Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.2.0
-
None
Description
Pipelines can currently contain Estimators and Transformers.
Question for debate: Should Pipelines be able to contain Evaluators?
Pros:
- Schema check: Evaluators take input datasets with particular schema, which should perhaps be checked before running a Pipeline.
- Intermediate results:
- If a Transformer removes a column (which is not done by built-in Transformers currently but might be reasonable in the future), then the user can never evaluate that column. (However, users could keep all columns around.)
- If users have to evaluate after running a Pipeline, then each evaluated column may have to be re-materialized.
Cons:
- API: Evaluators do not transform datasets. They produce a scalar (or a few values), which makes it hard to say how they fit into a Pipeline or a PipelineModel.