Description
In pyspark:
We add a parameter whether to collect the full model list when CrossValidator/TrainValidationSplit training (Default is NOT, avoid the change cause OOM)
Add a method in CrossValidatorModel/TrainValidationSplitModel, allow user to get the model list
CrossValidatorModelWriter add a “option”, allow user to control whether to persist the model list to disk.
Note: when persisting the model list, use indices as the sub-model path
Attachments
Issue Links
- blocks
-
SPARK-22005 CrossValidator, TrainValidationSplit dump sub models to disk when fitting: Python API
- Resolved
- is blocked by
-
SPARK-21911 Parallel Model Evaluation for ML Tuning: PySpark
- Resolved
- links to