Description
Create a K-Means API for the spark.ml Pipelines API. This should wrap the existing KMeans implementation in spark.mllib.
This should be the first clustering method added to Pipelines, and it will be important to consider SPARK-7610 and think about designing the clustering API. We do not have to have abstractions from the beginning (and probably should not) but should think far enough ahead so we can add abstractions later on.
Attachments
Issue Links
- duplicates
-
SPARK-7879 KMeans API for spark.ml Pipelines
- Resolved