Description
It would be handy to add a binary toggle Param to HashingTF, as in the scikit-learn one: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.HashingVectorizer.html
If set, then all non-zero counts will be set to 1.
Attachments
Issue Links
- is related to
-
SPARK-13629 Add binary toggle Param to CountVectorizer
- Resolved
- relates to
-
SPARK-14238 Add binary toggle Param to PySpark HashingTF in ML & MLlib
- Resolved
- links to