Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
None
-
None
Description
With the spreading application of Apache Spark* logistic regression, we've seen more and more requirements come up about improving the speed and scalability. Many suggestions and discussions have been evolving in the developer and user communities. While it may be difficult to find an optimization for all the cases, understanding the various scenarios and approaches will be important.
As discussed with josephkb, this JIRA is created for discussion and collecting efforts on the optimization work of LR (logistic regression). All the ongoing related JIRA will be linked here, as well as new ideas and possibilities.
Users are encouraged to share their experiences/expectations on LR and track the development status from the community. Developers can leverage the JIRA to browse existing efforts, make communication and introduce research/development resources.
Attachments
Issue Links
- relates to
-
SPARK-7159 Support multiclass logistic regression in spark.ml
- Resolved
-
SPARK-3181 Add Robust Regression Algorithm with Huber Estimator
- Resolved
-
SPARK-10078 Vector-free L-BFGS
- Resolved
-
SPARK-14464 Logistic regression performs poorly for very large vectors, even when the number of non-zero features is small
- Resolved
-
SPARK-16494 Upgrade breeze version to 0.12
- Resolved
-
SPARK-17133 Improvements to linear methods in Spark
- Resolved