[SPARK-16592] Improving ml.Logistic Regression on speed and scalability - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: None
Fix Version/s: None
Component/s: ML
Labels:
- bulk-closed

Description

With the spreading application of Apache Spark* logistic regression, we've seen more and more requirements come up about improving the speed and scalability. Many suggestions and discussions have been evolving in the developer and user communities. While it may be difficult to find an optimization for all the cases, understanding the various scenarios and approaches will be important.

As discussed with josephkb, this JIRA is created for discussion and collecting efforts on the optimization work of LR (logistic regression). All the ongoing related JIRA will be linked here, as well as new ideas and possibilities.

Users are encouraged to share their experiences/expectations on LR and track the development status from the community. Developers can leverage the JIRA to browse existing efforts, make communication and introduce research/development resources.

Attachments

Issue Links

relates to

SPARK-7159 Support multiclass logistic regression in spark.ml

Resolved

SPARK-3181 Add Robust Regression Algorithm with Huber Estimator

Resolved

SPARK-10078 Vector-free L-BFGS

Resolved

SPARK-14464 Logistic regression performs poorly for very large vectors, even when the number of non-zero features is small

Resolved

SPARK-16494 Upgrade breeze version to 0.12

Resolved

SPARK-17133 Improvements to linear methods in Spark

Resolved

(1 relates to)

Activity

People

Assignee:: Unassigned

Reporter:: yuhao yang

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 17/Jul/16 08:17

Updated:: 21/May/19 04:33

Resolved:: 21/May/19 04:33