Description
In the currently implementation of GLM solvers, we save intermediate models on the driver node and update it through broadcast and aggregation. Even with torrent broadcast and tree aggregation added in 1.1, it is hard to go beyond ~10 million features. This JIRA is for investigating the parameter server approach, including algorithm, infrastructure, and dependencies.
Attachments
Issue Links
- is related to
-
SPARK-6567 Large linear model parallelism via a join and reduceByKey
- Resolved
- relates to
-
SPARK-6932 A Prototype of Parameter Server
- Resolved