Description
The MLOR implementation in spark.ml trains the model in the standardized feature space by dividing the feature values by the column standard deviation in each iteration. We perform this computation many time more than is necessary in order to achieve sequential memory access pattern when computing the gradients. We can have both - sequential access patterns and reduced computation - if we use a column major layout for the coefficients.
Attachments
Issue Links
- is related to
-
SPARK-18456 Use matrix abstraction for LogisticRegression coefficients during training
- Resolved
- links to