Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17133 Improvements to linear methods in Spark
  3. SPARK-18060

Avoid unnecessary standardization in multinomial logistic regression training

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • ML
    • None

    Description

      The MLOR implementation in spark.ml trains the model in the standardized feature space by dividing the feature values by the column standard deviation in each iteration. We perform this computation many time more than is necessary in order to achieve sequential memory access pattern when computing the gradients. We can have both - sequential access patterns and reduced computation - if we use a column major layout for the coefficients.

      Attachments

        Issue Links

          Activity

            People

              sethah Seth Hendrickson
              sethah Seth Hendrickson
              DB Tsai DB Tsai
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: