Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5406

LocalLAPACK mode in RowMatrix.computeSVD should have much smaller upper bound

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.2.0
    • 1.3.0
    • MLlib
    • None
    • centos, others should be similar

    Description

      In RowMatrix.computeSVD, under LocalLAPACK mode, the code would invoke brzSvd. Yet breeze svd for dense matrix has latent constraint. In it's implementation
      ( https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/linalg/functions/svd.scala ):

      val workSize = ( 3

      • scala.math.min(m, n)
      • scala.math.min(m, n)
        + scala.math.max(scala.math.max(m, n), 4 * scala.math.min(m, n)
      • scala.math.min(m, n) + 4 * scala.math.min(m, n))
        )
        val work = new Array[Double](workSize)

      as a result, column num must satisfy 7 * n * n + 4 * n < Int.MaxValue
      thus, n < 17515.

      This jira is only the first step. If possbile, I hope spark can handle matrix computation up to 80K * 80K.

      Attachments

        Activity

          People

            yuhaoyan yuhao yang
            yuhaoyan yuhao yang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified