Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16566

Bug in SparseMatrix multiplication with SparseVector

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.6.2
    • None
    • MLlib
    • None

    Description

      In the org.apache.spark.mllib.linalg.BLAS.scala, the multiplication between SparseMatrix (sm) and SparseVector (sv) when sm is not transposed assume that the indices is sorted, but there is no validation to make sure that is the case, making the result returned wrongly.

      This can be replicated simply by using spark-shell and entering these commands:

      import org.apache.spark.mllib.linalg.SparseMatrix
      import org.apache.spark.mllib.linalg.SparseVector
      import org.apache.spark.mllib.linalg.DenseVector
      import scala.collection.mutable.ArrayBuffer

      val vectorIndices = Array(3,2)
      val vectorValues = Array(0.1,0.2)
      val size = 4

      val sm = new SparseMatrix(size, size, Array(0, 0, 0, 1, 1), Array(0), Array(1.0))
      val dm = sm.toDense
      val sv = new SparseVector(size, vectorIndices, vectorValues)
      val dv = new DenseVector(s.toArray)

      sm.multiply(dv) == sm.multiply(sv)

      sm.multiply(dv)
      sm.multiply(sv)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wilson.lauw Wilson
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: