Details
Description
In RowMatrix.computeSVD, under LocalLAPACK mode, the code would invoke brzSvd. Yet breeze svd for dense matrix has latent constraint. In it's implementation
( https://github.com/scalanlp/breeze/blob/master/math/src/main/scala/breeze/linalg/functions/svd.scala ):
val workSize = ( 3
- scala.math.min(m, n)
- scala.math.min(m, n)
+ scala.math.max(scala.math.max(m, n), 4 * scala.math.min(m, n) - scala.math.min(m, n) + 4 * scala.math.min(m, n))
)
val work = new Array[Double](workSize)
as a result, column num must satisfy 7 * n * n + 4 * n < Int.MaxValue
thus, n < 17515.
This jira is only the first step. If possbile, I hope spark can handle matrix computation up to 80K * 80K.
Attachments
Issue Links
- links to