[SYSTEMDS-1224] Migrate vector and labeledpoint classes from mllib to ml - ASF JIRA

Details

Type: Task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: SystemML 0.13
Fix Version/s: SystemML 0.13
Component/s: APIs, Runtime
Labels:
None

Description

For Spark 2, execution of test_mllearn_df.py gives SparseVector to Vector error:

spark-submit --driver-class-path $SYSTEMML_HOME/SystemML.jar test_mllearn_df.py

generates:

Py4JJavaError: An error occurred while calling o206.fit.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 2.0 failed 1 times, most recent failure: Lost task 1.0 in stage 2.0 (TID 17, localhost, executor driver): java.lang.ClassCastException: org.apache.spark.ml.linalg.SparseVector cannot be cast to org.apache.spark.mllib.linalg.Vector
	at org.apache.sysml.runtime.instructions.spark.utils.RDDConverterUtils.countNnz(RDDConverterUtils.java:314)
	at org.apache.sysml.runtime.instructions.spark.utils.RDDConverterUtils.access$400(RDDConverterUtils.java:71)
	at org.apache.sysml.runtime.instructions.spark.utils.RDDConverterUtils$DataFrameAnalysisFunction.call(RDDConverterUtils.java:940)
	at org.apache.sysml.runtime.instructions.spark.utils.RDDConverterUtils$DataFrameAnalysisFunction.call(RDDConverterUtils.java:921)
	at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1762)

This can most likely be fixed by migrating relevant classes (typically going from mllib package to ml package).

Attachments

Activity

Jon Deron Eriksson added a comment - 04/Feb/17 03:35

Fixed by PR369.

Jon Deron Eriksson added a comment - 04/Feb/17 03:35 Fixed by PR369 .

People

Assignee:: Jon Deron Eriksson

Reporter:: Jon Deron Eriksson

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 02/Feb/17 01:13

Updated:: 04/Feb/17 03:35

Resolved:: 04/Feb/17 03:35

SystemDS