Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
SystemML 0.13
-
None
-
None
Description
Recently, we made the switch from the old mllib.Vector to the new ml.Vector type. Unfortunately, this leaves us with the issue of no longer recognizing DataFrames with mllib.Vector columns during conversion, and thus, we (1) do not correctly convert to SystemML Matrix objects, (2) instead fall back on conversion to Frame objects, and then (3) fail completely when the ensuing DML script is expecting to operated on matrices.
Given a Spark DataFrame X_df of type DataFrame[__INDEX: int, sample: vector], where vector is of type mllib.Vector, the following script will now fail (did not previously):
script = """
# Scale images to [-1,1]
X = X / 255
X = X * 2 - 1
"""
outputs = ("X")
script = dml(script).input(X=X_df).output(*outputs)
X = ml.execute(script).get(*outputs)
X
Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception occurred while validating script at org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:487) at org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:280) at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293) ... 12 more Caused by: org.apache.sysml.parser.LanguageException: Invalid Parameters : ERROR: null -- line 4, column 4 -- Invalid Datatypes for operation FRAME SCALAR at org.apache.sysml.parser.Expression.raiseValidateError(Expression.java:549) at org.apache.sysml.parser.Expression.computeDataType(Expression.java:415) at org.apache.sysml.parser.Expression.computeDataType(Expression.java:386) at org.apache.sysml.parser.BinaryExpression.validateExpression(BinaryExpression.java:130) at org.apache.sysml.parser.StatementBlock.validate(StatementBlock.java:567) at org.apache.sysml.parser.DMLTranslator.validateParseTree(DMLTranslator.java:140) at org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:485) ... 14 more
This fixed my my real-world case. Thanks, deron!