Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
-
None
Description
Currently, the test_mllearn.py -> TestMLLearn.testLogisticSK1 test is failing with the following error:
Caused by: org.apache.sysml.runtime.DMLRuntimeException: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 151 and 164 -- Error evaluating instruction: CP°rexpand°cast=true°max=10.0°ignore=false°dir=cols°target=Y_vec°_mVar275·MATRIX·DOUBLE at org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:152) at org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:374) ... 17 more Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 151 and 164 -- Error evaluating instruction: CP°rexpand°cast=true°max=10.0°ignore=false°dir=cols°target=Y_vec°_mVar275·MATRIX·DOUBLE at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:335) at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:224) at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168) at org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:145) ... 18 more Caused by: org.apache.sysml.runtime.DMLRuntimeException: Invalid input w/ zeros for rexpand ignore=false (rlen=1617, nnz=1455). at org.apache.sysml.runtime.matrix.data.LibMatrixReorg.rexpand(LibMatrixReorg.java:721) at org.apache.sysml.runtime.matrix.data.MatrixBlock.rexpandOperations(MatrixBlock.java:5419) at org.apache.sysml.runtime.instructions.cp.ParameterizedBuiltinCPInstruction.processInstruction(ParameterizedBuiltinCPInstruction.java:252) at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:305) ... 21 more
Basically, this test case directly creates MatrixBlocks and supplies them as input to the LogisticRegression Scala wrapper (via LogisticRegression.fit(MatrixBlock, MatrixBlock)) we have, which in turn calls MultiLogReg.dml.
Within MultiLogReg.dml https://github.com/apache/incubator-systemml/blob/10dff5c9e3eb737a965846246d8187fcb0b03689/scripts/algorithms/MultiLogReg.dml#L148, the Y_vec input is converted from a vector of class labels to a matrix of one-hot encoded labels. During this conversion, the Y_vec vector is first transformed to have class labels <= 0 be converted to be the largest labels. Thus, this updated Y_vec matrix has no zero values. This updated Y_vec vector is then passed into the table function to be one-hot encoded. At this point, it checks if Y_vec has any zero values based on the nnz of the MatrixBlock, and in this case fails because the nnz of Y_vec is still erroneously set to the previous nnz from before the above transformation for class labels <= 0.
Interestingly, if we remove the recent update to MultiLogReg.dml from SYSTEMML-958, https://github.com/apache/incubator-systemml/commit/10dff5c9e3eb737a965846246d8187fcb0b03689, the test passes. Regardless, this is a bug as the nnz should be updated after Y_vec is transformed to have no 0 values.
Great, that fixed the problem!