Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1005

MultiLogReg Test Failure: Invalid input w/ zeros for rexpand ignore=false (rlen=1617, nnz=1455).

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • SystemML 0.11
    • None
    • None

    Description

      Currently, the test_mllearn.py -> TestMLLearn.testLogisticSK1 test is failing with the following error:

      Caused by: org.apache.sysml.runtime.DMLRuntimeException: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 151 and 164 -- Error evaluating instruction: CP°rexpand°cast=true°max=10.0°ignore=false°dir=cols°target=Y_vec°_mVar275·MATRIX·DOUBLE
              at org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:152)
              at org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:374)
              ... 17 more
      Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 151 and 164 -- Error evaluating instruction: CP°rexpand°cast=true°max=10.0°ignore=false°dir=cols°target=Y_vec°_mVar275·MATRIX·DOUBLE
              at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:335)
              at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:224)
              at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
              at org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:145)
              ... 18 more
      Caused by: org.apache.sysml.runtime.DMLRuntimeException: Invalid input w/ zeros for rexpand ignore=false (rlen=1617, nnz=1455).
              at org.apache.sysml.runtime.matrix.data.LibMatrixReorg.rexpand(LibMatrixReorg.java:721)
              at org.apache.sysml.runtime.matrix.data.MatrixBlock.rexpandOperations(MatrixBlock.java:5419)
              at org.apache.sysml.runtime.instructions.cp.ParameterizedBuiltinCPInstruction.processInstruction(ParameterizedBuiltinCPInstruction.java:252)
              at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:305)
              ... 21 more
      

      Basically, this test case directly creates MatrixBlocks and supplies them as input to the LogisticRegression Scala wrapper (via LogisticRegression.fit(MatrixBlock, MatrixBlock)) we have, which in turn calls MultiLogReg.dml.

      Within MultiLogReg.dml https://github.com/apache/incubator-systemml/blob/10dff5c9e3eb737a965846246d8187fcb0b03689/scripts/algorithms/MultiLogReg.dml#L148, the Y_vec input is converted from a vector of class labels to a matrix of one-hot encoded labels. During this conversion, the Y_vec vector is first transformed to have class labels <= 0 be converted to be the largest labels. Thus, this updated Y_vec matrix has no zero values. This updated Y_vec vector is then passed into the table function to be one-hot encoded. At this point, it checks if Y_vec has any zero values based on the nnz of the MatrixBlock, and in this case fails because the nnz of Y_vec is still erroneously set to the previous nnz from before the above transformation for class labels <= 0.

      Interestingly, if we remove the recent update to MultiLogReg.dml from SYSTEMML-958, https://github.com/apache/incubator-systemml/commit/10dff5c9e3eb737a965846246d8187fcb0b03689, the test passes. Regardless, this is a bug as the nnz should be updated after Y_vec is transformed to have no 0 values.

      cc mboehm7, niketanpansare

      Attachments

        Activity

          Great, that fixed the problem!

          dusenberrymw Mike Dusenberry added a comment - Great, that fixed the problem!

          mwdusenb@us.ibm.com I have pushed a commit https://github.com/apache/incubator-systemml/commit/70799f521fcdfbe8f87ba06ffb48b701ed97c14a which should fix this. Please confirm and close this issue.

          niketanpansare Niketan Pansare added a comment - mwdusenb@us.ibm.com I have pushed a commit https://github.com/apache/incubator-systemml/commit/70799f521fcdfbe8f87ba06ffb48b701ed97c14a which should fix this. Please confirm and close this issue.

          mwdusenb@us.ibm.com I have pushed a commit https://github.com/apache/incubator-systemml/commit/70799f521fcdfbe8f87ba06ffb48b701ed97c14a which should fix this. Please confirm and close this issue.

          niketanpansare Niketan Pansare added a comment - mwdusenb@us.ibm.com I have pushed a commit https://github.com/apache/incubator-systemml/commit/70799f521fcdfbe8f87ba06ffb48b701ed97c14a which should fix this. Please confirm and close this issue.
          mboehm7 Matthias Boehm added a comment -

          ok, I just ran a couple of sanity checks for MultiLogReg with -1/0/1-based labels as well as ProgramBlock.CHECK_MATRIX_SPARSITY enabled. ALL intermediates had correct nnz, so I can confirm that it's not related to SystemML's compiler/runtime. niketanpansare could you please have a look and fix the python api/test? Thanks.

          mboehm7 Matthias Boehm added a comment - ok, I just ran a couple of sanity checks for MultiLogReg with -1/0/1-based labels as well as ProgramBlock.CHECK_MATRIX_SPARSITY enabled. ALL intermediates had correct nnz, so I can confirm that it's not related to SystemML's compiler/runtime. niketanpansare could you please have a look and fix the python api/test? Thanks.

          People

            niketanpansare Niketan Pansare
            dusenberrymw Mike Dusenberry
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: