Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The parfor optimizer has a rewrite to select remote spark execution type even if in the original program there are Spark operations if these fit into the memory budget of the executors. However, this rewrite does not check for valid integer dimensions and hence fails with
Caused by: org.apache.sysml.runtime.DMLRuntimeException: Matrix dimensions too large for CP runtime: 3 x 5129281161
at org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:80)
at org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
at org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:207)
Here is the related optimizer output
---------------------------- EXPLAIN OPT TREE (type=ABSTRACT_PLAN, size=22) ---------------------------- --PARFOR, exec=CP, k=16, dp=NONE, tp=FIXED, rm=LOCAL_AUTOMATIC ----GENERIC (lines 122-126), exec=CP, k=1 ------lix, exec=CP, k=1 ------b(-), exec=CP, k=1 ------b(*), exec=CP, k=1 ------r(t), exec=CP, k=16 ------ba(+*), exec=CP, k=16 ------rix, exec=CP, k=1 ------r(rshape), exec=CP, k=16 ------ba(+*), exec=CP, k=16 ------r(rshape), exec=CP, k=16 ------rix, exec=CP, k=1 ------r(rshape), exec=SPARK, k=1 ------rix, exec=SPARK, k=1 ------b(/), exec=CP, k=1 ------u(exp), exec=CP, k=16 ------b(-), exec=CP, k=1 ------rix, exec=CP, k=1 ------ua(maxRC), exec=CP, k=16 ------ua(+RC), exec=CP, k=16 ------b(*), exec=CP, k=1 ------ua(+RC), exec=CP, k=16 ---------------------------- 18/03/06 23:17:33 DEBUG Optimizer: --- RULEBASED OPTIMIZER ------- 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: Optimize w/ max_mem=24271MB/4638MB/4638MB, max_k=16/144/144). 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: Optimize w/ SparkClusterConfig: -- legacyVersion = false (2.2.0) -- confOnly = true -- numExecutors = 6 -- defaultPar = 144 -- memExecutor = 69478645760 -- memDataMinFrac = 0.5 -- memDataMaxFrac = 0.6 -- memBroadcastFrac = 0.21 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: estimated mem (serial exec) M=109MB 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set data partitioner' - result=NONE () 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'remove unnecessary compare matrix' - result=false () 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set result partitioning' - result=false 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: estimated new mem (serial exec) M=109MB 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: estimated new mem (serial exec, all CP) M=109MB 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: estimated new mem (cond partitioning) M=109MB 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set execution strategy' - result=REMOTE_SPARK (recompile=true) 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set operation exec type CP' - result=2 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'enable data colocation' - result=false 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set partition replication factor' - result=false 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set export replication factor' - result=true (3) 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set degree of parallelism' - result=(see EXPLAIN) 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set task partitioner' - result=STATIC 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set fused data partitioning and execution' - result=false 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set transpose sparse vector operations' - result=false 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set in-place result indexing' - result=true ([delta_b_softmax], M=160MB) 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'disable CP caching' - result=false (M=160MB) 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set result merge' - result=LOCAL_MEM 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'set recompile memory budget' - result=24271MB 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'remove recursive parfor' - result=0/0 18/03/06 23:17:33 DEBUG Optimizer: RULEBASED OPT: rewrite 'remove unnecessary parfor' - result=0 18/03/06 23:17:33 DEBUG OptimizationWrapper: ParFOR Opt: Optimized plan (after optimization): ---------------------------- EXPLAIN OPT TREE (type=ABSTRACT_PLAN, size=22) ---------------------------- --PARFOR, exec=SPARK, k=3, dp=NONE, tp=STATIC, rm=LOCAL_MEM ----GENERIC (lines 122-126), exec=CP, k=1 ------lix, exec=CP, k=1 ------b(-), exec=CP, k=1 ------b(*), exec=CP, k=1 ------r(t), exec=CP, k=1 ------ba(+*), exec=CP, k=1 ------rix, exec=CP, k=1 ------r(rshape), exec=CP, k=1 ------ba(+*), exec=CP, k=1 ------r(rshape), exec=CP, k=1 ------rix, exec=CP, k=1 ------r(rshape), exec=CP, k=1 ------rix, exec=CP, k=1 ------b(/), exec=CP, k=1 ------u(exp), exec=CP, k=1 ------b(-), exec=CP, k=1 ------rix, exec=CP, k=1 ------ua(maxRC), exec=CP, k=1 ------ua(+RC), exec=CP, k=1 ------b(*), exec=CP, k=1 ------ua(+RC), exec=CP, k=1 ----------------------------