Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
SystemML 0.13
-
None
Description
deron discovered that the one of the python test (test_mllearn_df.py) with spark 2.1.0 was failing because the test score from linear regression was very low (~ 0.24). I did a some investigation and it turns out the the model parameters computed by the dml script are incorrect. In systemml.12, the values of betas from linear regression model are [152.919, 938.237]. This is what we expect from normal equation. (I also tested this with sklearn). But the values of betas from systemml.13 (with spark 2.1.0) come out to be [153.146, 458.489]. These are not correct and therefore the test score is much lower than expected. The data going into DML script is correct. I printed out the valued of X and Y in dml and I didn't see any issue there.
Attached are the log files for two different tests (systemml0.12 and 0.13) with explain flag.
Attachments
Attachments
- python_LinearReg_test_spark.1.6.log
- 505 kB
- Imran Younus
- python_LinearReg_test_spark.2.1.log
- 526 kB
- Imran Younus
Activity
Fixed in the commit https://github.com/apache/incubator-systemml/commit/9d0087cbbd250c9b486923555b450602f816cf19 by setting regularization to 0 (similar to that of scikit-learn).
I am able to reproduce this bug (not sure if it is) with command-line as well. Here is the output of GLM-predict (after running LinRegDS):
$ cat y_predicted.csv 189.09660701586185 133.3260601238074 157.3739106185465 132.8144037303023 135.88434209133283 154.81562865102103 194.2131709509127 136.3959984848379 125.13955782772601 137.41931127184807 178.35182275225503 123.60458864721075 152.7690030770007 141.0009060263837 116.95305553164462 161.46716176658717 144.58250078091928 144.58250078091928 170.67697684967874 117.4647119251497
Here is the output of Python mllearn:
>>> import numpy as np >>> from pyspark.context import SparkContext >>> from pyspark.ml import Pipeline >>> from pyspark.ml.feature import HashingTF, Tokenizer from pyspark.sql import SparkSession from sklearn import datasets, metrics, neighbors >>> from pyspark.sql import SparkSession >>> from sklearn import datasets, metrics, neighbors from sklearn.datasets import fetch_20newsgroups from sklearn.feature_extraction.text import TfidfVectorizer from systemml.mllearn import LinearRegression, LogisticRegression, NaiveBayes, SVM diabetes = datasets.load_diabetes() diabetes_X = diabetes.data[:, np.newaxis, 2] diabetes_X_train = diabetes_X[:-20] diabetes_X_test = diabetes_X[-20:] diabetes_y_train = diabetes.target[:-20] diabetes_y_test = diabetes.target[-20:] sparkSession = SparkSession.builder.getOrCreate() regr = LinearRegression(sparkSession, solver="direct-solve") regr.fit(diabetes_X_train, diabetes_y_train)>>> from sklearn.datasets import fetch_20newsgroups >>> from sklearn.feature_extraction.text import TfidfVectorizer >>> >>> from systemml.mllearn import LinearRegression, LogisticRegression, NaiveBayes, SVM >>> diabetes = datasets.load_diabetes() >>> diabetes_X = diabetes.data[:, np.newaxis, 2] >>> diabetes_X_train = diabetes_X[:-20] >>> diabetes_X_test = diabetes_X[-20:] >>> diabetes_y_train = diabetes.target[:-20] >>> diabetes_y_test = diabetes.target[-20:] >>> sparkSession = SparkSession.builder.getOrCreate() >>> regr = LinearRegression(sparkSession, solver="direct-solve") >>> regr.fit(diabetes_X_train, diabetes_y_train) Welcome to Apache SystemML! 17/02/16 22:39:21 WARN RewriteRemovePersistentReadWrite: Non-registered persistent write of variable 'X' (line 87). 17/02/16 22:39:21 WARN RewriteRemovePersistentReadWrite: Non-registered persistent write of variable 'y' (line 88). BEGIN LINEAR REGRESSION SCRIPT Reading X and Y... Calling the Direct Solver... Computing the statistics... AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,4.8020565933360324E-14 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 Writing the output matrix... END LINEAR REGRESSION SCRIPT lr >>> regr.predict(diabetes_X_test) 17/02/16 22:39:35 WARN Expression: WARNING: null -- line 149, column 4 -- Read input file does not exist on FS (local mode): 17/02/16 22:39:35 WARN Expression: Metadata file: .mtd not provided array([[ 188.84521284], [ 134.98127765], [ 158.20701117], [ 134.4871131 ], [ 137.45210036], [ 155.73618846], [ 193.78685827], [ 137.94626491], [ 127.07464496], [ 138.93459399], [ 178.46775744], [ 125.59215133], [ 153.75953028], [ 142.39374579], [ 119.16801227], [ 162.16032752], [ 145.8528976 ], [ 145.8528976 ], [ 171.05528929], [ 119.66217681]])
To reproduce the command-line output, please dump the test data into csv:
import numpy as np from sklearn import datasets diabetes = datasets.load_diabetes() diabetes_X = diabetes.data[:, np.newaxis, 2] diabetes_X_train = diabetes_X[:-20] diabetes_X_test = diabetes_X[-20:] diabetes_y_train = diabetes.target[:-20] diabetes_y_test = diabetes.target[-20:] diabetes_X_test.tofile('X_test.csv', sep="\n") diabetes_X.tofile('X.csv', sep="\n") diabetes.target.tofile('y.csv', sep="\n")
And execute following commands (you may have to edit dml script to add format or create metadata file):
~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit SystemML.jar -f LinearRegDS.dml -nvargs X=X.csv Y=y.csv B=B.csv fmt=csv icpt=1 tol=0.000001 reg=1 ~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit SystemML.jar -f GLM-predict.dml -nvargs X=X_test.csv M=y_predicted.csv B=B.csv fmt=csv icpt=1 tol=0.000001 reg=1
I also tested using SystemML 0.12.0 and got the same predictions:
$ ~/spark-1.6.1-bin-hadoop2.6/bin/spark-submit systemml-0.12.0-incubating.jar -f LinearRegDS.dml -nvargs X=X.csv Y=y.csv B=B.csv fmt=csv icpt=1 tol=0.000001 reg=1 $ ~/spark-1.6.1-bin-hadoop2.6/bin/spark-submit systemml-0.12.0-incubating.jar -f GLM-predict.dml -nvargs X=X_test.csv M=y_predicted.csv B=B.csv fmt=csv icpt=1 tol=0.000001 reg=1 $ cat y_predicted.csv 189.09660701586185 133.3260601238074 157.3739106185465 132.8144037303023 135.88434209133283 154.81562865102103 194.2131709509127 136.3959984848379 125.13955782772601 137.41931127184807 178.35182275225503 123.60458864721075 152.7690030770007 141.0009060263837 116.95305553164462 161.46716176658717 144.58250078091928 144.58250078091928 170.67697684967874 117.4647119251497
And here is the output of 0.12.0 mllearn:
Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 1.6.1 /_/ Using Python version 2.7.12 (default, Jul 2 2016 17:42:40) SparkContext available as sc, HiveContext available as sqlContext. >>> import numpy as np >>> from sklearn import datasets from systemml.mllearn import LinearRegression from pyspark.sql import SQLContext # Load the diabetes dataset diabetes = datasets.load_diabetes() # Use only one feature diabetes_X = diabetes.data[:, np.newaxis, 2] # Split the data into training/testing sets diabetes_X_train = diabetes_X[:-20] diabetes_X_test = diabetes_X[-20:] # Split the targets into training/testing sets diabetes_y_train = diabetes.target[:-20] diabetes_y_test = diabetes.target[-20:] # Create linear regression object regr = LinearRegression(sqlCtx, solver='direct-solve') # Train the model using the training sets regr.fit(diabetes_X_train, diabetes_y_train)>>> from systemml.mllearn import LinearRegression >>> from pyspark.sql import SQLContext >>> # Load the diabetes dataset ... diabetes = datasets.load_diabetes() >>> # Use only one feature ... diabetes_X = diabetes.data[:, np.newaxis, 2] >>> # Split the data into training/testing sets ... diabetes_X_train = diabetes_X[:-20] >>> diabetes_X_test = diabetes_X[-20:] >>> # Split the targets into training/testing sets ... diabetes_y_train = diabetes.target[:-20] >>> diabetes_y_test = diabetes.target[-20:] >>> # Create linear regression object ... regr = LinearRegression(sqlCtx, solver='direct-solve') 17/02/16 23:34:34 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-d7505265-08a5-41f5-804c-484e1de7881e/httpd-c0b01ae9-c212-4373-ab09-cc0390bcd1dd 17/02/16 23:34:34 INFO spark.HttpServer: Starting HTTP Server 17/02/16 23:34:34 INFO server.Server: jetty-8.y.z-SNAPSHOT 17/02/16 23:34:34 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:46590 17/02/16 23:34:34 INFO util.Utils: Successfully started service 'HTTP file server' on port 46590. 17/02/16 23:34:34 INFO spark.SparkContext: Added JAR /home/biuser/anaconda2/lib/python2.7/site-packages/systemml/systemml-java/systemml-0.12.0-incubating.jar at http://localhost:46590/jars/systemml-0.12.0-incubating.jar with timestamp 1487309674061 >>> # Train the model using the training sets ... regr.fit(diabetes_X_train, diabetes_y_train) Welcome to Apache SystemML! BEGIN LINEAR REGRESSION SCRIPT Reading X and Y... Calling the Direct Solver... Computing the statistics... AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,3.633533705616816E-14 STDEV_RES_Y,63.038506337610244 DISPERSION,3973.853281276927 PLAIN_R2,0.3351312506863875 ADJUSTED_R2,0.33354822985468835 PLAIN_R2_NOBIAS,0.3351312506863875 ADJUSTED_R2_NOBIAS,0.33354822985468835 Writing the output matrix... END LINEAR REGRESSION SCRIPT lr >>> regr.predict(diabetes_X_test) array([[ 225.97316413], [ 115.7476731 ], [ 163.27609584], [ 114.73643007], [ 120.80388829], [ 158.21988065], [ 236.08559449], [ 121.81513133], [ 99.56778451], [ 123.8376174 ], [ 204.73706035], [ 96.5340554 ], [ 154.17490851], [ 130.91631866], [ 83.38789592], [ 171.36604013], [ 137.99501992], [ 137.99501992], [ 189.5684148 ], [ 84.39913896]])
1. I have verified that the mllearn API in 0.12.0 produces correct results.
2. No changes have been introduced in Python/Scala wrappers to affect this. The only change I see in algo since 0.12.0 is cbind. The bug is likely due to a side-effect of some other change.
3. I verified that Python wrappers are passing correct inputs to DML script by writing the input X,y to file and comparing it with original python data.
I tested LinRegDS:
A. commandline:
~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit SystemML.jar -f LinearRegDS.dml -nvargs X=X.csv Y=y.csv B=B.csv fmt=csv icpt=1 tol=0.000001 reg=1 Calling the Direct Solver... Computing the statistics... 17/02/16 21:02:52 INFO MapPartitionsRDD: Removing RDD 17 from persistence list 17/02/16 21:02:52 INFO BlockManager: Removing RDD 17 AVG_TOT_Y,152.13348416289594 STDEV_TOT_Y,77.09300453299106 AVG_RES_Y,-2.935409582574532E-14 STDEV_RES_Y,66.48545020578437 DISPERSION,4420.315089065834 PLAIN_R2,0.2579428201690507 ADJUSTED_R2,0.2562563265785258 PLAIN_R2_NOBIAS,0.2579428201690507 ADJUSTED_R2_NOBIAS,0.2562563265785258 Writing the output matrix... END LINEAR REGRESSION SCRIPT
B. mllearn:
Calling the Direct Solver... Computing the statistics... AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,4.8020565933360324E-14 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 Writing the output matrix... END LINEAR REGRESSION SCRIPT lr
output of explain from python test:
PROGRAM ( size CP/SP = 372/0 ) --MAIN PROGRAM ----GENERIC (lines 1-112) [recompile=false] ------CP print BEGIN LINEAR REGRESSION SCRIPT.SCALAR.STRING.true _Var29.SCALAR.STRING ------CP print Reading X and Y....SCALAR.STRING.true _Var30.SCALAR.STRING ------CP createvar _mVar31 scratch_space//_p17177_10.168.31.80//_t0/temp3 true MATRIX binaryblock 422 1 1000 1000 422 copy ------CP rand 422 1 1000 1000 1.0 1.0 1.0 -1 uniform 1.0 48 _mVar31.MATRIX.DOUBLE ------CP assignvar .SCALAR.STRING.true fileB.SCALAR.STRING ------CP assignvar .SCALAR.STRING.true fileO.SCALAR.STRING ------CP assignvar .SCALAR.STRING.true fileLog.SCALAR.STRING ------CP assignvar binary.SCALAR.STRING.true fmtB.SCALAR.STRING ------CP assignvar 1.0.SCALAR.DOUBLE.true intercept_status.SCALAR.DOUBLE ------CP assignvar 1.0E-6.SCALAR.DOUBLE.true tolerance.SCALAR.DOUBLE ------CP assignvar 100.0.SCALAR.DOUBLE.true max_iteration.SCALAR.DOUBLE ------CP assignvar 1.0.SCALAR.DOUBLE.true regularization.SCALAR.DOUBLE ------CP assignvar 422.SCALAR.INT.true n.SCALAR.INT ------CP assignvar 1.SCALAR.INT.true m.SCALAR.INT ------CP assignvar 1.SCALAR.INT.true m_ext.SCALAR.INT ------CP rmvar _Var29 ------CP rmvar _Var30 ------CP cpvar _mVar31 ones_n ------CP rmvar _mVar31 ----GENERIC (lines 115-116) [recompile=false] ------CP createvar _mVar32 scratch_space//_p17177_10.168.31.80//_t0/temp4 true MATRIX binaryblock 422 2 1000 1000 845 copy ------CP append X.MATRIX.DOUBLE ones_n.MATRIX.DOUBLE 1.SCALAR.INT.true _mVar32.MATRIX.DOUBLE true ------CP rmvar X ------CP assignvar 2.SCALAR.INT.true m_ext.SCALAR.INT ------CP cpvar _mVar32 X ------CP rmvar _mVar32 ------CP rmvar ones_n ----GENERIC (lines 119-119) [recompile=false] ------CP createvar _mVar33 scratch_space//_p17177_10.168.31.80//_t0/temp5 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP rand 2 1 1000 1000 1.0 1.0 1.0 -1 uniform 1.0 48 _mVar33.MATRIX.DOUBLE ------CP cpvar _mVar33 scale_lambda ------CP rmvar _mVar33 ----GENERIC (lines 122-122) [recompile=false] ------CP createvar _mVar34 scratch_space//_p17177_10.168.31.80//_t0/temp6 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP leftIndex scale_lambda.MATRIX.DOUBLE 0.SCALAR.INT.true m_ext.SCALAR.INT.false m_ext.SCALAR.INT.false 1.SCALAR.INT.true 1.SCALAR.INT .true _mVar34.MATRIX.DOUBLE ------CP rmvar scale_lambda ------CP cpvar _mVar34 scale_lambda ------CP rmvar _mVar34 ----GENERIC (lines 135-136) [recompile=false] ------CP createvar _mVar35 scratch_space//_p17177_10.168.31.80//_t0/temp7 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP rand 2 1 1000 1000 1.0 1.0 1.0 -1 uniform 1.0 48 _mVar35.MATRIX.DOUBLE ------CP createvar _mVar36 scratch_space//_p17177_10.168.31.80//_t0/temp8 true MATRIX binaryblock 2 1 1000 1000 0 copy ------CP rand 2 1 1000 1000 0.0 0.0 1.0 -1 uniform 1.0 48 _mVar36.MATRIX.DOUBLE ------CP cpvar _mVar35 scale_X ------CP cpvar _mVar36 shift_X ------CP rmvar _mVar35 ------CP rmvar _mVar36 ----GENERIC (lines 151-152) [recompile=false] ------CP createvar _mVar37 scratch_space//_p17177_10.168.31.80//_t0/temp9 true MATRIX binaryblock 2 1 1000 1000 0 copy ------CP rand 2 1 1000 1000 0.0 0.0 1.0 -1 uniform 1.0 48 _mVar37.MATRIX.DOUBLE ------CP cpvar scale_lambda lambda ------CP cpvar _mVar37 beta_unscaled ------CP rmvar _mVar37 ------CP rmvar regularization ------CP rmvar scale_lambda ----GENERIC (lines 157-162) [recompile=false] ------CP print Running the CG algorithm....SCALAR.STRING.true _Var38.SCALAR.STRING ------CP createvar _mVar39 scratch_space//_p17177_10.168.31.80//_t0/temp10 true MATRIX binaryblock 1 422 1000 1000 422 copy ------CP r' y.MATRIX.DOUBLE _mVar39.MATRIX.DOUBLE 48 ------CP createvar _mVar40 scratch_space//_p17177_10.168.31.80//_t0/temp11 true MATRIX binaryblock 1 2 1000 1000 -1 copy ------CP ba+* _mVar39.MATRIX.DOUBLE X.MATRIX.DOUBLE _mVar40.MATRIX.DOUBLE 48 ------CP rmvar _mVar39 ------CP createvar _mVar41 scratch_space//_p17177_10.168.31.80//_t0/temp12 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP r' _mVar40.MATRIX.DOUBLE _mVar41.MATRIX.DOUBLE 48 ------CP rmvar _mVar40 ------CP createvar _mVar42 scratch_space//_p17177_10.168.31.80//_t0/temp13 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP - 0.SCALAR.INT.true _mVar41.MATRIX.DOUBLE _mVar42.MATRIX.DOUBLE ------CP rmvar _mVar41 ------CP assignvar 0.SCALAR.INT.true i.SCALAR.INT ------CP rmvar _Var38 ------CP cpvar _mVar42 r ------CP rmvar _mVar42 ----GENERIC (lines 168-174) [recompile=false] ------CP createvar _mVar43 scratch_space//_p17177_10.168.31.80//_t0/temp14 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP - 0.SCALAR.INT.true r.MATRIX.DOUBLE _mVar43.MATRIX.DOUBLE ------CP createvar _mVar44 scratch_space//_p17177_10.168.31.80//_t0/temp15 true MATRIX binaryblock 1 1 1000 1000 -1 copy ------CP tsmm r.MATRIX.DOUBLE _mVar44.MATRIX.DOUBLE LEFT 48 ------CP castdts _mVar44.MATRIX.DOUBLE.false _Var45.SCALAR.DOUBLE ------CP rmvar _mVar44 ------CP * _Var45.SCALAR.DOUBLE.false 1.0E-12.SCALAR.DOUBLE.true _Var46.SCALAR.DOUBLE ------CP sqrt _Var45.SCALAR.DOUBLE.false _Var47.SCALAR.DOUBLE ------CP + ||r|| initial value = .SCALAR.STRING.true _Var47.SCALAR.DOUBLE.false _Var48.SCALAR.STRING ------CP sqrt _Var46.SCALAR.DOUBLE.false _Var49.SCALAR.DOUBLE ------CP + CG_RESIDUAL_NORM,0,.SCALAR.STRING.true _Var47.SCALAR.DOUBLE.false _Var50.SCALAR.STRING ------CP rmvar _Var47 ------CP + _Var48.SCALAR.STRING.false , target value = .SCALAR.STRING.true _Var51.SCALAR.STRING ------CP rmvar _Var48 ------CP append _Var50.SCALAR.STRING.false CG_RESIDUAL_RATIO,0,1.0.SCALAR.STRING.true -1.SCALAR.INT.true _Var52.SCALAR.STRING true ------CP rmvar _Var50 ------CP + _Var51.SCALAR.STRING.false _Var49.SCALAR.DOUBLE.false _Var53.SCALAR.STRING ------CP rmvar _Var51 ------CP rmvar _Var49 ------CP print _Var53.SCALAR.STRING.false _Var54.SCALAR.STRING ------CP rmvar _Var53 ------CP assignvar _Var45.SCALAR.DOUBLE.false norm_r2.SCALAR.DOUBLE ------CP assignvar _Var45.SCALAR.DOUBLE.false norm_r2_initial.SCALAR.DOUBLE ------CP assignvar _Var46.SCALAR.DOUBLE.false norm_r2_target.SCALAR.DOUBLE ------CP assignvar _Var52.SCALAR.STRING.false log_str.SCALAR.STRING ------CP cpvar _mVar43 p ------CP rmvar _Var45 ------CP rmvar _Var46 ------CP rmvar _Var52 ------CP rmvar _Var54 ------CP rmvar _mVar43 ------CP rmvar tolerance ----GENERIC (lines 176-202) [recompile=false] ----WHILE (lines 176-202) ------CP < i.SCALAR.INT.false max_iteration.SCALAR.INT.false _Var55.SCALAR.BOOLEAN ------CP > norm_r2.SCALAR.DOUBLE.false norm_r2_target.SCALAR.DOUBLE.false _Var56.SCALAR.BOOLEAN ------CP && _Var55.SCALAR.BOOLEAN.false _Var56.SCALAR.BOOLEAN.false _Var57.SCALAR.BOOLEAN ------CP rmvar _Var55 ------CP rmvar _Var56 ------CP rmvar _Var57 ------GENERIC (lines 182-182) [recompile=false] --------CP cpvar p ssX_p ------GENERIC (lines 185-185) [recompile=false] --------CP createvar _mVar58 scratch_space//_p17177_10.168.31.80//_t0/temp16 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP mmchain X.MATRIX.DOUBLE ssX_p.MATRIX.DOUBLE _mVar58.MATRIX.DOUBLE XtXv 48 --------CP cpvar _mVar58 q --------CP rmvar _mVar58 --------CP rmvar ssX_p ------GENERIC (lines 191-201) [recompile=false] --------CP createvar _mVar59 scratch_space//_p17177_10.168.31.80//_t0/temp17 true MATRIX binaryblock 1 2 1000 1000 -1 copy --------CP r' p.MATRIX.DOUBLE _mVar59.MATRIX.DOUBLE 48 --------CP createvar _mVar60 scratch_space//_p17177_10.168.31.80//_t0/temp18 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP * lambda.MATRIX.DOUBLE p.MATRIX.DOUBLE _mVar60.MATRIX.DOUBLE --------CP + i.SCALAR.INT.false 1.SCALAR.INT.true _Var61.SCALAR.INT --------CP createvar _mVar62 scratch_space//_p17177_10.168.31.80//_t0/temp19 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP + q.MATRIX.DOUBLE _mVar60.MATRIX.DOUBLE _mVar62.MATRIX.DOUBLE --------CP rmvar _mVar60 --------CP + Iteration .SCALAR.STRING.true _Var61.SCALAR.INT.false _Var63.SCALAR.STRING --------CP + CG_RESIDUAL_NORM,.SCALAR.STRING.true _Var61.SCALAR.INT.false _Var64.SCALAR.STRING --------CP + CG_RESIDUAL_RATIO,.SCALAR.STRING.true _Var61.SCALAR.INT.false _Var65.SCALAR.STRING --------CP createvar _mVar66 scratch_space//_p17177_10.168.31.80//_t0/temp20 true MATRIX binaryblock 1 1 1000 1000 -1 copy --------CP ba+* _mVar59.MATRIX.DOUBLE _mVar62.MATRIX.DOUBLE _mVar66.MATRIX.DOUBLE 48 --------CP rmvar _mVar59 --------CP + _Var63.SCALAR.STRING.false : ||r|| / ||r init|| = .SCALAR.STRING.true _Var67.SCALAR.STRING --------CP rmvar _Var63 --------CP + _Var64.SCALAR.STRING.false ,.SCALAR.STRING.true _Var68.SCALAR.STRING --------CP rmvar _Var64 --------CP + _Var65.SCALAR.STRING.false ,.SCALAR.STRING.true _Var69.SCALAR.STRING --------CP rmvar _Var65 --------CP castdts _mVar66.MATRIX.DOUBLE.false _Var70.SCALAR.DOUBLE --------CP rmvar _mVar66 --------CP / norm_r2.SCALAR.DOUBLE.false _Var70.SCALAR.DOUBLE.false _Var71.SCALAR.DOUBLE --------CP rmvar _Var70 --------CP createvar _mVar72 scratch_space//_p17177_10.168.31.80//_t0/temp21 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP +* beta_unscaled.MATRIX.DOUBLE _Var71.SCALAR.DOUBLE.false p.MATRIX.DOUBLE _mVar72.MATRIX.DOUBLE --------CP createvar _mVar73 scratch_space//_p17177_10.168.31.80//_t0/temp22 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP +* r.MATRIX.DOUBLE _Var71.SCALAR.DOUBLE.false _mVar62.MATRIX.DOUBLE _mVar73.MATRIX.DOUBLE --------CP rmvar _Var71 --------CP rmvar _mVar62 --------CP createvar _mVar74 scratch_space//_p17177_10.168.31.80//_t0/temp23 true MATRIX binaryblock 1 1 1000 1000 -1 copy --------CP tsmm _mVar73.MATRIX.DOUBLE _mVar74.MATRIX.DOUBLE LEFT 48 --------CP createvar _mVar75 scratch_space//_p17177_10.168.31.80//_t0/temp24 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP - 0.SCALAR.INT.true _mVar73.MATRIX.DOUBLE _mVar75.MATRIX.DOUBLE --------CP castdts _mVar74.MATRIX.DOUBLE.false _Var76.SCALAR.DOUBLE --------CP rmvar _mVar74 --------CP / _Var76.SCALAR.DOUBLE.false norm_r2.SCALAR.DOUBLE.false _Var77.SCALAR.DOUBLE --------CP / _Var76.SCALAR.DOUBLE.false norm_r2_initial.SCALAR.DOUBLE.false _Var78.SCALAR.DOUBLE --------CP sqrt _Var76.SCALAR.DOUBLE.false _Var79.SCALAR.DOUBLE --------CP createvar _mVar80 scratch_space//_p17177_10.168.31.80//_t0/temp25 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP +* _mVar75.MATRIX.DOUBLE _Var77.SCALAR.DOUBLE.false p.MATRIX.DOUBLE _mVar80.MATRIX.DOUBLE --------CP rmvar _mVar75 --------CP rmvar _Var77 --------CP sqrt _Var78.SCALAR.DOUBLE.false _Var81.SCALAR.DOUBLE --------CP rmvar _Var78 --------CP + _Var68.SCALAR.STRING.false _Var79.SCALAR.DOUBLE.false _Var82.SCALAR.STRING --------CP rmvar _Var68 --------CP rmvar _Var79 --------CP + _Var67.SCALAR.STRING.false _Var81.SCALAR.DOUBLE.false _Var83.SCALAR.STRING --------CP rmvar _Var67 --------CP append log_str.SCALAR.STRING.false _Var82.SCALAR.STRING.false -1.SCALAR.INT.true _Var84.SCALAR.STRING true --------CP rmvar _Var82 --------CP + _Var69.SCALAR.STRING.false _Var81.SCALAR.DOUBLE.false _Var85.SCALAR.STRING --------CP rmvar _Var69 --------CP rmvar _Var81 --------CP print _Var83.SCALAR.STRING.false _Var86.SCALAR.STRING --------CP rmvar _Var83 --------CP append _Var84.SCALAR.STRING.false _Var85.SCALAR.STRING.false -1.SCALAR.INT.true _Var87.SCALAR.STRING true --------CP rmvar _Var84 --------CP rmvar _Var85 --------CP rmvar p --------CP rmvar r --------CP rmvar beta_unscaled --------CP assignvar _Var61.SCALAR.INT.false i.SCALAR.INT --------CP assignvar _Var76.SCALAR.DOUBLE.false norm_r2.SCALAR.DOUBLE --------CP assignvar _Var87.SCALAR.STRING.false log_str.SCALAR.STRING --------CP rmvar _Var61 --------CP cpvar _mVar72 beta_unscaled --------CP cpvar _mVar73 r --------CP rmvar _Var76 --------CP cpvar _mVar80 p --------CP rmvar _Var86 --------CP rmvar _Var87 --------CP rmvar _mVar72 --------CP rmvar _mVar73 --------CP rmvar _mVar80 --------CP rmvar q ----IF (lines 204-206) ------CP >= i.SCALAR.INT.false max_iteration.SCALAR.INT.false _Var88.SCALAR.BOOLEAN ------CP rmvar _Var88 ------GENERIC (lines 205-205) [recompile=false] --------CP print Warning: the maximum number of iterations has been reached..SCALAR.STRING.true _Var89.SCALAR.STRING --------CP rmvar _Var89 ----GENERIC (lines 207-207) [recompile=false] ------CP print The CG algorithm is done..SCALAR.STRING.true _Var90.SCALAR.STRING ------CP rmvar _Var90 ----GENERIC (lines 214-214) [recompile=false] ------CP cpvar beta_unscaled beta ----GENERIC (lines 217-228) [recompile=false] ------CP print Computing the statistics....SCALAR.STRING.true _Var91.SCALAR.STRING ------CP uak+ y.MATRIX.DOUBLE _Var92.SCALAR.DOUBLE 48 ------CP createvar _mVar93 scratch_space//_p17177_10.168.31.80//_t0/temp26 true MATRIX binaryblock 1 1 1000 1000 -1 copy ------CP tsmm y.MATRIX.DOUBLE _mVar93.MATRIX.DOUBLE LEFT 48 ------CP createvar _mVar94 scratch_space//_p17177_10.168.31.80//_t0/temp27 true MATRIX binaryblock 422 1 1000 1000 -1 copy ------CP ba+* X.MATRIX.DOUBLE beta.MATRIX.DOUBLE _mVar94.MATRIX.DOUBLE 48 ------CP / _Var92.SCALAR.DOUBLE.false 422.SCALAR.INT.true _Var95.SCALAR.DOUBLE ------CP rmvar _Var92 ------CP castdts _mVar93.MATRIX.DOUBLE.false _Var96.SCALAR.DOUBLE ------CP rmvar _mVar93 ------CP createvar _mVar97 scratch_space//_p17177_10.168.31.80//_t0/temp28 true MATRIX binaryblock 422 1 1000 1000 -1 copy ------CP - y.MATRIX.DOUBLE _mVar94.MATRIX.DOUBLE _mVar97.MATRIX.DOUBLE ------CP rmvar _mVar94 ------CP ^ _Var95.SCALAR.DOUBLE.false 2.SCALAR.INT.true _Var98.SCALAR.DOUBLE ------CP uak+ _mVar97.MATRIX.DOUBLE _Var99.SCALAR.DOUBLE 48 ------CP createvar _mVar100 scratch_space//_p17177_10.168.31.80//_t0/temp29 true MATRIX binaryblock 1 1 1000 1000 -1 copy ------CP tsmm _mVar97.MATRIX.DOUBLE _mVar100.MATRIX.DOUBLE LEFT 48 ------CP rmvar _mVar97 ------CP * 422.SCALAR.INT.true _Var98.SCALAR.DOUBLE.false _Var101.SCALAR.DOUBLE ------CP rmvar _Var98 ------CP / _Var99.SCALAR.DOUBLE.false 422.SCALAR.INT.true _Var102.SCALAR.DOUBLE ------CP rmvar _Var99 ------CP castdts _mVar100.MATRIX.DOUBLE.false _Var103.SCALAR.DOUBLE ------CP rmvar _mVar100 ------CP - _Var96.SCALAR.DOUBLE.false _Var101.SCALAR.DOUBLE.false _Var104.SCALAR.DOUBLE ------CP rmvar _Var101 ------CP ^ _Var102.SCALAR.DOUBLE.false 2.SCALAR.INT.true _Var105.SCALAR.DOUBLE ------CP / _Var104.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var106.SCALAR.DOUBLE ------CP * 422.SCALAR.INT.true _Var105.SCALAR.DOUBLE.false _Var107.SCALAR.DOUBLE ------CP rmvar _Var105 ------CP / _Var103.SCALAR.DOUBLE.false _Var104.SCALAR.DOUBLE.false _Var108.SCALAR.DOUBLE ------CP - _Var103.SCALAR.DOUBLE.false _Var107.SCALAR.DOUBLE.false _Var109.SCALAR.DOUBLE ------CP rmvar _Var107 ------CP - 1.SCALAR.INT.true _Var108.SCALAR.DOUBLE.false _Var110.SCALAR.DOUBLE ------CP rmvar _Var108 ------CP assignvar _Var95.SCALAR.DOUBLE.false avg_tot.SCALAR.DOUBLE ------CP assignvar _Var96.SCALAR.DOUBLE.false ss_tot.SCALAR.DOUBLE ------CP assignvar _Var102.SCALAR.DOUBLE.false avg_res.SCALAR.DOUBLE ------CP assignvar _Var103.SCALAR.DOUBLE.false ss_res.SCALAR.DOUBLE ------CP assignvar _Var104.SCALAR.DOUBLE.false ss_avg_tot.SCALAR.DOUBLE ------CP assignvar _Var106.SCALAR.DOUBLE.false var_tot.SCALAR.DOUBLE ------CP assignvar _Var109.SCALAR.DOUBLE.false ss_avg_res.SCALAR.DOUBLE ------CP assignvar _Var110.SCALAR.DOUBLE.false plain_R2.SCALAR.DOUBLE ------CP rmvar _Var91 ------CP rmvar _Var95 ------CP rmvar _Var96 ------CP rmvar _Var102 ------CP rmvar _Var103 ------CP rmvar _Var104 ------CP rmvar _Var106 ------CP rmvar _Var109 ------CP rmvar _Var110 ------CP rmvar X ------CP rmvar y ----IF (lines 229-235) ------CP > 422.SCALAR.INT.true m_ext.SCALAR.INT.false _Var111.SCALAR.BOOLEAN ------CP rmvar _Var111 ------GENERIC (lines 230-231) [recompile=false] --------CP - 422.SCALAR.INT.true m_ext.SCALAR.INT.false _Var112.SCALAR.INT --------CP / ss_avg_tot.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var113.SCALAR.DOUBLE --------CP / ss_res.SCALAR.DOUBLE.false _Var112.SCALAR.INT.false _Var114.SCALAR.DOUBLE --------CP rmvar _Var112 --------CP / _Var114.SCALAR.DOUBLE.false _Var113.SCALAR.DOUBLE.false _Var115.SCALAR.DOUBLE --------CP rmvar _Var113 --------CP - 1.SCALAR.INT.true _Var115.SCALAR.DOUBLE.false _Var116.SCALAR.DOUBLE --------CP rmvar _Var115 --------CP assignvar _Var114.SCALAR.DOUBLE.false dispersion.SCALAR.DOUBLE --------CP assignvar _Var116.SCALAR.DOUBLE.false adjusted_R2.SCALAR.DOUBLE --------CP rmvar _Var114 --------CP rmvar _Var116 --------CP rmvar m_ext ----ELSE ------GENERIC (lines 233-234) [recompile=false] --------CP assignvar NaN.SCALAR.DOUBLE.true dispersion.SCALAR.DOUBLE --------CP assignvar NaN.SCALAR.DOUBLE.true adjusted_R2.SCALAR.DOUBLE ----GENERIC (lines 237-238) [recompile=false] ------CP / ss_avg_res.SCALAR.DOUBLE.false ss_avg_tot.SCALAR.DOUBLE.false _Var117.SCALAR.DOUBLE ------CP - 1.SCALAR.INT.true _Var117.SCALAR.DOUBLE.false _Var118.SCALAR.DOUBLE ------CP rmvar _Var117 ------CP assignvar 420.SCALAR.INT.true deg_freedom.SCALAR.INT ------CP assignvar _Var118.SCALAR.DOUBLE.false plain_R2_nobias.SCALAR.DOUBLE ------CP rmvar _Var118 ----IF (lines 239-246) ------CP > deg_freedom.SCALAR.INT.false 0.SCALAR.INT.true _Var119.SCALAR.BOOLEAN ------CP rmvar _Var119 ------GENERIC (lines 240-241) [recompile=false] --------CP / ss_avg_res.SCALAR.DOUBLE.false deg_freedom.SCALAR.INT.false _Var120.SCALAR.DOUBLE --------CP / ss_avg_tot.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var121.SCALAR.DOUBLE --------CP / _Var120.SCALAR.DOUBLE.false _Var121.SCALAR.DOUBLE.false _Var122.SCALAR.DOUBLE --------CP rmvar _Var121 --------CP - 1.SCALAR.INT.true _Var122.SCALAR.DOUBLE.false _Var123.SCALAR.DOUBLE --------CP rmvar _Var122 --------CP assignvar _Var120.SCALAR.DOUBLE.false var_res.SCALAR.DOUBLE --------CP assignvar _Var123.SCALAR.DOUBLE.false adjusted_R2_nobias.SCALAR.DOUBLE --------CP rmvar _Var120 --------CP rmvar _Var123 --------CP rmvar ss_avg_res --------CP rmvar ss_avg_tot --------CP rmvar deg_freedom ----ELSE ------GENERIC (lines 243-245) [recompile=false] --------CP print Warning: zero or negative number of degrees of freedom..SCALAR.STRING.true _Var124.SCALAR.STRING --------CP assignvar NaN.SCALAR.DOUBLE.true var_res.SCALAR.DOUBLE --------CP assignvar NaN.SCALAR.DOUBLE.true adjusted_R2_nobias.SCALAR.DOUBLE --------CP rmvar _Var124 ----GENERIC (lines 248-248) [recompile=false] ------CP / ss_res.SCALAR.DOUBLE.false ss_tot.SCALAR.DOUBLE.false _Var125.SCALAR.DOUBLE ------CP - 1.SCALAR.INT.true _Var125.SCALAR.DOUBLE.false _Var126.SCALAR.DOUBLE ------CP rmvar _Var125 ------CP assignvar _Var126.SCALAR.DOUBLE.false plain_R2_vs_0.SCALAR.DOUBLE ------CP rmvar _Var126 ----GENERIC (lines 250-250) [recompile=false] ------CP / ss_res.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var127.SCALAR.DOUBLE ------CP / ss_tot.SCALAR.DOUBLE.false 422.SCALAR.INT.true _Var128.SCALAR.DOUBLE ------CP / _Var127.SCALAR.DOUBLE.false _Var128.SCALAR.DOUBLE.false _Var129.SCALAR.DOUBLE ------CP rmvar _Var127 ------CP rmvar _Var128 ------CP - 1.SCALAR.INT.true _Var129.SCALAR.DOUBLE.false _Var130.SCALAR.DOUBLE ------CP rmvar _Var129 ------CP assignvar _Var130.SCALAR.DOUBLE.false adjusted_R2_vs_0.SCALAR.DOUBLE ------CP rmvar _Var130 ------CP rmvar ss_tot ------CP rmvar m ------CP rmvar n ------CP rmvar ss_res ----GENERIC (lines 255-265) [recompile=false] ------CP toString target=beta _Var131.SCALAR.STRING ------CP + AVG_TOT_Y,.SCALAR.STRING.true avg_tot.SCALAR.DOUBLE.false _Var132.SCALAR.STRING ------CP sqrt var_tot.SCALAR.DOUBLE.false _Var133.SCALAR.DOUBLE ------CP + AVG_RES_Y,.SCALAR.STRING.true avg_res.SCALAR.DOUBLE.false _Var134.SCALAR.STRING ------CP sqrt var_res.SCALAR.DOUBLE.false _Var135.SCALAR.DOUBLE ------CP + DISPERSION,.SCALAR.STRING.true dispersion.SCALAR.DOUBLE.false _Var136.SCALAR.STRING ------CP + PLAIN_R2,.SCALAR.STRING.true plain_R2.SCALAR.DOUBLE.false _Var137.SCALAR.STRING ------CP + ADJUSTED_R2,.SCALAR.STRING.true adjusted_R2.SCALAR.DOUBLE.false _Var138.SCALAR.STRING ------CP + PLAIN_R2_NOBIAS,.SCALAR.STRING.true plain_R2_nobias.SCALAR.DOUBLE.false _Var139.SCALAR.STRING ------CP + ADJUSTED_R2_NOBIAS,.SCALAR.STRING.true adjusted_R2_nobias.SCALAR.DOUBLE.false _Var140.SCALAR.STRING ------CP print _Var131.SCALAR.STRING.false _Var141.SCALAR.STRING ------CP rmvar _Var131 ------CP + STDEV_TOT_Y,.SCALAR.STRING.true _Var133.SCALAR.DOUBLE.false _Var142.SCALAR.STRING ------CP rmvar _Var133 ------CP + STDEV_RES_Y,.SCALAR.STRING.true _Var135.SCALAR.DOUBLE.false _Var143.SCALAR.STRING ------CP rmvar _Var135 ------CP append _Var132.SCALAR.STRING.false _Var142.SCALAR.STRING.false -1.SCALAR.INT.true _Var144.SCALAR.STRING true ------CP rmvar _Var132 ------CP rmvar _Var142 ------CP append _Var144.SCALAR.STRING.false _Var134.SCALAR.STRING.false -1.SCALAR.INT.true _Var145.SCALAR.STRING true ------CP rmvar _Var144 ------CP rmvar _Var134 ------CP append _Var145.SCALAR.STRING.false _Var143.SCALAR.STRING.false -1.SCALAR.INT.true _Var146.SCALAR.STRING true ------CP rmvar _Var145 ------CP rmvar _Var143 ------CP append _Var146.SCALAR.STRING.false _Var136.SCALAR.STRING.false -1.SCALAR.INT.true _Var147.SCALAR.STRING true ------CP rmvar _Var146 ------CP rmvar _Var136 ------CP append _Var147.SCALAR.STRING.false _Var137.SCALAR.STRING.false -1.SCALAR.INT.true _Var148.SCALAR.STRING true ------CP rmvar _Var147 ------CP rmvar _Var137 ------CP append _Var148.SCALAR.STRING.false _Var138.SCALAR.STRING.false -1.SCALAR.INT.true _Var149.SCALAR.STRING true ------CP rmvar _Var148 ------CP rmvar _Var138 ------CP append _Var149.SCALAR.STRING.false _Var139.SCALAR.STRING.false -1.SCALAR.INT.true _Var150.SCALAR.STRING true ------CP rmvar _Var149 ------CP rmvar _Var139 ------CP append _Var150.SCALAR.STRING.false _Var140.SCALAR.STRING.false -1.SCALAR.INT.true _Var151.SCALAR.STRING true ------CP rmvar _Var150 ------CP rmvar _Var140 ------CP assignvar _Var151.SCALAR.STRING.false str.SCALAR.STRING ------CP rmvar _Var141 ------CP rmvar _Var151 ------CP rmvar avg_res ------CP rmvar adjusted_R2 ------CP rmvar plain_R2_nobias ------CP rmvar var_res ------CP rmvar plain_R2 ------CP rmvar var_tot ------CP rmvar adjusted_R2_nobias ------CP rmvar avg_tot ------CP rmvar dispersion ----GENERIC (lines 274-274) [recompile=false] ------CP print str.SCALAR.STRING.false _Var152.SCALAR.STRING ------CP rmvar _Var152 ------CP rmvar str ----GENERIC (lines 278-278) [recompile=false] ------CP print Writing the output matrix....SCALAR.STRING.true _Var153.SCALAR.STRING ------CP rmvar _Var153 ----GENERIC (lines 283-283) [recompile=false] ------CP cpvar beta beta_out ------CP rmvar beta ----GENERIC (lines 285-285) [recompile=false] ------CP rmvar fileB ------CP rmvar fmtB ----GENERIC (lines 290-291) [recompile=false] ------CP print END LINEAR REGRESSION SCRIPT.SCALAR.STRING.true _Var154.SCALAR.STRING ------CP rmvar _Var154 ------CP rmvar beta_out
output of explain from LinearRegCG.dml:
17/02/15 12:10:52 INFO api.DMLScript: EXPLAIN (RUNTIME): # Memory Budget local/remote = 3823MB/?MB/?MB/?MB # Degree of Parallelism (vcores) local/remote = 48/? PROGRAM ( size CP/SP = 385/2 ) --MAIN PROGRAM ----GENERIC (lines 86-110) [recompile=false] ------CP createvar pREADX /user/iyounus/data/diabetes_X_train.txt false MATRIX csv 422 1 -1 -1 -1 copy false , true 0.0 ------CP createvar pREADy /user/iyounus/data/diabetes_y_train.txt false MATRIX csv 422 1 -1 -1 -1 copy false , true 0.0 ------CP print BEGIN LINEAR REGRESSION SCRIPT.SCALAR.STRING.true _Var29.SCALAR.STRING ------CP print Reading X and Y....SCALAR.STRING.true _Var30.SCALAR.STRING ------CP createvar _mVar31 scratch_space//_p15212_10.168.31.80//_t0/temp1 true MATRIX binaryblock 422 1 1000 1000 422 copy ------CP rand 422 1 1000 1000 1.0 1.0 1.0 -1 uniform 1.0 48 _mVar31.MATRIX.DOUBLE ------CP createvar _mVar32 scratch_space//_p15212_10.168.31.80//_t0/temp2 true MATRIX binaryblock 422 1 1000 1000 -1 copy ------SPARK csvrblk pREADX.MATRIX.DOUBLE _mVar32.MATRIX.DOUBLE 1000 1000 false , true 0.0 ------CP createvar _mVar33 scratch_space//_p15212_10.168.31.80//_t0/temp3 true MATRIX binaryblock 422 1 1000 1000 -1 copy ------SPARK csvrblk pREADy.MATRIX.DOUBLE _mVar33.MATRIX.DOUBLE 1000 1000 false , true 0.0 ------CP assignvar beta.txt.SCALAR.STRING.true fileB.SCALAR.STRING ------CP assignvar .SCALAR.STRING.true fileO.SCALAR.STRING ------CP assignvar .SCALAR.STRING.true fileLog.SCALAR.STRING ------CP assignvar text.SCALAR.STRING.true fmtB.SCALAR.STRING ------CP assignvar 1.SCALAR.INT.true intercept_status.SCALAR.INT ------CP assignvar 1.0E-6.SCALAR.DOUBLE.true tolerance.SCALAR.DOUBLE ------CP assignvar 0.SCALAR.INT.true max_iteration.SCALAR.INT ------CP assignvar 1.0E-6.SCALAR.DOUBLE.true regularization.SCALAR.DOUBLE ------CP assignvar 422.SCALAR.INT.true n.SCALAR.INT ------CP assignvar 1.SCALAR.INT.true m.SCALAR.INT ------CP assignvar 1.SCALAR.INT.true m_ext.SCALAR.INT ------CP rmvar _Var29 ------CP rmvar _Var30 ------CP cpvar _mVar31 ones_n ------CP cpvar _mVar32 X ------CP cpvar _mVar33 y ------CP rmvar _mVar31 ------CP rmvar _mVar32 ------CP rmvar _mVar33 ----GENERIC (lines 113-114) [recompile=false] ------CP createvar _mVar34 scratch_space//_p15212_10.168.31.80//_t0/temp4 true MATRIX binaryblock 422 2 1000 1000 -1 copy ------CP append X.MATRIX.DOUBLE ones_n.MATRIX.DOUBLE 1.SCALAR.INT.true _mVar34.MATRIX.DOUBLE true ------CP rmvar X ------CP assignvar 2.SCALAR.INT.true m_ext.SCALAR.INT ------CP cpvar _mVar34 X ------CP rmvar _mVar34 ------CP rmvar ones_n ----GENERIC (lines 117-117) [recompile=false] ------CP createvar _mVar35 scratch_space//_p15212_10.168.31.80//_t0/temp5 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP rand 2 1 1000 1000 1.0 1.0 1.0 -1 uniform 1.0 48 _mVar35.MATRIX.DOUBLE ------CP cpvar _mVar35 scale_lambda ------CP rmvar _mVar35 ----GENERIC (lines 120-120) [recompile=false] ------CP createvar _mVar36 scratch_space//_p15212_10.168.31.80//_t0/temp6 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP leftIndex scale_lambda.MATRIX.DOUBLE 0.SCALAR.INT.true m_ext.SCALAR.INT.false m_ext.SCALAR.INT.false 1.SCALAR.INT.true 1.SCALAR.INT.true _mVar36.MATRIX.DOUBLE ------CP rmvar scale_lambda ------CP cpvar _mVar36 scale_lambda ------CP rmvar _mVar36 ----GENERIC (lines 133-134) [recompile=false] ------CP createvar _mVar37 scratch_space//_p15212_10.168.31.80//_t0/temp7 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP rand 2 1 1000 1000 1.0 1.0 1.0 -1 uniform 1.0 48 _mVar37.MATRIX.DOUBLE ------CP createvar _mVar38 scratch_space//_p15212_10.168.31.80//_t0/temp8 true MATRIX binaryblock 2 1 1000 1000 0 copy ------CP rand 2 1 1000 1000 0.0 0.0 1.0 -1 uniform 1.0 48 _mVar38.MATRIX.DOUBLE ------CP cpvar _mVar37 scale_X ------CP cpvar _mVar38 shift_X ------CP rmvar _mVar37 ------CP rmvar _mVar38 ----GENERIC (lines 149-150) [recompile=false] ------CP createvar _mVar39 scratch_space//_p15212_10.168.31.80//_t0/temp9 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP * scale_lambda.MATRIX.DOUBLE 1.0E-6.SCALAR.DOUBLE.true _mVar39.MATRIX.DOUBLE ------CP createvar _mVar40 scratch_space//_p15212_10.168.31.80//_t0/temp10 true MATRIX binaryblock 2 1 1000 1000 0 copy ------CP rand 2 1 1000 1000 0.0 0.0 1.0 -1 uniform 1.0 48 _mVar40.MATRIX.DOUBLE ------CP cpvar _mVar39 lambda ------CP cpvar _mVar40 beta_unscaled ------CP rmvar _mVar39 ------CP rmvar _mVar40 ------CP rmvar regularization ------CP rmvar scale_lambda ----GENERIC (lines 153-153) [recompile=false] ------CP assignvar m_ext.SCALAR.INT.false max_iteration.SCALAR.INT ----GENERIC (lines 155-160) [recompile=false] ------CP print Running the CG algorithm....SCALAR.STRING.true _Var41.SCALAR.STRING ------CP createvar _mVar42 scratch_space//_p15212_10.168.31.80//_t0/temp11 true MATRIX binaryblock 1 422 1000 1000 -1 copy ------CP r' y.MATRIX.DOUBLE _mVar42.MATRIX.DOUBLE 48 ------CP createvar _mVar43 scratch_space//_p15212_10.168.31.80//_t0/temp12 true MATRIX binaryblock 1 2 1000 1000 -1 copy ------CP ba+* _mVar42.MATRIX.DOUBLE X.MATRIX.DOUBLE _mVar43.MATRIX.DOUBLE 48 ------CP rmvar _mVar42 ------CP createvar _mVar44 scratch_space//_p15212_10.168.31.80//_t0/temp13 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP r' _mVar43.MATRIX.DOUBLE _mVar44.MATRIX.DOUBLE 48 ------CP rmvar _mVar43 ------CP createvar _mVar45 scratch_space//_p15212_10.168.31.80//_t0/temp14 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP - 0.SCALAR.INT.true _mVar44.MATRIX.DOUBLE _mVar45.MATRIX.DOUBLE ------CP rmvar _mVar44 ------CP assignvar 0.SCALAR.INT.true i.SCALAR.INT ------CP rmvar _Var41 ------CP cpvar _mVar45 r ------CP rmvar _mVar45 ----GENERIC (lines 166-172) [recompile=false] ------CP createvar _mVar46 scratch_space//_p15212_10.168.31.80//_t0/temp15 true MATRIX binaryblock 2 1 1000 1000 -1 copy ------CP - 0.SCALAR.INT.true r.MATRIX.DOUBLE _mVar46.MATRIX.DOUBLE ------CP createvar _mVar47 scratch_space//_p15212_10.168.31.80//_t0/temp16 true MATRIX binaryblock 1 1 1000 1000 -1 copy ------CP tsmm r.MATRIX.DOUBLE _mVar47.MATRIX.DOUBLE LEFT 48 ------CP castdts _mVar47.MATRIX.DOUBLE.false _Var48.SCALAR.DOUBLE ------CP rmvar _mVar47 ------CP * _Var48.SCALAR.DOUBLE.false 1.0E-12.SCALAR.DOUBLE.true _Var49.SCALAR.DOUBLE ------CP sqrt _Var48.SCALAR.DOUBLE.false _Var50.SCALAR.DOUBLE ------CP + ||r|| initial value = .SCALAR.STRING.true _Var50.SCALAR.DOUBLE.false _Var51.SCALAR.STRING ------CP sqrt _Var49.SCALAR.DOUBLE.false _Var52.SCALAR.DOUBLE ------CP + CG_RESIDUAL_NORM,0,.SCALAR.STRING.true _Var50.SCALAR.DOUBLE.false _Var53.SCALAR.STRING ------CP rmvar _Var50 ------CP + _Var51.SCALAR.STRING.false , target value = .SCALAR.STRING.true _Var54.SCALAR.STRING ------CP rmvar _Var51 ------CP append _Var53.SCALAR.STRING.false CG_RESIDUAL_RATIO,0,1.0.SCALAR.STRING.true -1.SCALAR.INT.true _Var55.SCALAR.STRING true ------CP rmvar _Var53 ------CP + _Var54.SCALAR.STRING.false _Var52.SCALAR.DOUBLE.false _Var56.SCALAR.STRING ------CP rmvar _Var54 ------CP rmvar _Var52 ------CP print _Var56.SCALAR.STRING.false _Var57.SCALAR.STRING ------CP rmvar _Var56 ------CP assignvar _Var48.SCALAR.DOUBLE.false norm_r2.SCALAR.DOUBLE ------CP assignvar _Var48.SCALAR.DOUBLE.false norm_r2_initial.SCALAR.DOUBLE ------CP assignvar _Var49.SCALAR.DOUBLE.false norm_r2_target.SCALAR.DOUBLE ------CP assignvar _Var55.SCALAR.STRING.false log_str.SCALAR.STRING ------CP cpvar _mVar46 p ------CP rmvar _Var48 ------CP rmvar _Var49 ------CP rmvar _Var55 ------CP rmvar _Var57 ------CP rmvar _mVar46 ------CP rmvar tolerance ----GENERIC (lines 174-200) [recompile=false] ----WHILE (lines 174-200) ------CP < i.SCALAR.INT.false max_iteration.SCALAR.INT.false _Var58.SCALAR.BOOLEAN ------CP > norm_r2.SCALAR.DOUBLE.false norm_r2_target.SCALAR.DOUBLE.false _Var59.SCALAR.BOOLEAN ------CP && _Var58.SCALAR.BOOLEAN.false _Var59.SCALAR.BOOLEAN.false _Var60.SCALAR.BOOLEAN ------CP rmvar _Var58 ------CP rmvar _Var59 ------CP rmvar _Var60 ------GENERIC (lines 180-180) [recompile=false] --------CP cpvar p ssX_p ------GENERIC (lines 183-183) [recompile=false] --------CP createvar _mVar61 scratch_space//_p15212_10.168.31.80//_t0/temp17 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP mmchain X.MATRIX.DOUBLE ssX_p.MATRIX.DOUBLE _mVar61.MATRIX.DOUBLE XtXv 48 --------CP cpvar _mVar61 q --------CP rmvar _mVar61 --------CP rmvar ssX_p ------GENERIC (lines 189-199) [recompile=false] --------CP createvar _mVar62 scratch_space//_p15212_10.168.31.80//_t0/temp18 true MATRIX binaryblock 1 2 1000 1000 -1 copy --------CP r' p.MATRIX.DOUBLE _mVar62.MATRIX.DOUBLE 48 --------CP createvar _mVar63 scratch_space//_p15212_10.168.31.80//_t0/temp19 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP * lambda.MATRIX.DOUBLE p.MATRIX.DOUBLE _mVar63.MATRIX.DOUBLE --------CP + i.SCALAR.INT.false 1.SCALAR.INT.true _Var64.SCALAR.INT --------CP createvar _mVar65 scratch_space//_p15212_10.168.31.80//_t0/temp20 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP + q.MATRIX.DOUBLE _mVar63.MATRIX.DOUBLE _mVar65.MATRIX.DOUBLE --------CP rmvar _mVar63 --------CP + Iteration .SCALAR.STRING.true _Var64.SCALAR.INT.false _Var66.SCALAR.STRING --------CP + CG_RESIDUAL_NORM,.SCALAR.STRING.true _Var64.SCALAR.INT.false _Var67.SCALAR.STRING --------CP + CG_RESIDUAL_RATIO,.SCALAR.STRING.true _Var64.SCALAR.INT.false _Var68.SCALAR.STRING --------CP createvar _mVar69 scratch_space//_p15212_10.168.31.80//_t0/temp21 true MATRIX binaryblock 1 1 1000 1000 -1 copy --------CP ba+* _mVar62.MATRIX.DOUBLE _mVar65.MATRIX.DOUBLE _mVar69.MATRIX.DOUBLE 48 --------CP rmvar _mVar62 --------CP + _Var66.SCALAR.STRING.false : ||r|| / ||r init|| = .SCALAR.STRING.true _Var70.SCALAR.STRING --------CP rmvar _Var66 --------CP + _Var67.SCALAR.STRING.false ,.SCALAR.STRING.true _Var71.SCALAR.STRING --------CP rmvar _Var67 --------CP + _Var68.SCALAR.STRING.false ,.SCALAR.STRING.true _Var72.SCALAR.STRING --------CP rmvar _Var68 --------CP castdts _mVar69.MATRIX.DOUBLE.false _Var73.SCALAR.DOUBLE --------CP rmvar _mVar69 --------CP / norm_r2.SCALAR.DOUBLE.false _Var73.SCALAR.DOUBLE.false _Var74.SCALAR.DOUBLE --------CP rmvar _Var73 --------CP createvar _mVar75 scratch_space//_p15212_10.168.31.80//_t0/temp22 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP +* beta_unscaled.MATRIX.DOUBLE _Var74.SCALAR.DOUBLE.false p.MATRIX.DOUBLE _mVar75.MATRIX.DOUBLE --------CP createvar _mVar76 scratch_space//_p15212_10.168.31.80//_t0/temp23 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP +* r.MATRIX.DOUBLE _Var74.SCALAR.DOUBLE.false _mVar65.MATRIX.DOUBLE _mVar76.MATRIX.DOUBLE --------CP rmvar _Var74 --------CP rmvar _mVar65 --------CP createvar _mVar77 scratch_space//_p15212_10.168.31.80//_t0/temp24 true MATRIX binaryblock 1 1 1000 1000 -1 copy --------CP tsmm _mVar76.MATRIX.DOUBLE _mVar77.MATRIX.DOUBLE LEFT 48 --------CP createvar _mVar78 scratch_space//_p15212_10.168.31.80//_t0/temp25 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP - 0.SCALAR.INT.true _mVar76.MATRIX.DOUBLE _mVar78.MATRIX.DOUBLE --------CP castdts _mVar77.MATRIX.DOUBLE.false _Var79.SCALAR.DOUBLE --------CP rmvar _mVar77 --------CP / _Var79.SCALAR.DOUBLE.false norm_r2.SCALAR.DOUBLE.false _Var80.SCALAR.DOUBLE --------CP / _Var79.SCALAR.DOUBLE.false norm_r2_initial.SCALAR.DOUBLE.false _Var81.SCALAR.DOUBLE --------CP sqrt _Var79.SCALAR.DOUBLE.false _Var82.SCALAR.DOUBLE --------CP createvar _mVar83 scratch_space//_p15212_10.168.31.80//_t0/temp26 true MATRIX binaryblock 2 1 1000 1000 -1 copy --------CP +* _mVar78.MATRIX.DOUBLE _Var80.SCALAR.DOUBLE.false p.MATRIX.DOUBLE _mVar83.MATRIX.DOUBLE --------CP rmvar _mVar78 --------CP rmvar _Var80 --------CP sqrt _Var81.SCALAR.DOUBLE.false _Var84.SCALAR.DOUBLE --------CP rmvar _Var81 --------CP + _Var71.SCALAR.STRING.false _Var82.SCALAR.DOUBLE.false _Var85.SCALAR.STRING --------CP rmvar _Var71 --------CP rmvar _Var82 --------CP + _Var70.SCALAR.STRING.false _Var84.SCALAR.DOUBLE.false _Var86.SCALAR.STRING --------CP rmvar _Var70 --------CP append log_str.SCALAR.STRING.false _Var85.SCALAR.STRING.false -1.SCALAR.INT.true _Var87.SCALAR.STRING true --------CP rmvar _Var85 --------CP + _Var72.SCALAR.STRING.false _Var84.SCALAR.DOUBLE.false _Var88.SCALAR.STRING --------CP rmvar _Var72 --------CP rmvar _Var84 --------CP print _Var86.SCALAR.STRING.false _Var89.SCALAR.STRING --------CP rmvar _Var86 --------CP append _Var87.SCALAR.STRING.false _Var88.SCALAR.STRING.false -1.SCALAR.INT.true _Var90.SCALAR.STRING true --------CP rmvar _Var87 --------CP rmvar _Var88 --------CP rmvar p --------CP rmvar r --------CP rmvar beta_unscaled --------CP assignvar _Var64.SCALAR.INT.false i.SCALAR.INT --------CP assignvar _Var79.SCALAR.DOUBLE.false norm_r2.SCALAR.DOUBLE --------CP assignvar _Var90.SCALAR.STRING.false log_str.SCALAR.STRING --------CP rmvar _Var64 --------CP cpvar _mVar75 beta_unscaled --------CP cpvar _mVar76 r --------CP rmvar _Var79 --------CP cpvar _mVar83 p --------CP rmvar _Var89 --------CP rmvar _Var90 --------CP rmvar _mVar75 --------CP rmvar _mVar76 --------CP rmvar _mVar83 --------CP rmvar q ----IF (lines 202-204) ------CP >= i.SCALAR.INT.false max_iteration.SCALAR.INT.false _Var91.SCALAR.BOOLEAN ------CP rmvar _Var91 ------GENERIC (lines 203-203) [recompile=false] --------CP print Warning: the maximum number of iterations has been reached..SCALAR.STRING.true _Var92.SCALAR.STRING --------CP rmvar _Var92 ----GENERIC (lines 205-205) [recompile=false] ------CP print The CG algorithm is done..SCALAR.STRING.true _Var93.SCALAR.STRING ------CP rmvar _Var93 ----GENERIC (lines 212-212) [recompile=false] ------CP cpvar beta_unscaled beta ----GENERIC (lines 215-226) [recompile=false] ------CP print Computing the statistics....SCALAR.STRING.true _Var94.SCALAR.STRING ------CP uak+ y.MATRIX.DOUBLE _Var95.SCALAR.DOUBLE 48 ------CP createvar _mVar96 scratch_space//_p15212_10.168.31.80//_t0/temp27 true MATRIX binaryblock 1 1 1000 1000 -1 copy ------CP tsmm y.MATRIX.DOUBLE _mVar96.MATRIX.DOUBLE LEFT 48 ------CP createvar _mVar97 scratch_space//_p15212_10.168.31.80//_t0/temp28 true MATRIX binaryblock 422 1 1000 1000 -1 copy ------CP ba+* X.MATRIX.DOUBLE beta.MATRIX.DOUBLE _mVar97.MATRIX.DOUBLE 48 ------CP / _Var95.SCALAR.DOUBLE.false 422.SCALAR.INT.true _Var98.SCALAR.DOUBLE ------CP rmvar _Var95 ------CP castdts _mVar96.MATRIX.DOUBLE.false _Var99.SCALAR.DOUBLE ------CP rmvar _mVar96 ------CP createvar _mVar100 scratch_space//_p15212_10.168.31.80//_t0/temp29 true MATRIX binaryblock 422 1 1000 1000 -1 copy ------CP - y.MATRIX.DOUBLE _mVar97.MATRIX.DOUBLE _mVar100.MATRIX.DOUBLE ------CP rmvar _mVar97 ------CP ^ _Var98.SCALAR.DOUBLE.false 2.SCALAR.INT.true _Var101.SCALAR.DOUBLE ------CP uak+ _mVar100.MATRIX.DOUBLE _Var102.SCALAR.DOUBLE 48 ------CP createvar _mVar103 scratch_space//_p15212_10.168.31.80//_t0/temp30 true MATRIX binaryblock 1 1 1000 1000 -1 copy ------CP tsmm _mVar100.MATRIX.DOUBLE _mVar103.MATRIX.DOUBLE LEFT 48 ------CP rmvar _mVar100 ------CP * 422.SCALAR.INT.true _Var101.SCALAR.DOUBLE.false _Var104.SCALAR.DOUBLE ------CP rmvar _Var101 ------CP / _Var102.SCALAR.DOUBLE.false 422.SCALAR.INT.true _Var105.SCALAR.DOUBLE ------CP rmvar _Var102 ------CP castdts _mVar103.MATRIX.DOUBLE.false _Var106.SCALAR.DOUBLE ------CP rmvar _mVar103 ------CP - _Var99.SCALAR.DOUBLE.false _Var104.SCALAR.DOUBLE.false _Var107.SCALAR.DOUBLE ------CP rmvar _Var104 ------CP ^ _Var105.SCALAR.DOUBLE.false 2.SCALAR.INT.true _Var108.SCALAR.DOUBLE ------CP / _Var107.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var109.SCALAR.DOUBLE ------CP * 422.SCALAR.INT.true _Var108.SCALAR.DOUBLE.false _Var110.SCALAR.DOUBLE ------CP rmvar _Var108 ------CP / _Var106.SCALAR.DOUBLE.false _Var107.SCALAR.DOUBLE.false _Var111.SCALAR.DOUBLE ------CP - _Var106.SCALAR.DOUBLE.false _Var110.SCALAR.DOUBLE.false _Var112.SCALAR.DOUBLE ------CP rmvar _Var110 ------CP - 1.SCALAR.INT.true _Var111.SCALAR.DOUBLE.false _Var113.SCALAR.DOUBLE ------CP rmvar _Var111 ------CP assignvar _Var98.SCALAR.DOUBLE.false avg_tot.SCALAR.DOUBLE ------CP assignvar _Var99.SCALAR.DOUBLE.false ss_tot.SCALAR.DOUBLE ------CP assignvar _Var105.SCALAR.DOUBLE.false avg_res.SCALAR.DOUBLE ------CP assignvar _Var106.SCALAR.DOUBLE.false ss_res.SCALAR.DOUBLE ------CP assignvar _Var107.SCALAR.DOUBLE.false ss_avg_tot.SCALAR.DOUBLE ------CP assignvar _Var109.SCALAR.DOUBLE.false var_tot.SCALAR.DOUBLE ------CP assignvar _Var112.SCALAR.DOUBLE.false ss_avg_res.SCALAR.DOUBLE ------CP assignvar _Var113.SCALAR.DOUBLE.false plain_R2.SCALAR.DOUBLE ------CP rmvar _Var94 ------CP rmvar _Var98 ------CP rmvar _Var99 ------CP rmvar _Var105 ------CP rmvar _Var106 ------CP rmvar _Var107 ------CP rmvar _Var109 ------CP rmvar _Var112 ------CP rmvar _Var113 ------CP rmvar X ------CP rmvar y ----IF (lines 227-233) ------CP > 422.SCALAR.INT.true m_ext.SCALAR.INT.false _Var114.SCALAR.BOOLEAN ------CP rmvar _Var114 ------GENERIC (lines 228-229) [recompile=false] --------CP - 422.SCALAR.INT.true m_ext.SCALAR.INT.false _Var115.SCALAR.INT --------CP / ss_avg_tot.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var116.SCALAR.DOUBLE --------CP / ss_res.SCALAR.DOUBLE.false _Var115.SCALAR.INT.false _Var117.SCALAR.DOUBLE --------CP rmvar _Var115 --------CP / _Var117.SCALAR.DOUBLE.false _Var116.SCALAR.DOUBLE.false _Var118.SCALAR.DOUBLE --------CP rmvar _Var116 --------CP - 1.SCALAR.INT.true _Var118.SCALAR.DOUBLE.false _Var119.SCALAR.DOUBLE --------CP rmvar _Var118 --------CP assignvar _Var117.SCALAR.DOUBLE.false dispersion.SCALAR.DOUBLE --------CP assignvar _Var119.SCALAR.DOUBLE.false adjusted_R2.SCALAR.DOUBLE --------CP rmvar _Var117 --------CP rmvar _Var119 --------CP rmvar m_ext ----ELSE ------GENERIC (lines 231-232) [recompile=false] --------CP assignvar NaN.SCALAR.DOUBLE.true dispersion.SCALAR.DOUBLE --------CP assignvar NaN.SCALAR.DOUBLE.true adjusted_R2.SCALAR.DOUBLE ----GENERIC (lines 235-236) [recompile=false] ------CP / ss_avg_res.SCALAR.DOUBLE.false ss_avg_tot.SCALAR.DOUBLE.false _Var120.SCALAR.DOUBLE ------CP - 1.SCALAR.INT.true _Var120.SCALAR.DOUBLE.false _Var121.SCALAR.DOUBLE ------CP rmvar _Var120 ------CP assignvar 420.SCALAR.INT.true deg_freedom.SCALAR.INT ------CP assignvar _Var121.SCALAR.DOUBLE.false plain_R2_nobias.SCALAR.DOUBLE ------CP rmvar _Var121 ----IF (lines 237-244) ------CP > deg_freedom.SCALAR.INT.false 0.SCALAR.INT.true _Var122.SCALAR.BOOLEAN ------CP rmvar _Var122 ------GENERIC (lines 238-239) [recompile=false] --------CP / ss_avg_res.SCALAR.DOUBLE.false deg_freedom.SCALAR.INT.false _Var123.SCALAR.DOUBLE --------CP / ss_avg_tot.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var124.SCALAR.DOUBLE --------CP / _Var123.SCALAR.DOUBLE.false _Var124.SCALAR.DOUBLE.false _Var125.SCALAR.DOUBLE --------CP rmvar _Var124 --------CP - 1.SCALAR.INT.true _Var125.SCALAR.DOUBLE.false _Var126.SCALAR.DOUBLE --------CP rmvar _Var125 --------CP assignvar _Var123.SCALAR.DOUBLE.false var_res.SCALAR.DOUBLE --------CP assignvar _Var126.SCALAR.DOUBLE.false adjusted_R2_nobias.SCALAR.DOUBLE --------CP rmvar _Var123 --------CP rmvar _Var126 --------CP rmvar ss_avg_res --------CP rmvar ss_avg_tot --------CP rmvar deg_freedom ----ELSE ------GENERIC (lines 241-243) [recompile=false] --------CP print Warning: zero or negative number of degrees of freedom..SCALAR.STRING.true _Var127.SCALAR.STRING --------CP assignvar NaN.SCALAR.DOUBLE.true var_res.SCALAR.DOUBLE --------CP assignvar NaN.SCALAR.DOUBLE.true adjusted_R2_nobias.SCALAR.DOUBLE --------CP rmvar _Var127 ----GENERIC (lines 246-246) [recompile=false] ------CP / ss_res.SCALAR.DOUBLE.false ss_tot.SCALAR.DOUBLE.false _Var128.SCALAR.DOUBLE ------CP - 1.SCALAR.INT.true _Var128.SCALAR.DOUBLE.false _Var129.SCALAR.DOUBLE ------CP rmvar _Var128 ------CP assignvar _Var129.SCALAR.DOUBLE.false plain_R2_vs_0.SCALAR.DOUBLE ------CP rmvar _Var129 ----GENERIC (lines 248-248) [recompile=false] ------CP / ss_res.SCALAR.DOUBLE.false 421.SCALAR.INT.true _Var130.SCALAR.DOUBLE ------CP / ss_tot.SCALAR.DOUBLE.false 422.SCALAR.INT.true _Var131.SCALAR.DOUBLE ------CP / _Var130.SCALAR.DOUBLE.false _Var131.SCALAR.DOUBLE.false _Var132.SCALAR.DOUBLE ------CP rmvar _Var130 ------CP rmvar _Var131 ------CP - 1.SCALAR.INT.true _Var132.SCALAR.DOUBLE.false _Var133.SCALAR.DOUBLE ------CP rmvar _Var132 ------CP assignvar _Var133.SCALAR.DOUBLE.false adjusted_R2_vs_0.SCALAR.DOUBLE ------CP rmvar _Var133 ------CP rmvar ss_tot ------CP rmvar m ------CP rmvar n ------CP rmvar ss_res ----GENERIC (lines 253-263) [recompile=false] ------CP toString target=beta _Var134.SCALAR.STRING ------CP + AVG_TOT_Y,.SCALAR.STRING.true avg_tot.SCALAR.DOUBLE.false _Var135.SCALAR.STRING ------CP sqrt var_tot.SCALAR.DOUBLE.false _Var136.SCALAR.DOUBLE ------CP + AVG_RES_Y,.SCALAR.STRING.true avg_res.SCALAR.DOUBLE.false _Var137.SCALAR.STRING ------CP sqrt var_res.SCALAR.DOUBLE.false _Var138.SCALAR.DOUBLE ------CP + DISPERSION,.SCALAR.STRING.true dispersion.SCALAR.DOUBLE.false _Var139.SCALAR.STRING ------CP + PLAIN_R2,.SCALAR.STRING.true plain_R2.SCALAR.DOUBLE.false _Var140.SCALAR.STRING ------CP + ADJUSTED_R2,.SCALAR.STRING.true adjusted_R2.SCALAR.DOUBLE.false _Var141.SCALAR.STRING ------CP + PLAIN_R2_NOBIAS,.SCALAR.STRING.true plain_R2_nobias.SCALAR.DOUBLE.false _Var142.SCALAR.STRING ------CP + ADJUSTED_R2_NOBIAS,.SCALAR.STRING.true adjusted_R2_nobias.SCALAR.DOUBLE.false _Var143.SCALAR.STRING ------CP print _Var134.SCALAR.STRING.false _Var144.SCALAR.STRING ------CP rmvar _Var134 ------CP + STDEV_TOT_Y,.SCALAR.STRING.true _Var136.SCALAR.DOUBLE.false _Var145.SCALAR.STRING ------CP rmvar _Var136 ------CP + STDEV_RES_Y,.SCALAR.STRING.true _Var138.SCALAR.DOUBLE.false _Var146.SCALAR.STRING ------CP rmvar _Var138 ------CP append _Var135.SCALAR.STRING.false _Var145.SCALAR.STRING.false -1.SCALAR.INT.true _Var147.SCALAR.STRING true ------CP rmvar _Var135 ------CP rmvar _Var145 ------CP append _Var147.SCALAR.STRING.false _Var137.SCALAR.STRING.false -1.SCALAR.INT.true _Var148.SCALAR.STRING true ------CP rmvar _Var147 ------CP rmvar _Var137 ------CP append _Var148.SCALAR.STRING.false _Var146.SCALAR.STRING.false -1.SCALAR.INT.true _Var149.SCALAR.STRING true ------CP rmvar _Var148 ------CP rmvar _Var146 ------CP append _Var149.SCALAR.STRING.false _Var139.SCALAR.STRING.false -1.SCALAR.INT.true _Var150.SCALAR.STRING true ------CP rmvar _Var149 ------CP rmvar _Var139 ------CP append _Var150.SCALAR.STRING.false _Var140.SCALAR.STRING.false -1.SCALAR.INT.true _Var151.SCALAR.STRING true ------CP rmvar _Var150 ------CP rmvar _Var140 ------CP append _Var151.SCALAR.STRING.false _Var141.SCALAR.STRING.false -1.SCALAR.INT.true _Var152.SCALAR.STRING true ------CP rmvar _Var151 ------CP rmvar _Var141 ------CP append _Var152.SCALAR.STRING.false _Var142.SCALAR.STRING.false -1.SCALAR.INT.true _Var153.SCALAR.STRING true ------CP rmvar _Var152 ------CP rmvar _Var142 ------CP append _Var153.SCALAR.STRING.false _Var143.SCALAR.STRING.false -1.SCALAR.INT.true _Var154.SCALAR.STRING true ------CP rmvar _Var153 ------CP rmvar _Var143 ------CP assignvar _Var154.SCALAR.STRING.false str.SCALAR.STRING ------CP rmvar _Var144 ------CP rmvar _Var154 ------CP rmvar avg_res ------CP rmvar adjusted_R2 ------CP rmvar plain_R2_nobias ------CP rmvar var_res ------CP rmvar plain_R2 ------CP rmvar var_tot ------CP rmvar adjusted_R2_nobias ------CP rmvar avg_tot ------CP rmvar dispersion ----GENERIC (lines 272-272) [recompile=false] ------CP print str.SCALAR.STRING.false _Var155.SCALAR.STRING ------CP rmvar _Var155 ------CP rmvar str ----GENERIC (lines 276-276) [recompile=false] ------CP print Writing the output matrix....SCALAR.STRING.true _Var156.SCALAR.STRING ------CP rmvar _Var156 ----GENERIC (lines 281-281) [recompile=false] ------CP cpvar beta beta_out ------CP rmvar beta ----GENERIC (lines 283-283) [recompile=false] ------CP write beta_out.MATRIX.DOUBLE beta.txt.SCALAR.STRING.true textcell.SCALAR.STRING.true .SCALAR.STRING.true ------CP rmvar fileB ------CP rmvar beta_out ------CP rmvar fmtB ----GENERIC (lines 288-288) [recompile=false] ------CP print END LINEAR REGRESSION SCRIPT.SCALAR.STRING.true _Var157.SCALAR.STRING ------CP rmvar _Var157
I tested LinearRegCG.dml script with the same data set that is being used in this test and get the correct results from the dml script. Here is how I ran it:
$SPARK_HOME/bin/spark-submit --master=local --driver-memory=6g $SYSTEMML_HOME/target/SystemML.jar -f $SYSTEMML_HOME/scripts/algorithms/LinearRegCG.dml -nvargs X=/user/iyounus/data/diabetes_X_train.txt Y=/user/iyounus/data/diabetes_y_train.txt B="beta.txt" icpt=1
Here are the stats:
Running the CG algorithm...
||r|| initial value = 64725.64237405237, target value = 0.06472564237405237
Iteration 1: ||r|| / ||r init|| = 0.013822097249150999
Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14
Warning: the maximum number of iterations has been reached.
The CG algorithm is done.
Computing the statistics...
AVG_TOT_Y,153.36255924170615
STDEV_TOT_Y,77.21853383600028
AVG_RES_Y,-1.081722178918495E-11
STDEV_RES_Y,63.03850633761024
DISPERSION,3973.8532812769263
PLAIN_R2,0.3351312506863876
ADJUSTED_R2,0.33354822985468857
PLAIN_R2_NOBIAS,0.3351312506863876
ADJUSTED_R2_NOBIAS,0.33354822985468857
Writing the output matrix...
END LINEAR REGRESSION SCRIPT
17/02/15 11:45:20 INFO api.DMLScript: SystemML Statistics:
Total execution time: 0.374 sec.
Number of executed Spark inst: 2.
The values of betas I get from this script are
1 1 938.2368795072023 2 1 152.91886229044422
But if I run the python test, then I get incorrect results. Just to complete, here is how I'm running the test:
$SPARK_HOME/bin/spark-submit --master=local --driver-memory=6g --driver-class-path $SYSTEMML_HOME/target/SystemML.jar test_mllearn_df.py
and here are the stats:
||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.01378813951373333 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795 Writing the output matrix... END LINEAR REGRESSION SCRIPT
and the values of betas are 458.489, 153.146.
I hope this helps.
Looks like both script have same plan. This looks like an algorithm-related or repeatability issue as the statistics after training are as follows:
python_LinearReg_test_spark.1.6.log:
||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.013822097249150999 Iteration 2: ||r|| / ||r init|| = 7.063617429825055E-14 The CG algorithm is done. Computing the statistics... 938.237 152.919 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-1.081722178918495E-11 STDEV_RES_Y,63.03850633761024 DISPERSION,3973.8532812769263 PLAIN_R2,0.3351312506863876 ADJUSTED_R2,0.33354822985468857 PLAIN_R2_NOBIAS,0.3351312506863876 ADJUSTED_R2_NOBIAS,0.33354822985468857
python_LinearReg_test_spark.2.1.log:
||r|| initial value = 64725.64237405237, target value = 0.06472564237405237 Iteration 1: ||r|| / ||r init|| = 0.01378813951373333 Iteration 2: ||r|| / ||r init|| = 4.3730800595678527E-14 The CG algorithm is done. Computing the statistics... 458.489 153.146 AVG_TOT_Y,153.36255924170615 STDEV_TOT_Y,77.21853383600028 AVG_RES_Y,-6.688193969161777E-12 STDEV_RES_Y,67.06389890324985 DISPERSION,4497.566536105316 PLAIN_R2,0.24750834362605834 ADJUSTED_R2,0.24571669682516795 PLAIN_R2_NOBIAS,0.24750834362605834 ADJUSTED_R2_NOBIAS,0.24571669682516795
Thanks Imran