Description
Reproduce code
import scala.util.Random val rng = new Random(3) val a1 = Array.tabulate(200)(_=>rng.nextDouble * 2.0 - 1.0) ++ Array.fill(20)(0.0) ++ Array.fill(20)(-0.0) import spark.implicits._ val df1 = sc.parallelize(a1, 2).toDF("id") import org.apache.spark.ml.feature.QuantileDiscretizer val qd = new QuantileDiscretizer().setInputCol("id").setOutputCol("out").setNumBuckets(200).setRelativeError(0.0) val model = qd.fit(df1)
Raise error like:
at org.apache.spark.ml.param.Param.validate(params.scala:76)
at org.apache.spark.ml.param.ParamPair.<init>(params.scala:634)
at org.apache.spark.ml.param.Param.$minus$greater(params.scala:85)
at org.apache.spark.ml.param.Params.set(params.scala:713)
at org.apache.spark.ml.param.Params.set$(params.scala:712)
at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
at org.apache.spark.ml.feature.Bucketizer.setSplits(Bucketizer.scala:77)
at org.apache.spark.ml.feature.QuantileDiscretizer.fit(QuantileDiscretizer.scala:231)
... 49 elided
java.lang.IllegalArgumentException: quantileDiscretizer_479bb5a3ca99 parameter splits given invalid value [-Infinity,-0.9986765732730827,..., -0.0, 0.0, ..., 0.9907184077958491,Infinity]
0.0 > -0.0 is False, which break the paremater validation check.