Description
Using pyspark.ml.regression,
when I fit a GeneralizedLinearRegression like this:
glr = GeneralizedLinearRegression(family="gaussian", link="identity",
regParam=0.3, maxIter=10)
model = glr.fit(someData)
It seems like there is no way to get the matching of the features and their coefficients or standard errors. I am using an ugly work around like this right now:
field = model.summary._call_java('getClass').getDeclaredField("coefficientsWithStatistics")
object2 = model._call_java('summary')
field.setAccessible(True)
value = field.get(object2)
coef_value = {}
for i in range(0, len(value)):
row = value[i].toString()
values = row.split(',')
coef_value[values[0].replace('(', '').replace(')', '')] = float(values[1])
Am I missing something?
If not, I'd like to request a method similar to model.coefficients with which one can just get the feature names in the right order, like model.features or something like that.