在学习spark的机器学习的时候出现了这么一个错误
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Column features must be of type struct<type:tinyint,size:int,indices:array<int>,values:array<double>> but was actually struct<type:tinyint,size:int,indices:array<int>,values:array<double>>.
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.ml.util.SchemaUtils$.checkColumnType(SchemaUtils.scala:44)
at org.apache.spark.ml.PredictorParams.validateAndTransformSchema(Predictor.scala:51)
at org.apache.spark.ml.PredictorParams.validateAndTransformSchema$(Predictor.scala:46)
at org.apache.spark.ml.regression.LinearRegression.org$apache$spark$ml$regression$LinearRegressionParams$$super$validateAndTransformSchema(LinearRegression.scala:177)
at org.apache.spark.ml.regression.LinearRegressionParams.validateAndTransformSchema(LinearRegression.scala:120)
at org.apache.spark.ml.regression.LinearRegressionParams.validateAndTransformSchema$(LinearRegression.scala:108)
at org.apache.spark.ml.regression.LinearRegression.validateAndTransformSchema(LinearRegression.scala:177)
at org.apache.spark.ml.Predictor.transformSchema(Predictor.scala:144)
at org.apache.spark.ml.PipelineStage.transformSchema(Pipeline.scala:74)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:100)
at sparkML.Regression$.delayedEndpoint$sparkML$Regression$1(Regression.scala:34)
at sparkML.Regression$delayedInit$body.apply(Regression.scala:10)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main$1$adapted(App.scala:80)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.App.main(App.scala:80)
at scala.App.main$(App.scala:78)
at sparkML.Regression$.main(Regression.scala:10)
at sparkML.Regression.main(Regression.scala)
Process finished with exit code 1
这个问题是你的spark版本是2.0+,而却调用了mllib包中的方法,saprk2.0之后更推荐使用ml包中的方法。
原来错误的导包
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
应该改为
import org.apache.spark.ml.linalg.Vectors
import org.apache.spark.ml.feature.LabeledPoint