话不多说,上代码
val docTopicData = sc.textFile("src\\main\\resources\\model\\111.txt", 1)
.map(s => Vectors.dense(s.split(' ').map(_.toDouble)))
import spark.implicits._
val docTopicDF = docTopicData.zipWithIndex.map(_.swap).toDF("id","features")
val normalizer = new Normalizer()
.setInputCol("features")
.setOutputCol("normalfeatures")
.setP(1.0)
val row_normalized_dt: DataFrame = normalizer.transform(docTopicDF)
row_normalized_dt.show()
自动导入,运行出错,提示:
org.apache.spark.mllib.linalg.DenseVector cannot be cast to org.apache.spark.ml.linalg.Vector
结果是包导错了
导入正确包:
import org.apache.spark.ml.feature.Normalizer import org.apache.spark.ml.linalg.Vectors