运行Spark报错:Exception in thread “main” java.lang.UnsupportedOperationException: CSV data source does not support struct<type:tinyint,size:int,indices:array,values:array> data type. at
查看多篇博客发现:因为是DenseVector不可以直接报保存到csv文件
- 可以有下面两种解决方法: (都是将数据列转化为String)
- 利用UDF函数
import org.apache.spark.sql.functions.udf
val stringify = udf((vs: Seq[String]) => s"""${vs.mkString(",")}""")
df.withColumn("cloumnA", stringify($"cloumnA))
.withColumn("cloumnB", stringify($"cloumnB"))
.write.csv("xxxxx")
- 直接转化
case class Asso(antecedent: String, consequent: String, confidence: String)
df.rdd.map { line => Asso(line(0).toString, line(1).toString, line(2).toString) }.
toDF().write.csv("xxxx")
参考:
https://jimolonely.github.io/2018/01/03/spark/02-write-csv/
https://cloud.tencent.com/developer/article/1531999