查看dataframe的
可以发现temp列有两列,所以在运行时候报如下错误:
这里是引用Caused by: org.apache.spark.sql.AnalysisException: Reference ‘temp’ is ambiguous, could be: temp, temp.;
at org.apache.spark.sql.catalyst.expressions.package$AttributeSeq.resolve(package.scala:259)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:121)
at org.apache.spark.sql.Dataset.resolve(Dataset.scala:221)
at org.apache.spark.sql.Dataset.col(Dataset.scala:1268)
发现是temp列重复了,结合代码发现在构造dataframe时候,temp多谢了一次如图
删除重复列即可