spark sql 上个月,如何在SparkSQL中获取两个日期之间的月,年差异

在SparkSQL中遇到错误:'year'无法解析。数据输入为id、开始日期和结束日期。尝试计算两个日期之间的年份差异,但结果以天数给出。解决方案是使用Spark的User Defined Function(UDF)来创建一个能计算年份差异的函数。
摘要由CSDN通过智能技术生成

I am getting the error:

org.apache.spark.sql.analysisexception: cannot resolve 'year'

My input data:

1,2012-07-21,2014-04-09

My code:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import sqlContext.implicits._

import org.apache.spark.sql.SaveMode

import org.apache.spark.sql._

import org.apache.spark.sql.functions._

case class c (id:Int,start:String,end:String)

val c1 = sc.textFile("date.txt")

val c2 = c1.map(_.split(",")).map(r=>(c(r(0).toInt,r(1).toString,r(2).toString)))

val c3 = c2.toDF();

c3.registerTempTable("c4")

val r = sqlContext.sql("select id,datediff(year,to_date(end), to_date(start)) AS date from c4")

What can I do resolve above error?

I have tried the following code but I got the output in days and I need it in years

val r = sqlContext.sql("select id,datediff(to_date(end), to_date(start)) AS date from c4")

Please advise me if i can use any function like to_date to get year difference.

解决方案val r = sqlContext.sql("select id,datediff(year,to_date(end), to_date(start)) AS date from c4")

In the above code, "year" is not a column in the data frame i.e it is not a valid column in table "c4" that is why analysis exception is thrown as query is invalid, query is not able to find the "year" column.

Use Spark User Defined Function (UDF), that will be a more robust approach.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值