Scala004-DataFrame整列String转timestamp

Intro

  DataFrame中有一列是String格式,字符串类型为"yyyyMMdd",需要把它转换成"timestamp"。可能有很多方法,udf啦等等,这里放一个相对简单的。

构造数据

import org.apache.spark.sql.functions._
import spark.implicits._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.expressions.Window
val df = Seq(
  ("A1", 25, 1,0.64,0.36,"20200101"),
  ("A1", 26, 1,0.34,0.66,"20200102"),
  ("B1", 27, 0,0.55,0.45,"20200103"),
  ("C1", 30, 0,0.14,0.86,"20200104")
  ).toDF("id", "age", "label","pro0","pro1","dateStr")
df.printSchema()
df.show()
Intitializing Scala interpreter ...



Spark Web UI available at http://DESKTOP-LAO32FQ:4043
SparkContext available as 'sc' (version = 2.4.4, master = local[*], app id = local-1583251961417)
SparkSession available as 'spark'



root
 |-- id: string (nullable = true)
 |-- age: integer (nullable = false)
 |-- label: integer (nullable = false)
 |-- pro0: double (nullable = false)
 |-- pro1: double (nullable = false)
 |-- dateStr: string (nullable = true)

+---+---+-----+----+----+--------+
| id|age|label|pro0|pro1| dateStr|
+---+---+-----+----+----+--------+
| A1| 25|    1|0.64|0.36|20200101|
| A1| 26|    1|0.34|0.66|20200102|
| B1| 27|    0|0.55|0.45|20200103|
| C1| 30|    0|0.14|0.86|20200104|
+---+---+-----+----+----+--------+






import org.apache.spark.sql.functions._
import spark.implicits._
import org.apache.spark.sql.functions._
import org.apache.spark.sql.expressions.Window
df: org.apache.spark.sql.DataFrame = [id: string, age: int ... 4 more fields]

列类型转换

转换之后,时分秒均为0

df.withColumn("date",unix_timestamp(col("dateStr"),"yyyyMMdd").cast("timestamp")).show()
+---+---+-----+----+----+--------+-------------------+
| id|age|label|pro0|pro1| dateStr|               date|
+---+---+-----+----+----+--------+-------------------+
| A1| 25|    1|0.64|0.36|20200101|2020-01-01 00:00:00|
| A1| 26|    1|0.34|0.66|20200102|2020-01-02 00:00:00|
| B1| 27|    0|0.55|0.45|20200103|2020-01-03 00:00:00|
| C1| 30|    0|0.14|0.86|20200104|2020-01-04 00:00:00|
+---+---+-----+----+----+--------+-------------------+

                                2020-03-04 于南京市栖霞区

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值