//service_prod_code字段类型为array
val df_info = spark.sql(
"""
select
product, service_prod_code
from
tablename
""".stripMargin)
//使用explode会将原值中的null值删除掉,就是说,如果原来的array值为空的话,通过explode转换后,会直接删除掉这行数据,只保留array指不为空的数据
val service_result = df_info.withColumn("service_explode", explode(col("service_prod_code")))
//想要保留array值为空的数据的话,可采用如下方法
//spark 2.2+
val service_result = df_info.withColumn("service_explode", explode_outer(col("service_prod_code")))
//spark <=2.1
df.withColumn("service_explode", explode(
when(col("service_prod_code").isNotNull, col("service_prod_code"))
// If null explode an array<string> with a single null
.otherwise(array(lit(null).cast("string")))))
Spark的DataFrame中用explode将array数组转换成多行
最新推荐文章于 2024-07-31 15:54:40 发布