Spark DataFrame:提取某列并修改/ Column更新、替换


2363252-21b697520da3f4f8.png

1.concat(exprs: Column*): Column

function note: Concatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.

我的问题: dateframe中的某列数据"XX_BM", 例如:值为 0008151223000316, 现在我想 把Column("XX_BM")中的所有值 变为:例如:0008151223000316sfjd。

0008151223000316 + sfjd

解决方案: in Scala

var tmp = dfval.col("XX_BM")

var result = concat(tmp,lit("sfjd"))

dfval = dfval.withColumn("XX_BM", result)

2.regexp_replace(e: Column, pattern: String, replacement: String): Column

function note: Replace all substrings of the specified string value that match regexp with rep.

我的问题:I got some dataframe with 170 columns. In one column I have a "name" string and this string sometimes can have a special symbols like "'" that are not appropriate, when I am writing them to Postgres. Can I make something like that:【问题来自

Df[$'name']=Df[$'name'].map(x => x.replaceAll("'","")) ?

但是:I don't want to parse full DataFrame,because it's very huge.Help me please


解决方案:You can't mutate DataFrames, you can only transform them into new DataFrames with updated values. In this case - you can use the regex_replace function to perform the mapping on name column:

import org.apache.spark.sql.functions._

val updatedDf = Df.withColumn("name", regexp_replace(col("name"), ",", ""))


3.regexp_replace(e: Column, pattern: Column, replacement: Column): Column

function note : Replace all substrings of the specified string value that match regexp with rep


详细function 参考:org.apache.spark.sql.functions

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值