Spark scala使用na.replace替换DataFrame中的字符串

创建DataFrameF示例

val df = sc.parallelize(Seq(
     |   (0,"cat26","cat26"),
     |   (1,"cat67","cat26"),
     |   (2,"cat56","cat26"),
     |   (3,"cat8","cat26"))).toDF("Hour", "Category", "Value")

方法一:

scala> df.na.replace("*", Map[Any, Any](
     |      "cat26" -> "cat23"
     |    )).show()
+----+--------+-----+
|Hour|Category|Value|
+----+--------+-----+
|   0|   cat23|cat23|
|   1|   cat67|cat23|
|   2|   cat56|cat23|
|   3|    cat8|cat23|
+----+--------+-----+

spark官方源码示例:org/apache/spark/sql/DataFrameNaFunctionsSuite.scala
name是列名

df.na.replace("name", Map(
        "Bob" -> "Bravo",
        "Alice" -> null
      ))

df.na.replace("*", Map[Any, Any](
     false -> null
   ))

方法二:

替换hour列中的0为9
import com.google.common.collect.ImmutableMap; scala> df.na.replace("hour", ImmutableMap.of(0, 9)).show() +----+--------+-----+ |Hour|Category|Value| +----+--------+-----+ | 9| cat26|cat26| | 1| cat67|cat26| | 2| cat56|cat26| | 3| cat8|cat26| +----+--------+-----+ 替换所有列中"cat26"为"cat222" scala> df.na.replace("*", ImmutableMap.of("cat26", "cat222")).show() +----+--------+------+ |Hour|Category| Value| +----+--------+------+ | 0| cat222|cat222| | 1| cat67|cat222| | 2| cat56|cat222| | 3| cat8|cat222| +----+--------+------+

spark官方源码示例:

org/apache/spark/sql/DataFrameNaFunctions.scala
* {{{
*   import com.google.common.collect.ImmutableMap;
*
*   // Replaces all occurrences of 1.0 with 2.0 in column "height".
*   df.na.replace("height", ImmutableMap.of(1.0, 2.0));
*
*   // Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name".
*   df.na.replace("name", ImmutableMap.of("UNKNOWN", "unnamed"));
*
*   // Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns.
*   df.na.replace("*", ImmutableMap.of("UNKNOWN", "unnamed"));
* }}}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值