Spark2 Dataset之collect_set与collect_list

collect_set去除重复元素;collect_list不去除重复元素
select gender,
       concat_ws(',', collect_set(children)),
       concat_ws(',', collect_list(children))
  from Affairs
 group by gender

 

1
2
3
4
5
6
7
8
9
10
11
12
13
// 创建视图
data.createOrReplaceTempView( "Affairs" )
 
val  df 3 =  spark.sql( "select gender,concat_ws(',',collect_set(children)),concat_ws(',',collect_list(children)) from Affairs group by gender" )
df 3 :  org.apache.spark.sql.DataFrame  =  [gender :  string, concat _ ws(,, collect _ set(children)) :  string ...  1  more field]
 
df 3 .show   // collect_set去除重复元素;collect_list不去除重复元素
+------+-----------------------------------+------------------------------------+
|gender|concat _ ws(,, collect _ set(children))|concat _ ws(,, collect _ list(children))|
+------+-----------------------------------+------------------------------------+
|female|                             no,yes|                    no,yes,no,no,yes|
|  male|                             no,yes|                    no,yes,no,yes,no|
+------+-----------------------------------+------------------------------------+
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值