举例,某个表形式如下
+--+--------+------+-----+----+
|id|name |course|score|year|
+--+--------+------+-----+----+
|1 |zhangsan|数学 |80 |2015|
|2 |lisi |语文 |90 |2016|
|3 |lisi |数学 |70 |2016|
|4 |wangwu |化学 |80 |2017|
|5 |zhangsan|语文 |85 |2015|
|6 |zhangsan|化学 |90 |2015|
+--+--------+------+-----+----+
concat相当于两列数据合并
select concat(id,":",name),course,score,year from course_score;
+----------+------+-----+----+
|c0 |course|score|year|
+----------+------+-----+----+
|1:zhangsan|数学 |80 |2015|
|2:lisi |语文 |90 |2016|
|3:lisi |数学 |70 |2016|
|4:wangwu |化学 |80 |2017|
|5:zhangsan|语文 |85 |2015|
|6:zhangsan|化学 |90 |2015|
+----------+------+-----+----+
concat_ws
也是合并多列的,不过分隔符只需要写一次.
只能连接字符串,所以要转型一下,利用cast函数
select concat_ws(":",cast(id as string),course,cast(score as string)) ,year from course_score;
+-------+----+
|c0 |year|
+-------+----+
|1:数学:80|2015|
|2:语文:90|2016|
|3:数学:70|2016|
|4:化学:80|2017|
|5:语文:85|2015|
|6:化学:90|2015|
+-------+----+
collect_set
收集某一列的,和上面两个功能完全不同. 另外,也可以去重
select collect_set(name) from course_score;
+----------------------------+
|c0 |
+----------------------------+
|["zhangsan","lisi","wangwu"]|
+----------------------------+