原表
id,task_id,current_unit_id
42688645,30404829,104
42688647,30404829,128
42688648,30404829,117
42688650,30404829,118
42688652,30404829,119
42688653,30404829,120
42688682,30404842,104
42688683,30404842,128
42688685,30404842,117
需要将current_unit_id按照id的顺序,以task_id 组合成一个单元
效果如下:
task_id,unit
30404842,104,117,128
30404829,104,117,118,119,120,128
使用concat_ws 及collect_set函数
collect_set 可以使一列单元值成为一个array单元值,但是内值需要转为string
concat_ws 是转置成向量后以指定符号粘合,需要配合group by使用
SELECT DISTINCT s.task_id,CONCAT_WS(',',collect_set(s.current_unit_id)) AS task_process
FROM
( SELECT DISTINCT s1.id, s1.task_id,cast(s1.current_unit_id as STRING ) as current_unit_id
FROM s1
WHERE s1.pt = '20180703'
AND task_id in(30404842,30404829)
order by s1.id LIMIT 900000000
) s
group by s.task_id