与传统的数据去重不同,这是一个字段内的字符串有大量重复内容,需要去重
比如字段内容为:aa,bb,cc,dd,ab,aa,cc,dd
去重后的结果为:aa,bb,cc,dd,ab
回复于 2018-07-25 15:55:26 #5 得分:100
模拟SQL:
- SELECT 唯一字段,
- LISTAGG(COLUMN_DISTINCT, ',') WITHIN GROUP(ORDER BY COLUMN_DISTINCT) AS 去重字段
- FROM (SELECT DISTINCT 唯一字段, COLUMN_DISTINCT
- FROM (SELECT 唯一字段,
- REGEXP_SUBSTR(去重字段, '[^,]+', 1, lv) COLUMN_DISTINCT
- FROM table,
- (SELECT LEVEL lv FROM dual CONNECT BY LEVEL < 10) b
- WHERE b.lv <= REGEXP_COUNT(去重字段, '\,') + 1
- ORDER BY 唯一字段))
- GROUP BY 唯一字段;