1、统计用逗号分隔字段中的元素
例如:有如下数据,需要把participants中每个元素出现的次数及对应的id统计出来:
id | participants |
169 | 吉利,搜狗,1 |
170 | 吉利,搜狗,2 |
171 | 吉利,3 |
172 | 吉利,4 |
173 | 吉利,5 |
SELECT group_concat(a.id) ids, substring_index( substring_index( a.participants, ',', b.help_topic_id + 1 ), ',',- 1 ) participant, count(a.id) num
FROM table_test a JOIN mysql.help_topic b
ON b.help_topic_id < ( length( a.participants ) - length( REPLACE ( a.participants, ',', '' ) ) + 1 )
group by substring_index( substring_index( a.participants, ',', b.help_topic_id + 1 ), ',',- 1 )
结果:
ids | participant | num |
169 | 1 | 1 |
170 | 2 | 1 |
171 | 3 | 1 |
172 | 4 | 1 |
173 | 5 | 1 |
169,170,173,172,171 | 吉利 | 5 |
169,170 | 搜狗 | 2 |
注意:此处借用了mysql库的辅表help_topic,由于该表的help_topic_id是连续的,且最大位为643(可以 select * from mysql.help_topic 来查看),所以这就要求我们业务表中需要统计的被逗号隔开的列participants中的元素不能超过643个。
另外,使用了group_concat函数将查询出的id字段由列合并成行。
2、查看数据表中某一字段重复的数据
select 字段名 from 表名 group by 字段名 having count( 字段名)>1;
首先要进行 group by 分组 再计算其数量大于1的数据
要切记你查看的内容(select 字段名)这一部分不可以显示全部数据(select *)
查询及删除重复记录的SQL语句
1、查找表中多余的重复记录,重复记录是根据单个字段(Id)来判断
select * from 表 where Id in (select Id from 表 group byId having count(Id) > 1)
2、删除表中多余的重复记录,重复记录是根据单个字段(Id)来判断,只留有rowid最小的记录
DELETE from 表 WHERE (id) IN ( SELECT id FROM 表 GROUP BY id HAVING COUNT(id) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FROM 表 GROUP BY id HAVING COUNT(*) > 1);
3、查找表中多余的重复记录(多个字段)
select * from 表 a where (a.Id,a.seq) in(select Id,seq from 表 group by Id,seq having count(*) > 1)
4、删除表中多余的重复记录(多个字段),只留有rowid最小的记录
delete from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)
5、查找表中多余的重复记录(多个字段),不包含rowid最小的记录
select * from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)