背景:存数据的时候多存了,导致数据重复
思路:group by 对重复的字段进行分组,重复意味着group by count的结果大于1,将count大于1的字段的最小id保留,其余删除
语句
delete from base_cell_extend where data_category =6 and data_key='device_num' and
cgi in (select cgi from base_cell_extend where data_category =6 and data_key='device_num' group by cgi having count(cgi) > 1 )
and base_cell_extend.id not in (select min(base_cell_extend.id) from base_cell_extend where data_category =6 and data_key='device_num' group by cgi having count(cgi) > 1 );
注意:最好还是能够在删除之前先select,并且计算一下删除后的数据总数是否正确,再进行delete
查找重复数据 group by 后having 作为条件
select cgi,count(*) from base_cell_extend
where data_category =6 and data_key='device_num'
group by cgi having count(cgi) > 1;
数据去重的几个方法:
1.distinct
2.group by
3.窗口函数