此处以某消费记录表(consume_record)为例,SQL语句如下:
DELETE consume_record
FROM
consume_record,
(
SELECT
min(id) id,
user_id,
monetary,
consume_time
FROM
consume_record
GROUP BY
user_id,
monetary,
consume_time
HAVING
count(*) > 1
) t2
WHERE
consume_record.user_id = t2.user_id
and consume_record.monetary = t2.monetary
and consume_record.consume_time = t2.consume_time
AND consume_record.id > t2.id;
SQL语句分析:
1、查询出重复记录形成一个集合(临时表t2),集合里是每种重复记录的最小ID
(SELECT min(id) id, user_id, monetary, consume_time FROM consume_record GROUP BY user_id, monetary, consume_time HAVING count(*) > 1 ) t2
2、关联
consume_record.user_id = t2.user_id and consume_record.monetary = t2.monetary and consume_record.consume_time = t2.consume_time
3、根据条件,删除原表中id大于t2中id的记录
DELETE consume_record FROM ... WHERE ... AND consume_record.id > t2.id;
测试效果:
图一为删除前总记录数45541,图二为删除操作、从45541条记录中删除2800条重复记录用时0.09秒,图三为删除后总记录数。贴上测试表,如有需要的小伙伴,下载导入即可进行测试。consume_record.sql
如下语句,用于SQL server对AccountEmail账号信息去重:
DELETE [FSDBtemp].[dbo].[CusUsers]
FROM [FSDBtemp].[dbo].[CusUsers],
(
SELECT
min(cuid) cuid,
[AccountEmail]
FROM
[FSDBtemp].[dbo].[CusUsers]
GROUP BY
[AccountEmail]
HAVING
count(*) > 1
) t2
WHERE
[FSDBtemp].[dbo].[CusUsers].AccountEmail = t2.AccountEmail
AND [FSDBtemp].[dbo].[CusUsers].cuid > t2.cuid