CREATE TABLE [dbo].[Datamonitoring](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Monitmun] [nvarchar](50) NULL,
[Monitdate] [smalldatetime] NOT NULL,
在收集检测数据的时候,由于各种手残+脑残将一条或者几条数据同时上传,导致数据库出现某些数据有一条或者两条重复数据。我在处理时时根据时间来判断重复。
处理SQL:
DELETE FROM [dbo].[Datamonitoring] WHERE
ID NOT IN (SELECT MAX(ID) FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) >= 1 )
分析:
总体思路,我们是要删除重复的数据,并且要保留一条。那么不重复的数据(只有一条的数据)不能删除,保留重复的其中一条数据。
1.获取重复的数据
SELECT MONITDATE FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) > 1
2.获取重复数据中的一条.(MAX或者MIN都可以,其实我就只是想要其中一条)
SELECT MAX(ID) FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) > 1
3.获取不重复的数据
SELECT MONITDATE FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) = 1
4.现在只需要删除ID不在第2,3中的结果中的数据即可。
SELECT FROM [dbo].[Datamonitoring] WHERE
ID NOT IN SELECT MAX(ID) FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) > 1
AND
ID NOT IN SELECT MONITDATE FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) = 1 (这句肯定要错,ID 和 MONITDATE 类型不一致,但是在聚合中无法使用ID)
怎么办呢,其实只需要将 MONITDATE 修改为 MAX(ID)或者MIN(ID)其实结果都一样,只有一条嘛。
SELECT FROM [dbo].[Datamonitoring] WHERE
ID NOT IN SELECT MAX(ID) FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) > 1
AND
ID NOT IN SELECT MAX(ID) FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) = 1
5.咋一看两句一样,于是进行合并。
SELECT FROM [dbo].[Datamonitoring] WHEREID NOT IN (SELECT MAX(ID) FROM [dbo].[Datamonitoring] GROUP BY MONITDATE HAVING COUNT(1) >= 1 )
<pre name="code" class="sql">