下面方法适用于所有数据库
需求:删除表内重复的行数据,没有主键的情况
新建测试表:
create table #a_aa_a(
t_name varchar(10),
t_date int,
t_grade int
)
insert into #a_aa_a VALUES ('盲僧',20210401,1)--
insert into #a_aa_a VALUES ('盲僧',20210403,2)
insert into #a_aa_a VALUES ('盲僧',20210401,1)--
insert into #a_aa_a VALUES ('亚索',20210401,1)
insert into #a_aa_a VALUES ('亚索',20210403,2)--
insert into #a_aa_a VALUES ('亚索',20210403,2)--
insert into #a_aa_a VALUES ('盖伦',20210401,1)
insert into #a_aa_a VALUES ('盖伦',20210403,2)
insert into #a_aa_a VALUES ('盖伦',20210408,3)
上面的建表语句结果
思路:得到重复数据的最大ID或者最小ID,删除其他数据,就得到唯一数据
问题:数据表里没有唯一ID的情况怎么办
解决:使用(开窗函数<---点它点它点它点它 )得到ID--链接有开窗函数的用法和例子
语句:
select row_number() over(partition by t_name order by t_name,t_date)
as id ,* from #a_aa_a
结果截图
得到的ID是与t_name,t_date有关联的,所以并不连续,但是也得到了唯一ID值
之后通过对结果集的t_name、t_date分组得到唯一ID
select min(a.id) as min_id,a.t_name,a.t_date from
(select row_number() over(partition by t_name order by t_name,t_date)
as id ,* from #a_aa_a) a group by a.t_name,a.t_date
结果图:与上述建表语句相比,后面有--的重复数据都变为不重复数据
查出要删除的重复数据
select * from (select row_number() over(partition by t_name order by t_name,t_date)
as id ,* from #a_aa_a) t1 join (select min(a.id) as min_id,a.t_name,a.t_date from
(select row_number() over(partition by t_name order by t_name,t_date) as id ,* from
#a_aa_a) a group by a.t_name,a.t_date) t2 on t1.t_date=t2.t_date
and t1.t_name=t2.t_name and t1.id<>t2.min_id
结果图:
改为delete语句--注意删除的表是T1
delete t1 from (select row_number() over(partition by t_name order by t_name,t_date)
as id ,* from #a_aa_a) t1 join (select min(a.id) as min_id,a.t_name,a.t_date from
(select row_number() over(partition by t_name order by t_name,t_date) as id ,* from
#a_aa_a) a group by a.t_name,a.t_date) t2 on t1.t_date=t2.t_date
and t1.t_name=t2.t_name and t1.id<>t2.min_id
查询有ID的结果集数据
同理。查询初始表,数据已经被删除
哎,又没事干了,继续划水