ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...)

Kodyaz Development ResourcesDevelopment resources, articles, tutorials, samples, codes and tools for .Net, SQL Server, Vista, etc. Welcome to Kodyaz Development Resources Sign in | Join | Help Home Articles News Tools Sample Chapters Trainers Blogs Forums Photos Files Roller T-SQL ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) to Delete Duplicate Rows in SQL Table If you have duplicate rows in your MS SQL Server database tables, you may want to delete duplicate records. T-SQL Row_Number() function can help sql developers to solve this sql problem. Before removing some of the rows in the SQL database table, you should decide which one of the row duplicates you will keep. And you should decide which rows to delete. You might want to keep the first entry, which can be assumed as the inserted date is earliest. Or if your sql table has an identity column, let's say an Id column with data type integer (int). You may want to keep the duplicate row with the smallest Id value, or with the biggest Id value. So SQL Row_Number enables tsql developers rank records and table rows over desired columns and table fields even enables partitioning the result set based on table columns. Before giving an sql example for deleting duplicate rows, let's create sql table and populate table with sample duplicate rows. CREATE TABLE Suppliers ( Id int identity (1,1), CompanyTitle nvarchar(1000), ContactName nvarchar(100), LastContactDate datetime ) Now insert sample data into Suppliers table for removing duplicate records example. INSERT Suppliers VALUES (N'Melody Music Instruments',N'James Manning', '20090623 10:15') INSERT Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090720 15:40') INSERT Suppliers VALUES (N'Top Music',N'Katy Swan', '20090827 18:00') INSERT Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090806 10:00') INSERT Suppliers VALUES (N'Melody Music Instruments',N'James Brown', '20080121 11:20') INSERT Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00') INSERT Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00') Order Rows by Datetime Fields Let's order table rows according to Last Contact Date within each Company record. SELECT RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC), Id, CompanyTitle, ContactName, LastContactDate FROM Suppliers Now we can develop t-sql code to delete rows where RN is different than 1, which we assume duplicates since they are old records. We have a newer row in the sql table Suppliers. This sql script uses the T-SQL Row_Number() function with Partition By option. T-SQL developers will soon realize that they can use Partition Over clause in many cases to solve their problems. WITH [CTE DUPLICATE] AS ( SELECT RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC), Id, CompanyTitle, ContactName, LastContactDate FROM Suppliers ) DELETE FROM [CTE DUPLICATE] WHERE RN > 1 Order Rows by Identity Field For this example code, we will order sql table rows according to the Identity field column Id within each Company record. SELECT RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY Id DESC), Id, CompanyTitle, ContactName, LastContactDate FROM Suppliers And within each group that is formed by the Company field, we will only keep the row with biggest Identity field value (Id column value), and remove others by deleting them. WITH [CTE DUPLICATE] AS ( SELECT RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY Id DESC), Id, CompanyTitle, ContactName, LastContactDate FROM Suppliers ) DELETE FROM [CTE DUPLICATE] WHERE RN > 1 Summary - Delete Duplicate Rows in SQL Table It is very easy to identify duplicate rows and delete duplicates due to your requirements by using a partition over clause, ROW_NUMBER() OVER (PARTITION BY columnname1, columnname2 ORDER BY columnname3 DESC) and using it within a CTE t-sql statement as shown in the above t-sql examples. I hope you enjoy the sql examples and find them useful. Additional Tutorials for Dublicate Rows If you have identical rows, I mean all column values are identical, how can we identify the duplicate rows and delete duplicate records from database table using t-sql commands. In this case there is still methods for sql administrators and t-sql developers to delete duplicate rows in sql tables. You can find an other sql tutorial titled How to delete duplicate records or rows among identical rows in a table where no primary key exists among Kodyaz SQL articles. Related SQL Resources SQL Server Articles SQL Server Tools SQL Blog SQL Server 2008 Blog Certification Exams Blog Reporting Services Blog Analysis Services Blog MS SQL Server Forums Free Exam Vouchers Copyright © 2004 - 2010 Eralper Yilmaz. All rights reserved.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值