如何在SQL Server实现upsert功能

最新推荐文章于 2024-05-29 09:15:37 发布

omage

最新推荐文章于 2024-05-29 09:15:37 发布

阅读量1.9k

点赞数

分类专栏： SQL Server

本文链接：https://blog.csdn.net/omage/article/details/99690998

版权

SQL Server 专栏收录该内容

10 篇文章

订阅专栏

upsert是什么？

相信各位码农在工作当中经常碰到这样的场景，当这个表存在某条记录就更新其值，不存在则插入，如下代码例子所示

IF EXISTS (SELECT 1 FROM mytable WHERE Id = @Id)
    BEGIN
        UPDATE dbo.mytable
        SET Value = @Value
        WHERE Id = @Id;
    END
ELSE
    BEGIN
        INSERT INTO dbo.mytable (Id, Value)
        VALUES(@Id, @Value)
    END

这就是upsert的概念。你一定会想，这么简单的场景需要讨论吗？如果是在单线程环境下，这个确实没啥好讨论，简单的做下判断即可。但是如果是在多用户并发访问的情况下就不是那么简单了。

并发访问会有什么问题？

会造成重复插入一条相同id的记录，会造成primary key violation, 即违反主键约束，从而造成SQL语句抛出异常。这是由于当并发操作时，可能会存在两个或以上的线程执行时都判断不存在这个id的记录，从而同时执行插入操作。

解决办法

原理：事务 + isolation level serializable + 排他锁(updlock）

SERIALIZABLE隔离级别:

Statements cannot read data that has been modified but not yet committed by other transactions. (语句不能读取其他事务已修改但未提交的数据，即避免脏读)
No other transactions can modify data that has been read by the current transaction until the current transaction completes.（其他事务不能修改当前事务已读取的数据，除非当前事务已完成）
Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes(其他事务不能插入新的记录，该记录的键值属于当前事务读取的键值范围内，除非当前事务已完成).

以下具体方法整理相关参考网址，从个人的理解都是遵循上述原理，只是实现上有些区别。

方法1：

set transaction isolation level serializable
begin transaction
if exists (select 1 from mytable with (updlock) where id = @id)
	update mytable set value = @value where id = @id;
else 
	insert mytable (id, value) values (@id, @value);
commit

在事务开始前，显式设置隔离级别为serializable ，该隔离级别对当前会话保持有效。

当采用serializable隔离级别后，可以避免出现主键冲突。但还需要在select语句显式的指定updlock提示，否则容易造成事务之间死锁。因为select语句默认会获取一个RangeS-S共享锁，共享锁是不排他的，即其他事务也会获取这个共享锁。当执行后面的update或insert语句时，当前事务需要将RangeS-S共享锁升级为排他锁，而如果其他事务也拥有这个共享锁，就无法升级，从而导致死锁。加上updlock提示，则一开始select语句获得的就是排他的更新锁，从而避免死锁。（详见https://samsaffron.com/blog/archive/2007/04/04/14.aspx)

方法2

begin tran
if exists (select * from t with (updlock,serializable) where pk = @id)
   begin
    update mytable set value = @value where id = @id;
   end
else
  begin
   insert mytable (id, value) values (@id, @value);
  end
commit tran

对select语句加提示serializable,是对这个表采用serializable隔离级别，而方法1是隔离级别对当前会话有效。

方法3

先执行update语句，如果执行结果为空，则插入，因update本身默认就是获得更新锁，可以不用加updlock提示。

begin tran
   update mytable with (serializable) 
       set value = @value where id = @id;
   if @@rowcount = 0
   begin
    insert mytable (id, value) values (@id, @value);
   end
commit tran

根据https://www.cnblogs.com/zhenfengren/p/5618511.html的描述，方法3在都是insert的情况下要好于方法1和2。

方法4

MERGE mytable WITH (serializable) AS t

USING (SELECT @ID AS ID) AS new_id

      ON t.ID = new_id.ID

WHEN MATCHED THEN

    UPDATE  SET t.value = @value

WHEN NOT MATCHED THEN

    INSERT ( ID, Value ) VALUES (new_id.ID, @value);

从SQL Server 2008之后，微软引入了一个新的的命令语法：Merge,详见 https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017

Merge和insert,update一样，本身是个原子语句声明，不需要用begin tran/commit去显式的声明事务。在这里只需要使用

with (serializable）指定使用serializable隔离级别即可。

根据多篇参考文章所述，都推荐使用这个方法，该方法性能最佳，原因如下：

Faster performance. The Engine needs to parse, compile, and execute only one query instead of three (and no temporary variable to hold the key).（更快的性能，引擎只需解析、编译和只需一条语句而不是三条，且不需要临时变量保存键值）
Neater and simpler T-SQL code (after you get proficient in MERGE). (干净简洁的代码，当你熟悉merge语法后)
No need for explicit BEGIN TRANSACTION/COMMIT. MERGE is a single statement and is executed in one implicit transaction. （无需显式声明begin transaction/commit. merge是单个声明，在隐式事务中执行）
Greater functionality. MERGE can delete rows that are not matched by source (SRC table above). For example, we can delete row 1 from A_Table because its Data column does not match Search_Col in the SRC table. There is also a way to return inserted/deleted values using the OUTPUT clause.“ （很棒的功能，merge可以删除源未匹配的记录，举例，我们可以删除A表一条记录当数据字段未匹配源表查询，而且还可以通过output子句返回插入或删除的值）