SQL Server中的万圣节问题和建议的解决方案

描述 (Description)

As per Wikipedia, the Halloween problem was first discovered by Don ChamberlinPat Selinger, and Morton Astrahan, on Halloween day, 1976.

根据维基百科 ,万圣节问题最早是由 ·张伯林帕特·塞林格莫顿·阿斯特拉罕于1976年万圣节当天发现的。

Logically, there are two cursors performing a typical update operation. One to read and other to write. A read cursor identifies the row which is to be updated and another cursor that performs the write operation to update the rows. In terms of execution, this process is thus divided into two separate parts. To be more specific, first, the write cursor of an updated plan does not affect the second read cursor.

逻辑上,有两个游标执行典型的更新操作。 一个读,另一个写。 读游标标识要更新的行,另一个游标执行写操作以更新行。 在执行方面,该过程因此分为两个单独的部分。 更具体地说,首先,更新计划的写游标不影响第二个读游标。

This means that if the data is updated first by the “write” cursor before the data was read by the “read” cursor, then it could be possible to change the position of the row just by updating it. Consequently, read it a second time and it might be different. The means of a key being updated using such an operation in SQL Server, is called the Halloween effect. Let me explain in detail for more clarification.

这意味着,如果在“读取”游标读取数据之前先通过“写入”游标更新数据,则可以仅通过更新来更改行的位置。 因此,再次阅读它可能会有所不同。 在SQL Server中使用这样的操作更新密钥的方法称为万圣节效果 。 让我详细解释以进一步阐明。

In this situation, a row (or rows) in the queried data may move position within the result set. In the context of the aforementioned operations; let me provide an example of this. The cursor for writing, I liken it to a cricket game and for the second cursor in terms of reading, I also assume to be players. In this example, prior to all players available on the ground, the cricket game starts! This would obviously not be a good situation.

在这种情况下,查询数据中的一行(或几行)可能会在结果集中移动位置。 在上述操作的范围内; 让我提供一个例子。 用于写作的光标,我将其比喻为板球游戏;对于阅读方面的第二光标,我也假设是玩家。 在此示例中,在所有玩家都在地面上之前,板球比赛开始了! 这显然不是一个好情况。

We must ensure data is available to the write, which has been fully read. For securing protection against this scenario, it’s possible to introduce a blocking operator in the plan. For our example, the SPOOL operator would be a good candidate. Unfortunately, inserting a blocking operator, such as an eager spool, into the updated plan is required for copying all rows output by the read cursor and this copying is quite expensive. It is not necessary for the introduced blocking operator (e.g. spool) in every scenario, though. Generally, SQL Server uses this operator, because its cost is lower in comparison to other blocking operator. Functionally, the data has been inserted into tempDB, before it is used by a write operation. In short, it needs to be ensured the data is read first before modification happen.

我们必须确保已对​​数据进行完全读取的写入数据可用。 为了防止这种情况发生,可以在计划中引入阻塞运算符。 对于我们的示例, SPOOL运算符将是一个不错的选择。 不幸的是,为了复制读取游标输出的所有行,需要在更新后的计划中插入阻塞的运算符(例如,急切后台处理程序),并且此复制非常昂贵。 但是,并非在每种情况下都需要引入的阻塞运算符(例如,线轴)。 通常,SQL Server使用此运算符,因为与其他阻塞运算符相比,它的成本较低。 从功能上讲,在写入操作使用数据之前,已将其插入tempDB。 简而言之,需要确保在修改发生之前先读取数据。

There is a different type of blocking operators available in SQL Server. For the purpose of more clarification, I have described operators overview.

SQL Server中提供了另一种类型的阻塞运算符。 为了更清楚起见,我已经描述了操作员概述。

非阻塞VS阻塞运算符 (Non-blocking VS Blocking operator)

Generally, execution plan operators are logically divided into 4 different groups

通常,执行计划运算符在逻辑上分为4个不同的组

  • Logical and physical operators 逻辑和物理运算符

    Also called iterators. Its highlighted as blue icons, it may represent query execution or DML operations.

    也称为迭代器。 其突出显示为蓝色图标,它可能表示查询执行或DML操作。

  • Parallelism physical operators 并行物理运算符

    These are the same highlighted as blue icons. It might be a subset of logical and physical operators.

    这些与蓝色图标相同。 它可能是逻辑和物理运算符的子集。

  • Cursor operators 光标运算符

    These display color in yellow icons and is used it while cursor used in T-SQL.

    它们以黄色图标显示颜色,并在T-SQL中使用光标时使用。

  • Language Elements 语言要素

    Language elements display as green icons. Its represent T-SQL language elements.

    语言元素显示为绿色图标。 它代表T-SQL语言元素。

Out of those operators, A non-blocking operator gives output immediately, while performing the action logically. I have mentioned some non-blocking operators like Compute Scalar, Merge Join, Stream Aggregate, Lazy Spool, etc.

在这些运算符中,非阻塞运算符在逻辑执行操作的同时立即提供输出。 我已经提到了一些非阻塞运算符,例如Compute Scalar,Merge Join,Stream Aggregate, Lazy Spool等。

A blocking operator first reads all the input for performing the action, then return the data. Logically We can say simply it to stop and go operators as well. I have mentioned some blocking operators like are as follows.

阻塞运算符首先读取所有输入以执行操作,然后返回数据。 从逻辑上讲,我们也可以简单地说它既可以停止也可以停止操作。 我已经提到了一些阻塞运算符,如下所示。

  • Sort

    分类
  • Eagar Hash Distinct

    伊加尔哈希
  • Eagar Table spool

    Eagar表线轴
  • Eagar Row Count Spool

    Eagar行计数线轴
  • Hash Join

    哈希联接
  • UDX

    UDX
  • Scalar

    标量
  • Batch Hash Table build

    批处理哈希表构建
  • Remote range

    远程范围
  • Remote query

    远程查询
  • Remote fetch

    远程获取
  • Remote modify

    远程修改
  • Hatch Match(Aggregate)

    舱口盖匹配(总计)

The Halloween problem was originally detected in an update, but it could also occur in an Insert, Delete or Merge Statement as well. For more clarification, I will demonstrate it now.

万圣节问题最初是在更新中检测到的,但也可能在插入,删除或合并语句中发生。 为了进一步说明,我现在将演示它。

万圣节问题与更新声明 (Halloween problem with an Update statement)

The Halloween problem can appear in certain update operations. For demonstration purposes, here is the database creation sample script.

万圣节问题可能出现在某些更新操作中。 出于演示目的,这是数据库创建示例脚本。

样例脚本 (Sample Script)

CREATE DATABASE Halloween
GO
 
USE Halloween
GO
 
CREATE TABLE Halloween_Protection
(
	id INT PRIMARY KEY IDENTITY(1,1),
	custCode TINYINT,
	creditAmount DECIMAL(18,2),
	debitAmount DECIMAL(18,2)
)
GO
 
INSERT INTO Halloween_Protection
(
 custCode,
 creditAmount,
 debitAmount
)
SELECT	101,1000,0
UNION ALL
SELECT 102,500,0
UNION ALL
SELECT 103,900,0
UNION ALL
SELECT 104,800,0
UNION ALL
SELECT 105,700,0
UNION ALL
SELECT 106,1500,0
UNION ALL
SELECT 107,0,3000
GO

I have applied above mentioned script, I have DB Halloween ready for execution. As per my requirement, I need to update credit amount with 10% commission, which customer credit value is residing up to 1000. Based on that I have created a script and execute as follows with include execution plan.

我已经应用了上述脚本,我已经准备好执行DB Halloween。 根据我的要求,我需要用10%的佣金更新信用额,该信用额最多可容纳1000个客户信用额。基于此,我创建了一个脚本,并按以下步骤执行包含执行计划。

Here the whole update operation is divided into multiple phases, it’s referred to as phase separation. In the first phase, all the records are found, based on where clauses predicate, and information is curated for the next phase. Once phase1 is completed, the whole set of update information is passed to the next second phase, in which each record is updated using a unique identifier and the new creditAmount is updated. In the last phase, the data integrity constraint validation of this process is checked.

在这里,整个更新操作分为多个阶段,这称为阶段分离。 在第一阶段,根据where子句谓词找到所有记录,并为下一阶段收集信息。 一旦阶段1完成,整个更新信息集将传递到下一个第二阶段,在第二阶段中,使用唯一的标识符更新每个记录,并更新新的creditAmount。 在最后一个阶段,检查此过程的数据完整性约束验证。

The process in this execution flow reads rows at a time from the base source table. If rows are filtered with where clauses, as per above mentioned query the credit amount is increasing, this process is repeated until all rows are processed from the source table.

该执行流程中的过程一次从基本源表读取行。 如果使用where子句过滤行,则按照上面提到的查询,信用额度会增加,重复此过程,直到从源表中处理了所有行。

Look at wat the Clustered index scan operator found. It reads rows one at a time from the storage engine until it does not get the filtered records, here for the phase separation correctness introduced compute scalar operator. It performs a scalar computation and returns the computed value. Mostly this operator represents a minimal cost with respect execution plan. Despite of this, there are different type of blocking operator found in the plan. For more clarity purpose, I will demonstration it now.

看一下找到的聚集索引扫描运算符。 它一次从存储引擎读取一行,直到没有得到过滤的记录为止,这里是为引入计算标量运算符而提出的相分离正确性。 它执行标量计算并返回计算值。 通常,此操作员代表执行计划所需的最低成本。 尽管如此,计划中还是有不同类型的阻塞运算符。 为了更清晰起见,我现在将对其进行演示。

添加阻塞运算符 (Adding Blocking operators )

As per the above script, the table is ready. Now I am going to add a non-clustered index.

按照上面的脚本,表已准备就绪。 现在,我将添加一个非聚集索引。

USE [Halloween]
GO
CREATE NONCLUSTERED INDEX [NCI_Halloween] ON [dbo].[Halloween_Protection]
(
	[creditAmount] ASC
)

After adding an index, I executed two update statements with include execution plan.

添加索引后,我执行了两个包含include执行计划的更新语句。

As per te above-mentioned query1 plan, we have found multiple compute scalars due to a non-clustered index. It means here index NCI_Halloween is created on column creditAmount, due to this query is finding qualified rows in which an index exists. In Query2, I have tried to used index NCI_Halloween, due to that introduced blocking operator Table spool.

根据上述query1计划,由于非聚集索引,我们发现了多个计算标量。 这意味着此处索引NCI_Halloween是在creditAmount列上创建的,因为此查询正在查找存在索引的合格行。 在Query2中,由于引入了阻塞运算符Table spool 因此我尝试使用索引NCI_Halloween

插入语句的万圣节问题 (Halloween problem with Insert statement)

I have applied the insert statement with the not exists statement for the purpose of maintaining unique data into the table based on custcode column.

我已将insert语句与不存在语句一起应用,目的是基于custcode列在表中维护唯一数据。

Generally, while on the insert statement Halloween protection found where the target table is also referenced in the select statement, though I have found a Clustered Index Insert in the query plan.

通常,虽然我在查询计划中找到了聚集索引插入,但是在插入语句中,万圣节保护发现在select语句中也引用了目标表。

万圣节问题,删除语句 (Halloween problem with Delete statement)

I have also applied a delete statement with using the exists keyword.

我还通过使用exist关键字应用了一条delete语句。

By using self-join references while deleting the records based on particular condition, there is a requirement for Halloween protection as well as phase separation.

通过使用自联接引用同时根据特定条件删除记录,就需要万圣节保护以及相分离。

建议措施 (Recommended actions)

This is a SQL protection issue, not an exception or query specific issue. But sometimes due to this Halloween problem, we have a case of a performance bottleneck being realized. So, for such cases, we need to take care while during database design and development. Pursuant to this, I am trying to share generic views which might be useful in this scenario.

这是一个SQL保护问题,而不是异常或特定于查询的问题。 但有时由于这个万圣节问题,我们遇到了性能瓶颈的情况。 因此,在这种情况下,我们需要在数据库设计和开发期间多加注意。 为此,我尝试共享通用视图,这在这种情况下可能有用。

  • Try to apply a clustered index on the unique column in the table and keep the column in index key which is not updated infrequently.

    尝试在表的唯一列上应用聚集索引,并将该列保留在索引键中,该键不会经常更新。
  • Keep clustered Index size narrow and static same in non-clustered as well.

    保持聚集索引的大小狭窄,并且在非聚集索引中也保持不变。
  • Despite having a table with a clustered index, proper indexing in the table, is also required, as per use cases.

    尽管有一个具有聚集索引的表,但根据用例,还需要在表中进行适当的索引编制。
  • Whenever executing T-SQL Statements specifically with Insert, select, delete, update or merge try to index tuned, its need to index seek in lieu of scan of huge data. Try to introduce temporary table or table variable in lieu of prepared single complex DML statement.

    每当使用插入,选择,删除,更新或合并来专门执行T-SQL语句时,都尝试对索引进行调整,它需要索引查找来代替对大数据的扫描。 尝试引入临时表或表变量来代替准备的单个复杂DML语句。
  • On the coding side, make sure the T-SQL Statement is optimal in terms of Query cost.

    在编码方面,请确保T-SQL语句在查询成本方面是最佳的。
  • It would be more intuitive if we reduce the data volume in an update transaction using Means Used set based operation, Partition scheme etc. There are several approaches using T-SQL which could be useful for reducing query cost.

    如果我们使用基于均值集的操作,分区方案等来减少更新事务中的数据量,将会更加直观。使用T-SQL的几种方法可能对降低查询成本很有用。
  • Using Merge statement, we can avoid Halloween protection in join cases. In some case, Halloween protection in Merge statement could be required as well. It depends on T-SQL statements.

    使用Merge语句,我们可以避免在join情况下保护万圣节。 在某些情况下,也可能需要在Merge语句中提供万圣节保护。 它取决于T-SQL语句。
  • As always, prior rollout in production, everything should be thoroughly tested.

    与往常一样,在生产中首先进行部署,应该对所有内容进行全面测试。

参考资料 (References)

翻译自: https://www.sqlshack.com/the-halloween-problem-in-sql-server-and-suggested-solutions/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值