如何解决SQL Server中SQL身份危机

描述 (Description)

SQL Identity columns provide a convenient way to auto-number an ID column within a table without the need to manage the sequence. This convenience can save immense amounts of time, but also presents a single challenge: What happens when an identity column runs out of space within the data type chosen?

SQL Identity列提供了一种方便的方法来自动编号表中的ID列,而无需管理序列。 这种便利可以节省大量时间,但是也带来了一个挑战:当标识列在所选数据类型内的空间不足时会发生什么?

In this article, we will answer this question, as well as explore a variety of ways to detect this problem before it arises and solve it without causing disruption to important tables.

在本文中,我们将回答这个问题,并探索各种方法以在出现问题之前对其进行检测并解决该问题,而不会对重要表造成干扰。

介绍 (Introduction)

SQL Identity columns are often used as a way to auto-number some data element when we have no need to assign any specific values to it. Either the values are arbitrary, the column is a surrogate key, or we wish to generate numbers for use in other processes downstream.

当不需要为数据元素分配任何特定值时,SQL Identity列通常用作自动编号某些数据元素的方法。 值是任意的,列是代理键,或者我们希望生成数字以供下游的其他过程使用。

For most applications, identity columns are set-it-and-forget-it. For an integer, which can contain 2,147,483,647 positive values, it’s easy to assume that a number so large is out-of-reach for a fledgling table. As time goes on, though, and applications grow larger, two billion can quickly seem like a much smaller number. It’s important to understand our data usage, predict identity consumption over time, and proactively manage data type changes before an emergency arises.

对于大多数应用程序,标识列是设置后忘记的。 对于一个可以包含2,147,483,647个正值的整数,很容易假设这么大的数字对于新创建的表是不可行的。 但是,随着时间的流逝,应用程序越来越大,很快就会有20亿个数字变得很小。 重要的是要了解我们的数据使用情况,预测一段时间内的身份消耗,并在紧急情况发生之前主动管理数据类型的更改。

SQL标识列用尽时会发生什么 (What happens when a SQL Identity Column is exhausted)

When a SQL identity column reaches its limit, all insert operations will fail. We can test this easily by creating a table with an identity column, reseeding it to its limit, and then trying to insert a new row. Here’s a table with only 2 columns, an identity that is set to a seed near its maximum value and a string:

当SQL标识列达到其限制时,所有插入操作将失败。 我们可以通过创建一个带有标识列的表,将其重新设置到其极限,然后尝试插入新行来轻松测试。 这是一个只有两列的表,一个身份设置为接近其最大值的种子和一个字符串:

CREATE TABLE dbo.Identity_Test
(	My_Identity INT NOT NULL IDENTITY(2147483646,1) CONSTRAINT PK_Identity_Test PRIMARY KEY CLUSTERED,
	My_Dinosaur VARCHAR(25) NOT NULL	);
 

With the table created, we can begin inserting some data:

创建表后,我们可以开始插入一些数据:

 
INSERT INTO dbo.Identity_Test
	(My_Dinosaur)
VALUES
	('Euoplocephalus');
INSERT INTO dbo.Identity_Test
	(My_Dinosaur)
VALUES
	('Triceratops');

After these inserts, we can view the data in our table:

这些插入之后,我们可以查看表中的数据:

Sample data

Note that the identity value for our Triceratops is at the highest allowed by an integer data type. Now, let’s insert one more row:

请注意,三角龙的标识值是整数数据类型所允许的最高值。 现在,让我们再插入一行:

INSERT INTO dbo.Identity_Test
	(My_Dinosaur)
VALUES
	('Micropachycephalosaurus');

The result of this SQL INSERT statement is the following error:

此SQL INSERT语句的结果是以下错误:

The result of this SQL INSERT statement

SQL Server provides no built-in warning. When we exceed the bounds of a data type, we receive the same error that would be returned if we tried to store a higher number in the INTEGER. Until an action is taken on our part to resolve the limit we have hit, inserts will continue to fail. All other operations on this table will execute normally, including DELETE, UPDATE, and SELECT.

SQL Server不提供内置警告。 当我们超出数据类型的界限时,我们收到与试图将更大的数字存储在INTEGER中时将返回的相同错误。 在我们采取行动解决我们遇到的限制之前,插入将继续失败。 该表上的所有其他操作将正常执行,包括DELETE,UPDATE和SELECT。

The remainder of this article will deal with detecting a data type that is getting full with enough time so that we can take an action that does not need to be based on panic 🙂

本文的其余部分将讨论检测足够时间填充的数据类型,以便我们可以采取不必基于恐慌的操作🙂

如何确定SQL标识列中的剩余空间 (How to determine remaining space in a SQL Identity Column)

SQL Server provides two tools that allow us to accurately and quickly research identity column usage:

SQL Server提供了两个工具,这些工具使我们能够准确快速地研究身份列的使用情况:

  1. System views

    系统视图
  2. IDENT_CURRENT

    IDENT_CURRENT

System views provide us with information about tables, schemas, columns, and other objects in SQL Server. We can use these to generate lists of different types of objects that can then be used for reporting or further research. For our work here, we will use system views to collect a list of all identity columns in a given database. Once we have this list, we can begin researching each column and understanding how far along we are within each column’s data type. The following views will be used in our work:

系统视图为我们提供了有关SQL Server中的表,架构,列和其他对象的信息。 我们可以使用它们来生成不同类型对象的列表,然后将这些列表用于报告或进一步研究。 对于此处的工作,我们将使用系统视图来收集给定数据库中所有标识列的列表。 有了此列表后,我们就可以开始研究每一列并了解我们在每一列的数据类型中的距离。 在我们的工作中将使用以下视图:

  • sys.tables: A view with a row per user table in a given database. sys.tables :一个视图,在给定数据库中每个用户表都有一行。
  • sys.columns: Returns a row for each column in any view or table. sys.columns :为任何视图或表中的每一列返回一行。
  • sys.types: Contains a row for each data type in defined in a given database. This includes both user and system data types. sys.types :在给定数据库中定义的每种数据类型包含一行。 这包括用户和系统数据类型。
  • sys.schemas: Returns a row for each schema defined within a database. sys.schemas :为数据库内定义的每个架构返回一行。

By joining these views together, we can create a query that will provide details about each SQL identity column within a database:

通过将这些视图连接在一起,我们可以创建一个查询,该查询将提供有关数据库中每个SQL标识列的详细信息:

SELECT
	DB_NAME() AS database_name,
	schemas.name AS schema_name,
	tables.name AS table_name,
	columns.name AS column_name,
	types.name AS type_name
FROM sys.tables
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
INNER JOIN sys.types
ON types.user_type_id = columns.user_type_id
INNER JOIN sys.schemas
ON schemas.schema_id = tables.schema_id
WHERE columns.is_identity = 1;

When run in the WideWorldImporters Microsoft demo database, we get the following result set:

WideWorldImporters Microsoft演示数据库中运行时,我们得到以下结果集:

The result of this SQL INSERT statement in WideWorldImporters database

This database contains three SQL identity columns, including the one we created earlier in this article. Our system views return the schema, table, and column name, as well as the data type. Filtering on columns.is_identity = 1, we can limit our result set from all columns to only those that are defined with the IDENTITY attribute.

该数据库包含三个SQL标识列,其中包括我们在本文前面创建的列。 我们的系统视图返回模式,表和列名称以及数据类型。 过滤column.is_identity = 1 ,我们可以将所有列的结果集限制为仅使用IDENTITY属性定义的列。

The next tool at our disposal is the built-in function IDENT_CURRENT, which returns the last identity value that was allocated for a given identity column. This function accepts a table name (including the schema name) and returns the last SQL identity value used:

我们可以使用的下一个工具是内置函数IDENT_CURRENT,它返回分配给给定标识列的最后一个标识值。 此函数接受一个表名(包括模式名)并返回最后使用SQL标识值:

SELECT IDENT_CURRENT('Warehouse.VehicleTemperatures') AS current_identity_value;

This query returns the following results:

该查询返回以下结果:

Query result

Since that column is a BIGINT, we have a long way to go before we exhaust the ~9.2 quintillion values available to it.

由于该列是BIGINT,因此要用尽约9.2十亿五千万的值,我们还有很长的路要走。

This is a good start, but we do not want to run a separate query on every single table. While the schema name is optional, if we do not include it, then it will default to the dbo schema, which will not work for any tables that are not in the default dbo schema. For tables in other schemas, we will need to write dynamic SQL to insert both the schema and table name into the IDENT_CURRENT function:

这是一个好的开始,但是我们不想在每个表上都运行单独的查询。 尽管架构名称是可选的,但如果我们不包含它,则它将默认为dbo架构,这不适用于不在默认dbo架构中的任何表。 对于其他模式中的表,我们将需要编写动态SQL,以将模式和表名都插入IDENT_CURRENT函数中:

USE WideWorldImporters;
GO
 
CREATE TABLE #identity_columns
(	[database_name] SYSNAME NOT NULL,
	[schema_name] SYSNAME NOT NULL,
	table_name SYSNAME NOT NULL,
	column_name SYSNAME NOT NULL,
	[type_name] SYSNAME NOT NULL,
	maximum_identity_value BIGINT NOT NULL,
	current_identity_value BIGINT NULL,
	percent_consumed DECIMAL(25,4) NULL	);
 
DECLARE @Table_Name NVARCHAR(MAX);
DECLARE @Schema_Name NVARCHAR(MAX)
DECLARE @Sql_Command NVARCHAR(MAX) = '';
 
SELECT @Sql_Command = @Sql_Command + '
INSERT INTO #identity_columns
	([database_name], [schema_name], table_name, column_name, [type_name], maximum_identity_value, current_identity_value)
SELECT
	DB_NAME() AS database_name,
	''' + schemas.name + ''' AS schema_name,
	''' + tables.name + ''' AS table_name,
	''' + columns.name + ''' AS column_name,
	''' + types.name + ''' AS type_name,
	CASE
		WHEN ''' + types.name + ''' = ''TINYINT'' THEN CAST(255 AS BIGINT)
		WHEN ''' + types.name + ''' = ''SMALLINT'' THEN CAST(32767 AS BIGINT)
		WHEN ''' + types.name + ''' = ''INT'' THEN CAST(2147483647 AS BIGINT)
		WHEN ''' + types.name + ''' = ''BIGINT'' THEN CAST(9223372036854775807 AS BIGINT)
		WHEN ''' + types.name + ''' IN (''DECIMAL'', ''NUMERIC'') THEN CAST(REPLICATE(9, (' + CAST(columns.precision AS VARCHAR(MAX)) + ' - ' + CAST(columns.scale AS VARCHAR(MAX)) + ')) AS BIGINT)
		ELSE -1
	END AS maximum_identity_value,
	IDENT_CURRENT(''[' + schemas.name + '].[' + tables.name + ']'') AS current_identity_value;
'
FROM sys.tables
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
INNER JOIN sys.types
ON types.user_type_id = columns.user_type_id
INNER JOIN sys.schemas
ON schemas.schema_id = tables.schema_id
WHERE columns.is_identity = 1;
 
EXEC sp_executesql @Sql_Command;
 
UPDATE #identity_columns
	SET percent_consumed = CAST(CAST(current_identity_value AS DECIMAL(25,4)) / CAST(maximum_identity_value AS DECIMAL(25,4)) AS DECIMAL(25,2)) * 100;
 
SELECT
	*
FROM #identity_columns;

This script adds additional elements to our previous query, including:

该脚本将其他元素添加到我们先前的查询中,包括:

  • The size of a column. For TINYINT, SMALLINT, INT, and BIGINT, this is a fixed value. For DECIMAL and NUMERIC, it is based on the definition of the column and how much space is allocated to the non-decimal component of the data type.

    列的大小。 对于TINYINT,SMALLINT,INT和BIGINT,这是一个固定值。 对于DECIMAL和NUMERIC,它基于列的定义以及为数据类型的非小数部分分配了多少空间。
  • The current SQL identity value, which is the number returned by IDENT_CURRENT.

    当前SQL身份值,即IDENT_CURRENT返回的数字。
  • The percent of the column consumed, as determined by dividing the current identity value by the maximum value and multiplying by 100.

    列消耗的百分比,通过将当前标识值除以最大值再乘以100来确定。

When run on WideWorldImporters, we get the following output:

WideWorldImporters上运行时,我们得到以下输出:

Identity tables in sample database

These results provide us with everything we need to understand how much of an identity columns data space has been consumed. We can see that the first two columns are safe for now, as they have an immense amount of identity values to churn through before being in danger of running out. Our test table, on the other hand, is at 100% capacity. As we recall from the demo earlier, this column has no space left and INSERT operations against it will fail due to this limit.

这些结果为我们提供了了解已消耗了多少标识列数据空间所需的一切。 我们可以看到,前两列目前是安全的,因为它们有大量的标识值需要经过大量修改后才能耗尽。 另一方面,我们的测试台容量为100%。 正如我们在前面的演示中所回顾的那样,此列没有剩余空间,并且由于此限制,针对它的INSERT操作将失败。

If we run the query above against AdventureWorks2016, we can see a larger result set:

如果我们对AdventureWorks2016运行以上查询,则可以看到更大的结果集:

Identity tables in sample database Adventureworks2014

We can see here that nearly all columns are under 1% used, with the exception of the TINYINT identity on HumanResources.Shift.

我们可以在这里看到几乎所有列的使用率均低于1%,除了HumanResources.Shift上的TINYINT标识之外。

Links are provided at the end of this article to where you can download Microsoft’s various demo databases, which are used throughout this article.

本文末尾提供链接,您可以从这些链接下载Microsoft的各种演示数据库,本文将使用它们。

解决SQL身份完整性 (Resolving SQL Identity Fullness)

Now that we have the ability to quickly research identity column utilization, we can take the TSQL presented above and move it into a stored procedure that gets executed via a job or some other process semi-regularly. We can then choose some threshold that we need to act on. For example, we may decide that any identity column that is over 85% consumed will be the target of DBA actions in order to prevent it from exhausting the range of its data type.

现在,我们已经能够快速研究标识列的利用率,我们可以使用上面介绍的TSQL并将其移到存储过程中,该存储过程可以通过作业或某些其他进程半定期地执行。 然后,我们可以选择需要执行的一些阈值。 例如,我们可以决定将消耗超过85%的任何标识列作为DBA操作的目标,以防止耗尽其数据类型范围。

Once we have identified a table that needs to be acted upon, we have a variety of choices in how to solve the problem. Determining the best option for resolving an identity column’s fullness will be require us to consider the table, its usage, contention, and resources available. Here are some ways that we can solve our problem:

一旦确定了需要处理的表格,我们就可以在解决问题的方式上有多种选择。 确定解决身份列的完整性的最佳选择将需要我们考虑表,表的用法,争用和可用资源。 以下是一些我们可以解决问题的方法:

Alter an identity column in-place

就地更改标识列

The simplest way to increase the range of a SQL identity column is to alter it to a larger data type. For example, we can change a SMALLINT into an INT, or an INT into a BIGINT. This option requires very little TSQL, but will lock the table while it is executing. In addition, quite a bit of log space can be used to complete the operation.

增加SQL Identity列范围的最简单方法是将其更改为更大的数据类型。 例如,我们可以将SMALLINT更改为INT,或将INT更改为BIGINT。 该选项只需要很少的TSQL,但是在执行时将锁定表。 此外,可以使用大量日志空间来完成操作。

Pros

优点

  • Simple TSQL, minimal effort to implement. Application may need to be updated to address the new data type, but no significant code changes should be required.

    简单的TSQL,只需最少的努力即可实现。 可能需要更新应用程序以解决新的数据类型,但是不需要进行重大的代码更改。
  • No need to rename objects or start a new table for our data.

    无需重命名对象或为我们的数据启动新表。
  • The risk of overflowing data decreases drastically after this change.

    更改后,数据溢出的风险将大大降低。

Cons

缺点

  • Will lock the table while executing. For a large table, this can take a long time and prevent applications and processes from using the table.

    执行时将锁定表。 对于大表,这可能需要很长时间,并且会阻止应用程序和进程使用该表。
  • Can cause significant log growth if the table is large.

    如果表很大,可能导致日志显着增长。


Altering a column in place is good for a smaller table or in scenarios where an application outage is tolerable. Consider that every row in the table needs to be updated as additional space needs to be allocated to the column in each row. For a table with a billion rows with an integer identity column, altering to a BIGINT will require approximately 4 billion additional bytes. That’s about 4GB of data that needs to be written to the table, at minimum. In practice, the space used will likely be much higher as the data will not all be compactly squeezed into consecutive, full pages. Instead, there will be page splits and the index (or heap) will temporarily take up more space, until index maintenance is performed.

在较小的表上或在容许应用程序中断的情况下,在适当位置更改列是一个好选择。 考虑到表中的每一行都需要更新,因为需要为每一行中的列分配额外的空间。 对于具有十亿行且具有整数标识列的表,更改为BIGINT将需要大约40亿个额外字节。 至少需要将大约4GB的数据写入表中。 实际上,由于数据不会全部紧凑地压缩为连续的完整页面,因此使用的空间可能会更高。 相反,将进行页面拆分,索引(或堆)将临时占用更多空间,直到执行索引维护为止。

To test a column alter on our table from earlier, we can implement it as follows:

为了从较早的时候在我们的表上测试一列变更,我们可以如下实现它:

 
ALTER TABLE dbo.Identity_Test ALTER COLUMN My_Identity BIGINT NOT NULL;
 

Executing this results in an error message:

执行此操作将导致错误消息:

Alter table alter column command

Ah, we need to drop our clustered primary key on this column before we can alter its data type:

啊,我们需要在此列上删除集群主键,然后才能更改其数据类型:

 
ALTER TABLE dbo.Identity_Test DROP CONSTRAINT PK_Identity_Test;
 
ALTER TABLE dbo.Identity_Test ALTER COLUMN My_Identity BIGINT NOT NULL;
 

This succeeds, and we can view the new data type in the table:

这样成功了,我们可以在表中查看新的数据类型:

new data type in the table

Now, we can add back the clustered primary key that originally was defined on the column:

现在,我们可以重新添加最初在该列上定义的集群主键:

 
ALTER TABLE dbo.Identity_Test ADD CONSTRAINT PK_Identity_Test PRIMARY KEY CLUSTERED (My_Identity);
 

And, we’re done! Of course, this table only had a handful of rows, so we expected this to be a quick and painless operation.

并且,我们完成了! 当然,该表只有几行,因此我们希望这是一种快速而轻松的操作。

Reseed the SQL Identity Column

重新设置“ SQL标识”列

An additional option to resolve an identity column that is running out of space is to reseed to column to a very low number and allow it to begin growing naturally again. This option requires no application-facing schema changes, but it will force the reuse of old IDs, which may not be an acceptable option. In addition, if the column is truly growing fast, this may only serve as a temporary solution. If the identity column fills up again soon, then we’ll be forced to revisit this problem again (and again).

解决标识列已用完的另一种选择是将种子列重新设置为非常小的数目,并使其重新开始自然增长。 此选项不需要更改面向应用程序的架构,但是它将强制重用旧的ID,这可能不是一个可接受的选项。 另外,如果色谱柱确实快速增长,则只能作为临时解决方案。 如果“身份”列很快又被填满,那么我们将被迫再次(又一次)重新研究此问题。

Pros

优点

  • No application-facing schema changes.

    没有面向应用程序的架构更改。
  • Will not block applications.

    不会阻止应用程序。
  • Will not cause any significant log growth.

    不会导致任何显着的日志增长。
  • Will not take long to execute.

    不需要很长时间即可执行。

Cons

缺点

  • Reuses old SQL identity values, which may be unacceptable for tables in which the ID values are important and must be unique all-time.

    重用旧SQL身份值,这对于ID值很重要并且必须始终唯一的表可能是不可接受的。
  • If table is very fast-growing, it will only be a temporary solution that we will need to visit again soon.

    如果表增长非常快,那将只是一个临时解决方案,我们需要尽快再次访问。
  • Need to confirm existing column values to ensure that we will not duplicate a value in the near future.

    需要确认现有的列值,以确保我们在不久的将来不会重复一个值。

This solution truly feels like we are kicking the can down the road a bit, and in many scenarios that will be the case. For a table in which the ID values are disposable and the data short-lived, though, this can be a quick and easy way to resolve the problem without having to resort to lengthening the column.

这种解决方案确实让我们感觉像是在将罐子推开了,在许多情况下都是如此。 但是,对于ID值是一次性的并且数据寿命短的表,这可以是解决问题的快速简便的方法,而不必求助于延长列。

Let’s demonstrate how to reseed an identity column to a new value:

让我们演示如何将标识列重新设置为新值:

DBCC CHECKIDENT ('Dbo.Identity_Test', RESEED, -2147483648);

Executing this is quick and painless, and there’s no need to drop dependencies, since the schema itself isn’t changing. Now if we insert a row, we’ll note that it now has an ID that’s a very, very negative number:

执行此操作既快速又轻松,并且无需删除依赖项,因为架构本身不会更改。 现在,如果我们插入一行,我们会注意到它现在的ID是一个非常非常负的数字:

INSERT INTO dbo.Identity_Test
	(My_Dinosaur)
VALUES
	('Pterodactyl');
SELECT * FROM dbo.Identity_Test;

The results of the SELECT are as follows:

SELECT的结果如下:

Reseed identity in SQL

Note that the Pterodactyl has the most negative value allowed by an integer data type. We’re now all set until we’ve exhausted about 4.2 billion integers!

请注意,翼手龙具有整数数据类型所允许的最大负值。 现在我们已经准备就绪,直到用尽约42亿个整数!

If you choose to reseed the column on a semi-regular basis, be sure to validate that ID values will never, ever be reused. What happens if we reuse an identity value? We can test that here:

如果您选择半定期重新设置列的种子,请确保验证ID值永远不会被重用。 如果我们重复使用身份值会怎样? 我们可以在这里测试:

DBCC CHECKIDENT ('Dbo.Identity_Test', RESEED, 2147483646);

Now we can try once again to add another dinosaur to our table:

现在,我们可以再次尝试将另一个恐龙添加到表中:

INSERT INTO dbo.Identity_Test
	(My_Dinosaur)
VALUES
	('Micropachycephalosaurus');

The result is an error:

结果是一个错误:

Violation of primary key error

Note that once again, the error did not mention that a SQL identity column crashed into itself or even that the identity itself was to blame. It simply tried to insert a value that was already there and threw a primary key violation as a result. Because of this—be careful when reseeding identity columns and ensure that numbers cannot be accidentally reused. Also, keep in mind that you may reseed an INT or BIGINT to a negative value as we did earlier in order to increase the range of numbers that can be used before the column’s data type is exhausted.

请再次注意,该错误并未提及SQL身份列崩溃到自身,甚至没有提到身份本身。 它只是试图插入一个已经存在的值,并因此引发主键冲突。 因此,在重新填充标识列时要小心,并确保不会意外重用数字。 另外,请记住,您可以像我们之前那样将INT或BIGINT设置为负值,以扩大在用尽列的数据类型之前可以使用的数字范围。

Create a New Table

创建一个新表

A more complex, but good solution for the scenario in which you have far more data is to create a new table with a correctly sized identity column, slowly sync the data up, and then switch over to use it at a future time. This allows us to stage our data and prepare the new column at our leisure without causing any significant blocking or application outage while we are updating our schema.

对于您拥有更多数据的情况,一种更复杂但很好的解决方案是创建一个具有正确大小的标识列的新表,缓慢地同步数据,然后再切换以备将来使用。 这样,我们就可以在暂存数据时暂存数据并准备新的列,而不会在更新架构时造成任何重大的阻塞或应用程序中断。

Pros

优点

  • We avoid identity value reuse or the churn of altering a column in-place.

    我们避免了身份值重用或就地更改列的麻烦。
  • The new column can be backfilled slowly, preventing the need for a long outage while the new data is populated.

    可以缓慢回填新列,从而避免在填充新数据时长时间停机。
  • Scales, even for very large tables.

    秤,即使是非常大的桌子也可以。

Cons

缺点

  • The small outage should still be planned while the tables are swapped.

    交换表时,仍应计划少量中断。
  • More complex solution than previous solutions.

    比以前的解决方案更复杂的解决方案。
  • Trigger (or some maintenance logic) is required on source table to keep both in sync.

    源表上需要触发器(或某些维护逻辑)以使两者保持同步。

There are a number of ways of accomplishing this, each with varying degrees of rigidness in the release process. The method demonstrated here will ensure that all data is copied to a new table and kept in sync until we are ready to swap. If your needs are less stringent, feel free to simplify the process to reduce complexity and/or work. Here is the starting point for our efforts:

有多种方法可以实现此目的,每种方法在释放过程中具有不同程度的刚度。 此处演示的方法将确保所有数据都复制到新表中并保持同步,直到我们准备交换为止。 如果您的需求不太严格,请随时简化过程以减少复杂性和/或工作。 这是我们努力的起点:

CREATE TABLE dbo.Identity_Test_BIGINT
(	My_Identity BIGINT NOT NULL IDENTITY(1,1) CONSTRAINT PK_My_Identity_BIGINT PRIMARY KEY CLUSTERED,
	My_Dinosaur VARCHAR(25) NOT NULL	);
GO
 
CREATE TRIGGER TR_Identity_Test_Bigint_Upgrade
ON dbo.Identity_Test
AFTER INSERT, UPDATE, DELETE
AS
BEGIN
	SET NOCOUNT ON;
	-- INSERTS ONLY
	SET IDENTITY_INSERT dbo.Identity_Test_BIGINT ON;
	INSERT INTO dbo.Identity_Test_BIGINT
		(My_Identity, My_Dinosaur)
	SELECT
		Inserted.My_Identity,
		Inserted.My_Dinosaur
	FROM Inserted
	LEFT JOIN Deleted
	ON Deleted.My_Identity = Inserted.My_Identity
	WHERE Deleted.My_Identity IS NULL;
	SET IDENTITY_INSERT dbo.Identity_Test_BIGINT OFF;
 
	-- DELETES ONLY
	DELETE Identity_Test_BIGINT
	FROM dbo.Identity_Test_BIGINT
	WHERE Identity_Test_BIGINT.My_Identity IN (
		SELECT
			Deleted.My_Identity
		FROM Deleted
		LEFT JOIN Inserted
		ON Deleted.My_Identity = Inserted.My_Identity
		WHERE Inserted.My_Identity IS NULL);
 
	-- UPDATES ONLY
	UPDATE Identity_Test_BIGINT
	SET My_Dinosaur = Inserted.My_Dinosaur
	FROM Inserted
	INNER JOIN Deleted
	ON Deleted.My_Identity = Inserted.My_Identity
	INNER JOIN dbo.Identity_Test_BIGINT
	ON Identity_Test_BIGINT.My_Identity = Inserted.My_Identity
	WHERE Inserted.My_Dinosaur <> Deleted.My_Dinosaur;
 
	-- UPDATES WHERE THE ROW HAS YET TO EXIST IN THE NEW TABLE
	SET IDENTITY_INSERT dbo.Identity_Test_BIGINT ON;
	INSERT INTO dbo.Identity_Test_BIGINT
		(My_Identity, My_Dinosaur)
	SELECT
		Inserted.My_Identity,
		Inserted.My_Dinosaur
	FROM Inserted
	INNER JOIN Deleted
	ON Deleted.My_Identity = Inserted.My_Identity
	WHERE Inserted.My_Identity NOT IN
		(SELECT Identity_Test_BIGINT.My_Identity FROM dbo.Identity_Test_BIGINT);
	SET IDENTITY_INSERT dbo.Identity_Test_BIGINT OFF;
END

This script creates a new table called Identity_Test_BIGINT and adds a trigger onto Identity_Test that will mirror all operations onto the new table: INSERT, UPDATE, and DELETE. In addition, if an update occurs on a row that is not in the new table, we’ll also catch that and copy it over. Here are some demos of the operations and results:

该脚本将创建一个名为Identity_Test_BIGINT的新表,并在Identity_Test上添加一个触发器,该触发器将所有操作镜像到该新表上:INSERT,UPDATE和DELETE。 此外,如果对不在新表中的行进行了更新,我们也将捕获并复制过来。 以下是操作和结果的一些演示:

INSERT INTO dbo.Identity_Test
	(My_Dinosaur)
VALUES
	('Troodon');
SELECT
	*
FROM dbo.Identity_Test;
SELECT
	*
FROM dbo.Identity_Test_BIGINT;

Sample data

We can see that the new row was correctly inserted into the new table.

我们可以看到新行已正确插入到新表中。

UPDATE dbo.Identity_Test
	SET My_Dinosaur = 'Parasaurolophus'
WHERE My_Dinosaur = 'Troodon';

Identity range

The update was also correctly applied to both the exsiting table, as well as the new one.

此更新也已正确应用于现有表和新表。

DELETE
FROM dbo.Identity_Test
WHERE My_Dinosaur = 'Parasaurolophus';

Identity range

Similarly, the delete operation removes the row from both tables. Lastly, what happens if we update a row in SQL Identity table Identity_Test that does not exist in our new table?

同样,删除操作将从两个表中删除该行。 最后,如果我们更新SQL身份表Identity_Test中新表中不存在的行,会发生什么情况?

UPDATE dbo.Identity_Test
	SET My_Dinosaur = 'Parasaurolophus'
WHERE My_Dinosaur = 'Pterodactyl';

Identity range

The last section of the trigger ensured that the update resulted in an insertion into the new table. At this point, we can backfill our data using whatever method we’d like. The following will batch IDs into groups of 2000, allowing for a safe and slow backfill that should have minimal impact on production operations. The batch sizes can be increased, but use caution to ensure that no negative impact results from writing too much data at once.

触发器的最后一部分确保更新导致插入到新表中。 此时,我们可以使用所需的任何方法回填数据。 以下内容会将ID批处理为2000个组,以确保安全且缓慢的回填,这对生产操作的影响应最小。 可以增加批处理的大小,但请谨慎使用,以确保一次写入太多数据不会造成负面影响。

WHILE @@ROWCOUNT > 0
BEGIN
	SET IDENTITY_INSERT dbo.Identity_Test_BIGINT ON;
	INSERT INTO dbo.Identity_Test_BIGINT
		(My_Identity, My_Dinosaur)
	SELECT
		Identity_Test.My_Identity,
		Identity_Test.My_Dinosaur
	FROM dbo.Identity_Test
	WHERE Identity_Test.My_Identity NOT IN
		(SELECT Identity_Test_BIGINT.My_Identity FROM dbo.Identity_Test_BIGINT)
	SET IDENTITY_INSERT dbo.Identity_Test_BIGINT OFF;
END

This runs quickly enough on our test data, and we can now confirm that both tables are synced up:

这可以在我们的测试数据上足够快地运行,现在我们可以确认两个表都已同步:

Identity range example

Once the tables are identical, the last step is a one-time release to swap our tables:

一旦表相同,最后一步是一次性发行以交换表:

EXEC sp_rename @objname = 'dbo.Identity_Test', @newname = 'Identity_Test_OLD';
EXEC sp_rename @objname = 'dbo.PK_Identity_Test', @newname = 'PK_Identity_Test_OLD', @objtype = 'OBJECT';
EXEC sp_rename @objname = 'dbo.Identity_Test_BIGINT', @newname = 'Identity_Test';
EXEC sp_rename @objname = 'dbo.PK_My_Identity_BIGINT', @newname = 'PK_Identity_Test', @objtype = 'OBJECT';
DROP TRIGGER TR_Identity_Test_Bigint_Upgrade;

Because this is a DDL operation in which no data is being modified, it will execute very quickly. The now-old objects are renamed and the new table is given the original table name of Identity_Test. The old table can be dropped at a later date, whenever is convenient or when the release is deemed a success.

因为这是DDL操作,其中没有数据被修改,所以它将很快执行。 现在重命名的对象被重命名,新表被赋予Identity_Test的原始表名。 只要方便或在认为发行成功时,便可以在以后删除旧表。

As a result, the timing of this release can be made to occur in an extremely short window. Realistically, a very short outage for the database would be the safe way to deploy, but even without an outage, the interruption would be extremely brief. An application with only modest resilience would likely see little in the way of interruption.

结果,可以使释放的时机在极短的窗口内发生。 实际上,非常短的数据库中断是部署的安全方法,但是即使没有中断,中断也将非常短暂。 只有适度的弹性的应用程序可能不会受到干扰。

This process can easily be tweaked, as needed, in order to accommodate tables of different sizes or complexities, but the general approach applies regardless.

可以根据需要轻松调整此过程,以适应不同大小或复杂程度的表,但是无论如何,都适用通用方法。

结论 (Conclusion)

SQL Identity columns are useful ways to generate sequential values with minimal effort or intervention. Checking periodically to ensure that identities have sufficient room to grow can help avert disaster and allow the problem to be solved preemptively without the need for last-minute panic.

SQL Identity列是使用最少的精力或干预即可生成顺序值的有用方法。 定期检查以确保身份具有足够的成长空间可以帮助避免灾难,并可以在不需要紧急恐慌的情况下提前解决问题。

Once identified, increasing the size of an identity column isn’t difficult, and we can choose from a variety of solutions based on the size, usage, and availability required of the table. Like many maintenance tasks, being able to perform this work on our terms saves immense time, effort, and stress. This not only makes us happier, but frees up more time so we can work on more important tasks!

一旦确定,增加身份列的大小并不困难,并且我们可以根据表的大小,用途和可用性从多种解决方案中进行选择。 与许多维护任务一样,能够按照我们的条件执行这项工作可以节省大量时间,精力和压力。 这不仅使我们更快乐,而且腾出了更多时间,以便我们可以从事更重要的任务!

参考资料 (References)

翻译自: https://www.sqlshack.com/solve-identity-crisis-sql-server/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值