如何在SQL Server中使用文件组以及如何在文件组之间迁移数据

最新推荐文章于 2023-12-03 21:48:53 发布

culuo4781

最新推荐文章于 2023-12-03 21:48:53 发布

阅读量936

点赞数 1

文章标签：数据库 python java mysql linux

原文链接：https://www.sqlshack.com/how-to-work-with-filegroups-in-sql-server-and-migrate-data-between-them/

版权

As you may already have figured out, the default settings in SQL Server are not always the best. Such is the case when you are working with new user databases; usually you get a single data (*.mdf) and transaction log (*.ldf) file. The data file resides within the PRIMARY file group; the only one we have so far, and it will store all of our databases objects, system objects, user tables, user stored procedures and all other objects. In some cases this default configuration may be good enough for you, but let us cover why would you prefer a different configuration in your production environment.

您可能已经知道，SQL Server中的默认设置并不总是最好的。使用新的用户数据库时就是这种情况。通常您会得到一个数据（* .mdf）和事务日志（* .ldf）文件。数据文件位于PRIMARY文件组内；到目前为止，我们仅有的一个，它将存储我们所有的数据库对象，系统对象，用户表，用户存储过程以及所有其他对象。在某些情况下，这种默认配置可能对您已经足够了，但让我们介绍一下为什么您会在生产环境中选择其他配置。

Working with multiple files and filegroups allows you to achieve higher throughput and to be more flexible when designing your environment.

使用多个文件和文件组可以使您获得更高的吞吐量，并在设计环境时更加灵活。

An added benefit is that additional system PFS, GAM and SGAM pages can be latched within the memory avoiding contention,although most commonly it affects the TempDB system database. In addition the possibility exists to spread your data files between numerous disk drives achieving higher IO operations.

另一个好处是，可以在内存中锁存其他系统PFS，GAM和SGAM页，从而避免争用，尽管最常见的情况是它会影响TempDB系统数据库。此外，还可能在众多磁盘驱动器之间传播数据文件，从而实现更高的IO操作。

Utilizing the approach for using multiple FILEGROUPs, you have the possibility to split your data between multiple files within the same Filegroup, even when using table partitioning. In addition to that, you can use the additional filegroups to optimize your database throughput and achieve more flexible backup and restore process. The DBCC CHECKDB command in latest versions also allows you to perform checks on Filegroup level instead of only on the individual user database.

利用使用多个FILEGROUP的方法，即使使用表分区，也可以在同一文件组内的多个文件之间拆分数据。除此之外，您可以使用其他文件组来优化数据库吞吐量并实现更灵活的备份和还原过程。最新版本的DBCC CHECKDB命令还允许您在文件组级别执行检查，而不仅仅是在单个用户数据库上执行检查。

Upon creating a new database the SQL Server usually leaves you with a single Filegroup,PRIMARY, and two files, a single data file and a log file.

创建新数据库后，SQL Server通常只为您提供一个Filegroup，PRIMARY和两个文件，即一个数据文件和一个日志文件。

 
CREATE DATABASE [February]
 CONTAINMENT = NONE
 ON  PRIMARY 
( NAME = N'February', FILENAME = N'C:\MF\February.mdf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
 LOG ON 
( NAME = N'February_log', FILENAME = N'C:\MF\February_log.ldf' , SIZE = 8192KB , FILEGROWTH = 65536KB )

Let us create an additional Filegroup and then place a new data file in it:

让我们创建一个附加的文件组，然后在其中放置一个新的数据文件：

 
USE [master]
GO
ALTER DATABASE [February] ADD FILEGROUP [MAIN_FILEGROUP]
GO
ALTER DATABASE [February] ADD FILE ( NAME = N'February_Data', FILENAME = N'C:\MF\February_Data.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [MAIN_FILEGROUP]
GO

At the moment any new user object we create will go to the currently default Filegroup;PRIMARY. However, as we have already mentioned we would like to leave the PRIMARY filegroup only for the default system objects and all new user objects to be created within our ‘MAIN_FILEGROUP’ for example. To change this behavior we need to set the filegroup ‘MAIN_FILEGROUP’ as default. To do so we modify the filegroup as follows:

目前，我们创建的任何新用户对象都将转到当前默认的Filegroup; PRIMARY。但是，正如我们已经提到的，我们只想保留PRIMARY文件组，以用于默认的系统对象以及所有要在“ MAIN_FILEGROUP”中创建的新用户对象。要更改此行为，我们需要将文件组“ MAIN_FILEGROUP”设置为默认值。为此，我们如下修改文件组：

 
USE [February]
GO
IF NOT EXISTS (SELECT name FROM sys.filegroups WHERE is_default=1 AND name = N'MAIN_FILEGROUP') ALTER DATABASE [February] MODIFY FILEGROUP [MAIN_FILEGROUP] DEFAULT
GO

The new table ‘UserTable’ we’ve created is automatically created within the ‘MAIN_FILEGROUP’:

我们创建的新表'UserTable'是在'MAIN_FILEGROUP'中自动创建的：

If within our user database we have more than one filegroup and we need to create a user object in a filegroup that is not the default one, we must specify this upon the creation of the object.

如果在我们的用户数据库中我们有多个文件组，并且需要在不是默认文件组的文件组中创建一个用户对象，则必须在创建对象时指定该对象组。

The following user table will be created within filegroup ARCH_FILEGROUP by stating this with the ‘ON’ statement.

通过在文件组ARCH_FILEGROUP中使用“ ON”语句进行说明，将创建以下用户表。

 
USE [February]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ArchiveTable](
	[Column1] [nchar](10) NULL
) ON [ARCH_FILEGROUP]
GO

Not all user databases that we have to work with come configured the best way possible, usually as applications become more complex and databases become bigger we need to tune our environment. One of the options we have is distributing the data between different filegroups and files. However we would need to migrate the user objects from one place to another.

并非所有需要使用的用户数据库都以最佳方式配置，通常是随着应用程序变得越来越复杂以及数据库越来越大，我们需要调整环境。我们拥有的一种选择是在不同的文件组和文件之间分配数据。但是，我们需要将用户对象从一个地方迁移到另一个地方。

The problem of migrating a table from one filegroup to another can be solved in two different ways.

将表从一个文件组迁移到另一个文件组的问题可以通过两种不同的方式解决。

The first and the not-so-pretty solution is simply to create a new table in the target filegroup, to migrate all data from the old table to the new table and then to switch their names.

第一个也不是那么漂亮的解决方案只是在目标文件组中创建一个新表，将所有数据从旧表迁移到新表，然后切换它们的名称。

A more elegant solution is to use the possibilities of the CREATE INDEX T-SQL Command. Note that ALTER INDEX command cannot be used to repartition an index or move it to a different filegroup.

一个更优雅的解决方案是使用CREATE INDEX T-SQL命令的可能性。请注意，ALTER INDEX命令不能用于对索引重新分区或将其移动到其他文件组。

 
USE [February]
GO
DROP INDEX [CL_IX_U_Column1] ON [dbo].[ArchiveTable] WITH (ONLINE = OFF)
GO
 
CREATE UNIQUE CLUSTERED INDEX [CL_IX_U_Column1] ON [dbo].[ArchiveTable]
(	[Column1] ASC
) ON [MAIN_FILEGROUP]
GO

The action above can be achieved by utilizing both clustered and non-clustered indexes. However when working with heaps we would need to temporary create an index only for the purpose of migrating the table.

可以通过利用聚簇索引和非聚簇索引来实现上述操作。但是，在使用堆时，我们仅需要出于迁移表的目的临时创建索引。

Here we are creating a new index on a heap table which purpose is only to point the user object to a new filegroup. Once this index is created we can immediately drop it being confident that dbo.heap_table will now reside in the ‘MAIN_FILEGROUP’.

在这里，我们在堆表上创建一个新索引，其目的仅仅是将用户对象指向一个新文件组。创建该索引后，我们可以立即删除它，并确信dbo.heap_table现在将位于“ MAIN_FILEGROUP”中。

 
USE [February]
GO
CREATE NONCLUSTERED INDEX [CL_IX_Column1] ON [dbo].[heap_table]
([column1] ASC
) ON [MAIN_FILEGROUP]
DROP INDEX [CL_IX_Column1] ON [dbo].[heap_table]
GO

In order to better design your database structure it is really important to understand how SQL Server works with multiple files within a filegroup.

为了更好地设计数据库结构，了解SQL Server如何处理文件组中的多个文件非常重要。

By default the SQL Server uses a proportional fill strategy across all of the files within each filegroups. When new data is added it is not just written to the first file up until it is full, but new data is spread depending on the free space you have within your files.

默认情况下，SQL Server对每个文件组中的所有文件使用比例填充策略。添加新数据后，它不仅会写到第一个文件中，直到充满为止，还会根据文件中的可用空间来散布新数据。

A very simple example is when you have two data files within the same filegroup; data_file_1.ndf and data_file_2.ndf. The first file has 100MB free space and the second has 200MB free. When new data is added, the SQL Server first checks the free space allocations within the two files. Then, based on the free space information it has it will allocate one extend in data_file_1.ndf and two extends in data_file_2.ndf and so on. By using this simple striping algorithm both files should be full usually at the same time.

一个非常简单的示例是当您在同一文件组中有两个数据文件时； data_file_1.ndf和data_file_2.ndf。第一个文件具有100MB的可用空间，第二个文件具有200MB的可用空间。添加新数据后，SQL Server首先检查两个文件中的可用空间分配。然后，根据可用空间信息，它将在data_file_1.ndf中分配一个扩展，在data_file_2.ndf中分配两个扩展，依此类推。通过使用这种简单的条带化算法，两个文件通常应该同时充满。

Sooner or later all files within a filegroup will be fully utilized and if automatic file growth is enabled the SQL server will expand them one file at a time in a round-robin manner. It will start expanding the first file and as soon it is full it will move the second file. It will expand it and write data to it, as soon it is full it will move to the next;data_file_3.ndf and so on, starting from the data_file_1.ndf again.

迟早将充分利用文件组中的所有文件，并且如果启用了自动文件增长功能，则SQL Server将以循环方式一次将它们扩展为一个文件。它将开始扩展第一个文件，并且一旦它满了，它将移动第二个文件。它将对其进行扩展并向其中写入数据，一旦充满，它将移至下一个； data_file_3.ndf，依此类推，再次从data_file_1.ndf开始。

In the SQL Server 2014 there was the possibility to use trace flag T1117, which changes the behavior of file growth. In a filegroup containing multiple files, if a single file needs to grow it forces all other files to grow as well. Although this works great for the system database TempDB in combination with trace flag T1118 it was not a common configuration due to being a server wide setting.

在SQL Server 2014中，可以使用跟踪标志T1117，它会更改文件增长的行为。在包含多个文件的文件组中，如果单个文件需要增长，它将强制所有其他文件也增长。尽管这对于将系统数据库TempDB与跟踪标志T1118结合使用非常有效，但由于是服务器范围的设置，因此它不是常用配置。

Starting with SQL Server 2016 the defaults for TempDB system database are as its using trace flags T1117 and T1118 without affecting other user databases – it is achieved by introducing new filegroup level settings.

从SQL Server 2016开始，TempDB系统数据库的默认设置为使用跟踪标志T1117和T1118，而不会影响其他用户数据库-这是通过引入新的文件组级别设置来实现的。

Within SQL Server 2016 an additional ALTER DATABASE setting at the FILEGROUP level has been introduced. The default one – ‘AUTOGROW_SINGLE_FILE’ works exactly as explained above – utilizing the round-robin algorithm. For filegroups having more than one file the new option ‘AUTOGROW_ALL_FILES’ can be enabled which forces an file expand to happen on all files within the filegroup.

在SQL Server 2016中，已引入FILEGROUP级别的其他ALTER DATABASE设置。默认值-'AUTOGROW_SINGLE_FILE'与循环算法完全一样，如上所述。对于具有多个文件的文件组，可以启用新选项“ AUTOGROW_ALL_FILES”，这将强制文件扩展发生在文件组中的所有文件上。

In order to enable it we would need to alter the database:

为了启用它，我们需要更改数据库：

 
ALTER DATABASE February MODIFY FILEGROUP MAIN_FILEGROUP AUTOGROW_ALL_FILES

DBCC CHECKFILEGROUP allows us to perform a CHECKDB operation only on a single filegroup, the following will check only the PRIMARY filegroup:

DBCC CHECKFILEGROUP允许我们仅在单个文件组上执行CHECKDB操作，以下将仅检查PRIMARY文件组：

 
DBCC CHECKFILEGROUP 
WITH NO_INFOMSGS 
GO

To perform a check on one of the additional file groups we would need information about the group ID:

要检查其他文件组之一，我们需要有关组ID的信息：

To perform check on the MAIN_FILEGROUP we need to specify this using the group ID – 2:

要对MAIN_FILEGROUP进行检查，我们需要使用组ID – 2进行指定：

 
DBCC CHECKFILEGROUP (2)
WITH NO_INFOMSGS 
GO

SQL Server databases with multiple filegroups can be restored in stages using piecemeal restore. The piecemeal restore works quite similar as the normal restore operation utilizing the three phases; data copy, redo and undo. You can perform a restore of one of your filegroups while your database and the other filegroups remain online.Please keep in mind that when recovering the PRIMARY filegroup your database should be offline.

具有多个文件组SQL Server数据库可以使用逐段还原来分阶段还原。零序还原的工作原理与使用这三个阶段的正常还原操作非常相似。数据复制，重做和撤消。您可以在数据库和其他文件组保持联机状态时执行其中一个文件组的还原。请记住，在恢复PRIMARY文件组时，数据库应处于脱机状态。