filestream_SQL Server FILESTREAM内部概述

filestream

In the article FILESTREAM in SQL Server, we provided a SQL Server FILESTREAM overview with a focus on internal functionality. In this article, we will cover various additional aspects of the FILESTREAM feature.

SQL Server中的FILESTREAM文章 ,我们提供了一个SQL Server FILESTREAM概述,重点是内部功能。 在本文中,我们将介绍FILESTREAM功能的各个其他方面。

Before we move further, let us check the data in the FILESTREAM table and the files in the file system container.

在继续之前,让我们检查FILESTREAM表中的数据和文件系统容器中的文件。

Below are the files in the container. We have two files present in the file system container. It also matches the metadata present in the table.

以下是容器中的文件。 文件系统容器中存在两个文件。 它还与表中存在的元数据匹配。

In the article, Managing data with SQL Server FILESTREAM tables, we wrote about how to update FILESTREAM objects.

在“ 使用SQL Server FILESTREAM表管理数据 ”一文中,我们写了关于如何更新FILESTREAM对象的信息。

Let us perform two updates for the existing records. In the first update, we are going to replace the file with 5 KB image file.

让我们对现有记录执行两次更新。 在第一个更新中,我们将用5 KB图像文件替换该文件。

UPDATE DemoFileStreamTable_1
SET [File] = (SELECT *
FROM OPENROWSET(
BULK 'C:\sqlshack\logo.png',
SINGLE_BLOB) AS Document)
WHERE fileid = '8D114AD3-12AA-4064-A1D8-3E8749712D5D'
 
Update DemoFileStreamTable_1
set filename='Logo files'
WHERE fileid = '8D114AD3-12AA-4064-A1D8-3E8749712D5D'
GO

It took 155 ms elapsed time and 4 lob logical reads to complete the update.

经过155 ms的时间和4个lob逻辑读取才能完成更新。

Now in another update, we are going to replace the existing file with an ISO file of 1.53 GB size.

现在,在另一个更新中,我们将用1.53 GB大小的ISO文件替换现有文件。

SET STATISTICS TIME ON
SET STATISTICS IO ON
UPDATE DemoFileStreamTable_1
SET [File] = (SELECT *
FROM OPENROWSET(
BULK 'C:\sqlshack\ubuntu-16.04.5-desktop-amd64.iso',
SINGLE_BLOB) AS Document)
WHERE fileid = '60236384-AC5B-45D1-97F8-4C05D90784F8'
 
Update DemoFileStreamTable_1
set filename='Installation files'
WHERE fileid = '60236384-AC5B-45D1-97F8-4C05D90784F8'
GO

This time update took 1 minute 19 seconds to complete along with 2248710 lob logical reads and 1,386,960 lob read-ahead reads.

这次更新完成了1分19秒,完成了2248710个lob逻辑读取和1,386,960个lob预读读取。

When we perform an update for the files in the SQL Server FILESTREAM, it copies the entire file from the source location to the file stream container. In the second update, we replaced the previous file with 1.5 GB ISO file, therefore; it took time and system resources to copy the file in the FILESTREAM container.

当我们对SQL Server FILESTREAM中的文件执行更新时,它将整个文件从源位置复制到文件流容器。 因此,在第二个更新中,我们用1.5 GB ISO文件替换了先前的文件。 将文件复制到FILESTREAM容器中需要花费时间和系统资源。

Now go to the FILESTREAM container and view the files there. We can see four files here. We did not insert any new files here however, it contains multiple files.

现在转到FILESTREAM容器并在其中查看文件。 我们可以在这里看到四个文件。 我们没有在这里插入任何新文件,但是它包含多个文件。

It contains the old files as well. In the article Managing data with SQL Server FILESTREAM tables, we provided an overview of the garbage collection process and how it works with the transaction log backup and CheckPoint. SQL Server maintains an internal table filestream_tombstone to track this garbage collection process. We can view the internal files using the dedicated administrator connection (DAC). DAC allows connecting to SQL Server even if no one can connect to it due to resources issues. We can enable the DAC connection using the below command.

它也包含旧文件。 在“ 使用SQL Server FILESTREAM表管理数据 ”一文中,我们概述了垃圾收集过程以及该过程如何与事务日志备份和CheckPoint一起使用。 SQL Server维护一个内部表filestream_tombstone来跟踪此垃圾回收过程。 我们可以使用专用管理员连接(DAC)查看内部文件。 DAC允许连接到SQL Server,即使由于资源问题没有人可以连接到它。 我们可以使用以下命令启用DAC连接。

Use master
GO
sp_configure 'remote admin connections', 1 
GO
RECONFIGURE with OVERRIDE
GO

You can refer to article SQL Server Dedicated Admin Connection (DAC) – how to enable, connect and use for more information about DAC.

您可以参考文章SQL Server专用管理员连接(DAC)–如何启用,连接和使用有关DAC的更多信息。

Now connect to the SQL Server using DAC using the connection string as ‘ADMIN: Instance’. Please note SQL Browser service should be running to use DAC connection.

现在,使用连接字符串作为“ ADMIN:实例”使用DAC连接到SQL Server。 请注意,SQL Browser服务应该正在运行以使用DAC连接。

Once connected, execute the below command to check the internal tables for SQL Server FILESTREAM garbage collection.

连接后,执行以下命令以检查内部表中是否存在SQL Server FILESTREAM垃圾回收。

use FileStreamDemoDB_test
go
select * from sys.internal_tables where name like 'filestream_tomb%'

In the above image, you get the internal table ‘ filestream_tombstone_2073058421’.

在上图中,您获得了内部表'filestream_tombstone_2073058421'。

Now view the content of this file. We need to use the ‘sys’ schema to view this table content using the select command.

现在查看此文件的内容。 我们需要使用“ sys”模式来通过select命令查看该表的内容。

use FileStreamDemoDB_test
go
select * from sys.filestream_tombstone_2073058421

In this table sys.filestream_tombstone_2073058421 , you can get the old file names in the column filestream_value_name. Now we can compare the records in this table and files in the file stream container.

在此表sys.filestream_tombstone_2073058421中 ,您可以在列filestream_value_name中获取旧文件名。 现在,我们可以比较该表中的记录和文件流容器中的文件。

In the below image, you can notice that both the old files (before the update) are present in this internal table.

在下图中,您可以注意到此内部表中同时存在两个旧文件(更新之前)。

These tables get processed when the garbage collection process runs however, it does not clear the files until these changes are not backed up.

这些表在运行垃圾收集过程时得到处理,但是,只有在不备份这些更改之前,它才会清除文件。

Lets us now run the transaction log backup.

现在让我们运行事务日志备份。

Once the transaction log backup is completed, check the content in the table sys.filestream_tombstone_2073058421 and as shown below 0 records found in the table.

事务日志备份完成后,检查表sys.filestream_tombstone_2073058421中的内容,如下表所示,找到0条记录。

In the SQL Server FILESTREAM container also, we can see the updated files (after update) only. It is how the internal garbage collection works.

同样在SQL Server FILESTREAM容器中,我们只能看到更新后的文件(更新后)。 这就是内部垃圾收集的工作方式。

In the article, FILESTREAM in SQL Server, we wrote about the folder structure using the FILESTREAM container. In the below image you can see these folders and files.

SQL Server中的FILESTREAM文章 ,我们使用FILESTREAM容器介绍了文件夹结构。 在下图中,您可以看到这些文件夹和文件。

  1. Folder for each FILESTREAM table

    每个FILESTREAM表的文件夹
  2. FILESTREAM Folder to show the column in the FILESTREAM table

    FILESTREAM文件夹,用于显示FILESTREAM表中的列
  3. File in the FILESTREAM container

    FILESTREAM容器中的文件

As you see these folders and files have specific names in the form of GUID. You might be wondering how these names are derived for the folders. We will execute the below command in the DAC connection window. Please note that you can have only one DAC connection. Therefore, if the DAC connection is already connected, use that connection only to run this query. If you run the query in standard connection, you get the error message. This error message comes because you cannot access the internal tables without a DAC connection.

如您所见,这些文件夹和文件具有GUID形式的特定名称。 您可能想知道文件夹的这些名称是如何派生的。 我们将在DAC连接窗口中执行以下命令。 请注意,您只能有一个DAC连接。 因此,如果已经连接了DAC连接,请仅使用该连接来运行此查询。 如果在标准连接中运行查询,则会收到错误消息。 出现此错误消息的原因是,如果没有DAC连接,您将无法访问内部表。

Let us run the below query in DAC window,

让我们在DAC窗口中运行以下查询,

SELECT
SELECT
    [so].[name] AS [Table],
    [r].[rsguid] AS [Rowset GUID],
    [rs].[colguid] AS [Column GUID]
FROM sys.sysrowsets [r] CROSS APPLY sys.sysrscols [rs]
JOIN sys.partitions [pt]
    ON [rs].[rsid] = [pt].[partition_id]
JOIN sys.objects [so]
    ON [so].[object_id] = [pt].[object_id]
JOIN sys.syscolpars [sco]
    ON [sco].[colid] = [rs].[rscolid]
WHERE [rs].[colguid] IS NOT NULL
    AND [so].[object_id] = [sco].[id]
    AND [r].[rsguid] IS NOT NULL
    AND [r].[rowsetid] = [rs].[rsid];
GO
 
    

Let us compare the query output with the folder GUID.

让我们将查询输出与文件夹GUID进行比较。

In the output, you can notice that the top-level directory GUID takes reference from the Rowset GUID and column level folder name shows name from the Column GUID. It is an interesting observation to know the internal of the system behaviour.

在输出中,您会注意到顶级目录GUID从行集GUID获取了引用,列级文件夹的名称显示了列GUID的名称。 了解系统行为的内部是一个有趣的观察。

Now, we further want to dig into the internals and need to know the file name. As you know, FILESTREAM process copies the file from the source directory to the FILESTREAM container directory. Size of the source file and the container file is same, however; it does not maintain the file name inside the FILESTREAM container. There is a mechanism behind the file name. Let us explore this using the undocumented DBCC command, i.e. DBCC IND and DBCC Page. DBCC IND is used to identify the page that belongs to a table or index.

现在,我们进一步想深入了解内部结构,并且需要知道文件名。 如您所知,FILESTREAM进程将文件从源目录复制到FILESTREAM容器目录。 但是,源文件和容器文件的大小相同。 它不会在FILESTREAM容器内保留文件名。 文件名后面有一种机制。 让我们使用未记录的DBCC命令(即DBCC IND和DBCC Page)对此进行探讨。 DBCC IND用于标识属于表或索引的页面。

Execute the below command and pass the DB name and object name into the parameter. In this command, -1 displays all indexes for the object.

执行以下命令,并将数据库名称和对象名称传递给参数。 在此命令中,-1显示对象的所有索引。

DBCC IND('FileStreamDemoDB_test','DemoFileStreamTable_1',-1)

The output of this command is as below.

该命令的输出如下。

We can examine a particular page using the DBCC page command. We need to turn on trace flag 3604 before running the DBCC Page command. DBCC Page shows the content of the database pages. Run the command with parameter ‘3’ to give the information about the page header and per row interpretation as well.

我们可以使用DBCC page命令检查特定页面。 在运行DBCC页面命令之前,我们需要打开跟踪标志3604。 DBCC页面显示数据库页面的内容。 使用参数“ 3”运行命令以提供有关页眉和每行解释的信息。

Run the below query to examine the page number 288 information with level 3.

运行以下查询以检查级别3的页码288信息。

DBCC traceon(3604)
DBCC Page ('FileStreamDemoDB_test',1,288,3)

In the output, we need to look at the createLSN field.

在输出中,我们需要查看createLSN字段。

This CreateLSN value is the same as of the filename in the SQL Server FILESTREAM container file in the example image.

该CreateLSN值与示例图像中SQL Server FILESTREAM容器文件中的文件名相同。

Similarly, the other file name also matches with the createLSN value.

同样,其他文件名也与createLSN值匹配。

结论 (Conclusion)

In this article, we explored the internals of the SQL Server FILESTREAM processes including the garbage collector folder name and the file names. This gives a better understanding of the overall system process for the FILESTREAM if we know the internal threads of it. We will cover a few more aspects of this feature in future articles.

在本文中,我们探索了SQL Server FILESTREAM进程的内部,包括垃圾收集器文件夹名称和文件名。 如果我们知道FILESTREAM的内部线程,则可以更好地理解整个系统过程。 我们将在以后的文章中介绍此功能的更多方面。

目录 (Table of contents)

FILESTREAM in SQL Server
Managing data with SQL Server FILESTREAM tables
SQL Server FILESTREAM Database backup overview
Restoring a SQL Server FILESTREAM enabled database
SQL Server FILESTREAM database recovery scenarios
Working with SQL Server FILESTREAM – Adding columns and moving databases
SQL Server FILESTREAM internals overview
Importing SQL Server FILESTREAM data with SSIS packages
SQL Server FILESTREAM queries and Filegroups
Viewing SQL Server FILESTREAM data with SSRS
SQL Server FILESTREAM Database Corruption and Remediation
Export SQL Server FILESTREAM Objects with PowerShell and SSIS
SQL FILESTREAM and SQL Server Full Text search
SQL Server FILESTREAM and Replication
SQL Server FILESTREAM with Change Data Capture
Transaction log backups in a SQL FILESTREAM database
SQL FILESTREAM Compatibility with Database Snapshot, Mirroring, TDE and Log Shipping
SQL Server FILETABLE – the next generation of SQL FILESTREAM
Managing Data in SQL Server FILETABLEs
SQL Server FILETABLE Use Cases
SQL Server中的文件流
使用SQL Server FILESTREAM表管理数据
SQL Server FILESTREAM数据库备份概述
还原启用了SQL Server FILESTREAM的数据库
SQL Server FILESTREAM数据库恢复方案
使用SQL Server FILESTREAM –添加列和移动数据库
SQL Server FILESTREAM内部概述
使用SSIS包导入SQL Server FILESTREAM数据
SQL Server FILESTREAM查询和文件组
使用SSRS查看SQL Server FILESTREAM数据
SQL Server FILESTREAM数据库损坏和修复
使用PowerShell和SSIS导出SQL Server FILESTREAM对象
SQL FILESTREAM和SQL Server全文搜索
SQL Server FILESTREAM和复制
具有更改数据捕获功能SQL Server FILESTREAM
SQL FILESTREAM数据库中的事务日志备份
SQL FILESTREAM与数据库快照,镜像,TDE和日志传送的兼容性
SQL Server FILETABLE –下一代SQL FILESTREAM
在SQL Server FILETABLEs中管理数据
SQL Server FILETABLE用例

翻译自: https://www.sqlshack.com/sql-server-filestream-internals-overview/

filestream

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值