SQL Server中的递归CTE和外键引用

介绍 (Introduction)

Foreign key constraints are a powerful mechanism for preserving referential integrity in a database. They can also represent a challenge when doing bulk table loads, since you need to find a “base” table to start with – that is, a table that has no foreign key constraints defined. Let’s label tables like this as level 0, or ground level if you like. Once that is loaded, you can begin to load other tables that have foreign key references to the base table. We can label those tables level 1, and so on. If you start with table data that already has referentially integrity and load tables by their level numbers — level 0, level 1, level 2 and so on – the load should proceed without problems. Let’s look at a simple example:

外键约束是在数据库中保留引用完整性的强大机制。 在进行批量表加载时,它们也可能构成挑战,因为您需要找到一个以“ base”表开头的表,即没有定义外键约束的表。 让我们将像这样的表标记为0级,或者将其标记为地面。 加载后,您可以开始加载对基表具有外键引用的其他表。 我们可以将这些表标记为1级,依此类推。 如果从已经具有参照完整性的表数据开始,并按其级别编号(级别0,级别1,级别2等等)加载表,则加载应该没有问题。 让我们看一个简单的例子:

CREATE TABLE base(
    id int IDENTITY(1,1) PRIMARY KEY, 
    b float)
INSERT INTO base(b) VALUES
(42), (3.14159), (2010401)
 
CREATE TABLE facts(
    id int IDENTITY(11,1) PRIMARY KEY, 
    base_id int FOREIGN KEY REFERENCES base(id), 
    c varchar(50))
INSERT INTO facts(base_id, c) VALUES
    (1, 'The Answer'), 
    (2, 'pi'), 
    (3, 'April Fools Day 2018')
 
CREATE TABLE morefacts(
    id int IDENTITY(21,1) PRIMARY KEY, 
    facts_id int FOREIGN KEY REFERENCES facts(id), 
    d varchar(50))
INSERT INTO morefacts(facts_id, d) VALUES
    (11, 'to the question'), 
    (12, 'transcendental number'), 
    (13, 'the jokes on you!')

This set of three tables are at levels 0, 1, and 2, respectively, since “base” has no FK references, “facts” refers to “base” and “morefacts” has an FK referring to “facts”. Now, imagine that you have new data for all three tables, in the form of INSERT statements:

这三个表的集合分别处于级别0、1和2,因为“基本”没有FK引用,所以“事实”是指“基本”,而“更多事实”的FK则是指“事实”。 现在,假设您以INSERT语句的形式拥有所有三个表的新数据:

INSERT INTO morefacts(facts_id, d) VALUES
    (14, 'golden ratio'),
    (15, 'limit of (1 + 1/n)^n')
 
INSERT INTO facts(base_id, c) VALUES
    (4, 'phi'),
    (5, 'Euler''s number')
 
INSERT INTO base(b) VALUES
(1.618), (2.718)

Now, you know that you can’t insert them that way, or you’ll get an error message like this one:

现在,您知道您不能以这种方式插入它们,否则将收到类似以下的错误消息:

The INSERT statement conflicted with the FOREIGN KEY constraint “FK__morefacts__facts”

INSERT语句与FOREIGN KEY约束“ FK__morefacts__facts”相冲突

You need to do these in reverse order to preserve referential integrity. This is easy with this little example since we are in total control. Now, imagine that you were asked to load up a database with lots of foreign key relationships, but you didn’t know the levels of any of the tables. How would you proceed? There are a few different ways to tackle the problem and in this article I’m going to leverage the power of recursive Common Table Expressions, or CTEs, to do it.

您需要按照相反的顺序进行操作,以保持参照完整性。 通过这个小例子,这很容易,因为我们完全可以控制。 现在,假设您被要求加载具有许多外键关系的数据库,但是您不知道任何表的级别。 您将如何进行? 有几种不同的方法可以解决该问题,在本文中,我将利用递归通用表表达式(CTE)的功能来解决此问题。

系统目录视图sys.foreign_keys (The System Catalog View sys.foreign_keys)

SQL Server now provides almost 300 system catalog views that are useful for all sorts of metadata operations. If I include the dynamic management views the total is almost 500!. sys.foreign_keys, as the name implies, shows you the foreign key relationships for a database. I can combine this view with a recursive CTE to dig out the foreign key relationships in the database. A full discussion of recursive CTEs is outside the scope of this article. See the references section for more detail on how they work. For now, just keep in mind that a recursive CTE has two parts, just like a mathematical recurrence:

SQL Server现在提供将近300种系统目录视图,这些视图对于各种元数据操作很有用。 如果包括动态管理视图,则总数将近500!。 顾名思义,sys.foreign_keys向您显示数据库的外键关系。 我可以将此视图与递归CTE结合使用,以挖掘数据库中的外键关系。 递归CTE的完整讨论不在本文讨论范围之内。 有关它们如何工作的更多详细信息,请参见参考资料部分。 现在,请记住,递归CTE有两个部分,就像数学递归一样:

  1. A base case

    基本案例
  2. A recursive case that builds on the base case

    基于基本案例的递归案例

For our example, a query to get the base case would look like this:

对于我们的示例,查询以获取基本情况如下所示:

SELECT DISTINCT
      fk.object_id 		   	AS FK,
      fk.schema_id 		   	AS SchemaId,
      fk.parent_object_id     	AS TableId,
      t.schema_id 		  	AS ReferencedSchema,
      fk.referenced_object_id 	AS ReferencedTable
 
     FROM sys.foreign_keys AS fk	 
     JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id
     WHERE fk.type = 'F'

Run this on any database you have access to and observe the results. On my test database I see:

在您有权访问的任何数据库上运行此命令,并观察结果。 在我的测试数据库中,我看到:

The recursive case builds on this by finding tables referenced by the base case:

递归案例基于此通过查找基本案例引用的表建立:

SELECT 
          fk.object_id,
	   fk.schema_id,
	   fk.parent_object_id,
	   t.schema_id,
	   fk.referenced_object_id
 
     FROM sys.foreign_keys fk	 
     JOIN sys.tables t ON fk.referenced_object_id = t.object_id
	JOIN base_case ON fk.parent_object_id = base_case.referenced_object_id
     WHERE fk.type = 'F'

This query is almost the same as the one above except that it joins with the base case, matching the parent object id, which is the table containing the FK reference. To the base case referenced object id. In this way we can get the tables referring to the base case tables and continue until there are no more, since this is recursive!

该查询与上面的查询几乎相同,除了它与基本情况结合在一起,匹配父对象ID(包含FK引用的表)。 以基本情况引用的对象ID。 这样,我们可以获取引用基本案例表的表,并继续进行直到没有更多为止,因为这是递归的!

Putting the two queries – the base case and the recursive case – together in a recursive CTE yields this query:

将两个查询(基本案例和递归案例)放到递归CTE中会产生以下查询:

WITH cte AS
    (SELECT DISTINCT
          fk.object_id,
	   fk.schema_id,
	   fk.parent_object_id,
	   t.schema_id AS referenced_schema_id,
	   fk.referenced_object_id,
	   0 AS Depth
     FROM sys.foreign_keys AS fk	 
     JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id
     WHERE fk.type = 'F' 
	--AND fk.parent_object_id = OBJECT_ID(N'morefacts', N'U')
 
	UNION ALL
 
	SELECT 
          fk.object_id,
	   fk.schema_id,
	   fk.parent_object_id,
	   t.schema_id,
	   fk.referenced_object_id,
	   cte.Depth - 1
     FROM sys.foreign_keys AS fk	 
     JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id
	JOIN CTE ON fk.parent_object_id = cte.referenced_object_id
     WHERE fk.type = 'F' 
	--AND fk.parent_object_id <> cte.referenced_object_id
	)
 
SELECT 
    OBJECT_NAME(cte.object_id) AS ReferringKey,
    SCHEMA_NAME(cte.schema_id) AS ReferringSchema,
    OBJECT_NAME(cte.parent_object_id) as ReferringTable,
    SCHEMA_NAME(cte.referenced_schema_id) AS ReferencedSchema,
    OBJECT_NAME(cte.referenced_object_id) as ReferencedTable,
    cte.referenced_object_id,
    cte.Depth
 
FROM cte
ORDER BY Depth DESC,ReferencedSchema, ReferencedTable, ReferringKey;

There are two commented lines that I’ll come back to in a moment. Running this on my test database I get:

我将在稍后再谈两句话。 在我的测试数据库上运行它,我得到:

Here, the column “Depth” represents the level a table is with respect to one referring to it. In this case, the table “morefacts” refers to the table “facts”. So “morefacts” is at ground level (Depth=0) and “facts” is in the first basement (Depth = -1). With such a report I know I need to load the deepest levels first, then those above them and so on until I reach ground level.

在此,“深度”列表示表格相对于引用该表格的级别。 在这种情况下,表“更多事实”是指表“事实”。 因此,“更多事实”在地面上(深度= 0),“事实”在第一层地下室(深度= -1)。 有了这样的报告,我知道我需要首先加载最深的层次,然后再加载它们之上的层次,依此类推,直到达到地面层次。

Now, let’s look at the two commented lines. The first one, if uncommented, lets me just look at the references from a specific table:

现在,让我们看一下两条注释行。 第一个,如果没有注释,让我只看特定表中的引用:

AND fk.parent_object_id = OBJECT_ID(N'morefacts', N'U')

Running that produces a smaller report:

运行将生成较小的报告:

No surprise there. The second commented line is trickier. On my SQL Server instance I also have the sample database WideWorldImporters installed. Let’s try the query with both lines commented on that database. I get an error:

毫不奇怪。 第二条评论线比较棘手。 在我SQL Server实例上,我还安装了示例数据库WideWorldImporters。 让我们尝试在该数据库中用两行注释的查询。 我收到一个错误:

Msg 530, Level 16, State 1, Line 1

消息530,第16级,状态1,第1行

The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

声明终止。 在语句完成之前,最大递归100已用尽。

The problem is that this database contains a table that is self-referential:

问题在于此数据库包含一个自引用表:

This comes from the fact that the People table contains a hierarchy. Hierarchies can be used to show people in a reporting structure, where an employee points to their manager in the same table. Here it actually looks like a mistake! The foreign key says, “Be sure that a person in the table really is also in the table”. Of course that is always true! For our discussion though, it means that we are chasing our own tail during the recursive part of the query. If I uncomment the second comment:

这是由于People表包含一个层次结构。 层次结构可用于在报表结构中显示人员,其中员工在同一表中指向其经理。 在这里,实际上看起来像是一个错误! 外键说:“确保桌子上的人确实也在桌子上”。 当然总是如此! 但是,对于我们的讨论,这意味着我们在查询的递归部分中追逐自己的尾巴。 如果我不评论第二条评论:

AND fk.parent_object_id <> cte.referenced_object_id

The problem will disappear:

问题将消失:

I just pasted part of the output. It is actually quite a bit longer, which you can verify for yourself.

我只是粘贴了部分输出。 实际上要更长一些,您可以自己验证。

自下而上? (Bottom up?)

The query we’ve been using takes a top-down approach. The problem we had with the People table suggests another approach. Can we find tables that refer to it? We can! We’ll use a bottom up approach. Actually the query changes very little:

我们一直在使用的查询采用自顶向下的方法。 我们在“人员”表中遇到的问题提出了另一种方法。 我们可以找到引用它的表吗? 我们可以! 我们将使用自下而上的方法。 实际上查询的变化很小:

WITH cte AS
    (SELECT DISTINCT
          fk.object_id,
	   fk.schema_id,
	   fk.parent_object_id,
	   t.schema_id AS referenced_schema_id,
	   fk.referenced_object_id,
	   1 AS Level
     FROM sys.foreign_keys AS fk	 
     JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id
     WHERE fk.type = 'F' 
	AND fk.referenced_object_id = OBJECT_ID(N'Application.People', N'U')
 
	UNION ALL
 
	SELECT 
          fk.object_id,
	   fk.schema_id,
	   fk.parent_object_id,
	   t.schema_id,
	   fk.referenced_object_id,
	   cte.Level + 1
     FROM sys.foreign_keys AS fk	 
     JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id
	JOIN CTE ON fk.referenced_object_id = cte.parent_object_id
     WHERE fk.type = 'F' 
	AND fk.parent_object_id <> cte.referenced_object_id
	)
 
SELECT DISTINCT
    OBJECT_NAME(cte.object_id) AS ReferringKey,
    SCHEMA_NAME(cte.schema_id) AS ReferringSchema,
    OBJECT_NAME(cte.parent_object_id) as ReferringTable,
    SCHEMA_NAME(cte.referenced_schema_id) AS ReferencedSchema,
    OBJECT_NAME(cte.referenced_object_id) as ReferencedTable,
    Level
 
FROM cte
ORDER BY Level, ReferencedSchema, ReferencedTable, ReferringKey;
RETURN;

I’ve highlighted the changes. Basically, we start with tables that refer to Application. People and work up from there. This query yields the desired result for the WideWorldImporters database, though they are too big to post here (325 lines). The Level goes all the way up to 10, indicating a little of the complexity of the data model used here.

我强调了这些变化。 基本上,我们从引用应用程序的表开始。 人们并从那里开始工作。 该查询为WideWorldImporters数据库产生了所需的结果,尽管它们太大而无法在此处发布(325行)。 级别一直上升到10,表明这里使用的数据模型的复杂性很小。

摘要 (Summary)

This brief excursion into recursive CTEs applied to system views shows how easy it can be to tease out the relationships between objects in a SQL Server database. If you’re new to common table expressions, especially the recursive variant, this gives you a simple example to understand how they work. Don’t use them for everything, however! Sometimes developers are tempted to use recursive CTEs in place of cursors or while loops, thinking that there will be some performance advantage. Usually, those hopes are dashed! Internally, recursive CTEs are processed “Row By Agonizing Row”, or RBAR, a term created by Jeff Moden, a veritable super-DBA in the Microsoft SQL Server space.

对应用于系统视图的递归CTE的简短介绍表明,弄清楚SQL Server数据库中对象之间的关系是多么容易。 如果您不熟悉通用表表达式,尤其是递归变量,那么这为您提供了一个简单的示例来了解它们如何工作。 但是,不要将它们用于所有内容! 有时,开发人员倾向于使用递归CTE代替游标或while循环,以为这​​会带来一些性能优势。 通常,那些希望破灭了! 在内部,递归CTE是通过“行化行”或RBAR来处理的,RBAR是由Jeff Moden(Microsoft SQL Server空间中的名副其实的超级DBA)创建的。

If you’re new to system catalog views, let this serve as the briefest of introductions to a large topic!

如果您是系统目录视图的新手,请使用它作为对大主题的简介中最简短的!

翻译自: https://www.sqlshack.com/recursive-ctes-and-foreign-key-references-in-sql-server/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值