组织架构递归_映射架构和递归管理数据–第2部分

组织架构递归

介绍 (Introduction)

In Part 1 of this article, we built a case for the benefits of documenting our schema via the use of a stored procedure, as well as the ways this information can be used for DML operations on a relational database. Below, we will continue where we left off by speeding up our SQL queries and then putting everything together so that we can demo a usable script that will hopefully make your life easier!

在本文的第1部分中,我们建立了一个案例,其目的是通过使用存储过程来记录我们的架构,以及将该信息用于关系数据库上的DML操作的方式的好处。 下面,我们将通过加快SQL查询的速度,然后将所有内容放在一起,继续我们遗留的地方,以便我们可以演示一个有用的脚本,该脚本有望使您的生活更轻松!

优化 (Optimization)

One additional step that has not been addressed yet, but should be, is optimization.  As we collect row counts, we will undoubtedly find many relationships with row counts of zero.  For the sake of deleting data, these relationships may be removed from our foreign key data set.  But wait—there’s more!  Any relationship that is a child of a parent with zero rows can also be deleted.  This is an immense optimization step as we can slice off large swaths of data very quickly.  Even more importantly, each future iteration of our WHILE loop won’t have to touch that newly defined chunk of irrelevant relationship data.  In tests I conducted on large test databases, these additional steps reduced runtime by as much as 95%. 

优化尚未解决,但应该解决的另一步骤。 当我们收集行计数时,无疑会发现许多行计数为零的关系。 为了删除数据,可以从我们的外键数据集中删除这些关系。 但是,等等-还有更多! 也可以删除作为具有零行的父级的子级的任何关系。 这是一个巨大的优化步骤,因为我们可以非常快速地分割大量数据。 更重要的是,我们WHILE循环的每个未来迭代都不必触及不相关关系数据的新定义块。 在大型测试数据库上进行的测试中,这些额外的步骤将运行时间减少了多达95%。

The following change to our TSQL from above illustrates this optimization procedure:

从上面对我们的TSQL进行的以下更改说明了此优化过程:

 
        IF @row_count = 0
       BEGIN
              DELETE FKEYS
              FROM @foreign_keys FKEYS
              WHERE FKEYS.object_id_hierarchy_rank LIKE @object_id_hierarchy_rank + '%' -- Remove all paths that share the same root as this one.
              AND (FKEYS.object_id_hierarchy_rank <> @object_id_hierarchy_rank OR FKEYS.foreign_key_id = @foreign_key_id) -- Don't remove paths where there are multiple foreign keys from one table to another.
              AND FKEYS.referencing_column_name_rank LIKE @referencing_column_name_rank + '%' -- Don't remove paths that have identical table relationships, but that occur through different FK columns.
       END
       ELSE
       BEGIN
              UPDATE FKEYS
                     SET processed = 1,
                           row_count = @row_count,
                           join_condition_sql = @join_sql
              FROM @foreign_keys FKEYS
              WHERE FKEYS.foreign_key_id = @foreign_key_id;
       END
 

By adding in a row count check, we can choose one of two scenarios:

通过添加行计数检查,我们可以选择以下两种情况之一:

  1. There are no rows of data to process: delete this foreign key relationship and all that are children of it.

    没有要处理的数据行:删除此外键关系及其所有​​子级。
  2. There are rows of data for this relationship, update @foreign_keys with the necessary information.

    有这种关系的数据行,用必要的信息更新@foreign_keys。

The generation of the INNER JOIN data that is stored in #inner_join_tables is a bit complex, and is written as it is to ensure that we never join into the same relationship twice or incorrectly.  We want to the correct path from the current relationship back to the target table through a sequence of unique column relationships.  The additional WHERE clauses guard against a handful of important scenarios:

存储在#inner_join_tables中的INNER JOIN数据的生成有点复杂,并按原样编写以确保我们永远不会两次或错误地加入相同的关系。 我们希望通过一系列唯一的列关系来确定从当前关系到目标表的正确路径。 附加的WHERE子句可防止少数重要情况:

  1. If a relationship could theoretically loop back through a table we have already referenced, prevent further processing.  Otherwise, we would allow infinite loops to occur.

    如果理论上某个关系可以循环回我们已经引用的表,请阻止进一步的处理。 否则,我们将允许无限循环发生。
  2. Do not process the target table in the loop.  Handle it separately as it is a special case.

    不要在循环中处理目标表。 由于是特例,请分开处理。
  3. If multiple relationship paths exist between two tables, ensure that only the current path is traversed.  The @has_same_object_id_hierarchy variable checks for identical table paths and allows for extra logic to be included when this happens.

    如果两个表之间存在多个关系路径,请确保仅遍历当前路径。 @has_same_object_id_hierarchy变量检查相同的表路径,并在发生这种情况时允许包含额外的逻辑。

At this point, we can take all of the data gathered above and use it to generate a list of DELETE statements, in order, for the target table and WHERE clause provided in the stored procedure parameters.

至此,我们可以获取上面收集的所有数据,并使用它为存储过程参数中提供的目标表和WHERE子句按顺序生成一系列DELETE语句。

 
WHILE EXISTS (SELECT * FROM @foreign_keys WHERE processed = 0 AND level > 0 )
BEGIN
       SELECT @sql_command = '';
       SELECT @join_sql = '';
       SELECT @old_table_name = '';
       SELECT @old_schema_name = '';
 
       SELECT TOP 1
              @process_schema_name = referencing_schema_name,
              @process_table_name = referencing_table_name,
              @object_id_hierarchy_rank = object_id_hierarchy_rank,
              @row_count = row_count,
              @foreign_key_id = foreign_key_id
       FROM @foreign_keys
       WHERE processed = 0
       AND level > 0
       ORDER BY level DESC;
 
       SET @sql_command = '-- Maximum rows to be deleted: ' + CAST(@row_count AS VARCHAR(25)) + CHAR(10) +
              'DELETE [' + @process_table_name + ']' + CHAR(10) + 'FROM [' + @process_schema_name + '].[' + @process_table_name + ']' + CHAR(10);
 
       SELECT
              @join_sql = FKEYS.join_condition_sql
       FROM @foreign_keys FKEYS
       WHERE FKEYS.foreign_key_id = @foreign_key_id
 
       SELECT @sql_command = @sql_command +  @join_sql;
 
       IF @where_clause <> ''
       BEGIN
              SELECT @sql_command = @sql_command + 'WHERE (' + @where_clause + ')' + CHAR(10);
       END
 
       -- If rows exist to be deleted, then print those delete statements.
       PRINT @sql_command + 'GO' + CHAR(10);
 
       UPDATE @foreign_keys
              SET processed = 1
       WHERE foreign_key_id = @foreign_key_id
END
 
-- Delete data from the root table
SET @sql_command = '-- Rows to be deleted: ' + CAST(@base_table_row_count AS VARCHAR(25)) + CHAR(10) +
'DELETE FROM [' + @process_schema_name + '].[' + @table_name + ']';
 
IF @where_clause <> ''
BEGIN
       SELECT @sql_command = @sql_command + CHAR(10) + 'WHERE ' + @where_clause;
END
 
-- Print deletion statement for root table
PRINT @sql_command;
 

Comments are added into the PRINT statements with the maximum number of rows to be deleted.  This is based on join data and may be rendered inaccurate as execution occurs, as a row of data may belong to several relationships, and once deleted will not be available for deletion as part of any others.  These estimates are useful, though, in gauging the volume of data that each relationship represents.

注释将添加到PRINT语句中,并带有要删除的最大行数。 这是基于联接数据的,并且在执行时可能会变得不准确,因为一行数据可能属于多个关系,并且一旦删除将无法作为其他任何部分删除。 但是,这些估计值对于衡量每个关系表示的数据量很有用。

The bulk of the work in our stored procedure is done when the counts are calculated.  The deletion section is iterating through a similar loop and printing out the appropriate delete TSQL, in order, for each relationship that was previously defined and enumerated.

计算计数后,将完成存储过程中的大部分工作。 删除部分通过一个类似的循环进行迭代,并针对先前定义和枚举的每个关系依次打印出适当的删除TSQL。

There is one final task to manage, and that is self-referencing relationships.  If a table has a parent-child relationship with itself, we explicitly avoided it above.  How to properly handle these relationships, should they exist, is up to you.  Whether we cascade DELETE statements through the rest of our work or simply set the foreign key column to NULL would be based on the appropriate business logic.  In the following TSQL, we set any of these relationships to NULL that happen to directly relate to the target table.  We could also tie it into our big loop and traverse all relationships previously defined, but I have left this edge case out as it is not too common:

有最后一项任务要管理,那就是自引用关系。 如果表与其自身具有父子关系,则上面我们明确避免了它。 如果存在这些关系,如何正确处理取决于您。 我们是在其余工作中级联DELETE语句,还是将外键列设置为NULL都将基于适当的业务逻辑。 在下面的TSQL中,我们将这些关系中的任何一个都设置为NULL,而这些关系恰好与目标表直接相关。 我们也可以将其绑定到我们的大循环中并遍历先前定义的所有关系,但由于这种情况不太常见,因此我省略了这种情况:

 
WHILE EXISTS (SELECT * FROM @self_referencing_keys SRKEYS WHERE SRKEYS.processed = 0)
BEGIN
       -- Get next self-referencing relationship to process
       SELECT TOP 1
              @relationship_id = SRKEY.self_referencing_keys_id
       FROM @self_referencing_keys SRKEY
       WHERE processed = 0;
       -- Get row counts for the update statement
       SELECT
              @count_sql_command = 'SELECT COUNT(*)' + CHAR(10) +
              'FROM [' + SRKEY.referencing_schema_name + '].[' + SRKEY.referencing_table_name + ']' + CHAR(10) +
              'WHERE [' + SRKEY.referencing_column_name + '] IN' + CHAR(10) +
              '     (SELECT ' + SRKEY.primary_key_column_name + ' FROM [' + SRKEY.primary_key_schema_name + '].[' + SRKEY.primary_key_table_name + '])' + CHAR(10)
       FROM @self_referencing_keys SRKEY
       WHERE SRKEY.self_referencing_keys_id = @relationship_id;
 
       INSERT INTO @row_counts
              (row_count)
       EXEC (@count_sql_command)
       SELECT @row_count = row_count FROM @row_counts;
       IF @row_count > 0
       BEGIN
              SELECT
                     @sql_command =
                     '-- Rows to be updated: ' + CAST(@row_count AS VARCHAR(MAX)) + CHAR(10) +
                     'UPDATE [' + SRKEY.referencing_schema_name + '].[' + SRKEY.referencing_table_name + ']' + CHAR(10) +
                     '     SET ' + SRKEY.referencing_column_name + ' = NULL' + CHAR(10) +
                     'FROM [' + SRKEY.referencing_schema_name + '].[' + SRKEY.referencing_table_name + ']' + CHAR(10) +
                     'WHERE [' + SRKEY.referencing_column_name + '] IN' + CHAR(10) +
                     '     (SELECT ' + SRKEY.primary_key_column_name + ' FROM [' + SRKEY.primary_key_schema_name + '].[' + SRKEY.primary_key_table_name + ')' + CHAR(10)
              FROM @self_referencing_keys SRKEY
              WHERE SRKEY.self_referencing_keys_id = @relationship_id;
 
              -- Print self-referencing data modification statements
              PRINT @sql_command;
       END
       ELSE
       BEGIN
              -- Remove any rows for which we have no data.
              DELETE SRKEY
              FROM @self_referencing_keys SRKEY
              WHERE SRKEY.self_referencing_keys_id = @relationship_id;
       END
 
       UPDATE @self_referencing_keys
              SET processed = 1,
                     row_count = @row_count
       WHERE self_referencing_keys_id = @relationship_id;
 
       DELETE FROM @row_counts;
END
 

This logic is simplified, but at least identifies relationships where data exists, and provides some sample TSQL that could be used to clear them out in their entirety, if that is the best approach for your data.

此逻辑经过简化,但是至少可以识别数据存在的关系,并提供一些示例TSQL,如果可以的话,这些示例TSQL可以全部清除它们(如果这是处理数据的最佳方法)。

放在一起 (Putting Everything Together)

With all of our objectives completed, we can now piece together our stored procedure in its final form.  The relationship data is selected back to SSMS at the end as a reference, so you can easily view & save this data as needed:

完成所有目标后,我们现在可以将存储过程整理成最终形式。 最后将关系数据选择回SSMS作为参考,因此您可以根据需要轻松查看和保存此数据:

 
CREATE PROCEDURE dbo.atp_schema_mapping
       @schema_name SYSNAME,
       @table_name SYSNAME,
       @where_clause VARCHAR(MAX) = ''
AS
BEGIN
       SET NOCOUNT ON;
 
       DECLARE @sql_command VARCHAR(MAX) = ''; -- Used for many dynamic SQL statements
 
       SET @where_clause = ISNULL(LTRIM(RTRIM(@where_clause)), ''); -- Clean up WHERE clause, to simplify future SQL
 
       DECLARE @relationship_id INT; -- Will temporarily hold row ID for use in iterating through relationships
       DECLARE @count_sql_command VARCHAR(MAX) = ''; -- Used for dynamic SQL for count calculations
       DECLARE @row_count INT; -- Temporary holding place for relationship row count
       DECLARE @row_counts TABLE -- Temporary table to dump dynamic SQL output into
              (row_count INT);
 
       DECLARE @base_table_row_count INT; -- This will hold the row count of the base entity.
       SELECT @sql_command = 'SELECT COUNT(*) FROM [' + @schema_name + '].[' + @table_name + ']' + -- Build COUNT statement
              CASE
                     WHEN @where_clause <> '' -- Add WHERE clause, if provided
                           THEN CHAR(10) + 'WHERE ' + @where_clause
                     ELSE ''
              END;
 
       INSERT INTO @row_counts
              (row_count)
       EXEC (@sql_command);
      
       SELECT
              @base_table_row_count = row_count -- Extract count from temporary location.
       FROM @row_counts;
 
       -- If there are no matching rows to the input provided, exit immediately with an error message.
       IF @base_table_row_count = 0
       BEGIN
              PRINT '-- There are no rows to process based on the input table and where clause.  Execution aborted.';
              RETURN;
       END
 
       DELETE FROM @row_counts;
 
       -- This table will hold all foreign key relationships
       DECLARE @foreign_keys TABLE
       (   foreign_key_id INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
              referencing_object_id INT NULL,
              referencing_schema_name SYSNAME NULL,
              referencing_table_name SYSNAME NULL,
              referencing_column_name SYSNAME NULL,
              primary_key_object_id INT NULL,
              primary_key_schema_name SYSNAME NULL,
              primary_key_table_name SYSNAME NULL,
              primary_key_column_name SYSNAME NULL,
              level INT NULL,
              object_id_hierarchy_rank VARCHAR(MAX) NULL,
              referencing_column_name_rank VARCHAR(MAX) NULL,
              row_count INT DEFAULT 0 NOT NULL,
              processed BIT DEFAULT 0 NOT NULL,
              join_condition_sql VARCHAR(MAX) DEFAULT ''); -- Save this after we complete the count calculations so we don't have to do it again later.
 
       -- Table to exclusively store self-referencing foreign key data
       DECLARE @self_referencing_keys TABLE
       (      self_referencing_keys_id INT NOT NULL IDENTITY(1,1),
              referencing_primary_key_name SYSNAME NULL,
              referencing_schema_name SYSNAME NULL,
              referencing_table_name SYSNAME NULL,
              referencing_column_name SYSNAME NULL,
              primary_key_schema_name SYSNAME NULL,
              primary_key_table_name SYSNAME NULL,
              primary_key_column_name SYSNAME NULL,
              row_count INT DEFAULT 0 NOT NULL,
              processed BIT DEFAULT 0 NOT NULL);
      
       -- Insert all foreign key relational data into the table variable using a recursive CTE over system tables.
       WITH fkey     (referencing_object_id,
                            referencing_schema_name,
                            referencing_table_name,
                            referencing_column_name,
                            primary_key_object_id,
                            primary_key_schema_name,
                            primary_key_table_name,
                            primary_key_column_name,
                            level,
                            object_id_hierarchy_rank,
                            referencing_column_name_rank) AS
       (      SELECT
                     parent_table.object_id AS referencing_object_id,
                     parent_schema.name AS referencing_schema_name,
                     parent_table.name AS referencing_table_name,
                     CONVERT(SYSNAME, NULL) AS referencing_column_name,
                     CONVERT(INT, NULL) AS referenced_table_object_id,
                     CONVERT(SYSNAME, NULL) AS referenced_schema_name,
                     CONVERT(SYSNAME, NULL) AS referenced_table_name,
                     CONVERT(SYSNAME, NULL) AS referenced_key_column_name,
                     0 AS level,
                     CONVERT(VARCHAR(MAX), parent_table.object_id) AS object_id_hierarchy_rank,
                     CAST('' AS VARCHAR(MAX)) AS referencing_column_name_rank
                     FROM sys.objects parent_table
                     INNER JOIN sys.schemas parent_schema
                     ON parent_schema.schema_id = parent_table.schema_id
                     WHERE parent_table.name = @table_name
                     AND parent_schema.name = @schema_name
              UNION ALL
              SELECT
                     child_object.object_id AS referencing_object_id,
                     child_schema.name AS referencing_schema_name,
                     child_object.name AS referencing_table_name,
                     referencing_column.name AS referencing_column_name,
                     referenced_table.object_id AS referenced_table_object_id,
                     referenced_schema.name AS referenced_schema_name,
                     referenced_table.name AS referenced_table_name,
                     referenced_key_column.name AS referenced_key_column_name,
                     f.level + 1 AS level,
                     f.object_id_hierarchy_rank + '-' + CONVERT(VARCHAR(MAX), child_object.object_id) AS object_id_hierarchy_rank,
                     f.referencing_column_name_rank + '-' + CAST(referencing_column.name AS VARCHAR(MAX)) AS referencing_column_name_rank
              FROM sys.foreign_key_columns sfc
              INNER JOIN sys.objects child_object
              ON sfc.parent_object_id = child_object.object_id
              INNER JOIN sys.schemas child_schema
              ON child_schema.schema_id = child_object.schema_id
              INNER JOIN sys.columns referencing_column
              ON referencing_column.object_id = child_object.object_id
              AND referencing_column.column_id = sfc.parent_column_id
              INNER JOIN sys.objects referenced_table
              ON sfc.referenced_object_id = referenced_table.object_id
              INNER JOIN sys.schemas referenced_schema
              ON referenced_schema.schema_id = referenced_table.schema_id
              INNER JOIN sys.columns AS referenced_key_column
              ON referenced_key_column.object_id = referenced_table.object_id
              AND referenced_key_column.column_id = sfc.referenced_column_id
              INNER JOIN fkey f
              ON f.referencing_object_id = sfc.referenced_object_id
              WHERE ISNULL(f.primary_key_object_id, 0) <> f.referencing_object_id -- Exclude self-referencing keys
              AND f.object_id_hierarchy_rank NOT LIKE '%' + CAST(child_object.object_id AS VARCHAR(MAX)) + '%'
       )
       INSERT INTO @foreign_keys
       (      referencing_object_id,
              referencing_schema_name,
              referencing_table_name,
              referencing_column_name,
              primary_key_object_id,
              primary_key_schema_name,
              primary_key_table_name,
              primary_key_column_name,
              level,
              object_id_hierarchy_rank,
              referencing_column_name_rank)
       SELECT DISTINCT
              referencing_object_id,
              referencing_schema_name,
              referencing_table_name,
              referencing_column_name,
              primary_key_object_id,
              primary_key_schema_name,
              primary_key_table_name,
              primary_key_column_name,
              level,
              object_id_hierarchy_rank,
              referencing_column_name_rank
       FROM fkey;
 
       UPDATE FKEYS
              SET referencing_column_name_rank = SUBSTRING(referencing_column_name_rank, 2, LEN(referencing_column_name_rank)) -- Remove extra leading dash leftover from the top-level column, which has no referencing column relationship.
       FROM @foreign_keys FKEYS
 
       -- Insert all data for self-referencing keys into a separate table variable.
       INSERT INTO @self_referencing_keys
               ( referencing_primary_key_name,
                       referencing_schema_name,
                       referencing_table_name,
                 referencing_column_name,
                       primary_key_schema_name,
                 primary_key_table_name,
                 primary_key_column_name)
       SELECT
              (SELECT COL_NAME(SIC.OBJECT_ID, SIC.column_id)
               FROM sys.indexes SI INNER JOIN sys.index_columns SIC
               ON SIC.index_id = SI.index_id AND SIC.object_id = SI.object_id
               WHERE SI.is_primary_key = 1
               AND OBJECT_NAME(SIC.OBJECT_ID) = child_object.name) AS referencing_primary_key_name,
              child_schema.name AS referencing_schema_name,
              child_object.name AS referencing_table_name,
              referencing_column.name AS referencing_column_name,
              referenced_schema.name AS primary_key_schema_name,
              referenced_table.name AS primary_key_table_name,
              referenced_key_column.name AS primary_key_column_name
       FROM sys.foreign_key_columns sfc
       INNER JOIN sys.objects child_object
       ON sfc.parent_object_id = child_object.object_id
       INNER JOIN sys.schemas child_schema
       ON child_schema.schema_id = child_object.schema_id
       INNER JOIN sys.columns referencing_column
       ON referencing_column.object_id = child_object.object_id
       AND referencing_column.column_id = sfc.parent_column_id
       INNER JOIN sys.objects referenced_table
       ON sfc.referenced_object_id = referenced_table.object_id
       INNER JOIN sys.schemas referenced_schema
       ON referenced_schema.schema_id = referenced_table.schema_id
       INNER JOIN sys.columns AS referenced_key_column
       ON referenced_key_column.object_id = referenced_table.object_id
       AND referenced_key_column.column_id = sfc.referenced_column_id
       WHERE child_object.name = referenced_table.name
       AND child_object.name IN -- Only consider self-referencing relationships for tables somehow already referenced above, otherwise they are irrelevant.
              (SELECT referencing_table_name FROM @foreign_keys);
 
       -------------------------------------------------------------------------------------------------------------------------------
       -- Generate the Delete script for self-referencing data
       -------------------------------------------------------------------------------------------------------------------------------
 
       WHILE EXISTS (SELECT * FROM @self_referencing_keys SRKEYS WHERE SRKEYS.processed = 0)
       BEGIN
              -- Get next self-referencing relationship to process
              SELECT TOP 1
                     @relationship_id = SRKEY.self_referencing_keys_id
              FROM @self_referencing_keys SRKEY
              WHERE processed = 0;
              -- Get row counts for the update statement
              SELECT
                     @count_sql_command = 'SELECT COUNT(*)' + CHAR(10) +
                     'FROM [' + SRKEY.referencing_schema_name + '].[' + SRKEY.referencing_table_name + ']' + CHAR(10) +
                     'WHERE [' + SRKEY.referencing_column_name + '] IN' + CHAR(10) +
                     '     (SELECT ' + SRKEY.primary_key_column_name + ' FROM [' + SRKEY.primary_key_schema_name + '].[' + SRKEY.primary_key_table_name + '])' + CHAR(10)
              FROM @self_referencing_keys SRKEY
              WHERE SRKEY.self_referencing_keys_id = @relationship_id;
 
              INSERT INTO @row_counts
                     (row_count)
              EXEC (@count_sql_command)
              SELECT @row_count = row_count FROM @row_counts;
 
              IF @row_count > 0
              BEGIN
                     SELECT
                           @sql_command =
                           '-- Rows to be updated: ' + CAST(@row_count AS VARCHAR(MAX)) + CHAR(10) +
                           'UPDATE [' + SRKEY.referencing_schema_name + '].[' + SRKEY.referencing_table_name + ']' + CHAR(10) +
                           '     SET ' + SRKEY.referencing_column_name + ' = NULL' + CHAR(10) +
                           'FROM [' + SRKEY.referencing_schema_name + '].[' + SRKEY.referencing_table_name + ']' + CHAR(10) +
                           'WHERE [' + SRKEY.referencing_column_name + '] IN' + CHAR(10) +
                           '     (SELECT ' + SRKEY.primary_key_column_name + ' FROM [' + SRKEY.primary_key_schema_name + '].[' + SRKEY.primary_key_table_name + ')' + CHAR(10)
                     FROM @self_referencing_keys SRKEY
                     WHERE SRKEY.self_referencing_keys_id = @relationship_id;
 
                     -- Print self-referencing data modification statements
                     PRINT @sql_command;
              END
              ELSE
              BEGIN
                     -- Remove any rows for which we have no data.
                     DELETE SRKEY
                     FROM @self_referencing_keys SRKEY
                     WHERE SRKEY.self_referencing_keys_id = @relationship_id;
              END
 
              UPDATE @self_referencing_keys
                     SET processed = 1,
                           row_count = @row_count
              WHERE self_referencing_keys_id = @relationship_id;
 
              DELETE FROM @row_counts;
       END
 
       -------------------------------------------------------------------------------------------------------------------------------
       -- Generate row counts for non-self-referencing data and delete any entries that have a zero row count
       -------------------------------------------------------------------------------------------------------------------------------
       DECLARE @object_id_hierarchy_sql VARCHAR(MAX);
     
       DECLARE @process_schema_name SYSNAME = '';
       DECLARE @process_table_name SYSNAME = '';
       DECLARE @referencing_column_name SYSNAME = '';
       DECLARE @join_sql VARCHAR(MAX) = '';
       DECLARE @object_id_hierarchy_rank VARCHAR(MAX) = '';
       DECLARE @referencing_column_name_rank VARCHAR(MAX) = '';
       DECLARE @old_schema_name SYSNAME = '';
       DECLARE @old_table_name SYSNAME = '';
       DECLARE @foreign_key_id INT;
       DECLARE @has_same_object_id_hierarchy BIT; -- Will be used if this foreign key happens to share a hierarchy with other keys
       DECLARE @level INT;
 
       WHILE EXISTS (SELECT * FROM @foreign_keys WHERE processed = 0 AND level > 0 )
       BEGIN
              SELECT @count_sql_command = '';
              SELECT @join_sql = '';
              SELECT @old_schema_name = '';
              SELECT @old_table_name = '';
 
              CREATE TABLE #inner_join_tables
                     (      id INT NOT NULL IDENTITY(1,1),
                           object_id INT);
             
              SELECT TOP 1
                     @process_schema_name = FKEYS.referencing_schema_name,
                     @process_table_name = FKEYS.referencing_table_name,
                     @object_id_hierarchy_rank = FKEYS.object_id_hierarchy_rank,
                     @referencing_column_name_rank = FKEYS.referencing_column_name_rank,
                     @foreign_key_id = FKEYS.foreign_key_id,
                     @referencing_column_name = FKEYS.referencing_column_name,
                     @has_same_object_id_hierarchy = CASE WHEN (SELECT COUNT(*) FROM @foreign_keys FKEYS2 WHERE FKEYS2.object_id_hierarchy_rank = FKEYS.object_id_hierarchy_rank) > 1 THEN 1 ELSE 0 END,
                     @level = FKEYS.level
              FROM @foreign_keys FKEYS
              WHERE FKEYS.processed = 0
              AND FKEYS.level > 0
              ORDER BY FKEYS.level ASC;
 
              SELECT @object_id_hierarchy_sql ='SELECT ' + REPLACE (@object_id_hierarchy_rank, '-', ' UNION ALL SELECT ');
 
              INSERT INTO #inner_join_tables
                     EXEC(@object_id_hierarchy_sql);
 
              SET @count_sql_command = 'SELECT COUNT(*) FROM [' + @process_schema_name + '].[' + @process_table_name + ']' + CHAR(10);
 
              SELECT
                     @join_sql = @join_sql +
                     CASE
                           WHEN (@old_table_name <> FKEYS.primary_key_table_name OR @old_schema_name <> FKEYS.primary_key_schema_name)
                                  THEN 'INNER JOIN [' + FKEYS.primary_key_schema_name + '].[' + FKEYS.primary_key_table_name + '] ' + CHAR(10) + ' ON ' +
                                  ' [' + FKEYS.primary_key_schema_name + '].[' + FKEYS.primary_key_table_name + '].[' + FKEYS.primary_key_column_name + '] =  [' + FKEYS.referencing_schema_name + '].[' + FKEYS.referencing_table_name + '].[' + FKEYS.referencing_column_name + ']' + CHAR(10)
                           ELSE ''
                     END
                     , @old_table_name = CASE
                                                              WHEN (@old_table_name <> FKEYS.primary_key_table_name OR @old_schema_name <> FKEYS.primary_key_schema_name)
                                                                     THEN FKEYS.primary_key_table_name
                                                              ELSE @old_table_name
                                                       END
                     , @old_schema_name = CASE
                                                              WHEN (@old_table_name <> FKEYS.primary_key_table_name OR @old_schema_name <> FKEYS.primary_key_schema_name)
                                                                     THEN FKEYS.primary_key_schema_name
                                                              ELSE @old_schema_name
                                                       END
              FROM @foreign_keys FKEYS
              INNER JOIN #inner_join_tables join_details
              ON FKEYS.referencing_object_id  = join_details.object_id
              WHERE CHARINDEX(FKEYS.object_id_hierarchy_rank + '-', @object_id_hierarchy_rank + '-') <> 0 -- Do not allow cyclical joins through the same table we are originating from
              AND FKEYS.level > 0
              AND ((@has_same_object_id_hierarchy = 0) OR (@has_same_object_id_hierarchy = 1 AND FKEYS.referencing_column_name = @referencing_column_name) OR (@has_same_object_id_hierarchy = 1 AND @level > FKEYS.level))
              ORDER BY join_details.ID DESC;
 
              SELECT @count_sql_command = @count_sql_command +  @join_sql;
 
              IF @where_clause <> ''
              BEGIN
                     SELECT @count_sql_command = @count_sql_command + ' WHERE (' + @where_clause + ')';
              END
 
              INSERT INTO @row_counts
                     (row_count)
              EXEC (@count_sql_command);
              SELECT @row_count = row_count FROM @row_counts;
 
              IF @row_count = 0
              BEGIN
                     DELETE FKEYS
                     FROM @foreign_keys FKEYS
                     WHERE FKEYS.object_id_hierarchy_rank LIKE @object_id_hierarchy_rank + '%' -- Remove all paths that share the same root as this one.
                     AND (FKEYS.object_id_hierarchy_rank <> @object_id_hierarchy_rank OR FKEYS.foreign_key_id = @foreign_key_id) -- Don't remove paths where there are multiple foreign keys from one table to another.
                     AND FKEYS.referencing_column_name_rank LIKE @referencing_column_name_rank + '%' -- Don't remove paths that have identical table relationships, but that occur through different FK columns.
              END
              ELSE
              BEGIN
                     UPDATE FKEYS
                           SET processed = 1,
                                  row_count = @row_count,
                                  join_condition_sql = @join_sql
                     FROM @foreign_keys FKEYS
                     WHERE FKEYS.foreign_key_id = @foreign_key_id;
              END
 
              DELETE FROM @row_counts;
              DROP TABLE #inner_join_tables
       END
 
       -- Reset processed flag for all rows
       UPDATE @foreign_keys
       SET processed = 0;
 
       -------------------------------------------------------------------------------------------------------------------------------
       -- Generate the Delete script for non-self-referencing data
       -------------------------------------------------------------------------------------------------------------------------------
 
       WHILE EXISTS (SELECT * FROM @foreign_keys WHERE processed = 0 AND level > 0 )
       BEGIN
              SELECT @sql_command = '';
              SELECT @join_sql = '';
              SELECT @old_table_name = '';
              SELECT @old_schema_name = '';
 
              SELECT TOP 1
                     @process_schema_name = referencing_schema_name,
                     @process_table_name = referencing_table_name,
                     @object_id_hierarchy_rank = object_id_hierarchy_rank,
                     @row_count = row_count,
                     @foreign_key_id = foreign_key_id
              FROM @foreign_keys
              WHERE processed = 0
              AND level > 0
              ORDER BY level DESC;
 
              SET @sql_command = '-- Maximum rows to be deleted: ' + CAST(@row_count AS VARCHAR(25)) + CHAR(10) +
              'DELETE [' + @process_table_name + ']' + CHAR(10) + 'FROM [' + @process_schema_name + '].[' + @process_table_name + ']' + CHAR(10);
 
              SELECT
                     @join_sql = FKEYS.join_condition_sql
              FROM @foreign_keys FKEYS
              WHERE FKEYS.foreign_key_id = @foreign_key_id
 
              SELECT @sql_command = @sql_command +  @join_sql;
 
              IF @where_clause <> ''
              BEGIN
                     SELECT @sql_command = @sql_command + 'WHERE (' + @where_clause + ')' + CHAR(10);
              END
 
              -- If rows exist to be deleted, then print those delete statements.
              PRINT @sql_command + 'GO' + CHAR(10);
 
              UPDATE @foreign_keys
                     SET processed = 1
              WHERE foreign_key_id = @foreign_key_id
       END
 
       -- Delete data from the root table
       SET @sql_command = '-- Rows to be deleted: ' + CAST(@base_table_row_count AS VARCHAR(25)) + CHAR(10) +
       'DELETE FROM [' + @process_schema_name + '].[' + @table_name + ']';
 
       IF @where_clause <> ''
       BEGIN
              SELECT @sql_command = @sql_command + CHAR(10) + 'WHERE ' + @where_clause;
       END
 
       -- Print deletion statement for root table
       PRINT @sql_command;
 
       -- Select remaining data from hierarchical tables & Update SELECT data for the base table to reflect the row count calculated at the start of this script
       UPDATE @foreign_keys
              SET row_count = @base_table_row_count,
                     processed = 1
       WHERE level = 0;
 
       IF (SELECT COUNT(*) FROM @self_referencing_keys) > 0
       BEGIN
              SELECT
                     *
              FROM @self_referencing_keys;
       END
       SELECT
              *
       FROM @foreign_keys;
END
GO
 

We can run some test scenarios on Adventureworks:

我们可以在Adventureworks上运行一些测试方案:

 
EXEC dbo.atp_schema_mapping
       @schema_name = 'Production',
       @table_name = 'Product',
       @where_clause = 'Product.Color = ''Silver'''
GO
 

This will return 14 relationships that stem from Production.Product, up to 2 levels away from that table.  The DELETE statements are printed to the text window and are directly tied to each row of relationship output from above. 

这将返回14个源自Production.Product的关系,距离该表最多2个级别。 DELETE语句打印到文本窗口,并直接绑定到从上方输出的每一行关系。

Another example can be run that shows an additional level of relationship abstraction, and the 15 relationships that exist as part of it:

可以运行另一个示例,该示例显示关系抽象的附加级别,以及作为其一部分存在的15个关系:

 
EXEC dbo.atp_schema_mapping
       @schema_name = 'Production',
       @table_name = 'ProductModel',
       @where_clause = 'ProductModel.Name = ''HL Mountain Frame'''
GO
 

We can look at one relationship from this example to see what data was output and why:

我们可以从此示例中查看一种关系,以查看输出了哪些数据以及原因:

foreign_key_id: 20 referencing_object_id: 1154103152

foreign_key_id :20 referenceencing_object_id :1154103152

referencing_schema_name: Sales
referencing_table_name: SalesOrderDetail
referencing_column_name: SpecialOfferID
primary_key_object_id: 414624520
primary_key_schema_name: Sales
primary_key_table_name: SpecialOfferProduct
primary_key_column_name: SpecialOfferID
level: 3
object_id_hierarchy_rank: 418100530-1973582069-414624520-1154103152
referencing_column_hierarchy_rank: ProductModelID-ProductID-SpecialOfferID

referenceencing_schema_name :销售
referencing_table_name :SalesOrderDetail
referencing_column_name :SpecialOfferID
primary_key_object_id :414624520
primary_key_schema_name :销售
primary_key_table_name :SpecialOfferProduct
primary_key_column_name :SpecialOfferID
等级 :3
object_id_hierarchy_rank :418100530-1973582069-414624520-1154103152
referenceencing_column_hierarchy_rank :ProductModelID-ProductID-SpecialOfferID

This charts the foreign key relationships Sales. SalesOrderDetailSalesSpecialOfferProductProduction.Product, and Production.ProductModel as is shown in the following ERD:

这将绘制外键关系Sales。 SalesOrderDetailSalesSpecialOfferProductProduction.Product  Production.ProductModel,如以下ERD中所示:

The resulting DELETE statement for this relationship is as follows:

此关系的结果DELETE语句如下:

 
DELETE [SalesOrderDetail]
FROM [Sales].[SalesOrderDetail]
INNER JOIN [Sales].[SpecialOfferProduct]
 ON  [Sales].[SpecialOfferProduct].[SpecialOfferID] =  [Sales].[SalesOrderDetail].[SpecialOfferID]
INNER JOIN [Production].[Product]
 ON  [Production].[Product].[ProductID] =  [Sales].[SpecialOfferProduct].[ProductID]
INNER JOIN [Production].[ProductModel]
 ON  [Production].[ProductModel].[ProductModelID] =  [Production].[Product].[ProductModelID]
WHERE (ProductModel.Name = 'HL Mountain Frame')
GO
 

结论 (Conclusion)

The ability to quickly map out a complete relationship hierarchy can be a huge time-saving tool to have at one’s disposal.  The stored procedure presented here is intended to be a base research script.  You, the user, can customize to your heart’s content, adding additional parameters, print options, or unique logic that is exclusive to your business.  Deletion was the target usage, but a similar process could be used in any application where complete hierarchical knowledge of a database (or part of a database) is needed.  This stored procedure can be used as-is with no modifications, and will still provide quite a bit of information on your database schema.

快速绘制出完整的关系层次结构的能力可能是节省大量时间的工具。 本文介绍的存储过程旨在作为基础研究脚本。 用户(您)可以根据自己的心脏进行自定义,添加其他参数,打印选项或业务专有的独特逻辑。 删除是目标用法,但是在需要完整的数据库层次结构知识(或数据库的一部分)的任何应用程序中,都可以使用类似的过程。 此存储过程可以按原样使用,而无需进行任何修改,并且仍将提供有关数据库模式的大量信息。

If you are working in a relational environment that lacks foreign keys, you can still utilize this approach, but would have to define those relationships in a table, file, or TSQL statement.  Once available, the stored procedure could be modified slightly to take that data as an input, format it as the system views would, and continue as though it were the same data.

如果您在缺少外键的关系环境中工作,则仍然可以使用此方法,但是必须在表,文件或TSQL语句中定义这些关系。 一旦可用,可以对存储过程进行一些修改,以将该数据作为输入,按照系统视图的格式对其进行格式化,然后继续进行,就好像它们是相同的数据一样。

Leaving off the WHERE clause allows you to get a complete relationship tree if one were to want to touch all data in a target table.  If your interest was a database map with no omissions, you could drop the DELETE statement that was added in for efficiency.  The resulting stored procedure will take more time to execute, but will provide all possible relationships from a given entity, which could be useful when researching code changes or data integrity.

如果要触摸目标表中的所有数据,则无需使用WHERE子句即可获得完整的关系树。 如果您的兴趣是没有遗漏的数据库映射,则可以删除为提高效率而添加的DELETE语句。 结果存储过程将花费更多时间执行,但是将提供给定实体的所有可能关系,这在研究代码更改或数据完整性时可能很有用。

Regardless of usage, knowledge is power, and the ability to obtain large amounts of schema-dependent information quickly & efficiently can turn complex tasks into trivial bits of work.  The ability to customize those processes allows for nearly limitless applications and the ability to conduct research and reconnaissance that would otherwise be extremely labor-intensive, error-prone, or seemingly impossible!

无论使用什么,知识就是力量,快速有效地获取大量与模式相关的信息的能力会将复杂的任务变成琐碎的工作。 定制这些流程的能力允许几乎无限的应用程序,以及进行研究和侦察的能力,否则这些工作将非常费力,容易出错或看起来不可能!

翻译自: https://www.sqlshack.com/mapping-schema-and-recursively-managing-data-part-2/

组织架构递归

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值