避免死循环――第一部分
原文:
尽管有很多关于使用游标(cursor)还是循环来获取最好性能的讨论,不过有一点是毫无疑问的,死循环的性能是最差。每当你写一个循环时,你必须确保该循环不是死循环。我将举两个例子——两个很常见的、很可能导致死循环的场景。本文中,我将讲述一个用于存储和遍历层级关系数据的方法,该方法很常见但很容易出现错误。下一篇中,我将举例讲述逐行处理数据时可能出现的死循环。
准备条件
下列表用于存储层级数据但并没有防止层级循环。
CREATE TABLE data.Employee(
EmployeeID int NOT NULL,
ManagerID int NULL,
FirstName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
LastName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT PK_Employee_EmployeeID PRIMARY KEY CLUSTERED
(
EmployeeID ASC
)WITH (IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_PADDING OFF
GO
USE Test
GO
ALTER TABLE data.Employee WITH CHECK ADD CONSTRAINT FK_Employee_Employee_ManagerID FOREIGN KEY(ManagerID)
REFERENCES data.Employee (EmployeeID)
下面简单的员工组织关系是完全无误的:
INSERT INTO data.Employee
(EmployeeID
,ManagerID
,FirstName
,LastName)
SELECT 1, NULL, 'Dan', 'Chang' UNION ALL
SELECT 2, 1, 'June', 'Yang'
而下面的员工组织关系则并不完美,虽然保存该组织关系没有任何问题,但其中包含着循环关系。
INSERT INTO data.Employee
(EmployeeID
,ManagerID
,FirstName
,LastName)
SELECT 3, 4, 'Sydney', 'Hobart' UNION ALL
SELECT 4, 3, 'Hobart', 'Sydney'
读取该层级数据可能导致死循环。请看下列函数:
CREATE FUNCTION Readers.GetManagerAndTeam (@EmployeeID INT)
RETURNS @ret TABLE
(
EmployeeID INT NOT NULL,
ManagerID INT NULL,
FirstName varchar(50),
LastName varchar(50),
Level INT NOT NULL
)
AS
BEGIN
DECLARE @level INT, @rc INT;
INSERT INTO @ret(EmployeeID,
ManagerID,
FirstName,
LastName,
Level)
SELECT EmployeeID,
ManagerID,
FirstName,
LastName,
1
FROM data.Employee
WHERE EmployeeID=@EmployeeID;
SELECT @level=1, @rc=@@ROWCOUNT;
WHILE @rc>0 BEGIN
INSERT INTO @ret(EmployeeID,
ManagerID,
FirstName,
LastName,
Level)
SELECT e.EmployeeID,
e.ManagerID,
e.FirstName,
e.LastName,
@level+1
FROM data.Employee e JOIN @ret r ON e.ManagerID=r.EmployeeID
WHERE r.Level=@level;
SELECT @level=@level+1, @rc=@@ROWCOUNT;
END
RETURN;
END
该函数能正确的返回一个有效的员工组织关系:
SELECT EmployeeID, ManagerID, FirstName, LastName, Level
FROM Readers.GetManagerAndTeam(1)
但该函数有时会陷入死循环。请读者自己分析:
SELECT EmployeeID, ManagerID, FirstName, LastName, Level
FROM Readers.GetManagerAndTeam(3)
使用递归函数作为权宜之计
可以使用递归实现相同的功能,如下所示:
CREATE FUNCTION Readers.GetManagerAndTeam(@ManagerID INT)
RETURNS TABLE
AS
RETURN(
WITH ManagerAndTeam AS(
SELECT EmployeeID, ManagerID, FirstName, LastName, CAST(1 AS INT) AS Level
FROM data.Employee
WHERE EmployeeID = @ManagerID
UNION ALL
SELECT e.EmployeeID, e.ManagerID, e.FirstName, e.LastName, m.Level + 1
FROM data.Employee e JOIN ManagerAndTeam m
ON m.EmployeeID = e.ManagerID)
SELECT EmployeeID, ManagerID, FirstName, LastName, Level
FROM ManagerAndTeam);
在默认的设置中,该函数不会陷入死循环,经过100次递归后该函数会终止递归:
SELECT EmployeeID, ManagerID, FirstName, LastName, Level
FROM Readers.GetManagerAndTeam(3)
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
然而,该函数还可能陷入死循环。请看:
SELECT EmployeeID, ManagerID, FirstName, LastName, Level
FROM Readers.GetManagerAndTeam(3) OPTION (MAXRECURSION 0)
解决数据问题!
你完全可以写出一个更好的更健壮的函数(UDF)来处理脏数据。不过更好的解决办法是解决脏数据的问题。先前的两个函数在层级数据中没有循环的假定条件下可以正确运行。那么就让我们加强数据库对脏数据的限制。下表就做到了使无效的员工组织关系无法存入。
CREATE TABLE data.Employee(
EmployeeID int NOT NULL,
ManagerID int NULL,
EmployeeLevel INT NOT NULL,
ManagerLevel INT NULL,
FirstName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
LastName varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
CONSTRAINT PK_Employee_EmployeeID PRIMARY KEY CLUSTERED
(
EmployeeID ASC
)WITH (IGNORE_DUP_KEY = OFF),
CONSTRAINT UNQ_Employee_EmployeeID_EmployeeLevel UNIQUE
(
EmployeeID, EmployeeLevel
),
CONSTRAINT CHK_Employee_EmployeeLevel_ManagerLevel
CHECK(EmployeeLevel=ManagerLevel+1),
CONSTRAINT CHK_Employee_ManagerID_ManagerLevel
CHECK((ManagerID IS NULL AND ManagerLevel IS NULL)
OR (ManagerID IS NOT NULL AND ManagerLevel IS NOT NULL))
)
ALTER TABLE data.Employee WITH CHECK ADD CONSTRAINT FK_Employee_Employee_ManagerID FOREIGN KEY(ManagerID,ManagerLevel)
REFERENCES data.Employee (EmployeeID,EmployeeLevel)
GO
只有有效的层级数据才可插入到该表中。
INSERT INTO data.Employee
(EmployeeID
,ManagerID
,EmployeeLevel
,ManagerLevel
,FirstName
,LastName)
SELECT 1, NULL, 1, NULL, 'Dan', 'Chang' UNION ALL
SELECT 2, 1, 2, 1, 'June', 'Yang'
There are several other ways to store hierarchies. Discussing them is beyond the scope of this post.
还有几种其它的方法来存储层级数据。讨论这些方法则超出了本文的范围。
存储层级数据请参考: