临时表vs.表变量以及它们对SQLServer性能的影响

最新推荐文章于 2023-06-24 18:35:23 发布

peng21o

最新推荐文章于 2023-06-24 18:35:23 发布

阅读量1k

点赞数

文章标签： sqlserver insert table 存储测试 go

--王成辉翻译整理，转贴请注明出自微软BI开拓者 www.windbi.com

在临时表

create table #T (…)

和表变量

declare @T table (…)

之间主要有
3 个理论上的不同。

第一个
不同使事务日志不会记录表变量。因此，它们脱离了事务机制的范围，从下面的例子可显而易见：

create table #T (s varchar(128))

declare @T table (s varchar(128))

insert into #T select 'old value #'

insert into @T select 'old value @'

begin transaction

update #T set s='new value #'

update @T set s='new value @'

rollback transaction

select * from #T

select * from @T

s

---------------

old value #

s

---------------

new value @

在声明临时表
#T 和表变量 @T 之后，给它们分配一个相同的值为 old value 字符串。然后，开始一个事务去更新它们。此时，它们都将有新的相同的值 new value 字符串。但当事务回滚时，正如你所看到的，表变量 @T 保留了这个新值而没有返回 old value 字符串。这是因为即使表变量在事务内被更新了，它本身不是事务的一部分。

第二个
主要的不同是任何一个使用临时表的存储过程都不会被预编译，然而使用表变量的存储过程的执行计划可以预先静态的编译。预编译一个脚本的主要好处在于加快了执行的速度。这个好处对于长的存储过程更加显著，因为对它来说重新编译代价太高。

最后
，表变量仅存在于那些变量能存在的相同范围内。和临时表相反，它们在内部存储过程和 exec （ string ）语句里是不可见的。它们也不能在 insert/exec 语句里使用。

性能比较

首先，准备一个有
100 万记录的测试表：

create table NUM (n int primary key, s varchar(128))

GO

set nocount on

declare @n int

set @n=1000000

while @n>0 begin

insert into NUM

select @n,'Value: '+convert(varchar,@n)

set @n=@n-1

end

GO

准备测试存储过程
T1 ：

create procedure T1

@total int

as

create table #T (n int, s varchar(128))

insert into #T select n,s from NUM

where n%100>0 and n<=@total

declare @res varchar(128)

select @res=max(s) from NUM

where n<=@total and

not exists(select * from #T

where #T.n=NUM.n)

GO

使用参数从
10 ， 100 ， 1000 ， 10000 ， 100000 到 1000000 不等来调用，它复制给定数量的记录到临时表（一些另外，它跳过那些能被 100 整除的数值），然后找到缺失记录的最大值。当然，记录越多，执行的时间就越长：
为了测量正好的执行时间，使用下面的代码：

declare @t1 datetime, @n int

set @t1=getdate()

set @n=
100 – (**)
while @n>0 begin

exec T1 1000 – (*)

   set @n=@n-1 end

select datediff(ms,@t1,getdate())

GO

（
* ）表示程序里边的参数从 10 到 1000000 不等。
（
** ）表示如果执行时间太短，就重复相同的循环 10 到 100 次不等。

多次运行代码以获得执行的结果。

该结果在下面的表
1 里能找到。

下面试着给临时表添加一个主键来提升存储过程的性能：

create procedure T2

   @total int

as

create table #T (n int primary key, s varchar(128))

insert into #T select n,s from NUM

where n%100>0 and n<=@total

declare @res varchar(128)

select @res=max(s) from NUM

where n<=@total and

not exists(select * from #T

where #T.n=NUM.n)

GO

然后，创建第三个。此时有聚集索引，它会工作得更好。但是是在插入数据到临时表之后创建的索引——通常，这样会更好：

create procedure T3

   @total int

as

create table #T (n int, s varchar(128))

insert into #T select n,s from NUM

where n%100>0 and n<=@total

   create clustered index Tind on #T (n)

   declare @res varchar(128)

select @res=max(s) from NUM

where n<=@total and

not exists(select * from #T

where #T.n=NUM.n)

GO

令人惊奇！大数据量花费的时间很长；仅仅添加
10 条记录就花费了 13 毫秒。这个问题在于创建索引语句强迫 SQLServer 去重新编译存储过程，显著的降低了执行效率。

现在试着使用表变量来完成相同的事情：

create procedure V1

   @total int

as

   declare @V table (n int, s varchar(128))

   insert into @V select n,s from NUM

where n%100>0 and n<=@total

declare @res varchar(128)

select @res=max(s) from NUM

where n<=@total and

not exists(select * from @V V

where V.n=NUM.n)

GO

使我们惊奇的是，该版本不是明显的比用临时表的快。这是由于在存储过程开头创建表
#T 语句时进行了特别优化的缘故。对整个范围内的值， V1 和 T1 工作得一样好。

下面试试有主键的情形：

create procedure V2

   @total int

as

declare @V table (n int primary key, s varchar(128))

insert into @V select n,s from NUM

where n%100>0 and n<=@total

declare @res varchar(128)

select @res=max(s) from NUM

where n<=@total and

not exists(select * from @V V

where V.n=NUM.n)

GO

这个结果很快，但
T2 超过了该版本。

Records

T1

T2

T3

V1

V2

10

0.7

1

13.5

0.6

0.8

100

1.2

1.7

14.2

1.2

1.3

1000

7.1

5.5

27

7

5.3

10000

72

57

82

71

48

100000

883

480

580

840

510

1000000

45056

6090

15220

20240

12010

表
1 ：使用 SQLServer2000 ，时间单位毫秒

但真正使我们震惊的是在
SQLServer2005 上的情形：

N

T1

T2

T3

V1

V2

10

0.5

0.5

5.3

0.2

0.2

100

2

1.2

6.4

61.8

2.5

1000

9.3

8.5

13.5

168

140

10000

67.4

79.2

71.3

17133

13910

100000

700

794

659
Too long! Too long!

1000000

10556

8673

6440
Too long! Too long!

表
2 ：使用 SQLServer2005 （时间单位毫秒）

有时，
SQL2005 比 SQL2000 快（上面标记为绿色的部分）。但大多数情况下，特别是在数据量巨大时，存储过程使用表变量花费了更长的时间（红色部分）。在 4 种情形下，我甚至放弃了等待。

结论

- 在什么时候和什么地方使用临时表或表变量没有一个普遍的规则。试着都测试测试它们。
- 在你的测试里，少量的记录和大量的数据集都要进行测试。
- 当在你的存储过程里使用了复杂的逻辑的时候要小心迁移到SQL2005。相同的代码在SQLServer2005上可能运行要慢10到100倍。

sp:

每种临时对像都有自己的优点，表变量的优势是它所涉及的重新编译次数较少。缺点拓狼提到了，因为缺少分布统计信息，当优化器需要用直方图确定选择性时，可能会得到低效的查询计划。特别是当数据量大的时候尤为严重。

感觉数据量少的时候可以用表变量，而重要的查询用临时表。但当你大量使用临时表时，可能你得多多注意你的tempdb，它可能会成为你系统中的瓶颈。

peng21o

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
临时表vs.表变量以及它们对SQLServer性能的影响

--王成辉翻译整理，转贴请注明出自微软BI开拓者www.windbi.com 在临时表create table #T (…)和表变量declare @T table (…)之间主要有3个理论上的不同。第一个不同使事务日志不会记录表变量。因此，它们脱离了事务机制的范围，从下面的例子可显而易见：cr
复制链接

扫一扫

Records	T1	T2	T3	V1	V2
10	0.7	1	13.5	0.6	0.8
100	1.2	1.7	14.2	1.2	1.3
1000	7.1	5.5	27	7	5.3
10000	72	57	82	71	48
100000	883	480	580	840	510
1000000	45056	6090	15220	20240	12010

N	T1	T2	T3	V1	V2
10	0.5	0.5	5.3	0.2	0.2
100	2	1.2	6.4	61.8	2.5
1000	9.3	8.5	13.5	168	140
10000	67.4	79.2	71.3	17133	13910
100000	700	794	659	Too long!	Too long!
1000000	10556	8673	6440	Too long!	Too long!