SQLServer 集合函数 COUNT 优化分析

最新推荐文章于 2024-08-13 10:24:54 发布

薛定谔的DBA

最新推荐文章于 2024-08-13 10:24:54 发布

阅读量5.2k

点赞数

分类专栏： SQLServer 调优优化 SQLServer

本文链接：https://blog.csdn.net/kk185800961/article/details/44580363

版权

SQLServer 同时被 2 个专栏收录

269 篇文章 25 订阅

订阅专栏

SQLServer 调优优化

22 篇文章 7 订阅

订阅专栏

当前版本：

Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (Intel X86) Apr 2 2010 15:53:02 Copyright (c) Microsoft Corporation Enterprise Edition on Windows NT 5.2 <X86> (Build 3790: Service Pack 2) (Hypervisor)

--	创建测试表
--	drop table tb_CountTest
create table tb_CountTest
(
	[uniqueidentifier] [uniqueidentifier] not null,
	[bigint] [bigint] not null,
	[tinyint] [tinyint] not null,
	[int] [int] not null,
	[int0] [int] null
)
go
--	uniqueidentifier(16 字节),bigint(8 字节),int(4 字节),smallint(2 字节),tinyint(1 字节)

--	插入2000行测试数据
insert into tb_CountTest([uniqueidentifier],[bigint],[tinyint],[int],[int0])
select NEWID(),number*3-1,number*2%256,number,case when number%6=0 then null else number end
from (
	select distinct number 
	from master.dbo.spt_values 
	where number between 1 and 2000
)tab
go

--	创建聚集索引 ([uniqueidentifier])
--	drop index ix_tb_CountTest_uniqueidentifier on tb_CountTest
create clustered index ix_tb_CountTest_uniqueidentifier on tb_CountTest([uniqueidentifier])
go

--	创建非聚集索引 ([int])
--	drop index ix_tb_CountTest_int on tb_CountTest
create index ix_tb_CountTest_int on tb_CountTest([int])
go

--	执行以下语句,查看执行计划.结果如下:
select count(*) from dbo.tb_counttest
select count(1) from dbo.tb_counttest
select count([uniqueidentifier]) from dbo.tb_counttest
select count([bigint]) from dbo.tb_counttest
select count([tinyint]) from dbo.tb_counttest
select count([int]) from dbo.tb_counttest

可以看到，统计信息都是一致的。以上的查询统计结果都为2000，全都是使用非聚集索引(ix_tb_CountTest_int)扫描，上面6中方法统计的开销都是一样的。而下面这个统计，却是使用聚集索引扫描（ix_tb_CountTest_uniqueidentifier），结果为1667行，筛选了空值。

select count([int0]) from dbo.tb_counttest

两个问题：

Q1.为什么都是使用非聚集索引扫描？

Q2.为什么count([int0])使用的是聚集索引？

A1. 为什么都是使用非聚集索引扫描？

因为使用非聚集索引返回的数据页更少。使用使用的都是索引，下面可以搜索到，按索引查询时，返回的数据页有多少。

如：

DBCC TRACEON(3604,-1)

DBCC IND(TestDB,tb_counttest,-1)

DBCC PAGE(TestDB,1,590,3) --聚集索引(根节点)

DBCC PAGE(TestDB,1,959,3) --非聚集索引(根节点)

上面可以看到，聚集索引寻找数据有11页，加上2页的IAM页，IO读取的页总数是13页。

而非聚集索引页的子叶节点，有6页的索引页，加上2页的IAM页，IO读取的页总数是8页。

按理说，非聚集索引中包括了聚集索引的键列才对，但是有索引的情况下，查找数据只要访问到上一级的页就行，没有实际访问到子叶的数据页（聚集索引）或者索引页（非聚集索引）。因此使用非聚集索引(ix_tb_CountTest_int)统计的数据，即时使用count([uniqueidentifier])统计，走的还是非聚集索引扫描。数据库引擎自动优化了。

当我们使用【set statistics io on】查看时，前6中情况count(*)中，读取数据页8也，而count([int0])读取了13页。

A2.为什么count([int0])使用的是聚集索引？

因为列[int0]中有空值(null)，当执行下面这个时，我们就发现性能非常不好了。

强制使用非聚集索引！（结果是排除了null值的）

select count([int0])from dbo.tb_counttestwith(index(ix_tb_CountTest_int))

看到IO读取4008页，也就是先读取非聚集索引子叶2000行数据进行索引扫描，再读取聚集索引子叶2000行进行键查找，加上2次中每次读取的 2 IAM页+2索引中间节点页，共4008页。且执行计划也不好。所以count([int0])用了聚集索引.

--现在再创建另一个索引：

--	创建非聚集索引([tinyint])
--	drop index ix_tb_CountTest_tinyint on tb_CountTest
create index ix_tb_CountTest_tinyint on tb_CountTest([tinyint])
go

总共只有5(索引页)+2(IAM页)=7页，这时性能更好些了！