2.5 组合索引

Loong Tu

于 2020-04-27 10:56:36 发布

阅读量147

点赞数

分类专栏： # 使用索引

本文链接：https://blog.csdn.net/u011354817/article/details/105784627

版权

使用索引专栏收录该内容

5 篇文章 0 订阅

订阅专栏

文章目录

准备测试数据
性能对比测试
组合索引使用效果总结
- 组合索引使用判断
参考

有时候查询中会带有多个字段的查询条件，但是其中任何单个字段的选择率都不高，而多个字段组合起来却有比较好的选择率。这种场景是bitmap索引大显身手的地方，但是bitmap索引对更新性能的影响相当大。PostgreSQL不支持bitmap索引，但是有一个临时的内存中的类似bitmap索引的东西，叫Bitmap Index Scan。

除了Bitmap Index Scan，组合索引也是一种选择。

准备测试数据

CREATE TABLE tb1 (
    c1 int,
    c2 int
);


INSERT INTO tb1
SELECT
    round(random()*100),
    round(random()*1000)
FROM
    generate_series(1,10000000);


SELECT 
   pg_size_pretty(pg_table_size('tb1'));

 pg_size_pretty
----------------
 346 MB
(1 row)

性能对比测试

全表扫描

postgres=# \timing


SELECT
    count(*)
FROM
    tb1 
WHERE
    c1=99
    AND c2=999;
    
 count
-------
    98
(1 row)

Time: 1200.874 ms

c1单索引扫描

CREATE INDEX tb1_idx1 ON tb1 (c1);


postgres=# ANALYZE;


EXPLAIN (analyze true, buffers true, timing true)
SELECT
    count(*)
FROM
    tb1 
WHERE
    c1=99
    AND c2=999;

                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=47394.85..47394.86 rows=1 width=8) (actual time=92.643..92.643 rows=1 loops=1)
   Buffers: shared hit=39883
   ->  Bitmap Heap Scan on tb1  (cost=1737.18..47394.63 rows=91 width=0) (actual time=23.775..92.613 rows=98 loops=1)
         Recheck Cond: (c1 = 99)
         Filter: (c2 = 999)
         Rows Removed by Filter: 99813
         Heap Blocks: exact=39607
         Buffers: shared hit=39883
         ->  Bitmap Index Scan on tb1_idx1  (cost=0.00..1737.16 rows=93963 width=0) (actual time=15.078..15.078 rows=99911 loops=1)
               Index Cond: (c1 = 99)
               Buffers: shared hit=276
 Planning time: 0.080 ms
 Execution time: 93.037 ms
(13 rows)

c2单索引扫描

DROP INDEX tb1_idx1;


CREATE INDEX tb1_idx2 ON tb1 (c2);


postgres=# ANALYZE;


EXPLAIN (analyze true, buffers true, timing true)
SELECT
    count(*)
FROM
    tb1
WHERE
    c1=99
    AND c2=999;
    
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=23542.41..23542.42 rows=1 width=8) (actual time=15.505..15.505 rows=1 loops=1)
   Buffers: shared hit=8880 read=30
   ->  Bitmap Heap Scan on tb1  (cost=180.77..23542.18 rows=91 width=0) (actual time=3.831..15.482 rows=98 loops=1)
         Recheck Cond: (c2 = 999)
         Filter: (c1 = 99)
         Rows Removed by Filter: 9732
         Heap Blocks: exact=8880
         Buffers: shared hit=8880 read=30
         ->  Bitmap Index Scan on tb1_idx2  (cost=0.00..180.75 rows=9642 width=0) (actual time=2.099..2.099 rows=9830 loops=1)
               Index Cond: (c2 = 999)
               Buffers: shared read=30
 Planning time: 0.244 ms
 Execution time: 15.622 ms
(13 rows)

c1+c2索引bitmap扫描

CREATE INDEX tb1_idx1 ON tb1 (c1);


postgres=# ANALYZE;


EXPLAIN (analyze true, buffers true, timing true)
SELECT
    count(*)
FROM
    tb1
WHERE
    c1=99
    AND c2=999;
    
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2420.36..2420.37 rows=1 width=8) (actual time=21.250..21.250 rows=1 loops=1)
   Buffers: shared hit=128 read=276
   ->  Bitmap Heap Scan on tb1  (cost=1964.58..2420.07 rows=118 width=0) (actual time=21.076..21.236 rows=98 loops=1)
         Recheck Cond: ((c2 = 999) AND (c1 = 99))
         Heap Blocks: exact=98
         Buffers: shared hit=128 read=276
         ->  BitmapAnd  (cost=1964.58..1964.58 rows=118 width=0) (actual time=21.041..21.041 rows=0 loops=1)
               Buffers: shared hit=30 read=276
               ->  Bitmap Index Scan on tb1_idx2  (cost=0.00..235.44 rows=12667 width=0) (actual time=2.260..2.260 rows=9830 loops=1)
                     Index Cond: (c2 = 999)
                     Buffers: shared hit=30
               ->  Bitmap Index Scan on tb1_idx1  (cost=0.00..1728.83 rows=93386 width=0) (actual time=16.887..16.887 rows=99911 loops=1)
                     Index Cond: (c1 = 99)
                     Buffers: shared read=276
 Planning time: 0.386 ms
 Execution time: 21.355 ms
(16 rows)

组合索引扫描

DROP INDEX tb1_idx1;


DROP INDEX tb1_idx2;


CREATE INDEX tb1_idx3 ON tb1(c1,c2);


postgres=# ANALYZE;


EXPLAIN (analyze true, buffers true, timing true)
SELECT
    count(*)
FROM
    tb1
WHERE
    c1=99
    AND c2=999;
    
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=358.58..358.59 rows=1 width=8) (actual time=0.276..0.277 rows=1 loops=1)
   Buffers: shared hit=98 read=3
   ->  Bitmap Heap Scan on tb1  (cost=5.37..358.35 rows=91 width=0) (actual time=0.070..0.257 rows=98 loops=1)
         Recheck Cond: ((c1 = 99) AND (c2 = 999))
         Heap Blocks: exact=98
         Buffers: shared hit=98 read=3
         ->  Bitmap Index Scan on tb1_idx3  (cost=0.00..5.34 rows=91 width=0) (actual time=0.040..0.041 rows=98 loops=1)
               Index Cond: ((c1 = 99) AND (c2 = 999))
               Buffers: shared read=3
 Planning time: 0.331 ms
 Execution time: 0.322 ms
(11 rows)

由此可见，本例中，组合索引的效率相当高。

组合索引使用效果总结

组合索引多字段是有序的，并且是个完整的B-Tree索引，有最左原则。
多列索引是先按照第一列进行排序，然后在第一列排好序的基础上再对第二列排序，如果没有第一列的话，直接访问第二列，那第二列肯定是无序的，直接访问后面的列就用不到索引了。
组合索引的生效原则是：从前往后依次使用生效，如果中间某个索引没有使用，那么断点前面的索引部分起作用，断点后面的索引没有起作用。
例如组合索引（a,b,c）：
where a=3 and b=45 and c=5，这种情况下三个索引顺序使用中间没有断点，全部发挥作用；
where a=3 and c=5，这种情况下b就是断点，a发挥了效果，c没有效果；
where b=3 and c=4，这种情况下a就是断点，在a后面的索引都没有发挥作用，这种写法联合索引没有发挥任何效果；
where b=45 and a=3 and c=5，这个跟第一个一样，全部发挥作用，abc只要用上了就行，跟写的顺序无关。

组合索引使用判断

select * from mytable where a=3 and b=5 and c=4;
abc三个索引都在where条件里面用到了，而且都发挥了作用
select * from mytable where a=3 and c=7;
a用到索引，b没有用，所以c是没有用到索引效果的
select * from mytable where a=3 and b>7 and c=3; --范围值就算是断点
a用到了，b也用到了，c没有用到，这个地方b是范围值，也算断点，只不过自身用到了索引
select * from mytable where b=3 and c=4;
因为a索引没有使用，所以这里bc都没有用上索引效果
select * from mytable where a>4 and b=7 and c=9;
a用到了，b没有使用，c没有使用
select * from mytable where a=3 order by b;
a用到了索引，b在结果排序中也用到了索引的效果，前面说了，a下面任意一段的b是排好序的
select * from mytable where a=3 order by c;
a用到了索引，但是这个地方c没有发挥排序效果，因为中间断点了
select * from mytable where b=3 order by a;
b没有用到索引，排序中a也没有发挥索引效果

参考

关于PostgreSQL中的组合索引之一(b-tree篇)

Loong Tu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2.5 组合索引

文章目录测试数据性能对比测试全表扫描c1单索引扫描c2单索引扫描c1+c2索引bitmap扫描组合索引扫描组合索引使用效果总结组合索引使用判断参考有时候查询中会带有多个字段的查询条件，但是其中任何单个字段的选择率都不高，而多个字段组合起来却有比较好的选择率。这种场景是bitmap索引大显身手的地方，但是bitmap索引对更新性能的影响相当大。PostgreSQL不支持bitmap索引，但是有一个...
复制链接

扫一扫