之前写过一篇关于,在PG中如何使用组合索引的文章
https://blog.csdn.net/chuckchen1222/article/details/90514332
今天是要记录关于查询第二列的使用。
建里组合索引,一个是为了性能,当使用同样的条件进行查询的时候,可以直接用到组合索引,第二也是为了避免过多索引带给数据库性能的损耗。
如果一个SQL,直接去使用后边的列,是不会走到索引的,那是为了它再建一个索引吗?
下面来看实验,先建一个实验表
testdb=> \d test0411
Table "public.test0411"
Column | Type | Collation | Nullable | Default
---------+-----------------------------+-----------+----------+--------------------------------------
id | integer | | not null | nextval('test0411_id_seq'::regclass)
name | character varying(10) | | |
sex | character varying(10) | | |
age | smallint | | |
utime | timestamp without time zone | | |
country | character varying(10) | | |
Indexes:
"test0411_pkey" PRIMARY KEY, btree (id)
首先建立一个 sex,age,country的索引。
testdb=> create index on test0411(sex,age,country);
CREATE INDEX
当查询 时,指定sex,age时,会使用到索引,当只是用age时,必定不会使用索引。
testdb=> explain select id, name from test0411 where sex='m' and age = 20;
QUERY PLAN
--------------------------------------------------------------------------------------------
Bitmap Heap Scan on test0411 (cost=4.50..20.53 rows=22 width=11)
Recheck Cond: (((sex)::text = 'm'::text) AND (age = 20))
-> Bitmap Index Scan on test0411_sex_age_country_idx (cost=0.00..4.50 rows=22 width=0)
Index Cond: (((sex)::text = 'm'::text) AND (age = 20))
(4 rows)
testdb=> explain select id, name from test0411 where age = 20;
QUERY PLAN
-----------------------------------------------------------
Seq Scan on test0411 (cost=0.00..39.46 rows=44 width=11)
Filter: (age = 20)
(2 rows)
那如果我想查询age的时候,也使用到索引怎么办?
将sex也加入到查询条件中,这样就可以使用到索引了。
testdb=> explain select id, name from test0411 where sex in ('m','f') and age = 20;
QUERY PLAN
--------------------------------------------------------------------------------------------
Bitmap Heap Scan on test0411 (cost=9.01..24.67 rows=44 width=11)
Recheck Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age = 20))
-> Bitmap Index Scan on test0411_sex_age_country_idx (cost=0.00..8.99 rows=44 width=0)
Index Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age = 20))
(4 rows)
当我们使用范围扫描查询年龄的时候,依然可以走到索引上。
testdb=> explain select id, name from test0411 where sex in ('m','f') and age > 20 and age < 30;
QUERY PLAN
----------------------------------------------------------------------------------------------
Bitmap Heap Scan on test0411 (cost=13.78..35.94 rows=409 width=11)
Recheck Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age > 20) AND (age < 30))
-> Bitmap Index Scan on test0411_sex_age_country_idx (cost=0.00..13.68 rows=409 width=0)
Index Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age > 20) AND (age < 30))
(4 rows)
如果我们在查询中加上country的条件,依然可以走到索引上。
testdb=> explain select id, name from test0411 where sex in ('m','f') and age > 20 and age < 30 and country = 'ab';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
---------
Bitmap Heap Scan on test0411 (cost=14.71..30.82 rows=21 width=11)
Recheck Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age > 20) AND (age < 30) AND ((country)::text = 'ab'::te
xt))
-> Bitmap Index Scan on test0411_sex_age_country_idx (cost=0.00..14.70 rows=21 width=0)
Index Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age > 20) AND (age < 30) AND ((country)::text = 'ab'
::text))
(4 rows)
testdb=> explain select id, name from test0411 where sex in ('m','f') and age > 20 and age < 30 and country in ('ab','ac');
QUERY PLAN
----------------------------------------------------------------------------------------------
Bitmap Heap Scan on test0411 (cost=13.69..36.87 rows=41 width=11)
Recheck Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age > 20) AND (age < 30))
Filter: ((country)::text = ANY ('{ab,ac}'::text[]))
-> Bitmap Index Scan on test0411_sex_age_country_idx (cost=0.00..13.68 rows=409 width=0)
Index Cond: (((sex)::text = ANY ('{m,f}'::text[])) AND (age > 20) AND (age < 30))
(5 rows)
所以如果建立组合索引的时候,我们关注以下几个问题:
- 查询的条件。都需要将哪些列加入到组合索引。
- 列使用的频率。
- 列值的属性。列值的种类,比如没有像性别这种很少的判断,那我们可以放在前边,当查询用不到它的时候,也可以将它列在SQL中,这样就可以使用到这个组合索引。避免了再建一个新索引或是用不到组合索引的麻烦。
另附上插入实验数据的脚本:
#!/usr/bin/python
# -*- coding: UTF-8 -*-
conn = psycopg2.connect(database="testdb", user="dbadmin", password="", host="127.0.0.1", port="5432")
cursor = conn.cursor()
for i in range(1,2001):
sex = random.choice('mf')
age = random.randrange(18, 60, 1)
country = ''.join(random.sample('abcde',2))
name = ''.join(random.sample(string.ascii_letters + string.digits, 6))
sql = '''insert into test0411(name, sex, age, country, utime)
values (\'%s\', \'%s\',\'%s\',\'%s\',now()) ''' % (name, sex, age, country)
cursor.execute(sql)
conn.commit()