索引类型
一. B-tree索引
大的方面看都用的平衡树,但具体的实现上,各引擎稍有不同。比如,NDB引擎使用的是T-tree。Mysiam,innodb中默认使用的B-tree索引。
B-tree索引常见误区:
1. 在where条件常用列上都加上索引
例:where cat_id = 3 and price > 100;//查询第三个栏目,100块以上的商品
误:cat_id和price都加上索引
错:只能用上cat_id或price索引。因为是独立的索引,同时只能用上1个。
2. 在多列上建立索引后,查询哪个列,索引都能发挥作用
误:多列索引发挥作用,需要满足左前缀的要求。
以index(a, b, c)为例
语句 | 索引是否发挥作用 |
---|---|
where a=3 | 是,只使用了a列 |
where a=3 and b=5 | 是,使用了ab列 |
where a=3 and b=5 and c=4 | 是,使用了abc列 |
where b=4 / where c=4 | 否 |
where a=3 and c=4 | a能发挥索引,c不能 |
where a=3 and b>10 and c=7 | a能用伤,b能用上,c不能用上 |
where a=3 and b like ‘XXX%’ and c=7 | a能用上,b能用上,c不能用上 |
实验:
假设某个表有一个联合索引(c1,c2,c3,c4)
A where c1=x and c2=x and c4>x and c3=x
B where c1=x and c2=x and c4=x order by c3
C where c1=x and c4= x group by c3,c2
D where c1=x and c5=x order by c2,c3
E where c1=x and c2=x and c5=x order by c2,c3
create table t4 (
c1 tinyint(1) not null default 0,
c2 tinyint(1) not null default 0,
c3 tinyint(1) not null default 0,
c4 tinyint(1) not null default 0,
c5 tinyint(1) not null default 0,
index c1234(c1,c2,c3,c4)
);
insert into t4 values (1,3,5,6,7),(2,3,9,8,3),(4,3,2,7,5);
A结果分析:c1=x and c2=x and c4>x and c3=x <==等价==> c1=x and c2=x and c3=x and c4>x。因此,c1,c2,c3,c4都能用上。
mysql> explain select * from t4 where c1=1 and c2=2 and c4>3 and c3=3;
+----+-------------+-------+-------+---------------+-------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------+---------+------+------+-------------+
| 1 | SIMPLE | t4 | range | c1234 | c1234 | 4 | NULL | 1 | Using where |
+----+-------------+-------+-------+---------------+-------+---------+------+------+-------------+
1 row in set
B结果分析:c1,c2索引用上,在c2用到索引的基础上,c3是排好序的,因此不用额外排序,而c4没发挥作用。假如不用c3排序,用c5排序则需要另外排序。
mysql> explain select * from t4 where c1=1 and c2=2 and c4=3 order by c3;
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-------------+
| 1 | SIMPLE | t4 | ref | c1234 | c1234 | 2 | const,const | 1 | Using where |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-------------+
1 row in set
mysql> explain select * from t4 where c1=1 and c2=2 and c4=3 order by c5;
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-----------------------------+
| 1 | SIMPLE | t4 | ref | c1234 | c1234 | 2 | const,const | 1 | Using where; Using filesort |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-----------------------------+
1 row in set
C结果分析:只用到c1索引,因为group by c3,c2的顺序无法利用c2,c3索引。如果排序换成c2,c3则能使用得上c2、c3索引,减少排序步骤。
mysql> explain select * from t4 where c1=1 and c4=2 group by c3,c2;
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------------------------------------------+
| 1 | SIMPLE | t4 | ref | c1234 | c1234 | 1 | const | 1 | Using where; Using temporary; Using filesort |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------------------------------------------+
1 row in set
mysql> explain select * from t4 where c1=1 and c4=2 group by c2,c3;
+----+-------------+-------+------+---------------+-------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-------------+
| 1 | SIMPLE | t4 | ref | c1234 | c1234 | 1 | const | 1 | Using where |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-------------+
1 row in set
D结果分析: C1确定的基础上,C2是有序的,C2之下C3是有序的,因此C2,C3发挥的排序的作用。因此,没用到filesort。
mysql> explain select * from t4 where c1=1 and c5=2 order by c2,c3;
+----+-------------+-------+------+---------------+-------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-------------+
| 1 | SIMPLE | t4 | ref | c1234 | c1234 | 1 | const | 1 | Using where |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-------------+
1 row in set
E结果分析:这一句等价与 select * from t4 where c1=1 and c2=3 and c5=2 order by c3; 因为c2的值既是固定的,参与排序时并不考虑。
mysql> explain select * from t4 where c1=1 and c2=3 and c5=2 order by c2,c3;
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-------------+
| 1 | SIMPLE | t4 | ref | c1234 | c1234 | 2 | const,const | 1 | Using where |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+-------------+
1 row in set
二. hash索引
hash的理论查询时间复杂度为O(1)
既然hash索引查找如此高效,为什么不都使用hash索引?
1. hash函数计算后的结果是随机的
2. 无法对范围查询进行优化。每个key都要重新计算hash值,得到的结果分散。
3. 无法利用前缀索引。比如在B-tree中,field列的值为"helloworld",并加索引,查询xx="helloworld"或xx="hello"也可以利用上索引。因为hash("helloworld")和hash("hello"),两者得到的hash值仍未随机。
4. 排序无法优化
5. 必须回行。就是说,通过索引拿到数据位置,必须回到表中取数据。
三. 聚簇索引
主索引文件上直接存放该行的数据,称为聚簇索引。次索引指向对主键的引用。譬如innodb引擎。
注意:
1. 主索引既存储索引值,又在叶子中存储行的数据。
2. 如果没有主键,则会生成unique key做主键。
3. 如果没有unique,则系统生成一个内部的rowid做主键。
4. 优势:根据主键查询条目比较少时,不用回行。
劣势:如果碰到不规则数据插入时,会造成频繁的页分裂。
四. 非聚簇索引
索引与数据分开存放,譬如myisam引擎。主次索引都指向物理行(磁盘位置)