为了加快检索数据,有了索引的概念
检索包括select、insert、update、delete
以B+树为基础结构来进行管理
索引常规使用及优化
工具
explain
optimize trace
mysql.trace
explain
type 列
依次从最优到最差分别为:system > const > eq_ref > ref > range > index > ALL
system/const
mysql 对查询能够优化成一个常量,比如使用 primary key 或者 unique key 最多匹配当一行数据
system是const的特例,表里只有一条元组匹配时为system
explain extended 命令可以通过 show warnings 查看附加信息。
eq_ref
primary key 或 unique key 索引的所有部分被连接使用 ,最多只会返回一条符合条件的记录
ref
相比 eq_ref,不使用唯一索引,而是使用普通索引或者唯一性索引的部分前缀,索引要和某个值相比较,可能会找到多个符合条件的行。
possible_keys列
可能使用的索引列
key列
实际使用的索引列
key_len列
key_len计算规则如下:
1)字符串
char(n):n字节长度
varchar(n):2字节存储字符串长度,如果是utf-8,则长度 3n + 2
2)数值类型
tinyint:1字节
smallint:2字节
int:4字节
bigint:8字节
3)时间类型
date:3字节
timestamp:4字节
datetime:8字节
注意:如果字段允许为 NULL,需要1字节记录是否为 NULL
ref列
显示了在key列记录的索引中,表查找值所用到的列或常量
rows列
mysql估计要读取并检测的行数,注意这个不是结果集里的行数
Extra列
Using index:查询的列被索引覆盖
Using where:查询的列未被索引覆盖,where筛选条件非索引的前导
Using temporary:mysql需要创建一张临时表来处理查询。
Using filesort:mysql 会对结果使用一个外部索引排序,而不是按索引次序从表里读取行。
DROP TABLE IF EXISTS `employees`;
CREATE TABLE `employees` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(24) NOT NULL DEFAULT '' COMMENT '姓名',
`age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄',
`position` varchar(20) NOT NULL DEFAULT '' COMMENT '职位',
`hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '入职时间',
PRIMARY KEY (`id`),
KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='员工记录表';
INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei',22,'manager',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei', 23,'dev',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy',23,'dev',NOW());
-- 1.按照联合索引顺序全值匹配
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22;
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';
-- 2.最左前缀匹配 查询从索引的最左前列开始并且不跳过索引中的列
EXPLAIN SELECT * FROM employees WHERE age = 22 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE position = 'manager';
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';
-- 3.不在索引列上做任何操作(计算、函数、(自动or手动)类型转换),会导致索引失效而转向全表扫描
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';
EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei';
-- 4.存储引擎不能使用索引中范围条件右边的列
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';
-- 5.*尽量使用覆盖索引(只访问索引的查询(索引列包含查询列)),减少select 语句
EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';
-- 6.mysql在使用不等于(!=或者<>)的时候无法使用索引会导致全表扫描
EXPLAIN SELECT * FROM employees WHERE name != 'LiLei';
-- 7.is null,is not null 也无法使用索引
EXPLAIN SELECT * FROM employees WHERE name is null;
-- 8.like以通配符开头(’$abc…’)mysql索引失效会变成全表扫描操作
EXPLAIN SELECT * FROM employees WHERE name like '%Lei';
EXPLAIN SELECT * FROM employees WHERE name like 'Lei%';
-- 9.解决like '%字符串%' 索引不被使用的方法?
EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';
-- 10.字符串不加单引号索引失效
EXPLAIN SELECT * FROM employees WHERE name = '1000';
EXPLAIN SELECT * FROM employees WHERE name = 1000;
-- 11.少用or,用它连接时很多情况下索引会失效
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';
-- like KK%相当于=常量,%KK和%KK% 相当于范围
误区
MySQL的WHERE子句中包含 IS NULL、IS NOT NULL、!= 这些条件时便不能使用索引查询,只能使用全表扫描。
CREATE TABLE s1 (
id INT NOT NULL AUTO_INCREMENT,
key1 VARCHAR(100),
key2 VARCHAR(100),
key3 VARCHAR(100),
key_part1 VARCHAR(100),
key_part2 VARCHAR(100),
key_part3 VARCHAR(100),
common_field VARCHAR(100),
PRIMARY KEY (id),
KEY idx_key1 (key1),
KEY idx_key2 (key2),
KEY idx_key3 (key3),
KEY idx_key_part(key_part1, key_part2, key_part3)
) Engine=InnoDB CHARSET=utf8;
explain select * from s1 where id is null;
explain select * from s1 where key1 is null;
explain select * from s1 where key2 is not null;
explain select * from s1 where key3 != 'abc';
CREATE TABLE record_format_demo (
c1 VARCHAR(10),
c2 VARCHAR(10) NOT NULL,
c3 CHAR(10),
c4 VARCHAR(10)
) CHARSET=ascii ROW_FORMAT=COMPACT;
-- 针对一行记录来说
-- 值为NULL的列的存储结构:
比如一行记录 c1:NULL, c2='asd', c3:NULL, c4:NULL
0000 0111
0000 0 1 1 1
c4 c3 c1
MySQL query 访问成本
IO成本: 单页为1
I/O成本==页面数量*1.0+1.1
CPU成本:单行为0.2
CPU成本=ROWS*0.2+1.0
总成本=I/O成本+CPU成本
engine_cost
server_cost
show status like ‘last_query_cost’;
optimizer trace
set session optimizer_trace="enabled=on",end_markers_in_json=on; ‐‐开启trace
select * from t1 where b >= 2 and b < 8 and c > 1 and d != 4 and e != 'a';
SELECT * FROM information_schema.OPTIMIZER_TRACE;
set session optimizer_trace="enabled=off"; ‐‐关闭trace
{
"steps": [
{
"join_preparation": {
"select#": 1,
"steps": [
{
"expanded_query": "/* select#1 */ select `t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`c` AS `c`,`t1`.`d` AS `d`,`t1`.`e` AS `e` from `t1` where ((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
}
] /* steps */
} /* join_preparation */
},
{
"join_optimization": {
"select#": 1,
"steps": [
{
"condition_processing": {
"condition": "WHERE",
"original_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))",
"steps": [
{
"transformation": "equality_propagation",
"resulting_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
},
{
"transformation": "constant_propagation",
"resulting_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
},
{
"transformation": "trivial_condition_removal",
"resulting_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
}
] /* steps */
} /* condition_processing */
},
{
"substitute_generated_columns": {
} /* substitute_generated_columns */
},
{
"table_dependencies": [
{
"table": "`t1`",
"row_may_be_null": false,
"map_bit": 0,
"depends_on_map_bits": [
] /* depends_on_map_bits */
}
] /* table_dependencies */
},
{
"ref_optimizer_key_uses": [
] /* ref_optimizer_key_uses */
},
{
"rows_estimation": [
{
"table": "`t1`",
"range_analysis": {
"table_scan": {
"rows": 8,
"cost": 4.7
} /* table_scan */,
"potential_range_indexes": [
{
"index": "PRIMARY",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_t1_bcd",
"usable": true,
"key_parts": [
"b",
"c",
"d",
"a"
] /* key_parts */
}
] /* potential_range_indexes */,
"setup_range_conditions": [
] /* setup_range_conditions */,
"group_index_range": {
"chosen": false,
"cause": "not_group_by_or_distinct"
} /* group_index_range */,
"analyzing_range_alternatives": {
"range_scan_alternatives": [
{
"index": "idx_t1_bcd",
"ranges": [
"2 <= b < 8"
] /* ranges */,
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 6,
"cost": 8.21,
"chosen": false,
"cause": "cost"
}
] /* range_scan_alternatives */,
"analyzing_roworder_intersect": {
"usable": false,
"cause": "too_few_roworder_scans"
} /* analyzing_roworder_intersect */
} /* analyzing_range_alternatives */
} /* range_analysis */
}
] /* rows_estimation */
},
{
"considered_execution_plans": [
{
"plan_prefix": [
] /* plan_prefix */,
"table": "`t1`",
"best_access_path": {
"considered_access_paths": [
{
"rows_to_scan": 8,
"access_type": "scan",
"resulting_rows": 8,
"cost": 2.6,
"chosen": true
}
] /* considered_access_paths */
} /* best_access_path */,
"condition_filtering_pct": 100,
"rows_for_plan": 8,
"cost_for_plan": 2.6,
"chosen": true
}
] /* considered_execution_plans */
},
{
"attaching_conditions_to_tables": {
"original_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))",
"attached_conditions_computation": [
] /* attached_conditions_computation */,
"attached_conditions_summary": [
{
"table": "`t1`",
"attached": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
}
] /* attached_conditions_summary */
} /* attaching_conditions_to_tables */
},
{
"refine_plan": [
{
"table": "`t1`"
}
] /* refine_plan */
}
] /* steps */
} /* join_optimization */
},
{
"join_execution": {
"select#": 1,
"steps": [
] /* steps */
} /* join_execution */
}
] /* steps */
}
优化器选项
# 查看优化选项
select @@optimizer_switch\G;
*************************** 1. row ***************************
@@optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on
1 row in set (0.01 sec)
index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on
mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on
semijoin=on
loosescan=on
firstmatch=on
duplicateweedout=on
subquery_materialization_cost_based=on
use_index_extensions=on
condition_fanout_filter=on
derived_merge=on
弱水三千,只取一瓢
索引相关查询优化
索引条件下推 ICP [index condition pushdown]
存储引擎通过判断索引是否符合MySQL服务器传递的条件,只有当索引符合条件时才会将数据检索出来返回给MySQL服务器
Index Condition Pushdown限制条件:
当需要访问全表时,ICP用于range,ref,eq_ref和ref_or_null访问类型。
ICP可用于InnoDB和MyISAM表,包括分区的InnoDB和MyISAM表。
对于InnoDB表,ICP仅用于辅助索引。ICP的目标是减少全行读取的数量,从而减少I/O操作。 对于InnoDB聚簇索引,完整记录已经读入InnoDB缓冲区。 在这种情况下使用ICP不会降低I/O.
在虚拟生成列上创建的辅助索引不支持ICP。 InnoDB支持虚拟生成列的辅助索引。
子查询的条件无法下推。
存储函数的条件无法下推。存储引擎无法调用存储的函数。
触发条件无法下推。
CREATE TABLE `address` (
`address_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`address` varchar(50) NOT NULL,
`address2` varchar(50) DEFAULT NULL,
`district` varchar(20) NOT NULL,
`city_id` smallint(5) unsigned NOT NULL,
`postal_code` varchar(10) DEFAULT NULL,
`phone` varchar(20) NOT NULL,
`location` geometry NOT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`address_id`),
KEY `idx_fk_city_id` (`city_id`)
) ENGINE=InnoDB AUTO_INCREMENT=606 DEFAULT CHARSET=utf8;
mysql> set optimizer_switch = "index_condition_pushdown=off";
mysql> explain select * from sakila.address d where d.city_id > 500;
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | d | NULL | range | idx_fk_city_id | idx_fk_city_id | 2 | NULL | 101 | 100.00 | Using where |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> set optimizer_switch = "index_condition_pushdown=on";
mysql> explain select * from sakila.address d where d.city_id > 500;
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | d | NULL | range | idx_fk_city_id | idx_fk_city_id | 2 | NULL | 101 | 100.00 | Using index condition |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
索引合并 index merge
通过多个range类型的扫描并且合并它们的结果集来检索行的。
合并索引可能会引起死锁。死锁原因是加锁不一致。
CREATE TABLE `t_xxx_customer` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
`partner_id` bigint(20) unsigned DEFAULT NULL,
`customer_id` bigint(20) unsigned DEFAULT NULL,
`deleted` tinyint(4) DEFAULT NULL,
`partner_user_id` bigint(20) unsigned DEFAULT NULL,
`xxx_id` varchar(128) DEFAULT NULL,
`xxx_name` varchar(256) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `partner_id` (`partner_id`),
KEY `customer_id` (`customer_id`),
KEY `partner_user_id` (`partner_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=140249 DEFAULT CHARSET=utf8;
UPDATE t_xxx_customer SET xxx_id='101', xxx_name='bbb' where customer_id=235646 and partner_id=1688 and deleted=0;
UPDATE t_xxx_customer SET xxx_id='102', xxx_name='aaa' where customer_id=151069 and partner_id=1688 and deleted=0;
show engine innodb status;
*** (1) TRANSACTION: UPDATE t_xxx_customer SET xxx_id='101', xxx_name='bbb' where customer_id=235646 and partner_id=1688 and deleted=0;
*** (1) HOLDS THE LOCK(S): RECORD LOCKS space id 1640 page no 3947 n bits 432 index partner_id of table xxx.t_xxx_customer trx id 2625291980 lock_mode X locks rec but not gap Record lock, heap no 334 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 0000000000000698; asc ;; 1: len 8; hex 0000000000021747; asc G;;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1640 page no 3395 n bits 160 index PRIMARY of table t_xxx_customer trx id 2625291980 lock_mode X locks rec but not gap waiting Record lock, heap no 89 PHYSICAL RECORD: n_fields 25; compact format; info bits 0
*** (2) TRANSACTION: UPDATE t_xxx_customer SET xxx_id='102', xxx_name='aaa' where customer_id=151069 and partner_id=1688 and deleted=0;
*** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 1640 page no 3395 n bits 160 index PRIMARY of table xxx.t_xxx_customer trx id 2625291981 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1640 page no 3947 n bits 432 index partner_id of table xxx.t_xxx_customer trx id 2625291981 lock_mode X locks rec but not gap waiting Record lock, heap no 334 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 0000000000000698; asc ;; 1: len 8; hex 0000000000021747; asc G;;
*** WE ROLL BACK TRANSACTION (2)
对于这条记录,第一个事务语句只有partnerid索引(1688)满足条件;对于第二个事务,customer_id和partner_id索引都满足条件。
由于每个语句执行时都需要利用两个二级索引,假设先使用customer_id索引扫描,然后使用partner_id索引扫描。
那么对于id为0x21747的记录,事务1的partner_id=1688满足条件,加partner_id锁,然后对对应的PK索引加锁;
对于事务2,对customer_id= 151069加锁,对对应的PK索引加锁,然后对partner_id=1688索引加锁。那么对partner_id二级索引和PK主键索引在两个事务的上锁顺序是相反的,所以导致了死锁。
----|----------
序号 事务1 事务2
1 customer_id 不满足条件不加锁 customer_id= 151069 加锁
2 partner_id=1688加锁 PK=0x21747加锁
3 PK=0x21747加锁 partner_id=1688加锁
4 PK=0x21747加锁
表格第2步和第3步,两个事务的加锁顺序是相反的,导致了死锁发生。
组合索引 mmr [Multi-Range Read]
MRR 通过把「随机磁盘读」,转化为「顺序磁盘读」,从而提高了索引查询的性能。
MySQL5.6版本推出来性能优化,主要功能是对于非聚簇索引查找时将随机IO转换为顺序IO;
因为不能保证二级索引上面储存的主键是顺序排序的,那么再回表的时候读取的数据页可能散落在各个节点上面,势必会造成随机IO读;
MRR的优化就是在回表的过程中,将二级索引树上查找的主键放在一块叫read_rnd_buffer的内存中先保存起来,然后对buffer的主键进行排序,排序后的主键在表中查找时效率就会变高;
buffer的大小由read_rnd_buffer_size参数控制,如果说一次MRR中buffer里面的主键越多那么优化效果也就越好;
索引覆盖
尽量较少 select * 的使用,尽可能的将查询字段覆盖在索引上,这样查询数据就不需要回表,会快很多。
Order by
索引排序
依据索引是按照顺序排序的特性,读取数据直接按照索引顺序读取,则数据是排序的
文件排序【filesort】
需要专门来按照排序条件,对数据进行排序
sort_buffer是排序时候用到的内存,是每个线程独有的,由参数sort_buffer_size控制大小;(还有一个Innodb_sort_buffer_size参数,这个参数是指在创建innodb索引期间用于对数据排序的排序缓存区大小)
根据索引a找到主键值,回到主索引树上面取出主键对应的一行记录中需要的字段,存入sort_buffer中;(如果sort_buffer 放不下则只存放,排序指定字段和主键,排序完成后,再去拿去其他所需要的字段)
从索引a中继续取下一个主键的值,重复步骤2和3直到找到所有满足的行;
在sort_buffer中对数据按照order by 指定的字段做快排;
取排序后的数据当作结果集返回;
如果需要排序的数据小于sort_buffer_size,那么排序可以在内存中完成,如果大于缓存区,则需要借助磁盘的临时文件辅助排序
可根据optimizer_trace 进行追踪
优化手段
尽量将多列索引做成联合索引
选择合适索引列顺序【联合索引,选择性最高的索引放在最左侧】
覆盖索引,尽量减少 select *
使用索引扫描来做排序
减少重复、冗余、未使用的索引 INFORMATION_SCHEMA.INDEX_STATISTICS 查询到每个索引的使用频率
减少索引和数据碎片 【使用OPTIMIZE TABLE 或者重新导入数据表来整理数据】
数据碎片:大量数据删除,插入后会填补空缺位置,未填满的碎片称作数据碎片