Mysql索引和优化工具

最新推荐文章于 2024-08-23 17:37:15 发布

趣开源

最新推荐文章于 2024-08-23 17:37:15 发布

阅读量341

点赞数

分类专栏：数据库文章标签： mysql 数据库前端

本文链接：https://blog.csdn.net/scott_nodejs/article/details/126262956

版权

数据库专栏收录该内容

3 篇文章 0 订阅

订阅专栏

为了加快检索数据，有了索引的概念
检索包括select、insert、update、delete

以B+树为基础结构来进行管理

索引常规使用及优化

工具

explain

optimize trace

mysql.trace

explain

type 列

依次从最优到最差分别为：system > const > eq_ref > ref > range > index > ALL

system/const

mysql 对查询能够优化成一个常量，比如使用 primary key 或者 unique key 最多匹配当一行数据

system是const的特例，表里只有一条元组匹配时为system

explain extended 命令可以通过 show warnings 查看附加信息。

eq_ref
primary key 或 unique key 索引的所有部分被连接使用，最多只会返回一条符合条件的记录

ref
相比 eq_ref，不使用唯一索引，而是使用普通索引或者唯一性索引的部分前缀，索引要和某个值相比较，可能会找到多个符合条件的行。

possible_keys列

可能使用的索引列

key列

实际使用的索引列

key_len列

key_len计算规则如下：
1)字符串
    char(n)：n字节长度
    varchar(n)：2字节存储字符串长度，如果是utf-8，则长度 3n + 2
2)数值类型
    tinyint：1字节
    smallint：2字节
    int：4字节
    bigint：8字节　　
3)时间类型
    date：3字节
    timestamp：4字节
    datetime：8字节

注意：如果字段允许为 NULL，需要1字节记录是否为 NULL

ref列

显示了在key列记录的索引中，表查找值所用到的列或常量

rows列

mysql估计要读取并检测的行数，注意这个不是结果集里的行数

Extra列

Using index：查询的列被索引覆盖
Using where：查询的列未被索引覆盖，where筛选条件非索引的前导

Using temporary：mysql需要创建一张临时表来处理查询。
Using filesort：mysql 会对结果使用一个外部索引排序，而不是按索引次序从表里读取行。


DROP TABLE IF EXISTS `employees`;
CREATE TABLE `employees` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(24) NOT NULL DEFAULT '' COMMENT '姓名',
  `age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄',
  `position` varchar(20) NOT NULL DEFAULT '' COMMENT '职位',
  `hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '入职时间',
  PRIMARY KEY (`id`),
  KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='员工记录表';

INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei',22,'manager',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei', 23,'dev',NOW());
INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy',23,'dev',NOW());

-- 1.按照联合索引顺序全值匹配
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22;
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';

-- 2.最左前缀匹配 查询从索引的最左前列开始并且不跳过索引中的列
EXPLAIN SELECT * FROM employees WHERE age = 22 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE position = 'manager';
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';

-- 3.不在索引列上做任何操作（计算、函数、（自动or手动）类型转换），会导致索引失效而转向全表扫描
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';
EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei';

-- 4.存储引擎不能使用索引中范围条件右边的列
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';

-- 5.*尽量使用覆盖索引（只访问索引的查询（索引列包含查询列）），减少select 语句
EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';

-- 6.mysql在使用不等于（！=或者<>）的时候无法使用索引会导致全表扫描
EXPLAIN SELECT * FROM employees WHERE name != 'LiLei';

-- 7.is null,is not null 也无法使用索引
EXPLAIN SELECT * FROM employees WHERE name is null;

-- 8.like以通配符开头（’$abc…’）mysql索引失效会变成全表扫描操作
EXPLAIN SELECT * FROM employees WHERE name like '%Lei';
EXPLAIN SELECT * FROM employees WHERE name like 'Lei%';

-- 9.解决like '%字符串%' 索引不被使用的方法？
EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';

-- 10.字符串不加单引号索引失效
EXPLAIN SELECT * FROM employees WHERE name = '1000';
EXPLAIN SELECT * FROM employees WHERE name = 1000;

-- 11.少用or,用它连接时很多情况下索引会失效
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';

-- like KK%相当于=常量，%KK和%KK% 相当于范围

误区

MySQL的WHERE子句中包含 IS NULL、IS NOT NULL、!= 这些条件时便不能使用索引查询，只能使用全表扫描。


CREATE TABLE s1 (
    id INT NOT NULL AUTO_INCREMENT,
    key1 VARCHAR(100),
    key2 VARCHAR(100),
    key3 VARCHAR(100),
    key_part1 VARCHAR(100),
    key_part2 VARCHAR(100),
    key_part3 VARCHAR(100),
    common_field VARCHAR(100),
    PRIMARY KEY (id),
    KEY idx_key1 (key1),
    KEY idx_key2 (key2),
    KEY idx_key3 (key3),
    KEY idx_key_part(key_part1, key_part2, key_part3)
) Engine=InnoDB CHARSET=utf8;

explain select * from s1 where id is null;

explain select * from s1 where key1 is null;

explain select * from s1 where key2 is not null;

explain select * from s1 where key3 != 'abc';


CREATE TABLE record_format_demo (
     c1 VARCHAR(10),
     c2 VARCHAR(10) NOT NULL,
     c3 CHAR(10),
     c4 VARCHAR(10)
 ) CHARSET=ascii ROW_FORMAT=COMPACT;

-- 针对一行记录来说
-- 值为NULL的列的存储结构：
比如一行记录 c1:NULL, c2='asd', c3:NULL, c4:NULL
0000 0111
0000 0   1   1   1
         c4  c3  c1

MySQL query 访问成本

IO成本：单页为1
I/O成本==页面数量*1.0+1.1

CPU成本：单行为0.2
CPU成本=ROWS*0.2+1.0

总成本=I/O成本+CPU成本

engine_cost
server_cost

show status like ‘last_query_cost’;

optimizer trace


set session optimizer_trace="enabled=on",end_markers_in_json=on; ‐‐开启trace 

select * from t1 where b >= 2 and b < 8 and c > 1 and d != 4 and e != 'a';

SELECT * FROM information_schema.OPTIMIZER_TRACE; 

set session optimizer_trace="enabled=off"; ‐‐关闭trace

{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`c` AS `c`,`t1`.`d` AS `d`,`t1`.`e` AS `e` from `t1` where ((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
                }
              ] /* steps */
            } /* condition_processing */
          },
          {
            "substitute_generated_columns": {
            } /* substitute_generated_columns */
          },
          {
            "table_dependencies": [
              {
                "table": "`t1`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "ref_optimizer_key_uses": [
            ] /* ref_optimizer_key_uses */
          },
          {
            "rows_estimation": [
              {
                "table": "`t1`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 8,
                    "cost": 4.7
                  } /* table_scan */,
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_t1_bcd",
                      "usable": true,
                      "key_parts": [
                        "b",
                        "c",
                        "d",
                        "a"
                      ] /* key_parts */
                    }
                  ] /* potential_range_indexes */,
                  "setup_range_conditions": [
                  ] /* setup_range_conditions */,
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  } /* group_index_range */,
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "idx_t1_bcd",
                        "ranges": [
                          "2 <= b < 8"
                        ] /* ranges */,
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 6,
                        "cost": 8.21,
                        "chosen": false,
                        "cause": "cost"
                      }
                    ] /* range_scan_alternatives */,
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    } /* analyzing_roworder_intersect */
                  } /* analyzing_range_alternatives */
                } /* range_analysis */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`t1`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 8,
                      "access_type": "scan",
                      "resulting_rows": 8,
                      "cost": 2.6,
                      "chosen": true
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "condition_filtering_pct": 100,
                "rows_for_plan": 8,
                "cost_for_plan": 2.6,
                "chosen": true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))",
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table": "`t1`",
                  "attached": "((`t1`.`b` >= 2) and (`t1`.`b` < 8) and (`t1`.`c` > 1) and (`t1`.`d` <> 4) and (`t1`.`e` <> 'a'))"
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "refine_plan": [
              {
                "table": "`t1`"
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}

优化器选项


# 查看优化选项
select @@optimizer_switch\G;

*************************** 1. row ***************************
@@optimizer_switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on
1 row in set (0.01 sec)


index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on
mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on
semijoin=on
loosescan=on
firstmatch=on
duplicateweedout=on
subquery_materialization_cost_based=on
use_index_extensions=on
condition_fanout_filter=on
derived_merge=on

弱水三千，只取一瓢

索引相关查询优化

索引条件下推 ICP [index condition pushdown]

    存储引擎通过判断索引是否符合MySQL服务器传递的条件，只有当索引符合条件时才会将数据检索出来返回给MySQL服务器

    Index Condition Pushdown限制条件：
        当需要访问全表时，ICP用于range，ref，eq_ref和ref_or_null访问类型。

        ICP可用于InnoDB和MyISAM表，包括分区的InnoDB和MyISAM表。

        对于InnoDB表，ICP仅用于辅助索引。ICP的目标是减少全行读取的数量，从而减少I/O操作。 对于InnoDB聚簇索引，完整记录已经读入InnoDB缓冲区。 在这种情况下使用ICP不会降低I/O.

        在虚拟生成列上创建的辅助索引不支持ICP。 InnoDB支持虚拟生成列的辅助索引。

        子查询的条件无法下推。

        存储函数的条件无法下推。存储引擎无法调用存储的函数。

        触发条件无法下推。

CREATE TABLE `address` (
  `address_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `address` varchar(50) NOT NULL,
  `address2` varchar(50) DEFAULT NULL,
  `district` varchar(20) NOT NULL,
  `city_id` smallint(5) unsigned NOT NULL,
  `postal_code` varchar(10) DEFAULT NULL,
  `phone` varchar(20) NOT NULL,
  `location` geometry NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`address_id`),
  KEY `idx_fk_city_id` (`city_id`)
) ENGINE=InnoDB AUTO_INCREMENT=606 DEFAULT CHARSET=utf8;


mysql> set optimizer_switch = "index_condition_pushdown=off";

mysql> explain select * from sakila.address d where d.city_id > 500;
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | d     | NULL       | range | idx_fk_city_id | idx_fk_city_id | 2       | NULL |  101 |   100.00 | Using where |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)




mysql> set optimizer_switch = "index_condition_pushdown=on";

mysql> explain select * from sakila.address d where d.city_id > 500;
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | d     | NULL       | range | idx_fk_city_id | idx_fk_city_id | 2       | NULL |  101 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

索引合并 index merge

    通过多个range类型的扫描并且合并它们的结果集来检索行的。

    合并索引可能会引起死锁。死锁原因是加锁不一致。

CREATE TABLE `t_xxx_customer` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
`partner_id` bigint(20) unsigned DEFAULT NULL,
`customer_id` bigint(20) unsigned DEFAULT NULL,
`deleted` tinyint(4) DEFAULT NULL,
`partner_user_id` bigint(20) unsigned DEFAULT NULL,
`xxx_id` varchar(128) DEFAULT NULL,
`xxx_name` varchar(256) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `partner_id` (`partner_id`),
KEY `customer_id` (`customer_id`),
KEY `partner_user_id` (`partner_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=140249 DEFAULT CHARSET=utf8;

UPDATE t_xxx_customer SET xxx_id='101', xxx_name='bbb' where customer_id=235646 and partner_id=1688 and deleted=0;

UPDATE t_xxx_customer SET xxx_id='102', xxx_name='aaa' where customer_id=151069 and partner_id=1688 and deleted=0;

show engine innodb status;

*** (1) TRANSACTION: UPDATE t_xxx_customer SET xxx_id='101', xxx_name='bbb' where customer_id=235646 and partner_id=1688 and deleted=0;
*** (1) HOLDS THE LOCK(S): RECORD LOCKS space id 1640 page no 3947 n bits 432 index partner_id of table xxx.t_xxx_customer trx id 2625291980 lock_mode X locks rec but not gap Record lock, heap no 334 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 0000000000000698; asc ;; 1: len 8; hex 0000000000021747; asc G;;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1640 page no 3395 n bits 160 index PRIMARY of table t_xxx_customer trx id 2625291980 lock_mode X locks rec but not gap waiting Record lock, heap no 89 PHYSICAL RECORD: n_fields 25; compact format; info bits 0
*** (2) TRANSACTION: UPDATE t_xxx_customer SET xxx_id='102', xxx_name='aaa' where customer_id=151069 and partner_id=1688 and deleted=0;
*** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 1640 page no 3395 n bits 160 index PRIMARY of table xxx.t_xxx_customer trx id 2625291981 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1640 page no 3947 n bits 432 index partner_id of table xxx.t_xxx_customer trx id 2625291981 lock_mode X locks rec but not gap waiting Record lock, heap no 334 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 0000000000000698; asc ;; 1: len 8; hex 0000000000021747; asc G;;
*** WE ROLL BACK TRANSACTION (2)


对于这条记录，第一个事务语句只有partnerid索引(1688)满足条件；对于第二个事务，customer_id和partner_id索引都满足条件。

由于每个语句执行时都需要利用两个二级索引，假设先使用customer_id索引扫描，然后使用partner_id索引扫描。
那么对于id为0x21747的记录，事务1的partner_id=1688满足条件，加partner_id锁，然后对对应的PK索引加锁；
对于事务2，对customer_id= 151069加锁，对对应的PK索引加锁，然后对partner_id=1688索引加锁。那么对partner_id二级索引和PK主键索引在两个事务的上锁顺序是相反的，所以导致了死锁。



----|----------


序号        事务1                                   事务2


1           customer_id 不满足条件不加锁            customer_id= 151069 加锁


2           partner_id=1688加锁                     PK=0x21747加锁


3           PK=0x21747加锁                          partner_id=1688加锁


4                                                   PK=0x21747加锁


表格第2步和第3步，两个事务的加锁顺序是相反的，导致了死锁发生。

组合索引 mmr [Multi-Range Read]

    MRR 通过把「随机磁盘读」，转化为「顺序磁盘读」，从而提高了索引查询的性能。

    MySQL5.6版本推出来性能优化，主要功能是对于非聚簇索引查找时将随机IO转换为顺序IO；
    因为不能保证二级索引上面储存的主键是顺序排序的，那么再回表的时候读取的数据页可能散落在各个节点上面，势必会造成随机IO读；

    MRR的优化就是在回表的过程中，将二级索引树上查找的主键放在一块叫read_rnd_buffer的内存中先保存起来，然后对buffer的主键进行排序，排序后的主键在表中查找时效率就会变高；
    buffer的大小由read_rnd_buffer_size参数控制，如果说一次MRR中buffer里面的主键越多那么优化效果也就越好；

索引覆盖

    尽量较少 select * 的使用，尽可能的将查询字段覆盖在索引上，这样查询数据就不需要回表，会快很多。

Order by

索引排序

依据索引是按照顺序排序的特性，读取数据直接按照索引顺序读取，则数据是排序的

文件排序【filesort】

需要专门来按照排序条件，对数据进行排序

sort_buffer是排序时候用到的内存，是每个线程独有的，由参数sort_buffer_size控制大小；(还有一个Innodb_sort_buffer_size参数，这个参数是指在创建innodb索引期间用于对数据排序的排序缓存区大小)

根据索引a找到主键值，回到主索引树上面取出主键对应的一行记录中需要的字段，存入sort_buffer中；(如果sort_buffer 放不下则只存放，排序指定字段和主键，排序完成后，再去拿去其他所需要的字段)

从索引a中继续取下一个主键的值，重复步骤2和3直到找到所有满足的行；

在sort_buffer中对数据按照order by 指定的字段做快排；

取排序后的数据当作结果集返回；

如果需要排序的数据小于sort_buffer_size，那么排序可以在内存中完成，如果大于缓存区，则需要借助磁盘的临时文件辅助排序

可根据optimizer_trace 进行追踪

优化手段

尽量将多列索引做成联合索引
选择合适索引列顺序【联合索引，选择性最高的索引放在最左侧】
覆盖索引，尽量减少 select *
使用索引扫描来做排序
减少重复、冗余、未使用的索引    INFORMATION_SCHEMA.INDEX_STATISTICS 查询到每个索引的使用频率
减少索引和数据碎片 【使用OPTIMIZE TABLE 或者重新导入数据表来整理数据】
    数据碎片：大量数据删除，插入后会填补空缺位置，未填满的碎片称作数据碎片