主键为什么查询比索引快
举例子:
表t
字段 a b c d 四个字段
| |
pk key
select c,d from t where a= 100; 直接根据聚集索引主键a从B+树中找到对应的指针位置,如果b+树高度为3,那么要查一次来找到c和d的值。
select c,d from t where b= 100; 要从二级索引字段的值和主键值(b,a)中找到a的信息,再根据主键a找到对应的指针位置,如果b+树高度为3,那么要查两次来找到c和d的值(书签查找,也叫回表)
实际测试下索引的影响
索引解析部分,这个是一个订单表有一百五十万行数据
SHOW CREATE TABLE orders\G;
*************************** 1. row ***************************
Table: orders
Create Table: CREATE TABLE `orders` (
`o_orderkey` int(11) NOT NULL,
`o_custkey` int(11) DEFAULT NULL,
`o_orderstatus` char(1) DEFAULT NULL,
`o_totalprice` double DEFAULT NULL,
`o_orderDATE` date DEFAULT NULL,
`o_orderpriority` char(15) DEFAULT NULL,
`o_clerk` char(15) DEFAULT NULL,
`o_shippriority` int(11) DEFAULT NULL,
`o_comment` varchar(79) DEFAULT NULL,
PRIMARY KEY (`o_orderkey`), -- 主键
KEY `i_o_orderdate` (`o_orderDATE`), -- 索引
KEY `i_o_custkey` (`o_custkey`) -- 索引
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
执行explain命令解析语句,看到这条语句使用了 PRIMARY KEY 主键
EXPLAIN SELECT * FROM orders where o_orderkey=1;
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | orders | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
接下来使用二级索引查询,注意此时的rows是150万行。,且执行时间为0.022s
(root@localhost) [dbt3]> EXPLAIN SELECT * FROM orders where o_orderDATE='1996-01-02';
+----+-------------+--------+------------+------+---------------+---------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+---------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | orders | NULL | ref | i_o_orderdate | i_o_orderdate | 4 | const | 637 | 100.00 | NULL |
+----+-------------+--------+------------+------+---------------+---------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
删除i_o_orderdate 这个索引 alter table orders drop index i_o_orderdate;删除索引不需要工具,
再次解析并执行,发现未使用索引,且查询时间增加了一百倍!注意此时的rows是150万行。
(root@localhost) [dbt3]> EXPLAIN SELECT * FROM orders where o_orderDATE='1996-01-02';
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | SIMPLE | orders | NULL | ALL | NULL | NULL | NULL | NULL | 1491940 | 10.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
同时由于开启了慢查询日志和未使用索引记录(具体怎么开启看我mysql配置的文章),且未使用索引的查询也会记录,对于线上来说,绝大部分优化需要看慢查询日志,DBA必学!
cat /my_mysql/mysql8.0/slow.log
# Time: 2024-07-27T06:06:26.832961Z
# User@Host: root[root] @ [192.168.174.1] Id: 5
# Query_time: 0.977493 Lock_time: 0.000071 Rows_sent: 637 Rows_examined: 1500000
SET timestamp=1722060386;
SELECT * FROM orders where o_orderDATE='1996-01-02';
# Time: 2024-07-27T06:06:26.841818Z
# User@Host: root[root] @ [192.168.174.1] Id: 5
# Query_time: 0.000508 Lock_time: 0.000076 Rows_sent: 15 Rows_examined: 318
SET timestamp=1722060386;
SELECT QUERY_ID, SUM(DURATION) AS SUM_DURATION FROM INFORMATION_SCHEMA.PROFILING GROUP BY QUERY_ID;
# Time: 2024-07-27T06:06:26.842549Z
# User@Host: root[root] @ [192.168.174.1] Id: 5
# Query_time: 0.000453 Lock_time: 0.000081 Rows_sent: 22 Rows_examined: 333
SET timestamp=1722060386;
SELECT STATE AS `Status`, ROUND(SUM(DURATION),7) AS `Duration`, CONCAT(ROUND(SUM(DURATION)/0.000588*100,3), '') AS `Percentage` FROM INFORMATION_SCHEMA.PROFILING WHERE QUERY_ID=118 GROUP BY SEQ, STATE ORDER BY SEQ;
线上一般使用采样,因为问题语句大部分都是重复的,没用必要全部查看
所以可以使用 tail -n 10000 /my_mysql/mysql8.0/slow.log >/my_mysql/mysql8.0/slow_0727.log
然后使用命令 mysqldumpslow /my_mysql/mysql8.0/slow_0727.log 来格式化查看日志内容
mysqldumpslow /my_mysql/mysql8.0/slow_0727.log
Reading mysql slow query log from /my_mysql/mysql8.0/slow_0727.log
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), 0users@0hosts
Time: N-N-27T06:N:N.842549Z
# User@Host: root[root] @ [N.N.N.N] Id: N
# Query_time: N.N Lock_time: N.N Rows_sent: N Rows_examined: N
SET timestamp=N;
SELECT STATE AS `Status`, ROUND(SUM(DURATION),N) AS `Duration`, CONCAT(ROUND(SUM(DURATION)/N.N*N,N), 'S') AS `Percentage` FROM INFORMATION_SCHEMA.PROFILING WHERE QUERY_ID=N GROUP BY SEQ, STATE ORDER BY SEQ
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), 0users@0hosts
Time: N-N-27T06:N:N.841818Z
# User@Host: root[root] @ [N.N.N.N] Id: N
# Query_time: N.N Lock_time: N.N Rows_sent: N Rows_examined: N
SET timestamp=N;
SELECT QUERY_ID, SUM(DURATION) AS SUM_DURATION FROM INFORMATION_SCHEMA.PROFILING GROUP BY QUERY_ID
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), 0users@0hosts
Time: N-N-27T06:N:N.674731Z
# User@Host: root[root] @ [N.N.N.N] Id: N
# Query_time: N.N Lock_time: N.N Rows_sent: N Rows_examined: N
SET timestamp=N;
SELECT QUERY_ID, SUM(DURATION) AS SUM_DURATION FROM INFORMATION_SCHEMA.PROFILING GROUP BY QUERY_ID
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), 0users@0hosts
# Time: N-N-27T06:N:N.832961Z
# User@Host: root[root] @ [N.N.N.N] Id: N
# Query_time: N.N Lock_time: N.N Rows_sent: N Rows_examined: N
SET timestamp=N;
SELECT * FROM orders where o_orderDATE='S'
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0 (0), 0users@0hosts
Time: N-N-27T06:N:N.675413Z
# User@Host: root[root] @ [N.N.N.N] Id: N
# Query_time: N.N Lock_time: N.N Rows_sent: N Rows_examined: N
SET timestamp=N;
SELECT STATE AS `Status`, ROUND(SUM(DURATION),N) AS `Duration`, CONCAT(ROUND(SUM(DURATION)/N.N*N,N), 'S') AS `Percentage` FROM INFORMATION_SCHEMA.PROFILING WHERE QUERY_ID=N GROUP BY SEQ, STATE ORDER BY SEQ