1、analyze table优化表的统计信息
mysql> show index from t;
+-------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| t | 0 | PRIMARY | 1 | id | A | 1045625 | NULL | NULL | | BTREE | | | YES | NULL |
| t | 1 | idx_dept | 1 | dept | A | 9 | NULL | NULL | | BTREE | | | YES | NULL |
| t | 1 | idx_create_time | 1 | create_time | A | 1031590 | NULL | NULL | | BTREE | | | YES | NULL |
| t | 1 | idx_last_login_time | 1 | last_login_time | A | 1041290 | NULL | NULL | YES | BTREE | | | YES | NULL |
+-------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.06 sec)
mysql> select count(1) from t_log_iteminfo;
+----------+
| count(1) |
+----------+
| 2262731 |
+----------+
1 row in set (0.80 sec)
mysql>
mysql> show index from t_log_iteminfo;
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_log_iteminfo | 1 | AK_Key_1 | 1 | id | A | 5095731 | NULL | NULL | | BTREE | | |
| t_log_iteminfo | 1 | Index_1 | 1 | CurrentDate | A | 1201908 | NULL | NULL | | BTREE | | |
| t_log_iteminfo | 1 | Index_1 | 2 | UserId | A | 1775250 | NULL | NULL | | BTREE | | |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
很明显,t_log_iteminfo全表才226万数据,但主键的索引基数居然有509万,说明统计信息不准确了,有必要对表进行analyze来更新表的统计信息,以使执行计划能更准确。
analyze的语法如下:
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL] TABLE tbl_name [, tbl_name] ...
默认执行analyze是会记录binlog,如果不想记录binlog,可加上NO_WRITE_TO_BINLOG参数或LOCAL参数,LOCAL参数是NO_WRITE_TO_BINLOG参数的别名,就是更好记点,两个参数效果是一样的。
值得注意的是,使用ANALYZE TABLE分析表的过程中,数据库系统会对表加一个只读锁。在分析期间,只能读取表中的记录,不能更新和插入记录。ANALYZE TABLE语句能够分析InnoDB和MyISAM类型的表。
mysql> analyze no_write_to_binlog table t_log_iteminfo;
+---------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------------------+---------+----------+----------+
| game.t_log_iteminfo | analyze | status | OK |
+---------------------+---------+----------+----------+
1 row in set (0.05 sec)
mysql>
mysql> show index from t_log_iteminfo;
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_log_iteminfo | 1 | AK_Key_1 | 1 | id | A | 2252847 | NULL | NULL | | BTREE | | |
| t_log_iteminfo | 1 | Index_1 | 1 | CurrentDate | A | 415329 | NULL | NULL | | BTREE | | |
| t_log_iteminfo | 1 | Index_1 | 2 | UserId | A | 382967 | NULL | NULL | | BTREE | | |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)
执行analyze后,主键的索引基数由原来的509万降到225万,和表记录数226万很接近了,说明统计信息更新正确。
2、optimize table对表进行碎片整理
OPTIMIZE TABLE重新组织表数据和关联索引数据的物理存储,以减少存储空间并提高访问表时的I / O效率。通俗点理解就是碎片整理。它会重建表以更新索引统计信息,因此执行optimize就没必要再执行analyze table了。
我们继续分析t_log_iteminfo表的情况:
mysql> show table status like 't_log_iteminfo'\G
*************************** 1. row ***************************
Name: t_log_iteminfo
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 2252847
Avg_row_length: 78
Data_length: 176373760
Max_data_length: 0
Index_length: 158449664
Data_free: 495976448
Auto_increment: 71789430
Create_time: 2018-12-30 03:46:06
Update_time: 2020-04-01 02:08:35
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment: 操作日志
1 row in set (0.00 sec)
通过查看表状态可见,Data_free是495976448(也就是495976448/1024/1024=473MB),它比Data_length + Index_length还有大。
我们先解释下Data_length 和Index_length。从字面上直译就是“数据长度”和“索引长度”,直译的意思也很明显了,我们知道,一个表占用的磁盘空间,主要是由数据和索引构成的,在MyISAM里,数据和索引是分开两个文件存储的,在InnoDB里则是存放在相同的ibd文件里,在MySQL 8.0里,InnoDB的frm文件也取消了,把表结构也一起放在ibd文件里实现了原子性ddl。看官注意此处是“原子性ddl”,而不是“事务性ddl”,MySQL 8.0还是不支持打开一个事务,然后drop一张表,然后还想rollback这骚操作。
上面一段句总结就是 “表大小(单位MB)= (Data_length + Index_length)/1024/1024”(不要问我为什么要除两次1024,拒绝回答这样的问题)。我再看t_log_iteminfo表,通过上面公式计算的结果约等于319MB,但Data_free却有473MB,说明碎片很多了。简单点理解,Data_free就是碎片的多少(虽然严格来说不能完全这样理解)。
我们确定了表有很多碎片,则可以使用optimize来进行碎片整理,语法如下:
OPTIMIZE [NO_WRITE_TO_BINLOG | LOCAL] TABLE tbl_name [, tbl_name] ...
默认执行optimize是会记录binlog,如果不想记录binlog,可加上NO_WRITE_TO_BINLOG参数或LOCAL参数,LOCAL参数是NO_WRITE_TO_BINLOG参数的别名,就是更好记点,两个参数效果是一样的。
使用optimize table碎片整理过程中,数据库系统会对表进行锁这下,因此一定要注意要在数据库不繁忙时执行。
mysql> optimize local table t_log_iteminfo;
+----------------------------------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------------------------------+----------+----------+-------------------------------------------------------------------+
| db_log_fxhf_00813.t_log_iteminfo | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| db_log_fxhf_00813.t_log_iteminfo | optimize | status | OK |
+----------------------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (12.10 sec)
mysql> show table status like 't_log_iteminfo'\G
*************************** 1. row ***************************
Name: t_log_iteminfo
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 2253722
Avg_row_length: 89
Data_length: 200998912
Max_data_length: 0
Index_length: 106086400
Data_free: 4194304
Auto_increment: 71789835
Create_time: 2020-04-01 02:24:52
Update_time: 2020-04-01 02:25:44
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment: 操作日志
1 row in set (0.00 sec)
可以看到Data_free由原来的473MB缩小到现在的4194304字节(即4MB),回收了不少空间。
上面执行optimize时可以看到“Table does not support optimize, doing recreate + analyze instead”,说明InnoDB是通过重建表加analyze方式来代替optimize table操作的。
没必要每天执行optimize table,可以根据实际碎片产生的情况来制定相应的维护计划,例如每月一次,或者每季度一次,或者每半年一次等。