背景:zabbix监控的数据库,原生history表数据增长迅猛,同时7*24监控,没有维护窗口。
但现实需要对表做碎片处理以及数据归档。采取方案:范围分区。
以下涉及到配置情况:
服务器 | mysql版本 | 存储设备 | 参数同异 |
master节点 | 5.7.11 | 机械硬盘 | 异 |
slave节点 | 5.7.22 | SSD | 同 |
新库 | 5.7.11 | SSD | 同 |
最终对history表进行做分区表(这步操作就不细说了)。分区表的表结构:
1
mysql> show create table history\G;
2
*************************** 1. row ***************************
3
Table: history
4
Create Table: CREATE TABLE `history` (
5
`itemid` bigint(20) unsigned NOT NULL,
6
`clock` int(11) NOT NULL DEFAULT '0',
7
`value` double(16,4) NOT NULL DEFAULT '0.0000',
8
`ns` int(11) NOT NULL DEFAULT '0',
9
KEY `history_1` (`itemid`,`clock`),
10
KEY `idx_history_clock` (`clock`)
11
) ENGINE=InnoDB DEFAULT CHARSET=utf8
12
/*!50500 PARTITION BY RANGE COLUMNS(clock)
13
(PARTITION p201806 VALUES LESS THAN (1530374400) ENGINE = InnoDB,
14
PARTITION p201807 VALUES LESS THAN (1533052800) ENGINE = InnoDB,
15
PARTITION p201808 VALUES LESS THAN (1535731200) ENGINE = InnoDB,
16
PARTITION p201809 VALUES LESS THAN (1538323200) ENGINE = InnoDB,
17
PARTITION p201810 VALUES LESS THAN (1541001600) ENGINE = InnoDB,
18
PARTITION p201811 VALUES LESS THAN (1543593600) ENGINE = InnoDB,
19
PARTITION p201812 VALUES LESS THAN (1546272000) ENGINE = InnoDB,
20
PARTITION p201901 VALUES LESS THAN (1548950400) ENGINE = InnoDB,
21
PARTITION p201902 VALUES LESS THAN (1551369600) ENGINE = InnoDB,
22
PARTITION p201903 VALUES LESS THAN (1554048000) ENGINE = InnoDB,
23
PARTITION p201904 VALUES LESS THAN (1556640000) ENGINE = InnoDB,
24
PARTITION p201905 VALUES LESS THAN (1559318400) ENGINE = InnoDB,
25
PARTITION p201906 VALUES LESS THAN (1561910400) ENGINE = InnoDB,
26
PARTITION p201907 VALUES LESS THAN (1564588800) ENGINE = InnoDB,
27
PARTITION p201908 VALUES LESS THAN (1567267200) ENGINE = InnoDB,
28
PARTITION p201909 VALUES LESS THAN (1569859200) ENGINE = InnoDB,
29
PARTITION p201910 VALUES LESS THAN (1572537600) ENGINE = InnoDB,
30
PARTITION p201911 VALUES LESS THAN (1575129600) ENGINE = InnoDB,
31
PARTITION p201912 VALUES LESS THAN (1577808000) ENGINE = InnoDB) */
32
1 row in set (0.01 sec)
33
34
ERROR:
35
No query specified
做了分区表后,发现zabbix的最新数据上查询非常慢,甚至无法使用。通过慢日志追踪到相应的SQL:
1
SELECT * FROM history h WHERE h.itemid='102985' AND h.clock>1532913996 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
分析期间,除了怀疑隐式转换问题外,其他都没发现问题。同时,尝试去除隐式转换的影响后,发现情况不变。分析到最后,就剩下两种情况排查:
1.数据库版本问题
2.数据库配置参数问题
刚好这个zabbix数据库是ms架构(master:mysql 5.7.11版本,slave:mysql 5.7.22版本)
master:
1
mysql> select version();
2
+------------+
3
| version() |
4
+------------+
5
| 5.7.11-log |
6
+------------+
7
1 row in set (0.01 sec)
master上的SQL执行情况以及执行计划:
1
mysql> SELECT * FROM history partition(p201807) h WHERE h.itemid=102985 AND h.clock>1532913996 and h.clock< 1533000615 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
2
Empty set (35.61 sec)
1
mysql> explain SELECT * FROM history partition(p201807) h WHERE h.itemid=102985 AND h.clock>1532913996 and h.clock< 1533000615 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
2
+----+-------------+-------+------------+-------+-----------------------------+-----------+---------+------+------+----------+-----------------------+
3
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
4
+----+-------------+-------+------------+-------+-----------------------------+-----------+---------+------+------+----------+-----------------------+
5
| 1 | SIMPLE | h | p201807 | range | history_1,idx_history_clock | history_1 | 12 | NULL | 1 | 100.00 | Using index condition |
6
+----+-------------+-------+------------+-------+-----------------------------+-----------+---------+------+------+----------+-----------------------+
7
1 row in set, 1 warning (0.00 sec)
slave:
1
mysql> select version();
2
+------------+
3
| version() |
4
+------------+
5
| 5.7.22-log |
6
+------------+
7
1 row in set (0.00 sec
slave上的SQL执行情况以及执行计划:
1
mysql> SELECT * FROM history h WHERE h.itemid=102985 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
2
Empty set (0.00 sec)
1
mysql> explain SELECT * FROM history h WHERE h.itemid=102985 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
2
+----+-------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+-----------+---------+-------+------+----------+-------------+
3
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
4
+----+-------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+-----------+---------+-------+------+----------+-------------+
5
| 1 | SIMPLE | h | p201806,p201807,p201808,p201809,p201810,p201811,p201812,p201901,p201902,p201903,p201904,p201905,p201906,p201907,p201908,p201909,p201910,p201911,p201912 | ref | history_1 | history_1 | 8 | const | 1 | 100.00 | Using where |
6
+----+-------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+-----------+---------+-------+------+----------+-------------+
7
1 row in set, 1 warning (0.01 sec)
通过master-slave之间查询,初步判定版本问题。但不排除配置参数影响。
为了更加准确的定位到问题,并排查参数影响。重新搭建一个5.7.11的库,并配置参数与slave库的参数一致(参数配置就不在这里展现)。
新库的情况如下:
1
mysql> select version();
2
+------------+
3
| version() |
4
+------------+
5
| 5.7.11-log |
6
+------------+
7
1 row in set (0.00 sec)
1
mysql> SELECT * FROM history h WHERE h.itemid='102985' AND h.clock>1532913996 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
2
Empty set (1 min 44.02 sec)
1
mysql> explain SELECT * FROM history h WHERE h.itemid='102985' AND h.clock>1532913996 ORDER BY h.clock DESC LIMIT 2 OFFSET 0;
2
+----+-------------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------+-----------------------------+-----------+---------+------+------+----------+-----------------------+
3
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
4
+----+-------------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------+-----------------------------+-----------+---------+------+------+----------+-----------------------+
5
| 1 | SIMPLE | h | p201807,p201808,p201809,p201810,p201811,p201812,p201901,p201902,p201903,p201904,p201905,p201906,p201907,p201908,p201909,p201910,p201911,p201912 | range | history_1,idx_history_clock | history_1 | 12 | NULL | 1 | 100.00 | Using index condition |
6
+----+-------------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------+-------+-----------------------------+-----------+---------+------+------+----------+-----------------------+
故,再此确认5.7.11的版本与5.7.22版本之间存在的性能差别。
后面,发现5.7.11版本上,去除desc降序时,效率就正常。 故5.7.11版本性能问题主要是:分区+空值+desc导致的。
确认把原master下架后,我们去做upgrade到5.7.22版本后,SQL效率恢复正常。(因没收集资料,故无法提供资料证明。)