需求来源:
假设有这样一些情况:
(1)由于max_connections 设置的过小,导致大量用户访问时出现 “Too many connections” 的错误,需要增大max_connections
(2)由于innodb_buffer_pool_size 设置的过小,导致部分update或delete操作失败,若where条件字段不适合添加索引,则解决的办法转变为了增大innodb_buffer_pool_size
上面举例的参数均为静态参数,不支持实时修改,如下
mysql> set global innodb_buffer_pool_size = 5G;
ERROR 1238 (HY000): Variable 'innodb_buffer_pool_size' is a read only variable
但实际情况下业务不能中断,又无法通过重启MySQL服务器的方法来更新参数。
实验目的:以innodb_buffer_pool_size为例,在不重启应用和数据库的情况下,实现线上数据库静态参数的修改
关于静态、动态参数总结如下
System Variable Summary:https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html
Dynamic Variable Summary:https://dev.mysql.com/doc/refman/5.5/en/dynamic-system-variables.html
实验背景:有主从复制关系的两台MySQL服务器,并已经利用MHA实现了高可用
主库:192.168.237.12
从库:192.168.237.13
VIP:192.168.237.19
主从库配置参数除 server-id 之外,完全一致。
为了快速模拟参数设置不当导致的部分高内存消耗的DML语句执行失败的情形,这里将 innodb_buffer_pool_size 设置为初始大小5M
mysql> show variables like "%innodb%buffer%";
+------------------------------+---------+
| Variable_name | Value |
+------------------------------+---------+
| innodb_buffer_pool_instances | 1 |
| innodb_buffer_pool_size | 5242880 |
| innodb_change_buffering | all |
| innodb_log_buffer_size | 8388608 |
+------------------------------+---------+
4 rows in set (0.01 sec)
利用sysbench创建测试表:sysbench1.sbtest1,一共500万行
[root@237_12 ~]# sysbench --test=/usr/local/share/sysbench/tests/include/oltp_legacy/select.lua --oltp-table-size=5000000 --mysql-tableengine=innodb --mysql-user=root --mysql-password=123456 --mysql-port=3306 --mysql-host=127.0.0.1 --mysql-db=sysbench1 --max-requests=0 --max-time=60 --oltptables-count=20 --report-interval=10 --num_threads=2 prepare
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
WARNING: --num-threads is deprecated, use --threads instead
WARNING: --max-time is deprecated, use --time instead
sysbench 1.0.8 (using bundled LuaJIT 2.1.0-beta2)
Creating table 'sbtest1'...
Inserting 5000000 records into 'sbtest1'
Creating secondary indexes on 'sbtest1'...
[root@237_12 ~]# sysbench --test=/usr/local/share/sysbench/tests/include/oltp_legacy/select.lua --oltp-table-size=5000000 --mysql-tableengine=innodb --mysql-user=root --mysql-password=123456 --mysql-port=3306 --mysql-host=127.0.0.1 --mysql-db=sysbench1 --max-requests=0 --max-time=60 --oltptables-count=20 --report-interval=10 --num_threads=2 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
WARNING: --num-threads is deprecated, use --threads instead
WARNING: --max-time is deprecated, use --time instead
sysbench 1.0.8 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 2
Report intermediate results every 10 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 10s ] thds: 2 tps: 5563.54 qps: 5563.54 (r/w/o: 5563.54/0.00/0.00) lat (ms,95%): 0.74 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 2 tps: 5212.39 qps: 5212.39 (r/w/o: 5212.39/0.00/0.00) lat (ms,95%): 0.87 err/s: 0.00 reconn/s: 0.00
[ 30s ] thds: 2 tps: 3552.90 qps: 3552.90 (r/w/o: 3552.90/0.00/0.00) lat (ms,95%): 1.52 err/s: 0.00 reconn/s: 0.00
[ 40s ] thds: 2 tps: 5267.47 qps: 5267.47 (r/w/o: 5267.47/0.00/0.00) lat (ms,95%): 0.70 err/s: 0.00 reconn/s: 0.00
[ 50s ] thds: 2 tps: 5832.78 qps: 5832.78 (r/w/o: 5832.78/0.00/0.00) lat (ms,95%): 0.61 err/s: 0.00 reconn/s: 0.00
SQL statistics:
queries performed:
read: 313186
write: 0
other: 0
total: 313186
transactions: 313186 (5219.32 per sec.)
queries: 313186 (5219.32 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 60.0009s
total number of events: 313186
Latency (ms):
min: 0.09
avg: 0.38
max: 109.09
95th percentile: 0.77
sum: 119189.04
Threads fairness:
events (avg/stddev): 156593.0000/9.00
execution time (avg/stddev): 59.5945/0.01
测试表结构如下
mysql> desc sbtest1;
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| k | int(10) unsigned | NO | MUL | 0 | |
| c | char(120) | NO | | | |
| pad | char(60) | NO | | | |
+-------+------------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mysql> update sbtest1 set k='123' where c="%abc%";
ERROR 1206 (HY000): The total number of locks exceeds the lock table size
2)执行delete
mysql> delete from sbtest1 where c="%abc%";
ERROR 1206 (HY000): The total number of locks exceeds the lock table size
故障现象“ERROR 1206”成功复现。
解决方案:
1、从库修改 innodb_buffer_pool_size,重启使参数生效
innodb_buffer_pool_size = 5M -> 5G
2、监控系统(Zabbix等)拉维护;停止MHA manager,手工切换主库
3、原主库修改 innodb_buffer_pool_size,重启使参数生效
4、原主库以备库的身份与新主库重新建立复制关系
5、再次启动MHA监控,开启监控系统(Zabbix)
方法二:
使用gdb,仅实验用,生产环境中易导致进程直接挂掉
mysql> show variables like 'log_slave_updates';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| log_slave_updates | OFF |
+-------------------+-------+
1 row in set (0.00 sec)
mysql> set global log_slave_updates=1;
ERROR 1238 (HY000): Variable 'log_slave_updates' is a read only variable
mysql> system gdb -p $(pidof mysqld) -ex "set opt_log_slave_updates=1" -batch
mysql> show variables like 'log_slave_updates';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| log_slave_updates | ON |
+-------------------+-------+
1 row in set (0.00 sec)
mysql> show slave status \G
Replicate_Do_DB: test
mysql> system gdb -p $(pidof mysqld)
-ex 'call rpl_filter->add_do_db(strdup("hehehe"))' -batch
mysql> show slave status \G
Replicate_Do_DB: test,hehehe