mysql 常见线上故障_mysql小白系列_14 线上故障分析与排错

1.重现故障5---线上执行update报错,并处理。(表结构和UPDATE语句自己构造,请给出详细步骤)

1)update故障出现ERROR 1206 (HY000): The total number of locks exceeds the lock table size

root@slave01 15:34: [test]> select count(*) from AZ2C_D01600_1;

+----------+

| count(*) |

+----------+

| 10394605 |

+----------+

1 row in set (10.86 sec)

root@slave01 16:03: [test]> desc AZ2C_D01600_1;

+-------+--------------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+-------+--------------+------+-----+---------+-------+

| f1 | varchar(255) | YES | | NULL | |

| f2 | varchar(255) | YES | | NULL | |

| f3 | varchar(255) | YES | | NULL | |

| f4 | varchar(255) | YES | | NULL | |

| f5 | varchar(255) | YES | | NULL | |

| f6 | varchar(255) | YES | | NULL | |

| f7 | varchar(255) | YES | | NULL | |

+-------+--------------+------+-----+---------+-------+

7 rows in set (0.00 sec)

root@slave01 16:03: [test]> show create table AZ2C_D01600_1;

+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Table | Create Table |

+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| AZ2C_D01600_1 | CREATE TABLE `AZ2C_D01600_1` (

`f1` varchar(255) DEFAULT NULL,

`f2` varchar(255) DEFAULT NULL,

`f3` varchar(255) DEFAULT NULL,

`f4` varchar(255) DEFAULT NULL,

`f5` varchar(255) DEFAULT NULL,

`f6` varchar(255) DEFAULT NULL,

`f7` varchar(255) DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 row in set (0.00 sec)

root@slave01 16:03: [test]> update AZ2C_D01600_1 set f3='asdfasdf ' where f1 like '%sdfsfd%';

ERROR 1206 (HY000): The total number of locks exceeds the lock table size

root@slave01 16:04: [test]>

2)检查参数innodb_buffer_pool_size

root@slave01 16:06: [test]> show variables like 'innodb_buffer_pool_size';

+-------------------------+---------+

| Variable_name | Value |

+-------------------------+---------+

| innodb_buffer_pool_size | 5242880 |

+-------------------------+---------+

1 row in set (0.00 sec)

[root@slave01 ~]# cat /etc/my.cnf |grep innodb_buffer_pool_size

innodb_buffer_pool_size = G

[root@slave01 ~]# ssh master cat /etc/my.cnf |grep innodb_buffer_pool_size

innodb_buffer_pool_size = G

[root@slave01 ~]# ssh slave02 cat /etc/my.cnf |grep innodb_buffer_pool_size

innodb_buffer_pool_size = G

当前主库是slave01,一主二从的innodb_buffe_pool_size均设置错误

3)首先修正从库错误参数,并重启生效

root@slave02 16:11: [(none)]> select @@innodb_buffer_pool_size;

+---------------------------+

| @@innodb_buffer_pool_size |

+---------------------------+

| 2147483648 |

+---------------------------+

1 row in set (0.00 sec)

root@master 16:12: [(none)]> select @@innodb_buffer_pool_size;

+---------------------------+

| @@innodb_buffer_pool_size |

+---------------------------+

| 2147483648 |

+---------------------------+

1 row in set (0.00 sec)

4)准备切换到从库

1.停止mha服务/usr/local/bin/masterha_stop --conf=/mysqlData/mha/etc/app.cnf务

2.如果两次mhafaliover时间不超过8小时,需要删除文件/mysqlData/mha/etc/app/app.failover.complete

3.在mha配置文件/mysqlData/mha/etc/app.conf确定新的mastercandidate_master=1

5)停止zabbix或者其他监控告警,停止mha服务,切换到从库

1.关闭原主库的事件调度event_scheduler=OFF

2.切换新主库/usr/local/bin/masterha_master_switch --master_state=alive --conf=/mysqlData/mha/etc/app.cnf

Wed Mar 14 22:58:51 2018 - [info] * Phase 5: New master cleanup phase..

Wed Mar 14 22:58:51 2018 - [info]

Wed Mar 14 22:58:51 2018 - [info] slave01: Resetting slave info succeeded.

Wed Mar 14 22:58:51 2018 - [info] Switching master to slave01(192.168.3.21:3306) completed successfully.

3.更改原来主库的参数innodb_buffer_pool_size为正确值,并重启

4.启动新slave复制

CHANGE MASTER TO \

MASTER_HOST='192.168.3.21', \

MASTER_PORT=3306, \

MASTER_AUTO_POSITION=1, \

MASTER_USER='repl', \

MASTER_PASSWORD='Rep12#3@';

#因为手工在从库做了修改,导致主从同步失败

#启用了GTID,使用mysqldump备份恢复进行重建从库

SET @@GLOBAL.GTID_PURGED='73c029dc-2034-11e8-90a5-005056a365b6:1-594908,

856d79f8-2038-11e8-b511-005056a330bb:1-3,

b658767f-2044-11e8-951f-005056a330bb:1-188730';

--

-- Position to start replication or point-in-time recovery from

--

-- CHANGE MASTER TO MASTER_LOG_FILE='binlog.000135', MASTER_LOG_POS=1060653593;

--

-- Current Database: `db1`

--

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `db1` /*!40100 DEFAULT CHARACTER SET utf8 */;

#直接在mysqldmp恢复后的从库上执行change master后start slave,无法同步

#重启myslq后报错Slave failed to initialize relay log info structure from the repository,reset slave就OK了

root@master 17:52: [(none)]> reset slave;

Query OK, 0 rows affected (0.01 sec)

root@master 17:53: [(none)]> CHANGE MASTER TO

-> MASTER_HOST='172.16.3.153',

-> MASTER_PORT=3306,

-> MASTER_USER='repl',

-> MASTER_PASSWORD='Rep12#3@',

-> master_auto_position=1;

Query OK, 0 rows affected, 2 warnings (0.04 sec)

2.请给出MySQL数据丢失的最佳解决方案?(自由发挥)

数据写入内存,但未写log未落盘

1)数据丢失后的解决方案?

业务重做

找客服,客服安抚客户

2)防止数据丢失

主从复制采用半同步方式

双1

超融合存储

换oracle

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值