现需要刷新slave(数据可以丢弃),用ansible写了个脚本,涉及到恢复的部分代码如下:
- name: run innobackupex
shell: cd {{ srcpath|default('/m') }}{{ shard.zfill(2) }}/mysql; innobackupex --defaults-group={{ group }} --socket=/var/lib/mysql/mysql.sock{{ shard.zfill(2) }} --ibbackup=xtrabackup_55 --safe-slave-backup --slave-info --stream=tar ./ --defaults-file={{ srcmycnf|default('/etc/my.cnf')}} --password={{ srcpassword|default('centos') }}
- name: change master to ...
shell: master_log=`cat {{ dstpath }}{{ shard.zfill(2) }}/mysql/xtrabackup_binlog_pos_innodb | awk '{gsub(".*/", "", $1); print $1}'`;log_pos=`cat {{ dstpath }}{{ shard.zfill(2) }}/mysql/xtrabackup_binlog_pos_innodb |awk '{print $2}'`; echo "CHANGE MASTER TO MASTER_LOG_FILE='$master_log', MASTER_LOG_POS=$log_pos,MASTER_HOST='{{ repl_master_host }}',MASTER_PORT={{ port }},MASTER_USER='repl',MASTER_PASSWORD='1quf82swe3';">change-master-{{ shard }}.sql;cat change-master-{{ shard }}.sql; mysql -S /var/lib/mysql/mysql.sock{{ shard.zfill(2) }} < change-master-{{ shard }}.sql
发现start slave后Last_IO_Error: Got fatal error 1236 from master when reading data from binary log ......
登陆到slave上发现看下:
# more xtrabackup_binlog_info
mysql-bin.000402 107
# more xtrabackup_binlog_pos_innodb
/m10/scrubbing/mysql/var/mysql-bin.000400 330333429
执行ansible之前看到的master的binlog/position的位置
mysql -S mysql.sock10 -e "show master status\G;"
> 1. row
> File: mysql-bin.000402
> Position: 107
> Binlog_Do_DB:
> Binlog_Ignore_DB:
和xtrabackup_binlog_info完全一致,期间master上的应用都停了,没有任何读写。
搜了下,原来是版本和xtrabackupex的问题,相关讨论如下:
https://bugs.launchpad.net/percona-xtrabackup/+bug/1005855
http://www.cnblogs.com/liushuiwuqing/p/4014339.html
解决办法:更改ansible代码,更改这部分
- name: change master to ...
shell: master_log=`cat {{ dstpath }}{{ shard.zfill(2) }}/mysql/xtrabackup_binlog_pos_innodb # 改为xtrabackup_binlog_info