【Mysql】Slave_IO_Running: No---Got fatal error 1236 from master

最新推荐文章于 2021-11-06 12:57:22 发布

cml2016

最新推荐文章于 2021-11-06 12:57:22 发布

阅读量291

点赞数

文章标签：数据库 shell

本地MySQL环境，是两台MySQL做M-M复制。今天发现错误信息：

mysql 5.5.28-log> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 88.88.88.88
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: testdbbinlog.000005
          Read_Master_Log_Pos: 98359687
               Relay_Log_File: mysql-relay-bin.000020
                Relay_Log_Pos: 4
        Relay_Master_Log_File: testdbbinlog.000005
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 98359687
              Relay_Log_Space: 107
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
               Last_SQL_Errno: 0
               Last_SQL_Error:
Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
1 row in set (0.00 sec)

mysql 5.5.28-log>

参考网上文档： http://blog.csdn.net/billfanggs/article/details/8905991

在source那边，执行：

flush logs;
show master status;

记下File, Position。 ---记录一个新的binlog

在target端，执行：

CHANGE MASTER TO MASTER_LOG_FILE='testdbbinlog.000008',MASTER_LOG_POS=107; ---从新的binlog开始复制，跳过旧的binlog。
slave start;
show slave status \G

一切正常。但是会有数据丢失，可以检查后重新搭建从库

如果使用了GTID新的特新
（原英文地址https://www.percona.com/blog/2013/02/08/how-to-createrestore-a-slave-using-gtid-replication-in-mysql-5-6/）
How to restore a slave in a bad and fast way（快速但不好的方法）

Let’s imagine that our slave has been down for several days and the binary logs from the master have been purged. This is the error we are going to

点击(此处)折叠或打开

Slave_IO_Running: No
Slave_SQL_Running: Yes
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'

 
         So, let’s try to solve it. First we have the bad and fast way, that is, point to another GTID that the master has in the binary logs. First, we get the GTID_EXECUTED from the master:

master > show global variables like 'GTID%';
+---------------+-------------------------------------------+
| Variable_name | Value |
+---------------+-------------------------------------------+
| gtid_executed | 9a511b7b-7059-11e2-9a24-08002762b8af:1-14 |
+---------------+-------------------------------------------+

|gtid_purged | 9a511b7b-7059-11e2-9a24-08002762b8af:2 |

And we set it on the slave:

点击(此处)折叠或打开

slave> set global GTID_EXECUTED="9a511b7b-7059-11e2-9a24-08002762b8af:2"
ERROR 1238 (HY000): Variable 'gtid_executed' is a read only variable

Error! Remember, we get the GTID_EXECUTED from the master and set is as GTID_PURGED on the slave

点击(此处)折叠或打开

slave1 > set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:2";
ERROR 1840 (HY000): GTID_PURGED can only be set when GTID_EXECUTED is empty.

Error again, GTID_EXECUTED should be empty before changing GTID_PURGED manually but we can’t change it with SET because is a read only variable. The only way to change it is with reset master (yes, on a slave server):

slave1> reset master;
slave1 > show global variables like 'GTID_EXECUTED';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_executed | |
+---------------+-------+
slave1 > set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:2"; 可通过这篇文章来熟悉一下 http://www.cnblogs.com/cenalulu/p/4309009.html
思路：将丢失的binlog放进purge，表示已删除，跳过报错的，英文原版的有点特殊：刚做个slave就报错，slave的purge已经到13了，所以使用的是execute14
生产上一般是运行过程中报错，因为主库的binarylog已经被purge 但是还没应用到slave，可以我们可以查看一下 master与slave的purge，很明显slave的purge绝壁落后着
slave1> start slave io_thread;
slave1> show slave status\G
[...]
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[...]

How to restore a slave in a good and slow way（慢但是比较好的方法）

The good way is mysqldump again. We take a dump from the master like we saw before and try to restore it on the slave

slave1 [localhost] {msandbox} ((none)) > source test.sql; ---导入搭建时的全备
[...]
ERROR 1840 (HY000): GTID_PURGED can only be set when GTID_EXECUTED is empty.
[...]

 
Shell

Wop! It is important to mention that these kind of error messages can dissapear on the shell buffer because the restore of the dump will continue. Be cautious.

Same problem again so same solution too:

点击(此处)折叠或打开

slave1> reset master;
slave1> source test.sql; ----重新到slave库导搭建是的那个全备一遍。。相当于搭建master-slave中的那一步
slave1> start slave;
slave1> show slave status\G
[...]
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[...]

模拟实验

slave ：stop slave，此时Master_Log_File: mysqlbin.000011 Relay_Master_Log_File: mysqlbin.000011 都停留在binlog 11这个日志上
master：
(root@127.0.0.1) [test]> insert into t1 select * from t1; --------11 binlog里面记录了insert8条的记录
Query OK, 8 rows affected (0.00 sec)
Records: 8 Duplicates: 0 Warnings: 0

(root@127.0.0.1) [test]> flush logs;
Query OK, 0 rows affected (0.02 sec)

(root@127.0.0.1) [test]> insert into t1 select * from t1;
Query OK, 16 rows affected (0.00 sec)
Records: 16 Duplicates: 0 Warnings: 0

(root@127.0.0.1) [test]> show master status; -------12 binlog 里面记录了insert 16条的记录
+-----------------+----------+--------------+------------------+------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-----------------+----------+--------------+------------------+------------------------------------------+
| mysqlbin.000012 | 567 | | | 8a0d06d7-91cb-11e5-843b-00163ec09859:1-7 |
+-----------------+----------+--------------+------------------+------------------------------------------+
1 row in set (0.00 sec)

(root@127.0.0.1) [test]> flush logs;
Query OK, 0 rows affected (0.01 sec)

(root@127.0.0.1) [test]> insert into t1 select * from t1; ------13 binlog里面记录了insert 32条的记录
Query OK, 32 rows affected (0.01 sec)
Records: 32 Duplicates: 0 Warnings: 0

(root@127.0.0.1) [test]> flush logs;
Query OK, 0 rows affected (0.02 sec)

(root@127.0.0.1) [test]> show master status;
+-----------------+----------+--------------+------------------+------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-----------------+----------+--------------+------------------+------------------------------------------+
| mysqlbin.000014 | 191 | | | 8a0d06d7-91cb-11e5-843b-00163ec09859:1-8 |
+-----------------+----------+--------------+------------------+------------------------------------------+
1 row in set (0.00 sec)

(root@127.0.0.1) [test]> purge master logs to 'mysqlbin.000012'; ---记录8条的binlog没了
Query OK, 0 rows affected (0.01 sec)

slave：
(root@127.0.0.1) [(none)]> start slave;
Query OK, 0 rows affected (0.01 sec)

(root@127.0.0.1) [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 218.78.186.162
Master_User: repluser
Master_Port: 3308
Connect_Retry: 60
Master_Log_File: mysqlbin.000011
Read_Master_Log_Pos: 191
Relay_Log_File: mysql-relay-bin.000014
Relay_Log_Pos: 399
Relay_Master_Log_File: mysqlbin.000011
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 191
Relay_Log_Space: 689
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 13308
Master_UUID: 8a0d06d7-91cb-11e5-843b-00163ec09859
Master_Info_File: /data/mydata/3309/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 160519 15:06:15
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-5
Executed_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-5
Auto_Position: 1
1 row in set (0.00 sec)

ERROR:
No query specified

解决办法：
master：
(root@127.0.0.1) [test]> show variables like 'gtid%';
+---------------+------------------------------------------+
| Variable_name | Value |
+---------------+------------------------------------------+
| gtid_executed | |
| gtid_mode | ON |
| gtid_next | AUTOMATIC |
| gtid_owned | |
| gtid_purged | 8a0d06d7-91cb-11e5-843b-00163ec09859:1-6 |
+---------------+------------------------------------------+

slave：
(root@127.0.0.1) [(none)]> reset master;
Query OK, 0 rows affected (0.02 sec)
(root@127.0.0.1) [(none)]> set global gtid_purged='8a0d06d7-91cb-11e5-843b-00163ec09859:1-6';
Query OK, 0 rows affected (0.02 sec)

(root@127.0.0.1) [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
(root@127.0.0.1) [(none)]> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 218.78.186.162
Master_User: repluser
Master_Port: 3308
Connect_Retry: 60
Master_Log_File: mysqlbin.000014
Read_Master_Log_Pos: 191
Relay_Log_File: mysql-relay-bin.000019
Relay_Log_Pos: 415
Relay_Master_Log_File: mysqlbin.000014
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 191
Relay_Log_Space: 721
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 13308
Master_UUID: 8a0d06d7-91cb-11e5-843b-00163ec09859
Master_Info_File: /data/mydata/3309/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-5:7-8 ---出现断层了，8a0d06d7-91cb-11e5-843b-00163ec09859:6 被我们purge了
Executed_Gtid_Set: 8a0d06d7-91cb-11e5-843b-00163ec09859:1-8
Auto_Position: 1
1 row in set (0.00 sec)

ERROR:
No query specified

(root@127.0.0.1) [test]> select count(*) from t1;
+----------+
| count(*) |
+----------+
| 56 | -----只有56条，刚好记录的那8条被purge了，数据丢失了！
+----------+
1 row in set (0.00 sec)

当线上出现1236的错误后可以这样解决，但是解决完后要检查一下主从数据的一致性，不然会留下隐患！可能后面会导致1023等一些列错误

各种导致1236的原因！：http://blog.itpub.net/22664653/viewspace-1714269/

 
   
          1 
        

          2 
        

          3 
        

          4 
        

          5 
        

          6 
        

          7 
        

          8 
        

          9 
        

          10 
        

          11 
        

          12 
        

          13 
        

          14 
        
 
         slave1 
         > 
           
         reset  
         master 
         ; 
        
 
         slave1 
           
         > 
           
         show  
         global  
         variables  
         like 
           
         'GTID_EXECUTED' 
         ; 
        
 
         + 
         -- 
         -- 
         -- 
         -- 
         -- 
         -- 
         -- 
         - 
         + 
         -- 
         -- 
         -- 
         - 
         + 
        
 
         | 
           
         Variable_name 
           
         | 
           
         Value 
           
         | 
        
 
         + 
         -- 
         -- 
         -- 
         -- 
         -- 
         -- 
         -- 
         - 
         + 
         -- 
         -- 
         -- 
         - 
         + 
        
 
         | 
           
         gtid_executed 
           
         | 
                 
         | 
        
 
         + 
         -- 
         -- 
         -- 
         -- 
         -- 
         -- 
         -- 
         - 
         + 
         -- 
         -- 
         -- 
         - 
         + 
        
 
         slave1 
           
         > 
           
         set  
         global  
         GTID_PURGED 
         = 
         "9a511b7b-7059-11e2-9a24-08002762b8af:1-14" 
         ; 
        
 
         slave1 
         > 
           
         start  
         slave  
         io_threa 
        
 
  

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/29096438/viewspace-1808376/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/29096438/viewspace-1808376/

cml2016

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【Mysql】Slave_IO_Running: No---Got fatal error 1236 from master

本地MySQL环境，是两台MySQL做M-M复制。今天发现错误信息： mysql 5.5.28-log> show slave status\G*************************** ...
复制链接

扫一扫