MySQL复制中slave延迟监控

最新推荐文章于 2023-07-24 15:23:38 发布

weixin_34081595

最新推荐文章于 2023-07-24 15:23:38 发布

阅读量84

点赞数

文章标签：数据库

通常是根据 Seconds_Behind_Master 的值来判断slave的延迟。这么做大部分情况下尚可接受，但并不够准确，而应该考虑更多因素。

查看slave的状态

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.20
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000150
          Read_Master_Log_Pos: 226326418
               Relay_Log_File: abce-relay-bin.000003
                Relay_Log_Pos: 77106856
        Relay_Master_Log_File: mysql-bin.000149
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1046292913
              Relay_Log_Space: 330964012
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 30510
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
1 row in set (0.00 sec)

mysql>

从查询结果可以看出，slave落后master的时间是30510秒(即Seconds_Behind_Master的值)

查看一下slave上的复制进程在做什么：

mysql> show full processlist\G
*************************** 1. row ***************************
     Id: 4
   User: system user
   Host: 
     db: NULL
Command: Connect
   Time: 6018
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 5
   User: system user
   Host: 
     db: abce
Command: Connect
   Time: 30501
  State: updating
   Info: delete from s_item_e
        where
          
           (
             ng_user_id = 23880
             and dt_shift_date in
              (  
                 '2017-10-30 00:00:00'
              , 
                 '2017-10-29 00:00:00'
              ) 
           )
          or  
           (
             ng_user_id = 24853
             and dt_shift_date in
              (  
                 '2017-10-30 00:00:00'
              , 
                 '2017-10-29 00:00:00'
              ) 
           )
		......

sql线程在做updating操作，具体内容是从表中删除数据。Time的值是30501。通常简单的sql不会执行30501秒。

在REPLICATION进程中，Time这列的值还可能有几种情况：
1、SQL线程当前执行的binlog（实际上是relay log）中的timestamp和IO线程最新的timestamp的差值，这就是通常大家认为的Seconds_Behind_Master值，并不是某个SQL的实际执行耗时；
2、SQL线程当前如果没有活跃SQL在执行的话，Time值就是SQL线程的idle time；

而IO线程的Time值则是该线程自从启动以来的总时长（多少秒），如果系统时间在IO线程启动后发生修改的话，可能会导致该Time值异常，比如变成负数，或者非常大。

可以根据以下顺序进行延迟的监控：

1.首先比较 Read_Master_Log_Pos 和 Master_Log_File 是否有差异
2.如果 Relay_Master_Log_File 和 Master_Log_File 是一样的话，再看 Exec_Master_Log_Pos 和 Read_Master_Log_Pos 的差异，对比SQL线程比IO线程慢了多少个binlog事件；
3.如果 Relay_Master_Log_File 和 Master_Log_File 不一样，那说明延迟可能较大，需要从MASTER上取得binlog status，判断当前的binlog和MASTER上的差距；

查看主库的binary日志：

mysql> SHOW BINARY LOGS;
+------------------+------------+
| Log_name         | File_size  |
+------------------+------------+
| mysql-bin.000142 | 1073778812 |
| mysql-bin.000143 | 1073798740 |
| mysql-bin.000144 | 1073910052 |
| mysql-bin.000145 | 1073781405 |
| mysql-bin.000146 | 1073845859 |
| mysql-bin.000147 | 1073746199 |
| mysql-bin.000148 | 1073863295 |
| mysql-bin.000149 | 1073822822 |
| mysql-bin.000150 |  226343255 |
+------------------+------------+
9 rows in set (0.00 sec)

mysql>

计算延迟应该是mysql-bin.000149的最大事件数减去已经被执行完的事件数，
即 1073822822 - 1046292913 = 27529909 个binlog event，再加上mysql-bin.000150这个binlog已经产生的226343255个binlog event，共226343255个binlog event。即：1073822822 - 1046292913 + 226343255

查看过程中也可以设置pager，只查看关注的几个status值：

mysql> pager cat | egrep -i 'system user|Exec_Master_Log_Pos|Seconds_Behind_Master|Read_Master_Log_Pos'
PAGER set to 'cat | egrep -i 'system user|Exec_Master_Log_Pos|Seconds_Behind_Master|Read_Master_Log_Pos''
mysql> 

这时有活跃SQL，Time值是30583，Seconds_Behind_Master是30565 
mysql>  show processlist; show slave status\G
|  4 | system user |           | NULL   | Connect |  7168 | Waiting for master to send event | NULL                                                                                                 |
|  5 | system user |           | abce   | Connect | 30583 | updating                         | delete from s_item_e
3 rows in set (0.00 sec)

          Read_Master_Log_Pos: 226364447
          Exec_Master_Log_Pos: 1057661705
        Seconds_Behind_Master: 30565
1 row in set (0.00 sec)

mysql>

weixin_34081595

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
MySQL复制中slave延迟监控

通常是根据 Seconds_Behind_Master 的值来判断slave的延迟。这么做大部分情况下尚可接受，但并不够准确，而应该考虑更多因素。查看slave的状态mysql&gt; show slave status\G*************************** 1. row *************************** Slav...
复制链接

扫一扫