Thu Feb 18 18:23:37 2016 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Thu Feb 18 18:23:37 2016 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --binlog_prefix=data01-bin
Thu Feb 18 18:23:37 2016 - [warning] HealthCheck: SSH to 10.1.10.244 is NOT reachable.
Thu Feb 18 18:23:43 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.1.10.244' (4))
Thu Feb 18 18:23:43 2016 - [warning] Connection failed 2 time(s)..
Thu Feb 18 18:23:46 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.1.10.244' (4))
Thu Feb 18 18:23:46 2016 - [warning] Connection failed 3 time(s)..
Thu Feb 18 18:23:49 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.1.10.244' (4))
Thu Feb 18 18:23:49 2016 - [warning] Connection failed 4 time(s)..
Thu Feb 18 18:23:49 2016 - [warning] Master is not reachable from health checker!
Thu Feb 18 18:23:49 2016 - [warning] Master 10.1.10.244(10.1.10.244:3306) is not reachable!
Thu Feb 18 18:23:49 2016 - [warning] SSH is NOT reachable.
Thu Feb 18 18:23:49 2016 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha-manager.cnf again, and trying to connect to all servers to check server status..
Thu Feb 18 18:23:49 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Feb 18 18:23:49 2016 - [info] Reading application default configuration from /etc/mha-manager.cnf..
Thu Feb 18 18:23:49 2016 - [info] Reading server configuration from /etc/mha-manager.cnf..
Thu Feb 18 18:23:50 2016 - [info] GTID failover mode = 0
Thu Feb 18 18:23:50 2016 - [info] Dead Servers:
Thu Feb 18 18:23:50 2016 - [info] 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:50 2016 - [info] Alive Servers:
Thu Feb 18 18:23:50 2016 - [info] 10.1.10.80(10.1.10.80:3306)
Thu Feb 18 18:23:50 2016 - [info] 10.1.10.91(10.1.10.91:3306)
Thu Feb 18 18:23:50 2016 - [info] Alive Slaves:
Thu Feb 18 18:23:50 2016 - [info] 10.1.10.80(10.1.10.80:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:50 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:50 2016 - [info] 10.1.10.91(10.1.10.91:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:50 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:50 2016 - [info] Checking slave configurations..
Thu Feb 18 18:23:50 2016 - [info] read_only=1 is not set on slave 10.1.10.80(10.1.10.80:3306).
Thu Feb 18 18:23:50 2016 - [warning] relay_log_purge=0 is not set on slave 10.1.10.80(10.1.10.80:3306).
Thu Feb 18 18:23:50 2016 - [info] read_only=1 is not set on slave 10.1.10.91(10.1.10.91:3306).
Thu Feb 18 18:23:50 2016 - [warning] relay_log_purge=0 is not set on slave 10.1.10.91(10.1.10.91:3306).
Thu Feb 18 18:23:50 2016 - [info] Checking replication filtering settings..
Thu Feb 18 18:23:50 2016 - [info] Replication filtering check ok.
Thu Feb 18 18:23:50 2016 - [info] Master is down!
Thu Feb 18 18:23:50 2016 - [info] Terminating monitoring script.
Thu Feb 18 18:23:50 2016 - [info] Got exit code 20 (Master dead).
Thu Feb 18 18:23:50 2016 - [info] MHA::MasterFailover version 0.56.
Thu Feb 18 18:23:50 2016 - [info] Starting master failover.
Thu Feb 18 18:23:50 2016 - [info]
Thu Feb 18 18:23:50 2016 - [info] * Phase 1: Configuration Check Phase..
Thu Feb 18 18:23:50 2016 - [info]
Thu Feb 18 18:23:51 2016 - [info] GTID failover mode = 0
Thu Feb 18 18:23:51 2016 - [info] Dead Servers:
Thu Feb 18 18:23:51 2016 - [info] 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:51 2016 - [info] Checking master reachability via MySQL(double check)...
Thu Feb 18 18:23:52 2016 - [info] ok.
Thu Feb 18 18:23:52 2016 - [info] Alive Servers:
Thu Feb 18 18:23:52 2016 - [info] 10.1.10.80(10.1.10.80:3306)
Thu Feb 18 18:23:52 2016 - [info] 10.1.10.91(10.1.10.91:3306)
Thu Feb 18 18:23:52 2016 - [info] Alive Slaves:
Thu Feb 18 18:23:52 2016 - [info] 10.1.10.80(10.1.10.80:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:52 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:52 2016 - [info] 10.1.10.91(10.1.10.91:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:52 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:52 2016 - [info] Starting Non-GTID based failover.
Thu Feb 18 18:23:52 2016 - [info]
Thu Feb 18 18:23:52 2016 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Feb 18 18:23:52 2016 - [info]
Thu Feb 18 18:23:52 2016 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Feb 18 18:23:52 2016 - [info]
Thu Feb 18 18:23:52 2016 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Feb 18 18:23:52 2016 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Thu Feb 18 18:23:52 2016 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Feb 18 18:23:53 2016 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 3: Master Recovery Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] The latest binary log file/position on all slaves is data01-bin.000016:120
Thu Feb 18 18:23:53 2016 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Feb 18 18:23:53 2016 - [info] 10.1.10.80(10.1.10.80:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:53 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:53 2016 - [info] 10.1.10.91(10.1.10.91:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:53 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:53 2016 - [info] The oldest binary log file/position on all slaves is data01-bin.000016:120
Thu Feb 18 18:23:53 2016 - [info] Oldest slaves:
Thu Feb 18 18:23:53 2016 - [info] 10.1.10.80(10.1.10.80:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:53 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:53 2016 - [info] 10.1.10.91(10.1.10.91:3306) Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
Thu Feb 18 18:23:53 2016 - [info] Replicating from 10.1.10.244(10.1.10.244:3306)
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [warning] Dead Master is not SSH reachable. Could not save it's binlogs. Transactions that were not sent to the latest slave (Read_Master_Log_Pos to the tail of the dead master's binlog) were lost.
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 3.3: Determining New Master Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Thu Feb 18 18:23:53 2016 - [info] All slaves received relay logs to the same position. No need to resync each other.
Thu Feb 18 18:23:53 2016 - [info] Searching new master from slaves..
Thu Feb 18 18:23:53 2016 - [info] Candidate masters from the configuration file:
Thu Feb 18 18:23:53 2016 - [info] Non-candidate masters:
Thu Feb 18 18:23:53 2016 - [info] New master is 10.1.10.80(10.1.10.80:3306)
Thu Feb 18 18:23:53 2016 - [info] Starting master failover..
Thu Feb 18 18:23:53 2016 - [info]
From:
10.1.10.244(10.1.10.244:3306) (current master)
+--10.1.10.80(10.1.10.80:3306)
+--10.1.10.91(10.1.10.91:3306)
To:
10.1.10.80(10.1.10.80:3306) (new master)
+--10.1.10.91(10.1.10.91:3306)
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 3.4: Master Log Apply Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Thu Feb 18 18:23:53 2016 - [info] Starting recovery on 10.1.10.80(10.1.10.80:3306)..
Thu Feb 18 18:23:53 2016 - [info] This server has all relay logs. Waiting all logs to be applied..
Thu Feb 18 18:23:53 2016 - [info] done.
Thu Feb 18 18:23:53 2016 - [info] All relay logs were successfully applied.
Thu Feb 18 18:23:53 2016 - [info] Getting new master's binlog name and position..
Thu Feb 18 18:23:53 2016 - [info] data02-bin.000003:684
Thu Feb 18 18:23:53 2016 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.1.10.80', MASTER_PORT=3306, MASTER_LOG_FILE='data02-bin.000003', MASTER_LOG_POS=684, MASTER_USER='repl_user', MASTER_PASSWORD='xxx';
Thu Feb 18 18:23:53 2016 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Thu Feb 18 18:23:53 2016 - [info] ** Finished master recovery successfully.
Thu Feb 18 18:23:53 2016 - [info] * Phase 3: Master Recovery Phase completed.
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 4: Slaves Recovery Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Thu Feb 18 18:23:53 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] -- Slave diff file generation on host 10.1.10.91(10.1.10.91:3306) started, pid: 18577. Check tmp log /var/log/mha//10.1.10.91_3306_20160218182350.log if it takes time..
Thu Feb 18 18:23:54 2016 - [info]
Thu Feb 18 18:23:54 2016 - [info] Log messages from 10.1.10.91 ...
Thu Feb 18 18:23:54 2016 - [info]
Thu Feb 18 18:23:53 2016 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Thu Feb 18 18:23:54 2016 - [info] End of log messages from 10.1.10.91.
Thu Feb 18 18:23:54 2016 - [info] -- 10.1.10.91(10.1.10.91:3306) has the latest relay log events.
Thu Feb 18 18:23:54 2016 - [info] Generating relay diff files from the latest slave succeeded.
Thu Feb 18 18:23:54 2016 - [info]
Thu Feb 18 18:23:54 2016 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Thu Feb 18 18:23:54 2016 - [info]
Thu Feb 18 18:23:54 2016 - [info] -- Slave recovery on host 10.1.10.91(10.1.10.91:3306) started, pid: 18579. Check tmp log /var/log/mha//10.1.10.91_3306_20160218182350.log if it takes time..
Thu Feb 18 18:23:55 2016 - [info]
Thu Feb 18 18:23:55 2016 - [info] Log messages from 10.1.10.91 ...
Thu Feb 18 18:23:55 2016 - [info]
Thu Feb 18 18:23:54 2016 - [info] Starting recovery on 10.1.10.91(10.1.10.91:3306)..
Thu Feb 18 18:23:54 2016 - [info] This server has all relay logs. Waiting all logs to be applied..
Thu Feb 18 18:23:54 2016 - [info] done.
Thu Feb 18 18:23:54 2016 - [info] All relay logs were successfully applied.
Thu Feb 18 18:23:54 2016 - [info] Resetting slave 10.1.10.91(10.1.10.91:3306) and starting replication from the new master 10.1.10.80(10.1.10.80:3306)..
Thu Feb 18 18:23:54 2016 - [info] Executed CHANGE MASTER.
Thu Feb 18 18:23:54 2016 - [info] Slave started.
Thu Feb 18 18:23:55 2016 - [info] End of log messages from 10.1.10.91.
Thu Feb 18 18:23:55 2016 - [info] -- Slave recovery on host 10.1.10.91(10.1.10.91:3306) succeeded.
Thu Feb 18 18:23:55 2016 - [info] All new slave servers recovered successfully.
Thu Feb 18 18:23:55 2016 - [info]
Thu Feb 18 18:23:55 2016 - [info] * Phase 5: New master cleanup phase..
Thu Feb 18 18:23:55 2016 - [info]
Thu Feb 18 18:23:55 2016 - [info] Resetting slave info on the new master..
Thu Feb 18 18:23:55 2016 - [info] 10.1.10.80: Resetting slave info succeeded.
Thu Feb 18 18:23:55 2016 - [info] Master failover to 10.1.10.80(10.1.10.80:3306) completed successfully.
Thu Feb 18 18:23:55 2016 - [info]
----- Failover Report -----
mha-manager: MySQL Master failover 10.1.10.244(10.1.10.244:3306) to 10.1.10.80(10.1.10.80:3306) succeeded
Master 10.1.10.244(10.1.10.244:3306) is down!
Check MHA Manager logs at manager:/var/log/mha/manager.log for details.
Started automated(non-interactive) failover.
The latest slave 10.1.10.80(10.1.10.80:3306) has all relay logs for recovery.
Selected 10.1.10.80(10.1.10.80:3306) as a new master.
10.1.10.80(10.1.10.80:3306): OK: Applying all logs succeeded.
10.1.10.91(10.1.10.91:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
10.1.10.91(10.1.10.91:3306): OK: Applying all logs succeeded. Slave started, replicating from 10.1.10.80(10.1.10.80:3306)
10.1.10.80(10.1.10.80:3306): Resetting slave info succeeded.
Master failover to 10.1.10.80(10.1.10.80:3306) completed successfully.