在前面的两篇文章中,介绍了mysql组复制的特点及配置过程,本文演示mysql单组复制下的模拟故障测试。
一、组复制所有成员服务器宕机重启后的恢复
连接所有的mysql实例查询当前的组复制成员情况,状态都是OFFLINE,这种情况下如何恢复组复制?mysql> select * from performance_schema.replication_group_members;
+---------------------------+-----------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+-----------+-------------+-------------+--------------+
| group_replication_applier | | | NULL | OFFLINE |
+---------------------------+-----------+-------------+-------------+--------------+
1 row in set (0.00 sec)
在第一台mysql实例上将group_replication_bootstrap_group设置为ON,然后开启组复制,完成后再设置group_replication_bootstrap_group参数为OFFmysql> set global group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.01 sec)
mysql> start group_replication;
Query OK, 0 rows affected (2.19 sec)
mysql> set global group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 8b643791-6d30-11e7-986a-000c29d53b31 | vm1 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)
第二台mysql实例上开启组复制mysql> select * from performance_schema.replication_group_members;
+---------------------------+-----------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+-----------+-------------+-------------+--------------+
| group_replication_applier | | | NULL | OFFLINE |
+---------------------------+-----------+-------------+-------------+--------------+
1 row in set (0.00 sec)
mysql> change master to master_user='repl',master_password='123456' for channel 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.03 sec)
mysql> start group_replication;
Query OK, 0 rows affected (6.76 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 35c331eb-6d3d-11e7-91e3-000c29e07281 | vm2 | 3306 | ONLINE |
| group_replication_applier | 941bce76-6d40-11e7-b2fe-000c2909332c | vm3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0.00 sec)
在第三台mysql实例开启组复制mysql> change master to master_user='repl',master_password='123456' for channel 'group_replication_recovery';
Query OK, 0 rows affected, 2 warnings (0.03 sec)
mysql> start group_replication;
Query OK, 0 rows affected (3.39 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 35c331eb-6d3d-11e7-91e3-000c29e07281 | vm2 | 3306 | ONLINE |
| group_replication_applier | 8b643791-6d30-11e7-986a-000c29d53b31 | vm1 | 3306 | ONLINE |
| group_replication_applier | 941bce76-6d40-11e7-b2fe-000c2909332c | vm3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.00 sec)
一般来说,第一台启动group replication的为组复制中的master角色,我们可以使用如下的SQL进行确认SELECT
*
FROM
PERFORMANCE_SCHEMA .replication_group_members
WHERE
member_id = (
SELECT
variable_value
FROM
PERFORMANCE_SCHEMA .global_status
WHERE
VARIABLE_NAME = 'group_replication_primary_member'
);
二、如果master宕机,组复制会如何?下面我们来测试一下
首先确认当前的master主机是VM2
[root@vm2 ~]# service mysqld stop
Shutting down MySQL............. SUCCESS![root@vm2 ~]# tail -f /mydata/vm2.err
2017-10-27T15:42:19.×××27Z 0 [Note] Giving 7 client threads a chance to die gracefully
2017-10-27T15:42:19.999375Z 0 [Note] Shutting down slave threads
2017-10-27T15:42:22.000341Z 0 [Note] Forcefully disconnecting 7 remaining clients
2017-10-27T15:42:22.000556Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2017-10-27T15:42:22.002961Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-10-27T15:42:25.462993Z 0 [Note] Plugin group_replication reported: 'state 4345 action xa_terminate'
2017-10-27T15:42:25.463472Z 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-10-27T15:42:25.463514Z 0 [Note] Plugin group_replication reported: 'state 4272 action xa_exit'
2017-10-27T15:42:25.463650Z 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-10-27T15:42:25.463675Z 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-10-27T15:42:30.555581Z 0 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1'
2017-10-27T15:42:30.555639Z 0 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1'
2017-10-27T15:42:30.556496Z 19 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2017-10-27T15:42:30.564277Z 16 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
2017-10-27T15:42:30.565453Z 0 [Note] Event Scheduler: Purging the queue. 0 events
2017-10-27T15:42:30.613823Z 0 [Note] Binlog end
2017-10-27T15:42:30.619483Z 0 [Note] Shutting down plugin 'group_replication'
2017-10-27T15:42:30.619582Z 0 [Note] Plugin group_replication reported: 'All Group Replication server observers have been successfully unregistered'
2017-10-27T15:42:30.620753Z 0 [Note] Shutting down plugin 'ngram'
2017-10-27T15:42:30.620801Z 0 [Note] Shutting down plugin 'BLACKHOLE'
2017-10-27T15:42:30.620820Z 0 [Note] Shutting down plugin 'ARCHIVE'
2017-10-27T15:42:30.620832Z 0 [Note] Shutting down plugin 'partition'
2017-10-27T15:42:30.620841Z 0 [Note] Shutting down plugin 'INNODB_SYS_VIRTUAL'
2017-10-27T15:42:30.620853Z 0 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
2017-10-27T15:42:30.620937Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'
2017-10-27T15:42:30.620946Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'
2017-10-27T15:42:30.620951Z 0 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'
2017-10-27T15:42:30.620957Z 0 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'
2017-10-27T15:42:30.620961Z 0 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'
2017-10-27T15:42:30.620966Z 0 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'
2017-10-27T15:42:30.620971Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'
2017-10-27T15:42:30.620976Z 0 [Note] Shutting down plugin 'INNODB_SYS_TABLES'
2017-10-27T15:42:30.620980Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'
2017-10-27T15:42:30.620985Z 0 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'
2017-10-27T15:42:30.620990Z 0 [Note] Shutting down plugin 'INNODB_FT_CONFIG'
2017-10-27T15:42:30.620994Z 0 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'
2017-10-27T15:42:30.620999Z 0 [Note] Shutting down plugin 'INNODB_FT_DELETED'
2017-10-27T15:42:30.621003Z 0 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'
2017-10-27T15:42:30.621008Z 0 [Note] Shutting down plugin 'INNODB_METRICS'
2017-10-27T15:42:30.621013Z 0 [Note] Shutting down plugin 'INNODB_TEMP_TABLE_INFO'
2017-10-27T15:42:30.621017Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'
2017-10-27T15:42:30.621022Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'
2017-10-27T15:42:30.621027Z 0 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'
2017-10-27T15:42:30.621032Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'
2017-10-27T15:42:30.621042Z 0 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'
2017-10-27T15:42:30.621051Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'
2017-10-27T15:42:30.621056Z 0 [Note] Shutting down plugin 'INNODB_CMPMEM'
2017-10-27T15:42:30.621061Z 0 [Note] Shutting down plugin 'INNODB_CMP_RESET'
2017-10-27T15:42:30.621066Z 0 [Note] Shutting down plugin 'INNODB_CMP'
2017-10-27T15:42:30.621071Z 0 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'
2017-10-27T15:42:30.621076Z 0 [Note] Shutting down plugin 'INNODB_LOCKS'
2017-10-27T15:42:30.621081Z 0 [Note] Shutting down plugin 'INNODB_TRX'
2017-10-27T15:42:30.621085Z 0 [Note] Shutting down plugin 'InnoDB'
2017-10-27T15:42:30.621309Z 0 [Note] InnoDB: FTS optimize thread exiting.
2017-10-27T15:42:30.622401Z 0 [Note] InnoDB: Starting shutdown...
2017-10-27T15:42:30.723215Z 0 [Note] InnoDB: Dumping buffer pool(s) to /mydata/ib_buffer_pool
2017-10-27T15:42:30.723542Z 0 [Note] InnoDB: Buffer pool(s) dump completed at 171027 11:42:30
2017-10-27T15:42:32.472064Z 0 [Note] InnoDB: Shutdown completed; log sequence number 2767811
2017-10-27T15:42:32.474350Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2017-10-27T15:42:32.474373Z 0 [Note] Shutting down plugin 'MEMORY'
2017-10-27T15:42:32.474385Z 0 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2017-10-27T15:42:32.474419Z 0 [Note] Shutting down plugin 'MRG_MYISAM'
2017-10-27T15:42:32.474431Z 0 [Note] Shutting down plugin 'MyISAM'
2017-10-27T15:42:32.474459Z 0 [Note] Shutting down plugin 'CSV'
2017-10-27T15:42:32.474471Z 0 [Note] Shutting down plugin 'sha256_password'
2017-10-27T15:42:32.474481Z 0 [Note] Shutting down plugin 'mysql_native_password'
2017-10-27T15:42:32.475429Z 0 [Note] Shutting down plugin 'binlog'
2017-10-27T15:42:32.477356Z 0 [Note] /usr/local/mysql/bin/mysqld: Shutdown complete
vm1主机的日志[root@vm1 ~]# tail -f /mydata/vm1.err
2017-10-27T15:42:22.634326Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-10-27T15:42:23.095198Z 0 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
2017-10-27T15:42:23.095627Z 0 [Note] Plugin group_replication reported: 'Unsetting super_read_only.'
2017-10-27T15:42:23.095987Z 5 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
vm3主机的日志[root@vm3 ~]# tail -f /mydata/vm3.err
2017-10-27T15:42:22.235922Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-10-27T15:42:22.696811Z 0 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
2017-10-27T15:42:22.697087Z 0 [Note] Plugin group_replication reported: 'Setting super_read_only.'
2017-10-27T15:42:22.697303Z 5 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
可以看到这个时候VM1自动提升为主了,且写入的数据可以继续同步到vm3,vm3的角色为slave,不能写入数据mysql> use yang
Database changed
mysql> insert into t1 values (2,'two');
Query OK, 1 row affected (0.03 sec)
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 1 | one |
| 2 | two |
+----+------+
2 rows in set (0.00 sec)
mysql> select * from yang.t1;
+----+------+
| id | name |
+----+------+
| 1 | one |
| 2 | two |
+----+------+
2 rows in set (0.00 sec)
mysql> insert into yang.t1 values (3,'three');
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
如果此时vm1也跪了会如何呢?[root@vm1 ~]# service mysqld stop
Shutting down MySQL............. SUCCESS!
观察vm3的日志[root@vm3 ~]# tail -f /mydata/vm3.err
2017-10-27T15:42:22.235922Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-10-27T15:42:22.696811Z 0 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
2017-10-27T15:42:22.697087Z 0 [Note] Plugin group_replication reported: 'Setting super_read_only.'
2017-10-27T15:42:22.697303Z 5 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
2017-10-27T15:46:54.618998Z 5 [Note] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection'
2017-10-27T15:49:57.546061Z 0 [Note] Plugin group_replication reported: 'getstart group_id 4317e324'
2017-10-27T15:49:58.612942Z 0 [Note] Plugin group_replication reported: 'Unsetting super_read_only.'
2017-10-27T15:49:58.613092Z 5 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
VM3写入测试mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 941bce76-6d40-11e7-b2fe-000c2909332c | vm3 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)
mysql> insert into yang.t1 values (3,'three');
Query OK, 1 row affected (0.02 sec)
mysql> select * from yang.t1;
+----+-------+
| id | name |
+----+-------+
| 1 | one |
| 2 | two |
| 3 | three |
+----+-------+
3 rows in set (0.00 sec)
三、将VM1和VM2恢复后会如何?我们来测试一下
在最早宕机VM2查询数据是否正常
四、总结
通过上述简单的测试,发现mysql的组复制还是挺好用的,3台mysql实例组成的组复制结构下,只要有1台主机存活,整个mysql服务就可用,在故障主机恢复之后,组复制会自动同步数据,恢复组复制状态。
但在实际的使用过程中,需要考虑到客户端或者数据库中间件的连接问题。因为在单组复制模式下,slave是自读的,所有的写入请求都通过master,如果master宕机,程序或者客户端、数据库中间件读写分离的连接如何平滑处理?