详解 InnoDB Cluster 主机名问题

详解 InnoDB Cluster 主机名问题

导言

因在写 【InnoDB Cluster】修改已有集群实例名称及成员实例选项 时发现主机名这块有一些问题,在其中进行了部分测试,但为使其内容精简,故将此部分单独拿出来形成一篇文章。要了解更多信息,请读者参阅该篇文章。

测试过程

使用cluster.setInstanceOption('旧成员实例名称','label','新成员实例名称')来修改成员实例名称。

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica3:3306','label','replica2.ic')
Setting the value of 'label' to 'replica2.ic' in the instance: 'replica3:3306' ...

Successfully set the value of 'label' to 'replica2.ic' in the cluster member: 'replica3:3306'.

cluster.status()结果:

 MySQL  ic-source:33060+ ssl  JS > cluster.status()
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "ic-source:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "ic-replica1:3306": {
                "address": "ic-replica1:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "ic-source:3306": {
                "address": "ic-source:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "replica2.ic": {
                "address": "replica3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "ic-source:3306"
}

cluster.options()结果:

 MySQL  ic-source:33060+ ssl  JS > cluster.options()
 ...
         "tags": {
            "global": [], 
            "ic-replica1:3306": [], 
            "ic-source:3306": [], 
            "replica2.ic": []
        },
 ...
 
            "replica2.ic": [

注意,这里有个问题,我故意拿 replica3:3306 作为cluster.setInstanceOption()方法的实例名称参数,却依然可以运行,证明旧的实例名称并没有完全失效。那是主机缓存的问题吗?

我们来FLUSH HOSTS;试试。

mysql> flush hosts;                            
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                              |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1287 | 'FLUSH HOSTS' is deprecated and will be removed in a future release. Please use TRUNCATE TABLE performance_schema.host_cache instead |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                              |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1287 | 'FLUSH HOSTS' is deprecated and will be removed in a future release. Please use TRUNCATE TABLE performance_schema.host_cache instead |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show tables from performance_schema like 'host%';
+--------------------------------------+
| Tables_in_performance_schema (host%) |
+--------------------------------------+
| host_cache                           |
| hosts                                |
+--------------------------------------+
2 rows in set (0.04 sec)

mysql> select * from performance_schema.host_cache; 
Empty set (0.06 sec)

再次设置label选项:

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica3:3306','label','replica2')
Setting the value of 'label' to 'replica2' in the instance: 'replica3:3306' ...

Successfully set the value of 'label' to 'replica2' in the cluster member: 'replica3:3306'.
 MySQL  ic-source:33060+ ssl  JS > cluster.status()
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "ic-source:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "ic-replica1:3306": {
                "address": "ic-replica1:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "ic-source:3306": {
                "address": "ic-source:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "replica2": {
                "address": "replica3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "ic-source:3306"
}

还是可以运行。那再截断performance_schema.hosts表呢?

mysql> select * from performance_schema.hosts;
+-------------+---------------------+-------------------+-------------------------------+--------------------------+
| HOST        | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS | MAX_SESSION_CONTROLLED_MEMORY | MAX_SESSION_TOTAL_MEMORY |
+-------------+---------------------+-------------------+-------------------------------+--------------------------+
| NULL        |                  39 |                86 |                        272128 |                 66816672 |
| localhost   |                   2 |                 5 |                       1477504 |                  1985033 |
| ic-source   |                   0 |                45 |                       2466600 |                  2817255 |
| ic-replica1 |                   0 |                 2 |                          8240 |                    74084 |
| ic-replica2 |                   0 |                 6 |                         20576 |                   126813 |
| replica2    |                   1 |                38 |                         20576 |                    74084 |
| source      |                  15 |               159 |                       2466360 |                  2860667 |
| replica1    |                   1 |                 3 |                         20576 |                    74084 |
+-------------+---------------------+-------------------+-------------------------------+--------------------------+
8 rows in set (0.00 sec)

mysql> truncate table performance_schema.hosts;
Query OK, 0 rows affected (0.09 sec)

mysql> select * from performance_schema.hosts; 
+-----------+---------------------+-------------------+-------------------------------+--------------------------+
| HOST      | CURRENT_CONNECTIONS | TOTAL_CONNECTIONS | MAX_SESSION_CONTROLLED_MEMORY | MAX_SESSION_TOTAL_MEMORY |
+-----------+---------------------+-------------------+-------------------------------+--------------------------+
| NULL      |                  39 |                39 |                        108512 |                  1370284 |
| localhost |                   2 |                 2 |                        647040 |                   671012 |
| replica2  |                   1 |                 1 |                          8240 |                    74084 |
| source    |                  15 |                15 |                       1841072 |                  2397944 |
| replica1  |                   1 |                 1 |                          8240 |                    74084 |
+-----------+---------------------+-------------------+-------------------------------+--------------------------+
5 rows in set (0.00 sec)

再次设置label选项:

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica3:3306','label','replica2.ic')
Setting the value of 'label' to 'replica2.ic' in the instance: 'replica3:3306' ...

Successfully set the value of 'label' to 'replica2.ic' in the cluster member: 'replica3:3306'.

还是可以运行。

 MySQL  ic-source:33060+ ssl  JS > cluster.status()
 ...
            "replica2.ic": {
                "address": "replica3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }

所以估计还是其他地方有缓存,或者和addresses可以识别出旧的实例名称。继而,我修改addresses中的mysqlClassic值为 r2-table:3306,即普通 MySQL 客户端使用的经典(标准) MySQL 协议的通讯地址。

mysql> update instances set addresses=json_replace(addresses,'$.mysqlClassic','r2-table:3306') where instance_id=4;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from instances where instance_id=4;
+-------------+--------------------------------------+---------------+--------------------------------------+---------------+-------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
| instance_id | cluster_id                           | address       | mysql_server_uuid                    | instance_name | addresses                                                                                 | attributes                                                                                                                                             | description |
+-------------+--------------------------------------+---------------+--------------------------------------+---------------+-------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
|           4 | f465b0e3-6ce3-11ed-8310-000c298d6cb9 | replica3:3306 | 2737b324-a0f6-11ea-afd1-000c2988ff33 | replica2.ic   | {"mysqlX": "replica3:33060", "grLocal": "r2-table:3306", "mysqlClassic": "r2-table:3306"} | {"joinTime": "2022-11-26 01:12:34.768", "server_id": 1250867127, "recoveryAccountHost": "%", "recoveryAccountUser": "mysql_innodb_cluster_1250867127"} | NULL        |
+-------------+--------------------------------------+---------------+--------------------------------------+---------------+-------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
1 row in set (0.00 sec)

然后尝试修改label,发现报错,

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica3:3306','label','replica2')
Cluster.setInstanceOption: The instance 'replica3:3306' does not belong to the cluster. (RuntimeError)

 MySQL  ic-source:33060+ ssl  JS > cluster.status()
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "ic-source:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "ic-replica1:3306": {
                "address": "ic-replica1:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "ic-source:3306": {
                "address": "ic-source:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "replica2.ic": {
                "address": "r2-table:3306", 
                "instanceErrors": [
                    "ERROR: Metadata for this instance does not match hostname reported by instance (metadata=r2-table:3306, actual=replica3:3306). Use rescan() to update the metadata."
                ], 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "ic-source:3306"
}

这里我们暂时先不rescan(),尝试使用其他成员实例名称(当前成员实例名称 replica2.ic,当前成员实例的 group_replication_local_address 变量值 r2-cnf:3306,和当前元数据表中新设的值 r2-table:3306 ):

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica2.ic','label','replica2')
Cluster.setInstanceOption: The instance 'replica2.ic:3306' does not belong to the cluster. (RuntimeError)
 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('r2-cnf:3306','label','replica2')
Cluster.setInstanceOption: The instance 'r2-cnf:3306' does not belong to the cluster. (RuntimeError)
 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('r2-table:3306','label','replica2')
Cluster.setInstanceOption: The instance 'r2-table:3306' does not belong to the cluster. (RuntimeError)

再试下mysqlX的值 replica3:33060 ,显然协议都不对:

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica3:33060','label','replica2')
ERROR: Unable to connect to the target instance 'replica3:33060'. Please verify the connection settings, make sure the instance is available and try again.
Cluster.setInstanceOption: Could not open connection to 'replica3:33060': The provided URI uses the X protocol, which is not supported by this command. (RuntimeError)

都报错,证明改这个东西导致 MYSQL Shell 元数据与实际情况出现了不匹配,虽然集群仍保持高可用,成员实例仍保持在线状态,但 MYSQL Shell API 已经无法对该成员实例进行修改了。我验证了此时也仍可建立新客户端会话。由此证明mysqlClassic很重要!
图1

那么我们改回正确的值 replica3:3306

mysql> update instances set addresses=json_replace(addresses,'$.mysqlClassic','replica3:3306') where instance_id=4;         
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

查看集群状态显示已经恢复了。把所有/etc/hosts中该 IP 地址对应的名称全都拿来作为成员实例名称传入该方法的第一个参数中:

 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('r2-table:3306','label','replica2')
Setting the value of 'label' to 'replica2' in the instance: 'r2-table:3306' ...

Successfully set the value of 'label' to 'replica2' in the cluster member: 'r2-table:3306'.
 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica3:3306','label','replica2.ic')
Setting the value of 'label' to 'replica2.ic' in the instance: 'replica3:3306' ...

Successfully set the value of 'label' to 'replica2.ic' in the cluster member: 'replica3:3306'.
 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('r2-cnf:3306','label','replica2')
Setting the value of 'label' to 'replica2' in the instance: 'r2-cnf:3306' ...
 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('replica2.ic:3306','label','replica2.ic')
Setting the value of 'label' to 'replica2.ic' in the instance: 'replica2.ic:3306' ...

Successfully set the value of 'label' to 'replica2.ic' in the cluster member: 'replica2.ic:3306'.
 MySQL  ic-source:33060+ ssl  JS > cluster.setInstanceOption('ic-replica2:3306','label','replica2')
Setting the value of 'label' to 'replica2' in the instance: 'ic-replica2:3306' ...

Successfully set the value of 'label' to 'replica2' in the cluster member: 'ic-replica2:3306'.

都可以运行,证明并不是只识别旧的实例名称,而是/etc/hosts中为该 IP 地址定义的所有名称!

那么为什么如此呢?前面我们已经分析得出结论,需要重启 MySQL 服务器(进程)才会将修改的主机名应用到数据库。所以这个实例名称 MySQL Shell API 是通过 MySQL 的主机名(SELECT @@hostname;)和协议使用的端口号拼接而成的。因为 MySQL 8.0.27 版本以后的 MySQL 8.0 组复制通信堆栈 支持 MySQL 通信协议 作为可选项,且 InnoDB Cluster 默认采用 MySQL 通信堆栈。所以此处的端口号是 MYSQL 默认的 3306 ,而非 XCOM 默认的 33061

mysql> SELECT @@group_replication_communication_stack;
+-----------------------------------------+
| @@group_replication_communication_stack |
+-----------------------------------------+
| MYSQL                                   |
+-----------------------------------------+
1 row in set (0.00 sec)

如果我们刚刚使用rescan()呢?再现刚刚的问题:

mysql> update instances set addresses=json_replace(addresses,'$.mysqlClassic','r2-table:3306') where instance_id=4;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
}
 MySQL  ic-source:33060+ ssl  JS > cluster.rescan()
Rescanning the cluster...

Result of the rescanning operation for the 'myCluster' cluster:
{
    "name": "myCluster", 
    "newTopologyMode": null, 
    "newlyDiscoveredInstances": [], 
    "unavailableInstances": [], 
    "updatedInstances": [
        {
            "host": "replica3:3306", 
            "id": 4, 
            "label": "replica2", 
            "member_id": "2737b324-a0f6-11ea-afd1-000c2988ff33", 
            "old_host": "ic-replica2:3306"
        }
    ]
}

The instance 'replica3:3306' is part of the cluster but its reported address has changed. Old address: ic-replica2:3306. Current address: replica3:3306.
Updating instance metadata...
The instance metadata for 'replica3:3306' was successfully updated.

 MySQL  ic-source:33060+ ssl  JS > cluster.status()
{
    "clusterName": "myCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "ic-source:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "ic-replica1:3306": {
                "address": "ic-replica1:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "ic-source:3306": {
                "address": "ic-source:3306", 
                "memberRole": "PRIMARY", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }, 
            "replica2": {
                "address": "replica3:3306", 
                "memberRole": "SECONDARY", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "replicationLag": "applier_queue_applied", 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.31"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "ic-source:3306"
}

再次印证了我刚刚的猜想。

结论

主机名这块,最关键的是 MySQL 的主机名(SELECT @@hostname;),而非操作系统当前的主机名。 当然,只要 MySQL 服务器(进程)一重启,就会读取操作系统当前的主机名作为 MySQL 的主机名(SELECT @@hostname;)。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

独上西楼影三人

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值