【MySQL Shell】8.9.1 在 InnoDB ClusterSet 中隔离集群

独上西楼影三人

已于 2023-02-13 20:38:23 修改

阅读量736

点赞数

分类专栏：《MySQL Shell 8.0》文章标签： mysql 数据库 InnoDB ClusterSet MySQL Shell

于 2023-02-10 17:32:07 首次发布

本文链接：https://blog.csdn.net/wudi53433927/article/details/128973494

版权

《MySQL Shell 8.0》专栏收录该内容

30 篇文章 5 订阅

订阅专栏

在发生紧急故障切换后，如果 ClusterSet 的各个部分之间存在事务集不同的风险，则必须保护集群不受写入流量或所有流量的影响。

如果发生网络分区，则有可能出现脑裂的情况，即实例失去同步，无法正确通信以定义同步状态。当 DBA 决定强制选择一个副本集群成为主集群时，产生多于一个主集群，可能会出现脑裂，从而导致脑裂问题。

在这种情况下，DBA 可以选择隔离原始主集群：

写入流向。
所有流量。

有三种隔离操作：
-<Cluster>.fenceWrites() ：停止对 ClusterSet 的主集群的写入流量。副本集群不接受写入，因此此操作对它们没有影响。
从 8.0.31 起，可以在 INVALIDATED 副本集群上使用。此外，如果在禁用 super_read_only 的副本集群上运行，它将启用它。

<Cluster>.unfenceWrites() ：恢复写入流量。此操作可以在以前使用 <Cluster>.fenceWrites() 操作阻止写入流量的集群上运行。
无法在副本集群上使用 cluster.unfenceWrites() 。
<Cluster>.fenceAllTraffic() ：将集群与所有流量隔离。如果您使用 <Cluster>.fenceAllTraffic() 保护了一个集群免受所有流量的影响，则必须使用 MySQL Shell 命令 dba.rebootClusterFromCompleteOutage() 重新启动集群。

有关 dba.rebootClusterFromCompleteOutage() 的更多信息，请参阅第 7.8.3 节 “从重大停机事故中重新启动集群” 。

fenceWrites()
在副本集群上执行 .fenceWrites() 返回报错：

ERROR: Unable to fence Cluster from write traffic: 
operation not permitted on REPLICA Clusters
Cluster.fenceWrites: The Cluster '<Cluster>' is a REPLICA Cluster 
of the ClusterSet '<ClusterSet>' (MYSQLSH 51616)

尽管您主要在属于 ClusterSet 的集群上使用隔离，但也可以使用 <Cluster>.fenceAllTraffic() 隔离独立集群。

要阻止主集群写入流量，请按如下方式使用 cluster.fenceWrites 命令：

<Cluster>.fenceWrites()

运行命令后：

集群上禁用了自动 super_read_only 管理。
在集群中的所有实例上都启用了 super_read_only 。
所有应用程序都被阻止在集群上执行写入操作。

cluster.fenceWrites()
    The Cluster 'primary' will be fenced from write traffic

	  * Disabling automatic super_read_only management on the Cluster...
	  * Enabling super_read_only on '127.0.0.1:3311'...
	  * Enabling super_read_only on '127.0.0.1:3312'...
	  * Enabling super_read_only on '127.0.0.1:3313'...

	  NOTE: Applications will now be blocked from performing writes on Cluster 'primary'. 
    Use <Cluster>.unfenceWrites() to resume writes if you are certain a split-brain is not in effect.

	  Cluster successfully fenced from write traffic

要检查是否已将主集群与写入流量隔离，请使用 <cluster>.status 命令，如下所示：

      <Cluster>.clusterset.status()

输出如下：

clusterset.status()
        {
        "clusters": {
        "primary": {
        "clusterErrors": [
        "WARNING: Cluster is fenced from Write traffic. 
         Use cluster.unfenceWrites() to unfence the Cluster."
        ],
        "clusterRole": "PRIMARY",
        "globalStatus": "OK_FENCED_WRITES",
        "primary": null,
        "status": "FENCED_WRITES",
        "statusText": "Cluster is fenced from Write Traffic."
        },
        "replica": {
        "clusterRole": "REPLICA",
        "clusterSetReplicationStatus": "OK",
        "globalStatus": "OK"
        }
        },
        "domainName": "primary",
        "globalPrimaryInstance": null,
        "primaryCluster": "primary",
        "status": "UNAVAILABLE",
        "statusText": "Primary Cluster is fenced from write traffic."

要解除隔离集群并恢复到主集群的写入流量，请按如下方式使用 cluster.fenceWrites 命令：

<Cluster>.unfenceWrites()

主集群上的自动 super_read_only 管理已启用，主集群实例上的 super_read_only 状态已启用。

cluster.unfenceWrites()
    The Cluster 'primary' will be unfenced from write traffic

    * Enabling automatic super_read_only management on the Cluster...
    * Disabling super_read_only on the primary '127.0.0.1:3311'...

    Cluster successfully unfenced from write traffic

要将集群与所有流量隔离，请使用 cluster.fenceAllTraffic 命令，如下所示：

<Cluster>.fenceAllTraffic()

super_read_only 状态在集群实例的主实例上启用。在集群中的所有实例上启用 offline_mode 之前：

cluster.fenceAllTraffic()
    The Cluster 'primary' will be fenced from all traffic

    * Enabling super_read_only on the primary '127.0.0.1:3311'...
    * Enabling offline_mode on the primary '127.0.0.1:3311'...
    * Enabling offline_mode on '127.0.0.1:3312'...
    * Stopping Group Replication on '127.0.0.1:3312'...
    * Enabling offline_mode on '127.0.0.1:3313'...
    * Stopping Group Replication on '127.0.0.1:3313'...
    * Stopping Group Replication on the primary '127.0.0.1:3311'...

    Cluster successfully fenced from all traffic

要解除隔离集群的所有流量，请使用 MySQL Shell命令 dba.rebootClusterFromCompleteOut() 。恢复集群后，当被问及是否要将实例重新连接到集群时，可以通过选择 Y 来重新连接实例到集群：

cluster = dba.rebootClusterFromCompleteOutage()
	Restoring the cluster 'primary' from complete outage...

	The instance '127.0.0.1:3312' was part of the cluster configuration.
	Would you like to rejoin it to the cluster? [y/N]: Y

	The instance '127.0.0.1:3313' was part of the cluster configuration.
	Would you like to rejoin it to the cluster? [y/N]: Y

	* Waiting for seed instance to become ONLINE...
	127.0.0.1:3311 was restored.
	Rejoining '127.0.0.1:3312' to the cluster.
	Rejoining instance '127.0.0.1:3312' to cluster 'primary'...

	The instance '127.0.0.1:3312' was successfully rejoined to the cluster.

	Rejoining '127.0.0.1:3313' to the cluster.
	Rejoining instance '127.0.0.1:3313' to cluster 'primary'...

	The instance '127.0.0.1:3313' was successfully rejoined to the cluster.

	The cluster was successfully rebooted.

	<Cluster:primary>