Redis Cluster on K8s 大揭密

之前我们针对 Redis 容器化,做了一些讨论: 《Redis 容器化,是不是个“软柿子”》,业界不乏相关的实践分享,KubeBlocks 也针对 Redis Cluster 做了适配并有对应的解决方案。在 Redis 容器化的过程中,KubeBlocks 遇到了哪些问题,又是如何解决的呢?今天这篇文章将带领大家一起捏一捏这个“柿子”。

背景

Redis Cluster 是 Redis 数据库的分布式解决方案,用于将数据分布在多个节点上,以提供高可用性和扩展性。它允许将大量数据分片存储在多个节点上,并自动处理数据的分片和迁移。

Redis Cluster 使用哈希槽(hash slots)的概念来管理数据的分布。数据被分成固定数量的哈希槽,每个槽都可以分配给不同的节点。每个节点负责处理一部分哈希槽中的数据。客户端可以直接连接到任意节点,而不需要中间代理。

在应用部署中,整体架构一般由后端的 redis cluster 和应用端的 smart client 共同组成。

Redis Cluster 提供了以下特性:

  1. 自动分片和数据迁移:当节点加入或离开集群时,Redis Cluster 会自动将数据迁移到正确的节点上,以保持数据的均衡分布。
  2. 高可用性:Redis Cluster 使用主从复制机制,每个主节点都有若干个从节点。当主节点发生故障时,从节点可以自动接管,从而实现高可用性。
  3. 负载均衡:Redis Cluster 在客户端和节点之间实现了自动的负载均衡。客户端可以直接连接到任意节点,并且节点之间会自动转发请求,从而实现负载均衡。

Redis Cluster 通过将数据分布在多个节点上,并提供自动的故障转移和负载均衡机制,使得应用程序可以处理大规模的数据集和高并发的访问需求。它是一个强大的分布式解决方案,常用于需要高性能和可扩展性的场景,如缓存、会话存储和实时计数等。

问题复现

Kubeblocks 很多客户对 redis cluster 都有强烈的需求,因此我们基于 kubeblocks 对 redis cluster 做了适配,在适配的过程中我们也发现了 redis cluster 在 k8s 容器场景中对一些网络标准的兼容性问题。

问题复现步骤如下:

1. 安装 kubeblocks 0.9.0

slc@slcmac kbcli % ./bin/kbcli kubeblocks list-versions --devel
VERSION         RELEASE-NOTES
0.9.0-beta.8    https://github.com/apecloud/kubeblocks/releases/tag/v0.9.0-beta.8
0.9.0-beta.7    https://github.com/apecloud/kubeblocks/releases/tag/v0.9.0-beta.7
slc@slcmac kbcli % kbcli kubeblocks install --version="0.9.0-beta.8"

2. 安装 redis-cluster addon

虽然默认安装了 redis addon,但是因为本文所述的网络适配原因,默认安装的 addon 对 redis cluster 的支持还有问题。

# 先禁用默认 addon
slc@slcmac addons % kbcli addon disable redis
# 安装分支上最新的 addon
slc@slcmac addons % git clone git@github.com:apecloud/kubeblocks-addons.git
slc@slcmac addons % cd kubeblocks-addons/addons/redis 
slc@slcmac addons % helm dependency build && cd ..
slc@slcmac addons % helm install redis ./redis
slc@slcmac addons % helm list
NAME          NAMESPACE        REVISION        UPDATED                                     STATUS          CHART                      APP VERSION
redis         default          1               2024-04-15 21:29:37.953119 +0800 CST        deployed        redis-0.9.0                7.0.6

为了便于复现问题,我们在 helm install redis 之前稍微修改了 addon 中的部分配置和步骤。

3. 创建 redis cluster

创建的实例采用 NodePort 模式,3 个主节点,3 个备节点。

slc@slcmac addons % helm install redisc ./redis-cluster --set mode=cluster --set nodePortEnabled=true --set redisCluster.shardCount=3
slc@slcmac addons % kg pods | grep -v job
NAME                                           READY   STATUS    RESTARTS   AGE
redisc-shard-hxx-1                             3/3     Running   0          14m
redisc-shard-hxx-0                             3/3     Running   0          14m
redisc-shard-xwz-0                             3/3     Running   0          14m
redisc-shard-xwz-1                             3/3     Running   0          14m
redisc-shard-5g8-0                             3/3     Running   0          14m
redisc-shard-5g8-1                             3/3     Running   0          14m

可以看到 3 主备的 pod 都能成功创建,但是此时集群 Node 之间的关系还未建立。

Annouce ip/port/bus-port:

redisc-shard-5g8-0
kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 172.18.0.2
kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-port 30039
kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-bus-port 32461
redisc-shard-hxx-0
kubectl exec -it redisc-shard-hxx-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 172.18.0.2
kubectl exec -it redisc-shard-hxx-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-port 30182
kubectl exec -it redisc-shard-hxx-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-bus-port 31879
redisc-shard-xwz-0
kubectl exec -it redisc-shard-xwz-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 172.18.0.2
kubectl exec -it redisc-shard-xwz-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-port 31993
kubectl exec -it redisc-shard-xwz-0 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-bus-port 30105

Create Slot:

kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- redis-cli -a O3605v7HsS cluster ADDSLOTSRANGE 0 5461
kubectl exec -it redisc-shard-hxx-0 -c redis-cluster -- redis-cli -a O3605v7HsS cluster ADDSLOTSRANGE 5462 10922
kubectl exec -it redisc-shard-xwz-0 -c redis-cluster -- redis-cli -a O3605v7HsS cluster ADDSLOTSRANGE 10923 16383

Cluster Meet:

登录其中一个 master 节点
slc@slcmac redis % kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- /bin/bash
root@redisc-shard-5g8-0:/# redis-cli -a O3605v7HsS
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 myself,master - 0 0 0 connected 0-5461
发现只有自己一个节点,还需要主动 meet 其他两个节点
slc@slcmac redis %  kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- redis-cli -a O3605v7HsS cluster meet 172.18.0.2 30182 31879
OK
slc@slcmac redis %  kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- redis-cli -a O3605v7HsS cluster meet 172.18.0.2 31993 30105
OK
重新查看集群拓
slc@slcmac redis % kubectl exec -it redisc-shard-5g8-0 -c redis-cluster -- /bin/bash
root@redisc-shard-5g8-0:/# redis-cli -a O3605v7HsS
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 myself,master - 0 1713324462000 0 connected 0-5461
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713324462989 2 connected 10923-16383
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713324463091 1 connected 5462-10922

至此一个 3 节点的 master 集群正常建立。

4. join headless slave

我们使用 redisc-shard-5g8-1 这个 pod 节点作为 master redisc-shard-5g8-0 的备节点。

查看备节点上的链接,比较干净,没有到其他 master 的链接:

查看备节点连接:
root@redisc-shard-5g8-1:/# netstat -anop | grep redis
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 127.0.0.1:6379          127.0.0.1:46948         ESTABLISHED 1/redis-server *:63  keepalive (123.22/0/0)
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

备节点 headless 地址:redisc-shard-5g8-1.redisc-shard-5g8-headless:6379
完整的 Join 命令为:

slc@slcmac redis % kubectl exec -it redisc-shard-5g8-1 -c redis-cluster -- /bin/bash
root@redisc-shard-5g8-1:/# redis-cli -a O3605v7HsS --cluster add-node redisc-shard-5g8-1.redisc-shard-5g8-headless:6379 172.18.0.2:30039 --cluster-slave --cluster-master-id ff935854b7626a7e4374598857d5fbe998297799
>>> Adding node redisc-shard-5g8-1.redisc-shard-5g8-headless:6379 to cluster 172.18.0.2:30039
>>> Performing Cluster Check (using node 172.18.0.2:30039)
M: ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039
   slots:[0-5461] (5462 slots) master
M: e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993
   slots:[10923-16383] (5461 slots) master
M: a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182
   slots:[5462-10922] (5461 slots) master
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node redisc-shard-5g8-1.redisc-shard-5g8-headless:6379 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 172.18.0.2:30039.
[OK] New node added correctly.

172.18.0.2:30039 为 master 节点的 annouced ip/port。

查看链接:

root@redisc-shard-5g8-1:/# netstat -anop | grep redis
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.237:48424       172.18.0.2:31879        ESTABLISHED 1/redis-server *:63  off (0.00/0/0) // master-2 announced bus port
tcp        0      0 10.42.0.237:36154       172.18.0.2:32461        ESTABLISHED 1/redis-server *:63  off (0.00/0/0) // master-1 announced bus port
tcp        0      0 10.42.0.237:33504       172.18.0.2:30039        ESTABLISHED 1/redis-server *:63  keepalive (285.22/0/0) // master-1 announced port
tcp        0      0 127.0.0.1:6379          127.0.0.1:46948         ESTABLISHED 1/redis-server *:63  keepalive (279.99/0/0) // local redis-cli
tcp        0      0 10.42.0.237:58576       172.18.0.2:30105        ESTABLISHED 1/redis-server *:63  off (0.00/0/0) // master-3 announced bus port
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

可以看到 slave 节点和其它 3 个 master 在 announced bus port 上建立了链接,并和自己的主节点额外建立了一条连接。

在备节点上查看集群拓扑,拓扑正确:

root@redisc-shard-5g8-1:/# redis-cli -a O3605v7HsS
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 master - 0 1713327060494 0 connected 0-5461
3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 :6379@16379 myself,slave ff935854b7626a7e4374598857d5fbe998297799 0 0 0 connected
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713327060696 2 connected 10923-16383
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713327060605 1 connected 5462-10922

在主节点上查看集群拓扑,新加备节点缺失:

root@redisc-shard-5g8-0:/# redis-cli -a O3605v7HsS
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 myself,master - 0 1713327106000 0 connected 0-5461
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713327107004 2 connected 10923-16383
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713327107106 1 connected 5462-10922

在前面 add-node 的过程中,cluster meet 提示成功,但是实际上主节点并没有看到备节点,翻看 /data/running.log,发现如下错误信息:

root@redisc-shard-5g8-0:/data# grep 16379 running.log
1:M 17 Apr 2024 04:05:37.610 - Connection with Node 30e6d55c687bfc08e4a2fcd2ef586ba5458d801f at 10.42.0.1:16379 failed: Connection refused
**共重复10次**
30e6d55c687bfc08e4a2fcd2ef586ba5458d801f at 10.42.0.1:16379 failed: Connection refused

所以这次 cluster meet 其实是失败的,原因是为何呢?

问题排查

1. 神秘的 ip

redis cluster 默认的 bus port 是 16379 = 6379 + 10000 ,如果没有显式 announce bus port,redis cluster 就会采用该地址,所以问题应该是 master 在收到 meet 请求后尝试用对端的默认 bus port(16379)回连,但是发现一直无法连接,可是备节点的 pod ip (10.42.0.237)并不是错误信息中提示的 ip(10.42.0.1),为何 master 会回连一个不一致的 ip 呢?

slc@slcmac redis %  kg pods -A -o wide | grep redisc-shard-5g8-1
default       redisc-shard-5g8-1                             3/3     Running     0              72m    10.42.0.237   k3d-k3s-default-server-0

继续追查,发现 10.42.0.1 原来是 k3d (我们开发环境使用的 k8s 版本) CNI0 的地址:

slc@slcmac redis % docker ps
CONTAINER ID   IMAGE                            COMMAND                  CREATED        STATUS        PORTS                             NAMES
8f8958df3298   moby/buildkit:buildx-stable-1    "buildkitd --allow-i…"   6 weeks ago    Up 6 weeks                                      buildx_buildkit_project-v3-builder0
f8f349b2faab   ghcr.io/k3d-io/k3d-proxy:5.4.6   "/bin/sh -c nginx-pr…"   6 months ago   Up 3 months   80/tcp, 0.0.0.0:57830->6443/tcp   k3d-k3s-default-serverlb
3e291f02144a   rancher/k3s:v1.24.4-k3s1         "/bin/k3d-entrypoint…"   6 months ago   Up 3 months                                     k3d-k3s-default-server-0
slc@slcmac redis % docker exec -it 3e291f02144a /bin/sh
/ # ifconfig
cni0      Link encap:Ethernet  HWaddr 32:22:34:47:9D:BF
          inet addr:10.42.0.1  Bcast:10.42.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:219424018 errors:0 dropped:0 overruns:0 frame:0
          TX packets:238722923 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:33805804056 (31.4 GiB)  TX bytes:199941577234 (186.2 GiB)

eth0      Link encap:Ethernet  HWaddr 02:42:AC:12:00:02
          inet addr:172.18.0.2  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:74602028 errors:0 dropped:0 overruns:0 frame:0
          TX packets:68167266 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:39814942542 (37.0 GiB)  TX bytes:17167663962 (15.9 GiB)
slc@slcmac redis % kg node -o wide
NAME                       STATUS   ROLES                  AGE    VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE   KERNEL-VERSION      CONTAINER-RUNTIME
k3d-k3s-default-server-0   Ready    control-plane,master   183d   v1.24.4+k3s1   172.18.0.2    <none>        K3s dev    5.10.104-linuxkit   containerd://1.6.6-k3s1

也就是说 10.42.* 是 k3d 默认的 pod CIDR 网段,172.18.0.2 是 k3d 唯一一个 node 的物理地址(所以看到的 node port 地址都是 172.18.0.2)。

2. 若隐若现的链路

原来是 gossip 协议(本地 16379 -> 对端 NodePort)对应的链接在目标端上做了 NAT 转换,通过 tcpdump 抓包我们定位了一条 gossip 会话链路,这个会话链路虽然被 CNI 做了 NAT 转换,但是通过 TS Val 和 ECR 信息我们还是能完整还原出来,下面我们还原的是已经建立好链接的 master-1 和 master-2 之间的 gossip 链路:

master-1 redisc-shard-5g8-0 的链接信息:

root@redisc-shard-5g8-0:/data# netstat -anop | grep redis
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 127.0.0.1:6379          127.0.0.1:46798         ESTABLISHED 1/redis-server *:63  keepalive (117.47/0/0)
tcp        0      0 10.42.0.236:58412       172.18.0.2:31879        ESTABLISHED 1/redis-server *:63  off (0.00/0/0) // 对端是 master-2 nodeport
tcp        0      0 10.42.0.236:6379        10.42.0.1:45255         ESTABLISHED 1/redis-server *:63  keepalive (118.11/0/0)
tcp        0      0 10.42.0.236:36528       172.18.0.2:30105        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.236:16379       10.42.0.1:16471         ESTABLISHED 1/redis-server *:63  keepalive (1.20/0/0)
tcp        0      0 10.42.0.236:16379       10.42.0.1:30788         ESTABLISHED 1/redis-server *:63  keepalive (0.08/0/0)
tcp        0      0 10.42.0.236:16379       10.42.0.1:20521         ESTABLISHED 1/redis-server *:63  keepalive (1.42/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

master-2 redisc-shard-hxx-0 的链接信息:

root@redisc-shard-hxx-0:/# netstat -anop | grep redis
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.232:16379       10.42.0.1:24780         ESTABLISHED 1/redis-server *:63  keepalive (0.72/0/0) // master-1 被 NAT 之后的地址
tcp        0      0 10.42.0.232:41974       172.18.0.2:30105        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.232:16379       10.42.0.1:6717          ESTABLISHED 1/redis-server *:63  keepalive (1.34/0/0)
tcp        0      0 10.42.0.232:16379       10.42.0.1:24130         ESTABLISHED 1/redis-server *:63  keepalive (0.33/0/0)
tcp        0      0 10.42.0.232:33306       172.18.0.2:32461        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 127.0.0.1:6379          127.0.0.1:46626         ESTABLISHED 1/redis-server *:63  keepalive (24.56/0/0)
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

两个链接的映射关系:

# 在 master-1 redisc-shard-5g8-0 上对 NodePort 31879(master-2 redisc-shard-hxx-0) 进行抓包:
05:40:04.817984 IP redisc-shard-5g8-0.redisc-shard-5g8-headless.default.svc.cluster.local.58412 > k3d-k3s-default-server-0.31879: Flags [P.], seq 6976:9336, ack 7081, win 10027, options [nop,nop,TS val 4191410578 ecr 867568717], length 2360
05:40:04.818428 IP k3d-k3s-default-server-0.31879 > redisc-shard-5g8-0.redisc-shard-5g8-headless.default.svc.cluster.local.58412: Flags [.], ack 9336, win 498, options [nop,nop,TS val 867569232 ecr 4191410578], length 0
05:40:04.819269 IP k3d-k3s-default-server-0.31879 > redisc-shard-5g8-0.redisc-shard-5g8-headless.default.svc.cluster.local.58412: Flags [P.], seq 7081:9441, ack 9336, win 501, options [nop,nop,TS val 867569233 ecr 4191410578], length 2360
05:40:04.819309 IP redisc-shard-5g8-0.redisc-shard-5g8-headless.default.svc.cluster.local.58412 > k3d-k3s-default-server-0.31879: Flags [.], ack 9441, win 10026, options [nop,nop,TS val 4191410580 ecr 867569233], length 0

# 在 master-2 redisc-shard-hxx-0 上对本地 Port 24780 (master-1 redisc-shard-5g8-0) 进行抓包: 
05:40:04.818178 IP 10.42.0.1.24780 > redisc-shard-hxx-0.redisc-shard-hxx-headless.default.svc.cluster.local.16379: Flags [P.], seq 32624:34984, ack 32937, win 10027, options [nop,nop,TS val 4191410578 ecr 867568717], length 2360
05:40:04.818371 IP redisc-shard-hxx-0.redisc-shard-hxx-headless.default.svc.cluster.local.16379 > 10.42.0.1.24780: Flags [.], ack 34984, win 498, options [nop,nop,TS val 867569232 ecr 4191410578], length 0
05:40:04.819239 IP redisc-shard-hxx-0.redisc-shard-hxx-headless.default.svc.cluster.local.16379 > 10.42.0.1.24780: Flags [P.], seq 32937:35297, ack 34984, win 501, options [nop,nop,TS val 867569233 ecr 4191410578], length 2360
05:40:04.819327 IP 10.42.0.1.24780 > redisc-shard-hxx-0.redisc-shard-hxx-headless.default.svc.cluster.local.16379: Flags [.], ack 35297, win 10026, options [nop,nop,TS val 4191410580 ecr 867569233], length 0

可以看出,所有的 Pod 和 NodePort 的报文在通话对端上都被 NAT 成了 CNI0 的地址 10.42.0.1。

3. 大象真白

所以到这里 meet 失败的原因也比较清楚了,slave-1 节点在没有 announce 的前提下,通过 pod ip(10.42.0.237) 去 meet master-1,meet 报文在 master-1 pod 上被 NAT 成了 10.42.0.1,master-1 使用默认的 bus port 16379 和从报文中取出的来源 ip 地址(10.42.0.1)去回连 slave-1,在连接 10.42.0.1:16379 时,由于这个节点实际并不是一个正常的 redis pod,也就不存在一个在 16379 监听的 redis-server 进程,所以会给出 connection refused 的错误。

问题修复

1. slave-1 announce & remeet

知道了原因,问题也就比较好解决了。

对于这种 meet 失败的场景,可以让 slave-1 announce ip/port/bus-port 然后再主动 join,这样在回连时会使用 announced ip 建连。

slc@slcmac redis % kubectl exec -it redisc-shard-5g8-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 172.18.0.2
slc@slcmac redis % kubectl exec -it redisc-shard-5g8-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-port 31309
slc@slcmac redis % kubectl exec -it redisc-shard-5g8-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-bus-port 31153

# 在 redisc-shard-5g8-1 上执行 cluster nodes,可以看到使用了最新的 announced 地址和端口
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 master - 0 1713334354116 0 connected 0-5461
# announce 之前为 :6379@16379
3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 172.18.0.2:31309@31153 myself,slave ff935854b7626a7e4374598857d5fbe998297799 0 0 0 connected
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713334354325 2 connected 10923-16383
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713334354532 1 connected 5462-10922

# 重新 meet master-1
127.0.0.1:6379> cluster meet 172.18.0.2 30039 32461
OK

在 master-1 上我们能够看到 meet 前后的差别:

root@redisc-shard-5g8-0:/data# redis-cli -a O3605v7HsS
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 myself,master - 0 1713334463000 0 connected 0-5461
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713334463613 2 connected 10923-16383
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713334463613 1 connected 5462-10922
127.0.0.1:6379> cluster nodes
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 myself,master - 0 1713334506000 0 connected 0-5461
3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 172.18.0.2:31309@31153 slave ff935854b7626a7e4374598857d5fbe998297799 0 1713334506133 0 connected
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713334506133 2 connected 10923-16383
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713334506233 1 connected 5462-10922

可以在 master-1 上看到多了一条到 slave-1 的 gossip 链接:

root@redisc-shard-5g8-0:/data# netstat -anop | grep redis
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 127.0.0.1:6379          127.0.0.1:46798         ESTABLISHED 1/redis-server *:63  keepalive (22.34/0/0)
tcp        0      0 10.42.0.236:58412       172.18.0.2:31879        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.236:6379        10.42.0.1:45255         ESTABLISHED 1/redis-server *:63  keepalive (22.15/0/0)
tcp        0      0 10.42.0.236:43732       172.18.0.2:31153        ESTABLISHED 1/redis-server *:63  off (0.00/0/0) // to slave-1 nodeport
tcp        0      0 10.42.0.236:36528       172.18.0.2:30105        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.236:16379       10.42.0.1:16471         ESTABLISHED 1/redis-server *:63  keepalive (1.17/0/0)
tcp        0      0 10.42.0.236:16379       10.42.0.1:30788         ESTABLISHED 1/redis-server *:63  keepalive (0.97/0/0)
tcp        0      0 10.42.0.236:16379       10.42.0.1:20521         ESTABLISHED 1/redis-server *:63  keepalive (1.48/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

可以在 slave-1 上看到多了三条来自 master-1/2/3 的 gossip 链接:

root@redisc-shard-5g8-1:/# netstat -anop | grep redis
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.237:48424       172.18.0.2:31879        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.237:16379       10.42.0.1:35577         ESTABLISHED 1/redis-server *:63  keepalive (1.11/0/0) // from NAT master
tcp        0      0 10.42.0.237:36154       172.18.0.2:32461        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.237:16379       10.42.0.1:32078         ESTABLISHED 1/redis-server *:63  keepalive (0.15/0/0) // from NAT master
tcp        0      0 10.42.0.237:33504       172.18.0.2:30039        ESTABLISHED 1/redis-server *:63  keepalive (0.00/0/0)
tcp        0      0 127.0.0.1:6379          127.0.0.1:46948         ESTABLISHED 1/redis-server *:63  keepalive (0.00/0/0)
tcp        0      0 10.42.0.237:58576       172.18.0.2:30105        ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 10.42.0.237:16379       10.42.0.1:35265         ESTABLISHED 1/redis-server *:63  keepalive (1.22/0/0) // from NAT master
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

这三条链接其实也是 master 通过 slave-1 的 NodePort 链接成功后在 Pod 上被 NAT 成了 CNI0 的地址。

2. slave-2 announce & meet

Annouce ip/port/bus-port:

slc@slcmac redis % kubectl exec -it redisc-shard-hxx-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 172.18.0.2
slc@slcmac redis % kubectl exec -it redisc-shard-hxx-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-port 30662
slc@slcmac redis % kubectl exec -it redisc-shard-hxx-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-bus-port 30960
slc@slcmac redis % kubectl exec -it redisc-shard-hxx-1 -c redis-cluster -- /bin/bash

Add-node Slave-2 (这个过程会包含 meet 操作):

redis-cli -a O3605v7HsS --cluster add-node 172.18.0.2:30662 172.18.0.2:30182 --cluster-slave --cluster-master-id a54e8fa9474c620154f4c1abc9628116deb3dc7e

在 slave-2 上查看集群拓扑:

127.0.0.1:6379> cluster nodes
3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 172.18.0.2:31309@31153 slave ff935854b7626a7e4374598857d5fbe998297799 0 1713335442641 0 connected
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 master - 0 1713335442328 1 connected 5462-10922
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713335442328 2 connected 10923-16383
4d497f9b4ff459b8c65f50afa6621e122e1d8470 172.18.0.2:30662@30960 myself,slave a54e8fa9474c620154f4c1abc9628116deb3dc7e 0 1713335442000 1 connected
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 master - 0 1713335442641 0 connected 0-5461

在 master-2 上查看集群拓扑:

127.0.0.1:6379> cluster nodes
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713335448690 2 connected 10923-16383
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 master - 0 1713335448892 0 connected 0-5461
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 myself,master - 0 1713335448000 1 connected 5462-10922
4d497f9b4ff459b8c65f50afa6621e122e1d8470 172.18.0.2:30662@30960 slave a54e8fa9474c620154f4c1abc9628116deb3dc7e 0 1713335448998 1 connected
3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 172.18.0.2:31309@31153 slave ff935854b7626a7e4374598857d5fbe998297799 0 1713335448794 0 connected

3. slave-3 announce & meet

先 announce 后 add-node:

slc@slcmac redis % kubectl exec -it redisc-shard-xwz-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 172.18.0.2
slc@slcmac redis % kubectl exec -it redisc-shard-xwz-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-port 30110
slc@slcmac redis % kubectl exec -it redisc-shard-xwz-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-bus-port 30971
slc@slcmac redis % kubectl exec -it redisc-shard-xwz-1 -c redis-cluster -- /bin/bash
root@redisc-shard-xwz-1:/# redis-cli -a O3605v7HsS --cluster add-node 172.18.0.2:30110 172.18.0.2:31993 --cluster-slave --cluster-master-id e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b
>>> Adding node 172.18.0.2:30110 to cluster 172.18.0.2:31993
>>> Performing Cluster Check (using node 172.18.0.2:31993)
M: e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993
   slots:[10923-16383] (5461 slots) master
M: ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039
   slots:[0-5461] (5462 slots) master
   1 additional replica(s)
S: 3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 172.18.0.2:31309
   slots: (0 slots) slave
   replicates ff935854b7626a7e4374598857d5fbe998297799
M: a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182
   slots:[5462-10922] (5461 slots) master
   1 additional replica(s)
S: 4d497f9b4ff459b8c65f50afa6621e122e1d8470 172.18.0.2:30662
   slots: (0 slots) slave
   replicates a54e8fa9474c620154f4c1abc9628116deb3dc7e
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 172.18.0.2:30110 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 172.18.0.2:31993.
[OK] New node added correctly.

在任何一个 master 上查看集群拓扑:

127.0.0.1:6379> cluster nodes
e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 172.18.0.2:31993@30105 master - 0 1713335724101 2 connected 10923-16383
ff935854b7626a7e4374598857d5fbe998297799 172.18.0.2:30039@32461 master - 0 1713335724101 0 connected 0-5461
a54e8fa9474c620154f4c1abc9628116deb3dc7e 172.18.0.2:30182@31879 myself,master - 0 1713335724000 1 connected 5462-10922
4d497f9b4ff459b8c65f50afa6621e122e1d8470 172.18.0.2:30662@30960 slave a54e8fa9474c620154f4c1abc9628116deb3dc7e 0 1713335724404 1 connected
3a136cd50eb3f2c0dcc3844a0de63d5e44b462d7 172.18.0.2:31309@31153 slave ff935854b7626a7e4374598857d5fbe998297799 0 1713335724510 0 connected
161ff6ea42047be45d986ed8ba4505afd07096d9 172.18.0.2:30110@30971 slave e4d9b914e7ee7c4fd399bdf3dd1c98f7a0a1791b 0 1713335724101 2 connected

至此集群处于完整的 3 主 3 备形态。

About CNI

1. k3s + Flannel + NodePort/Pod

k3s/k3d 默认使用的 CNI 为 flannel,如上分析,flannel 会有 NAT 映射的问题。

2. k3s + Calico + NodePort

我们又测试了 k3s + Calico 的场景,Calico 使用 vxlan 来建立 Pod 网络,测试发现,当使用 NodePort 时,在 Calico 上依然存在 NAT 问题,假设我们使用的 NodePort 是 10.128.0.52:32135,在入方向上,到本地 16379 端口的通信的 NodePort (10.128.0.52)依然会被转化为 Node 所在主机 vxlan.calico 网络设备的地址(192.168.238.0)。

这是其中一个 slave 的网络连接:

root@redisc-shard-ffv-1:/# netstat -anop | grep redis
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:41800    10.128.0.52:32135       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:45578    10.128.0.52:31952       ESTABLISHED 1/redis-server *:63  keepalive (277.76/0/0) // 到远端的 NodePort
tcp        0      0 127.0.0.1:6379          127.0.0.1:45998         ESTABLISHED 1/redis-server *:63  keepalive (185.62/0/0)
tcp        0      0 192.168.32.136:53280    10.128.0.52:32675       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    192.168.238.0:8740      ESTABLISHED 1/redis-server *:63  keepalive (8.79/0/0) // 来自远端的经过 NAT 的 NodePort
tcp        0      0 192.168.32.136:16379    192.168.238.0:9617      ESTABLISHED 1/redis-server *:63  keepalive (1.70/0/0)
tcp        0      0 192.168.32.136:34040    10.128.0.52:31454       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    192.168.238.0:18110     ESTABLISHED 1/redis-server *:63  keepalive (1.82/0/0)
tcp        0      0 192.168.32.136:39006    10.128.0.52:30390       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    192.168.238.0:32651     ESTABLISHED 1/redis-server *:63  keepalive (1.57/0/0)
tcp        0      0 192.168.32.136:54986    10.128.0.52:30459       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    192.168.238.0:43310     ESTABLISHED 1/redis-server *:63  keepalive (1.83/0/0)
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

在 Node 10.128.0.52 上可以看到两个设备:

ens4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1460
        inet 10.128.0.52  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::4001:aff:fe80:34  prefixlen 64  scopeid 0x20<link>
        ether 42:01:0a:80:00:34  txqueuelen 1000  (Ethernet)
        RX packets 3228477  bytes 3975395572 (3.9 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3025699  bytes 2382110168 (2.3 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
vxlan.calico: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1410
        inet 192.168.238.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::64b2:cdff:fe99:7f96  prefixlen 64  scopeid 0x20<link>
        ether 66:b2:cd:99:7f:96  txqueuelen 1000  (Ethernet)
        RX packets 587707  bytes 714235654 (714.2 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 810205  bytes 682665081 (682.6 MB)
        TX errors 0  dropped 31 overruns 0  carrier 0  collisions 0

如果 NodePort 使用的 Node 为 Pod 所在的主机,在 Calico 中不会被 NAT。

slc@cluster-1:~$ kubectl exec -it redisc-shard-ffv-1 -c redis-cluster -- redis-cli -a O3605v7HsS config set cluster-announce-ip 10.128.0.54 // 把 announced ip 设为 Pod 所在本地 Node ip
OK
slc@cluster-1:~$ kubectl exec -it redisc-shard-ffv-1 -c redis-cluster -- /bin/bash
root@redisc-shard-ffv-1:/# netstat -anop | grep redis
tcp        0      0 0.0.0.0:16379           0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    10.128.0.54:44757       ESTABLISHED 1/redis-server *:63  keepalive (6.92/0/0)
tcp        0      0 192.168.32.136:41800    10.128.0.52:32135       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    10.128.0.54:16772       ESTABLISHED 1/redis-server *:63  keepalive (0.64/0/0)
tcp        0      0 192.168.32.136:45578    10.128.0.52:31952       ESTABLISHED 1/redis-server *:63  keepalive (70.79/0/0)
tcp        0      0 127.0.0.1:6379          127.0.0.1:45998         ESTABLISHED 1/redis-server *:63  keepalive (0.00/0/0)
tcp        0      0 192.168.32.136:53280    10.128.0.52:32675       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    10.128.0.54:16440       ESTABLISHED 1/redis-server *:63  keepalive (8.62/0/0)
tcp        0      0 192.168.32.136:34040    10.128.0.52:31454       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    10.128.0.54:28655       ESTABLISHED 1/redis-server *:63  keepalive (0.14/0/0)
tcp        0      0 192.168.32.136:39006    10.128.0.52:30390       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:54986    10.128.0.52:30459       ESTABLISHED 1/redis-server *:63  off (0.00/0/0)
tcp        0      0 192.168.32.136:16379    10.128.0.54:29959       ESTABLISHED 1/redis-server *:63  keepalive (8.62/0/0)
tcp6       0      0 :::16379                :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)
tcp6       0      0 :::6379                 :::*                    LISTEN      1/redis-server *:63  off (0.00/0/0)

所以在 Calico vxlan 方案中 NodePort 是否做 SNAT 是和 source Node 地址有关,如果是本机 Node 则不做 SNAT,如果是远端 Node 则需要做 SNAT,但是由于我们做了显式 announce,所以在 redis cluster meet 时也不会有问题。

3. k3s + Calico + Pod

如果只使用 pod ip,redis cluster 会正常 meet,集群拓扑正确。

总结

1.在某些 k8s 版本中,根据 CNI 的实现不同,pod 和 nodeport 可能会被 NAT 转换,NAT 转换后的 ip 和 port 无法让集群中其他角色回连,然后 meet 失败。

2.由于上述机制的存在,在 k8s 中创建 redis cluster,要么使用 host network;要么使用 NodePort 并显式 announce ip/port/bus-port;对于纯粹的 pod 网络 && 不显式 announce 的场景,需要杜绝 NAT,而这依赖于 CNI 的实现。

3.Redis cluster 的内部通信和外部通信共享了一套 ip 地址,announce ip 之后,会使用 announce ip 覆盖 pod ip 进行后续的通信,这样会导致内部的 gossip 协商过程也走了 announce 网络,这是一种不必要的浪费,所以未来的建议是内部协议链路和外部应用的数据链路分开。

4.但是即使把 pod ip 和 announce ip 使用分开,内部通信走 pod 网络,外部和 client 之间的数据链路走 announce 网络,也无法解决 CNI NAT 转换的问题,由于 redis cluster 回连机制的存在,对于 NAT 之后的地址是无法直接回连的,这里需要在 redis cluster 通信协议上做扩充,理想的情况是:1)内部通信:pod 网络,需要回连,带上原始的 pod ip 作为 source ip,即使经过 NAT 转换也能获取 source ip;2)外部通信:announce 网络,可以是 NodePort/LoadBalancer,不需要回连,无所谓是否 NAT。当然内部通信也可以走 NodePort 和 LoadBalancer,但是前提也是带上原始 source ip(announce ip 其实也是一种 source ip),这也是 KubeBlocks 目前的解决方案

5.使用 NodePort 会引入另外一个问题,当 Node Down 之后需要更新 cluster 节点的 announce ip,这个实现难度其实也不小,需要 operator 和 HA 节点的努力配合。

  • 15
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
要在 Kubernetes 上部署 Redis Cluster 集群,可以按照以下步骤进行: 1. 创建 Redis ConfigMap 首先,需要创建一个 Redis 配置文件 ConfigMap。可以使用以下命令创建: ``` kubectl create configmap redis-conf --from-file=redis.conf ``` 其中,`redis.conf` 是 Redis 配置文件的名称。可以根据实际情况进行修改。 2. 创建 Redis StatefulSet 接下来,可以创建 Redis StatefulSet。可以使用以下 YAML 文件作为模板: ``` apiVersion: apps/v1 kind: StatefulSet metadata: name: redis-cluster spec: serviceName: redis-cluster replicas: 6 selector: matchLabels: app: redis-cluster template: metadata: labels: app: redis-cluster spec: containers: - name: redis image: redis:6.0.9 ports: - containerPort: 6379 name: client - containerPort: 16379 name: gossip volumeMounts: - name: config mountPath: /usr/local/etc/redis/redis.conf subPath: redis.conf command: - sh - -c - | if [ -n "${POD_NAME}" ]; then sed -i "s/%cluster-name%/redis-cluster/g" /usr/local/etc/redis/redis.conf sed -i "s/%node-name%/${POD_NAME}/g" /usr/local/etc/redis/redis.conf fi volumes: - name: config configMap: name: redis-conf volumeClaimTemplates: - metadata: name: data spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi ``` 在上述 YAML 文件中,`replicas` 字段指定了 Redis 集群的节点数,`volumeClaimTemplates` 字段指定了每个节点的数据卷大小。另外,还需要注意以下几点: - `serviceName` 字段需要与 `metadata.name` 字段相同。 - `selector.matchLabels` 字段需要与 `metadata.labels` 字段相同。 - `command` 字段用于修改 Redis 配置文件中的节点名称和集群名称。 3. 创建 Redis Service 最后,需要创建 Redis Service。可以使用以下 YAML 文件作为模板: ``` apiVersion: v1 kind: Service metadata: name: redis-cluster labels: app: redis-cluster spec: ports: - name: client port: 6379 targetPort: 6379 - name: gossip port: 16379 targetPort: 16379 clusterIP: None selector: app: redis-cluster ``` 在上述 YAML 文件中,`clusterIP` 字段需要设置为 `None`,以便创建一个 Headless Service,让每个 Redis 节点可以通过 DNS 解析到自己的 IP 地址。 4. 部署 Redis Cluster 完成以上步骤后,可以使用以下命令部署 Redis Cluster: ``` kubectl apply -f redis-cluster.yaml ``` 其中,`redis-cluster.yaml` 是包含 Redis StatefulSet 和 Redis Service 的 YAML 文件。可以根据实际情况进行修改。 部署完成后,可以使用以下命令查看 Redis Cluster 的状态: ``` kubectl exec -it redis-cluster-0 -- redis-cli cluster nodes ``` 其中,`redis-cluster-0` 是 Redis Cluster 中的一个节点名称。可以根据实际情况进行修改。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值