Redis6.0集群安装部署
{redis cluster搭建 redis集群搭建 redis主从 }
create-time:2022-04-26
通过本文章你可以动手学会如何搭建redis-cluster
前言
redis集群化部署主要用于大型缓存架构,一般的小型架构,使用redis主从配置 + sentinel哨兵集群应付系统压力,
使用redis集群可以方便快捷地对集群进行动态扩容,动态的添加、删除节点,reshard、并带有自动故障恢复功能。
一般redis集群使用三主三从,并且尽量保证主服务器与从服务器不在同一台机器上,防止机器故障导致的集群瘫痪,每个主服务器搭配一个从服务器,保证集群的高可用性。
一、集群规划
6台服务器依次部署主节点、从节点,3主3从
服务器 | 角色 | ip:端口 |
db-hs-1-40.bohai.org | master1 | 10.96.1.40:7000 |
db-hs-1-104.bohai.org | master2 | 10.96.1.104:7000 |
db-hs-1-112.bohai.org | master3 | 10.96.1.112:7000 |
db-hs-1-205.bohai.org | slave1 | 10.96.1.205:7000 |
db-hs-1-254.bohai.org | slave2 | 10.96.1.254:7000 |
db-hs-1-167.bohai.org | slave3 | 10.96.1.167:7000 |
软件版本:
OS:CentOS 7.6
Redis:redis-6.0.9
二、服务器设置
修改主机名
方法1.使用hostnamectl命令,hostnamectl set-hostname name ,再通过hostname或者hostnamectl status命令查看更改是否生效。
# hostnamectl set-hostname db-hs-1-167.bohai.org
方法2.直接使用文本编辑器修改/etc/hostname配置文件
# vim /etc/hostname
db-hs-1-167.bohai.org
内核设置
1、# chmod +x /etc/rc.d/rc.local
2、 将 vm.overcommit_memory = 1 添加到/etc/sysctl.conf中,然后运行命令 sysctl vm.overcommit_memory=1使其立即生效。
3、 确保禁用Linux内核功能透明的大页面,它将以负面的方式极大地影响内存使用和延迟。这可以通过以下命令完成:
先直接执行(临时生效):
echo never> /sys/kernel/mm/transparent_hugepage/enabled
再执行以下命令(永久生效):
vim /etc/rc.local
追加:echo never>/sys/kernel/mm/transparent_hugepage/enabled
4、 解决问题:【TCP backlog设置值,511没有成功,因为 /proc/sys/net/core/somaxconn这个设置的是更小的128】
先直接执行(临时生效):
# cat /proc/sys/net/core/somaxconn
128
#echo 511 > /proc/sys/net/core/somaxconn
再执行以下命令(永久生效):
vim /etc/rc.local
追加:echo 511 > /proc/sys/net/core/somaxconn
# tail -n2 /etc/rc.local
echo never>/sys/kernel/mm/transparent_hugepage/enabled
echo 511 > /proc/sys/net/core/somaxconn
5、 重启服务器。
三、Redis安装
分别在6台服务器进行如下操作。
1、gcc版本问题避免
Redis是c语言开发的。安装redis需要c语言的编译环境。
安装redis6最主要的一点是要用GCC5以上,而CentOS6.9的GCC版本为4.8.x, 所以安装之前必须升级GCC(使用命令gcc --version查看版本)。
# gcc --version
4.8.5
# yum -y install gcc tcl
rpm -ivh https://cbs.centos.org/kojifiles/packages/centos-release-scl-rh/2/3.el7.centos/noarch/centos-release-scl-rh-2-3.el7.centos.noarch.rpm
rpm -ivh https://cbs.centos.org/kojifiles/packages/centos-release-scl/2/3.el7.centos/noarch/centos-release-scl-2-3.el7.centos.noarch.rpm
yum -y install centos-release-scl
yum -y install devtoolset-9-gcc devtoolset-9-gcc-c++ devtoolset-9-binutils
查看gcc版本:
# gcc -v
gcc version 4.8.5
scl只是临时启用,退出shell后会恢复原系统gcc版本:
# scl enable devtoolset-9 bash
如下命令表示永久启用:
# echo "source /opt/rh/devtoolset-9/enable" >> /etc/profile
gcc -v
gcc version 9.3.1
2、下载安装
cd /opt
wget http://download.redis.io/releases/redis-6.0.9.tar.gz
tar -xvf redis-6.0.9.tar.gz
cd redis-6.0.9
make MALLOC=libc
make install PREFIX=/usr/local/redis
查看/usr/local/redis/bin,如看见redis具工表示redis已安装成功:
[root@db-hs-1-40 redis-6.0.9]# ll /usr/local/redis/bin
total 18324
-rwxr-xr-x 1 root root 728952 Apr 27 11:17 redis-benchmark
-rwxr-xr-x 1 root root 5658256 Apr 27 11:17 redis-check-aof
-rwxr-xr-x 1 root root 5658256 Apr 27 11:17 redis-check-rdb
-rwxr-xr-x 1 root root 1049336 Apr 27 11:17 redis-cli
lrwxrwxrwx 1 root root 12 Apr 27 11:17 redis-sentinel -> redis-server
-rwxr-xr-x 1 root root 5658256 Apr 27 11:17 redis-server
3、Redis配置
建立安裝Redis時会用到的文件夹:
mkdir -p /usr/local/redis/run
mkdir -p /usr/local/redis/log
mkdir -p /usr/local/redis/data/7000
mkdir -p /usr/local/redis/conf
创建目录# mkdir -p /usr/local/redis/data/7000
设置redis配置文件:
cp /opt/redis-6.0.9/redis.conf /usr/local/redis/conf/redis_7000.conf
vi /usr/local/redis/conf/redis_7000.conf
打开redis_7000.conf文件,修改以下内容:
bind 192.168.146.199 #添加本机的ip
port 7000 #端口
pidfile /usr/local/redis/run/redis_7000.pid #pid存储目录
logfile /usr/local/redis/log/redis_7000.log #日志存储目录
dir /usr/local/redis/data/7000 #数据存储目录,目录要提前创建好
cluster-enabled yes #开启集群
cluster-config-file nodes-7000.conf #集群节点配置文件,这个文件是不能手动编辑的。确保每一个集群节点的配置文件不同
cluster-node-timeout 15000 #集群节点的超时时间,单位:ms,超时后集群会认为该节点失败
appendonly yes #持久化
daemonize yes #守护进程
完整的配置文件如下:
bind 10.96.1.40
protected-mode yes
port 7000
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes
supervised no
pidfile /usr/local/redis/run/redis_7000.pid
loglevel notice
logfile /usr/local/redis/log/redis_7000.log
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
rdb-del-sync-files no
dir /usr/local/redis/data/7000
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-diskless-load disabled
repl-disable-tcp-nodelay no
replica-priority 100
acllog-max-len 128
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no
lazyfree-lazy-user-del no
oom-score-adj no
oom-score-adj-values 0 200 800
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes
jemalloc-bg-thread yes
cluster-enabled yes
cluster-config-file 1.40-7000.conf
cluster-node-timeout 1500
5、制作启动配置文件
# cd /usr/local/redis/bin
启动脚本:
# vi cluster_start.sh
./redis-server ../conf/redis_7000.conf
# chmod +x cluster_start.sh
关闭脚本:
# vi cluster_shutdown.sh
pgrep redis-server | xargs -exec kill -9
# chmod +x cluster_shutdown.sh
6、启动&关闭Redis:
启动redis:
# ./cluster_start.sh
ps -ef|grep redis
root 16045 1 0 11:35 ? 00:00:00 ./redis-server 10.96.1.40:7000 [cluster]
关闭redis:
# ./cluster_shutdown.sh
四、Redis集群
建立集群前需先启动各个节点的redis服务,並在其中一个redis服务器中执行以下指令建立集群。
1、创建集群
在redis3.0和4.0版本中,创建集群还是使用redis-trib.rb方式去创建,但是在5.0之后,可以直接使用redis-cli直接创建集群,本文redis版本为6.0,所以创建集群的方式为redis-cli方式直接创建。
用以下命令创建集群,--cluster-replicas 1 参数表示希望每个主服务器都有一个从服务器,这里则代表3主3从,前3个代表3个master,后3个代表3个slave。
通过该方式创建的带有从节点的机器不能够自己手动指定主节点,redis集群会尽量把主从服务器分配在不同机器上。
# redis-cli --cluster create 10.96.1.40:7000 10.96.1.104:7000 10.96.1.112:7000 10.96.1.167:7000 10.96.1.205:7000 10.96.1.254:7000 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.96.1.205:7000 to 10.96.1.40:7000
Adding replica 10.96.1.254:7000 to 10.96.1.104:7000
Adding replica 10.96.1.167:7000 to 10.96.1.112:7000
M: 7f1784b713182540a36e8653422b7c390c3a73c2 10.96.1.40:7000
slots:[0-5460] (5461 slots) master
M: aee7bc0be26213154d24c009fd71f6c12d88dbd8 10.96.1.104:7000
slots:[5461-10922] (5462 slots) master
M: a7b3ff893bd2e8e5c411da785554b838976d680f 10.96.1.112:7000
slots:[10923-16383] (5461 slots) master
S: cf4905804ba0eaa5160319fa680e3bc69ba51cd4 10.96.1.167:7000
replicates a7b3ff893bd2e8e5c411da785554b838976d680f
S: 47f67b951bc42cdd71ea97b896c44792c6abf8f5 10.96.1.205:7000
replicates 7f1784b713182540a36e8653422b7c390c3a73c2
S: 300661faa3bc246289c49f635c0ffe4cc820f03c 10.96.1.254:7000
replicates aee7bc0be26213154d24c009fd71f6c12d88dbd8
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
>>> Performing Cluster Check (using node 10.96.1.40:7000)
M: 7f1784b713182540a36e8653422b7c390c3a73c2 10.96.1.40:7000
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 47f67b951bc42cdd71ea97b896c44792c6abf8f5 10.96.1.205:7000
slots: (0 slots) slave
replicates 7f1784b713182540a36e8653422b7c390c3a73c2
S: 300661faa3bc246289c49f635c0ffe4cc820f03c 10.96.1.254:7000
slots: (0 slots) slave
replicates aee7bc0be26213154d24c009fd71f6c12d88dbd8
M: aee7bc0be26213154d24c009fd71f6c12d88dbd8 10.96.1.104:7000
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
M: a7b3ff893bd2e8e5c411da785554b838976d680f 10.96.1.112:7000
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: cf4905804ba0eaa5160319fa680e3bc69ba51cd4 10.96.1.167:7000
slots: (0 slots) slave
replicates a7b3ff893bd2e8e5c411da785554b838976d680f
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
2、查看集群状态
# ./redis-cli -c -h 10.96.1.40 -p 7000 cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:601
cluster_stats_messages_pong_sent:614
cluster_stats_messages_sent:1215
cluster_stats_messages_ping_received:609
cluster_stats_messages_pong_received:601
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:1215
3、查看集群节点
# ./redis-cli -c -h 10.96.1.40 -p 7000 cluster nodes
47f67b951bc42cdd71ea97b896c44792c6abf8f5 10.96.1.205:7000@17000 slave 7f1784b713182540a36e8653422b7c390c3a73c2 0 1651038056192 1 connected
300661faa3bc246289c49f635c0ffe4cc820f03c 10.96.1.254:7000@17000 slave aee7bc0be26213154d24c009fd71f6c12d88dbd8 0 1651038056192 2 connected
aee7bc0be26213154d24c009fd71f6c12d88dbd8 10.96.1.104:7000@17000 master - 0 1651038055692 2 connected 5461-10922
a7b3ff893bd2e8e5c411da785554b838976d680f 10.96.1.112:7000@17000 master - 0 1651038056000 3 connected 10923-16383
cf4905804ba0eaa5160319fa680e3bc69ba51cd4 10.96.1.167:7000@17000 slave a7b3ff893bd2e8e5c411da785554b838976d680f 0 1651038056092 3 connected
7f1784b713182540a36e8653422b7c390c3a73c2 10.96.1.40:7000@17000 myself,master - 0 1651038056000 1 connected 0-5460
五、测试用例
[root@db-hs-1-40 bin]# ./redis-cli -c -h 10.96.1.40 -p 7000
10.96.1.40:7000> set name node1
-> Redirected to slot [5798] located at 10.96.1.104:7000
OK
[root@db-hs-1-40 bin]# ./redis-cli -c -h 10.96.1.104 -p 7000
10.96.1.104:7000> get name
"node1"
[root@db-hs-1-40 bin]# ./redis-cli -c -h 10.96.1.205 -p 7000
10.96.1.205:7000> get name
-> Redirected to slot [5798] located at 10.96.1.104:7000
"node1"
10.96.1.104:7000> del name
(integer) 1
10.96.1.104:7000> get name
(nil)
六、目录结构
# pwd
/usr/local/redis
[root@db-hs-1-40 redis]# tree
.
├── bin
│ ├── cluster_shutdown.sh
│ ├── cluster_start.sh
│ ├── redis-benchmark
│ ├── redis-check-aof
│ ├── redis-check-rdb
│ ├── redis-cli
│ ├── redis-sentinel -> redis-server
│ └── redis-server
├── conf
│ └── redis_7000.conf
├── data
│ └── 7000
│ ├── 1.40-7000.conf
│ ├── appendonly.aof
│ └── dump.rdb
├── log
│ └── redis_7000.log
└── run
└── redis_7000.pid
6 directories, 14 files
七、故障转移演练
通过测试验证了,一个主库下线,它的从库会顶上自动升为主,当这个旧主库恢复后,角色自动变为新从库。
一个master下线
10.96.1.40# ps aux|grep redis
root 16183 0.1 0.0 146200 3212 ? Ssl 13:16 0:19 ./redis-server 10.96.1.40:7000 [cluster]
10.96.1.40# kill -9 16183
slave与下线master的主从复制中断
[10.96.1.205]# tail -n100 /usr/local/redis/log/redis_7000.log
15910:S 27 Apr 2022 17:18:32.367 # 【Connection with master lost.】
15910:S 27 Apr 2022 17:18:32.367 * 【Caching the disconnected master state.】
15910:S 27 Apr 2022 17:18:32.597 * 【Connecting to MASTER 10.96.1.40:7000】
15910:S 27 Apr 2022 17:18:32.597 * 【MASTER <-> REPLICA sync started】
15910:S 27 Apr 2022 17:18:32.598 # 【Error condition on socket for SYNC: Connection refused】
其他slave标记下线master主观下线
[root@db-hs-1-254 bin]# tail /usr/local/redis/log/redis_7000.log
15923:S 27 Apr 2022 17:18:34.772 * Marking node 7f1784b713182540a36e8653422b7c390c3a73c2 as failing (quorum reached).
在所有主从节点中,只发现一个slave进行了mark
---和原文不同之处018.Redis Cluster故障转移原理 - 云+社区 - 腾讯云
- 超半数master认为下线master主观下线,所以下线master客观下线
- slave节点在延迟576 ms后,开始准备选举,它和下线master的复制偏移量是18508
[root@db-hs-1-205 ~]# tail /usr/local/redis/log/redis_7000.log
15910:S 27 Apr 2022 17:18:34.773 # Cluster state changed: fail
15910:S 27 Apr 2022 17:18:34.801 # Start of election delayed for 576 milliseconds (rank #0, offset 18508).
slave更新配置版本并发起选举
15910:S 27 Apr 2022 17:18:35.402 # Starting a failover election for epoch 7.
其他两个master对slave进行了投票
104_master2#15900:M 27 Apr 2022 17:18:35.404 # Failover auth granted to 47f67b951bc42cdd71ea97b896c44792c6abf8f5 for epoch 7
112_master3#15977:M 27 Apr 2022 17:18:35.404 # Failover auth granted to 47f67b951bc42cdd71ea97b896c44792c6abf8f5 for epoch 7
所有节点日志记录: 研究学习备查
master1 40 无日志
master2[root@db-hs-1-104 ~]# tail /usr/local/redis/log/redis_7000.log |grep 17:18
15900:M 27 Apr 2022 17:18:34.773 * FAIL message received from 300661faa3bc246289c49f635c0ffe4cc820f03c about 7f1784b713182540a36e8653422b7c390c3a73c2
15900:M 27 Apr 2022 17:18:34.773 # Cluster state changed: fail
15900:M 27 Apr 2022 17:18:35.404 # Failover auth granted to 47f67b951bc42cdd71ea97b896c44792c6abf8f5 for epoch 7
15900:M 27 Apr 2022 17:18:35.407 # Cluster state changed: ok
master3[root@db-hs-1-112 ~]# tail /usr/local/redis/log/redis_7000.log |grep 17:18
15977:M 27 Apr 2022 17:18:34.773 * FAIL message received from 300661faa3bc246289c49f635c0ffe4cc820f03c about 7f1784b713182540a36e8653422b7c390c3a73c2
15977:M 27 Apr 2022 17:18:34.773 # Cluster state changed: fail
15977:M 27 Apr 2022 17:18:35.404 # Failover auth granted to 47f67b951bc42cdd71ea97b896c44792c6abf8f5 for epoch 7
15977:M 27 Apr 2022 17:18:35.406 # Cluster state changed: ok
[root@db-hs-1-112 ~]#
slave1 升为主
[root@db-hs-1-205 ~]# tail -n100 /usr/local/redis/log/redis_7000.log
15910:S 27 Apr 2022 17:18:32.367 # Connection with master lost.
15910:S 27 Apr 2022 17:18:32.367 * Caching the disconnected master state.
15910:S 27 Apr 2022 17:18:32.597 * Connecting to MASTER 10.96.1.40:7000
15910:S 27 Apr 2022 17:18:32.597 * MASTER <-> REPLICA sync started
15910:S 27 Apr 2022 17:18:32.598 # Error condition on socket for SYNC: Connection refused
15910:S 27 Apr 2022 17:18:33.598 * Connecting to MASTER 10.96.1.40:7000
15910:S 27 Apr 2022 17:18:33.598 * MASTER <-> REPLICA sync started
15910:S 27 Apr 2022 17:18:33.598 # Error condition on socket for SYNC: Connection refused
15910:S 27 Apr 2022 17:18:34.599 * Connecting to MASTER 10.96.1.40:7000
15910:S 27 Apr 2022 17:18:34.599 * MASTER <-> REPLICA sync started
15910:S 27 Apr 2022 17:18:34.599 # Error condition on socket for SYNC: Connection refused
15910:S 27 Apr 2022 17:18:34.773 * FAIL message received from 300661faa3bc246289c49f635c0ffe4cc820f03c about 7f1784b713182540a36e8653422b7c390c3a73c2
15910:S 27 Apr 2022 17:18:34.773 # Cluster state changed: fail
15910:S 27 Apr 2022 17:18:34.801 # Start of election delayed for 576 milliseconds (rank #0, offset 18508).
15910:S 27 Apr 2022 17:18:35.402 # Starting a failover election for epoch 7.
15910:S 27 Apr 2022 17:18:35.405 # Failover election won: I'm the new master.
15910:S 27 Apr 2022 17:18:35.405 # configEpoch set to 7 after successful failover
15910:M 27 Apr 2022 17:18:35.405 * Discarding previously cached master state.
15910:M 27 Apr 2022 17:18:35.405 # Setting secondary replication ID to 47e6d9c34f07f163d54f66e3c9831a0cedf4cba5, valid up to offset: 18509. New replication ID is ae23afbc0581c70342c4521e55ac8f033e2564eb
15910:M 27 Apr 2022 17:18:35.405 # Cluster state changed: ok
15910:M 27 Apr 2022 18:18:36.450 * Replication backlog freed after 3600 seconds without connected replicas.
slave2 [root@db-hs-1-254 ~]# tail /usr/local/redis/log/redis_7000.log
15923:S 27 Apr 2022 17:18:34.772 * Marking node 7f1784b713182540a36e8653422b7c390c3a73c2 as failing (quorum reached).
15923:S 27 Apr 2022 17:18:34.772 # Cluster state changed: fail
15923:S 27 Apr 2022 17:18:35.406 # Cluster state changed: ok
【slave2】 [root@db-hs-1-254 ~]# tail /usr/local/redis/log/redis_7000.log
15923:S 27 Apr 2022 17:18:34.772 * Marking node 7f1784b713182540a36e8653422b7c390c3a73c2 as failing (quorum reached).
15923:S 27 Apr 2022 17:18:34.772 # Cluster state changed: fail
15923:S 27 Apr 2022 17:18:35.406 # Cluster state changed: ok
slave3
[root@db-hs-1-167 ~]# tail /usr/local/redis/log/redis_7000.log
16016:S 27 Apr 2022 17:18:34.773 * FAIL message received from 300661faa3bc246289c49f635c0ffe4cc820f03c about 7f1784b713182540a36e8653422b7c390c3a73c2
16016:S 27 Apr 2022 17:18:34.773 # Cluster state changed: fail
16016:S 27 Apr 2022 17:18:35.406 # Cluster state changed: ok
查看当前集群节点
# ./redis-cli -c -h 10.96.1.104 -p 7000 cluster nodes
7f1784b713182540a36e8653422b7c390c3a73c2 10.96.1.40:7000@17000 master,fail - 1651051112590 1651051111790 1 disconnected
47f67b951bc42cdd71ea97b896c44792c6abf8f5 10.96.1.205:7000@17000 master - 0 1651055237000 7 connected 0-5460
cf4905804ba0eaa5160319fa680e3bc69ba51cd4 10.96.1.167:7000@17000 slave a7b3ff893bd2e8e5c411da785554b838976d680f 0 1651055236733 3 connected
a7b3ff893bd2e8e5c411da785554b838976d680f 10.96.1.112:7000@17000 master - 0 1651055237334 3 connected 10923-16383
aee7bc0be26213154d24c009fd71f6c12d88dbd8 10.96.1.104:7000@17000 myself,master - 0 1651055236000 2 connected 5461-10922
300661faa3bc246289c49f635c0ffe4cc820f03c 10.96.1.254:7000@17000 slave aee7bc0be26213154d24c009fd71f6c12d88dbd8 0 1651055237133 2 connected
- 重启下线的master
[root@db-hs-1-40 bin]# ./cluster_start.sh
[root@db-hs-1-40 bin]# ps aux|grep redis
root 16544 0.0 0.0 146036 2980 ? Ssl 18:33 0:00 ./redis-server 10.96.1.40:7000 [cluster]
旧master节点启动后发现自己负责的槽指派给另一个节点,则以现有集群配置为准,变为新主节点的从节点
16544:M 27 Apr 2022 18:33:41.164 * Node configuration loaded, I'm 7f1784b713182540a36e8653422b7c390c3a73c2
16544:M 27 Apr 2022 18:33:41.165 # Configuration change detected. Reconfiguring myself as a replica of 47f67b951bc42cdd71ea97b896c44792c6abf8f5
16544:S 27 Apr 2022 18:33:41.165 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
集群内其他节点接收到新上线发来的ping消息,清空客观下线状态
slave[root@db-hs-1-167 bin]#* Clear FAIL state for node 7f1784b713182540a36e8653422b7c390c3a73c2: master without slots is reachable again.
新的主从开始复制
#slave[root@db-hs-1-40 bin]# tail -n100 /usr/local/redis/log/redis_7000.log
16544:S 27 Apr 2022 18:33:42.166 * Connecting to MASTER 10.96.1.205:7000
16544:S 27 Apr 2022 18:33:42.166 * MASTER <-> REPLICA sync started
16544:S 27 Apr 2022 18:33:42.166 * Non blocking connect for SYNC fired the event.
16544:S 27 Apr 2022 18:33:42.166 * Master replied to PING, replication can continue...
16544:S 27 Apr 2022 18:33:42.167 * Trying a partial resynchronization (request 7dd3336553ef70bf59ca2b65771b24c65a8ddd9d:1).
16544:S 27 Apr 2022 18:33:42.167 * Full resync from master: e6c32bee70c79b5e0b0e9a23eabce72efcc7be91:18508
#master[root@db-hs-1-205 ~]#
15910:M 27 Apr 2022 18:33:42.167 * Replica 10.96.1.40:7000 asks for synchronization
15910:M 27 Apr 2022 18:33:42.167 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '7dd3336553ef70bf59ca2b65771b24c65a8ddd9d', my replication IDs are '79e4771593f2505f4dc9cc6e0703c971a2f6fe19' and '0000000000000000000000000000000000000000')
15910:M 27 Apr 2022 18:33:42.167 * Replication backlog created, my new replication IDs are 'e6c32bee70c79b5e0b0e9a23eabce72efcc7be91' and '0000000000000000000000000000000000000000'
15910:M 27 Apr 2022 18:33:42.167 * Starting BGSAVE for SYNC with target: disk
15910:M 27 Apr 2022 18:33:42.167 * Background saving started by pid 16177
16177:C 27 Apr 2022 18:33:42.168 * DB saved on disk
16177:C 27 Apr 2022 18:33:42.169 * RDB: 0 MB of memory used by copy-on-write
15910:M 27 Apr 2022 18:33:42.171 * Background saving terminated with success
15910:M 27 Apr 2022 18:33:42.171 * Synchronization with replica 10.96.1.40:7000 succeeded
查看当前集群节点
# ./redis-cli -c -h 10.96.1.104 -p 7000 cluster nodes
7f1784b713182540a36e8653422b7c390c3a73c2 10.96.1.40:7000@17000 slave 47f67b951bc42cdd71ea97b896c44792c6abf8f5 0 1651056368812 7 connected
47f67b951bc42cdd71ea97b896c44792c6abf8f5 10.96.1.205:7000@17000 master - 0 1651056369000 7 connected 0-5460
cf4905804ba0eaa5160319fa680e3bc69ba51cd4 10.96.1.167:7000@17000 slave a7b3ff893bd2e8e5c411da785554b838976d680f 0 1651056369013 3 connected
a7b3ff893bd2e8e5c411da785554b838976d680f 10.96.1.112:7000@17000 master - 0 1651056369212 3 connected 10923-16383
aee7bc0be26213154d24c009fd71f6c12d88dbd8 10.96.1.104:7000@17000 myself,master - 0 1651056368000 2 connected 5461-10922
300661faa3bc246289c49f635c0ffe4cc820f03c 10.96.1.254:7000@17000 slave aee7bc0be26213154d24c009fd71f6c12d88dbd8 0 1651056369013 2 connected
[root@db-hs-1-205 bin]#
遇到的问题并解决: ①安装包过程中报错找不到包
# yum -y install centos-release-scl
No package centos-release-scl available.
Error: Nothing to do
解决方法:
# rpm -ivh https://cbs.centos.org/kojifiles/packages/centos-release-scl-rh/2/3.el7.centos/noarch/centos-release-scl-rh-2-3.el7.centos.noarch.rpm
# rpm -ivh https://cbs.centos.org/kojifiles/packages/centos-release-scl/2/3.el7.centos/noarch/centos-release-scl-2-3.el7.centos.noarch.rpm
[root@db-hs-1-40 yum.repos.d]# yum -y install centos-release-scl
centos-sclo-rh | 3.0 kB 00:00:00
centos-sclo-sclo | 3.0 kB 00:00:00
(1/2): centos-sclo-sclo/x86_64/primary_db | 300 kB 00:00:00
(2/2): centos-sclo-rh/x86_64/primary_db | 3.3 MB 00:00:02
Nothing to do
参考:https://cbs.centos.org/koji/buildinfo?buildID=24739
②命令# tree
-bash: tree: command not found
解决方法:# yum -y install tree
参考:1. Redis——6.0集群安装部署 Redis——6.0集群安装部署 - 曹伟雄 - 博客园
2. redis 官网集群搭建 Scaling with Redis Cluster | Redis
3. Redis 集群教程 REDIS cluster-tutorial -- Redis中文资料站 -- Redis中国用户组(CRUG)
4. Redis Cluster故障转移原理 018.Redis Cluster故障转移原理 - 云+社区 - 腾讯云