一、Redis集群方案比较
1.1 哨兵模式

1.2 高可用集群模式

二、Redis高可用集群搭建
2.1 redis集群搭建
redis集群需要至少三个master节点,我们这里搭建三个master节点,并且给每个master再搭建一个slave节 点,总共6个redis节点,这里用一台机器部署6个redis实例,,搭建伪集群的步骤如下
1、在集群的/opt 目录下创建文件夹redis-cluster,然后分别创建8001-8006 6个文件夹
[root@k8s-master01 redis‐cluster]# pwd
/opt/redis‐cluster
[root@k8s-master01 redis‐cluster]# ll
total 24
drwxr-xr-x 2 root root 4096 Nov 21 19:45 8001
drwxr-xr-x 2 root root 4096 Nov 21 19:48 8002
drwxr-xr-x 2 root root 4096 Nov 21 19:48 8003
drwxr-xr-x 2 root root 4096 Nov 21 19:48 8004
drwxr-xr-x 2 root root 4096 Nov 21 19:48 8005
drwxr-xr-x 2 root root 4096 Nov 21 19:49 8006
2、把之前的redis.conf配置文件copy到8001下,修改如下内容:
(1)daemonize yes
(2)port 8001(分别对每个机器的端口号进行设置)
(3)pidfile /var/run/redis_8001.pid # 把pid进程号写入pidfile配置的文件
(4)dir /opt/redis‐cluster/8001/(指定数据文件存放位置,必须要指定不同的目录位置,不然会 丢失数据)
(5)cluster‐enabled yes(启动集群模式)
(6)cluster‐config‐file nodes‐8001.conf(集群节点信息文件,这里800x最好和port对应上)
(7)cluster‐node‐timeout 10000
(8)# bind 127.0.0.1(bind绑定的是自己机器网卡的ip,如果有多块网卡可以配多个ip,代表允许客户端通过机器的哪些网卡ip去访问,内网一般可以不配置bind,注释掉即可)
(9)protected‐mode no (关闭保护模式)
(10)appendonly yes
如果要设置密码需要增加如下配置:
(11)requirepass 123456 (设置redis访问密码)
(12)masterauth 123456 (设置集群节点间访问密码,跟上面一致)
3、配置好一个文件之后,把它拷贝到各个目录
cp redis.conf ../8002/redis.conf
cp redis.conf ../8003/redis.conf
cp redis.conf ../8004/redis.conf
cp redis.conf ../8005/redis.conf
cp redis.conf ../8006/redis.conf
4、使用替换命令,批量修改
[root@k8s-master01 redis‐cluster]# pwd
/opt/redis‐cluster
sed -i "s/8001/8002/g" 8002/redis.conf
sed -i "s/8001/8003/g" 8003/redis.conf
sed -i "s/8001/8004/g" 8004/redis.conf
sed -i "s/8001/8005/g" 8005/redis.conf
sed -i "s/8001/8006/g" 8006/redis.conf
5、分别启动6个redis实例,然后检查是否启动成功
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8001/redis.conf
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8002/redis.conf
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8003/redis.conf
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8004/redis.conf
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8005/redis.conf
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8006/redis.conf
[root@k8s-master01 8001]# ps -ef|grep redis
root 12831 1 0 20:06 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8001 [cluster]
root 12916 1 0 20:08 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8002 [cluster]
root 13003 1 0 20:09 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8003 [cluster]
root 13011 1 0 20:09 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8004 [cluster]
root 13020 1 0 20:09 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8005 [cluster]
root 13029 1 0 20:10 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8006 [cluster]
root 13038 11154 0 20:10 pts/1 00:00:00 grep --color=auto redis
6、用redis‐cli创建整个redis集群(redis5以前的版本集群是依靠ruby脚本redis‐trib.rb实现)
查看创建集群的相关命令
/opt/redis-5.0.3/src/redis-cli -h
/opt/redis-5.0.3/src/redis-cli --cluster help
[root@k8s-master01 redis-5.0.3]# /opt/redis-5.0.3/src/redis-cli --cluster help
Cluster Manager Commands:
create host1:port1 ... hostN:portN
--cluster-replicas <arg>
check host:port
--cluster-search-multiple-owners
info host:port
fix host:port
--cluster-search-multiple-owners
reshard host:port
--cluster-from <arg>
--cluster-to <arg>
--cluster-slots <arg>
--cluster-yes
--cluster-timeout <arg>
--cluster-pipeline <arg>
--cluster-replace
rebalance host:port
--cluster-weight <node1=w1...nodeN=wN>
--cluster-use-empty-masters
--cluster-timeout <arg>
--cluster-simulate
--cluster-pipeline <arg>
--cluster-threshold <arg>
--cluster-replace
add-node new_host:new_port existing_host:existing_port
--cluster-slave
--cluster-master-id <arg>
del-node host:port node_id
call host:port command arg arg .. arg
set-timeout host:port milliseconds
import host:port
--cluster-from <arg>
--cluster-copy
--cluster-replace
help
For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.
# 下面命令里的1代表为每个创建的主服务器节点创建一个从服务器节点
# 执行这条命令需要确认三台机器之间的redis实例要能相互访问,可以先简单把所有机器防火墙关掉,如果不
#关闭防火墙则需要打开redis服务端口和集群节点gossip通信端口16379(默认是在redis端口号上加1W)
# 关闭防火墙
# systemctl stop firewalld # 临时关闭防火墙
# systemctl disable firewalld # 禁止开机启动
# 创建集群
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster create --cluster-replicas 1 47.96.114.225:8001 47.96.114.225:8002 47.96.114.225:8003 47.96.114.225:8004 47.96.114.225:8005 47.96.114.225:8006
[root@k8s-master01 redis-5.0.3]# /opt/redis-5.0.3/src/redis-cli -a 123456 --cluster create --cluster-replicas 1 47.96.114.225:8001 47.96.114.225:8002 47.96.114.225:8003 47.96.114.225:8004 47.96.114.225:8005 47.96.114.225:8006
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 47.96.114.225:8004 to 47.96.114.225:8001
Adding replica 47.96.114.225:8005 to 47.96.114.225:8002
Adding replica 47.96.114.225:8006 to 47.96.114.225:8003
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: 0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001
slots:[0-5460] (5461 slots) master
M: 7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002
slots:[5461-10922] (5462 slots) master
M: 52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003
slots:[10923-16383] (5461 slots) master
S: 5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004
replicates 7255b58225c7108f9839169d6a621efbf6e63b1d
S: 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005
replicates 52dafe801931dbe34ef518af7b35d6578c9e8860
S: 38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006
replicates 0a706941748a0863d94de0e4701af0f90a40c0ce
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...
>>> Performing Cluster Check (using node 47.96.114.225:8001)
M: 0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004
slots: (0 slots) slave
replicates 7255b58225c7108f9839169d6a621efbf6e63b1d
S: 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005
slots: (0 slots) slave
replicates 52dafe801931dbe34ef518af7b35d6578c9e8860
S: 38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006
slots: (0 slots) slave
replicates 0a706941748a0863d94de0e4701af0f90a40c0ce
M: 52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
7、验证集群
(1)连接任意一个客户端即可:
./redis‐cli ‐c ‐h ‐p (‐a访问服务端密码,‐c表示集群模式,指定ip地址 和端口号)
(2)进行验证: cluster info(查看集群信息)、cluster nodes(查看节点列表)
[root@k8s-master01 redis‐cluster]# /opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 8002
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:8002> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:2
cluster_stats_messages_ping_sent:262
cluster_stats_messages_pong_sent:283
cluster_stats_messages_meet_sent:4
cluster_stats_messages_sent:549
cluster_stats_messages_ping_received:281
cluster_stats_messages_pong_received:265
cluster_stats_messages_meet_received:2
cluster_stats_messages_received:548
127.0.0.1:8002> cluster nodes
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 slave 52dafe801931dbe34ef518af7b35d6578c9e8860 0 1637498642000 5 connected
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637498642595 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637498641592 1 connected 0-5460
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 master - 0 1637498640000 3 connected 10923-16383
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637498640000 2 connected 5461-10922
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637498640590 4 connected
(3)进行数据操作验证
127.0.0.1:8002> set k1 v1
-> Redirected to slot [12706] located at 47.96.114.225:8003
OK
47.96.114.225:8003> set k2 v2
-> Redirected to slot [449] located at 47.96.114.225:8001
OK
47.96.114.225:8001> set k3 v3
OK
(4)关闭集群则需要逐个进行关闭,使用命令:
/opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 800* shutdown
redis集群中的节点关闭之后,再次启动即可,不需要再次创建集群,在nodes-800*.conf中有集群的信息,会根据这个信息进行集群的恢复
[root@k8s-master01 8001]# cat nodes-8001.conf
7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002@18002 master - 0 1637498383907 2 connected 5461-10922
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637498382905 4 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 slave 52dafe801931dbe34ef518af7b35d6578c9e8860 0 1637498382000 5 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 172.17.12.88:8001@18001 myself,master - 0 1637498383000 1 connected 0-5460
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637498384911 6 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 master - 0 1637498383000 3 connected 10923-16383
vars currentEpoch 6 lastVoteEpoch 0
测试验证:集群中的master节点宕机之后,slave是否会变为主节点
47.96.114.225:8003> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637503663000 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 172.17.12.88:8003@18003 myself,master - 0 1637503664000 3 connected 10923-16383
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 slave 52dafe801931dbe34ef518af7b35d6578c9e8860 0 1637503665792 5 connected
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637503664000 6 connected
7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002@18002 master - 0 1637503665000 2 connected 5461-10922
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637503664787 1 connected 0-5460
8003宕机
[root@k8s-master01 redis-5.0.3]# ps -ef|grep redis
root 12831 1 0 20:06 ? 00:00:07 /opt/redis-5.0.3/src/redis-server *:8001 [cluster]
root 12916 1 0 20:08 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8002 [cluster]
root 13003 1 0 20:09 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8003 [cluster]
root 13011 1 0 20:09 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8004 [cluster]
root 13020 1 0 20:09 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8005 [cluster]
root 13029 1 0 20:10 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8006 [cluster]
root 14797 3402 0 20:43 pts/5 00:00:00 /opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 8002
root 19131 11154 0 22:08 pts/1 00:00:00 grep --color=auto redis
[root@k8s-master01 redis-5.0.3]# kill 13003
[root@k8s-master01 redis-5.0.3]# ps -ef|grep redis
root 12831 1 0 20:06 ? 00:00:07 /opt/redis-5.0.3/src/redis-server *:8001 [cluster]
root 12916 1 0 20:08 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8002 [cluster]
root 13011 1 0 20:09 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8004 [cluster]
root 13020 1 0 20:09 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8005 [cluster]
root 13029 1 0 20:10 ? 00:00:06 /opt/redis-5.0.3/src/redis-server *:8006 [cluster]
root 14797 3402 0 20:43 pts/5 00:00:00 /opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 8002
root 19147 11154 0 22:09 pts/1 00:00:00 grep --color=auto redis
查看节点:8005为master节点,8003再次启动之后,为8005的从节点了
127.0.0.1:8002> cluster nodes
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637503805800 7 connected 10923-16383
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637503804000 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637503806810 1 connected 0-5460
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 master,fail - 1637503749526 1637503748526 3 disconnected
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637503806000 2 connected 5461-10922
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637503804798 4 connected
2.2 java操作redis集群
jedis
借助redis的java客户端jedis可以操作以上集群,引用jedis版本的maven坐标如下:
<dependency>
<groupId>redis.clients</groupId>
<artifactId>jedis</artifactId>
<version>3.2.0</version>
</dependency>
代码:
package com.zengqingfa.test;
import org.junit.Test;
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.JedisCluster;
import redis.clients.jedis.JedisPoolConfig;
import java.util.HashSet;
import java.util.Set;
public class JedisClusterTest {
@Test
public void test() {
JedisPoolConfig config = new JedisPoolConfig();
config.setMaxIdle(10);
config.setMaxTotal(20);
config.setMinIdle(5);
String masterName = "mymaster";
Set<HostAndPort> jedisClusterNode = new HashSet<HostAndPort>();
jedisClusterNode.add(new HostAndPort("47.96.114.225", 8001));
jedisClusterNode.add(new HostAndPort("47.96.114.225", 8002));
jedisClusterNode.add(new HostAndPort("47.96.114.225", 8003));
jedisClusterNode.add(new HostAndPort("47.96.114.225", 8004));
jedisClusterNode.add(new HostAndPort("47.96.114.225", 8005));
jedisClusterNode.add(new HostAndPort("47.96.114.225", 8006));
JedisCluster jedisCluster = null;
try {
//connectionTimeout:指的是连接一个url的连接等待时间
//soTimeout:指的是连接上一个url,获取response的返回等待时间
//JedisCluster(Set<HostAndPort> jedisClusterNode, int connectionTimeout, int soTimeout, int maxAttempts,
// String password, GenericObjectPoolConfig poolConfig)
jedisCluster= new JedisCluster(jedisClusterNode, 6000, 5000, 10, "123456", config);
System.out.println(jedisCluster.set("cluster", "shenlongfeixian"));
System.out.println(jedisCluster.get("cluster"));
} catch (Exception e) {
e.printStackTrace();
} finally {
//注意这里不是关闭连接,在JedisPool模式下,Jedis会被归还给资源池。
if (jedisCluster != null)
jedisCluster.close();
}
}
}
运行结果:
OK
shenlongfeixian
JedisCluster是如何定位到哪一台机器的呢?
初始化集群信息的时候会获取到节点与槽位之间的关系:
redis.clients.jedis.BinaryJedisCluster#BinaryJedisCluster为JedisCluster的父类
->redis.clients.jedis.JedisSlotBasedConnectionHandler
->redis.clients.jedis.JedisClusterConnectionHandler
成员变量:cache->redis.clients.jedis.JedisClusterInfoCache
redis.clients.jedis.JedisClusterConnectionHandler#initializeSlotsCache
redis.clients.jedis.JedisClusterInfoCache
private final Map<String, JedisPool> nodes = new HashMap<String, JedisPool>();
private final Map<Integer, JedisPool> slots = new HashMap<Integer, JedisPool>();
public void assignSlotsToNode(List<Integer> targetSlots, HostAndPort targetNode) {
w.lock();
try {
JedisPool targetPool = setupNodeIfNotExist(targetNode);
for (Integer slot : targetSlots) {
slots.put(slot, targetPool);
}
} finally {
w.unlock();
}
}

public T run(String key) {
return runWithRetries(JedisClusterCRC16.getSlot(key), this.maxAttempts, false, null);
}
->redis.clients.jedis.JedisClusterCommand#runWithRetries
->redis.clients.jedis.JedisSlotBasedConnectionHandler#getConnectionFromSlot
根据槽位获取到连接池,获取到连接
@Override
public Jedis getConnectionFromSlot(int slot) {
JedisPool connectionPool = cache.getSlotPool(slot);
if (connectionPool != null) {
// It can't guaranteed to get valid connection because of node
// assignment
return connectionPool.getResource();
} else {
renewSlotCache(); //It's abnormal situation for cluster mode, that we have just nothing for slot, try to rediscover state
connectionPool = cache.getSlotPool(slot);
if (connectionPool != null) {
return connectionPool.getResource();
} else {
//no choice, fallback to new connection to random node
return getConnection();
}
}
}
槽位的获取:redis.clients.jedis.util.JedisClusterCRC16#getSlot(java.lang.String)
public static int getSlot(String key) {
if (key == null) {
throw new JedisClusterOperationException("Slot calculation of null is impossible");
}
key = JedisClusterHashTagUtil.getHashTag(key);
// optimization with modulo operator with power of 2 equivalent to getCRC16(key) % 16384
return getCRC16(key) & (16384 - 1);
}
测试验证:
public class CRC16Test {
@Test
public void test() {
System.out.println(JedisClusterCRC16.getCRC16("cluster") & (16384 - 1));//14041
System.out.println(JedisClusterCRC16.getCRC16("cluster") % 16384);//14041
}
}
在redis中设置命令:可以查看到槽位为14041,与代码中运算的一致。
127.0.0.1:8002> set cluster 666
-> Redirected to slot [14041] located at 47.96.114.225:8005
OK
47.96.114.225:8005>
springboot集成
引入相关依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-pool2 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-pool2</artifactId>
<version>2.8.1</version>
</dependency>
配置文件:applciation.yaml
spring:
application:
name: springboot-redis-demo
redis:
database: 0
timeout: 3000
lettuce:
pool:
max-active: 100
max-idle: 50
max-wait: 1000
min-idle: 10
cluster:
nodes: 47.96.114.225:8001,47.96.114.225:8002,47.96.114.225:8003,47.96.114.225:8004,47.96.114.225:8005,47.96.114.225:8006
password: 123456
server:
port: 8087
代码:
package com.zengqingfa.springboot.redis.rest;
import com.zengqingfa.springboot.redis.dto.UserInfo;
import com.zengqingfa.springboot.redis.service.RedisServiceInterface;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.web.bind.annotation.*;
@RestController
@RequestMapping("/redis")
@Slf4j
public class RedisController {
@Autowired
private StringRedisTemplate stringRedisTemplate;
@GetMapping("/cluster")
public String cluster(String key,String value) {
stringRedisTemplate.opsForValue().set(key ,value);
String res = stringRedisTemplate.opsForValue().get(key);
return res;
}
}
三、Redis集群原理分析
Redis Cluster 将所有数据划分为 16384 个 slots(槽位),每个节点负责其中一部分槽位。槽位的信息存储于每 个节点中。 当 Redis Cluster 的客户端来连接集群时,它也会得到一份集群的槽位配置信息并将其缓存在客户端本地。这 样当客户端要查找某个 key 时,可以直接定位到目标节点。同时因为槽位的信息可能会存在客户端与服务器不 一致的情况,还需要纠正机制来实现槽位信息的校验调整。
3.1 槽位定位算法
Cluster 默认会对 key 值使用 crc16 算法进行 hash 得到一个整数值,然后用这个整数值对 16384 进行取模 来得到具体槽位。
HASH_SLOT = CRC16(key) mod 16384
3.2 跳转重定位
当客户端向一个错误的节点发出了指令,该节点会发现指令的 key 所在的槽位并不归自己管理,这时它会向客 户端发送一个特殊的跳转指令携带目标操作的节点地址,告诉客户端去连这个节点去获取数据。客户端收到指 令后除了跳转到正确的节点上去操作,还会同步更新纠正本地的槽位映射表缓存,后续所有 key 将使用新的槽 位映射表。
47.96.114.225:8005> set k3 v3
-> Redirected to slot [4576] located at 47.96.114.225:8001
OK
47.96.114.225:8001>
3.3 Redis集群节点间的通信机制
redis cluster节点间采取gossip协议进行通信
维护集群的元数据(集群节点信息,主从角色,节点数量,各节点共享的数据等)有两种方式:集中 式和gossip
集中式
优点在于元数据的更新和读取,时效性非常好,一旦元数据出现变更立即就会更新到集中式的存储中,其他节 点读取的时候立即就可以立即感知到;不足在于所有的元数据的更新压力全部集中在一个地方,可能导致元数 据的存储压力。 很多中间件都会借助zookeeper集中式存储元数据。
gossip

gossip协议的优点在于元数据的更新比较分散,不是集中在一个地方,更新请求会陆陆续续,打到所有节点上 去更新,有一定的延时,降低了压力;缺点在于元数据更新有延时可能导致集群的一些操作会有一些滞后。
3.4 gossip通信的10000端口
每个节点都有一个专门用于节点间gossip通信的端口,就是自己提供服务的端口号+10000,比如7001,那么 用于节点间通信的就是17001端口。 每个节点每隔一段时间都会往另外几个节点发送ping消息,同时其他几 点接收到ping消息之后返回pong消息。
3.5 网络抖动
真实世界的机房网络往往并不是风平浪静的,它们经常会发生各种各样的小问题。比如网络抖动就是非常常见 的一种现象,突然之间部分连接变得不可访问,然后很快又恢复正常。 为解决这种问题,Redis Cluster 提供了一种选项cluster-node-timeout(可以设置的大一点:比如5s),表示当某个节点持续 timeout 的时间失联时,才可以认定该节点出现故障,需要进行主从切换。如果没有这个选项,网络抖动会导致主从频 繁切换 (数据的重新复制)。
3.6 Redis集群选举原理分析
当slave发现自己的master变为FAIL状态时,便尝试进行Failover,以期成为新的master。由于挂掉的master 可能会有多个slave,从而存在多个slave竞争成为master节点的过程, 其过程如下:
1.slave发现自己的master变为FAIL
2.将自己记录的集群currentEpoch加1,并广播FAILOVER_AUTH_REQUEST 信息(所有的节点:包括主节点和从节点)
3.其他节点收到该信息,只有master响应(在一次选举周期内,只会响应一次),判断请求者的合法性,并发送FAILOVER_AUTH_ACK,对每一个 epoch只发送一次ack
4.尝试failover的slave收集master返回的FAILOVER_AUTH_ACK
5.slave收到超过半数master的ack后变成新Master(这里解释了集群为什么至少需要三个主节点,如果只有两 个,当其中一个挂了,只剩一个主节点是不能选举成功的)
问题:如果两个slave收到的票数一样,怎么处理?重新进行一轮新的选举周期。如果一直一样呢?
6.slave广播Pong消息通知其他集群节点。
从节点并不是在主节点一进入 FAIL 状态就马上尝试发起选举,而是有一定延迟,一定的延迟确保我们等待 FAIL状态在集群中传播,slave如果立即尝试选举,其它masters或许尚未意识到FAIL状态,可能会拒绝投票
•延迟计算公式:
DELAY = 500ms + random(0 ~ 500ms) + SLAVE_RANK * 1000ms
•SLAVE_RANK表示此slave已经从master复制数据的总量的rank。Rank越小代表已复制的数据越新。这种方式下,持有最新数据的slave将会首先发起选举(理论上)。
默认是保存ap架构,保证可用性。
3.7 集群脑裂数据丢失问题
redis集群没有过半机制会有脑裂问题,网络分区导致脑裂后多个主节点对外提供写服务,一旦网络分区恢复, 会将其中一个主节点变为从节点,这时会有大量数据丢失。
规避方法可以在redis配置里加上参数(这种方法不可能百分百避免数据丢失,参考集群leader选举机制):
## 写数据成功最少同步的slave数量,这个数量可以模仿大于半数机制配置,
## 比如集群总共三个节点可以配置1,加上leader就是2,超过了半数
min‐replicas‐to‐write 1
注意:这个配置在一定程度上会影响集群的可用性,比如slave要是少于1个,这个集群就算leader正常也不能 提供服务了,需要具体场景权衡选择。需要半数写入才能成功(很多中间件都是这么做的)
redis保证可用性,保证AP即可,redis缓存,丢失之后从后台数据库重新查询即可
测试验证:
127.0.0.1:8002> keys *
1) "slfx98"
127.0.0.1:8002> set slfx53 111
OK
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637556126000 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637556126000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637556126318 7 connected 10923-16383
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637556128331 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637556127328 1 connected 0-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637556123000 2 connected 5461-10922
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave,fail 7255b58225c7108f9839169d6a621efbf6e63b1d 1637556182249 1637556180000 4 disconnected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637556202000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637556201000 7 connected 10923-16383
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637556203669 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637556202675 1 connected 0-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637556203000 2 connected 5461-10922
127.0.0.1:8002> set slfx53 222
(error) NOREPLICAS Not enough good replicas to write.
127.0.0.1:8002>
重新启动从节点,重写设置数据:
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8004/redis.conf
## 客户端测试
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637556307066 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637556310080 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637556309000 7 connected 10923-16383
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637556309069 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637556306063 1 connected 0-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637556308000 2 connected 5461-10922
127.0.0.1:8002> set slfx53 222
OK
3.8 集群是否完整才能对外提供服务
当redis.conf的配置cluster-require-full-coverage为no时,表示当负责一个插槽的主库下线且没有相应的从库进行故障恢复时,集群仍然可用,如果为yes则集群不可用。
3.9 Redis集群为什么至少需要三个master节点,并且推荐节点数为奇数?
因为新master的选举需要大于半数的集群master节点同意才能选举成功,如果只有两个master节点,当其中 一个挂了,是达不到选举新master的条件的。 奇数个master节点可以在满足选举该条件的基础上节省一个节点,比如三个master节点和四个master节点的 集群相比,大家如果都挂了一个master节点都能选举新master节点,如果都挂了两个master节点都没法选举 新master节点了,所以奇数的master节点更多的是从节省机器资源角度出发说的。
3.10 Redis集群对批量操作命令的支持
对于类似mset,mget这样的多个key的原生批量操作命令,redis集群只支持所有key落在同一slot的情况。
127.0.0.1:8002> mset k1 v1 k2 v2
(error) CROSSSLOT Keys in request don't hash to the same slot
如果有多个key一定要用mset命令在redis集群上操作,则可以在key的前面加上{XX},这样参数数据分片hash计算的只会是大括号里的值,这样能确保不同的key能落到同一slot里去,示例如下:
127.0.0.1:8002> mset {user1}:1:name slfx {user1}:1:age 18
OK
127.0.0.1:8002> keys *
1) "slfx98"
2) "{user1}:1:name"
3) "slfx53"
4) "{user1}:1:age"
127.0.0.1:8002>
假设name和age计算的hash slot值不一样,但是这条命令在集群下执行,redis只会用大括号里的 user1 做 hash slot计算,所以算出来的slot值肯定相同,最后都能落在同一slot。
3.11 哨兵leader选举流程
当一个master服务器被某sentinel视为下线状态后,该sentinel会与其他sentinel协商选出sentinel的leader进 行故障转移工作。每个发现master服务器进入下线的sentinel都可以要求其他sentinel选自己为sentinel的 leader,选举是先到先得。同时每个sentinel每次选举都会自增配置纪元(选举周期),每个纪元中只会选择一 个sentinel的leader。如果所有超过一半的sentinel选举某sentinel作为leader。之后该sentinel进行故障转移 操作,从存活的slave中选举出新的master,这个选举过程跟集群的master选举很类似。 哨兵集群只有一个哨兵节点,redis的主从也能正常运行以及选举master,如果master挂了,那唯一的那个哨 兵节点就是哨兵leader了,可以正常选举新master。 不过为了高可用一般都推荐至少部署三个哨兵节点。为什么推荐奇数个哨兵节点原理跟集群奇数个master节点 类似。
四、redis集群运维
在原始集群基础上再增加一主(8007)一从(8008),增加节点后的集群参见下图,新增节点用虚线框表示
4.1 增加redis实例
在/opt/redis-cluster下创建8007和8008目录,并拷贝8001文件夹下的redis.conf文件到8007和8008这两个文件夹下
## 创建文件夹和复制配置文件
[root@k8s-master01 redis‐cluster]# pwd
/opt/redis‐cluster
[root@k8s-master01 redis‐cluster]# mkdir 8007
[root@k8s-master01 redis‐cluster]# mkdir 8008
[root@k8s-master01 redis‐cluster]# cp 8001/redis.conf 8007/redis.conf
[root@k8s-master01 redis‐cluster]# cp 8001/redis.conf 8008/redis.conf
## 修改配置文件
[root@k8s-master01 redis‐cluster]# pwd
/opt/redis‐cluster
[root@k8s-master01 redis‐cluster]# sed -i "s/8001/8007/g" 8007/redis.conf
[root@k8s-master01 redis‐cluster]# sed -i "s/8001/8008/g" 8008/redis.conf
## 启动8007和8008俩个服务并查看服务状态
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8007/redis.conf
/opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8008/redis.conf
[root@k8s-master01 redis‐cluster]# /opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8007/redis.conf
[root@k8s-master01 redis‐cluster]# /opt/redis-5.0.3/src/redis-server /opt/redis‐cluster/8008/redis.conf
[root@k8s-master01 redis‐cluster]# ps -ef|grep redis
root 12831 1 0 Nov21 ? 00:01:27 /opt/redis-5.0.3/src/redis-server *:8001 [cluster]
root 13020 1 0 Nov21 ? 00:01:26 /opt/redis-5.0.3/src/redis-server *:8005 [cluster]
root 13029 1 0 Nov21 ? 00:01:24 /opt/redis-5.0.3/src/redis-server *:8006 [cluster]
root 16862 1 0 18:34 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8007 [cluster]
root 16872 1 0 18:34 ? 00:00:00 /opt/redis-5.0.3/src/redis-server *:8008 [cluster]
root 16879 16244 0 18:34 pts/1 00:00:00 grep --color=auto redis
root 19264 1 0 Nov21 ? 00:01:17 /opt/redis-5.0.3/src/redis-server *:8003 [cluster]
root 31208 1 0 12:40 ? 00:00:22 /opt/redis-5.0.3/src/redis-server *:8002 [cluster]
root 31441 1 0 12:45 ? 00:00:21 /opt/redis-5.0.3/src/redis-server *:8004 [cluster]
[root@k8s-master01 redis‐cluster]#
查看redis集群的命令帮助
[root@k8s-master01 redis‐cluster]# /opt/redis-5.0.3/src/redis-cli --cluster help
Cluster Manager Commands:
create host1:port1 ... hostN:portN
--cluster-replicas <arg>
check host:port
--cluster-search-multiple-owners
info host:port
fix host:port
--cluster-search-multiple-owners
reshard host:port
--cluster-from <arg>
--cluster-to <arg>
--cluster-slots <arg>
--cluster-yes
--cluster-timeout <arg>
--cluster-pipeline <arg>
--cluster-replace
rebalance host:port
--cluster-weight <node1=w1...nodeN=wN>
--cluster-use-empty-masters
--cluster-timeout <arg>
--cluster-simulate
--cluster-pipeline <arg>
--cluster-threshold <arg>
--cluster-replace
add-node new_host:new_port existing_host:existing_port
--cluster-slave
--cluster-master-id <arg>
del-node host:port node_id
call host:port command arg arg .. arg
set-timeout host:port milliseconds
import host:port
--cluster-from <arg>
--cluster-copy
--cluster-replace
help
For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.
1.create:创建一个集群环境host1:port1 … hostN:portN
2.call:可以执行redis命令
3.add-node:将一个节点添加到集群里,第一个参数为新节点的ip:port,第二个参数为集群中任意一个已经存在的节点的ip:port
4.del-node:移除一个节点
5.reshard:重新分片
6.check:检查集群状态
配置8007为集群主节点
使用add-node命令新增一个主节点8007(master),前面的ip:port为新增节点,后面的ip:port为已知存在节点,看到日志最后有"[OK] New node added correctly"提示代表新节点加入成功
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster add-node 47.96.114.225:8007 47.96.114.225:8001
[root@k8s-master01 redis‐cluster]# /opt/redis-5.0.3/src/redis-cli -a 123456 --cluster add-node 47.96.114.225:8007 47.96.114.225:8001
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 47.96.114.225:8007 to cluster 47.96.114.225:8001
>>> Performing Cluster Check (using node 47.96.114.225:8001)
M: 0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004
slots: (0 slots) slave
replicates 7255b58225c7108f9839169d6a621efbf6e63b1d
M: 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: 38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006
slots: (0 slots) slave
replicates 0a706941748a0863d94de0e4701af0f90a40c0ce
S: 52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003
slots: (0 slots) slave
replicates 2a09e8f47fa9a50a39416e9acfada5582fca4eb6
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 47.96.114.225:8007 to make it join the cluster.
[OK] New node added correctly.
查看集群状态
[root@k8s-master01 redis‐cluster]# /opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 8002
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:8002> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:7
cluster_size:3
cluster_current_epoch:7
cluster_my_epoch:2
cluster_stats_messages_ping_sent:22936
cluster_stats_messages_pong_sent:21906
cluster_stats_messages_fail_sent:4
cluster_stats_messages_sent:44846
cluster_stats_messages_ping_received:21905
cluster_stats_messages_pong_received:21515
cluster_stats_messages_meet_received:1
cluster_stats_messages_fail_received:1
cluster_stats_messages_received:43422
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637577633805 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637577631000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637577632000 7 connected 10923-16383
4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007@18007 master - 0 1637577632802 0 connected
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637577632000 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637577634806 1 connected 0-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637577633000 2 connected 5461-10922
注意:当添加节点成功以后,新增的节点不会有任何数据,因为它还没有分配任何的slot(hash槽),我们需要为新节点手工分配hash槽
使用redis-cli命令为8007分配hash槽,找到集群中的任意一个主节点,对其进行重新分片工作。
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster reshard 47.96.114.225:8001
输出如下:
How many slots do you want to move (from 1 to 16384)? **600 **
(ps:需要多少个槽移动到新的节点上,自己设置,比如600个hash槽)
What is the receiving node ID? 4099074a0472c846d51128f952ce3cfb0b2801e8
(ps:把这600个hash槽移动到哪个节点上去,需要指定节点id)
Please enter all the source node IDs.
Type ‘all’ to use all the nodes as source nodes for the hash slots.
Type ‘done’ once you entered all the source nodes IDs.
Source node 1:all
(ps:输入all为从所有主节点(8001,8002,8003)中分别抽取相应的槽数指定到新节点中,抽取的总槽数为600个)
…
Do you want to proceed with the proposed reshard plan (yes/no)? yes
…
查看下最新的集群状态
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637578012000 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637578014000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637578015158 7 connected 11122-16383
4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007@18007 master - 0 1637578014156 8 connected 0-198 5461-5661 10923-11121
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637578011147 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637578012150 1 connected 199-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637578010000 2 connected 5662-10922
127.0.0.1:8002>
如上图所示,现在我们的8007已经有hash槽了,也就是说可以在8007上进行读写数据啦!到此为止我们的8007已经加入到集群中,并且是主节点(Master)
配置8008为8007的从节点
添加从节点8008到集群中去
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster add-node 47.96.114.225:8008 47.96.114.225:8001
查看集群状态
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637578117511 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637578121000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637578121524 7 connected 11122-16383
9e1062a5585bf0b2b54036fa37130fcf6c0fa0c1 47.96.114.225:8008@18008 master - 0 1637578118512 0 connected
4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007@18007 master - 0 1637578122527 8 connected 0-198 5461-5661 10923-11121
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637578120516 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637578119515 1 connected 199-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637578121000 2 connected 5662-10922
是一个master节点,没有被分配任何的hash槽。
我们需要执行replicate命令来指定当前节点(从节点)的主节点id为哪个,首先需要连接新加的8008节点的客户端,然后使用集群命令进行 操作,把当前的8008(slave)节点指定到一个主节点下(这里使用之前创建的8007主节点)
/opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 8008
[root@k8s-master01 ~]# /opt/redis-5.0.3/src/redis-cli -a 123456 -c -p 8008
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:8008> cluster replicate 4099074a0472c846d51128f952ce3cfb0b2801e8
OK
127.0.0.1:8008> cluster nodes
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637578300000 1 connected
7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002@18002 master - 0 1637578299000 2 connected 5662-10922
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637578299920 7 connected 11122-16383
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637578296000 2 connected
4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007@18007 master - 0 1637578300921 8 connected 0-198 5461-5661 10923-11121
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637578298415 7 connected
9e1062a5585bf0b2b54036fa37130fcf6c0fa0c1 172.17.12.88:8008@18008 myself,slave 4099074a0472c846d51128f952ce3cfb0b2801e8 0 1637578296000 0 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637578298918 1 connected 199-5460
127.0.0.1:8008>
查看集群状态,8008节点已成功添加为8007节点的从节点
4.2 减少redis实例
删除8008从节点
用del-node删除从节点8008,指定删除节点ip和端口,以及节点id(红色为8008节点id)
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster del-node 47.96.114.225:8008 9e1062a5585bf0b2b54036fa37130fcf6c0fa0c1
[root@k8s-master01 ~]# /opt/redis-5.0.3/src/redis-cli -a 123456 --cluster del-node 47.96.114.225:8008 9e1062a5585bf0b2b54036fa37130fcf6c0fa0c1
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Removing node 9e1062a5585bf0b2b54036fa37130fcf6c0fa0c1 from cluster 47.96.114.225:8008
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
再次查看集群状态,如下图所示,8008这个slave节点已经移除,并且该节点的redis服务也已被停止
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637578528024 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637578526000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637578530032 7 connected 11122-16383
4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007@18007 master - 0 1637578528000 8 connected 0-198 5461-5661 10923-11121
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637578528000 6 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637578529025 1 connected 199-5460
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637578527000 2 connected 5662-10922
删除8007主节点
最后,我们尝试删除之前加入的主节点8007,这个步骤相对比较麻烦一些,因为主节点的里面是有分配了hash槽的,所以我们这里必须 先把8007里的hash槽放入到其他的可用主节点中去,然后再进行移除节点操作,不然会出现数据丢失问题(目前只能把master的数据迁移 到一个节点上,暂时做不了平均分配功能),执行命令如下:
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster reshard 47.96.114.225:8007
[root@k8s-master01 ~]# /opt/redis-5.0.3/src/redis-cli -a 123456 --cluster reshard 47.96.114.225:8007
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing Cluster Check (using node 47.96.114.225:8007)
M: 4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007
slots:[0-198],[5461-5661],[10923-11121] (599 slots) master
S: 5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004
slots: (0 slots) slave
replicates 7255b58225c7108f9839169d6a621efbf6e63b1d
M: 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005
slots:[11122-16383] (5262 slots) master
1 additional replica(s)
M: 7255b58225c7108f9839169d6a621efbf6e63b1d 47.96.114.225:8002
slots:[5662-10922] (5261 slots) master
1 additional replica(s)
S: 38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006
slots: (0 slots) slave
replicates 0a706941748a0863d94de0e4701af0f90a40c0ce
M: 0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001
slots:[199-5460] (5262 slots) master
1 additional replica(s)
S: 52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003
slots: (0 slots) slave
replicates 2a09e8f47fa9a50a39416e9acfada5582fca4eb6
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 600
What is the receiving node ID? 0a706941748a0863d94de0e4701af0f90a40c0ce
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1: 4099074a0472c846d51128f952ce3cfb0b2801e8
Source node #2: done
Ready to move 600 slots.
Source nodes:
M: 4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007
slots:[0-198],[5461-5661],[10923-11121] (599 slots) master
Destination node:
M: 0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001
slots:[199-5460] (5262 slots) master
1 additional replica(s)
Resharding plan:
Moving slot 0 from 4099074a0472c846d51128f952ce3cfb0b2801e8
Moving slot 1 from 4099074a0472c846d51128f952ce3cfb0b2801e8
Moving slot 2 from 4099074a0472c846d51128f952ce3cfb0b2801e8
How many slots do you want to move (from 1 to 16384)? **600 **
What is the receiving node ID? 0a706941748a0863d94de0e4701af0f90a40c0ce
(ps:这里是需要把数据移动到哪?8001的主节点id)
Please enter all the source node IDs.
Type ‘all’ to use all the nodes as source nodes for the hash slots.
Type ‘done’ once you entered all the source nodes IDs. Source node 1:4099074a0472c846d51128f952ce3cfb0b2801e8
(ps:这里是需要数据源,也就是我们的8007节点id)
Source node 2:**done **(ps:这里直接输入done 开始生成迁移计划)
…
…
Do you want to proceed with the proposed reshard plan (yes/no)? **Yes **
(ps:这里输入yes开始迁移)
至此,我们已经成功的把8007主节点的数据迁移到8001上去了,我们可以看一下现在的集群状态如下图,你会发现8007下面已经没有任 何hash槽了,证明迁移成功!
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637593092000 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637593091000 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637593094084 7 connected 11122-16383
4099074a0472c846d51128f952ce3cfb0b2801e8 47.96.114.225:8007@18007 master - 0 1637593093000 8 connected
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637593095086 9 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637593093082 9 connected 0-5661 10923-11121
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637593092000 2 connected 5662-10922
127.0.0.1:8002>
最后我们直接使用del-node命令删除8007主节点即可
/opt/redis-5.0.3/src/redis-cli -a 123456 --cluster del-node 47.96.114.225:8007 4099074a0472c846d51128f952ce3cfb0b2801e8
[root@k8s-master01 ~]# /opt/redis-5.0.3/src/redis-cli -a 123456 --cluster del-node 47.96.114.225:8007 4099074a0472c846d51128f952ce3cfb0b2801e8
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Removing node 4099074a0472c846d51128f952ce3cfb0b2801e8 from cluster 47.96.114.225:8007
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
客户端查看集群节点:
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637593714000 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637593715471 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637593716474 7 connected 11122-16383
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637593713000 9 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637593715000 9 connected 0-5661 10923-11121
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637593715000 2 connected 5662-10922
127.0.0.1:8002>
ORGET messages to the cluster…
SHUTDOWN the node.
客户端查看集群节点:
```shell
127.0.0.1:8002> cluster nodes
5420f9f4e146e0e38307aa1bd8fd1d1bd1ea7dde 47.96.114.225:8004@18004 slave 7255b58225c7108f9839169d6a621efbf6e63b1d 0 1637593714000 4 connected
52dafe801931dbe34ef518af7b35d6578c9e8860 47.96.114.225:8003@18003 slave 2a09e8f47fa9a50a39416e9acfada5582fca4eb6 0 1637593715471 7 connected
2a09e8f47fa9a50a39416e9acfada5582fca4eb6 47.96.114.225:8005@18005 master - 0 1637593716474 7 connected 11122-16383
38eb38de65af79da6d3edd3d04f3f96218dc6130 47.96.114.225:8006@18006 slave 0a706941748a0863d94de0e4701af0f90a40c0ce 0 1637593713000 9 connected
0a706941748a0863d94de0e4701af0f90a40c0ce 47.96.114.225:8001@18001 master - 0 1637593715000 9 connected 0-5661 10923-11121
7255b58225c7108f9839169d6a621efbf6e63b1d 172.17.12.88:8002@18002 myself,master - 0 1637593715000 2 connected 5662-10922
127.0.0.1:8002>
4930

被折叠的 条评论
为什么被折叠?



