down redis集群_高可用架构篇:【3】Redis集群的安装(Redi...

参考文档

Redis集群的安装(Redis3.0.3 + CentOS6.6_x64)

要让Redis3.0集群正常工作至少需要3个Master节点,要想实现高可用,每个Master节点要配备至少1个Slave节点。根据以上特点和要求,进行如下的集群实施规划:

使用6台服务器(物理机或虚拟机)部署3个Master + 3个Slave;

20190810223437593.png

主机名

IP

服务端口[默认6379]

集群端口[服务端口数+10000]

主/从

edu-redis-01

192.168.1.111

7111

17111

Master

edu-redis-02

192.168.1.112

7112

17112

Master

edu-redis-03

192.168.1.113

7113

17113

Master

edu-redis-04

192.168.1.114

7114

17114

Slave

edu-redis-05

192.168.1.115

7115

17115

Slave

edu-redis-06

192.168.1.116

7116

17116

Slave

按规划:防火墙中打开相应的端口

192.168.1.111

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7111 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17111 -j ACCEPT

192.168.1.112

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7112 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17112 -j ACCEPT

192.168.1.113

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7113 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17113 -j ACCEPT

192.168.1.114

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7114 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17114 -j ACCEPT

192.168.1.115

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7115 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17115 -j ACCEPT

192.168.1.116

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7116 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17116 -j ACCEPT

安装目录:

# /usr/local/redis3

用户

root

编译和安装所需的包:

# yum install gcc tcl

下载(或上传)Redis3

# cd /usr/local/src

# wget http://download.redis.io/releases/redis-3.0.3.tar.gz

创建安装目录:

# mkdir /usr/local/redis3

解压:

# tar -zxvf redis-3.0.3.tar.gz

# cd redis-3.0.3

安装(使用PREFIX指定安装目录):

# make PREFIX=/usr/local/redis3 install

安装完成后,可以看到/usr/local/redis3目录下有一个bin目录,bin目录里就是redis的命令脚本:

redis-benchmark redis-check-aof redis-check-dump redis-cli redis-server

创建集群配置目录,并拷贝redid.conf配置文件到各节点配置目录:

192.168.1.111

# mkdir -p /usr/local/redis3/cluster/7111

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7111/redis-7111.conf

192.168.1.112

# mkdir -p /usr/local/redis3/cluster/7112

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7111/redis-7112.conf

192.168.1.113

# mkdir -p /usr/local/redis3/cluster/7113

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7111/redis-7113.conf

192.168.1.114

# mkdir -p /usr/local/redis3/cluster/7114

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7111/redis-7114.conf

192.168.1.115

# mkdir -p /usr/local/redis3/cluster/7115

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7111/redis-7115.conf

192.168.1.116

# mkdir -p /usr/local/redis3/cluster/7116

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7111/redis-7116.conf

修改配置文件中的下面选项:

6个节点的redis.conf配置文件内容,注意修改下红色字体部分的内容即可,其他都相同:

配置选项

选项值

说明

daemonize

yes

是否作为守护进程运行

pidfile

/var/run/redis-7111.pid

如以后台进程运行,则需指定一个pid,默认为/var/run/redis.pid

port

7111

监听端口,默认为6379; 注意:集群通讯端口值默认为此端口值+10000,如17111

databases

1

可用数据库数,默认值为16,默认数据库存储在DB 0号ID库中,无特殊需求,建议仅设置一个数据库 databases 1

cluster-enabled

yes

打开redis集群

cluster-config-file

/usr/local/redis3/cluster/7111/nodes.conf

集群配置文件(启动自动生成),不用人为干涉

cluster-node-timeout

15000

节点互连超时时间。毫秒

cluster-migration-barrier

1

数据迁移的副本临界数,这个参数表示的是,一个主节点在拥有多少个好的从节点的时候就要割让一个从节点出来给另一个没有任何从节点的主节点。

cluster-require-full-coverage

yes

如果某一些key space没有被集群中任何节点覆盖,集群将停止接受写入。

appendonly

yes

启用aof持久化方式, 因为redis本身同步数据文件是按上面save条件来同步的,所以有的数据会在一段时间内只存在于内存中。默认值为no

dir

/usr/local/redis3/cluster/7111

节点数据持久化存放目录(建议配置)包含了最少选项的集群配置文件示例如下:

port 7000

cluster-enabled yes

cluster-config-file nodes.conf

cluster-node-timeout 5000

appendonly yes

使用如下命令启动这6个Redis节点实例:

192.168.1.111

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7111.conf

192.168.1.112

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7112/redis-7112.conf

192.168.1.113

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7112/redis-7113.conf

192.168.1.114

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7112/redis-7114.conf

192.168.1.115

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7112/redis-7115.conf

192.168.1.116

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7112/redis-7116.conf

启动之后用PS命令查看实例启动情况:

[root@edu-redis-01 cluster]# ps -ef | grep redis

root 5443 1 0 22:49 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7111 [cluster]

[root@edu-redis-02 cluster]# ps -ef | grep redis

root 5421 1 0 22:49 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7112 [cluster]

[root@edu-redis-03 cluster]# ps -ef | grep redis

root 5457 1 0 22:49 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7113 [cluster]

[root@edu-redis-04 cluster]# ps -ef | grep redis

root 5379 1 0 22:50 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7114 [cluster]

[root@edu-redis-05 cluster]# ps -ef | grep redis

root 5331 1 0 22:50 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7115 [cluster]

[root@edu-redis-06 cluster]# ps -ef | grep redis

root 5687 1 0 22:50 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7116 [cluster]

注意:启动完毕后,6个Redis实例尚未构成集群。

-----------------------------------------------接下来准备创建集群---------------------------------------------

安装ruby和rubygems(注意:需要ruby的版本在1.8.7 以上)

# yum install ruby rubygems

检查ruby版本:

# ruby -v

ruby 1.8.7 (2013-06-27 patchlevel 374) [x86_64-linux]

gem 安装 redis ruby 接口:

# gem install redis

Successfully installed redis-3.2.1

1 gem installed

Installing ri documentation for redis-3.2.1...

Installing RDoc documentation for redis-3.2.1...

执行Redis集群创建命令(只需要在其中一个节点上执行一次则可)

# cd /usr/local/src/redis-3.0.3/src/

# cp redis-trib.rb /usr/local/bin/redis-trib

# redis-trib create --replicas 1 192.168.1.114:7114 192.168.1.115:7115 192.168.1.116:7116 192.168.1.111:7111 192.168.1.112:7112 192.168.1.113:7113

输出

>>> Creating cluster

Connecting to node 192.168.1.114:7114: OK

Connecting to node 192.168.1.115:7115: OK

Connecting to node 192.168.1.116:7116: OK

Connecting to node 192.168.1.111:7111: OK

Connecting to node 192.168.1.112:7112: OK

Connecting to node 192.168.1.113:7113: OK

>>> Performing hash slots allocation on 6 nodes...

Using 3 masters:

192.168.1.113:7113

192.168.1.112:7112

192.168.1.111:7111

Adding replica 192.168.1.116:7116 to 192.168.1.113:7113

Adding replica 192.168.1.115:7115 to 192.168.1.112:7112

Adding replica 192.168.1.114:7114 to 192.168.1.111:7111

S: 007a3fe8d7451d3d0a78fffd2653c8641809499c 192.168.1.114:7114

replicates 94e140b9ca0735040ae3428983835f1d93327aeb

S: ea69b6b6e2e7723eed50b1dabea9d244ccf3f098 192.168.1.115:7115

replicates c642b3071c4b2b073707ed3c3a2c16d53a549eff

S: 5f09dc0671732cf06a09f28631c90e0c68408520 192.168.1.116:7116

replicates 896a3c99da4fcf680de1f42406fccb551d8c40c3

M: 94e140b9ca0735040ae3428983835f1d93327aeb 192.168.1.111:7111

slots:10923-16383 (5461 slots) master

M: c642b3071c4b2b073707ed3c3a2c16d53a549eff 192.168.1.112:7112

slots:5461-10922 (5462 slots) master

M: 896a3c99da4fcf680de1f42406fccb551d8c40c3 192.168.1.113:7113

slots:0-5460 (5461 slots) master

Can I set the above configuration? (type 'yes' to accept): yes

(输入 yes 并按下回车确认之后,集群就会将配置应用到各个节点,并连接起(join)各个节点,也就是让各个节点开始互相通讯),输出:

>>> Nodes configuration updated

>>> Assign a different config epoch to each node

>>> Sending CLUSTER MEET messages to join the cluster

Waiting for the cluster to join....

>>> Performing Cluster Check (using node 192.168.1.114:7114)

M: 007a3fe8d7451d3d0a78fffd2653c8641809499c 192.168.1.114:7114

slots: (0 slots) master

replicates 94e140b9ca0735040ae3428983835f1d93327aeb

M: ea69b6b6e2e7723eed50b1dabea9d244ccf3f098 192.168.1.115:7115

slots: (0 slots) master

replicates c642b3071c4b2b073707ed3c3a2c16d53a549eff

M: 5f09dc0671732cf06a09f28631c90e0c68408520 192.168.1.116:7116

slots: (0 slots) master

replicates 896a3c99da4fcf680de1f42406fccb551d8c40c3

M: 94e140b9ca0735040ae3428983835f1d93327aeb 192.168.1.111:7111

slots:10923-16383 (5461 slots) master

M: c642b3071c4b2b073707ed3c3a2c16d53a549eff 192.168.1.112:7112

slots:5461-10922 (5462 slots) master

M: 896a3c99da4fcf680de1f42406fccb551d8c40c3 192.168.1.113:7113

slots:0-5460 (5461 slots) master

一切正常的情况下输出以下信息:

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

最后一行信息表示集群中的 16384 个槽都有至少一个主节点在处理, 集群运作正常。

集群创建过程说明:

(1) 给定 redis-trib 程序的命令是 create , 这表示我们希望创建一个新的集群;

(2) 这里的 --replicas 1 表示每个主节点下有一个从节点;

(3) 之后跟着的其它参数则是实例的地址列表,程序使用这些地址所指示的实例来创建新集群;

总的来说,以上命令的意思就是让 redis-trib 程序创建一个包含三个主节点和三个从节点的集群。接着,redis-trib 会打印出一份预想中的配置给你看,如果你觉得没问题的话(注意核对主从关系是否是你想要的),就可以输入 yes , redis-trib 就会将这份配置应用到集群当中。

集群简单测试

使用 redis-cli命令进入集群环境

[root@edu-redis-04 bin]# ./redis-cli -c -p 7114

127.0.0.1:7114> set Redis Redis

-> Redirected to slot [8559] located at 192.168.1.112:7112

OK

[root@edu-redis-01 bin]# ./redis-cli -c -p 7111

127.0.0.1:7111> get Redis

-> Redirected to slot [8559] located at 192.168.1.112:7112

"Redis"

[root@edu-redis-02 bin]# ./redis-cli -c -p 7112

127.0.0.1:7112> get Redis

"Redis"

127.0.0.1:7112>

[root@edu-redis-01 bin]# ./redis-cli -p 7111 cluster nodes

20190810230938227.png

将Redis配置成服务

(非伪集群适用,也就是每个节点都单独物理机部署的情况下):

按上面的操作步骤,Redis的启动脚本为:/usr/local/src/redis-3.0.3/utils/redis_init_script

将启动脚本复制到/etc/rc.d/init.d/目录下,并命名为redis:

# cp /usr/local/src/redis-3.0.3/utils/redis_init_script /etc/rc.d/init.d/redis

编辑/etc/rc.d/init.d/redis,修改相应配置,使之能注册成为服务:

# vi /etc/rc.d/init.d/redis

#!/bin/sh

#

# Simple Redis init.d script conceived to work on Linux systems

# as it does use of the /proc filesystem.

REDISPORT=6379

EXEC=/usr/local/bin/redis-server

CLIEXEC=/usr/local/bin/redis-cli

PIDFILE=/var/run/redis_${REDISPORT}.pid

CONF="/etc/redis/${REDISPORT}.conf"

case "$1" in

start)

if [ -f $PIDFILE ]

then

echo "$PIDFILE exists, process is already running or crashed"

else

echo "Starting Redis server..."

$EXEC $CONF

fi

;;

stop)

if [ ! -f $PIDFILE ]

then

echo "$PIDFILE does not exist, process is not running"

else

PID=$(cat $PIDFILE)

echo "Stopping ..."

$CLIEXEC -p $REDISPORT shutdown

while [ -x /proc/${PID} ]

do

echo "Waiting for Redis to shutdown ..."

sleep 1

done

echo "Redis stopped"

fi

;;

*)

echo "Please use start or stop as first argument"

;;

esac

查看以上redis服务脚本,关注标为橙色的几个属性,做如下几个修改的准备:

(1)在脚本的第一行后面添加一行内容如下:

#chkconfig: 2345 80 90

(如果不添加上面的内容,在注册服务时会提示:service redis does not support chkconfig)

(2)REDISPORT端口修改各节点对应的端口;(注意,端口名将与下面的配置文件名有关)

(3)EXEC=/usr/local/bin/redis-server改为 EXEC=/usr/local/redis3/bin/redis-server

(4)CLIEXEC=/usr/local/bin/redis-cli 改为CLIEXEC=/usr/local/redis3/bin/redis-cli

(5)配置文件设置,对CONF属性作如下调整:

CONF="/etc/redis/${REDISPORT}.conf"

改为CONF="/usr/local/redis3/cluster/REDISPORT/redis?{REDISPORT}/redis-REDISPORT/redis?{REDISPORT}.conf"

(6)更改redis开启的命令,以后台运行的方式执行:

$EXEC $CONF & [#“&”作用是将服务转到后面运行]

修改后的/etc/rc.d/init.d/redis服务脚本内容为(注意各节点的端口不同):

#!/bin/sh

#chkconfig: 2345 80 90

#

# Simple Redis init.d script conceived to work on Linux systems

# as it does use of the /proc filesystem.

REDISPORT=7111

EXEC=/usr/local/redis3/bin/redis-server

CLIEXEC=/usr/local/redis3/bin/redis-cli

PIDFILE=/var/run/redis-${REDISPORT}.pid

CONF="/usr/local/redis3/cluster/${REDISPORT}/redis-${REDISPORT}.conf "

case "$1" in

start)

if [ -f $PIDFILE ]

then

echo "$PIDFILE exists, process is already running or crashed"

else

echo "Starting Redis server..."

$EXEC $CONF &

fi

;;

stop)

if [ ! -f $PIDFILE ]

then

echo "$PIDFILE does not exist, process is not running"

else

PID=$(cat $PIDFILE)

echo "Stopping ..."

$CLIEXEC -p $REDISPORT shutdown

while [ -x /proc/${PID} ]

do

echo "Waiting for Redis to shutdown ..."

sleep 1

done

echo "Redis stopped"

fi

;;

*)

echo "Please use start or stop as first argument"

;;

esac

以上配置操作完成后,便可将Redis注册成为服务:

# chkconfig --add redis

防火墙中打开对应的端口,各节点的端口不同(前面已操作则可跳过此步)

# vi /etc/sysconfig/iptables

添加:

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7111 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17111 -j ACCEPT

重启防火墙:

k# service iptables restart

启动Redis服务

# service redis start

将Redis添加到环境变量中:

# vi /etc/profile

在最后添加以下内容:

## Redis env

export PATH=$PATH:/usr/local/redis3/bin

使配置生效:

# source /etc/profile

现在就可以直接使用redis-cli等redis命令了:

20190810232101425.png

关闭Redis服务

# service redis stop

默认情况下,Redis未开启安全认证,可以通过/usr/local/redis3/cluster/7111/redis-7111.conf的requirepass指定一个验证密码。

其它供参考资料

Redis集群的使用测试(Jedis客户端的使用)

1、Jedis客户端建议升级到最新版(当前为2.7.3),这样对3.0.x集群有比较好的支持。

2、直接在Java代码中链接Redis集群:

// 数据库链接池配置

JedisPoolConfig config = new JedisPoolConfig();

config.setMaxTotal(100);

config.setMaxIdle(50);

config.setMinIdle(20);

config.setMaxWaitMillis(6 * 1000);

config.setTestOnBorrow(true);

// Redis集群的节点集合

Set jedisClusterNodes = new HashSet();

jedisClusterNodes.add(new HostAndPort("192.168.1.111", 7111));

jedisClusterNodes.add(new HostAndPort("192.168.1.112", 7112));

jedisClusterNodes.add(new HostAndPort("192.168.1.113", 7113));

jedisClusterNodes.add(new HostAndPort("192.168.1.114", 7114));

jedisClusterNodes.add(new HostAndPort("192.168.1.115", 7115));

jedisClusterNodes.add(new HostAndPort("192.168.1.116", 7116));

// 根据节点集创集群链接对象

//JedisCluster jedisCluster = new JedisCluster(jedisClusterNodes);

// 节点,超时时间,最多重定向次数,链接池

JedisCluster jedisCluster = new JedisCluster(jedisClusterNodes, 2000, 100, config);

int num = 1000;

String key = "redis";

String value = "";

for (int i=1; i <= num; i++){

// 存数据

jedisCluster.set(key+i, "Redis"+i);

// 取数据

value = jedisCluster.get(key+i);

log.info(key+i + "=" + value);

// 删除数据

//jedisCluster.del(key+i);

//value = jedisCluster.get(key+i);

//log.info(key+i + "=" + value);

3、Spring配置Jedis链接Redis3.0集群的配置:

对应的Java调用代码样例

JedisCluster jedisCluster = (JedisCluster) context.getBean("jedisCluster");

int num = 1000;

String key = "redis";

String value = "";

for (int i=1; i <= num; i++){

// 存数据

jedisCluster.set(key+i, "Redis"+i);

// 取数据

value = jedisCluster.get(key+i);

log.info(key+i + "=" + value);

// 删除数据

//jedisCluster.del(key+i);

}

Redis集群的高可用性测试

一、Redis集群特点

1、集群架构特点:

(1)所有的redis节点彼此互联(PING-PONG机制),内部使用二进制协议优化传输速度和带宽;

(2)节点的fail是通过集群中超过半数的节点检测失效时才生效;

(3)客户端与redis节点直连,不需要中间proxy层。客户端不需要连接集群所有节点,连接集群中任何一个可用节点即可;

(4)redis-cluster把所有的物理节点映射到[0-16383]个slot(哈希槽)上,cluster 负责维护

nodeslotvalue 。

2、集群选举容错:

(1)节点失效选举过程是集群中所有master参与,如果半数以上master节点与当前被检测master节点通信检测超时(cluster-node-timeout),就认为当前master节点挂掉;

(2):什么时候整个集群不可用(cluster_state:fail)?

A:如果集群任意master挂掉,且当前master没有slave。集群进入fail状态,也可以理解成集群的slot映射[0-16383]不完整时进入fail状态。 ps : redis-3.0.0.rc1加入cluster-require-full-coverage参数,默认关闭,打开集群兼容部分失败;

B:如果集群超过半数以上master挂掉,无论是否有slave集群进入fail状态。ps:当集群不可用时,所有对集群的操作做都不可用,收到((error) CLUSTERDOWN The cluster is down)错误。

二、客户端集群命令

集群

cluster info :打印集群的信息

cluster nodes :列出集群当前已知的所有节点(node),以及这些节点的相关信息。

节点

cluster meet :将ip和port所指定的节点添加到集群当中,让它成为集群的一份子。

cluster forget :从集群中移除 node_id 指定的节点。

cluster replicate :将当前节点设置为node_id指定的节点的从节点。

cluster saveconfig :将节点的配置文件保存到硬盘里面。

槽(slot)

cluster addslots [slot ...] :将一个或多个槽(slot)指派(assign)给当前节点。

cluster delslots [slot ...] :移除一个或多个槽对当前节点的指派。

cluster flushslots :移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。

cluster setslot node :将槽 slot 指派给 node_id 指定的节点,如果槽已经指派给另一个节点,那么先让另一个节点删除该槽>,然后再进行指派。

cluster setslot migrating :将本节点的槽 slot 迁移到 node_id 指定的节点中。

cluster setslot importing :从 node_id 指定的节点中导入槽 slot 到本节点。

cluster setslot stable :取消对槽 slot 的导入(import)或者迁移(migrate)。

cluster keyslot :计算键 key 应该被放置在哪个槽上。

cluster countkeysinslot :返回槽 slot 目前包含的键值对数量。

cluster getkeysinslot :返回 count 个 slot 槽中的键。

三、集群高可用测试

1、重建集群,步骤:

(1)关闭集群的各节点;

(2)删除各节点数据目录下的 nodes.conf、appendonly.aof、dump.rdb;

# rm -rf appendonly.aof | rm -rf dump.rdb | rm -rf nodes.conf

(3)重新启用所有的节点

192.168.1.111

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7111.conf

192.168.1.112

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7112.conf

192.168.1.113

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7113.conf

192.168.1.114

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7114.conf

192.168.1.115

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7115.conf

192.168.1.116

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7116.conf

(4)执行集群创建命令(只需要在其中一个节点上执行一次则可)

# cd /usr/local/src/redis-3.0.3/src/

# cp redis-trib.rb /usr/local/bin/redis-trib

# redis-trib create --replicas 1 192.168.1.114:7114 192.168.1.115:7115 192.168.1.116:7116 192.168.1.111:7111 192.168.1.112:7112 192.168.1.113:7113

2、查看当前集群各节点的状态

[root@edu-redis-01 7111]# /usr/local/redis3/bin/redis-cli -c -p 7111

127.0.0.1:7111> cluster nodes

3、使用demo应用向集群写入1000个键值数据

使用/usr/local/redis3/bin/redis-cli -c -p 711X命令登录各节点,使用keys *查看各节点的所有key

4、运行demo应用,获取所有的键值数据

如果有空值则停止

5、模拟集群节点宕机

(1)Jedis客户端循环操作集群数据(模拟用户持续使用系统)

(2)查看Redis集群当前状态(用于接下来做节点状态变化对比)

20190810233824537.png

(3)关闭其中一个master节点(7111)

(4)观察该master节点和对应的slave节点的状态变化

20190810233841259.png

节点状态 fail? 表示正在判断是否失败

节点状态 fail 表示节点失败,对应的slave节点提升为master

(5)再查看集群状态变化# /usr/local/src/redis-3.0.3/src/redis-trib.rb check 192.168.1.116:7116

20190810233917905.png

由上可见,7114节点替换7111,由slave变成了master

此时再执行demo应用获取所有的键值数据,依然正常,说明slave替换master成功,集群正常。

6、恢复fail节点

(1)启动7111

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7111/redis-7111.conf

(2)查看集群状态

20190810234012472.png

其中7111变成 7114的slave

7、观察集群节点切换过程中,对客户端的影响

JedisCluster链接Redis集群操作时遇到的几个常见异常:

(1)重定向次数过多

redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?

解决方法: 初始化JedisCluster时,设定JedisCluster的maxRedirections

//集群各节点集合,超时时间(默认2秒),最多重定向次数(默认5),链接池

new JedisCluster(jedisClusterNodes, 2000, 100, config);

(2)集群不可以用

redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The cluster is down

原因:集群节点状态切换过程中会出现临时闪断,客户端重试操作则可。

(3)链接超时

redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out

解决方法: 初始化JedisCluster时,设定JedisCluster的timeout(默认为两秒);也可以修改源码中的默认时间。

8、总结:

优点:

在master节点下线后,slave节点会自动提升为master节点,保存集群持续提供服务;fail节点恢复后,会自动添加到集群中,变成slave节点;

缺点:

由于redis的复制使用异步机制,在自动故障转移的过程中,集群可能会丢失写命令。然而 redis 几乎是同时执行(将命令恢复发送给客户端,以及将命令复制到slave节点)这两个操作,所以实际中,命令丢失的窗口非常小。

Redis集群的扩展测试

一、安装新的Redis节点,将用于扩展性测试

1、在192.168.1.117虚拟机上以同样的方式安装Redis3,并启动两个实例,规划如下:

主机名

IP

服务端口[默认6379]

集群端口[服务端口值+10000]

主/从

edu-redis-07

192.168.1.117

7117

17117

Master

edu-redis-07

192.168.1.118

7118

17118

Slave按规划:在192.168.1.117的防火墙中打开相应的端口

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7117 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17117 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 7118 -j ACCEPT

-A INPUT -m state --state NEW -m tcp -p tcp --dport 17118 -j ACCEPT

2、Redis安装过程

参考以上Redis集群安装

# yum install gcc tcl

# cd /usr/local/src

# wget http://download.redis.io/releases/redis-3.0.3.tar.gz

# mkdir /usr/local/redis3

# tar -zxvf redis-3.0.3.tar.gz

# cd redis-3.0.3

# make PREFIX=/usr/local/redis3 install

# yum install ruby rubygems

# gem install redis

3、创建集群配置目录,并拷贝redid.conf配置文件到各节点配置目录:

192.168.1.117

# mkdir -p /usr/local/redis3/cluster/7117

# mkdir -p /usr/local/redis3/cluster/7118

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7117/redis-7117.conf

# cp /usr/local/src/redis-3.0.3/redis.conf /usr/local/redis3/cluster/7117/redis-7118.conf

提示:conf配置文件具体内容请看教程提供的 redis-7117.conf 和 redis-7118.conf 配置文件,主要增加了数据目录 dir属性的配置。

4、在192.168.1.117上使用如下命令启动这2个Redis实例:

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7117/redis-7117.conf

# /usr/local/redis3/bin/redis-server /usr/local/redis3/cluster/7118/redis-7118.conf

# ps -ef | grep redis

root 4865 1 0 01:01 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7117 [cluster]

root 4869 1 0 01:01 ? 00:00:00 /usr/local/redis3/bin/redis-server *:7118 [cluster]

二、Redis集群的扩展性测试

1、redis-trib.rb命令介绍:

[root@edu-redis-01 src]# /usr/local/src/redis-3.0.3/src/redis-trib.rb

Usage: redis-trib

import host:port

--from

set-timeout host:port milliseconds

del-node host:port node_id

create host1:port1 ... hostN:portN

--replicas

help (show this help)

add-node new_host:new_port existing_host:existing_port

--slave

--master-id

reshard host:port

--slots

--to

--yes

--from

fix host:port

check host:port

call host:port command arg arg .. arg

For check, fix, reshard, del-node, set-timeout you can specify the host and port of any working node in the cluster.

redis-trib.rb命令参数说明:

call:执行redis命令

create:创建一个新的集群(上一节教程有介绍)

add-node:将一个节点添加到集群里面,第一个是新节点ip:port, 第二个是集群中任意一个正常节

ip:port,–master-id

reshard:重新分片

check:查看集群信息

del-node:移除一个节点

2、添加新的Master节点:

add-node 将一个节点添加到集群里面,第一个是新节点ip:port,第二个是任意一个已存在节点ip:port

# /usr/local/src/redis-3.0.3/src/redis-trib.rb add-node 192.168.1.117:7117 192.168.1.111:7111

>>> Adding node 192.168.1.117:7117 to cluster 192.168.1.111:7111

Connecting to node 192.168.1.111:7111: OK

Connecting to node 192.168.1.116:7116: OK

Connecting to node 192.168.1.113:7113: OK

Connecting to node 192.168.1.112:7112: OK

Connecting to node 192.168.1.115:7115: OK

Connecting to node 192.168.1.114:7114: OK

>>> Performing Cluster Check (using node 192.168.1.111:7111)

M: cc50047487b52697d62b1a72b231b7c74e08e051 192.168.1.111:7111

slots:10923-16383 (5461 slots) master

1 additional replica(s)

S: b21ae6d0a3e614e53bbc52639173ec3ad68044b5 192.168.1.116:7116

slots: (0 slots) slave

replicates 041addd95fa0a15d98be363034e53dd06f69ef47

M: 041addd95fa0a15d98be363034e53dd06f69ef47 192.168.1.113:7113

slots:0-5460 (5461 slots) master

1 additional replica(s)

M: 712e523b617eea5a2ed8df732a50ff298ae2ea48 192.168.1.112:7112

slots:5461-10922 (5462 slots) master

1 additional replica(s)

S: 55c0db5af1b917f3ce0783131fb8bab28920e1f3 192.168.1.115:7115

slots: (0 slots) slave

replicates 712e523b617eea5a2ed8df732a50ff298ae2ea48

S: 8a6ca1452d61f8b4726f0649e6ce49a6ec4afee2 192.168.1.114:7114

slots: (0 slots) slave

replicates cc50047487b52697d62b1a72b231b7c74e08e051

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

Connecting to node 192.168.1.117:7117: OK

>>> Send CLUSTER MEET to node 192.168.1.117:7117 to make it join the cluster.

[OK] New node added correctly.

以上操作结果表示节点添加成功,新增的节点不包含任何数据, 因为它没有分配任何slot。新加入的节点是一个master节点,当集群需要将某个从节点升级为新的主节点时,这个新节点不会被选中。

为新节点分配哈希槽(slot):

你只需要指定集群中其中一个节点的地址,redis-trib 就会自动找到集群中的其他节点。目前 redis-trib 只能在管理员的协助下完成重新分片的工作,命令如下:

# /usr/local/src/redis-3.0.3/src/redis-trib.rb reshard 192.168.1.111:7111

Connecting to node 192.168.1.111:7111: OK

Connecting to node 192.168.1.117:7117: OK

Connecting to node 192.168.1.116:7116: OK

Connecting to node 192.168.1.113:7113: OK

Connecting to node 192.168.1.112:7112: OK

Connecting to node 192.168.1.115:7115: OK

Connecting to node 192.168.1.114:7114: OK

>>> Performing Cluster Check (using node 192.168.1.111:7111)

M: cc50047487b52697d62b1a72b231b7c74e08e051 192.168.1.111:7111

slots:10923-16383 (5461 slots) master

1 additional replica(s)

M: badbc0ffde2a3700df7e179d23fa2762108eabba 192.168.1.117:7117

slots: (0 slots) master

0 additional replica(s)

S: b21ae6d0a3e614e53bbc52639173ec3ad68044b5 192.168.1.116:7116

slots: (0 slots) slave

replicates 041addd95fa0a15d98be363034e53dd06f69ef47

M: 041addd95fa0a15d98be363034e53dd06f69ef47 192.168.1.113:7113

slots:0-5460 (5461 slots) master

1 additional replica(s)

M: 712e523b617eea5a2ed8df732a50ff298ae2ea48 192.168.1.112:7112

slots:5461-10922 (5462 slots) master

1 additional replica(s)

S: 55c0db5af1b917f3ce0783131fb8bab28920e1f3 192.168.1.115:7115

slots: (0 slots) slave

replicates 712e523b617eea5a2ed8df732a50ff298ae2ea48

S: 8a6ca1452d61f8b4726f0649e6ce49a6ec4afee2 192.168.1.114:7114

slots: (0 slots) slave

replicates cc50047487b52697d62b1a72b231b7c74e08e051

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

How many slots do you want to move (from 1 to 16384)?500

上面的提示是确认你打算移动的哈希槽slots的数量(这里槽数量设置为500)

除了移动的哈希槽数量之外, redis-trib 还需要知道重新分片的目标(target node), 也就是负责接收这 500 个哈希槽的节点。指定目标需要使用节点的 ID , 而不是 IP 地址和端口。我们打算向上面新增的主节点来作为目标, 它的 IP 地址和端口是 192.168.1.117:7117,而节点 ID 则是badbc0ffde2a3700df7e179d23fa2762108eabba , 那么我们应该向 redis-trib 提供节点的 ID :What is the receiving node ID? badbc0ffde2a3700df7e179d23fa2762108eabba

接下来 redis-trib 会向你询问重新分片的源节点(source node), 也就是要从哪个节点中取出 500 个哈希槽,并将这些槽移动到目标节点上面。如果我们不打算从特定的节点上取出指定数量的哈希槽, 那么可以向 redis-trib 输入 all , 这样的话, 集群中的所有主节点都会成为源节点, redis-trib 将从各个源节点中各取出一部分哈希槽, 凑够 500 个, 然后移动到目标节点上面:

Please enter all the source node IDs.

Type 'all' to use all the nodes as source nodes for the hash slots.

Type 'done' once you entered all the source nodes IDs.

Source node #1:all

输入 all 并按下回车之后, redis-trib 将打印出哈希槽的移动计划:

Do you want to proceed with the proposed reshard plan (yes/no)? yes

如果你觉得没问题的话, 就可以输入 yes 并再次按回车确认, redis-trib 就会正式开始执行重新分片操作, 将指定的哈希槽从源节点一个个地移动到目标节点上面。

注意:可以同步观察重新分片是否会对客户端的连续使用产生影响(结果:不影响)。

移动前,7117上没有slot:

20190810235713645.png

移动后,7117上有3段slot:

20190810235744737.png

在重新分片操作执行完毕之后, 可以使用以下命令来检查集群是否正常:

# /usr/local/src/redis-3.0.3/src/redis-trib.rb check 192.168.1.111:7111

20190810235812646.png

上面输出的检查结果显示,重新分片成功,集群状态正常。

也可以用以下命令再次查看集群的节点状况:

# /usr/local/redis3/bin/redis-cli -c -p 7111 cluster nodes

20190810235904543.png

以上集群状态输出信息解析:

(1)节点ID

(2)IP:PORT

(3)节点状态标识: master、slave、myself、fail?、fail

(4)如果是从节点,表示主节点的ID;如果是主节点,则为 ‘-’

(5)集群最近一次向各个节点发送PING命令后,过去多长时间还没有接到回复

(6)节点最近一次返回PONG的时间戳

(7)节点的配置纪元

(8)本节点的网络连接情况: connected、disconnected

(9)如果是主节点,表示节点包含的槽

3、添加新的slave节点

(1)添加节点

#/usr/local/src/redis-3.0.3/src/redis-trib.rb add-node 192.168.1.117:7118 192.168.1.111:7111

输出

>>> Adding node 192.168.1.117:7118 to cluster 192.168.1.111:7111

Connecting to node 192.168.1.111:7111: OK

Connecting to node 192.168.1.117:7117: OK

Connecting to node 192.168.1.116:7116: OK

Connecting to node 192.168.1.113:7113: OK

Connecting to node 192.168.1.112:7112: OK

Connecting to node 192.168.1.115:7115: OK

Connecting to node 192.168.1.114:7114: OK

>>> Performing Cluster Check (using node 192.168.1.111:7111)

M: cc50047487b52697d62b1a72b231b7c74e08e051 192.168.1.111:7111

slots:11089-16383 (5295 slots) master

1 additional replica(s)

M: badbc0ffde2a3700df7e179d23fa2762108eabba 192.168.1.117:7117

slots:0-165,5461-5627,10923-11088 (499 slots) master

0 additional replica(s)

S: b21ae6d0a3e614e53bbc52639173ec3ad68044b5 192.168.1.116:7116

slots: (0 slots) slave

replicates 041addd95fa0a15d98be363034e53dd06f69ef47

M: 041addd95fa0a15d98be363034e53dd06f69ef47 192.168.1.113:7113

slots:166-5460 (5295 slots) master

1 additional replica(s)

M: 712e523b617eea5a2ed8df732a50ff298ae2ea48 192.168.1.112:7112

slots:5628-10922 (5295 slots) master

1 additional replica(s)

S: 55c0db5af1b917f3ce0783131fb8bab28920e1f3 192.168.1.115:7115

slots: (0 slots) slave

replicates 712e523b617eea5a2ed8df732a50ff298ae2ea48

S: 8a6ca1452d61f8b4726f0649e6ce49a6ec4afee2 192.168.1.114:7114

slots: (0 slots) slave

replicates cc50047487b52697d62b1a72b231b7c74e08e051

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

Connecting to node 192.168.1.117:7118: OK

>>> Send CLUSTER MEET to node 192.168.1.117:7118 to make it join the cluster.

[OK] New node added correctly.

20190811000058357.png

新增的7118为一个master

(2)redis-cli连接上新节点shell,输入命令:cluster replicate 对应master的node-id

# /usr/local/redis3/bin/redis-cli -c -p 7118

127.0.0.1:7118>cluster replicate ab31611b3424990e2b9bbe73135cb4cb0ace394f

OK

在线添加slave 时,需要dump整个master进程,并传递到slave,再由 slave加载rdb文件到内存,rdb传输过程中Master可能无法提供服务,整个过程消耗大量IO,因此要小心操作。

查看执行结果:

127.0.0.1:7116> cluster nodes

20190811000214693.png

这时7117已变成7118的slave

4、在线reshard 数据

对于负载/数据不均匀的情况,可以在线reshard slot来解决,方法与添加新master的reshard一样,只是需要reshard的master节点是老节点。

5、删除一个slave节点

# cd /usr/local/src/redis-3.0.3/src/

# ./redis-trib.rb del-node 192.168.1.117:7118 5256e05a17c106c93285a03aff1b1b9e7ca7bf0c

20190811000342869.png

这时候,用下以下再查看集群状态,会发现该slave节点已成功移除

# /usr/local/redis3/bin/redis-cli -c -p 7116 cluster nodes

20190811000414236.png

移除一个节点,对应的节点进程也会被关闭。

6、删除一个master节点

删除master节点之前首先要使用reshard移除该master的全部slot,然后再删除当前节点(目前只能把被删除master的slot迁移到一个节点上),操作和分配slot类似,指定具体的Source node即可。

# /usr/local/src/redis-3.0.3/src/redis-trib.rb reshard 192.168.1.117:7117

Connecting to node 192.168.1.117:7117: OK

Connecting to node 192.168.1.112:7112: OK

Connecting to node 192.168.1.115:7115: OK

Connecting to node 192.168.1.114:7114: OK

Connecting to node 192.168.1.111:7111: OK

Connecting to node 192.168.1.116:7116: OK

Connecting to node 192.168.1.113:7113: OK

>>> Performing Cluster Check (using node 192.168.1.117:7117)

M: ab31611b3424990e2b9bbe73135cb4cb0ace394f 192.168.1.117:7117

slots:0-165,5461-5627,10923-11088 (499 slots) master

0 additional replica(s)

M: d2c6c159b07e8197e2c8d2eae8c847050159f602 192.168.1.112:7112

slots:5628-10922 (5295 slots) master

1 additional replica(s)

S: f34b28f1483f0c0d9543e93938fc12b8818050cb 192.168.1.115:7115

slots: (0 slots) slave

replicates d2c6c159b07e8197e2c8d2eae8c847050159f602

M: 48db78bcc55c4c3a3788940a6458b921ccf95d44 192.168.1.114:7114

slots:11089-16383 (5295 slots) master

1 additional replica(s)

S: 8dd55e9b4da9f62b9b15232e86553f1337864179 192.168.1.111:7111

slots: (0 slots) slave

replicates 48db78bcc55c4c3a3788940a6458b921ccf95d44

S: 1fd90d54090925afb4087d4ef94a1710a25160d6 192.168.1.116:7116

slots: (0 slots) slave

replicates 4e46bd06654e8660e617f7249fa22f6fa1fdff0d

M: 4e46bd06654e8660e617f7249fa22f6fa1fdff0d 192.168.1.113:7113

slots:166-5460 (5295 slots) master

1 additional replica(s)

[OK] All nodes agree about slots configuration.

>>> Check for open slots...

>>> Check slots coverage...

[OK] All 16384 slots covered.

//输入被删除master的所有slot数量

How many slots do you want to move (from 1 to 16384)? 499

//接收slot的master节点ID

What is the receiving node ID? 48db78bcc55c4c3a3788940a6458b921ccf95d44

Please enter all the source node IDs.

Type 'all' to use all the nodes as source nodes for the hash slots.

Type 'done' once you entered all the source nodes IDs.

//准备被删除master节点的node-id

Source node #1: ab31611b3424990e2b9bbe73135cb4cb0ace394f

Source node #2:done

Ready to move 499 slots.

Source nodes:

M: ab31611b3424990e2b9bbe73135cb4cb0ace394f 192.168.1.117:7117

slots:0-165,5461-5627,10923-11088 (499 slots) master

0 additional replica(s)

Destination node:

M: 48db78bcc55c4c3a3788940a6458b921ccf95d44 192.168.1.114:7114

slots:11089-16383 (5295 slots) master

1 additional replica(s)

Resharding plan:

Moving slot 0 from ab31611b3424990e2b9bbe73135cb4cb0ace394f

Moving slot 1 from ab31611b3424990e2b9bbe73135cb4cb0ace394f

……

//输入yes执行reshard

Do you want to proceed with the proposed reshard plan (yes/no)? yes

移除该Master节点的所有slot后,重新查看集群状态,会发现该节点不再占用slot:

# /usr/local/redis3/bin/redis-cli -c -p 7116 cluster nodes

20190811000610590.png

确认已清空该Master节点的所有slot后就可以删除该节点了(命令与删除slave节点一样):

# cd /usr/local/src/redis-3.0.3/src/

# ./redis-trib.rb del-node 192.168.1.117:7117 ab31611b3424990e2b9bbe73135cb4cb0ace394f

2019081100064331.png

此时执行 ps -ef | grep redis 命令会发现,被移除的节点对应用的Redis实例也已经被关闭了

2019081100070789.png

【测试代码点击下载】

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值