redis集群
Redis集群介绍
- redis集群是一个提供多个redis节点间共享数据的集合
- redis集群并不支持处理多个keys命令,因为这样需要在不同节点间移动数据,会导致高负载
- redis集群通过分区可以提供一定的可用性,当某个节点宕机后还可以继续执行处理命令
Redis集群优势所在
- 自动分割数据到不同节点上
- 整个集群的部分节点宕机还可以继续处理命令
Redis集群的三种模式
- 主从复制:是高可用redis基础,主从辅助实现了数据的多机备份,以及读操作的负载均衡和简单的故障恢复,缺点是,故障恢复无法自动化,写操作无法负载均衡,存储能力受到单机限制
- 哨兵:在主从复制基础上,哨兵实现自动化故障恢复,缺点是写操作无法负载均衡,存储能力受到单机限制
- 集群:通过集群,解决了写操作无法负载均衡和单机限制问题
redis的主从复制
复制的原理
- 当启动一个slave机器进程,则他会向master机器发送一个sync_command命令,请求同步链接
- 无论是第一次连接还是重新连接,master机器都会启动一个后台进程,将数据快照保存到数据文件中
- 后台进程完成缓存操作之后,,master机器就会向slave机器发送数据文件,slave将数据文件保存至硬盘,然后将其加载到内存中
- master机器收到slave的连接后,将其完整的数据文件发送给slave,如果master同时收到多个slave发来的同步请求则master会在后台启动一个进程以保存数据文件,然后将其发送给所有的slave机器,确保所有的slave机器都正常
部署主从复制
- 环境
master 192.168.1.5
slave1 192.168.1.102
slave2 192.168.1.103
- 安装依赖
[root@localhost ]# yum -y install gcc gcc-c++ make
已加载插件:fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
软件包 gcc-4.8.5-44.el7.x86_64 已安装并且是最新版本
软件包 gcc-c++-4.8.5-44.el7.x86_64 已安装并且是最新版本
软件包 1:make-3.82-24.el7.x86_64 已安装并且是最新版本
无须任何处理
- 然后解压安装包,进入安装包执行make
[root@localhost ]# tar zxf redis-5.0.7.tar.gz
[root@localhost redis-5.0.7]# cd redis-5.0.7/
[root@localhost redis-5.0.7]# make PREFIX=/usr/local/redis install //因为Redis源码包中直接提供了Makefile文件,所以不用执行./configure配置,可以直接执行make
- 然后进入utils执行脚本
[root@localhost redis-5.0.7]# cd utils/
[root@localhost utils]# ./install_server.sh
- 然后一路回车,到这一步需要写路径
- 优化命令路径
[root@slave1 utils]# ln -s /usr/local/redis/bin/* /usr/local/bin/
- 查看是否运行redis
[root@slave1 utils]# netstat -antp | grep redis
tcp 0 0 127.0.0.1:6379 0.0.0.0:* LISTEN 13152/redis-server
- 先修改主节点的配置文件,然后重启服务
[root@master ~]# vim /etc/redis/6379.conf
70 bind 0.0.0.0 //修改监听地址为0.0.0.0
137 daemonize yes //开启守护进程
172 logfile /var/log/redis_6379.log //指定日志文件目录
264 dir /var/lib/redis/6379 //指定工作目录
700 appendonly yes //开启AOF持久化
[root@master ~]# /etc/init.d/redis_6379 restart
Stopping ...
Waiting for Redis to shutdown ...
Redis stopped
Starting Redis server...
- 在从节点上配置
[root@slave1 ~]# vim /etc/redis/6379.conf
70 bind 0.0.0.0 //修改监听地址为0.0.0.0
137 daemonize yes //开启守护进程
172 logfile /var/log/redis_6379.log //指定日志文件目录
264 dir /var/lib/redis/6379 //指定工作目录
287 replicaof 192.168.1.5 6379 //指定要同步master节点ip和端口
700 appendonly yes //开启AOF持久化
- 验证主从效果
[root@master ~]# redis-cli info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.1.102,port=6379,state=online,offset=0,lag=1
slave1:ip=192.168.1.103,port=6379,state=online,offset=0,lag=1
master_replid:aa9db638fd539e3c6c3f9eb80ecb800760dcf384 //master启动时生成的40位16进制的随机字符串,用来标识master
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0 //复制流中的偏移量,master处理完写入命令后,会把命令的字节长度做累加记录,统计在该字段中
second_repl_offset:-1 //无论主从,都表示上次主实例repidl和复制偏移量,用于兄弟实例或级联复制
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:0
[root@master ~]# tail -f /var/log/redis_6379.log
60682:M 08 Aug 2021 22:37:46.478 * Background saving terminated with success
60682:M 08 Aug 2021 22:37:46.479 * Synchronization with replica 192.168.1.102:6379 succeeded
60682:M 08 Aug 2021 22:37:47.426 * Replica 192.168.1.103:6379 asks for synchronization
60682:M 08 Aug 2021 22:37:47.426 * Full resync requested by replica 192.168.1.103:6379
60682:M 08 Aug 2021 22:37:47.427 * Starting BGSAVE for SYNC with target: disk
60682:M 08 Aug 2021 22:37:47.427 * Background saving started by pid 60785
60785:C 08 Aug 2021 22:37:47.428 * DB saved on disk
60785:C 08 Aug 2021 22:37:47.428 * RDB: 0 MB of memory used by copy-on-write
60682:M 08 Aug 2021 22:37:47.485 * Background saving terminated with success
60682:M 08 Aug 2021 22:37:47.485 * Synchronization with replica 192.168.1.103:6379 succeeded
- 查看是否能同步
[root@master ~]# redis-cli
127.0.0.1:6379> set key1 1
OK
[root@slave1 ~]# redis-cli
127.0.0.1:6379> get key1
"1"
[root@slave2 ~]# redis-cli
127.0.0.1:6379> get key1
"1"
Redis的哨兵模式
哨兵模式主要功能
- 集群监控:负责监控redis的master和slave进程是否正常工作
- 消息通知:如果某个节点出现问题,哨兵负责发送报警消息通知给管理员
- 故障转义:如果master节点挂了,会自动转移到slave节点上
- 配置中心:如果故障转移发生了,互通至client新的master地址
哨兵监控整个系统节点的过程
- 首先主节点信息是配置在哨兵的配置文件中的
- 哨兵节点会和配置的主节点建立两条连接:命令连接和订阅连接
- 哨兵会通过命令连接每10s发送一次info信息,通过这个信息,主节点会返回自己的run_id和自己的从节点信息
- 哨兵会对从节点也建立两条连接
- 哨兵通过命令连接向从节点发送info命令,获取从节点的run_id(服务器id),role(职能),从服务器的复制偏移量offset以及其他的
- 通过命令连接向服务器的哨兵hello频道发送一条消息,内容包括紫的ip,端口,run_id,配置等
- 通过订阅连接对服务器的哨兵hello频道做了监听,所有的向该频道发送的哨兵的消息都能被接收到
- 解析监听到的消息,进行分析提取,就可以知道还有哪些哨兵也在监听主从节点,更新结构体将这些哨兵节点记录下来
- 向观察到的其他哨兵建立命令连接
哨兵模式下的故障迁移
- 主观下线
哨兵节点会每秒一次的频率向建立命令连接的实例发送ping命令,如果在down-after-milliseconds毫秒内没有做出响应,哨兵就会将该实例在本结构体中标记为SRI_S_DOWN主观下线 - 客观下线
当一个哨兵节点发现主节点处于主观下线状态时,会向其他哨兵节点发出询问,该节点是不是主观下线,如果超过配置参数quorum个节点认为是主观下线时,该哨兵节点会将自己维护的结构体中该主节点标记为SRI_O_DOWN客观下线 - master选举
认为主节点客观下线情况下,哨兵节点间会发起一次选举,命令为sentinel is-master-down-by-addr - 故障转移
①在从节点中选出新的主节点,通讯正常,优先级排序,优先级相同时选择offset最大的
②将该节点设置成新的主节点slaveof no one,并确保在后续的info命令时该节点返回状态为master
③将其他从节点设置成新的主节点复制
④将旧的主节点变成新的从节点
哨兵的工作过程
①哨兵启动依赖主从模式,节点上都需要部署哨兵模式,会监控所有的redis工作节点是否正常
②当master出现问题时,从节点会投票,投票过半就认为这个master出现问题,然后通知哨兵选出一个哨兵来进行故障转移工作,然后从slaves中选出新的master
③筛选方式是哨兵互相发送消息,并参与投票,票多者当选
④当哨兵发现主节点挂了,会将master标记为主观下线,并通知其他哨兵,其他哨兵尝试连接master,如果超过半数确认master挂了,就会标记master为客观下线
部署哨兵模式
- 实验环境
master 192.168.1.5
slave1 192.168.1.102
slave2 192.168.1.103
- 修改哨兵配置文件,所有都要改
[root@master ~]# vim redis-5.0.7/sentinel.conf
17 protected-mode no //关闭保护模式
21 port 26379 //哨兵默认监听端口
26 daemonize yes //开启守护进程
36 logfile "/var/log/sentinel.log" //指定日志存放路径
65 dir /var/lib/redis/6379 //指定数据库存放路径
84 sentinel monitor mymaster 192.168.1.5 6379 2 //指定主的ip,指定2个哨兵节点同意才能判断主节点故障并故障转移
113 sentinel down-after-milliseconds mymaster 30000 //判定服务器down掉的时间周期,默认30秒
146 sentinel failover-timeout mymaster 180000 //故障节点的最大超时时间为180秒
- 启动哨兵,先启动主节点,再启动从节点,并查看哨兵信息
[root@master redis-5.0.7]# redis-sentinel sentinel.conf &
[root@master redis-5.0.7]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.1.5:6379,slaves=2,sentinels=1
[1]+ 完成 redis-sentinel sentinel.conf
- 当关闭master
[root@master redis-5.0.7]# kill -9 41447
[root@master redis-5.0.7]# rm -rf /var/run/redis_6379.pid
- 再次查看哨兵信息
[root@master redis-5.0.7]# redis-cli -p 26379 info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.1.103:6379,slaves=2,sentinels=3 //发现master地址变了,说明发生了故障转移
- 查看日志信息
[root@slave1 redis-5.0.7]# tail -f /var/log/sentinel.log
14402:X 08 Aug 2021 22:13:46.737 # +vote-for-leader c0113cda2755db59d9cb0bf710cf89c026f5409a 1
14402:X 08 Aug 2021 22:13:47.744 # +odown master mymaster 192.168.1.5 6379 #quorum 3/2
14402:X 08 Aug 2021 22:13:47.745 # Next failover delay: I will not start a failover before Sun Aug 8 22:19:46 2021
14402:X 08 Aug 2021 22:13:47.845 # +config-update-from sentinel c0113cda2755db59d9cb0bf710cf89c026f5409a 192.168.1.103 26379 @ mymaster 192.168.1.5 6379
14402:X 08 Aug 2021 22:13:47.845 # +switch-master mymaster 192.168.1.5 6379 192.168.1.103 6379
14402:X 08 Aug 2021 22:13:47.845 * +slave slave 192.168.1.102:6379 192.168.1.102 6379 @ mymaster 192.168.1.103 6379
14402:X 08 Aug 2021 22:13:47.845 * +slave slave 192.168.1.5:6379 192.168.1.5 6379 @ mymaster 192.168.1.103 6379
14402:X 08 Aug 2021 22:14:17.903 # +sdown slave 192.168.1.5:6379 192.168.1.5 6379 @ mymaster 192.168.1.103 6379
14402:X 08 Aug 2021 22:14:17.903 # +sdown slave 192.168.1.102:6379 192.168.1.102 6379 @ mymaster 192.168.1.103 6379
14402:X 08 Aug 2021 22:14:56.527 # -sdown slave 192.168.1.5:6379 192.168.1.5 6379 @ mymaster 192.168.1.103 6379
Redis的cluster集群
集群中主节点负责读写请求和集群信息的维护,从节点只进行主节点数据和状态信息的复制
集群的作用
- 数据分区
数据分区是集群最核心的功能,集群将数据分散到多个节点,既可以突破单机的内存限制,又可以对外提供读服务和写服务,大大提高了集群的响应能力 - 高可用
集群支持主从复制和主节点的自动故障转移,当任意节点发送故障时,集群仍然可以对外提供服务 - 数据分片
Redis 集群引入了哈希槽的概念,有 16384 个哈希槽(编号 0~16383),集群的每个节点负责一部分哈希槽,通过CRC16校验后对16384取余来决定存放在哪个哈希槽中
以三个节点为例:节点A的哈希槽0-5460
节点B的哈希槽5461-10922
节点C哈希槽10923-16383
搭建cluster集群
- redis的集群一般需要6个节点,为了方便,我将放到一个机器上模拟
- 先创建6个端口的工作目录
[root@master redis-5.0.7]# cd /etc/redis
[root@master redis]# mkdir -p redis-cluster/redis700{1..6}
- 然后写一个脚本来复制需要的文件到目录下,然后执行
#!/bin/bash
for i in {1..6}
do
cp redis-5.0.7/redis.conf /etc/redis/redis-cluster/redis700$i
cp redis-5.0.7/src/redis-cli redis-5.0.7/src/redis-server /etc/redis/redis-cluster/redis700$i
done
[root@master redis7001]# sh -x /opt/redis.sh
- 然后查看是否复制成功
[root@master redis7001]# ls
redis-cli redis.conf redis-server
[root@master redis7001]# cd ..
[root@master redis-cluster]# cd redis7002
[root@master redis7002]# ls
redis-cli redis.conf redis-server
[root@master redis7002]# cd ../redis7003
[root@master redis7003]# ls
redis-cli redis.conf redis-server
[root@master redis7003]# cd ../redis7004
[root@master redis7004]# ls
redis-cli redis.conf redis-server
[root@master redis7004]# cd ../redis7005
[root@master redis7005]# ls
redis-cli redis.conf redis-server
[root@master redis7005]# cd ../redis7006
[root@master redis7006]# ls
redis-cli redis.conf redis-server
- 进配置文件修改配置,其他五个只要修改相应的端口号
[root@master redis7001]# vim redis.conf
69 bind 127.0.0.1 //注释掉或者不修改
88 protected-mode no //关闭保护模式
92 port 7001 //端口改成监听端口,其他五个依次修改
136 daemonize yes //开启守护进程
832 cluster-enabled yes //取消注释,开启集群模式
840 cluster-config-file nodes-7001.conf //取消注释并修改名字
846 cluster-node-timeout 15000 //取消注释集群超时时间设置
699 appendonly yes //开启AOF持久化
- 然后启动服务,可以设置一个启动脚本
[root@master redis7001]# vim /opt/redis_start.sh
#!/bin/bash
for d in {1..6}
do
cd /etc/redis/redis-cluster/redis700$d
redis-server redis.conf
done
ps -ef | grep redis
- 然后执行脚本
[root@master redis7001]# sh /opt/redis_start.sh
49088:C 09 Aug 2021 00:57:15.730 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
49088:C 09 Aug 2021 00:57:15.730 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=49088, just started
49088:C 09 Aug 2021 00:57:15.730 # Configuration loaded
49090:C 09 Aug 2021 00:57:15.735 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
49090:C 09 Aug 2021 00:57:15.735 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=49090, just started
49090:C 09 Aug 2021 00:57:15.735 # Configuration loaded
49092:C 09 Aug 2021 00:57:15.739 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
49092:C 09 Aug 2021 00:57:15.739 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=49092, just started
49092:C 09 Aug 2021 00:57:15.739 # Configuration loaded
49094:C 09 Aug 2021 00:57:15.743 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
49094:C 09 Aug 2021 00:57:15.743 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=49094, just started
49094:C 09 Aug 2021 00:57:15.743 # Configuration loaded
49096:C 09 Aug 2021 00:57:15.746 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
49096:C 09 Aug 2021 00:57:15.746 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=49096, just started
49096:C 09 Aug 2021 00:57:15.746 # Configuration loaded
49098:C 09 Aug 2021 00:57:15.749 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
49098:C 09 Aug 2021 00:57:15.749 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=49098, just started
49098:C 09 Aug 2021 00:57:15.749 # Configuration loaded
root 49010 1 0 00:54 ? 00:00:00 redis-server *:7001 [cluster]
root 49061 1 0 00:56 ? 00:00:00 redis-server *:7002 [cluster]
root 49064 1 0 00:56 ? 00:00:00 redis-server *:7003 [cluster]
root 49068 1 0 00:56 ? 00:00:00 redis-server *:7004 [cluster]
root 49073 1 0 00:56 ? 00:00:00 redis-server *:7005 [cluster]
root 49079 1 0 00:56 ? 00:00:00 redis-server *:7006 [cluster]
root 49087 106291 0 00:57 pts/2 00:00:00 sh /opt/redis_start.sh
root 49101 49087 0 00:57 pts/2 00:00:00 grep redis
root 83130 1 0 00:10 ? 00:00:11 redis-sentinel *:26379 [sentinel]
root 83184 1 0 00:14 ? 00:00:08 /usr/local/bin/redis-server 0.0.0.0:6379
- 将节点都加入集群中
[root@master redis7001]# redis-cli --cluster create 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 --cluster-replicas 1 //前三个作为主,后三个作为从,1表示每个主节点有一个从节点,所以分三组
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 127.0.0.1:7005 to 127.0.0.1:7001
Adding replica 127.0.0.1:7006 to 127.0.0.1:7002
Adding replica 127.0.0.1:7004 to 127.0.0.1:7003
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
M: db3b7682a85fc1094918c8aaa2f678b44f37e4c7 127.0.0.1:7001
slots:[0-5460] (5461 slots) master
M: 3f0455cc5f4ac46449bcd04a206ad305e1f91f73 127.0.0.1:7002
slots:[5461-10922] (5462 slots) master
M: 3a3a73300dc0d5a7ff20d75a56a9f2517d172fb4 127.0.0.1:7003
slots:[10923-16383] (5461 slots) master
S: 32e46faba6d5f814b71900d2b8ee6230c2f0852d 127.0.0.1:7004
replicates 3a3a73300dc0d5a7ff20d75a56a9f2517d172fb4
S: 6779211f79f7af7f20f399860c05bcccba66e4e0 127.0.0.1:7005
replicates db3b7682a85fc1094918c8aaa2f678b44f37e4c7
S: f1c4fd994e4fc57549582a6fb8192ebbee79aa06 127.0.0.1:7006
replicates 3f0455cc5f4ac46449bcd04a206ad305e1f91f73
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 127.0.0.1:7001)
M: db3b7682a85fc1094918c8aaa2f678b44f37e4c7 127.0.0.1:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
M: 3f0455cc5f4ac46449bcd04a206ad305e1f91f73 127.0.0.1:7002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 32e46faba6d5f814b71900d2b8ee6230c2f0852d 127.0.0.1:7004
slots: (0 slots) slave
replicates 3a3a73300dc0d5a7ff20d75a56a9f2517d172fb4
S: 6779211f79f7af7f20f399860c05bcccba66e4e0 127.0.0.1:7005
slots: (0 slots) slave
replicates db3b7682a85fc1094918c8aaa2f678b44f37e4c7
M: 3a3a73300dc0d5a7ff20d75a56a9f2517d172fb4 127.0.0.1:7003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: f1c4fd994e4fc57549582a6fb8192ebbee79aa06 127.0.0.1:7006
slots: (0 slots) slave
replicates 3f0455cc5f4ac46449bcd04a206ad305e1f91f73
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
- 测试集群
[root@master redis7001]# redis-cli -p 7002 -c //-c参数可以让节点之间互相跳转
127.0.0.1:7002> cluster slots //查看节点的哈希槽范围
1) 1) (integer) 5461
2) (integer) 10922
3) 1) "127.0.0.1"
2) (integer) 7002
3) "3f0455cc5f4ac46449bcd04a206ad305e1f91f73"
4) 1) "127.0.0.1"
2) (integer) 7006
3) "f1c4fd994e4fc57549582a6fb8192ebbee79aa06"
2) 1) (integer) 10923
2) (integer) 16383
3) 1) "127.0.0.1"
2) (integer) 7003
3) "3a3a73300dc0d5a7ff20d75a56a9f2517d172fb4"
4) 1) "127.0.0.1"
2) (integer) 7004
3) "32e46faba6d5f814b71900d2b8ee6230c2f0852d"
3) 1) (integer) 0
2) (integer) 5460
3) 1) "127.0.0.1"
2) (integer) 7001
3) "db3b7682a85fc1094918c8aaa2f678b44f37e4c7"
4) 1) "127.0.0.1"
2) (integer) 7005
3) "6779211f79f7af7f20f399860c05bcccba66e4e0"
127.0.0.1:7002> set k1 zhangsan //创建键值
-> Redirected to slot [12706] located at 127.0.0.1:7003 //存放到了7003中
OK
127.0.0.1:7003> cluster keyslot k1 //查看键的哈希槽编号
(integer) 12706
127.0.0.1:7003> set k2 lisi
-> Redirected to slot [449] located at 127.0.0.1:7001 //随机存放到任意一个节点中并且会互相跳转
OK
127.0.0.1:7001> cluster keyslot k2
(integer) 449
127.0.0.1:7001> get k1
-> Redirected to slot [12706] located at 127.0.0.1:7003
"zhangsan"
127.0.0.1:7003> //查看键值从7001跳到7003