1. 环境准备(例)
1.1 本机环境
- centos7.4 x86_64
- redis 3.2.5
1.2 redis环境
名称 | ip | port | 网卡名 | 备注 |
---|---|---|---|---|
Redis_1 | 192.168.56.108 | 6380 | enp0s8 | 主 |
Redis_2 | 192.168.56.109 | 6380 | enp0s8 | 从 |
Redis_3 | 192.168.56.110 | 6380 | enp0s8 | 从 |
Redis-Sentinel_1 | 192.168.56.108 | 26380 | enp0s8 | |
Redis-Sentinel_2 | 192.168.56.109 | 26380 | enp0s8 | |
Redis-Sentinel_3 | 192.168.56.110 | 26380 | enp0s8 |
架构图如下:
+----+
| M1 |
| S1 |
+----+
|
+----+ | +----+
| R2 |----+----| R3 |
| S2 | | S3 |
+----+ +----+
Configuration: quorum = 2
2. 主从配置
2.1 redis的部署
部署这部分直接用脚本部署就可以了。
要部署端口为63xx
和263xx
两个实例,2开头的实例之后用作哨兵模式。
2.2 几点注意
2.2.1 redis.conf
中的坑
63xx
开头的实例有可能需要修改参数文件,把redis.conf
中重命名的CONFIG
命令给改回来。
$ vim /data/redis/63xx/redis.conf
# Command renaming.
#
# It is possible to change the name of dangerous commands in a shared
# environment. For instance the CONFIG command may be renamed into something
# hard to guess so that it will still be available for internal-use tools
# but not available for general clients.
#
# Example:
#
# rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
#
# It is also possible to completely kill a command by renaming it into
# an empty string:
#
# rename-command CONFIG ""
#rename-command FLUSHALL xxflushall
#rename-command FLUSHDB xxplushdb
#rename-command CONFIG xxconfig
修改完成后,重启服务。
$ systemctl restart redis63xx
$ systemctl status redis63xx
2.2.2 Redis-Sentinel的修改
TODO: Redis-Sentinel的部署脚本(2021年5月24日)
Redis-Sentinel
所有的内容都需要改。
先关闭服务
$ systemctl stop redis263xx
$ systemctl status redis263xx
修改参数文件
$ vim /data/redis/263xx/sentinel.conf
port 263xx
dir "/data/redis/263xx"
logfile "/data/redis/263xx/redis-sentinel.log"
daemonize yes
protected-mode no
sentinel monitor mymaster 192.168.56.108 6380 2
sentinel down-after-milliseconds mymaster 5000
sentinel client-reconfig-script mymaster /home/redis/notify_master.sh
$ chown -R redis:redis /data/redis/26380
其中:
daemonize yes
/protected-mode no
必填,否则该哨兵会被其他哨兵主线下线+sdown
。
SENTINEL MONITOR <name> <ip> <port> <quorum>
主从的名字,主ip,主port,成功票数。1主2从的话,需要2票,否则会脑裂。
sentinel down-after-milliseconds
主无反应后,从库切换的等待时间。
sentinel client-reconfig-script <name> <script_full_path>
这个可以之后再填。
修改systemd的启动文件
之前的启动文件完全不能用了,可以清空重新编辑。
$ cat /dev/null > /etc/systemd/system/redis263xx.service
$ vim /etc/systemd/system/redis263xx.service
[Unit]
Description=redis
After=network.target
After=syslog.target
[Install]
WantedBy=multi-user.target
[Service]
Type=forking
User=redis
Group=redis
ExecStart=/home/redis/redis/src/redis-sentinel /data/redis/263xx/sentinel.conf
LimitNOFILE = 65535
Restart=always
RestartSec=1
StartLimitInterval=0
$ systemctl reload-daemon
reload
之后不着急重启。
2.3 建立主从
以192.168.56.108为主库
192.168.56.109
$ cd /home/redis/redis/src/
$ ./redis-cli -p 63xx SLAVEOF 192.168.56.108 63xx
OK
$ ./redis-cli -p 63xx info replication
# Replication
role:slave
master_host:192.168.56.108
master_port:6380
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:749037
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.56.110
$ cd /home/redis/redis/src/
$ ./redis-cli -p 63xx SLAVEOF 192.168.56.108 63xx
OK
$ ./redis-cli -p 63xx info replication
# Replication
role:slave
master_host:192.168.56.108
master_port:6380
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:749037
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
192.168.56.108
$ cd /home/redis/redis/src/
$ ./redis-cli -p 63xx info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.56.109,port=6380,state=online,offset=775840,lag=0
slave1:ip=192.168.56.110,port=6380,state=online,offset=775840,lag=1
master_repl_offset:775840
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:775839
2.3.1 切断主从
看看就行儿,不到万不得已千万别用。
$ cd /home/redis/redis/src/
$ ./redis-cli -p 63xx slaveof no one
3. 哨兵配置
3.1 VIP脚本
3.1.1 授权redis用户sudo
权限
因为脚本中的ip
和arping
都需要超级用户的权限。所以要给redis用户授权sudo
(巨坑,一定要做)。
$ echo -e "redis\tALL=(ALL)\tNOPASSWD:/sbin/ip,NOPASSWD:/sbin/arping" > /etc/sudoers.d/redis
$ sed -i "s|Defaults.*requiretty|#Defaults\trequiretty|" /etc/sudoers
$ chmod 440 /etc/sudoers.d/redis
3.1.2 VIP脚本内容
$ vim /home/redis/notify_master.sh
#!/bin/bash
MASTER_IP=${6}
MY_IP='192.168.56.108' # 服务器IP
VIP='192.168.56.250' # VIP
NETMASK='24' # Netmask
INTERFACE='enp0s8' # 网卡名字,可应ifconfig查看
if [ ${MASTER_IP} = ${MY_IP} ]; then
sudo /sbin/ip addr add ${VIP}/${NETMASK} dev ${INTERFACE}
sudo /sbin/arping -q -c 3 -A ${VIP} -I ${INTERFACE}
exit 0
else
sudo /sbin/ip addr del ${VIP}/${NETMASK} dev ${INTERFACE}
exit 0
fi
exit 1
$ chown -R redis:redis /home/redis/notify_master.sh
$ chmod +x /home/redis/notify_master.sh
3.1.3 手动配置主机VIP
$ ip addr add 192.168.56.250/24 dev enp0s8
$ arping -q -c 3 -A 192.168.56.250 -I enp0s8
VIP的删除命令
$ ip addr del 192.168.56.250 dev enp0s8
3.1.4 sentinel client-reconfig-script <name> <script_full_path>
sentinel client-reconfig-script <name> <script_full_path>
中的<script_full_path>
填上VIP脚本的全路径
最后启动哨兵实例
$ systemctl start redis26380
$ systemctl status redis26380
3.2 检查哨兵的状态
3.2.1 检查日志
$ cd /data/redis/263xx/
$ tail -f redis-sentinel.log
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.2.5 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in sentinel mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 26380
| `-._ `._ / _.-' | PID: 16735
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
16735:X 24 May 15:25:38.839 # Sentinel ID is 80afff4eafa75c0ef53dcf9f37fd04513812c44f
16735:X 24 May 15:25:38.840 # +monitor master mymaster 192.168.56.110 6380 quorum 2
3.2.2 通过哨兵查看主从状态
一些命令
redis-cli -p 26380
PING :返回 PONG 。
SENTINEL masters :列出所有被监视的主服务器
SENTINEL slaves <master name>
SENTINEL get-master-addr-by-name <master name> : 返回给定名字的主服务器的 IP 地址和端口号。
SENTINEL reset <pattern> : 重置所有名字和给定模式 pattern 相匹配的主服务器。
SENTINEL failover <master name> : 当主服务器失效时, 在不询问其他 Sentinel 意见的情况下, 强制开始一次自动故障迁移。
到此为止,主从+哨兵+VIP的模式已经做好了。
4. Failover测试
192.168.56.108主机
切之前的VIP状态
$ ip a
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:58:18:13 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.108/24 brd 192.168.56.255 scope global dynamic enp0s8
valid_lft 545sec preferred_lft 545sec
inet 192.168.56.250/24 scope global secondary enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::84e5:9a69:72d6:f1c7/64 scope link
valid_lft forever preferred_lft forever
可以看到192.168.56.108
有两个IP,预想结果应该是192.168.56.250
切换到Failover后主库的服务服务器上。
停redis实例
$ systemctl stop redis6380
查看Redis-Sentinel
的日志,观察failover的情况
$ tail -f redis-sentinel.log
16886:X 24 May 17:44:55.223 # +sdown master mymaster 192.168.56.108 6380
16886:X 24 May 17:44:55.265 # +new-epoch 5
16886:X 24 May 17:44:55.269 # +vote-for-leader 80afff4eafa75c0ef53dcf9f37fd04513812c44f 5
16886:X 24 May 17:44:55.295 # +odown master mymaster 192.168.56.108 6380 #quorum 3/2
16886:X 24 May 17:44:55.295 # Next failover delay: I will not start a failover before Mon May 24 17:50:56 2021
16886:X 24 May 17:44:56.411 # +config-update-from sentinel 80afff4eafa75c0ef53dcf9f37fd04513812c44f 192.168.56.108 26380 @ mymaster 192.168.56.108 6380
16886:X 24 May 17:44:56.411 # +switch-master mymaster 192.168.56.108 6380 192.168.56.110 6380
16886:X 24 May 17:44:56.411 * +slave slave 192.168.56.109:6380 192.168.56.109 6380 @ mymaster 192.168.56.110 6380
16886:X 24 May 17:44:56.411 * +slave slave 192.168.56.108:6380 192.168.56.108 6380 @ mymaster 192.168.56.110 6380
16886:X 24 May 17:45:01.435 # +sdown slave 192.168.56.108:6380 192.168.56.108 6380 @ mymaster 192.168.56.110 6380
192.168.56.110上的IP情况
$ ip a
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:0b:ca:b6 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.110/24 brd 192.168.56.255 scope global dynamic enp0s8
valid_lft 391sec preferred_lft 391sec
inet 192.168.56.250/24 scope global secondary enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::23b1:ee0c:c8a6:7409/64 scope link
valid_lft forever preferred_lft forever
failover成功,VIP也切换过去了。
参考
[1] Redis-Sentinelのclient-reconfig-scriptでVIPをつける
[2] 故障转移的过程的日志详解