Redis集群生产环境高可用方案实战过程

Redis集群生产环境高可用方案实战过程

2017-05-15 民工哥 友侃有笑 友侃有笑
友侃有笑

youkanyouxiao

民工哥个人公众号,所发表之文字纯属个人观点,如有不正之处,敬请给予指正,谢谢!

布署方案说明

1、sentinel负责对redis集群中的主从服务监控、提醒和自动故障转移

2、redis集群负责对外提供相关服务


Sentinel原理介绍

原理:

sentinel是一个分布式系统,可以在一个架构中运行多个sentinel进程,这些进程使用流言协议(gossip protocols)来接收关于rdis主服务器是否下线的信息,并使用投票协议(agreement protocols)来决定是否执行自动故障迁移,以及选举哪个从服务器成为新的主服务器。

流言协议:

sentinel服务通过ping命令来确认监控的服务器是否正常,当足够多数量的sentinel都确认监控的同一服务器停止服务了(主观下线),则判定此服务器停止服务。

投票协议:

其实就选举,sentinel集群根据一定的规则从redis群中选择一个新的服务器成为主服务器,并使其它的服务器做为新的从服务器,并修改自身的配置文件。


服务器布署规划

实验环境采用两台服务器模拟集群环境

服务器系统环境

Centos 6.6 x86_64

Master服务器 10.0.0.3/24

Redis-Mster 10.0.0.3:6379

Redis-Slave1 10.0.0.3:63791

Redis-Slave2 10.0.0.3:63792

Sentinel服务

s 10.0.0.3:26379

s1 10.0.0.3:26378

Slave服务器 10.0.0.4/24

Redis-Slave3 10.0.0.4:63793

Redis-Slave4 10.0.0.4:63794

Sentinel服务

s2 10.0.0.4:26379

s3 10.0.0.4:26378


故障切换前后逻辑图



Redis-sentinel服务配置

主服务器上安装布署过程

安装redis服务

mkdir /usr/local/redis/data

cd /usr/local/src

wget http://download.redis.io/releases/redis-2.8.9.tar.gz

tar zxf redis-2.8.9.tar.gz

cd redis-2.8.9

make && make install

复制配置文件

cp redis.conf /usr/local/bin/

cd /usr/local/bin

cp redis.conf redis-slave1

cp redis.conf redis-slave2

修改配置文件

[root@master bin]#vi redis.conf

daemonize yes

#开启后台运行模式

pidfile /var/run/redis.pid

bind 10.0.0.3

dbfilename dump.rdb

dir /usr/local/redis/data

port 6379

[root@master bin]#vi redis-slave1

daemonize yes

pidfile /var/run/redis-slave1.pid

port 63791

bind 10.0.0.3

dbfilename dump-slave1.rdb

dir /usr/local/redis/data

slaveof 10.0.0.3 6379

slave-read-only yes

[root@master bin]#vi redis-slave2

daemonize yes

pidfile /var/run/redis-slave2.pid

port 63792

bind 10.0.0.3

dbfilename dump-slave2.rdb

dir /usr/local/redis/data

slaveof 10.0.0.3 6379

配置redis-sentinel服务

mkdir /var/log/redis -p

cp /usr/local/src/redis-2.8.9/src/redis-sentinel /usr/bin/

cp /usr/local/src/redis-2.8.9/src/sentinel.conf /usr/local/bin/

cd /usr/local/bin

cp sentinel.conf sentinel-s1.conf

修改配置文件

[root@master bin]# egrep -v "^#|^$" sentinel.conf

port 26379

daemonize yes

logfile /var/log/redis/sentinel.log

sentinel monitor mymaster 10.0.0.3 6379 2

sentinel down-after-milliseconds mymaster 30000

sentinel parallel-syncs mymaster 1

sentinel failover-timeout mymaster 180000

[root@master bin]# egrep -v "^#|^$" sentinel-s1.conf

port 26378

daemonize yes

logfile /var/log/redis/sentinel-s1.log

sentinel monitor mymaster 10.0.0.3 6379 2

sentinel down-after-milliseconds mymaster 30000

sentinel parallel-syncs mymaster 1

sentinel failover-timeout mymaster 180000

以上配置从服务器操作过程同上


启动服务

启动redis服务

[root@master bin]# redis-server redis.conf

[root@master bin]# redis-server redis-slave1

[root@master bin]# redis-server redis-slave2

[root@master bin]# ps -ef|grep redis

root 2579 1 0 23:55 ? 00:00:00 redis-server 10.0.0.3:6379

root 2585 1 0 23:55 ? 00:00:00 redis-server 10.0.0.3:63792

root 2590 1 0 23:55 ? 00:00:00 redis-server 10.0.0.3:63791

root 2597 2479 0 23:56 pts/0 00:00:00 grep --color=auto redis

[root@slave bin]# redis-server redis-slave3

[root@slave bin]# redis-server redis-slave4

[root@slave bin]# ps -ef|grep redis

root 2576 1 0 23:56 ? 00:00:00 redis-server 10.0.0.4:63793

root 2580 1 0 23:56 ? 00:00:00 redis-server 10.0.0.4:63794

root 2584 2502 0 23:56 00:00:00 grep --color=auto redis

启动redis-sentinel服务

[root@master bin]# redis-sentinel sentinel.conf

[root@master bin]# redis-sentinel sentinel-s1.conf

[root@master bin]# ps -ef|grep redis-sentinel

root 2638 1 0 01:05 ? 00:00:04 redis-sentinel *:26379

root 2646 1 0 01:13 ? 00:00:00 redis-sentinel *:26378

root 2650 2479 0 01:13 00:00:00 grep --color=auto redis

[root@slave bin]# redis-sentinel sentinel-s2.conf

[root@slave bin]# redis-sentinel sentinel-s3.conf

[root@slave bin]# ps -ef|grep redis-sentinel

root 2644 1 1 01:14 ? 00:00:00 redis-sentinel *:26378

root 2649 1 0 01:14 ? 00:00:00 redis-sentinel *:26379

root 2653 2502 0 01:15 00:00:00 grep --color=auto redis-sentinel


查看日志观察启动过程

[root@master bin]# tail -f /var/log/redis/sentinel.log

`-.__.-'

[2664] 12 May 01:20:11.036 # Sentinel runid is c327be464ef36e670566a0d76c9dc85bac7f33b1

[2664] 12 May 01:20:11.036 # +monitor master mymaster 10.0.0.3 6379 quorum 2

[2664] 12 May 01:20:11.123 * -dup-sentinel master mymaster 10.0.0.3 6379 #duplicate of 10.0.0.3:26378 or fb1fbe73b51a0a6e71a8ceae57d34ef773d086e3

[2664] 12 May 01:20:11.123 * +sentinel sentinel 10.0.0.3:26378 10.0.0.3 26378 @ mymaster 10.0.0.3 6379

[2664] 12 May 01:20:21.410 * -dup-sentinel master mymaster 10.0.0.3 6379 #duplicate of 10.0.0.4:26379 or 3d43ddea4d4ba8de7dd30e2d332723508f6d4c19

[2664] 12 May 01:20:21.410 * +sentinel sentinel 10.0.0.4:26379 10.0.0.4 26379 @ mymaster 10.0.0.3 6379

[2664] 12 May 01:20:25.103 * -dup-sentinel master mymaster 10.0.0.3 6379 #duplicate of 10.0.0.4:26378 or 6d134d9a3e53c0cb70de842281de8aaf17a84c00

[2664] 12 May 01:20:25.103 * +sentinel sentinel 10.0.0.4:26378 10.0.0.4 26378 @ mymaster 10.0.0.3 6379

可以看出有其它监控服务器加入到集群中来


查看配置文件是否有变化

root@master bin]# egrep -v "^#|^$" sentinel-s1.conf

port 26378

daemonize yes

logfile "/var/log/redis/sentinel-s1.log"

sentinel monitor mymaster 10.0.0.3 6379 2

sentinel config-epoch mymaster 0

sentinel leader-epoch mymaster 0

sentinel known-slave mymaster 10.0.0.3 63792

dir "/usr/local/bin"

sentinel known-slave mymaster 10.0.0.4 63793

sentinel known-slave mymaster 10.0.0.4 63794

sentinel known-slave mymaster 10.0.0.3 63791

sentinel known-sentinel mymaster 10.0.0.3 26379 c327be464ef36e670566a0d76c9dc85bac7f33b1

sentinel known-sentinel mymaster 10.0.0.4 26379 3d43ddea4d4ba8de7dd30e2d332723508f6d4c19

sentinel known-sentinel mymaster 10.0.0.4 26378 6d134d9a3e53c0cb70de842281de8aaf17a84c00

sentinel current-epoch 0


通过日志观察故障切换过程

模拟主服务器故障并查看故障切换

[root@master bin]# redis-cli -h 10.0.0.3 -p 6379 shutdown

[root@master bin]# ps -ef|grep redis

root 2585 1 0 May11 ? 00:00:07 redis-server 10.0.0.3:63792

root 2590 1 0 May11 ? 00:00:07 redis-server 10.0.0.3:63791

root 2660 1 0 01:20 ? 00:00:02 redis-sentinel *:26378

root 2664 1 0 01:20 ? 00:00:02 redis-sentinel *:26379

root 2676 2479 0 01:30 00:00:00 grep --color=auto redis

此时发现主服务器进程不存在,说明服务有故障

清空原来的日志并查看故障切换过程

[root@slave bin]# > /var/log/redis/sentinel-s3.log

[root@slave bin]# tail -f /var/log/redis/sentinel-s3.log

[2669] 12 May 01:30:55.203 # +sdown master mymaster 10.0.0.3 6379

[2669] 12 May 01:30:55.276 # +new-epoch 1

[2669] 12 May 01:30:55.280 # +vote-for-leader c327be464ef36e670566a0d76c9dc85bac7f33b1 1

[2669] 12 May 01:30:56.329 # +odown master mymaster 10.0.0.3 6379 #quorum 4/2

[2669] 12 May 01:30:57.547 # +switch-master mymaster 10.0.0.3 6379 10.0.0.3 63792

[2669] 12 May 01:30:57.548 * +slave slave 10.0.0.4:63794 10.0.0.4 63794 @ mymaster 10.0.0.3 63792

[2669] 12 May 01:30:57.553 * +slave slave 10.0.0.4:63793 10.0.0.4 63793 @ mymaster 10.0.0.3 63792

[2669] 12 May 01:30:57.556 * +slave slave 10.0.0.3:63791 10.0.0.3 63791 @ mymaster 10.0.0.3 63792

[2669] 12 May 01:30:57.561 * +slave slave 10.0.0.3:6379 10.0.0.3 6379 @ mymaster 10.0.0.3 63792

[2669] 12 May 01:31:27.620 # +sdown slave 10.0.0.3:6379 10.0.0.3 6379 @ mymaster 10.0.0.3 63792

可以看出判定master主观下线(+sdown),sentinel选举10.0.0.3 63792为新的主服务器,其它slave自动执行slaveof ,故障转移成功


恢复原主服务器

[root@master bin]# redis-server redis.conf

[root@master bin]# ps -ef|grep redis

root 2585 1 0 May11 ? 00:00:08 redis-server 10.0.0.3:63792

root 2590 1 0 May11 ? 00:00:08 redis-server 10.0.0.3:63791

root 2660 1 0 01:20 ? 00:00:05 redis-sentinel *:26378

root 2664 1 0 01:20 ? 00:00:05 redis-sentinel *:26379

root 2683 1 0 01:36 ? 00:00:00 redis-server 10.0.0.3:6379

root 2689 2479 0 01:36 00:00:00 grep --color=auto redis

[root@slave bin]# tail -f /var/log/redis/sentinel-s3.log

[2673] 12 May 01:36:21.925 # -sdown slave 10.0.0.3:6379 10.0.0.3 6379 @ mymaster 10.0.0.3 63792

当原来主服务器故障恢复后,自动以从角色加入到集群,并不会抢占主服务器的角色


测试读写分离

[root@master bin]# redis-cli -h 10.0.0.3 -p 63792

10.0.0.3:63792> get key

"test"

10.0.0.3:63792> set key file

OK

10.0.0.3:63792> get key

"file"

[root@master bin]# redis-cli -h 10.0.0.3 -p 6379

10.0.0.3:6379> get key

"file"

10.0.0.3:6379> set key file1

(error) READONLY You can't write against a read only slave.

说明新主是提升成功的,原来的主故障恢复后已是从服务器,而且也是只读状态,没有破坏之前的主写从读的状态

至此整个布署过程结束,实现了集群监控与自动故障切换、读写分离的功能

写文不易如有帮助敬请赞赏

赞赏

人赞赏

长按二维码向我转账

写文不易如有帮助敬请赞赏

受苹果公司新规定影响,微信 iOS 版的赞赏功能被关闭,可通过二维码转账支持公众号。

阅读
投诉
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值