Redis集群丢数据问题——机器宕机没有自动切库,手动takeover后,老主库重启又抢回master

一、背景

机器发生了宕机,查看后发现有个节点没有自动切库成功,为了保证集群对外能正常服务,所以决定手动强制切库,保证集群对外的可用性。

二、事故描述

先使用cluster failover force命令执行强制切库,但是试了3次都不行,然后就使用cluster failover takeover更强制的切库,切库后,集群是能对外提供服务了。

但是当把宕机的机器重启后,开始拉起宕机机器上的Redis实例,发现之前没有自动切库的节点,现在又变成主库了,而且数据也丢了。。。

三、问题分析

3.1 日志记录

没有正常切库的从库(13808节点)的日志:

# 宕机了,陆续收到其他节点客观下线的消息
6836:S 08 May 11:37:12.204 * Marking node f4edfc38749148be941481e3d17fb14ecdc62947 as failing (quorum reached).
6836:S 08 May 11:37:12.399 * Marking node 6a82a479431cab234bb4b8330ca4dea2d409871e as failing (quorum reached).
...
6836:S 08 May 11:40:15.287 * Marking node d84c1798cf3470cdbd3bd8a2261d59c117ff918e as failing (quorum reached).

# 宕机后,从库一直尝试选取切库,但是因为这个集群在这台机器上有 11/30 的主库,所以master之间的选主通信有些问题,一直没有成功
6836:S 08 May 11:40:15.384 # Start of election delayed for 786 milliseconds (rank #0, offset 2989077250425).
6836:S 08 May 11:40:16.254 # Starting a failover election for epoch 141.
# 通信有问题,master投票选举没有过半,不能切库
6836:S 08 May 11:40:35.345 # Currently unable to failover: Waiting for votes, but majority still not reached.
6836:S 08 May 11:40:46.209 # Currently unable to failover: Failover attempt expired.
6836:S 08 May 11:41:16.242 # Start of election delayed for 751 milliseconds (rank #0, offset 2989077250425).
6836:S 08 May 11:41:16.338 # Currently unable to failover: Waiting the delay before I can start a new failover.
6836:S 08 May 11:41:17.014 # Starting a failover election for epoch 142.
6836:S 08 May 11:41:17.110 # Currently unable to failover: Waiting for votes, but majority still not reached.
6836:S 08 May 11:41:19.977 # MASTER timeout: no data nor PING received...
6836:S 08 May 11:41:19.977 # Connection with master lost.
6836:S 08 May 11:41:19.977 * Caching the disconnected master state.
6836:S 08 May 11:41:19.977 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:41:19.977 * MASTER <-> SLAVE sync started
6836:S 08 May 11:41:28.976 # Error condition on socket for SYNC: Connection timed out
...
6836:S 08 May 11:42:16.096 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:42:16.096 * MASTER <-> SLAVE sync started
6836:S 08 May 11:42:17.033 # Start of election delayed for 754 milliseconds (rank #0, offset 2989077250425).
6836:S 08 May 11:42:17.131 # Currently unable to failover: Waiting the delay before I can start a new failover.
6836:S 08 May 11:42:17.806 # Starting a failover election for epoch 143.
6836:S 08 May 11:42:17.904 # Currently unable to failover: Waiting for votes, but majority still not reached.
6836:S 08 May 11:42:25.096 # Error condition on socket for SYNC: Connection timed out
6836:S 08 May 11:42:25.108 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:42:25.108 * MASTER <-> SLAVE sync started
6836:S 08 May 11:42:34.108 # Error condition on socket for SYNC: Connection timed out
...
6836:S 08 May 11:43:10.162 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:43:10.162 * MASTER <-> SLAVE sync started
6836:S 08 May 11:43:17.883 # Start of election delayed for 886 milliseconds (rank #0, offset 2989077250425).
6836:S 08 May 11:43:17.979 # Currently unable to failover: Waiting the delay before I can start a new failover.
6836:S 08 May 11:43:18.855 # Starting a failover election for epoch 144.
6836:S 08 May 11:43:18.951 # Currently unable to failover: Waiting for votes, but majority still not reached.
6836:S 08 May 11:43:19.162 # Error condition on socket for SYNC: Connection timed out
6836:S 08 May 11:43:19.176 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:43:19.176 * MASTER <-> SLAVE sync started
6836:S 08 May 11:43:28.176 # Error condition on socket for SYNC: Connection timed out
...
6836:S 08 May 11:44:14.260 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:44:14.260 * MASTER <-> SLAVE sync started
6836:S 08 May 11:44:18.837 # Start of election delayed for 532 milliseconds (rank #0, offset 2989077250425).
6836:S 08 May 11:44:18.935 # Currently unable to failover: Waiting the delay before I can start a new failover.
6836:S 08 May 11:44:19.425 # Starting a failover election for epoch 145.
6836:S 08 May 11:44:19.522 # Currently unable to failover: Waiting for votes, but majority still not reached.
6836:S 08 May 11:44:23.260 # Error condition on socket for SYNC: Connection timed out
6836:S 08 May 11:44:23.273 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:44:23.273 * MASTER <-> SLAVE sync started

# 第一次执行 cluster failover force 进行强制切库
6836:S 08 May 11:44:28.307 # Forced failover user request accepted.
6836:S 08 May 11:44:32.273 # Error condition on socket for SYNC: Connection timed out
6836:S 08 May 11:44:32.278 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:44:32.278 * MASTER <-> SLAVE sync started
# 第一次强制切库 timed out
6836:S 08 May 11:44:33.384 # Manual failover timed out.

# 第二次执行 cluster failover force 进行强制切库
6836:S 08 May 11:44:37.955 # Forced failover user request accepted.
6836:S 08 May 11:44:41.280 # Error condition on socket for SYNC: Connection timed out
6836:S 08 May 11:44:41.292 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:44:41.292 * MASTER <-> SLAVE sync started
6836:S 08 May 11:44:42.976 # Manual failover timed out.

# 第三次执行 cluster failover force 进行强制切库
6836:S 08 May 11:44:47.440 # Forced failover user request accepted.
6836:S 08 May 11:44:49.371 # Currently unable to failover: Failover attempt expired.
6836:S 08 May 11:44:50.293 # Error condition on socket for SYNC: Connection timed out
6836:S 08 May 11:44:50.312 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 11:44:50.312 * MASTER <-> SLAVE sync started
6836:S 08 May 11:44:52.473 # Manual failover timed out.

# 执行 cluster failover takeover 切库
6836:S 08 May 11:44:55.699 # Taking over the master (user request).
# 设置 configEpoch 为 146
6836:S 08 May 11:44:55.699 # New configEpoch set to 146
# 清理之前缓存的master信息,自己作为主库开始服务
6836:M 08 May 11:44:55.699 * Discarding previously cached master state.

# 宕机机器上的其他正常切库的主节点,现在拉起来了,通信后有如下日志
6836:M 08 May 12:05:53.445 * Clear FAIL state for node 71f8ab3c35baceea3415ecc91797b5e6fe9b4136: master without slots is reachable again.
6836:M 08 May 12:13:30.911 * Clear FAIL state for node 6a82a479431cab234bb4b8330ca4dea2d409871e: master without slots is reachable again.

# 拉起该节点之前的老主库,通信后的日志
6836:M 08 May 13:09:30.309 * Clear FAIL state for node d84c1798cf3470cdbd3bd8a2261d59c117ff918e: master without slots is reachable again.
# 被老主库以更高的 epoch 投票选举,抢走了master,自己又变成了老主库的从
6836:M 08 May 13:09:37.532 # Configuration change detected. Reconfiguring myself as a replica of d84c1798cf3470cdbd3bd8a2261d59c117ff918e
6836:S 08 May 13:09:38.051 * Connecting to MASTER 10.142.1.13:13778
6836:S 08 May 13:09:38.051 * MASTER <-> SLAVE sync started
6836:S 08 May 13:09:38.051 * Non blocking connect for SYNC fired the event.
6836:S 08 May 13:09:38.053 * Master replied to PING, replication can continue...
6836:S 08 May 13:09:38.053 * Partial resynchronization not possible (no cached master)
6836:S 08 May 13:09:38.053 * Full resync from master: 731c4a47d04cf0eb2af608bd5ff442850086dbc9:1
6836:S 08 May 13:09:38.353 * MASTER <-> SLAVE sync: receiving 77105 bytes from master
6836:S 08 May 13:09:39.338 * MASTER <-> SLAVE sync: Flushing old data
6836:S 08 May 13:10:23.333 * MASTER <-> SLAVE sync: Loading DB in memory
6836:S 08 May 13:10:23.335 * MASTER <-> SLAVE sync: Finished with success

宕机机器上主库(13778节点)的日志:

168063:M 08 May 13:09:04.491 * Node configuration loaded, I'm d84c1798cf3470cdbd3bd8a2261d59c117ff918e
168063:M 08 May 13:09:04.493 # Server started, Redis version 3.0.7
# 重复执行拉起操作,waring日志,可忽略?
168096:M 08 May 13:09:04.900 # Creating Server TCP listening socket *:13778: bind: Address already in use
168545:M 08 May 13:09:24.045 # Creating Server TCP listening socket *:13778: bind: Address already in use
# 实例启动成功,开始用13778端口服务
168063:M 08 May 13:09:30.299 * DB loaded from disk: 25.807 seconds
168063:M 08 May 13:09:30.300 * The server is now ready to accept connections on port 13778
# 准备作为 13808 的从库
168063:M 08 May 13:09:30.303 # Configuration change detected. Reconfiguring myself as a replica of 56613f3183ce8349dc5f502
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值