故障1
对新加入到redis集群的主机执行
redis-trib.rb reshard 192.168.38.70:6379
进行分片时出现[ERR] Calling MIGRATE: ERR Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)
redis-trib.rb check 192.168.38.70:6379
>>> Performing Cluster Check (using node 192.168.38.70:6379)
M: c85144dc3834bb4c867e731ab861e1ed5ffcec91 192.168.38.70:6379
slots:0-5460 (5461 slots) master
1 additional replica(s)
M: e3aa5a826153c2aff9092bbb6d3a63869dc8654c 192.168.38.76:6379
slots:5461-5797 (337 slots) master
0 additional replica(s)
S: 21542801ec46b5d7ac8d3fbb27130288632a3f6a 192.168.38.77:6379
slots: (0 slots) slave
replicates e3aa5a826153c2aff9092bbb6d3a63869dc8654c
S: 057ea784225ab0ed967ae92413b9eb4a8b0a05b2 192.168.38.75:6379
slots: (0 slots) slave
replicates d7551746265dc1d6067d1a1897357297c9f4d487
M: 83c1e168afbee21bf812bab4fcab799ff7a0ed59 192.168.38.72:6379
slots:10923-16383 (5461 slots) master
1 additional replica(s)
S: 293f32ae281283faecf66e9da6817ff43f44217b 192.168.38.73:6379
slots: (0 slots) slave
replicates 83c1e168afbee21bf812bab4fcab799ff7a0ed59
M: d7551746265dc1d6067d1a1897357297c9f4d487 192.168.38.71:6379
slots:5798-10922 (5125 slots) master
1 additional replica(s)
S: 7e73872e8cf726348000cdf95e1bb28772d16e82 192.168.38.74:6379
slots: (0 slots) slave
replicates c85144dc3834bb4c867e731ab861e1ed5ffcec91
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 192.168.38.76:6379 has slots in importing state (5798).
[WARNING] Node 192.168.38.71:6379 has slots in migrating state (5798).
[WARNING] The following slots are open: 5798
>>> Check slots coverage...
[OK] All 16384 slots covered.
解决方案
-
安装早期版本的redis.rb即可解决此问题
-
卸载4.x版本,安装3.x版本(测试3.2.1到3.3.5都可以,4.x以上的分片报错)
#查看当前版本
gem list redis
*** LOCAL GEMS ***
redis (4.1.3)
#卸载
gem uninstall redis
Successfully uninstalled redis-4.1.3
#安装3.3.5
gem install redis -v 3.3.5
Fetching: redis-3.3.5.gem (100%)
Successfully installed redis-3.3.5
Parsing documentation for redis-3.3.5
Installing ri documentation for redis-3.3.5
Done installing documentation for redis after 0 seconds
1 gem installed
#查看版本
gem list redis
*** LOCAL GEMS ***
redis (3.3.5)
故障2
故障现象
- 对redis的ruby包进行降级后,执行
redis-trib.rb fix 192.168.38.70:6379
命令进行修复时出现[ERR] Sorry, can't connect to node 192.168.38.70:6379
故障原因
- 因为redis3.x集群操作时不支持加密认证,这里我是先把所有redis node上的密码注释,集群添加好后在加入密码
vi /apps/redis/etc/redis.conf
#masterauth "123456"
#requirepass "123456"
systemctl restart redis
# 所有node上执行以上操作
- 然后执行
redis-trib.rb fix 192.168.38.70:6379
修复集群
redis-trib.rb fix 192.168.38.70:6379
[ERR] Sorry, can't connect to node 192.168.38.70:6379
redis-trib.rb fix 192.168.38.70:6379
>>> Performing Cluster Check (using node 192.168.38.70:6379)
M: c85144dc3834bb4c867e731ab861e1ed5ffcec91 192.168.38.70:6379
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 21542801ec46b5d7ac8d3fbb27130288632a3f6a 192.168.38.77:6379
slots: (0 slots) slave
replicates e3aa5a826153c2aff9092bbb6d3a63869dc8654c
S: 057ea784225ab0ed967ae92413b9eb4a8b0a05b2 192.168.38.75:6379
slots: (0 slots) slave
replicates d7551746265dc1d6067d1a1897357297c9f4d487
M: 83c1e168afbee21bf812bab4fcab799ff7a0ed59 192.168.38.72:6379
slots:10923-16383 (5461 slots) master
1 additional replica(s)
M: d7551746265dc1d6067d1a1897357297c9f4d487 192.168.38.71:6379
slots:5798-10922 (5125 slots) master
1 additional replica(s)
S: 7e73872e8cf726348000cdf95e1bb28772d16e82 192.168.38.74:6379
slots: (0 slots) slave
replicates c85144dc3834bb4c867e731ab861e1ed5ffcec91
M: e3aa5a826153c2aff9092bbb6d3a63869dc8654c 192.168.38.76:6379
slots:5461-5797 (337 slots) master
0 additional replica(s)
S: 293f32ae281283faecf66e9da6817ff43f44217b 192.168.38.73:6379
slots: (0 slots) slave
replicates 83c1e168afbee21bf812bab4fcab799ff7a0ed59
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
[WARNING] Node 192.168.38.71:6379 has slots in migrating state (5798).
[WARNING] Node 192.168.38.76:6379 has slots in importing state (5798).
[WARNING] The following slots are open: 5798
>>> Fixing open slot 5798
Set as migrating in: 192.168.38.71:6379
Set as importing in: 192.168.38.76:6379
Moving slot 5798 from 192.168.38.71:6379 to 192.168.38.76:6379: .
>>> Check slots coverage...
[OK] All 16384 slots covered.
# 执行check查看修复后的状态
redis-trib.rb check 192.168.38.70:6379
>>> Performing Cluster Check (using node 192.168.38.70:6379)
M: c85144dc3834bb4c867e731ab861e1ed5ffcec91 192.168.38.70:6379
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 21542801ec46b5d7ac8d3fbb27130288632a3f6a 192.168.38.77:6379
slots: (0 slots) slave
replicates e3aa5a826153c2aff9092bbb6d3a63869dc8654c
S: 057ea784225ab0ed967ae92413b9eb4a8b0a05b2 192.168.38.75:6379
slots: (0 slots) slave
replicates d7551746265dc1d6067d1a1897357297c9f4d487
M: 83c1e168afbee21bf812bab4fcab799ff7a0ed59 192.168.38.72:6379
slots:10923-16383 (5461 slots) master
1 additional replica(s)
M: d7551746265dc1d6067d1a1897357297c9f4d487 192.168.38.71:6379
slots:5799-10922 (5124 slots) master
1 additional replica(s)
S: 7e73872e8cf726348000cdf95e1bb28772d16e82 192.168.38.74:6379
slots: (0 slots) slave
replicates c85144dc3834bb4c867e731ab861e1ed5ffcec91
M: e3aa5a826153c2aff9092bbb6d3a63869dc8654c 192.168.38.76:6379
slots:5461-5798 (338 slots) master
0 additional replica(s)
S: 293f32ae281283faecf66e9da6817ff43f44217b 192.168.38.73:6379
slots: (0 slots) slave
replicates 83c1e168afbee21bf812bab4fcab799ff7a0ed59
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
- 最后执行
redis-trib.rb rebalance 192.168.38.70:6379
平衡集群中各主机的slot数量
redis-trib.rb rebalance 192.168.38.70:6379
>>> Performing Cluster Check (using node 192.168.38.70:6379)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 4 nodes. Total weight = 4
Moving 1365 slots from 192.168.38.72:6379 to 192.168.38.76:6379
...
# 执行check查看状态
redis-trib.rb check 192.168.38.70:6379
>>> Performing Cluster Check (using node 192.168.38.70:6379)
M: c85144dc3834bb4c867e731ab861e1ed5ffcec91 192.168.38.70:6379
slots:1365-5460 (4096 slots) master
1 additional replica(s)
S: 21542801ec46b5d7ac8d3fbb27130288632a3f6a 192.168.38.77:6379
slots: (0 slots) slave
replicates e3aa5a826153c2aff9092bbb6d3a63869dc8654c
S: 057ea784225ab0ed967ae92413b9eb4a8b0a05b2 192.168.38.75:6379
slots: (0 slots) slave
replicates d7551746265dc1d6067d1a1897357297c9f4d487
M: 83c1e168afbee21bf812bab4fcab799ff7a0ed59 192.168.38.72:6379
slots:12288-16383 (4096 slots) master
1 additional replica(s)
M: d7551746265dc1d6067d1a1897357297c9f4d487 192.168.38.71:6379
slots:6827-10922 (4096 slots) master
1 additional replica(s)
S: 7e73872e8cf726348000cdf95e1bb28772d16e82 192.168.38.74:6379
slots: (0 slots) slave
replicates c85144dc3834bb4c867e731ab861e1ed5ffcec91
M: e3aa5a826153c2aff9092bbb6d3a63869dc8654c 192.168.38.76:6379
slots:0-1364,5461-6826,10923-12287 (4096 slots) master
1 additional replica(s)
S: 293f32ae281283faecf66e9da6817ff43f44217b 192.168.38.73:6379
slots: (0 slots) slave
replicates 83c1e168afbee21bf812bab4fcab799ff7a0ed59
[ERR] Nodes don't agree about configuration!
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.