Mongodb分片时,两台服务器时间不同步caught exception while doing balance: error checking clock skew of cluster


Tue Nov 29 09:16:11 [Balancer] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:16:11 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:16:11 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:16:11 [Balancer] ~ScopedDbConnection: _conn != null
Tue Nov 29 09:16:11 [Balancer] caught exception while doing balance: error checking clock skew of cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 :: caused by :: 13650 clock skew of the cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 is too far out of bounds to allow distributed locking.

192.168.150.116的mongos日志
Tue Nov 29 09:43:33 [Balancer] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:43:33 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:43:33 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:43:33 [Balancer] ~ScopedDbConnection: _conn != null
Tue Nov 29 09:43:33 [Balancer] caught exception while doing balance: error checking clock skew of cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 :: caused by :: 13650 clock skew of the cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 is too far out of bounds to allow distributed locking.
Tue Nov 29 09:44:03 [Balancer] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:44:03 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:44:03 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:44:03 [LockPinger] creating distributed lock ping thread for 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 and process WebServer:27013:1322457524:1804289383 (sleeping for 30000ms)
Tue Nov 29 09:44:03 [LockPinger] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:44:03 [LockPinger] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:44:03 [LockPinger] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:44:04 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed438e30be277fd8006c95a
Tue Nov 29 09:44:04 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 09:44:14 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed438ee0be277fd8006c95b
Tue Nov 29 09:44:14 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 09:44:25 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed438f80be277fd8006c95c
Tue Nov 29 09:44:25 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 09:44:35 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed439030be277fd8006c95d
Tue Nov 29 09:44:35 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 09:44:45 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed4390d0be277fd8006c95e
Tue Nov 29 09:44:45 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 09:44:56 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed439170be277fd8006c95f

192.168.150.100的mongos日志
Tue Nov 29 09:44:11 [Balancer] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:44:11 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:44:11 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:44:11 [Balancer] ~ScopedDbConnection: _conn != null
Tue Nov 29 09:44:11 [Balancer] caught exception while doing balance: error checking clock skew of cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 :: caused by :: 13650 clock skew of the cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 is too far out of bounds to allow distributed locking.
Tue Nov 29 09:44:18 [Balancer] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:44:18 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:44:18 [Balancer] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:44:18 [LockPinger] creating distributed lock ping thread for 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 and process localhost.localdomain:27010:1322457595:1804289383 (sleeping for 30000ms)
Tue Nov 29 09:44:18 [LockPinger] SyncClusterConnection connecting to [192.168.150.116:27012]
Tue Nov 29 09:44:18 [LockPinger] SyncClusterConnection connecting to [192.168.150.100:27015]
Tue Nov 29 09:44:18 [LockPinger] SyncClusterConnection connecting to [192.168.150.100:27016]
Tue Nov 29 09:44:18 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed438f215a9197441ec2236
Tue Nov 29 09:44:18 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 09:44:29 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed438fc15a9197441ec2237
Tue Nov 29 09:44:29 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 09:44:39 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed4390715a9197441ec2238
Tue Nov 29 09:44:39 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.


同步时间之前
116的时间比100的时间慢了38秒

同步两台服务器的时间,100同步116的时间
在116用root编辑/etc/ntp.conf,加入下面这段
## add for rac
server 127.127.1.0
fudge  127.127.1.0 stratum 11
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
然后在100用root编辑/etc/ntp.conf,加入下面这段
## add for rac
server 192.168.150.116 prefer
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
然后在两台服务器上执行下面的命令使NTP服务启动
/etc/init.d/ntpd start

同步时间之后
116的时间比100的时间慢了15秒,从日志上看,已经不报异常了。
同步时间稳定后,100的时间比116的时间慢了2秒
现在就是不明白,在做分片时,两台服务器的时间相差多少才不报异常,这个阀值是多少,没有那个地方看到相关的文档说明。
看来最好是将两台服务器的时间同步一致,就不会出问题。

最后查看日志
192.168.150.116的mongos日志
Tue Nov 29 10:08:59 [LockPinger] cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 pinged successfully at Tue Nov 29 10:08:59 2011 by distributed lock pinger '192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016/WebServer:27013:1322457524:1804289383', sleeping for 30000ms
Tue Nov 29 10:09:09 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43ec50be277fd8006c9ea
Tue Nov 29 10:09:09 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 10:09:19 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43ecf0be277fd8006c9eb
Tue Nov 29 10:09:19 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 10:09:29 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43ed90be277fd8006c9ec
Tue Nov 29 10:09:30 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 10:09:40 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43ee40be277fd8006c9ed
Tue Nov 29 10:09:40 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 10:09:50 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43eee0be277fd8006c9ee
Tue Nov 29 10:09:50 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 10:10:00 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43ef80be277fd8006c9ef
Tue Nov 29 10:10:00 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.
Tue Nov 29 10:10:11 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' acquired, ts : 4ed43f020be277fd8006c9f0
Tue Nov 29 10:10:11 [Balancer] distributed lock 'balancer/WebServer:27013:1322457524:1804289383' unlocked.

192.168.150.100的mongos日志
Tue Nov 29 10:08:57 [LockPinger] cluster 192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016 pinged successfully at Tue Nov 29 10:08:57 2011 by distributed lock pinger '192.168.150.116:27012,192.168.150.100:27015,192.168.150.100:27016/localhost.localdomain:27010:1322457595:1804289383', sleeping for 30000ms
Tue Nov 29 10:09:07 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43ec315a9197441ec22c6
Tue Nov 29 10:09:07 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:09:18 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43ecd15a9197441ec22c7
Tue Nov 29 10:09:18 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:09:28 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43ed815a9197441ec22c8
Tue Nov 29 10:09:28 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:09:38 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43ee215a9197441ec22c9
Tue Nov 29 10:09:38 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:09:49 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43eec15a9197441ec22ca
Tue Nov 29 10:09:49 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:09:59 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43ef715a9197441ec22cb
Tue Nov 29 10:09:59 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:10:09 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43f0115a9197441ec22cc
Tue Nov 29 10:10:09 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.
Tue Nov 29 10:10:19 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' acquired, ts : 4ed43f0b15a9197441ec22cd
Tue Nov 29 10:10:20 [Balancer] distributed lock 'balancer/localhost.localdomain:27010:1322457595:1804289383' unlocked.

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

-无-为-

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值