RegionServer 由于 ZooKeeper session expired 而退出,头疼了很久,总结可能的原因:
1、网络不好
2、GC时间过长,程序暂停导致租约过期
3、CPU忙,维护zookeeper的线程不能及时得到执行机会(调度)
解决方案:
- RS配置zookeeper.session.timeout时间长点,我配置的180000
- RS配置hbase.regionserver.restart.on.zk.expire设置为true
参考下源代码
/**
* We register ourselves as a watcher on the master address ZNode. This is
* called by ZooKeeper when we get an event on that ZNode. When this method
* is called it means either our master has died, or a new one has come up.
* Either way we need to update our knowledge of the master.
* @param event WatchedEvent from ZooKeeper.
*/
public void pr