ambari在移动namenode的时候出现两个备用的namnode
异常如下:
2017-11-17 15:38:55,621 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(512)) - EventThread shut down
2017-11-17 15:38:55,621 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:processWatchEvent(549)) - Session connected.
2017-11-17 15:38:55,627 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:fenceOldActive(884)) - Checking for any old active which needs to be fenced...
2017-11-17 15:38:55,627 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:fenceOldActive(905)) - Old node exists: 0a046e6e686112036e6e321a1064656d6f3136302e746573742e636f6d20d43e28d33e
2017-11-17 15:38:55,628 WARN ha.ActiveStandbyElector (ActiveStandbyElector.java:becomeActive(816)) - Exception handling the winning of election
java.lang.RuntimeException: Mismatched address stored in ZK for NameNode at demo153.test.com/10.100.6.216:8020: Stored protobuf was nameserviceId: "nnha"
namenodeId: "nn2"
hostname: "demo160.test.com"
port: 8020
zkfcPort: 8019
, address from our own configuration for this NameNode was demo153.test.com/10.100.6.216:8020
at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.dataToTarget(DFSZKFailoverController.java:84)
at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:499)
at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:60)
at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:888)
at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:909)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:808)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:417)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2017-11-17 15:38:55,628 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:reJoinElection(672)) - Trying to re-establish ZK session
2017-11-17 15:38:55,632 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x25fc8c545d00f59 closed
2017-11-17 15:38:56,632 INFO zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=demo102.test.com:2181,demo153.test.com:2181,demo96.test.com:2181,demo160.test.com:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@29502e07
2017-11-17 15:38:56,633 INFO zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server demo96.test.com/10.100.6.122:2181. Will not attempt to authenticate using SASL (unknown error)
2017-11-17 15:38:56,634 INFO zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(852)) - Socket connection established, initiating session, client: /10.100.6.122:35869, server: demo96.test.com/10.100.6.122:2181
2017-11-17 15:38:56,640 INFO zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1235)) - Session establishment complete on server demo96.test.com/10.100.6.122:2181, sessionid = 0x45fc8c5430d0fed, negotiated timeout = 5000
此异常是因为初始化zk导致的问题,恢复此异常需要做如下工作:
- 停止Failover Controller
- 通过zookeeper删除 /hadoop-ha
- 通过cloudera manager 初始化zookeeper的HA
- 重新启动Failover Controlle