hadoop ha 没有自动切换分析报告

环境介绍:

baseA10(centos 6.8精简版本)
2,hadoop-2.6.0-cdh5.4.0-cent6.5.tar.gz
3,jdk8
4, zookeeper-3.4.5-cdh5.4.0.tar.gz
5,dmp32,dmp33,dmp34,dmp35,dmp36 五台机器组成集群,dmp33 ,dmp34作为namenode

2:部署完hadoop ha后,
hdfs haadmin -getServiceState nn1 显示active
hdfs haadmin -getServiceState nn2 显示 standby

hdfs dfsadmin -report 显示集群正常,部署成功
验证 两个namenode 是否可以切换,将activie的namenode kill 之后发现standby无法自动切换到active
查看log日志:
hadoop-eversec-zkfc-DMP1.log

2017-08-03 14:18:32,075 INFO org.apache.hadoop.ha.NodeFencer: ====== Beginning Service Fencing Process... ======
2017-08-03 14:18:32,075 INFO org.apache.hadoop.ha.NodeFencer: Trying method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
2017-08-03 14:18:32,077 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Connecting to dmp33...
2017-08-03 14:18:32,077 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connecting to dmp33 port 22
2017-08-03 14:18:32,078 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connection established
2017-08-03 14:18:32,083 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Remote version string: SSH-2.0-OpenSSH_7.5
2017-08-03 14:18:32,083 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Local version string: SSH-2.0-JSCH-0.1.42
2017-08-03 14:18:32,083 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes256-ctr is not available.
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes192-ctr is not available.
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes256-cbc is not available.
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: aes192-cbc is not available.
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: arcfour256 is not available.
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXINIT sent
2017-08-03 14:18:32,084 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXINIT received
2017-08-03 14:18:32,085 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Disconnecting from dmp33 port 22
2017-08-03 14:18:32,085 WARN org.apache.hadoop.ha.SshFenceByTcpPort: Unable to connect to dmp33 as user root
com.jcraft.jsch.JSchException: Algorithm negotiation fail
        at com.jcraft.jsch.Session.receive_kexinit(Session.java:520)
        at com.jcraft.jsch.Session.connect(Session.java:286)
        at org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
        at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
        at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:527)
        at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:500)
        at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:60)
        at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:887)
        at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:901)
        at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:800)
        at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2017-08-03 14:18:32,085 WARN org.apache.hadoop.ha.NodeFencer: Fencing method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful.
2017-08-03 14:18:32,085 ERROR org.apache.hadoop.ha.NodeFencer: Unable to fence service by any configured method.
2017-08-03 14:18:32,085 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
java.lang.RuntimeException: Unable to fence NameNode at dmp33/172.16.100.33:8020
        at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:528)
        at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:500)
        at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:60)
        at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:887)
        at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:901)
        at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:800)
        at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2017-08-03 14:18:32,085 INFO org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session
2017-08-03 14:18:32,099 INFO org.apache.zookeeper.ZooKeeper: Session: 0x35da4da3bad0316 closed
2017-08-03 14:18:33,100 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=dmp32:2181,dmp33:2181,dmp34:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@5ede09fa
2017-08-03 14:18:33,101 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server dmp34/172.16.100.34:2181. Will not attempt to authenticate using SASL (unknown error)
2017-08-03 14:18:33,101 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to dmp34/172.16.100.34:2181, initiating session
2017-08-03 14:18:33,111 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server dmp34/172.16.100.34:2181, sessionid = 0x35da4da3bad0317, negotiated timeout = 5000
2017-08-03 14:18:33,111 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
2017-08-03 14:18:33,112 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
2017-08-03 14:18:33,116 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...
2017-08-03 14:18:33,117 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a076c6f675361766512036e6e321a05646d70333320d43e28d33e
2017-08-03 14:18:33,118 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at dmp33/172.16.100.33:8020
2017-08-03 14:18:34,119 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: dmp33/172.16.100.33:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)


====================================================================

log日志显示 无法root 连接到的dmp33
2017-08-03 14:18:32,085 WARN org.apache.hadoop.ha.SshFenceByTcpPort: Unable to connect to dmp33 as user root
com.jcraft.jsch.JSchException: Algorithm negotiation fail

经过和其他能正常切换的省份机器对比:
正常:
这里写图片描述

错误:
这里写图片描述

结论;两个namenode无法自动切换。原因是 操作系统安装的openssh版本和hadoop内部使用的版本不匹配造成的
需要升级 $hadoop_home/share 下的jar包 由 jsch-0.1.42.jar 升级到 jsch-0.1.54.jar

find ./ -name ‘jsch*’ |xargs -i -t rm {}

cp jsch-0.1.54.jar ./share/hadoop/common/lib/jsch-0.1.54.jar
cp jsch-0.1.54.jar ./share/hadoop/jsch-0.1.54.jar
cp jsch-0.1.54.jar ./share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jsch-0.1.54.jar
cp jsch-0.1.54.jar ./share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/jsch-0.1.54.jar
cp jsch-0.1.54.jar ./share/hadoop/tools/lib/jsch-0.1.54.jar

重启集群后,两个namenode可以自动切换

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值