Hadoop HA重做 Standby

错误现象,刚开始 namenode log一直刷以下错误信息:

2014-01-27 17:55:59,388 WARN  resources.ExceptionHandler (ExceptionHandler.java:toResponse(92)) - INTERNAL_SERVER_ERROR

后面与此文类似,见 Hadoop运维笔记 之 Namenode异常停止后无法正常启动

同系 Hadoop-2.10-beta 版本的 bug(testNamenodeRestart fails with NullPointerException in trunk),

This is actually due to a bug in the NN. The http services are started before the image is loaded, the edits are processed, and the rpc server is started. During image loading and edits processing, webhdfs will NPE on the rpc server.

 

无发启动,只好重做 Standby,具体步骤如下:

1、首先在 Active 上执行以下命令,然后手动备份整个 name目录:

# 关闭 故障自动切换控制器
hadoop-daemon.sh stop zkfc

# 进入安全模式
hdfs dfsadmin -safemode enter

# 刷新editslog 到fsimage
hdfs dfsadmin -saveNamespace

2、然后在 Standby 上,先备份整个 name 及 journal 目录,再执行:

hadoop-daemon.sh stop zkfc
hdfs namenode -bootstrapStandby

若报错:

FATAL ha.BootstrapStandby: Unable to read transaction ids 10-100 from the configured shared edits storage qjournal://1.1.1.1:8485;1.1.1.2:8485/sec-hdfs-cluster. Please copy these logs into the shared edits storage or call saveNamespace on the active node.
Error: Gap in transactions. Expected to be able to read up until at least txid 10 but unable to find any edit logs containing txid 10

则将 Active 上整个 name目录复制到 Standby,然后直接启动namenode即可:

scp -r /data/hadoop/name/ $standby_ip:/data/hadoop
hadoop-daemon.sh start namenode

3、注意,此时无需执行 “bootstrapStandby”,否则会将刚刚复制过来的 name 目录重建清空。

参考:

转载于:https://my.oschina.net/cwalet/blog/680572

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值