NodeManager启动错误

1、NodeManager 没起来

2013-07-25 20:06:22,266 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager
	at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:196)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:329)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:351)
Caused by: org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
	at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:78)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.start(ContainerManagerImpl.java:248)
	at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
	... 3 more
Caused by: org.apache.hadoop.yarn.YarnException: Failed to check for existence of remoteLogDir [/yarn/apps]
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:179)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.start(LogAggregationService.java:132)
	at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
	... 5 more

/yarn/apps 目录其实存在的 

重启后居然又起来了,莫名其妙 

这种情况有时是因为 IP 不对 :

SHUTDOWN_MSG: Shutting down NodeManager at localhost.localdomain/192.168.1.109

日志发现不是当前 IP,待ip手动或自动配置正确后重启



2、NodeManager 又没起来,这是个更常见的错误 

Caused by: java.net.ConnectException: Call From localhost.localdomain/192.168.1.109 to localhost.localdomain:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
SHUTDOWN_MSG: Shutting down NodeManager at localhost.localdomain/192.168.1.109
************************************************************/

检查 hosts 文件

192.168.1.109 localhost localhost.localdomain

检查 yarn 监控页面 http://192.168.1.109:8088/ 不能访问

查看系统有 RM 进程.

查看 RM 日志 ,并没有启动日志,每次给 RM 进程加上 Debug 参数这个进程就没日志了,看来还是参数没加好啊


调整参数后,再启动,在 Eclipse 中连接到调试端口后,再用 jps 查看时就不会出现 cannot sync ..错误了

但发现 NodeManager  还是没起来,查看日志还是上面的错误,又是8031:

<property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>127.0.0.1:8031</value>
  <description>
   host is the hostname of the resource manager and port is the port on which the NodeManagers contact the Resource Manager.
  </description>
  </property>
<property>

yarn 监控页面可以访问,其实8031也有监听

[root@localhost yuming]# netstat -tln | grep 8031
tcp6       0      0 192.168.1.109:8031      :::*                    LISTEN     


3、为 RM 加上调试参数后,NM 又又没起来的问题:

RM 日志:

2013-07-29 09:40:48,750 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8031

NM 日志:

2013-07-29 09:36:17,783 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.YarnException: Failed to Start org.apache.hadoop.yarn.server.nodemanager.NodeManager
Caused by: java.net.ConnectException: Call From localhost.localdomain/192.168.0.137 to localhost.localdomain:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

时间现实 NM 连接 8031 时 8031 还没起来呢,差了4秒,因为 RM 在等待调试器连接

单独再启一次 NM 就可以了



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值