错误详情如下:
2018-01-09 17:47:22,892 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-09 17:47:23,893 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-09 17:47:24,895 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-09 17:47:25,896 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-09 17:47:26,898 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
错误原因:
Hadoop集群yarn-site.xml配置错误:
默认情况下yarn ResourceManager 相关服务IP地址指向的是0.0.0.0。
而在服务器中,0.0.0.0指的是本机网络地址,那么NodeManager就会在本机找ResourceManager相关服务,而slave节点上并没有这些服务,这些服务在ResourceManager Master节点上。所以针对Hadoop集群配置yare-site.xml某些配置项不能使用默认配置。
注意:hadoop伪分布式可以使用默认配置,因为所有服务都在本地运行。
解决方法:
修改hadoop集群所有节点上yarn-site.xml配置文件,在该文件中配置ResourceManager Master节点所在地址即可解决问题。详细配置信息如下:
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopMaster:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoopMaster:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoopMaster:8031</value>
</property>