问题:在完全分布式hadoop集群中,内置wordcount程序运行到INFO mapreduce.Job: Running job: job_1492509956955_0001卡住了,很长时间不动。
卡住不动的界面,如下:
17/04/18 18:00:30 INFO client.RMProxy: Connecting to ResourceManager at heres04/192.168.2.113:8032
17/04/18 18:00:32 INFO input.FileInputFormat: Total input paths to process : 1
17/04/18 18:00:32 INFO mapreduce.JobSubmitter: number of splits:1
17/04/18 18:00:32 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
17/04/18 18:00:32 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
17/04/18 18:00:32 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
17/04/18 18:00:32 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
17/04/18 18:00:32 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
17/04/18 18:00:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1492509956955_0001
17/04/18 18:00:33 INFO impl.YarnClientImpl: Submitted application application_1492509956955_0001 to ResourceManager at heres04/192.168.2.113:8032
17/04/18 18:00:33 INFO mapreduce.Job: The url to track the job: http://heres04:8088/proxy/application_1492509956955_0001/
17/04/18 18:00:33 INFO mapreduce.Job: Running job: job_1492509956955_0001
大体网上有两种不同解决方案:
方案一(基本不是这种原因):
hosts文件配置问题,ip到主机名的映射不全,或出现冗余。
方案二(多数是因为这种情况):
资源分配的原因(设置yarn里面关于内存和虚拟内存的配置项),所以要修改yarn-site.xml。
原先yarn-site.xml配置:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>heres04</value>
</property>
<!-- 指定nodemanager启动时加载server的方式为shuffle server -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>1024</value>
</property>
修改后,添加一些其他属性:
<!-- 指定resourcemanager地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>heres04</value>
</property>
<!-- 指定nodemanager启动时加载server的方式为shuffle server -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
属性的含义可以参考博客:http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-nodemanager/
经过测试,可以去掉
<property><name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
这两个属性,也能够运行出结果。