MapReduce wordcount测试卡死在running job

最新推荐文章于 2024-05-22 08:42:12 发布

拖鞋公子

最新推荐文章于 2024-05-22 08:42:12 发布

阅读量7k

点赞数 3

分类专栏： hadoop 文章标签： hadoop mapreduce wordcount

本文链接：https://blog.csdn.net/hukun910903/article/details/79736257

版权

hadoop 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

hadoop环境搭建好后，准备用MapReduce自带的wordcount程序测试一下，跑了几次总是卡在Running job那里

2018-03-28 08:46:41,855 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.85.3:8032
2018-03-28 08:46:42,341 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hukun/.staging/job_1521944931433_0004
2018-03-28 08:46:42,600 INFO input.FileInputFormat: Total input files to process : 1
2018-03-28 08:46:43,103 INFO mapreduce.JobSubmitter: number of splits:1
2018-03-28 08:46:43,142 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2018-03-28 08:46:43,238 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1521944931433_0004
2018-03-28 08:46:43,239 INFO mapreduce.JobSubmitter: Executing with tokens: []
2018-03-28 08:46:43,414 INFO conf.Configuration: resource-types.xml not found
2018-03-28 08:46:43,415 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2018-03-28 08:46:43,480 INFO impl.YarnClientImpl: Submitted application application_1521944931433_0004
2018-03-28 08:46:43,521 INFO mapreduce.Job: The url to track the job: http://master:8099/proxy/application_1521944931433_0004/
2018-03-28 08:46:43,522 INFO mapreduce.Job: Running job: job_1521944931433_0004

我的Hadoop环境搭建见hadoop 3.0 集群配置（ubuntu环境）
我的wordcount测试步骤如下
随便找一个英文txt文本，push到hdfs，我这里直接使用了hadoop的license文件

hdfs dfs -mkdir hdfs://master:9000/wordcount
hdfs dfs -put ~/LICENSE.txt hdfs://master:9000/wordcount/LICENSE.txt

cd到hadoop的mapreduce目录，这里有hadoop提供的测试程序

cd ~/soft/hadoop-3.0.0/share/hadoop/mapreduce/

执行wordcount测试程序

hadoop jar hadoop-mapreduce-examples-3.0.0.jar wordcount hdfs://master:9000/wordcount/LICENSE.txt hdfs://master:9000/wordcount/result

结果会放在hdfs://master:9000/wordcount/result目录下
发现执行到INFO mapreduce.Job: Running job: job_1521902469523_0005卡死了
这里写图片描述

hadoop的默认日志级别是INFO，很多信息没有打印出来，这次我把hadoop的日志打印级别调整为DEBUG

export HADOOP_ROOT_LOGGER=DEBUG,console

再次执行wordcount测试程序
发现一直在和master进行ipc通信，也看不出其他什么信息

2018-03-24 08:03:28,044 DEBUG ipc.Client: The ping interval is 60000 ms.
2018-03-24 08:03:28,047 DEBUG ipc.Client: Connecting to master/192.168.85.3:9000
2018-03-24 08:03:28,097 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun: starting, having connections 1
2018-03-24 08:03:28,104 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo
2018-03-24 08:03:28,240 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #0
2018-03-24 08:03:28,241 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 237ms
2018-03-24 08:03:28,287 DEBUG mapred.ResourceMgrDelegate: getStagingAreaDir: dir=/tmp/hadoop-yarn/staging/hukun/.staging
2018-03-24 08:03:28,288 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #1 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo
2018-03-24 08:03:28,301 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #1
2018-03-24 08:03:28,301 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 13ms
2018-03-24 08:03:28,357 DEBUG ipc.Client: The ping interval is 60000 ms.
2018-03-24 08:03:28,364 DEBUG ipc.Client: Connecting to master/192.168.85.3:8032
2018-03-24 08:03:28,367 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:8032 from hukun sending #2 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getNewApplication
2018-03-24 08:03:28,378 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:8032 from hukun: starting, having connections 2
2018-03-24 08:03:28,412 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:8032 from hukun got value #2
2018-03-24 08:03:28,413 DEBUG ipc.ProtobufRpcEngine: Call: getNewApplication took 58ms
2018-03-24 08:03:28,452 DEBUG mapreduce.JobSubmitter: Configuring job job_1521903779426_0001 with /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001 as the submit dir
2018-03-24 08:03:28,452 DEBUG mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:[hdfs://master:9000]
2018-03-24 08:03:28,830 DEBUG mapreduce.JobResourceUploader: default FileSystem: hdfs://master:9000
2018-03-24 08:03:28,833 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #3 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo
2018-03-24 08:03:28,835 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #3
2018-03-24 08:03:28,835 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 2ms
2018-03-24 08:03:28,838 DEBUG hdfs.DFSClient: /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001: masked={ masked: rwxr-xr-x, unmasked: rwxrwxrwx }
2018-03-24 08:03:28,852 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #4 org.apache.hadoop.hdfs.protocol.ClientProtocol.mkdirs
2018-03-24 08:03:28,884 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #4

重启hadoop

stop-all.sh
start-all.sh

再次执行wordcount有LOG打印出来了

2018-03-24 08:03:28,910 DEBUG retry.RetryInvocationHandler: Exception while invoking call #4 ClientNamenodeProtocolTranslatorPB.mkdirs over null. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001. Name node is in safe mode.
The reported blocks 48 has reached the threshold 0.9990 of total blocks 48. The number of live datanodes 2 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 11 seconds. NamenodeHostName:master
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1436)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1423)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3029)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1115)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:695)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

提示Cannot delete /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001. Name node is in safe mode.

#关闭safenode
hdfs dfsadmin -safemode leave  
stop-all.sh
start-all.sh

再次执行wordcount程序，发现仍然卡在ipc通信那里
这里写图片描述

后来了解到，出现Cannot delete /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001. Name node is in safe mode.的原因是执行MapReduce的时候强制按CTRL+C退出了，tmp里面保存了job的一些信息，而safemode是无法删除tmp里面的这些信息的

看来是其他了问题了，网上试了各种方法，还是不行，最后把虚拟内存调到4G，处理器调成2个，每个处理器两个核，后来再执行一次就OK了，看来还是虚拟机的内存设置太少了的缘故
这里写图片描述

参考

1. Hadoop开启关闭调试信息
 2. Hadoop “Name node is in safe mode” 错误解决方法
 3. 错误Name node is in safe mode的解决方法

拖鞋公子

关注

3
点赞
踩
14

收藏

觉得还不错? 一键收藏
2
评论
MapReduce wordcount测试卡死在running job

hadoop环境搭建好后，准备用MapReduce自带的wordcount程序测试一下，跑了几次总是卡在Running job那里2018-03-28 08:46:41,855 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.85.3:80322018-03-28 08:46:42,341 INFO...
复制链接

扫一扫

专栏目录