MapReduce wordcount测试卡死在running job

hadoop环境搭建好后,准备用MapReduce自带的wordcount程序测试一下,跑了几次总是卡在Running job那里

2018-03-28 08:46:41,855 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.85.3:8032
2018-03-28 08:46:42,341 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hukun/.staging/job_1521944931433_0004
2018-03-28 08:46:42,600 INFO input.FileInputFormat: Total input files to process : 1
2018-03-28 08:46:43,103 INFO mapreduce.JobSubmitter: number of splits:1
2018-03-28 08:46:43,142 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2018-03-28 08:46:43,238 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1521944931433_0004
2018-03-28 08:46:43,239 INFO mapreduce.JobSubmitter: Executing with tokens: []
2018-03-28 08:46:43,414 INFO conf.Configuration: resource-types.xml not found
2018-03-28 08:46:43,415 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2018-03-28 08:46:43,480 INFO impl.YarnClientImpl: Submitted application application_1521944931433_0004
2018-03-28 08:46:43,521 INFO mapreduce.Job: The url to track the job: http://master:8099/proxy/application_1521944931433_0004/
2018-03-28 08:46:43,522 INFO mapreduce.Job: Running job: job_1521944931433_0004

我的Hadoop环境搭建见hadoop 3.0 集群配置(ubuntu环境)
我的wordcount测试步骤如下
随便找一个英文txt文本,push到hdfs,我这里直接使用了hadoop的license文件

hdfs dfs -mkdir hdfs://master:9000/wordcount
hdfs dfs -put ~/LICENSE.txt hdfs://master:9000/wordcount/LICENSE.txt

cd到hadoop的mapreduce目录,这里有hadoop提供的测试程序

cd ~/soft/hadoop-3.0.0/share/hadoop/mapreduce/

执行wordcount测试程序

hadoop jar hadoop-mapreduce-examples-3.0.0.jar wordcount hdfs://master:9000/wordcount/LICENSE.txt hdfs://master:9000/wordcount/result

结果会放在hdfs://master:9000/wordcount/result目录下
发现执行到INFO mapreduce.Job: Running job: job_1521902469523_0005卡死了
这里写图片描述

hadoop的默认日志级别是INFO,很多信息没有打印出来,这次我把hadoop的日志打印级别调整为DEBUG

export HADOOP_ROOT_LOGGER=DEBUG,console

再次执行wordcount测试程序
发现一直在和master进行ipc通信,也看不出其他什么信息

2018-03-24 08:03:28,044 DEBUG ipc.Client: The ping interval is 60000 ms.
2018-03-24 08:03:28,047 DEBUG ipc.Client: Connecting to master/192.168.85.3:9000
2018-03-24 08:03:28,097 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun: starting, having connections 1
2018-03-24 08:03:28,104 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo
2018-03-24 08:03:28,240 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #0
2018-03-24 08:03:28,241 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 237ms
2018-03-24 08:03:28,287 DEBUG mapred.ResourceMgrDelegate: getStagingAreaDir: dir=/tmp/hadoop-yarn/staging/hukun/.staging
2018-03-24 08:03:28,288 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #1 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo
2018-03-24 08:03:28,301 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #1
2018-03-24 08:03:28,301 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 13ms
2018-03-24 08:03:28,357 DEBUG ipc.Client: The ping interval is 60000 ms.
2018-03-24 08:03:28,364 DEBUG ipc.Client: Connecting to master/192.168.85.3:8032
2018-03-24 08:03:28,367 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:8032 from hukun sending #2 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getNewApplication
2018-03-24 08:03:28,378 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:8032 from hukun: starting, having connections 2
2018-03-24 08:03:28,412 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:8032 from hukun got value #2
2018-03-24 08:03:28,413 DEBUG ipc.ProtobufRpcEngine: Call: getNewApplication took 58ms
2018-03-24 08:03:28,452 DEBUG mapreduce.JobSubmitter: Configuring job job_1521903779426_0001 with /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001 as the submit dir
2018-03-24 08:03:28,452 DEBUG mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:[hdfs://master:9000]
2018-03-24 08:03:28,830 DEBUG mapreduce.JobResourceUploader: default FileSystem: hdfs://master:9000
2018-03-24 08:03:28,833 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #3 org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo
2018-03-24 08:03:28,835 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #3
2018-03-24 08:03:28,835 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 2ms
2018-03-24 08:03:28,838 DEBUG hdfs.DFSClient: /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001: masked={ masked: rwxr-xr-x, unmasked: rwxrwxrwx }
2018-03-24 08:03:28,852 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun sending #4 org.apache.hadoop.hdfs.protocol.ClientProtocol.mkdirs
2018-03-24 08:03:28,884 DEBUG ipc.Client: IPC Client (658532887) connection to master/192.168.85.3:9000 from hukun got value #4

重启hadoop

stop-all.sh
start-all.sh

再次执行wordcount有LOG打印出来了

2018-03-24 08:03:28,910 DEBUG retry.RetryInvocationHandler: Exception while invoking call #4 ClientNamenodeProtocolTranslatorPB.mkdirs over null. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001. Name node is in safe mode.
The reported blocks 48 has reached the threshold 0.9990 of total blocks 48. The number of live datanodes 2 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 11 seconds. NamenodeHostName:master
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1436)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1423)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3029)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1115)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:695)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

提示Cannot delete /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001. Name node is in safe mode.

#关闭safenode
hdfs dfsadmin -safemode leave  
stop-all.sh
start-all.sh

再次执行wordcount程序,发现仍然卡在ipc通信那里
这里写图片描述

后来了解到,出现Cannot delete /tmp/hadoop-yarn/staging/hukun/.staging/job_1521903779426_0001. Name node is in safe mode.的原因是执行MapReduce的时候强制按CTRL+C退出了,tmp里面保存了job的一些信息,而safemode是无法删除tmp里面的这些信息的

看来是其他了问题了,网上试了各种方法,还是不行,最后把虚拟内存调到4G,处理器调成2个,每个处理器两个核,后来再执行一次就OK了,看来还是虚拟机的内存设置太少了的缘故
这里写图片描述

参考

1. Hadoop开启关闭调试信息
2. Hadoop “Name node is in safe mode” 错误解决方法
3. 错误Name node is in safe mode的解决方法

  • 3
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值