Hadoop任务运行中java.net.ConnectException: to 0.0.0.0:10020 failed
在执行Hive 语句的任务时,最近频繁的发生如下的报错:
java.io.IOException: java.net.ConnectException: Call From hadoop-001/192.168.1.101 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectExcep
tion: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:334)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:532)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:578)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:578)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:596)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:288)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:742)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
报错信息提示,在访问端口 100020的时候出错,这表示DataNode 需要访问 MapReduce JobHistory Server,而默认值是: 0.0.0.0:10020 。
修改配置文件
找到{HADOOP_HOME}/etc/hadoop/mapred-site.xml配置文件 ,增加如下配置:
<property>
<name>mapreduce.jobhistory.address</name>
<!-- 配置实际的主机名和端口-->
<value>{namenode}:10020</value>
</property>
启动 JobHistory 服务
在namenode上执行命令
{hadoop_dir}/sbin/mr-jobhistory-daemon.sh start historyserver
之后就可以在 historyserver 的日志中,查看job的运行情况了。
错误处理
启动服务不久报了如下的错误:
2016-07-29 14:36:44,418 ERROR org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Error while trying to scan the directory hdfs://namenode:9000/tmp/had
oop-yarn/staging/history/done_intermediate/hadoop
java.io.IOException: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: Java heap space
......
Caused by: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:242)
......
2016-07-29 14:40:07,068 ERROR org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Error while trying to scan the directory hdfs://namenode:9000/tmp/had
oop-yarn/staging/history/done_intermediate/hadoop
java.io.IOException: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: GC overhead limit exceeded
接着我修改了配置文件,增加heap 的大小
${HADOOP_DIR}/etc/hadoop/mapred-env.sh
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=2000
重新启动 JobHistory 服务,问题解决。
[1] hadoop 0.0.0.0/0.0.0.0:10020 connection refused
[2] hadoop运行mapreduce作业无法连接0.0.0.0/0.0.0.0:10020