1.Retrying connect to server: master/192.168.1.200:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-08-22 21:44:19,478 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.1.200:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-08-22 21:44:20,479 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.1.200:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
原因:
- (1)core-site.xml中的错误
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
- (2)主机名称不一致
>vim /etc/hosts
hosts:
127.0.0.1 localhost
192.168.11.3 node1
192.168.11.4 node2
192.168.11.5 node3
>vim /etc/sysconfig/network
network:
NETWORKING=yes
HOSTNAME=node1
- (3) 防火墙没关,导致node节点连接不到master。
启动:# systemctl start firewalld
查看状态:# systemctl status firewalld 或者 firewall-cmd –state
停止:# systemctl disable firewalld
禁用:# systemctl stop firewalld
2.org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:9000/user/sunpeng/test.txt
测试集群的时候显示找不到路径,原因是:
input目录没有加入到hdfs中去,命令为:
hadoop fs -put conf input
本次为:hadoop fs -put conf /user/sunpeng/test.txt
解决路径问题之后,执行:
[sunpeng@master bin]$ ./hadoop jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount file:///opt/hadoop/tmp/README.txt /opt/hadoop/tmp/out
出现新错误如下
3.Call From master/192.168.0.49 to master:46587 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:
16/10/09 10:13:34 INFO mapreduce.Job: Running job: job_1475915355491_0005
16/10/09 10:19:36 INFO mapreduce.Job: Job job_1475915355491_0005 running in uber mode : false
16/10/09 10:19:36 INFO mapreduce.Job: map 0% reduce 0%
16/10/09 10:19:36 INFO mapreduce.Job: Job job_1475915355491_0005 failed with state FAILED due to: Application application_1475915355491_0005 failed 2 times due to Error launching appattempt_1475915355491_0005_000002. Got exception: java.net.ConnectException: Call From master/192.168.0.49 to master:46587 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor40.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
解决办法:由于本次采用了一个master和一个node1节点,所以在:
master节点上的hostname(/etc/hostname)为master
node1节点上的hostname(/etc/hostname)为master
因为node1的虚拟机是在master上克隆过来的,所以这个忘记改了,把node1上的hostname 改为node1,启动,发现新错误如下
4.java.net.UnknownHostException: node1 : node1: Name or service not known
java.net.UnknownHostException: node1 : node1: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.getHostname(MetricsSystemImpl.java:463)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSystem(MetricsSystemImpl.java:394)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:390)
解决办法:在当前node节点node1的hosts(/etc/hosts)上加上一条 192.168.0.53(当前主机ip) node1
所以现在和node1节点的文件为:
- master节点
hostname:
master
hosts:
192.168.0.49 master
192.168.0.53 node1
- node1:
hostname:
node1
hosts:
192.168.0.49 master
192.168.0.53 node1