我配置了两台机器的集群,一台作为主机,一台作为从机,当主机启动wordcont任务时候出现这样的错误
java.io.IOException: Could not obtain block: blk_3096925682732876688_1028 file=/home/hadoop/tmp/mapred/system/job_201308300007_0001/jobToken
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2460)at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2252)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2415)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:68)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:100)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:230)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:163)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1248)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1229)
at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4502)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1285)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
at java.lang.Thread.run(Thread.java:722)
之后分析主机jobtracker日志,job一共把任务分为 两个map 任务和一个reduce任务,第一个map任务是交给主机上的tasktracker运行,第二个任务是交给从机上的tasktracker运行,但是从机上一直出现错误
2013-08-30 00:11:56,412 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201308300007_0001_m_000001_0: Error initializing attemp t_201308300007_0001_m_000001_0:
118 java.io.IOException: Could not obtain block: blk_3096925682732876688_1028 file=/home/hadoop/tmp/mapred/system/job_201308300007_0001/jobToken
119 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2460)
120 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2252)
121 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2415)
122 at java.io.DataInputStream.read(DataInputStream.java:100)
123 at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:68)
124 at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
125 at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:100)
126 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:230)
127 at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:163)
128 at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1248)
129 at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1229)
130 at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4502)
131 at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1285)
132 at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1226)
133 at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2603)
134 at java.lang.Thread.run(Thread.java:722)
即从机对应的资源包获取失败, 接着分析datanode
114 2013-08-30 00:11:46,860 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.108.114.188:50010, storageID=DS-1980704826 -10.108.114.188-50010-1377781247100, infoPort=50075, ipcPort=50020):Failed to transfer blk_-761190158981708426_1023 to 10.210.37.175:50010 got j ava.net.NoRouteToHostException: 没有到主机的路由
115 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
116 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
117 at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
118 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
119 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
120 at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1500)
121 at java.lang.Thread.run(Thread.java:722)
是从机和主机之间的通信出现了问题, 检查了很长时间 没有找到出错原因,后来突然想到linux下的防火墙iptables ,将iptables关闭,问题解决