转自:http://hi.baidu.com/itdreams2009/blog/item/62a5ef18fbbe854e42a9ad13.html 1、问题描述:三台机子搭建的hadoop集群,一台是namenode,另外两台是datanode。今天执行hadoop fs -copyFromLocal 的时候报错。File /home/hexianghui/tmp/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 由这个可以认为是没有找到datanode。我用jps查看进程都正常。但是用web查看的话,live nodes为0. 这说明datanode没有正常启动,但是datanode进程又启动了,这是为何?网友可以跟帖提出意见或者指导…… 2009-12-30 22:02:19,190 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: at org.apache.hadoop.ipc.Client.call(Client.java:739) 2009-12-30 22:04:01,555 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null at org.apache.hadoop.ipc.Client.call(Client.java:739) 以下是我在网上找的解决方法集锦: 方案1: 是否是防火墙未关闭,查看。确实忘记关闭防火墙了,因为我换了几台机器,以前是在虚拟机下用的是redhat,现在用的是ubuntu8.0.4。所以跟这个很可能相关。关闭iptables后,出现的错误信息没那么多了,但是还有错误。如下: 2010-01-03 22:08:25,073 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call addBlock(/user/hexianghui/input/file01, DFSClient_105866075) from 192.168.0.4:53604: error: java.io.IOException: File /user/hexianghui/input/file01 could only be replicated to 0 nodes, instead of 1 虽然有类似的信息:could only be replicated to 0 nodes, instead of 1。但情况不一样了,可以对照上面的错误信息。继续接着找…… 方案2:Hadoop DFSClient警告NotReplicatedYetException信息有时,当你申请到一个HOD集群后马上尝试上传文件到HDFS时,DFSClient会警告NotReplicatedYetException。通常会有一个这样的信息 -
当你向一个DataNodes正在和NameNode联络的集群上传文件的时候,这种现象就会发生。在上传新文件到HDFS之前多等待一段时间就可以解决这个问题,因为这使得足够多的DataNode启动并且联络上了NameNode。 PS:我的实践:等待了几分钟,还是依然报错,此法不通。 方案3:http://trac.nchc.org.tw/cloud/wiki/waue/2009/0709 這個錯誤訊息意思是,他想要放檔案,但沒半個node可以給存取,因此我們需要檢查:
PS:检查上面几个,1)系统空间够。df -hl查看。 2)datanode数是2.datanode用jps查看进程,都启动了。3,是否在safemode下。hadoop dfsadmin -safemode leave.使用后,可以正常拷贝了。也许之前的操作和这一步操作起作用了。
|
转自:http://blog.csdn.net/wh62592855/archive/2010/07/18/5744158.aspx
10/07/18 12:31:11 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null
10/07/18 12:31:11 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/root/input/log4j.properties" - Aborting...
put: java.io.IOException: File /user/root/input/log4j.properties could only be replicated to 0 nodes, instead of 1
好长到一段错误代码,呵呵。刚碰到这个问题到时候上网搜了以下,也没有一个很标准的解决方法。大致上说是由于不一致状态导致的。
办法倒是有一个,只不过会丢失掉已有数据,请慎重使用。
1、先把服务都停掉
2、格式化namenode
3、重新启动所有服务
4、可以进行正常操作了
下面是我到解决步骤
root@scutshuxue-desktop:/home/root/hadoop-0.19.2# bin/stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
no namenode to stop
localhost: no datanode to stop
localhost: stopping secondarynamenode
root@scutshuxue-desktop:/home/root/hadoop-0.19.2# bin/hadoop namenode -format
10/07/18 12:46:23 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = scutshuxue-desktop/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.19.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.19 -r 789657; compiled by 'root' on Tue Jun 30 12:40:50 EDT 2009
************************************************************/
Re-format filesystem in /tmp/hadoop-root/dfs/name ? (Y or N) Y
10/07/18 12:46:24 INFO namenode.FSNamesystem: fsOwner=root,root
10/07/18 12:46:24 INFO namenode.FSNamesystem: supergroup=supergroup
10/07/18 12:46:24 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/07/18 12:46:25 INFO common.Storage: Image file of size 94 saved in 0 seconds.
10/07/18 12:46:25 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
10/07/18 12:46:25 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at scutshuxue-desktop/127.0.1.1
************************************************************/
root@scutshuxue-desktop:/home/root/hadoop-0.19.2# ls
bin docs lib README.txt
build.xml hadoop-0.19.2-ant.jar libhdfs src
c++ hadoop-0.19.2-core.jar librecordio test-txt
CHANGES.txt hadoop-0.19.2-examples.jar LICENSE.txt webapps
conf hadoop-0.19.2-test.jar logs
contrib hadoop-0.19.2-tools.jar NOTICE.txt
root@scutshuxue-desktop:/home/root/hadoop-0.19.2# bin/start-all.sh
starting namenode, logging to /home/root/hadoop-0.19.2/bin/../logs/hadoop-root-namenode-scutshuxue-desktop.out
localhost: starting datanode, logging to /home/root/hadoop-0.19.2/bin/../logs/hadoop-root-datanode-scutshuxue-desktop.out
localhost: starting secondarynamenode, logging to /home/root/hadoop-0.19.2/bin/../logs/hadoop-root-secondarynamenode-scutshuxue-desktop.out
starting jobtracker, logging to /home/root/hadoop-0.19.2/bin/../logs/hadoop-root-jobtracker-scutshuxue-desktop.out
localhost: starting tasktracker, logging to /home/root/hadoop-0.19.2/bin/../logs/hadoop-root-tasktracker-scutshuxue-desktop.out
root@scutshuxue-desktop:/home/root/hadoop-0.19.2# bin/hadoop fs -put conf input
root@scutshuxue-desktop:/home/root/hadoop-0.19.2# bin/hadoop dfs -ls
Found 1 items
drwxr-xr-x - root supergroup 0 2010-07-18 12:47 /user/root/input
本文来自CSDN博客,转载请标明出处:http://blog.csdn.net/wh62592855/archive/2010/07/18/5744158.aspx