File***could only be replicated to 0 nodes instead of minReplication (=1)

1、集群部署完成之后,测试上传文件至hdfs时,报出 File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1)的异常信息。

[hadoop@abcd08 chx]$ hadoop fs -put hdfs_test.txt /user
14/11/30 21:43:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/30 21:43:06 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1331)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:480)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)
        at org.apache.hadoop.ipc.Client.call(Client.java:1225)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1176)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1029)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
put: File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
14/11/30 21:43:06 ERROR hdfs.DFSClient: Failed to close file /user/hdfs_test.txt._COPYING_
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs_test.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1331)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:480)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)
        at org.apache.hadoop.ipc.Client.call(Client.java:1225)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1176)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1029)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
[hadoop@abcd08 chx]$ 


2、执行hadoop dfsadmin -report看集群的状态
[ hadoop@abcd08 ~]$ hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

14/11/30 22:09:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)
[hadoop@abcd08 ~]$    

很明显,datanode异常。

3、重新执行sh start-dfs.sh(连接执行两次,都是如下信息):
[hadoop@abcd08 sbin]$ sh start-dfs.sh 
14/11/30 23:30:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [abcd08]
abcd08: starting namenode, logging to /home/hadoop/cdh4/hadoop-2.0.0-cdh4.3.0/logs/hadoop-hadoop-namenode-abcd08.out
abcd05: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd05.out
abcd06: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd06.out
abcd02: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd02.out
abcd01: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd01.out
abcd07: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd07.out
abcd04: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd04.out
abcd03: starting datanode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-datanode-abcd03.out
Starting secondary namenodes [abcd01]
abcd01: starting secondarynamenode, logging to /home/hadoop/cdh4/hadoop/logs/hadoop-hadoop-secondarynamenode-abcd01.out
14/11/30 23:30:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@abcd08 sbin]$ 

此证明:各datanode启动之后,进程随即又退出了。


4、查看datanode上的日志文件:
[hadoop@abcd07 logs]$ pwd
/home/hadoop/cdh4/hadoop/logs
[hadoop@abcd07 logs]$ tail -50 hadoop-hadoop-datanode-abcd07.log
2014-11-30 22:51:36,896 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-434621394-100.101.138.60-1417342220181 (storage id DS-139576203-100.101.138.59-50010-1416757117263) service to abcd08/100.101.138.60:8020
java.io.IOException: Incompatible clusterIDs in /data1/hadoop: namenode clusterID = CID-2d09836b-d546-4066-9deb-b28cc55de11a; datanode clusterID = CID-b664bda9-3f22-412f-86bd-372f98a73a52
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:911)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:882)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:308)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
        at java.lang.Thread.run(Thread.java:662)
2014-11-30 22:51:36,898 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-434621394-100.101.138.60-1417342220181 (storage id DS-139576203-100.101.138.59-50010-1416757117263) service to abcd08/100.101.138.60:8020
2014-11-30 22:51:36,999 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-434621394-100.101.138.60-1417342220181 (storage id DS-139576203-100.101.138.59-50010-1416757117263)
2014-11-30 22:51:39,000 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-11-30 22:51:39,002 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-11-30 22:51:39,007 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at abcd07/100.101.138.59
************************************************************/


报的是: java.io.IOException: Incompatible clusterIDs in /data1/hadoop: namenode clusterID = CID-2d09836b-d546-4066-9deb-b28cc55de11a; datanode clusterID = CID-b664bda9-3f22-412f-86bd-372f98a73a52。


5、查看各节点的 hdfs-site.xml配置文件,找到所配置的 namenode 和 datanode 的位置,如下:
[hadoop@abcd07 hadoop]$ more hdfs-site.xml 
<!-- Put site-specific property overrides in this file. -->
<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/home/hadoop/cdh4/hadoop/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/data1/hadoop</value>
        </property>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>abcd01:50090</value>
                <description></description>
        </property>
        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>true</value>
        </property>
</configuration>
[hadoop@abcd07 hadoop]$


在各datanode上打开 namenode 和 datanode 的位置(上面红色标注目录),并与namenode上的/home/hadoop/cdh4/hadoop/dfs/name/current目录下VERSION进行比较,发现它们clusterID确实是不一样。


如下:

namenode上:
[hadoop@abcd08 current]$ more VERSION 
#Sun Nov 30 18:10:20 CST 2014
namespaceID=1651811630
clusterID= CID-2d09836b-d546-4066-9deb-b28cc55de11a
cTime=0
storageType=NAME_NODE
blockpoolID=BP-434621394-100.101.138.60-1417342220181
layoutVersion=-40
[hadoop@abcd08 current]$

datanode上:
[hadoop@abcd07 current]$ more VERSION 
#Mon Nov 24 00:31:56 CST 2014
namespaceID=1422355035
clusterID=CID-597dcc33-c77e-4d18-a8b9-6e940b63d3e6
cTime=0
storageType=NAME_NODE
blockpoolID=BP-178246317-100.101.138.60-1416760316740
layoutVersion=-40
[hadoop@abcd07 current]$  


clusterID不一样的原因:在第一次格式化dfs后,启动并使用了hadoop,后来又重新执行了格式化命令(hdfs namenode -format),这时namenode的clusterID会重新生成,而datanode的clusterID 保持不变。


6、修改各datanode上的namenode上的 namenode位置下的current目录下的VERSION文件,将clusterID修改为与 namenode上的clusterID一致,问题解决。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值