Hadoop安装遇到的各种异常及解决办法(1)

异常一:

2014-03-13 11:10:23,665 INFO org.apache.Hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

2014-03-13 11:10:24,667 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:25,667 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:26,669 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:27,670 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:28,671 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:29,672 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:30,674 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:31,675 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:32,676 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Linux-hadoop-38/10.10.208.38:9000. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2014-03-13 11:10:32,677 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: Linux-hadoop-38/10.10.208.38:9000
解决办法:
1、ping Linux-hadoop-38能通,telnet Linux-hadoop-38 9000不能通,说明开启了防火墙
2、去Linux-hadoop-38主机关闭防火墙/etc/init.d/iptables stop,显示:
iptables:清除防火墙规则:[确定]
iptables:将链设置为政策 ACCEPT:filter [确定]
iptables:正在卸载模块:[确定]

3、重启

异常二:

2014-03-13 11:26:30,788 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-1257313099-10.10.208.38-1394679083528 (storage id DS-743638901-127.0.0.1-50010-1394616048958) service to Linux-hadoop-38/10.10.208.38:9000
java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop/tmp/dfs/data: namenode clusterID = CID-8e201022-6faa-440a-b61c-290e4ccfb006; datanode clusterID = clustername
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:391)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:916)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:887)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:309)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
at java.lang.Thread.run(Thread.java:662)
解决办法:
1、在hdfs-site.xml配置文件中,配置了dfs.namenode.name.dir,在master中,该配置的目录下有个current文件夹,里面有个VERSION文件,内容如下:
#Thu Mar 13 10:51:23 CST 2014
namespaceID=1615021223
clusterID=CID-8e201022-6faa-440a-b61c-290e4ccfb006
cTime=0
storageType=NAME_NODE
blockpoolID=BP-1257313099-10.10.208.38-1394679083528
layoutVersion=-40
2、在core-site.xml配置文件中,配置了hadoop.tmp.dir,在slave中,该配置的目录下有个dfs/data/current目录,里面也有一个VERSION文件,内容
#Wed Mar 12 17:23:04 CST 2014
storageID=DS-414973036-10.10.208.54-50010-1394616184818
clusterID=clustername
cTime=0
storageType=DATA_NODE
layoutVersion=-40
3、一目了然,两个内容不一样,导致的。删除slave中的错误内容,重启,搞定!

参考资料:http://www.linuxidc.com/Linux/2014-03/98598.htm

异常三:

2014-03-13 12:34:46,828 FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce_shuffle
java.lang.RuntimeException: No class defiend for mapreduce_shuffle
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.init(AuxServices.java:94)
at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.init(ContainerManagerImpl.java:181)
at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.init(NodeManager.java:185)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:328)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:351)
2014-03-13 12:34:46,830 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
java.lang.RuntimeException: No class defiend for mapreduce_shuffle
at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.init(AuxServices.java:94)
at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.init(ContainerManagerImpl.java:181)
at org.apache.hadoop.yarn.service.CompositeService.init(CompositeService.java:58)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.init(NodeManager.java:185)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:328)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:351)
2014-03-13 12:34:46,846 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: ResourceCalculatorPlugin is unavailable on this system. org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is disabled.
解决办法:
1、yarn-site.xml配置错误:

yarn.nodemanager.aux-services
mapreduce_shuffle

2、修改为:

yarn.nodemanager.aux-services
mapreduce.shuffle

3、重启服务

警告:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

解决办法:

参考 http://www.linuxidc.com/Linux/2014-03/98599.htm

异常四:

14/03/13 17:25:41 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1734)
at java.lang.Runtime.loadLibrary0(Runtime.java:823)
at java.lang.System.loadLibrary(System.java:1028)
at com.hadoop.compression.lzo.GPLNativeCodeLoader.(GPLNativeCodeLoader.java:32)
at com.hadoop.compression.lzo.LzoCodec.(LzoCodec.java:67)
at com.hadoop.compression.lzo.LzoIndexer.(LzoIndexer.java:36)
at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
14/03/13 17:25:41 ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop
14/03/13 17:25:43 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file /test2.lzo, size 0.00 GB…
Exception in thread “main” java.lang.RuntimeException: native-lzo library not available
at com.hadoop.compression.lzo.LzopCodec.createDecompressor(LzopCodec.java:91)
at com.hadoop.compression.lzo.LzoIndex.createIndex(LzoIndex.java:222)
at com.hadoop.compression.lzo.LzoIndexer.indexSingleFile(LzoIndexer.java:117)
at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:98)
at com.hadoop.compression.lzo.LzoIndexer.index(LzoIndexer.java:52)
at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:137)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

**解决办法:**很明显,没有native-lzo
编译安装/编译lzo,http://www.linuxidc.com/Linux/2014-03/98601.htm

异常五:

14/03/17 10:23:59 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1394702706596_0003
java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:134)
at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:174)
at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:58)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:276)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:468)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:485)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
at org.apache.hadoop.mapreduce.Job 11. r u n ( J o b . j a v a : 1266 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 396 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1408 ) a t o r g . a p a c h e . h a d o o p . m a p r e d u c e . J o b . s u b m i t ( J o b . j a v a : 1266 ) a t o r g . a p a c h e . h a d o o p . m a p r e d u c e . J o b . w a i t F o r C o m p l e t i o n ( J o b . j a v a : 1287 ) a t o r g . a p a c h e . h a d o o p . e x a m p l e s . W o r d C o u n t . m a i n ( W o r d C o u n t . j a v a : 84 ) a t s u n . r e f l e c t . N a t i v e M e t h o d A c c e s s o r I m p l . i n v o k e 0 ( N a t i v e M e t h o d ) a t s u n . r e f l e c t . N a t i v e M e t h o d A c c e s s o r I m p l . i n v o k e ( N a t i v e M e t h o d A c c e s s o r I m p l . j a v a : 39 ) a t s u n . r e f l e c t . D e l e g a t i n g M e t h o d A c c e s s o r I m p l . i n v o k e ( D e l e g a t i n g M e t h o d A c c e s s o r I m p l . j a v a : 25 ) a t j a v a . l a n g . r e f l e c t . M e t h o d . i n v o k e ( M e t h o d . j a v a : 597 ) a t o r g . a p a c h e . h a d o o p . u t i l . P r o g r a m D r i v e r 11.run(Job.java:1266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287) at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver 11.run(Job.java:1266)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:396)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)atorg.apache.hadoop.mapreduce.Job.submit(Job.java:1266)atorg.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)atorg.apache.hadoop.examples.WordCount.main(WordCount.java:84)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)atjava.lang.reflect.Method.invoke(Method.java:597)atorg.apache.hadoop.util.ProgramDriverProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1680)
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:127)
… 26 more
临时解决办法:
将/usr/local/hadoop/lib/hadoop-lzo-0.4.10.jar拷贝到/usr/local/jdk/lib下,重启linux

异常六:

14/03/17 10:35:03 ERROR security.UserGroupInformation: PriviledgedActionException as:Hadoop(auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot delete /tmp/hadoop-yarn/staging/hadoop/.staging/job_1395023531587_0001. Name node is in safe mode.
The reported blocks 0 needs additional 12 blocks to reach the threshold 0.9990 of total blocks 12. Safe mode will be turned off automatically.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2905)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol 2. c a l l B l o c k i n g M e t h o d ( C l i e n t N a m e n o d e P r o t o c o l P r o t o s . j a v a : 44968 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) at org.apache.hadoop.ipc.ProtobufRpcEngine 2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968)atorg.apache.hadoop.ipc.ProtobufRpcEngineServer P r o t o B u f R p c I n v o k e r . c a l l ( P r o t o b u f R p c E n g i n e . j a v a : 453 ) a t o r g . a p a c h e . h a d o o p . i p c . R P C ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler 1. r u n ( S e r v e r . j a v a : 1752 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:1752) at org.apache.hadoop.ipc.Server 1.run(Server.java:1752)atorg.apache.hadoop.ipc.ServerHandler 1. r u n ( S e r v e r . j a v a : 1748 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 396 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1408 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server 1.run(Server.java:1748)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:396)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)atorg.apache.hadoop.ipc.ServerHandler.run(Server.java:1746)
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot delete /tmp/hadoop-yarn/staging/hadoop/.staging/job_1395023531587_0001. Name node is in safe mode.
The reported blocks 0 needs additional 12 blocks to reach the threshold 0.9990 of total blocks 12. Safe mode will be turned off automatically.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:2905)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:2872)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2859)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:642)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:408)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol 2. c a l l B l o c k i n g M e t h o d ( C l i e n t N a m e n o d e P r o t o c o l P r o t o s . j a v a : 44968 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968) at org.apache.hadoop.ipc.ProtobufRpcEngine 2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44968)atorg.apache.hadoop.ipc.ProtobufRpcEngineServer P r o t o B u f R p c I n v o k e r . c a l l ( P r o t o b u f R p c E n g i n e . j a v a : 453 ) a t o r g . a p a c h e . h a d o o p . i p c . R P C ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler 1. r u n ( S e r v e r . j a v a : 1752 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:1752) at org.apache.hadoop.ipc.Server 1.run(Server.java:1752)atorg.apache.hadoop.ipc.ServerHandler 1. r u n ( S e r v e r . j a v a : 1748 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 396 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1408 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server 1.run(Server.java:1748)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:396)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)atorg.apache.hadoop.ipc.ServerHandler.run(Server.java:1746)
at org.apache.hadoop.ipc.Client.call(Client.java:1238)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.delete(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:408)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1487)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:355)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:418)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
at org.apache.hadoop.mapreduce.Job 11. r u n ( J o b . j a v a : 1266 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 396 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1408 ) a t o r g . a p a c h e . h a d o o p . m a p r e d u c e . J o b . s u b m i t ( J o b . j a v a : 1266 ) a t o r g . a p a c h e . h a d o o p . m a p r e d u c e . J o b . w a i t F o r C o m p l e t i o n ( J o b . j a v a : 1287 ) a t o r g . a p a c h e . h a d o o p . e x a m p l e s . W o r d C o u n t . m a i n ( W o r d C o u n t . j a v a : 84 ) a t s u n . r e f l e c t . N a t i v e M e t h o d A c c e s s o r I m p l . i n v o k e 0 ( N a t i v e M e t h o d ) a t s u n . r e f l e c t . N a t i v e M e t h o d A c c e s s o r I m p l . i n v o k e ( N a t i v e M e t h o d A c c e s s o r I m p l . j a v a : 39 ) a t s u n . r e f l e c t . D e l e g a t i n g M e t h o d A c c e s s o r I m p l . i n v o k e ( D e l e g a t i n g M e t h o d A c c e s s o r I m p l . j a v a : 25 ) a t j a v a . l a n g . r e f l e c t . M e t h o d . i n v o k e ( M e t h o d . j a v a : 597 ) a t o r g . a p a c h e . h a d o o p . u t i l . P r o g r a m D r i v e r 11.run(Job.java:1266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287) at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver 11.run(Job.java:1266)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:396)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)atorg.apache.hadoop.mapreduce.Job.submit(Job.java:1266)atorg.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)atorg.apache.hadoop.examples.WordCount.main(WordCount.java:84)atsun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)atjava.lang.reflect.Method.invoke(Method.java:597)atorg.apache.hadoop.util.ProgramDriverProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
解决办法:
明显是安全模式:
原因是reboot机器的时候,防火墙起来了,导致NodeManager启动不起来,导致一直是安全模式。关闭防火墙重启hadoop,ok!

异常七:

2014-03-20 10:35:10,447 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2014-03-20 10:35:10,450 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2014-03-20 10:35:10,450 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
2014-03-20 10:35:10,450 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
2014-03-20 10:35:10,476 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /usr/local/hadoop/hdfs/name/in_use.lock acquired by nodename 9580@Linux-hadoop-38
2014-03-20 10:35:10,479 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system…
2014-03-20 10:35:10,480 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2014-03-20 10:35:10,480 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2014-03-20 10:35:10,480 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:217)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:728)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:521)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:403)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:437)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:613)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:598)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233)
2014-03-20 10:35:10,484 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2014-03-20 10:35:10,501 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
解决办法:
hadoop namenode -format,不是hdfs namenode -format

启动JobHistoryServer
sbin/mr-jobhistory-daemon.sh start historyserver

异常八:

Datanode denied communication with namenode: DatanodeRegistration 解决办法

Hadoop版本:2.2.0

单机伪同步环境。

启动start-dfs.sh后,发现本地没有datanode进程,查看datanode的日志如下:

2014-03-24 23:48:11,357 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1316134947-127.0.0.1-1395699640023 (storage id DS-2053126411-127.0.0.1-50010-1395699655564) service to localhost/192.168.1.101:9000 beginning handshake with NN

2014-03-24 23:48:11,381 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-1316134947-127.0.0.1-1395699640023 (storage id DS-2053126411-127.0.0.1-50010-1395699655564) service to localhost/192.168.1.101:9000
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException):Datanode denied communication with namenode: DatanodeRegistration(0.0.0.0, storageID=DS-2053126411-127.0.0.1-50010-1395699655564, infoPort=50075, ipcPort=50020, storageInfo=lv=-47;cid=CID-ba95b66c-d94b-4390-8d44-adb486ba5682;nsid=2073699254;c=0)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:739)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3929)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:948)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:90)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService 2. c a l l B l o c k i n g M e t h o d ( D a t a n o d e P r o t o c o l P r o t o s . j a v a : 24079 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 2.callBlockingMethod(DatanodeProtocolProtos.java:24079) at org.apache.hadoop.ipc.ProtobufRpcEngine 2.callBlockingMethod(DatanodeProtocolProtos.java:24079)atorg.apache.hadoop.ipc.ProtobufRpcEngineServer P r o t o B u f R p c I n v o k e r . c a l l ( P r o t o b u f R p c E n g i n e . j a v a : 585 ) a t o r g . a p a c h e . h a d o o p . i p c . R P C ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler 1. r u n ( S e r v e r . j a v a : 2048 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:2048) at org.apache.hadoop.ipc.Server 1.run(Server.java:2048)atorg.apache.hadoop.ipc.ServerHandler 1. r u n ( S e r v e r . j a v a : 2044 ) a t j a v a . s e c u r i t y . A c c e s s C o n t r o l l e r . d o P r i v i l e g e d ( N a t i v e M e t h o d ) a t j a v a x . s e c u r i t y . a u t h . S u b j e c t . d o A s ( S u b j e c t . j a v a : 394 ) a t o r g . a p a c h e . h a d o o p . s e c u r i t y . U s e r G r o u p I n f o r m a t i o n . d o A s ( U s e r G r o u p I n f o r m a t i o n . j a v a : 1491 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r 1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server 1.run(Server.java:2044)atjava.security.AccessController.doPrivileged(NativeMethod)atjavax.security.auth.Subject.doAs(Subject.java:394)atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)atorg.apache.hadoop.ipc.ServerHandler.run(Server.java:2042)

at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine I n v o k e r . i n v o k e ( P r o t o b u f R p c E n g i n e . j a v a : 206 ) a t c o m . s u n . p r o x y . Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy. Invoker.invoke(ProtobufRpcEngine.java:206)atcom.sun.proxy.Proxy9.registerDatanode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.registerDatanode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:146)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:623)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664)
at java.lang.Thread.run(Thread.java:695)
2014-03-24 23:48:11,382 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1316134947-127.0.0.1-1395699640023 (storage id DS-2053126411-127.0.0.1-50010-1395699655564) service to localhost/192.168.1.101:9000
2014-03-24 23:48:11,484 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-1316134947-127.0.0.1-1395699640023 (storage id DS-2053126411-127.0.0.1-50010-1395699655564)
2014-03-24 23:48:11,484 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Removed bpid=BP-1316134947-127.0.0.1-1395699640023 from blockPoolScannerMap
2014-03-24 23:48:11,484 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-1316134947-127.0.0.1-1395699640023
2014-03-24 23:48:13,485 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2014-03-24 23:48:13,486 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2014-03-24 23:48:13,487 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at localhost/127.0.0.1

************************************************************/

导致这个问题的原因很多,在stackoverflow上有很多解决方案,但都不适用我。经过摸索,通过修改/etc/hosts解决。

/etc/hosts原来为:

127.0.0.1 localhost

由于我本机IP地址为 192.168.1.101,所以将hosts文件修改为这个ip地址:

192.168.1.101 localhost

理论上127.0.0.1也是可行的,但不知道我的为什么不行。

修改后,重新删除dfs.namenode.name.dir和dfs.datanode.data.dir以及hadoop.tmp.dir目录,分别执行:

hdfs namenode -format

start-dfs.sh

start-yarn.sh

后,用jps查看服务,全部正常:

37619 NodeManager

37798 Jps

37247 NameNode

37330 DataNode

37536 ResourceManager

37432 SecondaryNameNode

  • 4
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
在搭建Hadoop服务器集群过程中,可能会遇到一些常见问题。以下是一些可能出现的问题及相应的解决办法: 1. 网络连接问题:确保集群中的服务器能够相互通信,检查网络配置、防火墙设置等。可以通过ping命令检查服务器之间的连通性。 2. SSH连接问题:如果无法通过SSH进行免密登录,可以检查SSH配置、密钥文件等。确保每台服务器的SSH服务正常运行,并且公钥已正确配置。 3. Java环境问题:如果Hadoop无法找到Java环境,可以检查Java的安装路径和环境变量设置。确保每台服务器上都正确安装了适合的Java Development Kit (JDK)。 4. Hadoop配置问题:在编辑Hadoop的配置文件时,可能会出现错误的配置或格式不正确导致集群无法正常启动。建议仔细检查配置文件的语法和参数设置,可以参考官方文档或其他资源进行正确配置。 5. HDFS格式化问题:在格式化HDFS时,可能会遇到权限或文件系统错误导致格式化失败。确保有足够的权限执行格式化操作,并且没有其他进程占用HDFS的相关目录。 6. 资源分配问题:如果集群中的节点无法正常分配资源或任务无法运行,可以检查资源管理器(如YARN)的配置和日志,确保资源分配策略和配置正确。 7. 集群安全问题:如果需要启用Hadoop的安全功能(如Kerberos认证),可能会遇到配置和认证问题。在启用安全功能前,建议详细阅读相关文档,并按照指导进行正确配置。 8. 高可用性配置问题:如果需要实现Hadoop集群的高可用性,配置过程可能会较为复杂。建议仔细阅读相关文档,并按照指导进行正确配置,包括故障转移、备份节点等。 9. 日志和错误排查:在搭建过程中,如果遇到问题,可以查看Hadoop的日志文件和错误信息,以便更好地定位问题。可以通过日志来分析异常、错误和警告信息,并尝试解决或定位问题。 以上是一些常见的问题及解决办法,具体的问题和解决方案可能会因环境和配置的不同而有所差异。在遇到问题时,可以参考官方文档、社区论坛或其他相关资源,进行更详细的排查和解决

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

鱼香Ross

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值