关闭

hadoop2安装错误记录

标签: hadoopexceptionhdfs
834人阅读 评论(0) 收藏 举报
分类:

错误1:发生在向hdfs中上传文件的过程当中,具体情况是提示文件一直处于复制上传的过程中,消耗很大的时间。错误如下:

2015-06-30 09:29:45,020 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/tmp/dfs/name/current/edits_inprogress_0000000000000000114 -> /home/lin/hadoop-2.5.2/tmp/dfs/name/current/edits_0000000000000000114-0000000000000000127

2015-06-30 09:29:45,020 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 128
2015-06-30 09:29:48,876 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30002 milliseconds
2015-06-30 09:29:48,877 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 2 millisecond(s).
2015-06-30 09:30:18,876 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2015-06-30 09:30:18,876 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).
2015-06-30 09:30:48,876 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2015-06-30 09:30:48,876 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).
2015-06-30 09:31:18,878 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30002 milliseconds
2015-06-30 09:31:18,879 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).


2015-06-30 09:25:43,935 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000001 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000001-0000000000000000002
2015-06-30 09:27:44,814 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000003 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000003-0000000000000000113
2015-06-30 09:29:45,016 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000114 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000114-0000000000000000127
2015-06-30 09:31:45,158 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000128 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000128-0000000000000000129
2015-06-30 09:33:45,331 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000130 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000130-0000000000000000131
2015-06-30 09:35:45,457 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000132 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000132-0000000000000000133
2015-06-30 09:37:45,575 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000134 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000134-0000000000000000136
2015-06-30 09:39:45,779 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000137 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000137-0000000000000000138
2015-06-30 09:41:47,915 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000139 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000139-0000000000000000140
2015-06-30 09:43:48,061 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000141 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000141-0000000000000000142

2015-06-30 09:45:48,365 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_inprogress_0000000000000000143 -> /home/lin/hadoop-2.5.2/journal/mycluster/current/edits_0000000000000000143-0000000000000000144

分析上面的错误主要是出现在journalnode这个进程当中。出现这个问题的原因主要是配置journalnode的时候,出现配置错误,也可能是部分journalnode进程没有启动。仔细检查就能避免该错误。

error2:启动hbase的时候,出现hmaster进程死掉,报错如下:

2015-06-29 20:38:39,287 INFO  [master1:60000.activeMasterManager] master.RegionStates: Onlined 1588230740 on ubuntu.slave1,16020,1435580161840
2015-06-29 20:38:39,288 INFO  [master1:60000.activeMasterManager] master.ServerManager: AssignmentManager hasn't finished failover cleanup; waiting
2015-06-29 20:38:39,289 INFO  [master1:60000.activeMasterManager] master.HMaster: hbase:meta assigned=0, rit=false, location=ubuntu.slave1,16020,1435580161840
2015-06-29 20:38:39,532 INFO  [master1:60000.activeMasterManager] hbase.MetaMigrationConvertingToPB: hbase:meta doesn't have any entries to update.
2015-06-29 20:38:39,532 INFO  [master1:60000.activeMasterManager] hbase.MetaMigrationConvertingToPB: META already up-to date with PB serialization
2015-06-29 20:38:39,771 INFO  [master1:60000.activeMasterManager] master.AssignmentManager: Clean cluster startup. Assigning user regions
2015-06-29 20:38:39,991 INFO  [master1:60000.activeMasterManager] master.AssignmentManager: Joined the cluster in 459ms, failover=false
2015-06-29 20:38:40,006 INFO  [master1:60000.activeMasterManager] master.TableNamespaceManager: Namespace table not found. Creating...
2015-06-29 20:38:40,093 FATAL [master1:60000.activeMasterManager] master.HMaster: Failed to become active master
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.checkAndSetEnablingTable(CreateTableHandler.java:151)
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(CreateTableHandler.java:124)
at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceTable(TableNamespaceManager.java:233)
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:86)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:871)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:722)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:165)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1428)
at java.lang.Thread.run(Thread.java:745)
2015-06-29 20:38:40,095 FATAL [master1:60000.activeMasterManager] master.HMaster: Master server abort: loaded coprocessors are: []
2015-06-29 20:38:40,095 FATAL [master1:60000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.checkAndSetEnablingTable(CreateTableHandler.java:151)
at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(CreateTableHandler.java:124)
at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceTable(TableNamespaceManager.java:233)
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:86)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:871)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:722)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:165)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1428)
at java.lang.Thread.run(Thread.java:745)
2015-06-29 20:38:40,095 INFO  [master1:60000.activeMasterManager] regionserver.HRegionServer: STOPPED: Unhandled exception. Starting shutdown.
2015-06-29 20:38:40,096 INFO  [master/master1/192.168.1.107:60000] regionserver.HRegionServer: Stopping infoServer
2015-06-29 20:38:40,110 INFO  [master/master1/192.168.1.107:60000] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:16010
2015-06-29 20:38:40,212 INFO  [master/master1/192.168.1.107:60000] regionserver.HRegionServer: stopping server master1,60000,1435581499646
2015-06-29 20:38:40,218 INFO  [master/master1/192.168.1.107:60000] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x14e3f345e250008
2015-06-29 20:38:40,234 INFO  [master/master1/192.168.1.107:60000] zookeeper.ZooKeeper: Session: 0x14e3f345e250008 closed
2015-06-29 20:38:40,234 INFO  [master/master1/192.168.1.107:60000-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-06-29 20:38:40,295 INFO  [master1,60000,1435581499646.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor: master1,60000,1435581499646.splitLogManagerTimeoutMonitor exiting
2015-06-29 20:38:40,338 INFO  [master/master1/192.168.1.107:60000] regionserver.HRegionServer: stopping server master1,60000,1435581499646; all regions closed.
2015-06-29 20:38:40,338 INFO  [master1,60000,1435581499646-BalancerChore] balancer.BalancerChore: master1,60000,1435581499646-BalancerChore exiting
2015-06-29 20:38:40,339 INFO  [master1,60000,1435581499646-ClusterStatusChore] balancer.ClusterStatusChore: master1,60000,1435581499646-ClusterStatusChore exiting

分析错误发现是hbase的namespace出现错误,因为hbase本身会有一个元数据表来维持hbase的元数据管理,这个表存在,但是与某个产生冲突。到底与哪个有冲突?我们知道hbase在运行过程当中需要zookeeper来配合管理元数据。可以推测是不是zookeeper导致该问题的出现?尝试一下将所有的系统数据删除,然后重新格式化hadoop,发现该问题解决。

error3:客户端读取数据的时候出现的错误。

ERROR: java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=7, exceptions:
Thu May 07 17:38:13 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, java.net.ConnectException: Connection refused
Thu May 07 17:39:45 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, java.net.ConnectException: Connection refused
Thu May 07 17:40:04 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for sec,508872:12:12:6030,99999999999999 after 7 tries.
Thu May 07 17:40:23 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for sec,508872:12:12:6030,99999999999999 after 7 tries.
Thu May 07 17:40:43 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for sec,508872:12:12:6030,99999999999999 after 7 tries.
Thu May 07 17:41:03 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for sec,508872:12:12:6030,99999999999999 after 7 tries.
Thu May 07 17:41:26 CST 2015, org.apache.hadoop.hbase.client.ScannerCallable@16e7eff, org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for sec,508872:12:12:6030,99999999999999 after 7 tries.




INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x14d4b624e860006, likely server has closed socket, closing socket connection and attempting reconnect
15/05/13 18:39:48 INFO zookeeper.ClientCnxn: Opening socket connection to server ubuntu.slave6/192.168.1.122:2181
15/05/13 18:39:48 INFO zookeeper.ClientCnxn: Socket connection established to ubuntu.slave6/192.168.1.122:2181, initiating session
15/05/13 18:40:01 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 13335ms for sessionid 0x14d4b624e860006, closing socket connection and attempting reconnect
15/05/13 18:40:02 INFO zookeeper.ClientCnxn: Opening socket connection to server ubuntu.slave5/192.168.1.124:2181
15/05/13 18:40:02 INFO zookeeper.ClientCnxn: Socket connection established to ubuntu.slave5/192.168.1.124:2181, initiating session
15/05/13 18:40:07 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x14d4b624e860006, likely server has closed socket, closing socket connection and attempting reconnect
15/05/13 18:40:07 INFO zookeeper.ClientCnxn: Opening socket connection to server ubuntu.master/192.168.1.103:2181
15/05/13 18:40:07 INFO zookeeper.ClientCnxn: Socket connection established to ubuntu.master/192.168.1.103:2181, initiating session
15/05/13 18:40:08 INFO zookeeper.ClientCnxn: Session establishment complete on server ubuntu.master/192.168.1.103:2181, sessionid = 0x14d4b624e860006, negotiated timeout = 40000




15/05/17 09:39:56 INFO zookeeper.ClientCnxn: Socket connection established to ubuntu.master/192.168.1.103:2181, initiating session
15/05/17 09:39:56 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x14d5f7dc853000c, likely server has closed socket, closing socket connection and attempting reconnect
15/05/17 09:39:57 INFO zookeeper.ClientCnxn: Opening socket connection to server ubuntu.master/192.168.1.103:2181. Will not attempt to authenticate using SASL (无法定位登录配置)

分析错误发现时客户端连接被拒绝,查找系统进程发现时其中的某个regionserver进程死掉,而客户端正在连接该节点上的regionserver,从而导致错误,将该进程重新启动,问题解决。

error4:在编程过程当中,出现的问题

org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 634 actions: servers with issues: ubuntu.slave5:60020, 
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:772)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:748)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:123)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at newFrame.NewFrametwo$Reduce.reduce(NewFrametwo.java:176)
at newFrame.NewFrametwo$Reduce.reduce(NewFrametwo.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

主要是因为我在操作版本2的时候,使用的仍然是版本1的代码,而此时版本2的接口有点变化,通过查找具体的函数接口,然后改正,问题就得以解决。

error5:jar包找不到



Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 12 more
Caused by: java.lang.NoClassDefFoundError: io/netty/channel/EventLoopGroup
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1929)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:631)
... 17 more
Caused by: java.lang.ClassNotFoundException: io.netty.channel.EventLoopGroup
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 24 more

程序当中缺少netty包,导致与之相关的类没有找到,将该jar包添加到程序当中即可

error6:缺少指定的zookeeper参数导致的错误。

create table :sed
15/04/29 22:32:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/29 22:32:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
15/04/29 22:32:56 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection
15/04/29 22:32:56 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 6240@DELL-PC
15/04/29 22:32:56 INFO zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (无法定位登录配置)
15/04/29 22:32:57 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
15/04/29 22:32:57 WARN zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
15/04/29 22:32:57 INFO util.RetryCounter: Sleeping 2000ms before retry #1...
15/04/29 22:32:58 INFO zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (无法定位登录配置)
15/04/29 22:32:59 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

zookeeper的参数没有配置到程序当中个,通过configuration类将具体的zookeeper参数配置到程序当中,问题得以解决

0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:27602次
    • 积分:732
    • 等级:
    • 排名:千里之外
    • 原创:41篇
    • 转载:19篇
    • 译文:4篇
    • 评论:4条
    最新评论