yarn中resourcemanager启动不了,启动hregionserver后又挂掉了问题的解决

问题一、启动Hadoop-2.2.0中的yarn时,resourcemanager进程一直没有启动起来。

查看日志文件中的信息tail -n 50 yarn-dell-resourcemanager-master1.log

出现一下异常:

2016-09-09 14:41:09,341 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state STARTED; cause: org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
    at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:262)
    at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:623)
    at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:655)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:872)
Caused by: java.net.BindException: Port in use: 192.168.1.120:8088
    at org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:742)
    at org.apache.hadoop.http.HttpServer.start(HttpServer.java:686)
    at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:257)
    ... 4 more
Caused by: java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:444)
    at sun.nio.ch.Net.bind(Net.java:436)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
    at org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:738)
    ... 6 more

解决方法:

1. ps aux | grep -i resourcemanager,  查看主机master中的resourcemanager的进程个数

2.   然后使用 kill -9 <RESOURCE_MANAGER_PID> 杀死相关进行

3. sbin目录下重启yarn即可复现进行

   ./stop-yarn.sh   ./start-yarn.sh 

在主节点master上面即可出现resourcemanager进程


问题二、有时,启动hregionserver后又挂掉了,查看Hbase启动的日志

dell@master1:/usr/local/hbase-0.98.7-hadoop2/logs$ tail -n 100  hbase-dell-regionserver-master1.log
    at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1286)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:862)
    at java.lang.Thread.run(Thread.java:745)
2017-01-12 10:02:23,347 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server master1,60020,1484186540447: Unhandled: Cannot create directory /hbase/WALs/master1,60020,1484186540447. Name node is in safe mode.
Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE:  If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3355)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3330)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59598)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /hbase/WALs/master1,60020,1484186540447. Name node is in safe mode.
Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE:  If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3355)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3330)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:724)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:502)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59598)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

    at org.apache.hadoop.ipc.Client.call(Client.java:1347)
    at org.apache.hadoop.ipc.Client.call(Client.java:1300)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
    at com.sun.proxy.$Proxy15.mkdirs(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2394)
    at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2365)
    at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:817)
    at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:813)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:813)
    at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:806)
    at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1933)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:408)
    at org.apache.hadoop.hbase.regionserver.wal.FSHLog.<init>(FSHLog.java:334)
    at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createHLog(HLogFactory.java:58)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateHLog(HRegionServer.java:1552)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1531)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1286)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:862)
    at java.lang.Thread.run(Thread.java:745)
2017-01-12 10:02:23,350 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
2017-01-12 10:02:23,367 INFO  [regionserver60020] ipc.RpcServer: Stopping server on 60020
2017-01-12 10:02:23,368 INFO  [regionserver60020] regionserver.HRegionServer: Stopping infoServer
2017-01-12 10:02:23,373 INFO  [regionserver60020] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:60030
2017-01-12 10:02:23,475 INFO  [regionserver60020] snapshot.RegionServerSnapshotManager: Stopping RegionServerSnapshotManager abruptly.
2017-01-12 10:02:23,475 INFO  [regionserver60020] regionserver.HRegionServer: aborting server master1,60020,1484186540447
2017-01-12 10:02:23,475 DEBUG [regionserver60020] catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@58465d50
2017-01-12 10:02:23,475 INFO  [regionserver60020] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x358d3e5582442fb
2017-01-12 10:02:23,485 INFO  [regionserver60020] zookeeper.ZooKeeper: Session: 0x358d3e5582442fb closed
2017-01-12 10:02:23,485 INFO  [regionserver60020-EventThread] zookeeper.ClientCnxn: EventThread shut down
2017-01-12 10:02:23,488 INFO  [regionserver60020] regionserver.HRegionServer: stopping server master1,60020,1484186540447; all regions closed.
2017-01-12 10:02:23,588 INFO  [regionserver60020] regionserver.Leases: regionserver60020 closing leases
2017-01-12 10:02:23,588 INFO  [regionserver60020] regionserver.Leases: regionserver60020 closed leases
2017-01-12 10:02:23,589 INFO  [regionserver60020] regionserver.CompactSplitThread: Waiting for Split Thread to finish...
2017-01-12 10:02:23,589 INFO  [regionserver60020] regionserver.CompactSplitThread: Waiting for Merge Thread to finish...
2017-01-12 10:02:23,589 INFO  [regionserver60020] regionserver.CompactSplitThread: Waiting for Large Compaction Thread to finish...
2017-01-12 10:02:23,589 INFO  [regionserver60020] regionserver.CompactSplitThread: Waiting for Small Compaction Thread to finish...
2017-01-12 10:02:23,636 INFO  [regionserver60020] zookeeper.ZooKeeper: Session: 0x558d3e6026242f9 closed
2017-01-12 10:02:23,636 INFO  [regionserver60020-EventThread] zookeeper.ClientCnxn: EventThread shut down
2017-01-12 10:02:23,636 INFO  [regionserver60020] regionserver.HRegionServer: stopping server master1,60020,1484186540447; zookeeper connection closed.
2017-01-12 10:02:23,636 INFO  [regionserver60020] regionserver.HRegionServer: regionserver60020 exiting
2017-01-12 10:02:23,636 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting
java.lang.RuntimeException: HRegionServer Aborted
    at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66)
    at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2489)
2017-01-12 10:02:23,639 INFO  [Thread-10] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@68ee3eb2
2017-01-12 10:02:23,640 INFO  [Thread-10] regionserver.ShutdownHook: Starting fs shutdown hook thread.
2017-01-12 10:02:23,641 INFO  [Thread-10] regionserver.ShutdownHook: Shutdown hook finished.
You have new mail in /var/mail/dell
解决方法:

1. hdfs dfsadmin -safemode leave, 释放安全模式

2. 然后使用

启动集群中所有的regionserver

./ hbase-daemons.sh start regionserver
或者启动某个regionserver
./hbase-daemon.sh start regionserver
3.查看Hbase webUI 
http://192.168.1.120:60010/master-status
可以看到Region Servers的存活个数。






参考文献:http://stackoverflow.com/questions/26704763/yarn-resourcetrackerservice-failed-in-state-started

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值