避免regionServer宕机

版权声明:本文为博主原创文章,遵循 CC 4.0 by-sa 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/map_lixiupeng/article/details/42420603

因为regionserver 的管理信息主要记录在zookeeper,regionserver的宕机判断依据是session expired。ok

那么regionserver 和Zookeeper的session expired原因有哪些尼?

1. 网络不好。
2. Java full GC, 这会block所有的线程。如果时间比较长,也会导致session expired.
解决办法:
1. 将Zookeeper的timeout时间加长。
2. 配置“hbase.regionserver.restart.on.zk.expire” 为true。 这样子,遇到ZooKeeper session expired , regionserver将选择 restart 而不是 abort
具体的配置是,在hbase-site.xml中加入
<property>
<name>zookeeper.session.timeout</name>
<value>90000</value>
<description>ZooKeeper session timeout.
HBase passes this to the zk quorum as suggested maximum time for a
session.  See http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions
“The client sends a requested timeout, the server responds with the
timeout that it can give the client. The current implementation
requires that the timeout be a minimum of 2 times the tickTime
(as set in the server configuration) and a maximum of 20 times
the tickTime.” Set the zk ticktime with hbase.zookeeper.property.tickTime.
In milliseconds.
</description>
</property>
<property>
<name>hbase.regionserver.restart.on.zk.expire</name>
<value>true</value>
<description>
Zookeeper session expired will force regionserver exit.
Enable this will make the regionserver restart.
</description>
</property>
3、为了避免java full GC suspend thread 对Zookeeper heartbeat的影响,我们还需要对hbase-env.sh进行配置。
  设置jvm的内存回收算法:
     -XX:+CMSParallelRemarkEnabled。
如下所示:

export HBASE_OPTS="-Xms16g -Xmx16g -Xmn2g -Xss200k -XX:MaxNewSize=2g -XX:SurvivorRatio=2 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseConcMarkSweepGC -XX:+DisableExplicitGC  -XX:+CMSParallelRemarkEnabled   -XX:+UseFastAccessorMethods  -XX:+UseParNewGC -XX:MaxPermSize=300m -XX:MaxTenuringThreshold=5  -XX:GCTimeRatio=19 -XX:ParallelGCThreads=10 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:-UseGCOverheadLimit "

最后启动  regionserver:
命令:hbase-daemon.sh start regionserver
开启balance命令:balance_switch true

展开阅读全文

hbase regionserver启动失败

07-31

三台主机:rn192.168.1.121 sp.soft.pc1 作为主机masterrn192.168.1.122 sp.soft.pc2rn192.168.1.123 sp.soft.pc3rn搭建了hadoop集群,zookeeper集群,启动后各服务后进程如下:rn[img=https://img-bbs.csdn.net/upload/201807/31/1533028510_624154.png][/img]rnrn在sp.soft.pc1下启动hbase,结果如下:rn[img=https://img-bbs.csdn.net/upload/201807/31/1533028465_520977.png][/img]rn另外两台机器中的hregionserver启动后很快就停止了,查看日志有如下错误:rn2018-07-31 16:53:12,908 ERROR [regionserver/sp:16020] regionserver.HRegionServer: pache/hadoop/fs/ContentSummary; @98: invokestaticrn Reason:rn Type 'org/apache/hadoop/fs/ContentSummary$Builder' (current frame, stack[1]) is not assignable to 'org/apache/hadoop/fs/QuotaUsage$Builder'rn Current Frame:rn bci: @98rn flags: rn locals: 'org/apache/hadoop/hdfs/protocol/proto/HdfsProtos$ContentSummaryProto', 'org/apache/hadoop/fs/ContentSummary$Builder' rn stack: 'org/apache/hadoop/hdfs/protocol/proto/HdfsProtos$StorageTypeQuotaInfosProto', 'org/apache/hadoop/fs/ContentSummary$Builder' rn Bytecode:rn 0x0000000: 2ac7 0005 01b0 bb03 4159 b703 424c 2b2arn 0x0000010: b603 43b6 0344 2ab6 0345 b603 462a b603rn 0x0000020: 47b6 0348 2ab6 0349 b603 4a2a b603 4bb6rn 0x0000030: 034c 2ab6 034d b603 4e2a b603 4fb6 0350rn 0x0000040: 2ab6 0351 b603 522a b603 53b6 0354 2ab6rn 0x0000050: 0355 b603 5657 2ab6 0357 9900 0b2a b603rn 0x0000060: 582b b803 592b b603 5ab0 rn Stackmap Table:rn same_frame(@6)rn append_frame(@101,Object[#2149])rn *****rnjava.lang.VerifyError: Bad type on operand stackrn..................................rn2018-07-31 16:53:13,146 ERROR [main] regionserver.HRegionServerCommandLine: Region server exitingrnjava.lang.RuntimeException: HRegionServer Abortedrn at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:67)rn at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)rn at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)rn at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)rn at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2968)rnrnrn日志内容太多,没有完全贴出来。有哪位大神来瞅瞅,我这是反了什么低级错误导致的。rn 论坛

WAS5.0宕机

10-24

操作系统Solaris 9,应用服务器WAS5.0与数据库DB2 8.1安装在一台机器上,一个月会出现1-2次应用服务器宕机现象,错误信息如下:rnrnAn unexpected exception has been detected in native code outside the VM.rnUnexpected Signal : 11 occurred at PC=0x6EB07658rnFunction=__1cISMemNodeSfindParentsPointer6Mpp0_2_+0x24rnLibrary=/opt/IBM/db2/V8.1/lib/libdb2.so.1rnrnCurrent Java thread:rn at COM.ibm.db2.jdbc.app.DB2Connection.SQLDisconnect(Native Method)rn at COM.ibm.db2.jdbc.app.DB2Connection.close2(Unknown Source)rn - locked <7f5cbc80> (a COM.ibm.db2.jdbc.app.DB2Connection)rn at COM.ibm.db2.jdbc.app.DB2Connection.close(Unknown Source)rn at com.sscm.wap.order.CreateMOrderServlet.createOrder(CreateMOrderServlet.java:250)rn - locked <86bfdee0> (a com.sscm.wap.order.CreateMOrderServlet)rn at com.sscm.wap.order.CreateMOrderServlet.processRequest(CreateMOrderServlet.java:51)rn at com.sscm.wap.order.CreateMOrderServlet.doPost(CreateMOrderServlet.java:299)rn at javax.servlet.http.HttpServlet.service(HttpServlet.java:760)rn at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)rn at com.ibm.ws.cache.servlet.ServletWrapper.serviceProxied(ServletWrapper.java:266)rn at com.ibm.ws.cache.servlet.CacheHook.handleFragment(CacheHook.java:229)rn at com.ibm.ws.cache.servlet.CacheHook.handleServlet(CacheHook.java:137)rn at com.ibm.ws.cache.servlet.ServletWrapper.service(ServletWrapper.java:248)rn at com.ibm.ws.webcontainer.servlet.StrictServletInstance.doService(StrictServletInstance.java:110)rn at com.ibm.ws.webcontainer.servlet.StrictLifecycleServlet._service(StrictLifecycleServlet.java:174)rn at com.ibm.ws.webcontainer.servlet.IdleServletState.service(StrictLifecycleServlet.java:313)rn at com.ibm.ws.webcontainer.servlet.StrictLifecycleServlet.service(StrictLifecycleServlet.java:116)rn at com.ibm.ws.webcontainer.servlet.ServletInstance.service(ServletInstance.java:283)rn at com.ibm.ws.webcontainer.servlet.ValidServletReferenceState.dispatch(ValidServletReferenceState.java:42)rn at com.ibm.ws.webcontainer.servlet.ServletInstanceReference.dispatch(ServletInstanceReference.java:40)rn at com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.handleWebAppDispatch(WebAppRequestDispatcher.java:974)rn - locked <7e0d7198> (a com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher)rn at com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.dispatch(WebAppRequestDispatcher.java:555)rn - locked <7e0d7198> (a com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher)rn at com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.forward(WebAppRequestDispatcher.java:200)rn - locked <7e0d7198> (a com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher)rn at com.ibm.ws.webcontainer.srt.WebAppInvoker.doForward(WebAppInvoker.java:119)rn at com.ibm.ws.webcontainer.srt.WebAppInvoker.handleInvocationHook(WebAppInvoker.java:276)rn at com.ibm.ws.webcontainer.cache.invocation.CachedInvocation.handleInvocation(CachedInvocation.java:71)rn at com.ibm.ws.webcontainer.srp.ServletRequestProcessor.dispatchByURI(ServletRequestProcessor.java:182)rn at com.ibm.ws.webcontainer.oselistener.OSEListenerDispatcher.service(OSEListener.java:334)rn at com.ibm.ws.webcontainer.http.HttpConnection.handleRequest(HttpConnection.java:56)rn at com.ibm.ws.http.HttpConnection.readAndHandleRequest(HttpConnection.java:618)rn at com.ibm.ws.http.HttpConnection.run(HttpConnection.java:439)rn at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:593)rnrnDynamic libraries:rn0x10000 /opt/WebSphere/AppServer/java/bin/javarn0xff380000 /usr/lib/libthread.so.1rn0xff3f2000 /usr/lib/libdl.so.1rn0xff280000 /usr/lib/libc.so.1rn0xff3b0000 /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1rn0xfec00000 /opt/WebSphere/AppServer/java/jre/lib/sparc/client/libjvm.sorn0xff250000 /usr/lib/libCrun.so.1rn0xff230000 /usr/lib/libsocket.so.1rn0xff100000 /usr/lib/libnsl.so.1rn0xff1e0000 /usr/lib/libm.so.1rn0xff1c0000 /usr/lib/libmp.so.2rn0xff090000 /opt/WebSphere/AppServer/java/jre/lib/sparc/native_threads/libhpi.sorn0xfebd0000 /opt/WebSphere/AppServer/java/jre/lib/sparc/libverify.sorn0xfeb90000 /opt/WebSphere/AppServer/java/jre/lib/sparc/libjava.sorn0xff070000 /opt/WebSphere/AppServer/java/jre/lib/sparc/libzip.sorn0xfe3d0000 /usr/lib/locale/zh/zh.so.2rn0xfe3b0000 /usr/lib/locale/zh/methods_zh.so.2rn0x71160000 /opt/WebSphere/AppServer/bin/libWs50ProcessManagement.sorn0x71120000 /opt/WebSphere/AppServer/java/jre/lib/sparc/libnet.sorn0x71060000 /opt/WebSphere/AppServer/java/jre/lib/sparc/libioser12.sorn0x71040000 /opt/WebSphere/AppServer/java/jre/lib/sparc/libnio.sorn0x71020000 /usr/lib/librt.so.1rn0x70f60000 /usr/lib/libaio.so.1rn0x70f40000 /usr/lib/libmd5.so.1rn0x70f10000 /opt/IBM/db2/V8.1/lib/libdb2jdbc.so.1rn0x6e800000 /export/home/db2inst1/sqllib/lib/libdb2.so.1rn0x6f480000 /export/home/db2inst1/sqllib/lib/libdb2g11n.so.1rn0x709a0000 /usr/lib/libresolv.so.2rn0x70d60000 /export/home/db2inst1/sqllib/lib/libdb2install.so.1rn0x70850000 /export/home/db2inst1/sqllib/lib/libdb2locale.so.1rn0x6f400000 /export/home/db2inst1/sqllib/lib/libdb2osse.so.1rn0x706b0000 /export/home/db2inst1/sqllib/lib/libdb2genreg.so.1rn0x70d40000 /export/home/db2inst1/sqllib/lib/libdb2trcapi.so.1rn0x70830000 /usr/lib/libpthread.so.1rn0x70810000 /usr/lib/libkstat.so.1rn0x70560000 /usr/lib/nss_files.so.1rn0x70e50000 /opt/mqm/java/lib/libmqjbnd05.sorn0x6e700000 /usr/lib/libmqm.sorn0x6e480000 /usr/lib/libmqmcs.sorn0x70e30000 /usr/lib/libmqmzse.sornrnLocal Time = Fri Sep 23 16:03:57 2005rnElapsed Time = 312372rn#rn# The exception above was detected in native code outside the VMrn#rn# Java VM: Java HotSpot(TM) Client VM (1.4.1_05-b01 mixed mode)rn#rn 论坛

没有更多推荐了,返回首页