最近在升级hadoop集群的过程中,hbase偶尔出现无法读写的情况,看region server进程并没有挂掉,于是用jstack看下是什么情况:
Found one Java-level deadlock: |
============================= |
'IPC Server handler 127 on 60020' : |
waiting to lock monitor 0x00007fa185def330 (object 0x000000057becf198 , a java.lang.Object), |
which is held by 'IPC Server handler 21 on 60020' |
'IPC Server handler 21 on 60020' : |
waiting to lock monitor 0x00007fa1984780b0 (object 0x000000057b96e178 , a java.lang.Object), |
which is held by 'regionserver60020.logRoller' |
'regionserver60020.logRoller' : |
waiting to lock monitor 0x00007fa185def330 (object 0x000000057becf198 , a java.lang.Object), |
which is held by 'IPC Server handler 21 on 60020' |
Java stack information for the threads listed above: |
=================================================== |
'IPC Server handler 127 on 60020' : |
at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java: 1305 ) |
- waiting to lock < 0x000000057becf198 > (a java.lang.Object) |
at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java: 1460 ) |
at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java: 2231 ) |
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java: 1968 ) |
at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java: 3428 ) |
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) |
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25 ) |
at java.lang.reflect.Method.invoke(Method.java: 597 ) |
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java: 364 ) |
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java: 1389 ) |
'IPC Server handler 21 on 60020' : |
at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java: 1314 ) |
- waiting to lock < 0x000000057b96e178 > (a java.lang.Object) |
- locked < 0x000000057becf198 > (a java.lang.Object) |
at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java: 1460 ) |
at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java: 2231 ) |
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java: 1968 ) |
at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java: 3428 ) |
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) |
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25 ) |
at java.lang.reflect.Method.invoke(Method.java: 597 ) |
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java: 364 ) |
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java: 1389 ) |
'regionserver60020.logRoller' : |
at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java: 1305 ) |
- waiting to lock < 0x000000057becf198 > (a java.lang.Object) |
at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java: 1283 ) |
at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java: 1456 ) |
at org.apache.hadoop.hbase.regionserver.wal.HLog.cleanupCurrentWriter(HLog.java: 876 ) |
at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java: 657 ) |
- locked < 0x000000057b96e178 > (a java.lang.Object) |
at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java: 94 ) |
at java.lang.Thread.run(Thread.java: 662 ) |
显然,这是logRoller和HLog.syncer两个线程发生了死锁。
在Hbase中,采用WAL(Write Ahead Log)的方式来保证数据不丢失。一个Region Server对应一个HLog,任何数据都是先写HLog,然后才真正的写MemStore。HLog文件有个轮转的过程,当达到默认大小(一般是block size * 0.95)就会写一个新的HLog文件,logRoller线程就是做这个事情的。而buffer中的数据每隔一秒钟会写一次HLog文件,这个是HLog.syncer线程负责的。logRoller和HLog.syncer进程不能同时进行,不然会导致数据问题,于是就需要通过锁来进行互斥了。
查看了下代码,是在HLog.syncer往HDFS写文件发生IO Exception时会触发死锁。一番google后发现是我们集群使用的hbase 0.94.1版本的一个bug,目前在社区已经修复了:https://issues.apache.org/jira/browse/HBASE-7728。
其实频繁触发Hbase的这个bug主要还是因为这次hadoop集群升级引起的。由于hadoop集群升级后导致集群的diskIO变高,加上有些大的任务在测试,HLog.syncer发生IO Exception的几率就更高了,所以RS假死就更频繁了。