故障现象
在hbase用户下,使用以下命令:
hbase hbck -details
检查所有region 和 regionserver的情况,发现存储数据空洞现象:
“ERROR: There is a hole in the region chain between …… You need to create a new .regioninfo and region dir in hdfs to plug the hole. ERROR: Found inconsistency in table TestTable”.
具体报错类似下面这样的:
ERROR: There is a hole in the region chain between TestTable,2,1415170922328.3c1b2a210888171d142059912e2faba1. and TestTable,3,1415171044919.da852e5b0034a2ca83f6966280454b4a. You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: Found inconsistency in table TestTable
原因分析
This problem occurs when meta regions are not assigned yet and preScannerOpen coprocessor waits for reading meta table for local indexes, which results in openregionthreads to wait forever because of deadlock.
you can solve this by increasing number of threads required to open the regions so that meta regions can be assigned even threads for local index table is still waiting to remove the deadlock.
解释:当元region 还没有分配时,preScannerOpen 协处理器会等待读取本地索引的元表,这会导致openregionthreads 因为死锁而永远等待。
解决办法就是在hbase-site.xml的RegionServer 高级配置代码段,更改regionserver的线程数量,以此来提高rs处理region的能力:
<property>
<name>hbase.regionserver.executor.openregion.threads</name>
<value>100</value>
</property>
进一步优化hbase
进一步优化hbase集群
与hbase regionserver相关的线程参数还有以下几个:
hbase.regionserver.executor.openregion.threads 默认3
hbase.regionserver.executor.openroot.threads 默认1
hbase.regionserver.executor.openmeta.threads 默认1
hbase.regionserver.executor.closeregion.threads 默认3
hbase.regionserver.executor.closeroot.threads 默认1
hbase.regionserver.executor.closemeta.threads 默认1
需要对这几个参数都进行优化。具体优化设定的值可以参考社区反馈。
参考链接:
1.https://cloud.tencent.com/developer/article/1359221.
2.https://community.cloudera.com/t5/Support-Questions/Can-phoenix-local-indexes-create-a-deadlock-during-an-HBase/m-p/101396.