因为前面Hbase2集群出现过一次故障,当时花了一个周末才修好,就去了解整理了一些hbase故障的,事故现场可以看前面写的一篇:Hbase集群挂掉的一次惊险经历
一. HBCK一致性
一致性是指Region在meta中的meta表信息、在线Regionserver的Region信息和hdfs的Regioninfo的Region信息的一致。
二. HBCK2与hbck1
HBCK2是后继hbck,该修复工具,随HBase的-1.x的(AKA hbck1)。使用HBCK2代替 hbck1对 hbase-2.x 集群进行修复。hbck1不应针对 hbase-2.x 安装运行。它可能会造成损害。虽然hbck1仍然捆绑在 hbase-2.x 中——以尽量减少意外——但它已被弃用,将在hbase-3.x 中删除。它的写入工具 ( -fix) 已被删除。它可以报告 hbase-2.x 集群的状态,但它的评估将是不准确的,因为它不了解 hbase-2.x 的内部工作原理。
我这里是hbase版本是2.0.0-cdh6.0.1
,hbase hbck -h
显示的是:
-----------------------------------------------------------------------
NOTE: As of HBase version 2.0, the hbck tool is significantly changed.
In general, all Read-Only options are supported and can be be used
safely. Most -fix/ -repair options are NOT supported. Please see usage
below for details on which options are not supported.
-----------------------------------------------------------------------
hbase2.0*是不支持hbck的,很多只读命令还可以执行,修复命令完全不能执行,hbase2只能自己去官网下载,自己编译修复工具,也不知道hbase
团队咋想滴,整合在shell命令中多好,还要使用者自己去编译,随着版本升级,越来越多的公司将从1.x升级到2.x。
NOTE: Following options are NOT supported as of HBase version 2.0+.
UNSUPPORTED Metadata Repair options: (expert features, use with caution!)
-fix Try to fix region assignments. This is for backwards compatiblity
-fixAssignments Try to fix region assignments. Replaces the old -fix
-fixMeta Try to fix meta problems. This assumes HDFS region info is good.
-fixHdfsHoles Try to fix region holes in hdfs.
-fixHdfsOrphans Try to fix region dirs with no .regioninfo file in hdfs
-fixTableOrphans Try to fix table dirs with no .tableinfo file in hdfs (online mode only)
-fixHdfsOverlaps Try to fix region overlaps in hdfs.
-maxMerge <n> When fixing region overlaps, allow at most <n> regions to merge. (n=5 by default)
-sidelineBigOverlaps When fixing region overlaps, allow to sideline big overlaps
-maxOverlapsToSideline <n> When fixing region overlaps, allow at most <n> regions to sideline per group. (n=2 by default)
-fixSplitParents Try to force offline split parents to be online.
-removeParents Try to offline and sideline lingering parents and keep daughter regions.
-fixEmptyMetaCells Try to fix hbase:meta entries not referencing any region (empty REGIONINFO_QUALIFIER rows)
UNSUPPORTED Metadata Repair shortcuts
-repair Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps -fixReferenceFiles-fixHFileLinks
-repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles
在hbase2中,hbck的命令是不支持修复的,需要使用hbck2命令,后面会介绍。
三. Hbck 一致性的检查和修复命令
一致性检查命令
hbase hbck <-details> <表名>
一致性修复
hbase hbck <-fixMeta> ,<-fixAssignments> <表名>
命令详解
-fixMeta:Try to fix meta problems. This assumes HDFS region info is good.
主要以hdfs为准进行修复,hdfs存在则添加到meta中,不存在删除meta对应region。
-fixAssignments:Try to fix region assignments. Replaces the old -fix
不同情况,动作不一样,包括下线、关闭和重新上线
四. Hbck异常定位和修复
region在meta、regionserver和hdfs三者都有哪些不一致?怎么修复?可以根据下面的异常清单进行异常定位和修复:
不一致 | 异常信息 | 修复 |
---|---|---|
第一种情况 | Region Is Not In Hbase:Meta | |
Region信息在meta数据和hdfs都不存在,但是却被部署到Regionserver。 | errors.reportError(ERROR_CODE.NOT_IN_META_HDFS, "Region " + descriptiveName + ", key=" + key + ", not on HDFS or in hbase:meta but " + "deploye |