1.创建命令脚本compact.sh
写入内容:
#!/bin/bash
time_start=`date "+%Y-%m-%d %H:%M:%S"`
echo "开始进行HBase的大合并.时间:${time_start}"
str=`echo list | hbase shell | sed -n '$p'`
#str="a,b,c"
str=${str//,/ }
arr=($str)
length=${#arr[@]}
current=1
echo "HBase中总共有${length}张表需要合并."
echo "balance_switch false" | hbase shell | > /dev/null
echo "HBase的负载均衡已经关闭"
for each in ${arr[*]}
do
table=`echo $each | sed 's/]//g' | sed 's/\[//g'`
echo "开始合并第${current}/${length}张表,表的名称为:${table}"
echo "major_compact ${table}" | hbase shell | > /dev/null
let current=current+1
done
echo "balance_switch true" | hbase shell | > /dev/null
echo "HBase的负载均衡已经打开."
time_end=`date "+%Y-%m-%d %H:%M:%S"`
echo "HBase的大合并完成.时间:${time_end}"
((duration=$(date +%s -d "$time_end")-$(date +%s -d "$time_start")))
echo "耗时:${duration}s"
2.创建后台调用脚本start.sh
写入内容:
nohup ./compact.sh > log 2>&1 &
查询hbase的region分区范围
scan "hbase:meta", {COLUMNS => ['info:regioninfo'] }
展示
regionName start-rowkey end-key
one,,4343.6587. value={STARTKEY => '', ENDKEY => '1000000'}
one,100,bb9b6a. value={STARTKEY => '1000000', ENDKEY => '2000000'}
one,200,bff9ac. value={STARTKEY => '2000000', ENDKEY => '3000000'}
one,300,832e19. value={STARTKEY => '3000000', ENDKEY => ''}
-------------------------------------------------------------------------------------------
one,200,bff9ac. region全称
bff9ac region编码名称
---------------------------------------------------------------------------------------------------------
通过shell合并region (合并的时候需要停止hbase集群,合并相邻的两个region) 使用region全称
hbase org.apache.hadoop.hbase.util.Merge one \
one,100,bb9b6a. \
one,200,bff9ac.
--------------------------------------------------------------------------------------------------------
合并还可以通过启动hbase集群,在hbase-shell中合并
hbase(main):006:0> merge_region 'region1编码名称', 'region2编码名称', true/可选参数,是否强制合并
------------------------------------------------------------------------------------------------------