1, 查看系统支持的压缩格式
[root@cdh-node1 ~]# hbase org.apache.hadoop.util.NativeLibraryChecker
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
20/03/26 01:00:28 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
20/03/26 01:00:28 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/hadoop/lib/native/libsnappy.so.1
lz4: true revision:10301
bzip2: true /lib64/libbz2.so.1
openssl: false Cannot load libcrypto.so (libcrypto.so: cannot open shared object file: No such file or directory)!
2, 对比使用压缩,data_block_ca
a, 小表
#################### 使用snappy压缩前后######################
#create 'test', {NAME=>'f', COMPRESSION=>'LZ4', DATA_BLOCK_ENCODING=>'FAST_DIFF'},{SPLITS=>['A','B']}
create 'ttt','f'
put 'ttt','r1','f:name','a'
put 'ttt','r2','f:name','a'
put 'ttt','r3','f:name','a'
put 'ttt','r4','f:name','a'
put 'ttt','r5','f:name','a'
put 'ttt','r6','f:name','a'
put 'ttt','r7','f:name','a'
put 'ttt','r8','f:name','a'
flush 'ttt'
#查看storefile文件大小
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/ttt/*/f
Found 1 items
-rw-r--r-- 3 hbase hbase 5111 2020-03-26 00:01 /hbase/data/default/ttt/ed861da62c3d1a366255c695b355c748/f/b8650952ccba48f5a80cfe7fd8185dba
#####修改snappy压缩
alter 'ttt',{NAME=>'f',COMPRESSION=>'SNAPPY'}
major_compact 'ttt'
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/ttt/*/f
Found 1 items
-rw-r--r-- 3 hbase hbase 4948 2020-03-26 00:03 /hbase/data/default/ttt/ed861da62c3d1a366255c695b355c748/f/ad7a2bfd6c8641629710865761e46ba4
alter 'ttt',{NAME=>'f',DATA_BLOCK_ENCODING=>'FAST_DIFF'}
major_compact 'ttt'
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/ttt/*/f
Found 1 items
-rw-r--r-- 3 hbase hbase 4960 2020-03-26 00:05 /hbase/data/default/ttt/ed861da62c3d1a366255c695b355c748/f/134b159b41334d418151ba1c1dbf7f27
#################### 使用lz4压缩前后######################
create 'ttt2','f'
put 'ttt2','r1','f:name','a'
put 'ttt2','r2','f:name','a'
put 'ttt2','r3','f:name','a'
put 'ttt2','r4','f:name','a'
put 'ttt2','r5','f:name','a'
put 'ttt2','r6','f:name','a'
put 'ttt2','r7','f:name','a'
put 'ttt2','r8','f:name','a'
flush 'ttt2'
#查看storefile文件大小
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/ttt2/*/f
Found 1 items
-rw-r--r-- 3 hbase hbase 5111 2020-03-26 00:07 /hbase/data/default/ttt2/1997dc851df78b546a671f617a1901bf/f/d105d3848dfc44a9b455f7e7d06c8e39
#####修改lz4压缩
alter 'ttt2',{NAME=>'f',COMPRESSION=>'LZ4'}
major_compact 'ttt2'
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/ttt2/*/f
Found 1 items
-rw-r--r-- 3 hbase hbase 4940 2020-03-26 00:09 /hbase/data/default/ttt2/1997dc851df78b546a671f617a1901bf/f/bcb4fe47eb78406599b1a0c744da3c95
alter 'ttt2',{NAME=>'f',DATA_BLOCK_ENCODING=>'FAST_DIFF'}
major_compact 'ttt2'
[root@cdh-node1 ~]# hdfs dfs -ls /hbase/data/default/ttt2/*/f
Found 1 items
-rw-r--r-- 3 hbase hbase 4955 2020-03-26 00:11 /hbase/data/default/ttt2/1997dc851df78b546a671f617a1901bf/f/b5d774a1aa94491984803af1327b6614
b, 大表(10万条数据): 900M --> 200M --> 100M
====默认格式:无压缩
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG
948.3 M 948.3 M /hbase/data/default/GLYY_DIAG
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/*/f
486.7 M 486.7 M /hbase/data/default/GLYY_DIAG/17e36f9e1b7da30f120e9221199235c0/f
461.6 M 461.6 M /hbase/data/default/GLYY_DIAG/ba4d20807af0ee98fbd9427bc7f4c69f/f
====lz4压缩
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG
198.9 M 198.9 M /hbase/data/default/GLYY_DIAG
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/*/f
103.4 M 103.4 M /hbase/data/default/GLYY_DIAG/17e36f9e1b7da30f120e9221199235c0/f
95.5 M 95.5 M /hbase/data/default/GLYY_DIAG/ba4d20807af0ee98fbd9427bc7f4c69f/f
====snappy压缩
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/
177.2 M 177.2 M /hbase/data/default/GLYY_DIAG
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/*/f
91.6 M 91.6 M /hbase/data/default/GLYY_DIAG/17e36f9e1b7da30f120e9221199235c0/f
85.5 M 85.5 M /hbase/data/default/GLYY_DIAG/ba4d20807af0ee98fbd9427bc7f4c69f/f
====lz4压缩+ DATA_BLOCK_ENCODING='FAST_DIFF'
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/
96.3 M 96.3 M /hbase/data/default/GLYY_DIAG
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/*/f
50.0 M 50.0 M /hbase/data/default/GLYY_DIAG/17e36f9e1b7da30f120e9221199235c0/f
46.4 M 46.4 M /hbase/data/default/GLYY_DIAG/ba4d20807af0ee98fbd9427bc7f4c69f/f
====snappy压缩+ DATA_BLOCK_ENCODING='FAST_DIFF'
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/
98.6 M 98.6 M /hbase/data/default/GLYY_DIAG
[test@eadage ~]$ hdfs dfs -du -h -s /hbase/data/default/GLYY_DIAG/*/f
51.3 M 51.3 M /hbase/data/default/GLYY_DIAG/17e36f9e1b7da30f120e9221199235c0/f
47.4 M 47.4 M /hbase/data/default/GLYY_DIAG/ba4d20807af0ee98fbd9427bc7f4c69f/f
3, 服务优化
###增加memstore百分比:hbase.regionserver.global.memstore.upperLimit 0.4调整为0.6
根据官方推荐,在写工作量较大的regionServer,可以适当调大memstore的百分比,降低block cache的百分比。可以支持更多的region。
###增加阻塞倍率:hbase.hregion.memstore.block.multiplier 2调整为4
它是用来阻塞来自客户端更新数据请求的安全阈值,当memstore达到multiplier*flush的大小限制,会阻止进一步的更新。
当有足够的存储空间,我们增加这个值来平滑地处理写入突发流量。
###增加阻塞StoreFiles阈值:hbase.hstore.blockingStoreFiles=hbase.hstore.compaction.max 10调整为100
当storefile的个数达到10个时,就会阻塞hbase的写入,强烈建议调大。
###减少最大日志文件限制:hbase.regionserver.maxlogs 32调整为16
控制刷写频率,对于写压力较大的应用,这个值过大。降低这个值会强迫服务器更频繁地将数据刷写到磁盘上,这样已经刷写到磁盘上的数据所对应的日志就删除了。