在配置关闭Hbase的split后发现只有两个region 每个100G,查询性能差。
查询split方式有四种,尝试了KeyPrefix方式,并没有按想象的直接切割成100个region
设置如下:来自http://blog.javachen.com/2014/01/16/hbase-region-split-policy/KeyPrefixRegionSplitPolicy
// 更新现有表的split策略 HBaseAdmin admin = new HBaseAdmin( conf); HTable hTable = new HTable( conf, "test" ); HTableDescriptor htd = hTable.getTableDescriptor(); HTableDescriptor newHtd = new HTableDescriptor(htd); newHtd.setValue(HTableDescriptor. SPLIT_POLICY, KeyPrefixRegionSplitPolicy.class .getName());// 指定策略 newHtd.setValue("prefix_split_key_policy.prefix_length", "2"); newHtd.setValue("MEMSTORE_FLUSHSIZE", "5242880"); // 5M admin.disableTable( "test"); admin.modifyTable(Bytes. toBytes("test"), newHtd); admin.enableTable( "test");
最后修改为预分区
create 'test', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '1', TTL => '5184000', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '8192', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'},{SPLITS_FILE => 'splits.txt'}
splits.txt如下:
1000000000000
1100000000000
1200000000000
1300000000000
1400000000000
1500000000000
1600000000000
1700000000000
1800000000000
1900000000000
2000000000000
2100000000000
2200000000000
.....