搭建Hbase集群前的考察记录

1.hbase集群 本地测试(用java完成的相关操作):
四台虚拟机 做成2个hmaster ,3个HregionServer 既实现高可用也充分利用四台虚拟机的分布式性能,
需要依赖:
(1)zookeeper 做hbase集群的维护
(2)hdfs 做存储引擎 一个namenode 四个 datanode

2.分区键的设计:(做5年的数据规划)
原始数据 一条 60k的磁盘存储 (深圳是57k) ,5秒钟产生一条
1年的数据 60K * 12 * 60 * 24 3012 = 373,248,000‬k 约等于360G
每条数据有三个副本,预估五年的存储 :3 * 5 * 360G = 5400G
留30%的buffer 共7200G
Hbase 一个region存放 30G的数据 所以预分区做成 7200/30 = 240 个分区
分区键定位定为 000| 001| 002| … 237| 238|

分区效果图

3.分区号设计 (rowkey设计)
根据年月日的 HASH值 和分区数 取模 作为rowkey的前缀,这样保证一天的数据在一个region里面,(批量查询的时候)用下划线”” 拼接 ,后面放 年月日时分秒
String yearMothDay= “20200101”; //年月日
int region=Math.abs((yearMothDay).hashCode())%240;
String time=“20200102121010”; //年月日时分秒
String key = region + “
” + time

4,插入数据存在一个列族下面,可以每天00:30 存前一天的数据 ,测试批量插入秒级
5,读取数据的时候一个scan只能取出一天的数据,所以取出多天的数据需要做好工具类;取出1万条数据 15秒左右 ,服务器上会快很多
Hbase-site.xml 暂时的配置

	<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>
        <property>
       <name>hbase.rootdir</name>
       <value>hdfs://hdp-01:9000/hbase</value>
      </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>hdp-01,hdp-02,hdp-03</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
    <property>
        <name>hbase.tmp.dir</name>
        <value>/opt/module/data/hbase/tmp</value>
    </property>
    <property>
        <name>hbase.master</name>
        <value>hdp-01:60000</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/opt/module/data/zookeeper/zkdata</value>
    </property>
    <property>
        <!--htable.setWriteBufferSize(5242880);//5M -->
        <name>hbase.client.write.buffer</name>
        <value>5242880</value>
    </property>
    <property>
        <name>hbase.regionserver.handler.count</name>
        <value>300</value>
        <description>
           Count of RPC Listener instances spun up on RegionServers.Same property is used by the Master for count of master handlers.
        </description>
    </property>
    <property>
        <name>hbase.table.sanity.checks</name>
        <value>false</value>
    </property>
    <property>
        <!--every 30s,the master will check regionser is working -->
        <name>zookeeper.session.timeout</name>
        <value>30000</value>
    </property>
    <property>
        <!--every region max file size set to 30G -->
        <name>hbase.hregion.max.filesize</name>
        <value>32212254720</value>
    </property>
    <property>
        <name>hbase.hregion.majorcompaction</name>
        <value>0</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>hbase.regionserver.region.split.policy</name>
        <value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value>
    </property>
   <property>
        <name>hbase.regionserver.optionalcacheflushinterval</name>
        <value>7200000</value>
        <description>
            Maximum amount of time an edit lives in memory before being automatically
            flushed.
            Default 1 hour. Set it to 0 to disable automatic flushing.</description>
    </property>
    <property>
        <name>hfile.block.cache.size</name>
        <value>0.3</value>
        <description>Percentage of maximum heap (-Xmx setting) to allocate to
            block cache
            used by HFile/StoreFile. Default of 0.4 means allocate 40%.
            Set to 0 to disable but it's not recommended; you need at least
            enough cache to hold the storefile indices.</description>
    </property>
    <property>
        <name>hbase.hregion.memstore.flush.size</name>
        <value>52428800</value>
    </property>
    <property>
        <name>hbase.regionserver.global.memstore.size</name>
        <value>0.5</value>
    </property>
    <property>
        <name>hbase.regionserver.global.memstore.size.lower.limit</name>
        <value>0.5</value>
    </property>
    <property>
      <name>dfs.clienhbase.hregion.max.filesizet.socket-timeout</name>
      <value>600000</value>
    </property>

<property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
</property>

<property>
    <name>hbase.hregion.memstore.block.multiplier</name>
    <value>10</value>
</property>
<property>
    <name>hbase.regionserver.hlog.splitlog.writer.threads</name>
    <value>10</value>
</property>
<property>
    <name>hbase.hstore.compaction.min</name>
    <value>8</value>
</property>
<property>
    <name>hbase.regionserver.thread.compaction.small</name>
    <value>5</value>
</property>
<property>
    <name>hbase.regionserver.thread.compaction.large</name>
    <value>8</value>
</property>
<property>
    <name>dfs.socket.timeout</name>
    <value>900000</value>
</property>
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值