一 集群部署架构介绍
Master节点一主一备,跟HadoopNameNode 保持一致,即HBase
Master也把主从节点放在NameNode主从同一台机器
RegionServer放在每一个对应DataNode上,比如我由三个DataNode,那么每一台DataNode对应的Server也会放一个RegionServer
节点规划信息:
二 企业级系统参数设置
查看Linux系统最大进程数和最大文件打开数
可根据具体情况调整,包括以下内核参数:
net.ipv4.ip_forward=0
net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.default.accept_source_route=0
kernel.core_users_pid=1
net.ipv4.tcp_syncookies=1
net.bridge.bridge-nf-call-ip6tables=0
net.bridge.bridge-nf-call-iptables=0
net.bridge.bridge-nf-call-arptables=0
kernel.mggmnb=65536
kernel.mggmax=65536
kernel.shmmax=68719476736
kernel.shmall=268435456
net.ipv4.tcp_max_syn_backlog=65000
net.core.netdev_max_backlog=32768
net.core.somaxconn=32768
fs.file-max=65000
net.core.wmem_default=8388608
net.core.rmem_default=8388608
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_synack_retries=2
net.ipv4.tcp_syn_retries=2
net.ipv4.tcp_mem=94500000 915000000 927000000
net.ipv4.tcp_max_orphans=3276800
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_tw_recycle=1
net.ipv4.tcp_keepalive_time=1200
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_fin_timeout=10
net.ipv4.tcp_keepalive_intvl=15
net.ipv4.tcp_keepalive_probes=3
net.ipv4.ip_local_port_range=1024 65535
net.ipv4.conf.eml.send_redirects=0
net.ipv4.conf.lo.send_redirects=0
net.ipv4.conf.default.send_redirects=0
net.ipv4.conf.all.send_redirects=0
net.ipv4.icmp_echo_ignore_broadcasts=1
net.ipv4.conf.eml.accept_source_route=0
net.ipv4.conf.lo.accept_source_route=0
net.ipv4.conf.default.accept_source_route=0
net.ipv4.conf.all.accept_source_route=0
net.ipv4.icmp_ignore_bogus_error_responses=1
kernel.core_pattern=/tmp/core
vm.overcommit_memory=1
三 解压
tar-zxf ./hbase-1.0.0-cdh5.5.0.tar.gz -C /opt/app/
四 配置HBase的hbase-env.sh里的相关参数
exportJAVA_HOME=/opt/app/java
exportHADOOP_HOME=/opt/app/hadoop
exportHBASE_HEAPSIZE=1024
exportHBASE_OPTS="-XX:+UseConcMarkSweepGC"
exportHBASE_MASTER_OPTS="${HBASE_MASTER_OPTS} -Xmx512m"
exportHBASE_REGIONSERVER_OPTS="${HBASE_REGIONSER
VER_OPTS}-Xmx1024m"
exportHBASE_LOG_DIR=${HBASE_HOME}/logs
exportHBASE_PID_DIR=${HBASE_HOME}/pids
#不使用HBase自己的zookeeper
exportHBASE_MANAGES_ZK=false
五 配置hbase-site.xml
HMasterhadoop-all-01 hadoop-all-02
HRegionServerhadoop-all-01 hadoop-all-02 hadoop-all-03
<configuration>
<!-- 指定缓存文件存储的路径 -->
<property>
<name>hbase.tmp.dir</name>
<value>/opt/app/hbase/data</value>
</property>
<!-- 设置HRegionServers共享目录 -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://hdfs-cluster/hbase</value>
</property>
<!-- 设置HMaster的rpc端口 -->
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<!-- 设置HMaster的http端口 -->
<property>
<name>hbase.master.info.port</name>
<value>16010</value>
</property>
<!-- 开启分布式模式 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop-all-01,hadoop-all-02,hadoop-all-03</value>
</property>
<!-- 指定ZooKeeper集群端口 -->
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<!--指定Zookeeper数据目录,需要与ZooKeeper集群上配置相一致 -->
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/app/zookeeper-3.4.5/data</value>
</property>
<!-- hbase客户端rpc扫描一次获取的行数,默认是2147483647, -->
<property>
<name>hbase.client.scanner.caching</name>
<value>2000</value>
</property>
<!-- HRegion分裂前最大的文件大小(默认1.25G)-->
<property>
<name>hbase.hregion.max.filesize</name>
<value>10737418240</value>
</property>
<!-- HRegionServer中最大的region数量 -->
<property>
<name>hbase.regionserver.reginoSplitLimit</name>
<value>2000</value>
</property>
<!-- StoreFile开始合并的阀值 -->
<property>
<name>hbase.hstore.compactionThreshold</name>
<value>6</value>
</property>
<!-- 当某一个region的storefile个数达到该值则block写入,等待compact-->
<property>
<name>hbase.hstore.blockingStoreFiles</name>
<value>14</value>
</property>
<!--当MemStore占用内存大小超过hbase.hregion.memstore.flush.size
MemStore刷新缓存的大小的4倍,开始中block该HRegion的请求,进行flush
释放内存,后台会有服务线程在周期内hbase.server.thread.wakefrequency
定时检查-->
<property>
<name>hbase.hregion.memstore.block.multiplier</name>
<value>4</value>
</property>
<!-- service工作的sleep间隔 -->
<property>
<name>hbase.server.thread.wakefrequency</name>
<value>500</value>
</property>
<!--ZK并发连接的限制-->
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>300</value>
</property>
<!--RegionServer进程block进行flush触发条件:该节点上所有region的memstore之和达到upperLimit*heapsize-->
<property>
<name>hbase.regionserver.global.memstore.size</name>
<value>0.4</value>
</property>
<!--RegionServer进程触发flush的一个条件:该节点上所有region的memstore之和达到lowerLimit*heapsize-->
<property>
<name>hbase.regionserver.global.memstore.size.lower.limit</name>
<value>0.3</value>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.4</value>
</property>
<!--HRegionserver处理IO请求的线程数-->
<property>
<name>hbase.regionserver.handler.count</name>
<value>100</value>
</property>
<!-- 客户端最大重试次数 -->
<property>
<name>hbase.client.retries.number</name>
<value>5</value>
</property>
<!-- 客户端重试的休眠时间 -->
<property>
<name>hbase.client.pause</name>
<value>100</value>
</property>
</configuration>
六 配置regionservers
vimregionservers,把HRegionServer对应的host添加进去
hadoop-all-01
hadoop-all-02
hadoop-all-03
七 配置backup-masters文件
hadoop-all-02
八 创建必要的目录
mkdir-p /opt/app/hbase/data logs pids
九 把已经配置好的hbase分发到其他节点
删除hbase的slf4j-log4j12-1.7.5.jar,解决hbase和hadoop的LSF4J包冲突
mvslf4j-log4j12-1.7.5.jar slf4j-log4j12-1.7.5.jar.bak
删除/opt/app/hbase-1.0.0-cdh5.5.0/docs
scp-r hbase-1.0.0-cdh5.5.0/ hadoop@hadoop-all-02:/opt/app/
scp-r hbase-1.0.0-cdh5.5.0/ hadoop@hadoop-all-03:/opt/app/
然后每一个节点给该文件创建软连接指向/opt/app/hbase
ln-s /opt/app/hbase-1.0.0-cdh5.5.0 /opt/app/hbase
十 每一个节点添加环境变量
vim~/.bashrc
#HBASE_HOME
exportHBASE_HOME=/opt/app/hbase
exportPATH=$PATH:$HBASE_HOME/bin
保存之后,source~/.bashrc
十一 启动HBASE集群
首先确保所有ZK NNDN RM NM已经启动
start-hbase.sh所有的HMaster和HRegionServer都会启动
然后访问WEB-UI:
http://hadoop-all-02:16010