1. config conf/hbase-env.sh
export JAVA_HOME=/opt/jdk1
export HBASE_CLASSPATH=/opt/hadoop-2.2.0/etc/hadoop
export HBASE_MANAGES_ZK=true //采用自带zookeeper
2. config conf/hbase-site.sh
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ip:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/opt/hbasetmp</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ip1,ip2,ip3</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/zookeeperdata</value>
</property>
3. config conf/regionservers
ip1
ip2
ip3
ip4
...
4. deploy it to all related nodes
5. execute /bin/start-hbase.sh bin/habse shell
6. web interface : http://hmasterip:60010
other: 优化
挂载文件系统,设置noatime属性来禁止记录文件访问时间戳,设置属性noatime in file /etc/fstab
/dev/sdd1 /data ext3 defaults, noatime 0 0
同步时间:ntp
文件句柄和进程限制:
In file /etc/security/limits.conf
root - nofile 32768
不同账户下运行hadoop, hbase config:
in file: /etc/pam.d/common-session add
session required pam_limit.so
datanode处理线程数:conf/hdfs-site.xml
<property>
<name>dfs.datanode.max.xciever</name>
<value>4096</value>
</property>
不要使用交换区: /etc/sysctl.conf
vm.swappiness=5
sysctl -p
配置scan缓存:
setScannerCashing(ing scannerCashing)
<property>
<name>hbase.client.scanner.caching</name>
<value>10</value>
</property>
通过设定批量setBatch(int), 可优化rpc的数量
修改配置后,可以采用rsync同步文件,