Hadoop 2.7.5 + HBase 1.2.6 + ZooKeeper 3.4.10 配置_hbase-2.1.5 hadoop-2.7.5 版本匹配-CSDN博客

本文链接：https://blog.csdn.net/zhbzhbzhbbaby/article/details/80773251

下载hbase: https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/stable/hbase-1.2.6-bin.tar.gz

解压到目录/home/hadoop/hbase-1.2.6

HBase本身就内置Zookeeper以支持HA，在配置时可选择内部自带的Zookeeper或者选择外部独立安装的Zookeeper，对于存储可选择HDFS为底层分布式存储，故受HDFS是否配置为HA而影响，HDFS non HA 及HDFS HA下，HBase的配置文件需要相应的修改，且在启动HBase前，必须先启动zookeeper

1：设置环境变量及临时目录：.bash_profile

export PATH
export  HIVE_HOME=/home/hadoop/apache-hive-2.3.3-bin
export  HIVE_CONF_DIR=${HIVE_HOME}/conf 
export  CLASSPATH=$CLASSPATH:${HIVE_HOME}/lib
export PGHOME=/usr/local/pgsql
export PGDATA=/data/pgdata
export PATH=$PGHOME/bin:$PATH
export MANPATH=$PGHOME/share/man:$MANPATH
export LANG=en_US.utf8
export DATE=`date +"%Y-%m-%d %H:%M:%S"`
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export HBASE_HOME=/home/hadoop/hbase-1.2.6
alias rm='rm  -i'
alias ll='ls -lh'
export  PATH=.:${HIVE_HOME}/bin:$HBASE_HOME/bin:$PATH

创建临时目录：mkdir -p /home/hadoop/hbase-1.2.6/tmp

规划集群，有4台linux机器

hadoop-m 192.168.31.50 namenode datenode zookeeper hive Hbase

hadoop-sm 192.168.31.51 namenode datenode zookeeper Hbase

slave1 192.168.31.52 namenode datenode zookeeper Hbase

slave2 192.168.31.53 datenode Hbase

2：修改配置文件

hbase-env.sh

　　hbase自带zookeeper，如果不用自带zk，将下面HBASE_MANAGES_ZK设置为false，本次使用独立配置的zookeeper

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_161  
export HADOOP_HOME=/usr/local/hadoop-2.7.5  
export HBASE_HOME=/home/hadoop/hbase-1.2.6  
export HBASE_CLASSPATH=/usr/local/hadoop-2.7.5/etc/hadoop  
export HBASE_PID_DIR=/home/hadoop/hbase-1.2.6/pids 
export HBASE_MANAGES_ZK=false

hbase-site.xml

<property>  
 <name>hbase.rootdir</name>  
 <value>hdfs://mycluster/hbase</value> 
 <!--配置为core-site.xml 中的fs.defaultFS --> 
 <description>The directory shared byregion servers.</description>  
</property>  
<property>  
 <name>hbase.zookeeper.property.clientPort</name>  
 <value>2181</value>  
 <description>Property from ZooKeeper'sconfig zoo.cfg. The port at which the clients will connect.  
 </description>  
</property> 
<property>
  <name>hbase.master</name>
  <value>60000</value>
  <!-- Hbase HA 方式下只需配置端口 -->
</property> 
<property>  
 <name>zookeeper.session.timeout</name>  
 <value>120000</value>  
</property>  
<property>  
 <name>hbase.zookeeper.quorum</name>  
 <value>hadoop-m,hadoop-sm,slave1</value>  
</property> 
<property>
  <name>hbase.zoopkeeper.property.dataDir</name>
  <value>/usr/local/zookeeper-3.4.10/data/zkData</value>
    </property> 
<property>  
 <name>hbase.tmp.dir</name>  
 <value>/home/hadoop/hbase-1.2.6/tmp</value>  
</property>  
<property>  
 <name>hbase.cluster.distributed</name>  
 <value>true</value>  
</property>

regionservers （注：通常应该将regionserver配置为datanode相同的server上以实现本地存储，提升性能）

hadoop-m
hadoop-sm
slave1
slave2

保存配置，然后这4台同步安装文件夹

采用scp命令 scp -r hbase-1.2.6 hadoop@slave2:/home/hadoop

hbase-site.xml 配置参数解析

hbase.rootdir

这个目录是 RegionServer 的共享目录，用来持久化 HBase。特别注意的是 hbase.rootdir 里面的 HDFS 地址是要跟 Hadoop 的 core-site.xml 里面的 fs.defaultFS 的 HDFS 的 IP 地址或者域名、端口必须一致。（HA环境下，dfs.nameservices 是由zookeeper来决定的）

hbase.cluster.distributed

HBase 的运行模式。为 false 表示单机模式，为 true 表示分布式模式。若为 false，HBase 和 ZooKeeper 会运行在同一个 JVM 中

hbase.master

如果只设置单个 Hmaster，那么 hbase.master 属性参数需要设置为 master:60000 (主机名:60000)

如果要设置多个 Hmaster，那么我们只需要提供端口 60000，因为选择真正的 master 的事情会有 zookeeper 去处理

hbase.tmp.dir

本地文件系统的临时文件夹。可以修改到一个更为持久的目录上。(/tmp会在重启时清除)

hbase.zookeeper.quorum

对于 ZooKeeper 的配置。至少要在 hbase.zookeeper.quorum 参数中列出全部的 ZooKeeper 的主机，用逗号隔开。该属性值的默认值为 localhost，这个值显然不能用于分布式应用中。

hbase.zookeeper.property.dataDir

这个参数用户设置 ZooKeeper 快照的存储位置，默认值为 /tmp，显然在重启的时候会清空。因为笔者的 ZooKeeper 是独立安装的，所以这里路径是指向了 $ZOOKEEPER_HOME/conf/zoo.cfg 中 dataDir 所设定的位置。

hbase.zookeeper.property.clientPort

表示客户端连接 ZooKeeper 的端口。

zookeeper.session.timeout

ZooKeeper 会话超时。Hbase 把这个值传递改 zk 集群，向它推荐一个会话的最大超时时间

hbase.regionserver.restart.on.zk.expire

当 regionserver 遇到 ZooKeeper session expired ， regionserver 将选择 restart 而不是 abort。

3 启动和测试

3.1 启动

Hbase是基于hadoop提供的分布式文件系统的，所以启动Hbase之前，先确保hadoop在正常运行，另外Hbase还依赖于zookkeeper，本来我们可以用hbase自带的zookeeper，但是我们上面的配置启用的是我们自己的zookeeper集群，所以在启动hbase前，还要确保zokeeper已经正常运行。

Hbase可以只在hadoop的某个namenode节点上安装，也可以在所有的hadoop节点上安装，但是启动的时候只需要在一个节点上启动就行了，本例中，我在4个节点上都安装了Hbase，启动的时候只需要在hadoop-m上启动就OK。

在hadoop-m上执行命令，进入到Hbase的bin目录内，命令是：

start-hbase.sh

如图

3.2 测试

用浏览器访问Hbase状态信息

直接访问地址：http://192.168.31.50:16030/

如图：

3.2.2 启动hbase的命令行

执行命令，进入到Hbase的bin目录内，命令是：

cd /opt/hbase/hbase-1.2.5/bin

执行命令启动Hbase命令行窗口，命令是：

./hbase shell

如图：

完整的输出是：

[hadoop@hadoop-m ~]$ hbase  shell 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017

hbase(main):001:0> status
1 active master, 1 backup masters, 4 servers, 0 dead, 0.5000 average load

hbase(main):002:0>

在hbase命令行模式下，可以输入一系列hbase命令，进行测试

等等 ~~~说好的高可用呢？

其实是没有启动

我们在hadoop-sm,slave1,slave2上分别执行

hbase-daemon.sh start master

然后再执行status

检查状态

可以查看进程jps：

在master和备节点上可以看到进程：HMaster，HRegionServer，因为我们4台都配了的，所以这4台都有HMaster

在其它节点上可以看到进程：HRegionServer

此时hadoop-m jps

hadoop-sm jps

浏览器查看：hadoop-m:16010, slave1:16010

可以看到一个是Active Master， 3个为Backup Master

如果要退出Hbase命令行模式的话，输入：exit

命令行操作：

1. 经常使用hbase命令

--进入habase

[grid@gc ~]$ hbase-0.90.5/bin/hbase shell

HBase Shell; enter 'help<RETURN>' for list of supported commands.

Type "exit<RETURN>" to leave the HBase Shell

Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011

hbase(main):001:0>

--查看数据库状态

hbase(main):002:0> status

2 servers, 0 dead, 1.0000 average load

--查询数据库版本号

hbase(main):004:0> version

0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011

--帮助命令

hbase(main):003:0> help

HBase Shell, version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011

Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.

Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.

COMMAND GROUPS:

Group name: general

Commands: status, version

Group name: ddl

Commands: alter, create, describe, disable, drop, enable, exists, is_disabled, is_enabled, list

Group name: dml

Commands: count, delete, deleteall, get, get_counter, incr, put, scan, truncate

Group name: tools

Commands: assign, balance_switch, balancer, close_region, compact, flush, major_compact, move, split, unassign, zk_dump

Group name: replication

Commands: add_peer, disable_peer, enable_peer, remove_peer, start_replication, stop_replication

SHELL USAGE:

Quote all names in HBase Shell such as table and column names. Commas delimit

command parameters. Type <RETURN> after entering a command to run it.

Dictionaries of configuration used in the creation and alteration of tables are

Ruby Hashes. They look like this:

{'key1' => 'value1', 'key2' => 'value2', ...}

and are opened and closed with curley-braces. Key/values are delimited by the

'=>' character combination. Usually keys are predefined constants such as

NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type

'Object.constants' to see a (messy) list of all constants in the environment.

If you are using binary keys or values and need to enter them in the shell, use

double-quote'd hexadecimal representation. For example:

hbase> get 't1', "key\x03\x3f\xcd"

hbase> get 't1', "key\003\023\011"

hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.

For more on the HBase Shell, see http://hbase.apache.org/docs/current/book.html

2. Hbase数据库操作命令

--创建表

resume表逻辑模型：

行键	时间戳	列族binfo	列族edu	列族work
lichangzai	T2	binfo:age=’1980-1-1’
	T3	binfo:sex=’man’
	T5		edu:mschool=’rq no.1’
	T6		edu:university=’qhddx’
	T7			work:company1=’12580’
changfei	T10	binfo:age=’1986-2-1’
	T11		edu:university=’bjdx’
	T12			work:company1=’LG’
……	Tn

--创建表

hbase(main):005:0> create 'resume','binfo','edu','work'

0 row(s) in 16.5710 seconds

--列出表

hbase(main):006:0> list

TABLE

resume

1 row(s) in 1.6080 seconds

--查看表结构

hbase(main):007:0> describe 'resume'

DESCRIPTION ENABLED

{NAME => 'resume', FAMILIES => [{NAME => 'binfo', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', C true

OMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'fals

e', BLOCKCACHE => 'true'}, {NAME => 'edu', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESS

ION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLO

CKCACHE => 'true'}, {NAME => 'work', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>

'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACH

E => 'true'}]}

1 row(s) in 1.8590 seconds

--加入列族

hbase(main):014:0> disable 'resume'

0 row(s) in 4.2630 seconds

hbase(main):015:0> alter 'resume',name='f1'

0 row(s) in 4.6990 seconds

--删除列族

hbase(main):017:0> alter 'resume',{NAME=>'f1',METHOD=>'delete'}

0 row(s) in 1.1390 seconds

--或是

hbase(main):021:0> alter 'resume','delete' => 'f1'

0 row(s) in 1.9310 seconds

hbase(main):022:0> enable 'resume'

0 row(s) in 5.9060 seconds

注意：

（1） ddl命令是区分大写和小写的。像ddl中的alter,create, drop, enable等都必需用小写。

而{}中的属性名都必需用大写。

（2） alter、drop表之前必需在先禁用(disabel)表，改动完后再启用表(enable)表，否则会报错

--查询禁用状态

hbase(main):024:0> is_disabled 'resume'

false

0 row(s) in 0.4930 seconds

hbase(main):021:0> is_enabled 'resume'

true

0 row(s) in 0.2450 seconds

--删除表

hbase(main):015:0> create 't1','f1'

0 row(s) in 15.3730 seconds

hbase(main):016:0> disable 't1'

0 row(s) in 6.4840 seconds

hbase(main):017:0> drop 't1'

0 row(s) in 7.3730 seconds

--查询表是否存在

hbase(main):018:0> exists 'resume'

Table resume does exist

0 row(s) in 2.3900 seconds

hbase(main):019:0> exists 't1'

Table t1 does not exist

0 row(s) in 1.3270 seconds

--插入数据

put 'resume','lichangzai','binfo:age','1980-1-1'

put 'resume','lichangzai','binfo:sex','man'

put 'resume','lichangzai','edu:mschool','rq no.1'

put 'resume','lichangzai','edu:university','qhddx'

put 'resume','lichangzai','work:company1','12580'

put 'resume','lichangzai','work:company2','china mobile'

put 'resume','lichangzai','binfo:site','blog.csdn.net/lichangzai'

put 'resume','lichangzai','binfo:mobile','13712345678'

put 'resume','changfei','binfo:age','1986-2-1'

put 'resume','changfei','edu:university','bjdx'

put 'resume','changfei','work:company1','LG'

put 'resume','changfei','binfo:mobile','13598765401'

put 'resume','changfei','binfo:site','hi.baidu/lichangzai'

--获取一行键的全部数据

hbase(main):014:0> get 'resume','lichangzai'

COLUMN CELL

binfo:age timestamp=1356485720612, value=1980-1-1

binfo:mobile timestamp=1356485865523, value=13712345678

binfo:sex timestamp=1356485733603, value=man

binfo:site timestamp=1356485859806, value=blog.csdn.net/lichangzai

edu:mschool timestamp=1356485750361, value=rq no.1

edu:university timestamp=1356485764211, value=qhddx

work:company1 timestamp=1356485837743, value=12580

work:company2 timestamp=1356485849365, value=china mobile

8 row(s) in 2.1090 seconds

注意：必须通过行键Row Key来查询数据

--获取一个行键。一个列族的全部数据

hbase(main):015:0> get 'resume','lichangzai','binfo'

COLUMN CELL

binfo:age timestamp=1356485720612, value=1980-1-1

binfo:mobile timestamp=1356485865523, value=13712345678

binfo:sex timestamp=1356485733603, value=man

binfo:site timestamp=1356485859806, value=blog.csdn.net/lichangzai

4 row(s) in 1.6010 seconds

--获取一个行键。一个列族中一个列的全部数据

hbase(main):017:0> get 'resume','lichangzai','binfo:sex'

COLUMN CELL

binfo:sex timestamp=1356485733603, value=man

1 row(s) in 0.8980 seconds

--更新一条记录

hbase(main):018:0> put 'resume','lichangzai','binfo:mobile','13899999999'

0 row(s) in 1.7640 seconds

hbase(main):019:0> get 'resume','lichangzai','binfo:mobile'

COLUMN CELL

binfo:mobile timestamp=1356486691591, value=13899999999

1 row(s) in 1.5710 seconds

注意：更新实质就是插入一条带有时间戳的记录，get查询时仅仅显示最新时间的记录

--通过timestamp来获取数据

------查询最新的时间戳的数据

hbase(main):020:0> get 'resume','lichangzai',{COLUMN=>'binfo:mobile',TIMESTAMP=>1356486691591}

COLUMN CELL

binfo:mobile timestamp=1356486691591, value=13899999999

1 row(s) in 0.4060 seconds

------查之前（即删除）时间戳的数据

hbase(main):021:0> get 'resume','lichangzai',{COLUMN=>'binfo:mobile',TIMESTAMP=>1356485865523}

COLUMN CELL

binfo:mobile timestamp=1356485865523, value=13712345678

1 row(s) in 0.7780 seconds

--全表扫描

hbase(main):022:0> scan 'resume'

ROW COLUMN+CELL

changfei column=binfo:age, timestamp=1356485874056, value=1986-2-1

changfei column=binfo:mobile, timestamp=1356485897477, value=13598765401

changfei column=binfo:site, timestamp=1356485906106, value=hi.baidu/lichangzai

changfei column=edu:university, timestamp=1356485880977, value=bjdx

changfei column=work:company1, timestamp=1356485888939, value=LG

lichangzai column=binfo:age, timestamp=1356485720612, value=1980-1-1

lichangzai column=binfo:mobile, timestamp=1356486691591, value=13899999999

lichangzai column=binfo:sex, timestamp=1356485733603, value=man

lichangzai column=binfo:site, timestamp=1356485859806, value=blog.csdn.net/lichangzai

lichangzai column=edu:mschool, timestamp=1356485750361, value=rq no.1

lichangzai column=edu:university, timestamp=1356485764211, value=qhddx

lichangzai column=work:company1, timestamp=1356485837743, value=12580

lichangzai column=work:company2, timestamp=1356485849365, value=china mobile

2 row(s) in 3.6300 seconds

--删除指定行键的列族字段

hbase(main):023:0> put 'resume','changfei','binfo:sex','man'

0 row(s) in 1.2630 seconds

hbase(main):024:0> delete 'resume','changfei','binfo:sex'

0 row(s) in 0.5890 seconds

hbase(main):026:0> get 'resume','changfei','binfo:sex'

COLUMN CELL

0 row(s) in 0.5560 seconds

--删除整行

hbase(main):028:0> create 't1','f1','f2'

0 row(s) in 8.3950 seconds

hbase(main):029:0> put 't1','a','f1:col1','xxxxx'

0 row(s) in 2.6790 seconds

hbase(main):030:0> put 't1','a','f1:col2','xyxyx'

0 row(s) in 0.5130 seconds

hbase(main):031:0> put 't1','b','f2:cl1','ppppp'

0 row(s) in 1.2620 seconds

hbase(main):032:0> deleteall 't1','a'

0 row(s) in 1.2030 seconds

hbase(main):033:0> get 't1','a'

COLUMN CELL

0 row(s) in 0.8980 seconds

--查询表中有多少行

hbase(main):035:0> count 'resume'

2 row(s) in 2.8150 seconds

hbase(main):036:0> count 't1'

1 row(s) in 0.9500 seconds

--清空表

hbase(main):034:0> truncate 't1'
Truncating 't1' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.1070 seconds

注意：Truncate表的处理过程：因为Hadoop的HDFS文件系统不同意直接改动，所以仅仅能先删除表在又一次创建已达到清空表的目的