HBase的安装和编程实践
文章目录
1.HBase2.4.2安装
1.1 解压安装包
解压到/usr/local
[root@hadoop-master ~]# tar -zvxf hbase-2.4.2-bin.tar.gz -C /usr/local/
1.2 将解压的文件名hbase-2.4.2改为hbase
[root@hadoop-master local]# mv hbase-2.4.2/ hbase
1.3 配置环境变量
[root@hadoop-master hbase]# vim ~/.bashrc
[root@hadoop-master hbase]# cat ~/.bashrc
# .bashrc
# User specific aliases and functions
alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
export PATH=$PATH:/usr/local/hbase/bin
[root@hadoop-master hbase]# source ~/.bashrc
1.4 查看HBase版本,确定hbase安装成功
[root@hadoop-master ~]# /usr/local/hbase/bin/hbase version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.4.2
Source code repository git://apurtell-ltm.internal.salesforce.com/Users/apurtell/src/hbase revision=3e98c51c512cbd5ef779ae6bcef178ce89c46e37
Compiled by apurtell on Mon Mar 8 16:49:11 PST 2021
From source with checksum 01fd6e6591e3e79b34b4921861434ed0c39f8d69994ca9a59284532a6608703533601371c8797bef261788cd812480da37a007f19c8590feec1dbeed85e4ad5d
2.HBase配置
2.1 伪分布式配置
1.配置/usr/local/hbase/conf/hbase-env.sh
[root@hadoop-master ~]# vim /usr/local/hbase/conf/hbase-env.sh
配置JAVA_HOME,HBASE_CLASSPATH,HBASE_MANAGES_ZK
export JAVA_HOME=/usr/java/jdk1.8.0_281-amd64#这是我本机的JDK安装路径
export HBASE_CLASSPATH=/usr/local/hbase/conf
export HBASE_MANAGES_ZK=true
2.配置/usr/local/hbase/conf/hbase-site.xml
[root@hadoop-master ~]# vim /usr/local/hbase/conf/hbase-site.xml
[root@hadoop-master ~]# cat /usr/local/hbase/conf/hbase-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<configuration>
<!--
The following properties are set for running HBase as a single process on a
developer workstation. With this configuration, HBase is running in
"stand-alone" mode and without a distributed file system. In this mode, and
without further configuration, HBase and ZooKeeper data are stored on the
local filesystem, in a path under the value configured for `hbase.tmp.dir`.
This value is overridden from its default value of `/tmp` because many
systems clean `/tmp` on a regular basis. Instead, it points to a path within
this HBase installation directory.
Running against the `LocalFileSystem`, as opposed to a distributed
filesystem, runs the risk of data integrity issues and data loss. Normally
HBase will refuse to run in such an environment. Setting
`hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior,
permitting operation. This configuration is for the developer workstation
only and __should not be used in production!__
See also https://hbase.apache.org/book.html#standalone_dist
-->
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-master:9001/hbase</value>#这是我本机的hdfs路径
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
</configuration>
3.测试运行HBase
首先要启动Hadoop,我的Hadoop一直启动的就直接跳过
start-dfs.sh #启动Hadoop的命令
第二步:启动HBase
[root@hadoop-master ~]# /usr/local/hbase/bin/start-hbase.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:hB8k+zi77qTN95rlFZZv8PFs1ud36IOvjpWk0I5JOpQ.
ECDSA key fingerprint is MD5:79:0d:2b:eb:28:f8:52:7c:49:49:91:cd:7c:77:d6:3e.
Are you sure you want to continue connecting (yes/no)? yes
127.0.0.1: Warning: Permanently added '127.0.0.1' (ECDSA) to the list of known hosts.
127.0.0.1: running zookeeper, logging to /usr/local/hbase/bin/../logs/hbase-root-zookeeper-hadoop-master.out
127.0.0.1: ERROR: JAVA_HOME /usr/java/jre1.8.0_281-amd64 does not exist.
running master, logging to /usr/local/hbase/bin/../logs/hbase-root-master-hadoop-master.out
: running regionserver, logging to /usr/local/hbase/bin/../logs/hbase-root-regionserver-hadoop-master.out
[root@hadoop-master ~]#
[root@hadoop-master ~]# jps
15938 NodeManager
13378 jar
15782 ResourceManager
15116 DataNode
14959 NameNode
13713 Bootstrap
13745 Bootstrap
563 HRegionServer
788 Jps
13398 jar
18521 AzkabanWebServer
32665 HQuorumPeer
这三个进程就是HBase的启动进程
563 HRegionServer
32665 HQuorumPeer
348 HMaster
进入shell界面:
[root@hadoop-master ~]# /usr/local/hbase/bin/hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.2, r3e98c51c512cbd5ef779ae6bcef178ce89c46e37, Mon Mar 8 16:49:11 PST 2021
Took 0.0019 seconds
hbase:001:0>
3.编程实践
1.利用Shell命令
1.1 HBase中创建表
hbase:001:0> create 'student','Sname','Ssex','Sage','Sdept','course'
Created table student
Took 1.8002 seconds
=> Hbase::Table - student
此时,即创建了一个“student”表,属性有:Sname,Ssex,Sage,Sdept,course。因为HBase的表中会有一个系统默认的属性作为行键,无需自行创建,默认为put命令操作中表名后第一个数据。创建完“student”表后,可通过describe命令查看“student”表的基本信息。
1.2 HBase数据库基本操作
-
添加数据
HBase中用put命令添加数据,注意:一次只能为一个表的一行数据的一个列,也就是一个单元格添加一个数据,所以直接用shell命令插入数据效率很低,在实际应用中,一般都是利用编程操作数据。
hbase:003:0> put 'student','95001','Sname','Mirko' Took 0.1520 seconds
表示向student表中添加了学号为95001,名字为Mirko的一行数据,其行键为95001
hbase:004:0> put 'student','95001','course:math','100' Took 0.0113 seconds
这表示填加一行数学成绩
-
删除数据
在HBase中用delete以及deleteall命令进行删除数据操作,它们的区别是:1. delete用于删除一个数据,是put的反向操作;2. deleteall操作用于删除一行数据。
1.delete命令
hbase:005:0> get 'student','95001' COLUMN CELL Sname: timestamp=2021-04-01T15:40:58.123, value=Mirko course:math timestamp=2021-04-01T15:42:34.060, value=100 1 row(s) Took 0.0395 seconds hbase:006:0> delete 'student','95001','course:math' Took 0.0423 seconds hbase:007:0> get 'student','95001' COLUMN CELL Sname: timestamp=2021-04-01T15:40:58.123, value=Mirko 1 row(s) Took 0.0201 seconds
2.deleteall命令
hbase:008:0> deleteall 'student','95001' Took 0.0110 seconds hbase:009:0> get 'student','95001' COLUMN CELL 0 row(s) Took 0.0125 seconds
表示删除了student表中的95001行的全部数据
-
查看数据
HBase中有两个用于查看数据的命令:1. get命令,用于查看表的某一行数据;2. scan命令用于查看某个表的全部数据
1.get命令
hbase:014:0> get 'student','95001'
命令执行截图:
2.scan命令
hbase:015:0> scan 'student'
命令执行截图:
-
删除表
删除表有两步,第一步先让该表不可用,第二步删除表。
hbase:016:0> disable 'student' hbase:018:0> drop 'student'
1.3 查询表历史数据
查询表的历史版本,需要两步。
1.在创建表的时候,指定保存的版本数(假设指定为5)
create 'teacher',{NAME=>'username',VERSIONS=>5}
2、插入数据然后更新数据,使其产生历史版本数据,注意:这里插入数据和更新数据都是用put命令
put 'teacher','91001','username','Mary'
put 'teacher','91001','username','Mary1'
put 'teacher','91001','username','Mary2'
put 'teacher','91001','username','Mary3'
put 'teacher','91001','username','Mary4'
put 'teacher','91001','username','Mary5'
3.查询时,指定查询的历史版本数。默认会查询出最新的数据。
get 'teacher','91001',{COLUMN=>'username',VERSIONS=>5}
1.4 退出HBase数据库表操作
最后退出数据库操作,输入exit命令即可退出,注意:这里退出HBase数据库是退出对数据库表的操作,而不是停止启动HBase数据库后台运行。
exit