Hbase 的基础入门

最新推荐文章于 2024-05-04 11:21:28 发布

XK&RM

最新推荐文章于 2024-05-04 11:21:28 发布

阅读量384

点赞数

分类专栏： HBase 文章标签：大数据实时大数据 hbase java

本文链接：https://blog.csdn.net/qq_41301707/article/details/113555861

版权

HBase 专栏收录该内容

2 篇文章 1 订阅

订阅专栏

HBase 官网

本次安装部署的是 cdh 5.16.2 系列

1. HBase 部署

HBase 下载地址

1.1 HBase 部署前提

需要部署 Hadoop，HBase 的数据最终存储在 HDFS 上面
需要部署Zookeeper，HBase 的元数据存储在 Zookeeper 上面

1.2 HBase 下载以及修改配置文件

[root@bigdatatest01 ~]# cd software/
[root@bigdatatest01 software]# wget http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.16.2.tar.gz
[root@bigdatatest01 software]# tar -xzvf hbase-1.2.0-cdh5.16.2.tar.gz -C ~/app/
[root@bigdatatest01 software]# cd ~/app/
[root@bigdatatest01 app]# cd hbase-1.2.0-cdh5.16.2/conf/
[root@bigdatatest01 conf]# ll
total 40
-rw-r--r-- 1 1106 4001 1811 Jun  3  2019 hadoop-metrics2-hbase.properties
-rw-r--r-- 1 1106 4001 4603 Jun  3  2019 hbase-env.cmd
-rw-r--r-- 1 1106 4001 7530 Jun  3  2019 hbase-env.sh
-rw-r--r-- 1 1106 4001 2257 Jun  3  2019 hbase-policy.xml
-rw-r--r-- 1 1106 4001  934 Jun  3  2019 hbase-site.xml
-rw-r--r-- 1 1106 4001 4603 Jun  3  2019 log4j.properties
-rw-r--r-- 1 1106 4001   10 Jun  3  2019 regionservers

1.3 修改配置文件

[root@bigdatatest01 conf]# vim hbase-env.sh 
export JAVA_HOME=/usr/java/jdk1.8.0_151
export HBASE_MANAGES_ZK=false

[root@bigdatatest01 conf]# vim hbase-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <!--hbase.rootdir的前端与$HADOOP_HOME/conf/core-site.xml的fs.defaultFS一致 -->
        <property>
                <name>hbase.rootdir</name>
                <value>hdfs://bigdatatest02:8020/hbase</value>
        </property>
        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>

                <!--本地文件系统的临时文件夹。可以修改到一个更为持久的目录上。(/tmp会在重启时清除) -->
        <property>
                <name>hbase.tmp.dir</name>
                <value>/root/tmp/hbase</value>
        </property>

                <!--如果只设置单个 Hmaster，那么 hbase.master 属性参数需要设置为 master5:60000 (主机名:60000) -->
                <!--如果要设置多个 Hmaster，那么我们只需要提供端口 60000，因为选择真正的 master 的事情会有 zookeeper 去处理 -->
        <property>
                <name>hbase.master</name>
                <value>60000</value>
        </property>

                <!--这个参数用户设置 ZooKeeper 快照的存储位置，默认值为 /tmp，显然在重启的时候会清空。因为笔者的 ZooKeeper 是独立安装的，所以这里路径是指向了 $ZOOKEEPER_HOME/conf/zoo.cfg 中 dataDir 所设定的位置 -->
        <property>
                <name>hbase.zookeeper.property.dataDir</name>
                <value>/root/tmp/zk1</value>
        </property>

        <property>
                <name>hbase.zookeeper.quorum</name>
                <value>bigdatatest01</value>
        </property>
                <!--表示客户端连接 ZooKeeper 的端口 -->
        <property>
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>
        </property>
                <!--ZooKeeper 会话超时。Hbase 把这个值传递改 zk 集群，向它推荐一个会话的最大超时时间 -->
        <property>
                <name>zookeeper.session.timeout</name>
                <value>120000</value>
        </property>

                <!--当 regionserver 遇到 ZooKeeper session expired ， regionserver 将选择 restart 而不是 abort -->
        <property>
                <name>hbase.regionserver.restart.on.zk.expire</name>
                <value>true</value>
        </property>
		<property>
				<name>hbase.online.schema.update.enable</name>
				<value>true</value>
		</property>

		<property>
				<name>hbase.coprocessor.abortonerror</name>
				<value>false</value>
		</property>
</configuration>

[root@bigdatatest01 conf]# vim regionservers
bigdatatest01

HBASE_MANAGES_ZK：是否使用 HBase 内置的 Zookeeper

1.4 启动 HBase

[root@bigdatatest01 conf]# cd ../
[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/start-hbase.
start-hbase.cmd  start-hbase.sh   
[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/start-hbase.sh 
starting master, logging to /root/app/hbase-1.2.0-cdh5.16.2/bin/../logs/hbase-root-master-bigdatatest01.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
The authenticity of host 'bigdatatest01 (192.168.20.66)' can't be established.
ECDSA key fingerprint is SHA256:ubGg3eXeUXlLZLkDjezmag/lpWFbRFEl30lMiQ/Is6M.
ECDSA key fingerprint is MD5:eb:fa:c5:d9:2a:2e:d5:18:39:fc:41:18:8c:4a:76:f6.
Are you sure you want to continue connecting (yes/no)? yes
bigdatatest01: Warning: Permanently added 'bigdatatest01' (ECDSA) to the list of known hosts.
bigdatatest01: starting regionserver, logging to /root/app/hbase-1.2.0-cdh5.16.2/bin/../logs/hbase-root-regionserver-bigdatatest01.out
bigdatatest01: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
bigdatatest01: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

遇到报错

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/hbase":hbase:hbase:drwxr-xr-x

解决报错

[root@bigdatatest01 ~]# su - hdfs
Last login: Tue Feb  2 10:31:56 CST 2021 on pts/1
[hdfs@bigdatatest01 ~]$ hadoop fs -chmod 777 /hbase

重启服务

[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/stop-hbase.sh 
stopping hbasecat: /tmp/hbase-root-master.pid: No such file or directory

[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/start-hbase.sh 
starting master, logging to /root/app/hbase-1.2.0-cdh5.16.2/bin/../logs/hbase-root-master-bigdatatest01.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
bigdatatest01: regionserver running as process 26211. Stop it first.

查看日志，发现权限依旧不足

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=ALL, inode="/hbase/.tmp":hbase:hbase:drwxr-xr-x

增加 /hbase/.tmp 权限

[hdfs@bigdatatest01 ~]$ hadoop fs -chmod 777 /hbase/.tmp
[hdfs@bigdatatest01 ~]$ hadoop fs -chmod -R  777 /hbase/.tmp
[hdfs@bigdatatest01 ~]$ hadoop fs -chmod -R  777 /hbase

重启 Hbase 服务
jps 查看服务

[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# jps
800 QuorumPeerMain
26211 HRegionServer
1060 Jps
525 QuorumPeerMain
18994 DataNode
32374 HMaster
19354 NodeManager
669 QuorumPeerMain

HBase 是主从架构
HMaster：主节点
HRegionServer：子节点

2. HBase 架构

HBase 架构

Client：
- HBase 有一张特殊表：META
  - META.：记录了用户所有表拆分出来的的 Region 映射信息，META 可以有多个 Regoin 。
- Client 访问用户数据前需要首先访问 ZooKeeper，找到 META 表的 Region 所在的位置，然后才能找到用户数据的位置去访问，中间需要多次网络操作，不过 client 端会做 cache 缓存。
ZooKeeper：
- ZooKeeper 为 HBase 提供 Failover 机制，选举 Master，避免单点 Master 单点故障问题。
- 实时监控 RegionServer 的状态，将 RegionServer 的上线和下线信息实时通知给 Master。
- 存储 HBase 的 Schema，包括有哪些 Table，每个 Table 有哪些 Column Family。
- 存储 hbase:meta 表的地址和 Master 地址
Master：
- 为 RegionServer 分配 Region。
- 负责 RegionServer 的负载均衡。
- 发现失效的 RegionServer 并重新分配其上的 Region。
- HDFS 上的垃圾文件（HBase）回收。
- 处理 Schema 更新请求（表的创建，删除，修改，列簇的增加等等）。
RegionServer：
- RegionServer 维护 Master 分配给它的 Region，处理对这些 Region 的 IO 请求。
- RegionServer 负责 Split 在运行过程中变得过大的 Region，负责 Compact 操作。
  - 可以看到，client 访问 HBase 上数据的过程并不需要 Master 参与（寻址访问 Zookeeper 和 RegioneServer，数据读写访问 RegioneServer），Master 仅仅维护者 Table 和 Region 的元数据信息，负载很低。
  - META 存的是所有的 Region 的位置信息，那么 RegioneServer 当中 Region 在进行分裂之后的新产生的 Region，是由 Master 来决定发到哪个 RegioneServer，这就意味着，只有 Master 知道 new Region 的位置信息，所以，由 Master 来管理 META 这个表当中的数据的 CRUD。
  - 所以结合以上两点表明，在没有 Region 分裂的情况，Master 宕机一段时间是可以忍受的。
HRegion：
- Table 在行的方向上分隔为多个Region。Region是HBase中分布式存储和负载均衡的最小单元，即不同的 Region 可以分别在不同的 Region Server 上，但同一个Region是不会拆分到多个 Server 上。
- Region按大小分隔，每个表一般是只有一个 Region。随着数据不断插入表，Region不断增大，当 Region 的某个列族达到一个阈值时就会分成两个新的 Region。
- 每个 Region 由以下信息标识：< 表名,startRowkey,创建时间>。
- 由目录表( META )记录该 Region 的 endRowkey。
Store：
- 每一个 Region 由一个或多个 Store 组成，至少是一个 Store，HBase会把一起访问的数据放在一个 Store 里面，即为每个 ColumnFamily 建一个 Store，如果有几个ColumnFamily，也就有几个 Store。一个 Store由一个 MemStore和0或者多个 StoreFile组成。 HBase以 Store的大小来判断是否需要切分 Region。
MemStore：
- MemStore 是放在内存里的。保存修改的数据即keyValues。当 MemStore 的大小达到一个阀值（默认128MB）时，MemStore会被flush到文件，即生成一个快照。目前 HBase 会有一个线程来负责MemStore的flush操作。
StoreFile：
- MemStore内存中的数据写到文件后就是 StoreFile，StoreFile底层是以 HFile的格式保存。当 StoreFile文件的数量增长到一定阈值后，系统会进行合并（minor、major compaction），在合并过程中会进行版本合并和删除工作（majar），形成更大的 StoreFile。
HFile：
- HBase中KeyValue数据的存储格式，HFile是Hadoop的二进制格式文件，实际上 StoreFile 就是对 Hfile 做了轻量级包装，即 StoreFile 底层就是 HFile。
HLog：
- HLog(WAL log)：WAL意为 Write Ahead Log，用来做灾难恢复使用，HLog记录数据的所有变更，一旦 Region Server 宕机，就可以从 HLog 中进行恢复。
- HLog文件就是一个普通的Hadoop Sequence File， Sequence File的value是key时HLogKey对象，其中记录了写入数据的归属信息，除了 Table 和 Region 名字外，还同时包括 sequence number和timestamp，timestamp是写入时间，sequence number的起始值为0，或者是最近一次存入文件系统中的sequence number。 Sequence File的value是HBase的KeyValue对象，即对应HFile中的KeyValue。
BlockCache：读取中的缓存。
- 存储在 RegionServer 中，一个 RegionServer 只会有一个 BlockCache。
- 在 RegionServer 启动时完成 BlockCache 初始化的工作。

3. HBase 物理存储模型

HBase 物理存储模型

HBase 每个单元格包含的数据：
- RowKey：主键。
- Column Family：列簇，把表竖向切割，一个表中可以有多个列簇，一个列簇下面有多个字段。
- Column：字段。
- Version Number：long类型默认是系统时间戳用户也自定义。
- value：存储的值。
Table 中的所有行都按照 RowKey 的字典序排列。
Table 在行的方向上分割为多个 HRegion。
HRegion 按大小分割的，每个表一开始只有一个 HRegion，随着数据不断插入表，HRegion 不断增大，当增大到一个阀值的时候，HRegion 就会等分会两个新的 HRegion。当表中的行不断增多，就会有越来越多的 HRegion。
HRegion 是 Hbase 中分布式存储和负载均衡的最小单元。最小单元就表示不同的 HRegion 可以分布在不同的 HRegion Server 上。但一个 HRegion 是不会拆分到多个 Server 上的。
HRegion 虽然是负载均衡的最小单元，但并不是物理存储的最小单元。事实上，HRegion 由一个或者多个 Store 组成，每个 Store 保存一个 Column Family。每个 Strore 又由一个 MemStore 和 0 至多个 StoreFile 组成。

4. HBase 读写流程

HBase 一次范围查询可能涉及多个 Region 、多个 MemStore，甚至是多个 StoreFile。
HBase 的更新、删除操作底层实现都是往 HBase 里面插入一笔数据，都没有真正的更新真正的数据，而是通过时间戳来实现多个版本，删除操作也没有真正的删除原始数据，而且打了一个 delete 的标签，类似于我们通常所说的逻辑删除。
这种操作极大的简化了更新、删除操作，但是给读取数据带了一些压力。通过多个版本和删除标记进行过滤。

4.1 HBase 读流程

HBase 读流程

Client 发送读取数据的请求。
先去 Zookeeper 里面获取 hbase:meta 表所在的 Region 节点。
base:meta 表中根据 RowKey 确定目标 RegionServer 所在的节点以及 Region 信息。
读取顺序：MemStore --> BlockCache --> HFile 文件
将读请求进行封装，发送给 RegionServer 节点，RegionServer 收到读取数据的请求后，解析数据，查询出所有的数据后并返回。

4.2 HBase 写流程

HBase 写流程

Client 发送写数据的请求。
先去 Zookeeper 里面获取 hbase:meta 表所在的 Region 节点。
base:meta 表中根据 RowKey 确定目标 RegionServer 所在的节点以及 Region 信息。
将写请求与对应的 RegionServer 进行通信，RegionServer 接收到写请求，解析数据，先写到 HLog，再写对应 Region 列簇 Store 中的 MemStore。
当 MemStore 触发异步 Flush，把内存中 MemStore 写入到 StoreFile 文件中。

5. HBase Shell

5.1 查看帮助命令

hbase(main):005:0> help

5.2 创建一个 namespace

HBase 中的 namespace 类似于 ORACLE 中的 schema、Mysql 中的 database

Group name: namespace
  Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

alter_namespace：修改 namespace
create_namespace：创建 namespace
describe_namespace：namespace 的描述
drop_namespace：删除 namespace
list_namespace：namespace 列表
list_namespace_tables + ‘$namespace 名称’：查询 namespace 下面的所有

hbase(main):014:0> create_namespace 'test'
Took 0.7759 seconds
hbase(main):015:0> list_namespace
NAMESPACE
SYSTEM
bigdata
default
hbase
test
8 row(s)
Took 0.0082 seconds
hbase(main):016:0> list_namespace_tables 'test'
TABLE
0 row(s)
Took 0.0060 seconds

5.3 创建一个表

 Group name: ddl
  Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters

查看 create 的命令帮助

hbase(main):020:0> help 'create'
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
  hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => 'weekly'}

Table configuration options can be put at the end.
Examples:

  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
  hbase> # Optionally pre-split the table into NUMREGIONS, using
  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
  hbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}

You can also keep around a reference to the created table:

  hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then

创建表

hbase(main):021:0> create 'test:demo', 'o'
Created table test:demo
Took 1.3342 seconds
=> Hbase::Table - test:demo

5.4 CRUD

5.4.1 插入数据

hbase(main):028:0> put 'test:demo', 'row1' , 'o:id', '1'
Took 0.0427 seconds

5.4.2 查看所有数据

查看表中所有的数据

hbase(main):022:0> scan 'test:demo'
ROW                                                         COLUMN+CELL
0 row(s)
Took 0.0342 seconds

查看一行的数据

hbase(main):029:0> get 'test:demo','row1'
COLUMN                                                      CELL
 o:id                                                       timestamp=1612256670990, value=1
 1 row(s)
Took 0.0230 seconds

5.4.3 更新数据

hbase(main):030:0> put 'test:demo', 'row1' , 'o:id', '2'
Took 0.0094 seconds
hbase(main):031:0> get 'test:demo','row1'
COLUMN                                                      CELL
 o:id                                                       timestamp=1612256869333, value=2
 1 row(s)
Took 0.0123 seconds

5.4.4 删除数据

删除数据：只能删除最新版本的数据，如果这个数据有多个版本，删除这个单元格数据，会显示上个版本这个单元格的数据。

hbase(main):032:0> delete 'test:demo', 'row1', 'o:id'
Took 0.0161 seconds
hbase(main):033:0> get 'test:demo','row1'
COLUMN                                                      CELL
 o:id                                                       timestamp=1612256670990, value=1
 1 row(s)
Took 0.0072 seconds

6. HBase API CRUD

6.1 POM 文件

<dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.12</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.2.0-cdh5.16.2</version>
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-server</artifactId>
    <version>1.2.0-cdh5.16.2</version>
</dependency>

6.2 HBaseUtils Code

package com.xk.bigdata.hbase.basic;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class HBaseUtils {

    public static Connection connection;

    /**
     * 创建连接
     *
     * @param zookeeperQuorum ： Zookeeper 连接地址
     * @throws Exception
     */
    public static void init(String zookeeperQuorum) throws Exception {
        Configuration hbaseConf = new Configuration();
        hbaseConf.set(HConstants.ZOOKEEPER_QUORUM, zookeeperQuorum);
        Configuration conf = HBaseConfiguration.create(hbaseConf);
        connection = ConnectionFactory.createConnection(conf);
    }

    /**
     * 关闭连接
     *
     * @throws Exception
     */
    public static void close() throws Exception {
        if (!connection.isClosed()) {
            connection.close();
        }
    }

    /**
     * 创建表
     *
     * @param tableName ：表名
     * @param familys   ：列簇数组
     * @throws IOException
     */
    public static void createTable(String tableName, String[] familys) throws IOException {
        Admin admin = connection.getAdmin();
        if (admin.tableExists(TableName.valueOf(tableName))) {
            System.out.println(tableName + "已经存在");
        } else {
            HTableDescriptor tableDescriptor = new HTableDescriptor(TableName.valueOf(tableName));
            for (String family : familys) {
                tableDescriptor.addFamily(new HColumnDescriptor(family));
            }
            admin.createTable(tableDescriptor);
            System.out.println(tableName + "创建成功！！");
        }
    }

    /**
     * 插入数据
     *
     * @param tableName ： 表名
     * @param rowKey    ： 主键
     * @param family    ： 列簇
     * @param qualifier ： 字段名
     * @param value     ： 字段数据
     * @throws IOException
     */
    public static void putRecord(String tableName, String rowKey, String family, String qualifier, String value) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Put put = new Put(Bytes.toBytes(rowKey));
        put.addColumn(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));
        table.put(put);
        System.out.println(tableName + "中字段：" + qualifier + "插入成功！！");
    }

    /**
     * 得到 hbase 中的一条数据
     *
     * @param tableName ：表名
     * @param rowKey    ： 主键
     * @throws IOException
     */
    public static void getOneRecord(String tableName, String rowKey) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Get get = new Get(Bytes.toBytes(rowKey));
        Result result = table.get(get);
        for (Cell cell : result.rawCells()) {
            System.out.println(Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())
                    + " : " + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())
                    + ":" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())
                    + " : " + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
        }
    }

    /**
     * 得到表中的所有数据
     *
     * @param tableName
     * @throws IOException
     */
    public static void getAllRecord(String tableName) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Scan scan = new Scan();
        ResultScanner scanner = table.getScanner(scan);
        for (Result result : scanner) {
            for (Cell cell : result.rawCells()) {
                System.out.println(Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())
                        + " : " + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())
                        + ":" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())
                        + " : " + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
            }
        }
    }

    /**
     * 删除数据
     *
     * @param tableName ：表名
     * @param rowKey    ：主键
     * @throws IOException
     */
    public static void deleteRecord(String tableName, String rowKey) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Delete delete = new Delete(Bytes.toBytes(tableName));
        table.delete(delete);
        System.out.println(tableName + "=====》" + rowKey + "删除成功！！！");
    }
}

6.3 HBaseUtilsTest Code

package com.xk.bigdata.hbase.basic;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;

public class HBaseUtilsTest {

    final String zookeeperQuorum = "bigdatatest02";

    @Before
    public void setUp() throws Exception {
        HBaseUtils.init(zookeeperQuorum);
    }

    @After
    public void cleanUp() throws Exception {
        HBaseUtils.close();
    }

    @Test
    public void testCreateTable() throws Exception {
        HBaseUtils.createTable("test:demo1", new String[]{"o"});
    }

    @Test
    public void testPutRecord() throws Exception {
        HBaseUtils.putRecord("test:demo1", "row2", "o", "id", "2");
    }

    @Test
    public void testGetOneRecord() throws IOException {
        HBaseUtils.getOneRecord("test:demo1", "row1");
    }

    @Test
    public void testGetAllRecord() throws IOException {
        HBaseUtils.getAllRecord("test:demo1");
    }

    @Test
    public void testDeleteRecord() throws IOException {
        HBaseUtils.deleteRecord("test:demo1", "row2");
    }
}

7. HBase API （多版本控制）

7.1 HBaseMultiVersion Code

package com.xk.bigdata.hbase.basic;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HConstants;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class HBaseMultiVersion {

    public static Connection connection;

    /**
     * 创建连接
     *
     * @param zookeeperQuorum ： Zookeeper 连接地址
     * @throws Exception
     */
    public static void init(String zookeeperQuorum) throws Exception {
        Configuration hbaseConf = new Configuration();
        hbaseConf.set(HConstants.ZOOKEEPER_QUORUM, zookeeperQuorum);
        Configuration conf = HBaseConfiguration.create(hbaseConf);
        connection = ConnectionFactory.createConnection(conf);
    }

    /**
     * 关闭连接
     *
     * @throws Exception
     */
    public static void close() throws Exception {
        if (!connection.isClosed()) {
            connection.close();
        }
    }

    /**
     * 得到固定版本的整条数据
     *
     * @param tableName ： 表名
     * @param version   ： 版本号
     * @throws IOException
     */
    public static void getAllVersionRecord(String tableName, Integer version) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Scan scan = new Scan();
        scan.setMaxVersions(version);
        ResultScanner scanner = table.getScanner(scan);
        for (Result result : scanner) {
            for (Cell cell : result.rawCells()) {
                System.out.println(Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())
                        + " : " + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())
                        + ":" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())
                        + " : " + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
            }
        }
    }

}

7.2 HBaseMultiVersionTest Code

package com.xk.bigdata.hbase.basic;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;

public class HBaseMultiVersionTest {

    final String zookeeperQuorum = "bigdatatest02";

    @Before
    public void setUp() throws Exception {
        HBaseMultiVersion.init(zookeeperQuorum);
    }

    @After
    public void cleanUp() throws Exception {
        HBaseMultiVersion.close();
    }

    @Test
    public void testGetAllVersionRecord() throws IOException {
        HBaseMultiVersion.getAllVersionRecord("test:demo1", -1);
    }

}

XK&RM

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Hbase 的基础入门

Hbase 的基础入门HBase 官网本次安装部署的是 cdh 5.16.2 系列1. HBase 部署HBase 下载地址1.1 HBase 部署前提需要部署 Hadoop，HBase 的数据最终存储在 HDFS 上面需要部署Zookeeper，HBase 的元数据存储在 Zookeeper 上面1.2 HBase 下载以及修改配置文件[root@bigdatatest01 ~]# cd software/[root@bigdatatest01 software]# wget
复制链接

扫一扫