【HBase学习笔记-尚硅谷-Java API shell命令 谷粒微博案例】

HBase

一、HBase简介

1、HBase介绍

HBase 是一种分布式、可扩展、支持海量数据存储的 NoSQL 数据库。 逻辑上, HBase 的数据模型同关系型数据库很类似,数据存储在一张表中,有行有列。但从 HBase 的底层物理存储结构(K-V) 来看, HBase 更像是一个 multi-dimensional map。

2、HBase的逻辑结构和物理结构

在这里插入图片描述
在这里插入图片描述

3、数据模型

1) Name Space
命名空间,类似于关系型数据库的 DatabBase 概念,每个命名空间下有多个表。 HBase
有两个自带的命名空间,分别是 hbase 和 default, hbase 中存放的是 HBase 内置的表,
default 表是用户默认使用的命名空间。
2) Region
类似于关系型数据库的表概念。不同的是, HBase 定义表时只需要声明列族即可,不需
要声明具体的列。这意味着, 往 HBase 写入数据时,字段可以动态、 按需指定。因此,和关
系型数据库相比, HBase 能够轻松应对字段变更的场景。
3) Row
HBase 表中的每行数据都由一个 RowKey 和多个 Column(列)组成,数据是按照 RowKey
的字典顺序存储的,并且查询数据时只能根据 RowKey 进行检索,所以 RowKey 的设计十分重
要。
4) Column
HBase 中的每个列都由 Column Family(列族)和 Column Qualifier(列限定符(列名)) 进行限
定,例如 info: name, info: age(表示info列族下的name列)。建表时,只需指明列族,而列限定符无需预先定义。
5) Time Stamp
用于标识数据的不同版本(version), 每条数据写入时, 如果不指定时间戳, 系统会自动为其加上该字段,其值为写入 HBase 的时间。 数据库的每列后面都存储了不同版本的数据,通过时间戳来区分。

6) Cell
由{rowkey, column Family: column Qualifier, time Stamp} 唯一确定的单元。 cell 中的数据是没有类型的,全部是字节码形式存贮。 相对与传统的关系型数据库的二维表格,cell更像一个三维表格中的一个块,因为增加了时间戳。

4、基本架构

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-WFgY5NIf-1639985020184)(C:\Users\19783\AppData\Roaming\Typora\typora-user-images\image-20211215145027123.png)]

架构角色:
1) Region Server
Region Server 为 Region 的管理者, 其实现类为 HRegionServer,主要作用如下:对于数据的操作: get, put, delete;
对于 Region 的操作: splitRegion、 compactRegion。
2) Master
Master 是所有 Region Server 的管理者,其实现类为 HMaster,主要作用如下:对于表的操作: create, delete, alter
对于 RegionServer的操作:分配 regions到每个RegionServer,监控每个 RegionServer的状态,负载均衡和故障转移。
3) Zookeeper
HBase 通过 Zookeeper 来做 Master 的高可用、 RegionServer 的监控、元数据的入口以及集群配置的维护等工作。
4) HDFS
HDFS 为 HBase 提供最终的底层数据存储服务,同时为 HBase 提供高可用的支持

二、快速入门

1、配置HBase

首先保证 Zookeeper 集群的正常部署,并启动之:

sk.sh start

Hadoop 集群的正常部署并启动:

myhadoop.sh start

解压到module下

tar -zxvf hbase-1.3.1-bin.tar.gz -C/opt/module

修改conf下的hbase-env.sh

export JAVA_HOME=/opt/module/jdk1.6.0_144  //27行
export HBASE_MANAGES_ZK=false              //187行

hbase-site.xml 修改内容:

<configuration>
	 <!-- 配置hdfs的接口,应该和hadoop配置文件core-site配置的端口一致 -->
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hadoop102:8020/HBase</value>
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <!-- 0.98 后的新变动,之前版本没有.port,默认端口为 60000 -->
    <property>
        <name>hbase.master.port</name>
        <value>16000</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>hadoop102,hadoop103,hadoop104</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/opt/module/zookeeper-3.5.7/zkData</value>
    </property>

</configuration>

修改regionservers:

hadoop102
hadoop103
hadoop104

软连接 hadoop 配置文件到 HBase:

ln -s /opt/module/hadoop-3.1.3/etc/hadoop/core-site.xml /opt/module/hbase-1.3.1/conf/core-site.xml
ln -s /opt/module/hadoop-3.1.3/etc/hadoop/hdfs-site.xml /opt/module/hbase-1.3.1/conf/hdfs-site.xml

分发文件

xsync hbase-1.3.1/

启动:

方式一:

bin/hbase-daemon.sh start master
bin/hbase-daemon.sh start regionserver

提示: 如果集群之间的节点时间不同步,会导致 regionserver 无法启动,抛出ClockOutOfSyncException 异常。 时间对HBase非常重要。

修改hbase-site.xml添加时间同步配置,设置更大的值,或者同步集群间的时间

<property>
        <name>hbase.master.maxclockskew</name>
        <value>180000</value>
        <description>Time difference of regionserver frommaster</description>
 </property>

方式二:

bin/start-hbase.sh  //群启
bin/stop-hbase.sh

2、命令

创建命名空间

hbase(main):003:0> create_namespace 'bigdata'

//命名空间创建下创建表
hbase(main):006:0> create 'bigdata:stu','info'

不加默认是default命名空间,具体看尚硅谷文档第二章

1.进入 HBase 客户端命令行

[atguigu@hadoop102 hbase]$ bin/hbase shell

2.查看帮助命令
hbase(main):001:0> help
3.查看当前数据库中有哪些表
hbase(main):002:0> list
2.2.2 表的操作
1.创建表
hbase(main):002:0> create 'student','info'
2.插入数据到表
hbase(main):003:0> put 'student','1001','info:sex','male'
hbase(main):004:0> put 'student','1001','info:age','18'
hbase(main):005:0> put 'student','1002','info:name','Janna'
hbase(main):006:0> put 'student','1002','info:sex','female'
hbase(main):007:0> put 'student','1002','info:age','20'
3.扫描查看表数据
hbase(main):008:0> scan 'student'
hbase(main):009:0> scan 'student',{STARTROW => '1001', STOPROW =>
'1001'}
hbase(main):010:0> scan 'student',{STARTROW => '1001'}
4.查看表结构
hbase(main):011:0> describe ‘student’
5.更新指定字段的数据
hbase(main):012:0> put 'student','1001','info:name','Nick'
hbase(main):013:0> put 'student','1001','info:age','100'
6.查看“指定行”或“指定列族:列”的数据
hbase(main):014:0> get 'student','1001'
hbase(main):015:0> get 'student','1001','info:name'
7.统计表数据行数
hbase(main):021:0> count 'student'
8.删除数据
删除某 rowkey 的全部数据:
hbase(main):016:0> deleteall 'student','1001'
删除某 rowkey 的某一列数据:  

hbase(main):017:0> delete 'student','1002','info:sex'
9.清空表数据
hbase(main):018:0> truncate 'student'
提示: 清空表的操作顺序为先 disable,然后再 truncate。
10.删除表
首先需要先让该表为 disable 状态:
hbase(main):019:0> disable 'student'
然后才能 drop 这个表:
hbase(main):020:0> drop 'student'
提示: 如果直接 drop 表,会报错: ERROR: Table student is enabled. Disable it first.
11.变更表信息
将 info 列族中的数据存放 3 个版本:
hbase(main):022:0> alter 'student',{NAME=>'info',VERSIONS=>3}
hbase(main):022:0> get
'student','1001',{COLUMN=>'info:name',VERSIONS=>3}  

三、API

1、获取HBase连接

获取HBase连接,并判断表是否存在

    //判断表是否存在
    public static boolean isTableExist(String tablename) throws IOException {

        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum","hadoop102,hadoop103,hadoop104");

        //HBaseAdmin admin = new HBaseAdmin(configuration);
        Connection connection = ConnectionFactory.createConnection(configuration);
        Admin admin = connection.getAdmin();

        boolean exists = admin.tableExists(TableName.valueOf(tablename));
        admin.close();
        return exists;
    }

封装重复代码:


public class TestAPI {

    private  static  Connection connection = null;
    private static Admin admin = null;
	//获取连接和admin对象
    static {
        try{
            Configuration configuration = HBaseConfiguration.create();
            configuration.set("hbase.zookeeper.quorum","hadoop102,hadoop103,hadoop104");
            connection = ConnectionFactory.createConnection(configuration);
            admin = connection.getAdmin();
        }catch (IOException e){
            e.printStackTrace();
        }

    }
    
    //判断表是否存在
    public static boolean isTableExist(String tablename) throws IOException {
        
        boolean exists = admin.tableExists(TableName.valueOf(tablename));
        return exists;
    }
    //关闭资源
    public static void close(){
        if(admin != null) {
            try {
                admin.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        if(connection != null) {
            try {
                connection.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        
    }

    public static void main(String[] args) throws IOException {
        System.out.println(isTableExist("stu"));
        close();
    }
}

2、创建表

//创建表
public static void createTable(String tableName, String... cfs) throws IOException {
    if(cfs.length <= 0){
        System.out.println("请设置列族信息");
        return;
    }
    if(isTableExist(tableName)){
        System.out.println(tableName+"表已存在");
        return;
    }
    //创建表描述器
    HTableDescriptor hTableDescriptor = new HTableDescriptor(TableName.valueOf(tableName));
    //循环添加列族信息
    for(String cf:cfs){
        //创建列族描述器
        HColumnDescriptor hColumnDescriptor = new HColumnDescriptor(cf);
		//保存的历史版本,默认为1
       hColumnDescriptor.setMaxVersions(5);
        //添加列族信息
        hTableDescriptor.addFamily(hColumnDescriptor);
    }
    admin.createTable(hTableDescriptor);
}

3、删除表

//删除表
    public static void dropTable(String tableName) throws IOException {
        if(!isTableExist(tableName)){
            System.out.println(tableName+"表不存在!");
            return;
        }

        //使表下线
        admin.disableTable(TableName.valueOf(tableName));
        //删除表
        admin.deleteTable(TableName.valueOf(tableName));
    }

4、创建命名空间

//创建命名空间
    public static void createNameSpace(String ns) throws IOException {

        //创建命名空间描述器
        NamespaceDescriptor namespaceDescriptor = NamespaceDescriptor.create(ns).build();

        //创建命名空间
        try {
            admin.createNamespace(namespaceDescriptor);
        }catch (NamespaceExistException e){
            System.out.println(ns+"命名空间已存在!");
        }
        catch (IOException e) {
            e.printStackTrace();
        }
        //如果存在,还可以创建表
        System.out.println("尽管存在,我还能到这");
        createTable("test:stu2","info");
    }

5、插入数据

 //插入数据
    public static void putData(String tableName, String rowKey, String cf, String cn, String value) throws IOException {

        //获取表对象
        Table table = connection.getTable(TableName.valueOf(tableName));

        //创建put对象
        Put put = new Put(Bytes.toBytes(rowKey));

        //给put赋值,可以put多个列
        put.addColumn(Bytes.toBytes(cf),Bytes.toBytes(cn),Bytes.toBytes(value));
        put.addColumn(Bytes.toBytes(cf),Bytes.toBytes("sex"),Bytes.toBytes("male"));

        //插入数据
        table.put(put);

        //关闭连接
        table.close();
    }

6、获取数据

 //获取数据get
    public static void getData(String tableName, String rowKey, String cf, String cn) throws IOException {

        //获取表对象
        Table table = connection.getTable(TableName.valueOf(tableName));

        //创建get对象
        Get get = new Get(Bytes.toBytes(rowKey));
        //get.addFamily(Bytes.toBytes(cf));
        get.addColumn(Bytes.toBytes(cf),Bytes.toBytes(cn));
        //历史版本
        get.setMaxVersions(10);
        //获取数据
        Result result = table.get(get);

        //解析result
        for (Cell cell : result.rawCells()) {
            System.out.println("CF:"+Bytes.toString(CellUtil.cloneFamily(cell))+"  CN:"+Bytes.toString(CellUtil.cloneQualifier(cell))+
                    "  VALUE:"+Bytes.toString(CellUtil.cloneValue(cell)));
        }

        //关闭表连接
        table.close();

    }


 //获取数据scan
    public static void scanData(String tableName, String rowKey, String cf, String cn) throws IOException {
        //获取表对象
        Table table = connection.getTable(TableName.valueOf(tableName));
        //构建scan对象
        Scan scan = new Scan(Bytes.toBytes(rowKey));
		//scan.addColumn(Bytes.toBytes(cf),Bytes.toBytes(cn));
        //扫描表
        ResultScanner scanner = table.getScanner(scan);
        //解析resultScanner
        for (Result result : scanner) {
            for (Cell cell : result.rawCells()) {
                System.out.println("CF:"+Bytes.toString(CellUtil.cloneFamily(cell))+"  CN:"+Bytes.toString(CellUtil.cloneQualifier(cell))+
                        "  VALUE:"+Bytes.toString(CellUtil.cloneValue(cell)));
            }
        }
        table.close();
    }

7、删除数据

//删除数据
    public static void delteData(String tableName, String rowKey, String cf, String cn) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));

        //构建删除对象
        Delete delete = new Delete(Bytes.toBytes(rowKey));
        //删除大于等于的
        delete.addColumns(Bytes.toBytes(cf),Bytes.toBytes(cn));
        //删除单个,scan时出现上一个版本
       // delete.addColumn(Bytes.toBytes(cf),Bytes.toBytes(cn));
        //执行删除操作
        table.delete(delete);

        table.close();
   }

四、HBase与MapReduce交互

1、配置

配置Hadoop需要的hbase的jar包,在/etc/profile.d/my_env.sh中添加下面的内容,然后分发,source /etc/profile

export HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp`

通过mapReduce添加数据

//先上传文件到hdfs
hdfs dfs -put fruit.tsv /

1001	Apple	Red
1002	Pear	Yellow
1003	Pineapple	Yellow

//然后运行mapreduce

/opt/module/hadoop-3.1.3/bin/yarn jar lib/hbase-server-1.3.1.jar importtsv -Dimporttsv.columns=HBASE_ROW_KEY,info:name,info:color fruit hdfs://hadoop102:8020/fruit.tsv

2、MR写数据

mapper:

package com.bert.mr1;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;

public class FruitMapper extends Mapper<LongWritable, Text, LongWritable, Text> {
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        context.write(key, value);
    }
}

reduce:

package com.bert.mr1;

import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;

import java.io.IOException;

public class FruitReducer extends TableReducer<LongWritable, Text, NullWritable> {


    @Override
    protected void reduce(LongWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException {

        //1、遍历reduce
        for (Text value : values) {
            //获取每行数据
            String[] fields = value.toString().split("\t");

            //构建put对象
            Put put = new Put(Bytes.toBytes(fields[0]));

            put.addColumn(Bytes.toBytes("info"),Bytes.toBytes("name"),Bytes.toBytes(fields[1]));
            put.addColumn(Bytes.toBytes("info"),Bytes.toBytes("color"),Bytes.toBytes(fields[2]));

            //写出
            context.write(NullWritable.get(),put);

        }
    }
}

driver:

package com.bert.mr1;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;


public class FruitDriver implements Tool {

    private Configuration configuration  = null;

    public int run(String[] args) throws Exception {
        //获取job对象
        Job job = Job.getInstance(configuration);

        //设置驱动类路径
        job.setJarByClass(FruitDriver.class);
        //设置Mapper和Mapper的输出的KV类型
        job.setMapperClass(FruitMapper.class);
        job.setMapOutputKeyClass(LongWritable.class);
        job.setMapOutputValueClass(Text.class);
        //设置Reducer类
        TableMapReduceUtil.initTableReducerJob(args[1],FruitReducer.class,job);
        //设置输入参数
        FileInputFormat.setInputPaths(job, new Path(args[0]));

        //提交任务

        boolean result = job.waitForCompletion(true);

        return result?0:1;
    }

    public void setConf(Configuration conf) {
        configuration = conf;
    }

    public Configuration getConf() {
        return configuration;
    }

    public static void main(String[] args) {
        Configuration configuration = new Configuration();
        try {
            int run = ToolRunner.run(configuration, new FruitDriver(), args);
            System.exit(run);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

打包后放到集群工作:

yarn jar hbase-1.0-SNAPSHOT.jar com.bert.mr1.FruitDriver fruit.tsv fruit1

3、MR读写文件

mapper:

package com.bert.mr2;

import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class Fruit2Mapper extends TableMapper<ImmutableBytesWritable, Put> {

    @Override
    protected void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException {

        Put put = new Put(key.get());
        //获取数据
        for (Cell cell : value.rawCells()) {
            //过滤掉color列,只要name列
            if("name".equals(Bytes.toString(CellUtil.cloneQualifier(cell)))){
                //给put对象赋值
                put.add(cell);
            }
        }
        context.write(key,put);
    }
}

reducer:

package com.bert.mr2;

import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;

public class Fruit2Reducer extends TableReducer<ImmutableBytesWritable, Put, NullWritable> {
    @Override
    protected void reduce(ImmutableBytesWritable key, Iterable<Put> values, Context context) throws IOException, InterruptedException {

        for (Put value : values) {
            context.write(NullWritable.get(),value);
        }

    }
}

driver:

package com.bert.mr2;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class Fruit2Driver implements Tool {

    private Configuration configuration  = null;

    public int run(String[] args) throws Exception {
        //获取job对象
        Job job = Job.getInstance(configuration);

        //设置驱动类路径
        job.setJarByClass(Fruit2Driver.class);
        //设置Mapper和Mapper的输出的KV类型
        TableMapReduceUtil.initTableMapperJob("fruit1", new Scan(), Fruit2Mapper.class, ImmutableBytesWritable.class, Put.class,job);
        //设置Reducer类
        TableMapReduceUtil.initTableReducerJob("fruit2", Fruit2Reducer.class,job);

        //提交任务
        boolean result = job.waitForCompletion(true);

        return result?0:1;
    }

    public void setConf(Configuration conf) {
        configuration = conf;
    }

    public Configuration getConf() {
        return configuration;
    }

    public static void main(String[] args) {

        try {
            //Configuration configuration = new Configuration();
            Configuration configuration = HBaseConfiguration.create();
            int run = ToolRunner.run(configuration, new Fruit2Driver(), args);
            System.exit(run);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}


在resource目录下创建hbase-site.xml 进行本地测试(导入的hbase依赖内置hadoop jar包):

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-->
<configuration>

    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hadoop102:8020/HBase</value>
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
    <!-- 0.98 后的新变动,之前版本没有.port,默认端口为 60000 -->
    <property>
        <name>hbase.master.port</name>
        <value>16000</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>hadoop102,hadoop103,hadoop104</value>
    </property>
    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/opt/module/zookeeper-3.5.7/zkData</value>
    </property>
    <property>
        <name>hbase.master.maxclockskew</name>
        <value>180000</value>
        <description>Time difference of regionserver frommaster</description>
    </property>

</configuration>

五、谷粒微博案例

1、设计表

在这里插入图片描述

项目目录:

在这里插入图片描述

2、Constants类

静态配置类

package com.bert.guliweibo.constants;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;



public class Constants {
    //Hbase的配置信息
    public static final Configuration CONFIGURATION = HBaseConfiguration.create();

    //命名空间
    public static final String NAMESPACE = "weibo";

    //微博内容表
    public static final String CONTENT_TABLE = "weibo:content";
    public static final String CONTENT_TABLE_CF = "info";
    public static final int CONTENT_TABLE_VERSIONS = 1;

    //用户关系表
    public static final String RELATION_TABLE = "weibo:relation";
    public static final String RELATION_TABLE_CF1= "attends";
    public static final String RELATION_TABLE_CF2= "fans";
    public static final int RELATION_TABLE_VERSIONS = 1;

    //收件箱表
    public static final String INBOX_TABLE = "weibo:inbox";
    public static final String INBOX_TABLE_CF = "info";
    public static final int INBOX_TABLE_VERSIONS = 2;

}

3、HBaseUtil类

创建表和命名空间

package com.bert.guliweibo.utils;

import com.bert.guliweibo.constants.Constants;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.NamespaceDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

import java.io.IOException;

/*
1、创建命名空间
2、判断表是否存在
3、创建表
 */
public class HBaseUtil {

    public static void createNameSpace(String nameSpace) throws IOException {
        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);

        //构建Admin对象
        Admin admin = connection.getAdmin();

        //构建命名空间描述器
        NamespaceDescriptor namespaceDescriptor = NamespaceDescriptor.create(nameSpace).build();

        //创建命名空间
        admin.createNamespace(namespaceDescriptor);

        //关闭资源
        admin.close();
        connection.close();
    }

    public static boolean isTableExist(String tableName) throws IOException {
        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);

        //构建Admin对象
        Admin admin = connection.getAdmin();

        boolean exits = admin.tableExists(TableName.valueOf(tableName));

        admin.close();
        connection.close();

        return exits;
    }

    public static void createTable(String tableName,int versions,String...cfs) throws IOException {
        //判断是否传入了列族信息
        if(cfs.length<=0){
            System.out.println("请设置列族信息");
            return;
        }
        //判断表是否存在
        if(isTableExist(tableName)){
            System.out.println("表已存在");
            return;
        }
        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);
        //获取Admin对象
        Admin admin = connection.getAdmin();
        //创建表描述器
        HTableDescriptor hTableDescriptor = new HTableDescriptor(TableName.valueOf(tableName));
        //添加列族信息
        for(String cf:cfs){
            HColumnDescriptor hColumnDescriptor = new HColumnDescriptor(cf);

            //设置版本
            hColumnDescriptor.setMaxVersions(versions);

            hTableDescriptor.addFamily(hColumnDescriptor);
        }
        //创建表
        admin.createTable(hTableDescriptor);

        admin.close();
        connection.close();

    }


}

4、HBaseDao

具体业务:关注,发微博,取关,查看初始列表,查看微博

package com.bert.guliweibo.dao;

import com.bert.guliweibo.constants.Constants;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.filter.CompareFilter;
import org.apache.hadoop.hbase.filter.Filter;
import org.apache.hadoop.hbase.filter.RowFilter;
import org.apache.hadoop.hbase.filter.SubstringComparator;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

/*

 */
public class HBaseDao {

    //发布微博
    public static void publishWeibo(String uid, String content) throws IOException {
        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);
        //-------------------------------操作微博内容表---------------------------
        //获取微博内容表对象
        Table contTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));
        //获取rowKey
        long ts = System.currentTimeMillis();
        String rowKey = uid + "_"+ts;
        //创建put对象
        Put contPut = new Put(Bytes.toBytes(rowKey));

        contPut.addColumn(Bytes.toBytes(Constants.CONTENT_TABLE_CF),Bytes.toBytes("content"),Bytes.toBytes(content));
        //插入数据
        contTable.put(contPut);

        //-------------------------------操作微博关系表---------------------------
        Table relaTable = connection.getTable(TableName.valueOf(Constants.RELATION_TABLE));

        Get get = new Get(Bytes.toBytes(uid));
        get.addFamily(Bytes.toBytes(Constants.RELATION_TABLE_CF2));
        Result result = relaTable.get(get);

        //创建put集合,用于存储微博收件表put对象,操作多行数据时,每行数据都要一个put对象
        ArrayList<Put> inboxPuts = new ArrayList<Put>();

        for (Cell cell : result.rawCells()) {

            //构建put对象操作收件箱表
            Put inboxPut = new Put(CellUtil.cloneQualifier(cell));
            inboxPut.addColumn(Bytes.toBytes(Constants.INBOX_TABLE_CF),Bytes.toBytes(uid),Bytes.toBytes(rowKey));
            inboxPuts.add(inboxPut);
        }


        if(inboxPuts.size()>0){
            Table inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));
            inboxTable.put(inboxPuts);
            inboxTable.close();
        }

        relaTable.close();
        contTable.close();
        connection.close();


    }

    //关注用户
    public static void addAttends(String uid, String ... attends) throws IOException {

        if(attends.length <= 0){
            System.out.println("请选择关注的人");
            return;
        }

        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);
        //-------------------------------操作用户关系表---------------------------
        //获取用户关系表内容表对象
        Table relaTable = connection.getTable(TableName.valueOf(Constants.RELATION_TABLE));

        //操作多行数据时,每行数据都要一个put对象
        ArrayList<Put> relaPuts = new ArrayList<Put>();

        //创建操作者的Put对象
        Put uidPut = new Put(Bytes.toBytes(uid));

        for (String attend : attends) {
            uidPut.addColumn(Bytes.toBytes(Constants.RELATION_TABLE_CF1),Bytes.toBytes(attend),Bytes.toBytes(attend));
            //创建观众的操作对象
            Put attendPut = new Put(Bytes.toBytes(attend));
            attendPut.addColumn(Bytes.toBytes(Constants.RELATION_TABLE_CF2), Bytes.toBytes(uid), Bytes.toBytes(uid));
            relaPuts.add(attendPut);
        }
        relaPuts.add(uidPut);

        relaTable.put(relaPuts);

        //-------------------------------操作收件箱表---------------------------
        //获取微博内容对象

        Table contTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));

        //创建收件箱表的put对象
        Put inboxPut = new Put(Bytes.toBytes(uid));

        //获取收件箱put对象
        for (String attend : attends) {

            Scan scan = new Scan(Bytes.toBytes(attend + "_"), Bytes.toBytes(attend + "|"));
            ResultScanner resultScanner = contTable.getScanner(scan);

            //定义一个时间戳,因为时间戳的单位是ms,1ms可能插入多条数据,所以时间戳可能会重复,手动添加时间戳
            long ts = System.currentTimeMillis();

            for (Result result : resultScanner) {
                inboxPut.addColumn(Bytes.toBytes(Constants.INBOX_TABLE_CF),Bytes.toBytes(attend),++ts, result.getRow());
                ts = ts + 1000;  //+1不行时,+1000

            }

        }

        if(!inboxPut.isEmpty()){
            Table inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));
            inboxTable.put(inboxPut);
            inboxTable.close();
        }

        relaTable.close();
        contTable.close();
        connection.close();
    }

    //取关
    public static void deleteAttends(String uid, String...dels) throws IOException {

        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);
        //获取用户关系表内容表对象
        Table relaTable = connection.getTable(TableName.valueOf(Constants.RELATION_TABLE));

        //操作多行数据时,每行数据都要一个put对象
        ArrayList<Delete> relaDelete = new ArrayList<Delete>();

        //操作者的delete对象
        Delete uidDelete = new Delete(Bytes.toBytes(uid));

        //循环创建被取关者删除对象
        for(String del:dels){

            uidDelete.addColumns(Bytes.toBytes(Constants.RELATION_TABLE_CF1),Bytes.toBytes(del));

            //创建被取关者对象
            Delete delDelete = new Delete(Bytes.toBytes(del));
            delDelete.addColumns(Bytes.toBytes(Constants.RELATION_TABLE_CF2),Bytes.toBytes(uid));

            relaDelete.add(delDelete);
        }
        relaDelete.add(uidDelete);
        relaTable.delete(relaDelete);

        //-----------------------操作收件箱表-----------------------------
        Table inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));

        Delete inboxDel = new Delete(Bytes.toBytes(uid));

        for (String del : dels) {
            inboxDel.addColumns(Bytes.toBytes(Constants.INBOX_TABLE_CF),Bytes.toBytes(del));
        }
        inboxTable.delete(inboxDel);

        relaTable.close();
        inboxTable.close();
        connection.close();

    }

    //获取初始化页面数据
    public static void getInit(String uid) throws IOException {
        //获取Connection对象
        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);

        //创建收件箱表对象
        Table inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));

        //获取微博内容表对象
        Table contTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));

        Get inboxGet = new Get(Bytes.toBytes(uid));
        inboxGet.setMaxVersions(2);
        Result result = inboxTable.get(inboxGet);

        for (Cell cell : result.rawCells()) {

            Get contGet = new Get(CellUtil.cloneValue(cell));

            //构建微博内容表get对象
            Result contResult = contTable.get(contGet);

            for (Cell rawCell : contResult.rawCells()) {
                System.out.println("RK:"+Bytes.toString(CellUtil.cloneRow(rawCell))
                        +",CF:"+Bytes.toString(CellUtil.cloneFamily(rawCell))
                        +",CN:"+Bytes.toString(CellUtil.cloneQualifier(rawCell))
                        +",Value:"+Bytes.toString(CellUtil.cloneValue(rawCell)));
            }

        }

    }

    //获取每个人的所有微博,通过过滤器实现
    public static void getWeiBo(String uid) throws IOException {

        Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);

        Table contTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));

        Scan scan = new Scan();
        //构建过滤器,找到rowkey字串含有uid的的rowkey
        RowFilter rowFilter = new RowFilter(CompareFilter.CompareOp.EQUAL,new SubstringComparator(uid+"_"));
        scan.setFilter(rowFilter);
        ResultScanner scanner = contTable.getScanner(scan);

        for (Result result : scanner) {
            for (Cell rawCell : result.rawCells()) {
                System.out.println("RK:"+Bytes.toString(CellUtil.cloneRow(rawCell))
                        +",CF:"+Bytes.toString(CellUtil.cloneFamily(rawCell))
                        +",CN:"+Bytes.toString(CellUtil.cloneQualifier(rawCell))
                        +",Value:"+Bytes.toString(CellUtil.cloneValue(rawCell)));
            }
        }
        contTable.close();
        connection.close();
    }

}

5、测试

package com.bert.guliweibo.test;

import com.bert.guliweibo.constants.Constants;
import com.bert.guliweibo.dao.HBaseDao;
import com.bert.guliweibo.utils.HBaseUtil;

import java.io.IOException;

public class TestWeiBo {
    public static void init(){
        //创建命名空间
        try {
            HBaseUtil.createNameSpace(Constants.NAMESPACE);

            HBaseUtil.createTable(Constants.CONTENT_TABLE, Constants.CONTENT_TABLE_VERSIONS,Constants.CONTENT_TABLE_CF);

            HBaseUtil.createTable(Constants.RELATION_TABLE,Constants.RELATION_TABLE_VERSIONS,Constants.RELATION_TABLE_CF1
                    ,Constants.RELATION_TABLE_CF2);

            HBaseUtil.createTable(Constants.INBOX_TABLE,Constants.INBOX_TABLE_VERSIONS,Constants.INBOX_TABLE_CF);


        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    public static void main(String[] args) throws IOException, InterruptedException {

//        HBaseDao.deleteAttends("1002","1003");
//        HBaseDao.getInit("1002");
//        Thread.sleep(10);
//        HBaseDao.addAttends("1002","1003");
//        HBaseDao.getInit("1002");


        init();

        HBaseDao.publishWeibo("1001","test发微博了1111111111");

        HBaseDao.addAttends("1002","1001","1003");

        HBaseDao.getInit("1002");

        System.out.println("-----------------------11111-----------------------" );

        HBaseDao.publishWeibo("1003","1003发微博了11111111111");
        Thread.sleep(10);
        HBaseDao.publishWeibo("1001","1001又发微博了22222222222222");
        Thread.sleep(10);
        HBaseDao.publishWeibo("1003","1003又发微博了2222222222222");
        Thread.sleep(10);
        HBaseDao.publishWeibo("1001","1001又发微博了3333333333333333333");
        Thread.sleep(10);
        HBaseDao.publishWeibo("1003","1003又发微博了23333333333333333");

        HBaseDao.getInit("1002");

        System.out.println("--------------2222222222222222------------------------");
        HBaseDao.deleteAttends("1002","1003");
        HBaseDao.getInit("1002");

        System.out.println("--------------333333------------------------");
        HBaseDao.addAttends("1002","1003");
        HBaseDao.getInit("1002");

        System.out.println("--------------44444444444444------------------------");
        HBaseDao.getWeiBo("1001");
    }
}

  • 2
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值