hbase rowkey实例_HBase 常用类说明和基本操作、案例

最新推荐文章于 2022-03-21 16:42:58 发布

weixin_39928099

最新推荐文章于 2022-03-21 16:42:58 发布

阅读量195

点赞数

文章标签： hbase rowkey实例

本文链接：https://blog.csdn.net/weixin_39928099/article/details/111628193

版权

这篇博客详细介绍了HBase中的关键类，包括HBaseConfiguration、HBaseAdmin、HTableDescriptor、HColumnDescriptor、HTable、Put、Get以及ResultScanner的使用。此外，还探讨了客户端请求过滤器，如FilterList、SingleColumnValueFilter等，并提供了相关案例。通过这些内容，读者可以了解如何在HBase中进行数据操作和过滤。

摘要由CSDN通过智能技术生成

1.HBaseConfiguration

关系：org.apache.hadoop.hbase.HBaseConfiguration
作用：通过此类可以对HBase进行配置
用法实例： Configuration config = HBaseConfiguration.create();
说明： HBaseConfiguration.create() 默认会从classpath 中查找 hbase-site.xml 中的配置信息，初始化 Configuration。

2.HBaseAdmin

关系：org.apache.hadoop.hbase.client.HBaseAdmin
作用：提供接口关系HBase 数据库中的表信息
用法：HBaseAdmin admin = new HBaseAdmin(config);

3.HTableDescriptor

关系：org.apache.hadoop.hbase.HTableDescriptor
作用：HTableDescriptor 类包含了表的名字以及表的列族信息
用法：HTableDescriptor htd =new HTableDescriptor(tablename);
Htd.addFamily(new HColumnDescriptor(“myFamily”));

4.HColumnDescriptor

关系：org.apache.hadoop.hbase.HColumnDescriptor
作用：HColumnDescriptor 维护列族的信息
用法：HTableDescriptor htd =new HTableDescriptor(tablename);
Htd.addFamily(new HColumnDescriptor(“myFamily”));

5.HTable

关系：org.apache.hadoop.hbase.client.HTable
作用：HTable 和 HBase 的表通信
用法：HTable tab = new HTable(config,Bytes.toBytes(tablename));
ResultScanner sc = tab.getScanner(Bytes.toBytes(“familyName”));
说明：获取表内列族 familyNme 的所有数据。

6.Put

关系：org.apache.hadoop.hbase.client.Put
作用：获取单个行的数据
用法：HTable table = new HTable(config,Bytes.toBytes(tablename));
Put put = new Put(row);
p.add(family,qualifier,value);
说明：向表 tablename 添加 “family,qualifier,value”指定的值。

7.Get

关系：org.apache.hadoop.hbase.client.Get
作用：获取单个行的数据
用法：HTable table = new HTable(config,Bytes.toBytes(tablename));
Get get = new Get(Bytes.toBytes(row));
Result result = table.get(get);
说明：获取 tablename 表中 row 行的对应数据

8.ResultScanner

关系：Interface
作用：获取值的接口
用法：ResultScanner scanner = table.getScanner(Bytes.toBytes(family));
For(Result rowResult : scanner){
Bytes[] str = rowResult.getValue(family,column);
}
说明：循环获取行中列值。

客户端请求过滤器：

1. FilterList

FilterList 代表一个过滤器列表，过滤器间具有
FilterList.Operator.MUST_PASS_ALL 和
FilterList.Operator.MUST_PASS_ONE 的关系，下面展示一个过滤器的 “或”关系。
下面FilterList 列表中检查同一属性的'value1' 或'value2' 。

FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ONE);
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(Bytes.toBytes(“cfamily”), Bytes.toBytes(“column”),CompareOp.EQUAL,Bytes.toBytes("value1"));
list.add(filter1);
SingleColumnValueFilter filter2 = new SingleColumnValueFilter(Bytes.toBytes(“cfamily”), Bytes.toBytes(“column”), CompareOp.EQUAL, Bytes.toBytes("value2"));
List.add(filter2);

2. SingleColumnValueFilter

SingleColumnValueFilter 用于测试列值相等 (CompareOp.EQUAL ), 不等 (CompareOp.NOT_EQUAL),或范围 (e.g., CompareOp.GREATER). 下面示例检查列值和字符串'my values' 相等...

SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes(“cFamily”), Bytes.toBytes(“column”), CompareOp.EQUAL, Bytes.toBytes("values"));
scan.setFilter(filter);

3. ColumnPrefixFilter

ColumnPrefixFilter 用于指定列名前缀值相等

Byte[] prefix = Bytes.toBytes(“values”);
Filter f = new ColumnPrefixFilter(prefix);
scan.setFilter(f);

4. MultipleColumnPrefixFilter

MultipleColumnPrefixFilter 和 ColumnPrefixFilter 行为差不多，但可以指定多个前缀。

byte[][] prefixes = new byte[][] {Bytes.toBytes("value1"), Bytes.toBytes("value2")};
Filter f = new MultipleColumnPrefixFilter(prefixes);
scan.setFilter(f);

5. QualifierFilter

QualifierFilter 是基于列名的过滤器。

Filter f = new QualifierFilter(“QualifierName”);
scan.setFilter(f);

6. RowFilter

RowFilter 是rowkey过滤器,通常根据rowkey来指定范围时，使用scan扫描器的StartRow和StopRow 方法比较好。Rowkey也可以使用。

Filter f = new RowFilter(“rowkey”);
scan.setFilter(f);

7. RegexStringComparator

RegexStringComparator 是支持正则表达式的比较器。
过滤器配合上比较器会很方便。看下面的代码。

HTable table = new HTable(cfg,"datainfo");
Scan scan = new Scan();
String reg = "^136([0-9]{8})$";//满足136开头的手机号
RowFilter filter = new RowFilter(CompareOp.EQUAL, 
new RegexStringComparator(reg));
scan.setFilter(filter);
ResultScanner rs = table.getScanner(scan);
for(Result rr : rs){
for(KeyValue kv : rr.raw()){
         ...
}
}

8. SubstringComparator

SubstringComparator 用于检测一个子串是否存在于值中。大小写不敏感。

//检测values 是否存在于查询的列值中
SubstringComparator comp = new SubstringComparator("values");
SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes(“family”), Bytes.toBytes(“column”),CompareOp.EQUAL, Bytes.toBytes(“value”));
scan.setFilter(filter);

案例：

简单的添加和创建。

随机产生1000条通话记录，186的主叫手机号和158被叫号。

以下注释自我写的，不对的地方指出，我改。

package cn.msk.HBase;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.filter.CompareFilter;
import org.apache.hadoop.hbase.filter.FilterList;
import org.apache.hadoop.hbase.filter.PrefixFilter;
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;
import java.io.InterruptedIOException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;


public class HBaseDemo {
    /**
     * HBaseAdmin
     * 关系：org.apache.hadoop.hbase.client.HBaseAdmin
     * 作用：提供接口关系HBase 数据库中的表信息
     * 用法：HBaseAdmin admin = new HBaseAdmin(config);
     */

    //管理表
    HBaseAdmin admin;
    //管理表里面的数据
    HTable hTable;
    //表名
    String TableName="phone";
    @Before //Junit的单元测试注解
    public void init() throws Exception {
        Configuration conf=new Configuration();
        conf.set("hbase.zookeeper.quorum","node02,node03,node04");
        admin=new HBaseAdmin(conf);
        hTable=new HTable(conf,TableName.getBytes());
    }

    @Test
    public void CreateTable() throws Exception {
        //判断表是否存在
        if (admin.tableExists(TableName)){
            //存在先禁用
            admin.disableTable(TableName);
            //再删除
            admin.deleteTable(TableName);
        }


        /**
         * HTableDescriptor：
         * 关系：org.apache.hadoop.hbase.HTableDescriptor
         * 作用：HTableDescriptor 类包含了表的名字以及表的列族信息
         * 用法：HTableDescriptor htd =new HTableDescriptor(tablename);
         *      Htd.addFamily(new HColumnDescriptor(“myFamily”));
         */
            //表描述
        HTableDescriptor desc=new HTableDescriptor(TableName.valueOf(TableName));

        /**
         * HColumnDescriptor
         * 关系：org.apache.hadoop.hbase.HColumnDescriptor
         * 作用：HColumnDescriptor 维护列族的信息
         * 用法：HTableDescriptor htd =new HTableDescriptor(tablename);
         *    Htd.addFamily(new HColumnDescriptor(“myFamily”));
         */
        HColumnDescriptor cf=new HColumnDescriptor("cf".getBytes());

        //创建了表和列，接下来把列放到表里面
        desc.addFamily(cf);
        admin.createTable(desc);
    }


    /**
     * 表创建好了，往表里面添加数据
     * admin管理表，但不能管理表里面的数据，所以还需要HTable
     * @throws IOException
     * 添加成功后，继续在此添加的话，需要修改列名，不然会覆盖先前的数据
     */
    @Test
    public void put() throws Exception {
        //唯一键
        String rowKey="1";
        Put put=new Put(rowKey.getBytes());
        //添加：列，列的名称，值
        put.add("cf".getBytes(),"name".getBytes(),"xiaoxue".getBytes());
        put.add("cf".getBytes(),"age".getBytes(),"20".getBytes());
        put.add("cf".getBytes(),"sex".getBytes(),"nv".getBytes());
        hTable.put(put);
    }

    /**
     * 获取数据
     * @throws Exception
     */
    @Test
    public void get() throws Exception {
        //标识
        String rowKey="1";
        //根据标识去获取
        Get get=new Get(rowKey.getBytes());
        get.addColumn("cf".getBytes(),"name".getBytes());
        get.addColumn("cf".getBytes(), "age".getBytes());
        get.addColumn("cf".getBytes(), "sex".getBytes());
        //获取get里面的数据，返回结果
        Result result = hTable.get(get);
        //根据列，列族遍历结果
        Cell cell = result.getColumnLatestCell("cf".getBytes(), "name".getBytes());
        Cell cell1 = result.getColumnLatestCell("cf".getBytes(), "age".getBytes());
        Cell cell2 = result.getColumnLatestCell("cf".getBytes(), "sex".getBytes());
        //转换类型并打印输出
        System.out.println(new String(CellUtil.cloneValue(cell)));
        System.out.println(new String(CellUtil.cloneValue(cell1)));
        System.out.println(new String(CellUtil.cloneValue(cell2)));
    }


    /**
     * 第二种获取数据的方法
     * 统计二月份到三月份的通话记录
     * @throws Exception
     */
    @Test
    public void scan() throws Exception {
        String phoneNum="18691483287";
        //三月份的数据
        String startRow=phoneNum+"_"+(Long.MAX_VALUE-sdf.parse("20180301000000").getTime());
        //二月份的数据
        String stopRow=phoneNum+"_"+(Long.MIN_VALUE-sdf.parse("20180201000000").getTime());
        Scan scan=new Scan();
        scan.setStartRow(startRow.getBytes());
        scan.setStopRow(stopRow.getBytes());
        ResultScanner results = hTable.getScanner(scan);
        for (Result rs:results){
            System.out.print(new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"dnum".getBytes()))));
            System.out.print("-"+new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"length".getBytes()))));
            System.out.print("-"+new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"type".getBytes()))));
            System.out.println("-"+new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"date".getBytes()))));
        }
    }


    //例子

    SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");

    /**
     * 有10个用户，每个用户随机产生100条记录
     *
     * @throws Exception
     */
    @Test
    public void insertDB2() throws Exception {
        List<Put> puts = new ArrayList<Put>();
        for (int i = 0; i < 10; i++) {
            //自己手机号：固定186开头，加上随机产生的后9位
            String phoneNum = getPhoneNum("186");
            for (int j = 0; j < 100; j++) {
                //对方手机号
                String dnum = getPhoneNum("158");
                //通话时长
                String length = r.nextInt(99) + "";
                //主叫和被叫
                String type = r.nextInt(2) + "";
                //日期：固定好年份
                String dateStr = getDate("2018");
                //rowKey
                String rowkey = phoneNum + "_" + (Long.MAX_VALUE - sdf.parse(dateStr).getTime());
                Put put = new Put(rowkey.getBytes());
                put.add("cf".getBytes(), "dnum".getBytes(), dnum.getBytes());
                put.add("cf".getBytes(), "length".getBytes(), length.getBytes());
                put.add("cf".getBytes(), "type".getBytes(), type.getBytes());
                put.add("cf".getBytes(), "date".getBytes(), dateStr.getBytes());
                puts.add(put);
            }
        }
        hTable.put(puts);
    }

    //固定好了年份，随机产生最大十二个月
    private String getDate(String year) {
        return year + String.format("%02d%02d%02d%02d%02d", new Object[] {
                //1-12月
                r.nextInt(12) + 1, r.nextInt(31) + 1, r.nextInt(24), r.nextInt(60), r.nextInt(60) });
    }

    Random r = new Random();

    /**
     * 生成随机的手机号码
     * @param string
     * @return
     */
    private String getPhoneNum(String string) {
        return string + String.format("%08d", r.nextInt(99999999));
    }


    /**
     * 查询某个手机主号为1的所有记录
     * @throws IOException
     */
    @Test
    public void scan2()throws Exception{
        FilterList list=new FilterList(FilterList.Operator.MUST_PASS_ALL);
        //PrefixFilter是对RowKey进行过滤
        PrefixFilter filter=new PrefixFilter("18691483287".getBytes());
        SingleColumnValueFilter filter1=new SingleColumnValueFilter("cf".getBytes(),"type".getBytes(), CompareFilter.CompareOp.EQUAL,"1".getBytes());
        list.addFilter(filter);
        list.addFilter(filter1);
        Scan scan=new Scan();
        scan.setFilter(list);
        ResultScanner results = hTable.getScanner(scan);
        for (Result rs:results){
            System.out.print(new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"dnum".getBytes()))));
            System.out.print("-"+new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"length".getBytes()))));
            System.out.print("-"+new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"type".getBytes()))));
            System.out.println("-"+new String(CellUtil.cloneValue(rs.getColumnLatestCell("cf".getBytes(),"date".getBytes()))));
        }
    }

    // 关闭
    @After
    public void destory() throws IOException {
        if (admin!=null){
            admin.close();
        }
    }
}

添加通话记录

输出2月份的所有通话记录，并排序

输出所有为1的通话记录

weixin_39928099

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫