HBase实战案例之使用Scanner获取数据

HBase 实战案例之使用Scanner获取数据

1.Java API 简介
1.1 getScanner()

getScanner方法有三个重载模型,分别如下:

  • getScanner(Scan scan)
  /**
   * Returns a scanner on the current table as specified by the {@link Scan}
   * object.
   * 返回当前表上由Scan对象指定的一个scanner
   * 
   * Note that the passed {@link Scan}'s start row and caching properties
   * maybe changed.
   *注意:传递的Scan的起始行以及缓冲参数可能会被改变【这是什么意思?】
   
   * @param scan A configured {@link Scan} object.
   * @return A scanner.
   * @throws IOException if a remote or network exception occurs.
   * @since 0.20.0
   */
  ResultScanner getScanner(Scan scan) throws IOException;
  • getScanner(byte[] family)
 /**
   * Gets a scanner on the current table for the given family.
   * 在当前的表,以及指定的列族上获取一个scanner(扫描器)
   
   * @param family The column family to scan.
   * @return A scanner.
   * @throws IOException if a remote or network exception occurs.
   * @since 0.20.0
   */
  ResultScanner getScanner(byte[] family) throws IOException;
  • getScanner(byte[] family, byte[] qualifier)
  /**
   * Gets a scanner on the current table for the given family and qualifier.
   * 返回一个当前表中给定的列族和限定符所表示的scanner
   * 
   * @param family The column family to scan.
   * @param qualifier The column qualifier to scan.
   * @return A scanner.
   * @throws IOException if a remote or network exception occurs.
   * @since 0.20.0
   */
  ResultScanner getScanner(byte[] family, byte[] qualifier) throws IOException;
2.实战代码
2.1 分别针对上述api,进行测试。在测试之前,请看tsdb-uid表中的数据,如下:
 \x00                                               column=id:metrics, timestamp=1541500656882, value=\x00\x00\x00\x00\x00\x00\x00\x05                                                                    
 \x00                                               column=id:tagk, timestamp=1535982247222, value=\x00\x00\x00\x00\x00\x00\x00\x03                                                                       
 \x00                                               column=id:tagv, timestamp=1541425665699, value=\x00\x00\x00\x00\x00\x00\x00\x08                                                                       
 \x00\x00\x01                                       column=name:metrics, timestamp=1531479245132, value=mytest.cpu                                                                                        
 \x00\x00\x01                                       column=name:tagk, timestamp=1531479245162, value=host                                                                                                 
 \x00\x00\x01                                       column=name:tagv, timestamp=1531479245189, value=server4                                                                                              
 \x00\x00\x02                                       column=name:metrics, timestamp=1535891521172, value=metric-t                                                                                          
 \x00\x00\x02                                       column=name:tagk, timestamp=1535891521198, value=chl                                                                                                  
 \x00\x00\x02                                       column=name:tagv, timestamp=1531479264404, value=server5                                                                                              
 \x00\x00\x03                                       column=name:metrics, timestamp=1535982247205, value=csdn                                                                                              
 \x00\x00\x03                                       column=name:tagk, timestamp=1535982247230, value=accessNumber                                                                                         
 \x00\x00\x03                                       column=name:tagv, timestamp=1531485413194, value=s485276                                                                                              
 \x00\x00\x04                                       column=name:metrics, timestamp=1541426336083, value=test                                                                                              
 \x00\x00\x04                                       column=name:tagv, timestamp=1535891521217, value=hqdApp                                                                                               
 \x00\x00\x05                                       column=name:metrics, timestamp=1541500656917, value=test_meta                                                                                         
 \x00\x00\x05                                       column=name:tagv, timestamp=1535982247253, value=cs                                                                                                   
 \x00\x00\x06                                       column=name:tagv, timestamp=1537103490275, value=Firminal                                                                                             
 \x00\x00\x07                                       column=name:tagv, timestamp=1541425665353, value=lawson                                                                                               
 \x00\x00\x08                                       column=name:tagv, timestamp=1541425665725, value=firminal                                                                                             
 Firminal                                           column=id:tagv, timestamp=1537103490289, value=\x00\x00\x06                                                                                           
 accessNumber                                       column=id:tagk, timestamp=1535982247235, value=\x00\x00\x03                                                                                           
 chl                                                column=id:tagk, timestamp=1535891521203, value=\x00\x00\x02                                                                                           
 cs                                                 column=id:tagv, timestamp=1535982247259, value=\x00\x00\x05                                                                                           
 csdn                                               column=id:metrics, timestamp=1535982247213, value=\x00\x00\x03                                                                                        
 firminal                                           column=id:tagv, timestamp=1541425665756, value=\x00\x00\x08                                                                                           
 host                                               column=id:tagk, timestamp=1531479245177, value=\x00\x00\x01                                                                                           
 hqdApp                                             column=id:tagv, timestamp=1535891521224, value=\x00\x00\x04                                                                                           
 lawson                                             column=id:tagv, timestamp=1541425665366, value=\x00\x00\x07                                                                                           
 metric-t                                           column=id:metrics, timestamp=1535891521182, value=\x00\x00\x02                                                                                        
 mytest.cpu                                         column=id:metrics, timestamp=1531479245145, value=\x00\x00\x01                                                                                        
 s485276                                            column=id:tagv, timestamp=1531485413204, value=\x00\x00\x03                                                                                           
 server4                                            column=id:tagv, timestamp=1531479245192, value=\x00\x00\x01                                                                                           
 server5                                            column=id:tagv, timestamp=1531479264407, value=\x00\x00\x02                                                                                           
 test                                               column=id:metrics, timestamp=1541426336086, value=\x00\x00\x04                                                                                        
 test_meta                                          column=id:metrics, timestamp=1541500656927, value=\x00\x00\x05                                                                                        
25 row(s) in 0.7650 seconds
  • 使用 columnFamily作为参数
public static void getRowByScan(String tableName, String columnFamily) {
        try {
            Table table = connection.getTable(TableName.valueOf(tableName));
            ResultScanner resultScanner = table.getScanner(Bytes.toBytes(columnFamily));// get cf's data
            for(Result res: resultScanner){
                System.out.println(res);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

执行结果如下:

keyvalues={\x00\x00\x01/name:metrics/1531479245132/Put/vlen=10/seqid=0, \x00\x00\x01/name:tagk/1531479245162/Put/vlen=4/seqid=0, \x00\x00\x01/name:tagv/1531479245189/Put/vlen=7/seqid=0}
keyvalues={\x00\x00\x02/name:metrics/1535891521172/Put/vlen=8/seqid=0, \x00\x00\x02/name:tagk/1535891521198/Put/vlen=3/seqid=0, \x00\x00\x02/name:tagv/1531479264404/Put/vlen=7/seqid=0}
keyvalues={\x00\x00\x03/name:metrics/1535982247205/Put/vlen=4/seqid=0, \x00\x00\x03/name:tagk/1535982247230/Put/vlen=12/seqid=0, \x00\x00\x03/name:tagv/1531485413194/Put/vlen=7/seqid=0}
keyvalues={\x00\x00\x04/name:metrics/1541426336083/Put/vlen=4/seqid=0, \x00\x00\x04/name:tagv/1535891521217/Put/vlen=6/seqid=0}
keyvalues={\x00\x00\x05/name:metrics/1541500656917/Put/vlen=9/seqid=0, \x00\x00\x05/name:tagv/1535982247253/Put/vlen=2/seqid=0}
keyvalues={\x00\x00\x06/name:tagv/1537103490275/Put/vlen=8/seqid=0}
keyvalues={\x00\x00\x07/name:tagv/1541425665353/Put/vlen=6/seqid=0}
keyvalues={\x00\x00\x08/name:tagv/1541425665725/Put/vlen=8/seqid=0}

可以看到代码中的一个res其实是一个 Keyvalues,因为同行中的数据不等,于是得到的总数据就是8行。

  • 使用Scan作为参数
     public static void getRowByScan(String tableName) {
        try {
            Table table = connection.getTable(TableName.valueOf(tableName));
            Scan scan = new Scan();
            scan.setStartRow("server4".getBytes());

            ResultScanner resultScanner = table.getScanner(scan);// get cf's data
            for(Result res: resultScanner){
                System.out.println(res);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

执行结果如下:

keyvalues={server4/id:tagv/1531479245192/Put/vlen=3/seqid=0}
keyvalues={server5/id:tagv/1531479264407/Put/vlen=3/seqid=0}
keyvalues={test/id:metrics/1541426336086/Put/vlen=3/seqid=0}
keyvalues={test_meta/id:metrics/1541500656927/Put/vlen=3/seqid=0}
  • 使用columnFamily,qualifier作为参数
public static void getRowByScanThree(String tableName,String family,String qualifier) {
        try {
            Table table = connection.getTable(TableName.valueOf(tableName));

            ResultScanner resultScanner = table.getScanner(family.getBytes(),qualifier.getBytes());// get cf's data
            for(Result res: resultScanner){
                System.out.println(res);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

执行结果如下:

keyvalues={\x00\x00\x01/name:metrics/1531479245132/Put/vlen=10/seqid=0}
keyvalues={\x00\x00\x02/name:metrics/1535891521172/Put/vlen=8/seqid=0}
keyvalues={\x00\x00\x03/name:metrics/1535982247205/Put/vlen=4/seqid=0}
keyvalues={\x00\x00\x04/name:metrics/1541426336083/Put/vlen=4/seqid=0}
keyvalues={\x00\x00\x05/name:metrics/1541500656917/Put/vlen=9/seqid=0}
2.2 输出 Keyvalue的值

上面的输出将表中一整行的数据作为一个 Keyvalue对象存储,但是如何单独取出 Keyvalue中的值呢?比如说,我想取出rowKey=? value=? timestamp=?等。代码如下:

public static void getRowValue(String tableName,String family,String qualifier) {
        try {
            Table table = connection.getTable(TableName.valueOf(tableName));

            ResultScanner resultScanner = table.getScanner(family.getBytes(),qualifier.getBytes());// get cf's data
            for(Result res: resultScanner){
                //System.out.println(res);
                for (KeyValue kv : res.raw()) {

                    byte []temp = new byte[]{};
                    temp = kv.getRow();
                    System.out.print("rowKey: ");
                    for(int i = 0;i<temp.length;i++){
                        System.out.print(temp[i]);
                    }
                    System.out.println(" value: "+Bytes.toString(kv.getValue()) +" timestamp: "+(kv.getTimestamp()));
                }

            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

执行结果

rowKey: 001, value: mytest.cpu, timestamp: 1531479245132
rowKey: 002, value: metric-t, timestamp: 1535891521172
rowKey: 003, value: csdn, timestamp: 1535982247205
rowKey: 004, value: test, timestamp: 1541426336083
rowKey: 005, value: test_meta, timestamp: 1541500656917

因为在表tsdb-uidrowKey是一个字节数组,所以无法将其直接转为String,于是在上面的代码里,使用的是for()循环输出rowKey

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

说文科技

看书人不妨赏个酒钱?

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值