Hbase 过滤器详解及一些代码测试示例

最新推荐文章于 2020-11-29 17:48:31 发布

xuguokun1986

最新推荐文章于 2020-11-29 17:48:31 发布

阅读量2.6k

点赞数

分类专栏：大数据 Hbase

本文链接：https://blog.csdn.net/xuguokun1986/article/details/50629161

版权

大数据同时被 2 个专栏收录

178 篇文章 1 订阅

订阅专栏

Hbase

41 篇文章 0 订阅

订阅专栏

一、过滤器（Filter）

基础API中的查询操作在面对大量数据的时候是非常苍白的，这里Hbase提供了高级的查询方法：Filter。Filter可以根据簇、列、版本等更多的条件来对数据进行过滤，基于Hbase本身提供的三维有序（主键有序、列有序、版本有序），这些Filter可以高效的完成查询过滤的任务。带有Filter条件的RPC查询请求会把Filter分发到各个RegionServer，是一个服务器端（Server-side）的过滤器，这样也可以降低网络传输的压力。

要完成一个过滤的操作，至少需要两个参数。一个是抽象的操作符，Hbase提供了枚举类型的变量来表示这些抽象的操作符：LESS/LESS_OR_EQUAL/EQUAL/NOT_EUQAL等；另外一个就是具体的比较器（Comparator），代表具体的比较逻辑，如果可以提高字节级的比较、字符串级的比较等。有了这两个参数，我们就可以清晰的定义筛选的条件，过滤数据。

    Java代码   
    
 CompareFilter（CompareOp compareOp， WritableByteArrayComparable valueComparator）

CompareFilter是高层的抽象类，下面我们将看到它的实现类和实现类代表的各种过滤条件。这里实现类实际上代表的是参数中的过滤器过滤的内容，可以使主键、簇名、列值等，这就是由CompareFilter决定了。

行过滤器（RowFilter）

行过滤器的比较对象是行主键

    Java代码   
    
  
 Scan scan = new Scan();  
 Filter filter1 = new RowFilter(CompareFIlter.CompareOp.LESS_OR_EUQAL, new BinaryComparator(Bytes.toBytes("hello")));  
 scan.setFilter(filter1);  
 scan.close();  

例中的Filter会将所有的小于等于“Hello”的主键过滤出来。

簇过滤器（FamilyFilter）

簇过滤器过滤的是簇的名字。

列过滤器（QualifierFilter）

列过滤器过滤的是列的名字。

值过滤器（ValueFilter）

值过滤器过滤的是扫描对象的值。

单值过滤器（SingleColumnValueFilter）

单值过滤器是以特定列的值为过滤内容，与值过滤器不同的是，这里是特定的列，而值过滤器比较的是行内的所有列。所有在使用单值过滤器的时候要指定比较的列的坐标。

    Java代码   
    
 SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareOp compareOp, WritableByteArrayComparable comparator)

对于找不到该列的行，可以有特殊的处理

    Java代码   
    
 void setFilterIfMissing(boolean filterIfMissing)

默认缺省行将被包含进过滤的结果集中。

前缀过滤器（PrefixFilter）

前缀过滤器将会过滤掉不匹配的记录，过滤的对象是主键的值。

    Java代码   
    
 PrefixFilter(byte[] prefix)

页过滤器（PageFilter）

页过滤器可以根据主键有序返回固定数量的记录，这需要客户端在遍历的时候记住页开始的地方，配合scan的startkey一起使用。

    Java代码   
    
 PageFilter(int size)

键过滤器（KeyOnlyFilter）

键过滤器可以简单的设置过滤的结果集中只包含键而忽略值，这里有一个选项可以把结果集的值保存为值的长度。

FirstKeyOnlyFilter

在键过滤器的基础上，根据列有序，只包含第一个满足的键。

ColumnPrefixFilter

这里过滤的对象是列的值。

TimestampsFilter

    Java代码   
    
 TimestampsFilter(List<Long> times)

这里参数是一个集合，只有包含在集合中的版本才会包含在结果集中。

包装类过滤器，此类过滤器要通过包装其他的过滤器才有意义，是其他过滤器的一种加强。

SkipFilter

    Java代码   
    
 SkipFilter(Filter filter)

过滤器集合（FilterList）

Hbase的过滤器设计遵照于设计模式中的组合模式，以上的所有过滤器都可以叠加起来共同作用于一次查询。

二、计数器（Counter）

Hbase提供一个计数器工具可以方便快速的进行计数的操作，而免去了加锁等保证原子性的操作。但是实质上，计数器还是列，有自己的簇和列名。值得注意的是，维护计数器的值最好是用Hbase提供的API，直接操作更新很容易引起数据的混乱。

计数器的增量可以是正数负数，正数代表加，负数代表减。

    Java代码   
    
  
 long icrementColumnValue(byte[] row, byte[] famuly, byte[] qualifier, long amount)  
 Result increment(Increment increment)  

三、协处理器（Coprocessor）

协处理器的思想是把处理的复杂代码分发到各个RegionServer，使大部分的计算可以在服务器端，或者扫描的时候完成，提高处理的效率。形式上比较类似RDBMS中的存储过程，不同的是，存储过程的原理是在服务器端进行预处理等优化，而协处理器仅仅只是服务器处理，这里又有点类似于Map-Reduce中的Map阶段。

协处理器(Coprocesssor)有两种，一种是观察者（Obsever）另外一种是Endpoint（LZ跪了，实在不知道翻译成啥）。

每个协处理器都有一个优先级，优先级分为USER/SYSTEM，优先级决定处理器的执行顺序，SYSTEM级别的处理器永远先于USER。

每个处理器都有自己的执行环境(CoprocessorEnvironment)，这个环境包含当前集群和请求的状态等信息，是处理中重要的一部分，以构造函数参数的形式被传入到处理器。

另外就是CoprocessorHost，这是Hbase管理协处理器的类，用来维护所有的处理器和其环境。

抽象如图：

协处理器的加载有两种方式，一种是通过配置文件，在配置文件中指定加载路径、类名等，通过这种方式加载的处理器都是SYSTEM级别的，会作用于所有的请求，所有的表；另一种方式是通过在创建表的时候在表中指定，这种方式既可以创建全局的SYSTEM级别的处理器，也可以创建USER级别的处理器，USER级别的处理器是针对表的。

    Java代码   
    
  
 Path path = new Paht("test.jar");  
 HTableDescriptor htd = new HTableDescriptor("test");  
 htd.addFamily(new HColumnDescriptor("family1"));  
 htd.setValue("Coprocessor$1", path.toString + "|" + className + "|" + Coprocessor.Priority.USER);  
 HBaseAdmin admin = new HBaseAdmin(conf);  
 admin.createTable(htd);  

这里setValue方法有两个参数，第一个参数是协处理器的名字，$后面跟的是影响执行顺序的序号；第二个参数是<path>|<classname>|<priority>。

Observer

这是第一种处理器，观察者，观察者有三种，分别用来监听RegionServerObserver、MasterServerObserver、WALObserver。

RegionServer监听的是Region Server上的操作，如在Region Server上的Get、Put等。操作被赋予生命周期：Pending open--open--Pending close

监听器是可以监听生命周期中的各个阶段，并对其做出处理。

每一个监听的方法都有一个上下文参数（Context），通过Context参数可以直接的操作请求的声明周期。

    Java代码   
    
  
 void bypass();  
 void complete();  

MasterObserver监听的是Master Server上的操作，有点类似RDBMS中的DDL的操作如表操作、列操作等。

具体的操作和RegionServer比较类似。

Endpoint

这是第二种处理器，Endpoint相当于被分发到各个RegionServer上的存储过程，可以在客户端远程调用的方法。Endpoint的存在使我们可以进行一些服务器端的计算，如服务器聚集、求和等运算，弥补了查询API的不足。服务器端计算的优势是显而易见的，它可以降低网络传输的数据量，合理利用服务器资源。

从功能上可以看出Endpoint是一个基于RPC调用的模块，所以在实现自己的Endpoint时候需要定义我们自己的通信协议。在Hbase中，通信协议被抽象为CoprocessorProtocol接口，要实现我们的协议，我们要创建协议接口继承自CoprocessorProtocol接口，然后再实现我们的协议类。

    Java代码   
    
  
 public interface MyProtocol extends CoprocessorProtocol {  
     public int work();  
 }  

协议类本身也是处理器，所以还要继承BaseEndpointCoprocessor类。

    Java代码   
    
  
 public class MyEndpoint extends BaseEndpointCoprocessor implements MyProtocol {  
     public int work() {  
         Sytem.out.println("hello");  
     }  
 }  

在抽象的父类BaseEndpointCoprocessor中还提供了一些有用的方法，如我们可以拿到对应的环境类。

    Java代码   
    
 RegionCoprocessorEnvironment getEnvironment()

配置好Endpoint重启集群环境以后，我们的实现类会被分发到各个RegionServer，通过HTable实例的方法我们可以调用到Endpoint。

    Java代码   
    
 <T extends CoprocessorProtocol, R> Map<byte[], R> coprocessorExec(Class<T> protocol, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable);

startKey和endKey用于确定哪些RegionServer将执行Endpoint， Batch中的内部类将决定协议中方法的调用。

四、 HTablePool 连接池

在Hbase中，创建一个代表表的HTable实例是一个耗时且很占资源的操作，类似操作数据库，我们也需要建立我们自己的连接池，于是有了代表连接池的抽象类：HTable。

    Java代码   
    
  
 HTablePool(Configuaration conf, int maxSize)  
 HTablePool(Configuaration conf, int maxSize, HTableInterfaceFactory factory)  

创建HTable需要配置文件的实例，连接池的最大连接数也在构造方法中设置。另外，如果想要自己控制HTable被创建的过程，则需要实现自己的工厂方法。在连接池中，最大连接数（maxSize）的含义是，连接池管理的最大的连接数，当所需要的连接数超过最大值时，会临时的创建连接来满足需求，但是这些连接在使用完毕之后会被直接释放且丢弃而不会进入连接池被管理，所以最大连接数代表的是连接池中最大被管理的连接数，而不是使用连接池最大可使用的连接数。

    Java代码   
    
  
 HTableInterface getTable(String tableName)  
 HTableInterface getTable(byte[] tableName)  
 void putTable(HTableInterface table)  

需要注意的是，使用完连接以后需要手动的调用putTable方法将连接放回池中。

附加上述过滤器的一些测试代码示例：

package com.test.junit;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Date;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.MasterNotRunningException;
import org.apache.hadoop.hbase.ZooKeeperConnectionException;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HConnection;
import org.apache.hadoop.hbase.client.HConnectionManager;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.HTableInterface;
import org.apache.hadoop.hbase.client.HTablePool;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Row;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.filter.BinaryComparator;
import org.apache.hadoop.hbase.filter.BinaryPrefixComparator;
import org.apache.hadoop.hbase.filter.ColumnRangeFilter;
import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp;
import org.apache.hadoop.hbase.filter.FamilyFilter;
import org.apache.hadoop.hbase.filter.Filter;
import org.apache.hadoop.hbase.filter.PrefixFilter;
import org.apache.hadoop.hbase.filter.PageFilter;
import org.apache.hadoop.hbase.filter.ColumnCountGetFilter;
import org.apache.hadoop.hbase.filter.ColumnPaginationFilter;
import org.apache.hadoop.hbase.filter.ColumnPrefixFilter;
import org.apache.hadoop.hbase.filter.FilterList;
import org.apache.hadoop.hbase.filter.FilterList.Operator;
import org.apache.hadoop.hbase.filter.QualifierFilter;
import org.apache.hadoop.hbase.filter.RowFilter;
import org.apache.hadoop.hbase.filter.ValueFilter;
import org.apache.hadoop.hbase.util.Bytes;
import org.junit.Test;

public class JunitHbaseTest {

	private static org.apache.hadoop.conf.Configuration conf = null;
	private static HConnection connection = null;
    private static HBaseAdmin admin = null;
    private static final int MAX_TABLE_COUNT = 10;
    
	static {
       
		Configuration HBASE_CONFIG = new Configuration();
	    HBASE_CONFIG.set("hbase.master", "192.168.174.129:60000");
	    HBASE_CONFIG.set("hbase.zookeeper.quorum", "192.168.174.129,192.168.174.130,192.168.174.131");
	    HBASE_CONFIG.set("hbase.zookeeper.property.clientPort", "2181");
	    conf = HBaseConfiguration.create(HBASE_CONFIG);  
	    
	    try {
			
	    	connection = HConnectionManager.createConnection(HBASE_CONFIG);
		
	    } catch (IOException e1) {
			// TODO Auto-generated catch block
			e1.printStackTrace();
		}
	    
        try {
			
        	admin = new HBaseAdmin(conf);
        	
		} catch (IOException e) {

			e.printStackTrace();
		}
    }
	
	 @Test 
	 public void putTest() throws IOException, InterruptedException{
	      
		 HTableInterface  table = connection.getTable("myTest");
		
	     table.setAutoFlush(false);//关闭自动提交
	   
	     List<Put> puts = new ArrayList<Put>(10);

	     for (int i = 40,len=50; i < len; i++) {
	           
	    	 Put put = new Put(Bytes.toBytes("row-"+i),new Date().getTime());	            
	    	 put.add(Bytes.toBytes("student"), Bytes.toBytes("name1"), Bytes.toBytes("value"+i));	  
	    	 put.add(Bytes.toBytes("student"), Bytes.toBytes("age"), Bytes.toBytes(i+""));	  
	    	 put.add(Bytes.toBytes("student"), Bytes.toBytes("grade"), Bytes.toBytes("grade"+i));	  

	         puts.add(put);
	           
	         put.heapSize();	            
	         put.size();

	         //下面的方法 从字面上基本上就可以知道
	         put.isEmpty();
	         put.getRow();	     
	         put.numFamilies();  
	     }
        
	     table.put(puts);
	     table.flushCommits();//手动提交，只有当put总大小超过writeBufferSize 才提交  或者手工,table.flushCommits()
	     //table.setWriteBufferSize(1024*1024*5);//如果不设置 table.flushCommits(),只有达到达到这个大小的时候才能自动提交	     
	     admin.flush("myTest");

	     //批量操作方法一, 使用batch,可以混合各种操作 ( Put Delete Get 都是接口Row的实现)
	     //主要 这个如果处理Put操作 是不会使用客户端缓存的   会直接异步的发送到服务器端
	     
		 /*List<Row> rows = new ArrayList<Row>(10);	     
	     for (int i = 20,len=30; i < len; i++) {
	        
	    	 Put put = new Put(Bytes.toBytes(("row-"+i)));
	         put.add(Bytes.toBytes("data"), Bytes.toBytes("name"), Bytes.toBytes(("value"+i)));
	         put.add(Bytes.toBytes("data"), Bytes.toBytes("email"), Bytes.toBytes(("value"+i+"@sina.com")));
	         rows.add(put);
	     }
	        
	     rows.add(new Delete(Bytes.toBytes("row-9")));
	     table.batch(rows);	 */
	     
	     table.close();	     
	     
	 }

	@Test
	public void getAllRecords(){
		  
		String tableName = "myTest";
			
		try{
	            
			HTable table = new HTable(conf, tableName);
	        Scan s = new Scan();
	            
	        ResultScanner ss = table.getScanner(s);
	            
	        for(Result r:ss){
	                
	           for(KeyValue kv : r.raw()){
	                    
	              //System.out.println(new String(kv.getRow()));//rowkey
	              //System.out.println(new String(kv.getFamily()));//family
		          System.out.println(new String(kv.getQualifier()));//qualifier
		          
		          /*if((new String(kv.getQualifier())).equals("teacher")){
		        	  
		        	  System.out.println(new String(kv.getValue()));
		          }*/
	           }
	       } 
	            
		} catch (IOException e){
	            				
			e.printStackTrace();
	    }		
	}
	
	/**
	* Delete与Put一致 把全部的Put改成Delete  table.put -->table.delete 就可以了,
	* 不过有些需要注意, 看下面
	* @throws IOException 
	*/	   
	@Test public void deleteTest() throws IOException{
	    	HTableInterface table = connection.getTable("myTable");
	        try {
	            //如果上面介绍的KeyValue 有点印象, 通过delete提供的构造函数可以知道
	            //不指定会删除所有的版本
	            Delete delete = new Delete(Bytes.toBytes("row-1"));
	            table.delete(delete);
	        } catch (Exception e) {
	            e.printStackTrace();
	        }
	    }
	    /**
	     * 一些原子性操作   对于java并发工具包有所了解的 应该会知道 轻量级锁的核心就是CAS机制(Compare and swap),
	     * 这里在概念上有些类似, 也可以类似于  SQL中  select 出来然后   insert or update的 操作  Hbase这里可以保证他们在一个原子操作
	     * 这个在高并发 场景下  更新值  是个好的选择
	     * table.checkAndPut(row, family, qualifier, value, put)
	     * table.checkAndDelete(row, family, qualifier, value, delete)
	     * @throws IOException
	     */
	    @Test public void atomicOP() throws IOException{
	        byte[] row = Bytes.toBytes("row-12");
	        byte[] family = Bytes.toBytes("data");

	        HTableInterface table = connection.getTable("myTable");
	        //操作成功会返回 true,否则false;  如果是个不存在的qualifier, 把value置为null  check是会成功的
	        Put put = new Put(row);
	        put.add(family, Bytes.toBytes("namex"), Bytes.toBytes("value12"));
	        //check 和put是同一个row
	        boolean result1 = table.checkAndPut(row, family, Bytes.toBytes("namex"), null, put);  //true
	        boolean result2 = table.checkAndPut(row, family, Bytes.toBytes("namex"), null, put);   //false

	        Put put2 = new Put(row);
	        put2.add(family, Bytes.toBytes("namex"), Bytes.toBytes("value12"));
	        boolean result3 = table.checkAndPut(row, family, Bytes.toBytes("namex"),
	                Bytes.toBytes("value12"), put2);  //true

	        Put put3 = new Put(Bytes.toBytes("row-13"));
	        put3.add(family, Bytes.toBytes("namex"), Bytes.toBytes("value13"));
	        boolean result4 = table.checkAndPut(row, family, Bytes.toBytes("namex2"),
	                Bytes.toBytes("value12"), put3);  //org.apache.hadoop.hbase.DoNotRetryIOException
	        //注意：check 和put的一定要是同一行 否则会报错

//	      table.checkAndDelete类似
	    }

	 
	  
	@Test
	public void testAddTable() throws MasterNotRunningException, ZooKeeperConnectionException, IOException{		
		
	      String tableName = "myTest";
	      String[] familys = {"grade", "course"};
	      HBaseAdmin admin = new HBaseAdmin(conf);
	      if (admin.tableExists("myTest")) {
	            
	    	  System.out.println("table already exists!");
	       
	      } else {
	          
	    	  HTableDescriptor tableDesc = new HTableDescriptor(tableName);
	           
	    	  for(int i=0; i<familys.length; i++){
	              
	            	tableDesc.addFamily(new HColumnDescriptor(familys[i]));	           
	    	  }
	            admin.createTable(tableDesc);
	            System.out.println("create table " + tableName + " ok.");	       
	      } 	      
	}
	
	@Test
	public void testAddRecord() throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
		
		 
	      String rowKey = "20160201113312";
	      String tableName = "myTest";
	      String gradeFamily = "grade";
	      String qualifier = "middle";
	      String value = "2";
	      
	      try {
	           
	    	  HTable table = new HTable(conf, tableName);
	          Put put = new Put(Bytes.toBytes(rowKey));//将字符串转化成字节数组
	          put.add(Bytes.toBytes(gradeFamily),Bytes.toBytes(qualifier),Bytes.toBytes(value));
	          table.put(put);
	          System.out.println("insert recored " + rowKey + " to table " + tableName +" ok.");
	            
	       } catch (IOException e) {
	         
	    	   e.printStackTrace();	       
	       }	      
	}
	
	@Test
	public void testGetOneRecord() throws IOException{
		
	    String rowKey = "20160201113312";
	    String tableName = "myTest";
	    String gradeFamily = "grade";
	    String qualifier = "middle";
	    String value = "2";
	      
		HTable table = new HTable(conf, tableName);
	   
		Get get = new Get(rowKey.getBytes());//Get一般是获取一条记录
	    Result rs = table.get(get);
	   
	    for(KeyValue kv : rs.raw()){
	        
	    	System.out.print(new String(kv.getRow()) + " " );
	        System.out.print(new String(kv.getFamily()) + ":" );
	        System.out.print(new String(kv.getQualifier()) + " " );
	        System.out.print(kv.getTimestamp() + " " );
	        System.out.println(new String(kv.getValue()));
	    }	    
	}
	
	
	
	@Test
	public void deleteOneRecord() throws IOException{
	    
	    String tableName = "myTest";
		
	    String rowKey = "20160201113312";
	    HTable table = new HTable(conf, tableName);
       
	    List list = new ArrayList();
        Delete del = new Delete(rowKey.getBytes());
        list.add(del);
        
        table.delete(list);
        System.out.println("del recored " + rowKey + " ok.");        
	}
	
	
	@Test
	public void deleteSomeRecord() throws IOException{
	    
	    String tableName = "myTest";
		
	    String rowKey = "20160201113312";
	    HTable table = new HTable(conf, tableName);
       
	    List list = new ArrayList();
	    
        Delete del = new Delete(rowKey.getBytes());
        
        list.add(del);
        
        table.delete(list);
        System.out.println("del recored " + rowKey + " ok.");        
	}
	
	
	@Test
	public void scanAdvance() throws IOException{
       
		HTableInterface  table = connection.getTable("myTest");
		
		Scan scan  = new Scan();
        List<Filter> rootList = new ArrayList<Filter>();
            
        List<Filter> selectList = new ArrayList<Filter>();
                
        List<Filter> select_1 = new ArrayList<Filter>();                   
        select_1.add(new FamilyFilter(CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes("teacher"))));                    
        select_1.add(new QualifierFilter(CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes("age"))));
              
        //List<Filter> select_2 = new ArrayList<Filter>();                 
        //select_2.add(new FamilyFilter(CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes("cf2"))));                 
        //select_2.add(new QualifierFilter(CompareOp.EQUAL,new BinaryPrefixComparator(Bytes.toBytes("column"))));
            
        selectList.add(new FilterList(Operator.MUST_PASS_ALL, select_1));            
        //selectList.add(new FilterList(Operator.MUST_PASS_ALL, select_2));
        
        rootList.add(new FilterList(Operator.MUST_PASS_ONE,selectList));

        List<Filter> whereList = new ArrayList<Filter>();
               
        //whereList.add(new RowFilter(CompareOp.GREATER,new BinaryComparator(Bytes.toBytes("7"))));
        whereList.add(new ColumnRangeFilter(Bytes.toBytes("12"), true, Bytes.toBytes("13"), true));
                
        //whereList.add(new RowFilter(CompareOp.EQUAL,new BinaryPrefixComparator(Bytes.toBytes("xxx"))));      
        
        rootList.add(new FilterList(Operator.MUST_PASS_ALL,whereList));
        
        scan.setFilter(new FilterList(Operator.MUST_PASS_ALL, rootList));
        
        ResultScanner ss = table.getScanner(scan);
        
        for(Result r:ss){
            
        	for(KeyValue kv : r.raw()){
                
            	 System.out.print(new String(kv.getRow()) + " ");                   
            	 System.out.print(new String(kv.getFamily()) + ":");                    
            	 System.out.print(new String(kv.getQualifier()) + " ");                  
            	 //System.out.print(kv.getTimestamp() + " ");                   
            	 System.out.println(new String(kv.getValue()));
            }
        } 
    }
	
	
	
	@Test
	public void ColumnCountGetFilter() throws IOException{
       
		HTableInterface  table = connection.getTable("myTest");
  
        Get get=new Get(Bytes.toBytes("row-11"));  
        
        ColumnCountGetFilter filter=new ColumnCountGetFilter(2); 
        
        get.setFilter(filter);  
        Result result=table.get(get);  
        System.out.println(result.size()); 
		
    }
	
	
	@Test
	public void ColumnPaginationGetFilter() throws IOException{
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan=new Scan();  
        ColumnPaginationFilter filter=new ColumnPaginationFilter(1, 2);  
        scan.setFilter(filter);  
        ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
               
        		System.out.println(kv+"-----"+Bytes.toString(kv.getQualifier()));  
            }  
        }  
        resultScanner.close();  		
    }
	
	
	@Test
	public void RowFilter() throws IOException{//行过滤器过滤的是主键
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
		Filter filter1 = new RowFilter(CompareOp.LESS_OR_EQUAL, new BinaryComparator(Bytes.toBytes("row-23")));  
		scan.setFilter(filter1);  
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        resultScanner.close(); 
    }
	
	
	@Test
	public void FamilyFilter() throws IOException{//行过滤器过滤的是主键
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	
		Filter filter1 = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("student")));  
		scan.setFilter(filter1);  
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        resultScanner.close(); 
    }
	
	@Test
	public void RowFamilyFilter() throws IOException{//行过滤器过滤的是主键
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	 
		List<Filter> filterList =  new ArrayList<Filter>();
		Filter filter1 = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("student")));  
		
		Filter filter2 = new RowFilter(CompareOp.LESS_OR_EQUAL, new BinaryComparator(Bytes.toBytes("row-11")));  
		filterList.add(filter1);
		filterList.add(filter2);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL, filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        resultScanner.close(); 
    }
	
	@Test
	public void QualifierFilter() throws IOException{//行过滤器过滤的是主键
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	 
		List<Filter> filterList =  new ArrayList<Filter>();
		Filter filter1 = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("grade")));  
		
		filterList.add(filter1);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL, filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        resultScanner.close(); 
    }
	
	@Test
	public void ValueFilter() throws IOException{//行过滤器过滤的是主键
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	 
		List<Filter> filterList =  new ArrayList<Filter>();
		Filter filter1 = new ValueFilter(CompareOp.LESS, new BinaryComparator(Bytes.toBytes("professor34")));  
		
		filterList.add(filter1);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL, filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        resultScanner.close(); 
    }
	
	
	@Test
	public void PrefixFilter() throws IOException{//主键的前序匹配
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	  
		List<Filter> filterList =  new ArrayList<Filter>();
		
		PrefixFilter filter1 = new PrefixFilter(Bytes.toBytes("row-2"));  			
		
		filterList.add(filter1);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL,filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        resultScanner.close(); 
    }
	
	
	@Test
	public void ColumnPrefixFilter() throws IOException{//强过滤条件定位到指定的列
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	  
		List<Filter> filterList =  new ArrayList<Filter>();
		
		ColumnPrefixFilter filter1 = new ColumnPrefixFilter(Bytes.toBytes("age"));  	
				
		filterList.add(filter1);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL,filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        
        resultScanner.close(); 
    }
	
	
	@Test
	public void ColumnPrefixRangeFilter() throws IOException{//强过滤条件定位到指定的列
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	  
		List<Filter> filterList =  new ArrayList<Filter>();
		
		FamilyFilter filter1 = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("student")));
		
		ColumnRangeFilter filter2 = new ColumnRangeFilter(Bytes.toBytes("name"),true,Bytes.toBytes("name1"),true); 
			
		filterList.add(filter1);
		filterList.add(filter2);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL,filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        
        resultScanner.close(); 
    }
	
	@Test
	public void PageFilter() throws IOException{//强过滤条件定位到指定的列
       
		HTableInterface  table = connection.getTable("myTest");
  
		Scan scan = new Scan();  
	
		List<Filter> filterList =  new ArrayList<Filter>();
		
		FamilyFilter filter1 = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("student")));
		
		PageFilter filter2 = new PageFilter(10); 
			
		filterList.add(filter1);
		filterList.add(filter2);
		
		scan.setFilter(new FilterList(Operator.MUST_PASS_ALL,filterList));
		
		ResultScanner resultScanner=table.getScanner(scan);  
        for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }  
        
		System.out.println("StartRow:"+new String(scan.getStartRow()));
		System.out.println("StopRow:"+new String(scan.getStopRow()));
		
        resultScanner.close(); 
    }
	
	
	@Test
	public void ScanStartStpoRow() throws IOException{//StartRow和StopRow的应用案例
       
	   HTableInterface  table = connection.getTable("myTest");
  
	   Scan scan = new Scan();  
	
	   scan.setStartRow(Bytes.toBytes("row-23"));
	   scan.setStopRow(Bytes.toBytes("row-34"));
		
	   ResultScanner resultScanner=table.getScanner(scan);  
       for(Result result:resultScanner){  
           
        	for(KeyValue kv:result.raw()){  
                
        		System.out.println(kv+"-----"+Bytes.toString(kv.getValue()));  
            }
        }          
        resultScanner.close();         
    }
	
}

其中测试数据产生的方法请参考上一遍博客。

xuguokun1986

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Hbase 过滤器详解及一些代码测试示例

一、过滤器（Filter）基础API中的查询操作在面对大量数据的时候是非常苍白的，这里Hbase提供了高级的查询方法：Filter。Filter可以根据簇、列、版本等更多的条件来对数据进行过滤，基于Hbase本身提供的三维有序（主键有序、列有序、版本有序），这些Filter可以高效的完成查询过滤的任务。带有Filter条件的RPC查询请求会把Filter分发到各个RegionSer
复制链接

扫一扫