文章说明:本文参照了https://blog.csdn.net/m0_37739193/article/details/73615016的内容进行测试。写文章更多的补充文章一些内容,和注释,用例都是经过测试,代码部分补充了部分原作者没有的HBASE shell操作,更多是为了自己记录
如果问题有疑问可以在留言区互动。
本文运行需要引入的JAR包(MAVEN)
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>2.7.7</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13.2</version>
<scope>test</scope>
</dependency>
进入hbase shell 开始插入数据(表全名 test:scores => test是命名空间+scores表名)
//创建表
hbase(main):179:0> create 'test:scores','grade','course'
//插入数据
hbase(main):180:0> put 'test:scores','zhangsan01','course:art','90'
//扫描结果
hbase(main):181:0> scan 'test:scores'
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1616489599699, value=90
1 row(s) in 0.0150 seconds
hbase(main):182:0> put 'test:scores','zhangsan01','course:math','99',1616489599699
(这里手动设置时间戳的时候一定不能大于你当前的系统时间,否则的话无法删除该数据,我这里手动设置数据是为了下面的DependentColumnFilter过滤器试验。你可以查看一下插入第一条数据的时间戳,再插入第二条数据的时间戳为第一条数据的时间戳)
hbase(main):183:0> put 'test:scores','zhangsan01','grade:','101'
问题:当我将这条插入的数据删除之后再执行put 'test:scores','zhangsan01','grade:','101',1616489599699后能成功却scan 'test:scores'后没有该条数据,而再执行put 'test:scores','zhangsan01','grade:','101'后scan 'test:scores'却能查到该条数据。如果想插入该条数据的时候手动设置时间戳的话,必须在第一次插入该条数据或者truncate后再插入。(数据唯一定义由:命名空间+表名+ROWKEY+FAMILY+COLUMN+TIMESTAMP组成唯一记录,默认只有一个版本号,查询不到的原因:个人暂时认为是由于时间戳在前面的会先识别(记录被标记删除),之前时间戳的都认为是删除了)
hbase(main):184:0> put 'test:scores','zhangsan02','course:art','90'
hbase(main):186:0> put 'test:scores','zhangsan02','grade:','102',1616489886634
hbase(main):187:0> put 'test:scores','zhangsan02','course:math','66',1616489599699
hbase(main):188:0> put 'test:scores','lisi01','course:math','89',1616489599699
hbase(main):189:0> put 'test:scores','lisi01','course:art','89'
hbase(main):190:0> put 'test:scores','lisi01','grade:','201',1616489599699
全部数据
hbase(main):102:0> scan 'test:scores'
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102
以下容均为直接复制源作者内容,值修改了命名空间
查询数据:
根据rowkey查询:
hbase(main):187:0> get 'test:scores','zhangsan01'
COLUMN CELL
course:art timestamp=1498003561726, value=90
course:math timestamp=1498003561726, value=99
grade: timestamp=1498003593575, value=101
3 row(s) in 0.0160 seconds
根据列名查询:
hbase(main):188:0> scan 'test:scores',{COLUMNS=>'course:art'}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1498003655021, value=89
zhangsan01 column=course:art, timestamp=1498003561726, value=90
zhangsan02 column=course:art, timestamp=1498003601365, value=90
3 row(s) in 0.0120 seconds
查询两个rowkey之间的数据:(扫描结果包含zhangsan01,不包含结束zhangsan02)
hbase(main):205:0> scan 'test:scores',{STARTROW=>'zhangsan01',STOPROW=>'zhangsan02'}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1498003561726, value=90
zhangsan01 column=course:math, timestamp=1498003561726, value=99
zhangsan01 column=grade:, timestamp=1498003593575, value=101
1 row(s) in 0.0140 seconds
查询两个rowkey且根据列名来查询:(扫描结果包含zhangsan01,不包含结束zhangsan02)
hbase(main):206:0> scan 'test:scores',{COLUMNS=>'course:art',STARTROW=>'zhangsan01',STOPROW=>'zhangsan02'}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1498003561726, value=90
1 row(s) in 0.0110 seconds
查询指定rowkey到末尾根据列名的查询:(扫描结果包含zhangsan01,不包含结束zhangsan09)
hbase(main):207:0> scan 'test:scores',{COLUMNS=>'course:art',STARTROW=>'zhangsan01',STOPROW=>'zhangsan09'}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1498003561726, value=90
zhangsan02 column=course:art, timestamp=1498003601365, value=90
2 row(s) in 0.0310 seconds
过滤器的使用:
引言 -- 参数基础
有两个参数类在各类Filter中经常出现,统一介绍下:
(1)比较运算符为CompareOperator(原作者的枚举已经过期,不推荐使用)
比较运算符用于定义比较关系,可以有以下几类值供选择:
EQUAL 相等
GREATER 大于
GREATER_OR_EQUAL 大于等于
LESS 小于
LESS_OR_EQUAL 小于等于
NOT_EQUAL 不等于
(2)比较器 ByteArrayComparable
通过比较器可以实现多样化目标匹配效果,比较器有以下子类可以使用:
BinaryComparator 匹配完整字节数组
BinaryPrefixComparator 匹配字节数组前缀
BitComparator Performs a bitwise comparison, providing a BitwiseOp class with OR, and XOR operators.
NullComparator Does not compare against an actual value but whether a given one is null, or not null.
RegexStringComparator 正则表达式匹配(模糊查询)
SubstringComparator 子串匹配(包含查询)
以下代码都是通过了自己的测试和验证。最后一个scanCondition方法包含了多条件联合查询(AND 和 OR的联合使用),注意修改包名
package com.lyn.demo.hbase;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.CompareOperator;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.filter.*;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.Pair;
import org.junit.Before;
import org.junit.Test;
import java.util.*;
/**
* @author lyn
* @date 2021-03-25
*/
public class HBaseFilterDemo {
private static Connection connection;
@Before
public void createConnection() throws Exception{
if(connection==null){
Configuration conf = new Configuration();
//zk地址
conf.set("hbase.zookeeper.quorum","master");
//zk端口
conf.set("hbase.zookeeper.property.clientPort", "2181");
connection = ConnectionFactory.createConnection(conf);
}
}
/**
1,FamilyFilter
a,按family(列族)查找,取回所有符合条件的“family”
b,构造方法第一个参数为compareOp
c,第二个参数为WritableByteArrayComparable,有BinaryComparator,
BinaryPrefixComparator,
BitComparator, NullComparator, RegexStringComparator, SubstringComparator这些类,
最常用的为BinaryComparator
*/
@Test
public void FamilyFilter() throws Exception{
Filter filter = new FamilyFilter(CompareOperator.LESS_OR_EQUAL,
new BinaryComparator(Bytes.toBytes("grc")));
Scan scan =new Scan();
scan.setFilter(filter);
_printResult(scan);
/* 功能打印内容
ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66
ROW:zhangsan02 column:grade: value=102*/
/* HBASE扫描内容
hbase(main):115:0> scan 'test:scores',{FILTER=>"FamilyFilter(<=,'binary:grc')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
/**
2,QualifierFilter
类似于FamilyFilter,取回所有符合条件的“列”
构造方法第一个参数 CompareOperator
第二个参数为WritableByteArrayComparable
*/
@Test
public void QualifierFilter() throws Exception{
Filter filter = new QualifierFilter(CompareOperator.LESS_OR_EQUAL,
new BinaryComparator(Bytes.toBytes("b")));
Scan scan =new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:grade: value=101
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):129:0> scan 'test:scores',{FILTER=>"QualifierFilter(<=,'binary:b')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=grade:, timestamp=1616489886634, value=102 */
}
/**
3,RowFilter
构造方法参数设置类似于FamilyFilter,符合条件的row都返回
但是通过row查询时,如果知道开始结束的row,还是用scan的start和end方法更直接并且经测试速度快
一半以上
*/
@Test
public void RowFilter()throws Exception{
Filter filter = new RowFilter(CompareOperator.LESS_OR_EQUAL,
new BinaryComparator(Bytes.toBytes("zhangsan01")));
Scan scan =new Scan();
scan.setFilter(filter);
//增加开始结束范围可以减少搜索范围加快搜索效率
//直接指定起止行,因为filter本质上还是会遍历全部数据,而设定起止行后会直接从指定行开始,
// 指定行结束,效率高很多。
/* scan.withStartRow(Bytes.toBytes("aaaaaa"),true);
scan.withStopRow(Bytes.toBytes("zzzzzzz"),true);*/
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101*/
/* hbase(main):141:0> scan 'test:scores',
{FILTER=>"RowFilter(<=,'binary:zhangsan01')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101*/
}
/**
4,PrefixFilter
取回rowkey键以指定prefix开头的所有行
*/
@Test
public void PrefixFilter() throws Exception{
Filter filter = new PrefixFilter(Bytes.toBytes("zhang"));
Scan scan =new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66
ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):151:0> scan 'test:scores',{FILTER=>"PrefixFilter('zhang')"}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
/*
由于其原生带有PrefixFilter这种对ROWKEY的前缀过滤查询,因此想着实现的后缀查询的过程中,发现这一方面相对来说还是空白。
因此,只能采用一些策略来实现,主要还是采用正则表达式的方式。
RegexStringComparator类进行正则表达式进行匹配
*/
@Test
public void SuffixFilter() throws Exception{
Filter filter = new RowFilter(CompareOperator.EQUAL,new RegexStringComparator(".*n01"));
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101*/
/* hbase(main):162:0> scan 'test:scores',
{FILTER=>"RowFilter(=,'regexstring:.*n01')"}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101*/
}
/**
* 列名前缀过滤
*/
@Test
public void ColumnPrefixFilter() throws Exception{
Filter filter = new ColumnPrefixFilter(Bytes.toBytes("ar"));
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan02 column:course:art value=90*/
/* hbase(main):001:0> scan 'test:scores',{FILTER=>"ColumnPrefixFilter('ar')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan02 column=course:art, timestamp=1616489886634,
value=90*/
}
/**
* 多重列名前缀过滤(列和列之间是用OR关联)
*/
@Test
public void MultipleColumnPrefixFilter() throws Exception{
MultipleColumnPrefixFilter multipleColumnPrefixFilter =
new MultipleColumnPrefixFilter(
new byte[][]{Bytes.toBytes("ar"),Bytes.toBytes("ma")});
Scan scan = new Scan();
scan.setFilter(multipleColumnPrefixFilter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66*/
/* hbase(main):007:0> scan 'test:scores',{FILTER=>"MultipleColumnPrefixFilter('ar','ma')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66*/
}
/* 7,ColumnCountGetFilter
a,scan中不适合使用,在Get中
b,若设为0,则无法返回数据,设为几就按服务器中存储位置取回几列
c,可用size()取到列数,观察效果*/
@Test
public void ColumnCountGetFilter() throws Exception{
//参数标识显示N个列,按字母顺序A-Z,a-z
ColumnCountGetFilter filter = new ColumnCountGetFilter(2);
Table table = connection.getTable(TableName.valueOf("test", "scores"));
Get get = new Get(Bytes.toBytes("zhangsan01"));
get.setFilter(filter);
Result result = table.get(get);
Cell[] cells = result.rawCells();
for(Cell cell:cells){
String family = Bytes.toString(CellUtil.cloneFamily(cell));
String column = Bytes.toString(CellUtil.cloneQualifier(cell));
String value = Bytes.toString(CellUtil.cloneValue(cell));
System.out.println(String.format("ROW:%s column:%s:%s value=%s",Bytes.toString(result.getRow()),family,column,value));
}
/* 可以使用扫描,但是不建议使用
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);*/
/* ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99*/
/* hbase(main):026:0> get 'test:scores','zhangsan01',{FILTER=>"ColumnCountGetFilter(2)"}
COLUMN CELL
course:art timestamp=1616489599699, value=90
course:math timestamp=1616489599699, value=99*/
/* hbase(main):024:0> scan 'test:scores',{FILTER=>"ColumnCountGetFilter(2) AND RowFilter(=,'binary:zhangsan01')"}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99*/
}
/*
8,ColumnPaginationFilter
a,limit 表示返回列数
b,offset 表示返回列的偏移量,如果为0,则全部取出,如果为1,则返回第二列及以后
*/
@Test
public void ColumnPaginationFilter() throws Exception{
//限制2个列显示,从第二个列开始显示,按字母顺序A-Z,a-z
ColumnPaginationFilter filter = new ColumnPaginationFilter(2, 1);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101
ROW:zhangsan02 column:course:math value=66
ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):034:0> scan 'test:scores',{FILTER=>"ColumnPaginationFilter(2,1)"}
ROW COLUMN+CELL
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, times*/
}
/* 9,ColumnRangeFilter
构造函数:
ColumnRangeFilter(byte[] minColumn, boolean minColumnInclusive, byte[] maxColumn, boolean maxColumnInclusive)
*可用于获得一个范围的列,例如,如果你的一行中有百万个列,但是你只希望查看列名为bbbb到dddd的范围
*该过滤器可以进行高效的列名内部扫描。(为何是高效呢???因为列名是已经按字典排序好的)HBase-0.9.2 版本引入该功能。
*一个列名是可以出现在多个列族中的,该过滤器将返回所有列族中匹配的列*/
@Test
public void ColumnRangeFilter() throws Exception{
ColumnRangeFilter filter = new ColumnRangeFilter(Bytes.toBytes("a"), true,
Bytes.toBytes("n"), true);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66*/
/* hbase(main):046:0> scan 'test:scores',{FILTER=>"ColumnRangeFilter('a',true,'n',true)"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66*/
}
/* 10, DependentColumnFilter (该过滤器有两个参数:family和Qualifier,尝试找到该列所在的每一行,
并返回该行具有相同时间戳的全部键值对。如果某一行不包含指定的列,则该行的任何键值对都不返回,
该过滤器还可以有一个可选的布尔参数-如果为true,从属的列不返回;
该过滤器还可以有两个可选的参数--一个比较操作符和一个值比较器,用于family和Qualifier
的进一步检查,如果从属的列找到,其值还必须通过值检查,然后就是时间戳必须考虑)*/
@Test
public void DependentColumnFilter() throws Exception{
Filter filter = new DependentColumnFilter(Bytes.toBytes("course")
,Bytes.toBytes("art"),false,CompareOperator.EQUAL,
new BinaryComparator(Bytes.toBytes("90")));
// Filter filter = new DependentColumnFilter(Bytes.toBytes("course"), Bytes.toBytes("art"));默认false
// Filter filter = new DependentColumnFilter(Bytes.toBytes("course"), Bytes.toBytes("art"),false);
// Filter filter = new DependentColumnFilter(Bytes.toBytes("course"), Bytes.toBytes("art"),true);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):055:0> scan 'test:scores',{FILTER=>"DependentColumnFilter('course','art',false,=,'binary:90')"}
ROW COLUMN+CELL
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
//dropDependentColumn的值为true表示排除匹配到的列值,显示和本列有相同时间戳的值
//用于查找同时插入的值
/* Filter filter = new DependentColumnFilter(Bytes.toBytes("course")
,Bytes.toBytes("art"),true,CompareOperator.EQUAL,
new BinaryComparator(Bytes.toBytes("90")));*/
/* ROW:zhangsan01 column:course:math value=99
ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):056:0> scan 'test:scores',{FILTER=>"DependentColumnFilter('course','art',true,=,'binary:90')"}
ROW COLUMN+CELL
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
/*
11,FirstKeyOnlyFilter
如名字所示,结果只返回每行的第一个值对
*/
@Test
public void FirstKeyOnlyFilter() throws Exception{
FirstKeyOnlyFilter filter = new FirstKeyOnlyFilter();
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan02 column:course:art value=90*/
/* hbase(main):075:0> scan 'test:scores',{FILTER=>"FirstKeyOnlyFilter()"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan02 column=course:art, timestamp=1616489886634, value=90*/
}
/*
12,FuzzyRowFilter
模糊row查询
pair中第一个参数为模糊查询的string
第二个参数为byte[]其中装与string位数相同的数值0或1,0表示该位必须与string中值相同,1表示可以不同
*/
@Test
public void FuzzyRowFilter() throws Exception{
System.out.println(Bytes.toBytes("zhangsan01").length);
Filter filter = new FuzzyRowFilter(Arrays.asList(new Pair<byte[], byte[]>(Bytes.toBytes("zhangsan01"),
new byte[]{0, 0, 0, 0, 0, 0, 0, 0, 0, 1})));
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66
ROW:zhangsan02 column:grade: value=102*/
}
/*
13,InclusiveStopFilter
指定stopRow,程序在scan时从头扫描全部返回,直到stopRow停止(stopRow这行也会返回,然后scan停止)
*/
@Test
public void InclusiveStopFilter() throws Exception{
Filter filter = new InclusiveStopFilter(Bytes.toBytes("zhangsan01"));
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101*/
/* hbase(main):082:0> scan 'test:scores',{FILTER=>"InclusiveStopFilter('zhangsan01')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101*/
}
/*
14,KeyOnlyFilter
只取key值,size正常,说明value不是没取而是在取的时候被重写为空(能打印,不是null)
lenAsVal这个值,当设为false时打印为空,如果设为true时打印的将会是“\x00\x00\x00\x02”(原因未知,只记录问题)
现在该功能更多用于统计查询总记录数
*/
@Test
public void KeyOnlyFilter() throws Exception{
Filter filter = new KeyOnlyFilter(false);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=
ROW:lisi01 column:course:math value=
ROW:lisi01 column:grade: value=
ROW:zhangsan01 column:course:art value=
ROW:zhangsan01 column:course:math value=
ROW:zhangsan01 column:grade: value=
ROW:zhangsan02 column:course:art value=
ROW:zhangsan02 column:course:math value=
ROW:zhangsan02 column:grade: value=*/
/* hbase(main):096:0> scan 'test:scores',{FILTER=>"KeyOnlyFilter(false)"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=
lisi01 column=course:math, timestamp=1616489599699, value=
lisi01 column=grade:, timestamp=1616489599699, value=
zhangsan01 column=course:art, timestamp=1616489599699, value=
zhangsan01 column=course:math, timestamp=1616489599699, value=
zhangsan01 column=grade:, timestamp=1616489773779, value=
zhangsan02 column=course:art, timestamp=1616489886634, value=
zhangsan02 column=course:math, timestamp=1616489599699, value=
zhangsan02 column=grade:, timestamp=1616489886634, value=*/
/* Filter filter = new KeyOnlyFilter(true);
hbase(main):095:0> scan 'test:scores',{FILTER=>"KeyOnlyFilter(true)"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=\x00\x00\x00\x02
lisi01 column=course:math, timestamp=1616489599699, value=\x00\x00\x00\x02
lisi01 column=grade:, timestamp=1616489599699, value=\x00\x00\x00\x03
zhangsan01 column=course:art, timestamp=1616489599699, value=\x00\x00\x00\x02
zhangsan01 column=course:math, timestamp=1616489599699, value=\x00\x00\x00\x02
zhangsan01 column=grade:, timestamp=1616489773779, value=\x00\x00\x00\x03
zhangsan02 column=course:art, timestamp=1616489886634, value=\x00\x00\x00\x02
zhangsan02 column=course:math, timestamp=1616489599699, value=\x00\x00\x00\x02
zhangsan02 column=grade:, timestamp=1616489886634, value=\x00\x00\x00\x03*/
}
/*
15,PageFilter
取回XX条数据,可用于分页使用
{
Scan.setReversed(X);true:反向取数据,false:正向取数据
正向向后取数据下一页(正向取数据(A-Z,a-z)ASCII正向取记录),
反向向后取数据就是前一页(反向取数据(Z-A,z-a)ASCII反向取记录)
受记录所属region影响,比如查询记录在2个regionserver,2个regionserver
都会收到查询请求都会返回相应限制数量,比如下例子中的返回2条记录
那么实际返回是2(分页数量)*2(regionserver数量)=4条返回记录
}
*/
@Test
public void PageFilter() throws Exception{
Filter filter = new PageFilter(2);
Scan scan = new Scan();
scan.setFilter(filter);
//正向取数据(A-Z,a-z)ASCII正向取记录
scan.setReversed(false);
/*//反向取数据(Z-A,z-a)ASCII反向取记录
scan.setReversed(true);
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66
ROW:zhangsan02 column:grade: value=102
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101*/
_printResult(scan);
//正向想去
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101*/
/* hbase(main):139:0> scan 'test:scores',{FILTER=>"PageFilter(2)"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101*/
}
/* 无实际使用场景,随机获取记录数
16,RandomRowFilter
参数小于0时一条查不出大于1值会返回所有,而想取随机行的话有效区间为0~1,值代表取到的几率
*/
@Test
public void RandomRowFilter() throws Exception{
Filter filter = new RandomRowFilter(0.5F);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
}
/*
17,SingleColumnValueFilter和SingleColumnValueExcludeFilter
用来查找并返回指定条件的列的数据
a,如果查找时没有该列,两种filter都会把该行所有数据返回
b,如果查找时有该列,但是不符合条件,则该行所有列都不返回
c,如果找到该列,并且符合条件,前者返回所有列,后者返回除该列以外的所有列
*/
@Test
public void SingleColumnValueFilter() throws Exception{
SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes("course")
, Bytes.toBytes("art"), CompareOperator.EQUAL
, new SubstringComparator("9"));
/* 作用就是排除条件列,记录的其他列显示出来
SingleColumnValueExcludeFilter filter = new SingleColumnValueExcludeFilter(Bytes.toBytes("course")
, Bytes.toBytes("art"), CompareOperator.EQUAL
, new SubstringComparator("9"));*/
//完整匹配字节数组
//Filter filter = new SingleColumnValueFilter(Bytes.toBytes("course"), Bytes.toBytes("art"),CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes("90")));
/* //等效于SubstringComparator,性能问题暂时未校验
SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes("course")
, Bytes.toBytes("art"), CompareOperator.EQUAL
, new RegexStringComparator("9"));*/
//过滤记录缺少本列的数据,(true,过滤缺少改列的数据,false:缺少改列的数据直接返回记录)
//如果是列是后面加的数据,会影响之前的数据查询,因为之前数据缺少该字段所以不会显示出来,
// 字段如果是一直就有可以使用。有利有弊,慎重使用
filter.setFilterIfMissing(false);
//是否使用最后的版本
filter.setLatestVersionOnly(true);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
// put 'test:scores','lisi02','course:math','89'(插入一条数据为了后面测试使用)
/* filter.setFilterIfMissing(false);如果设置为true[ ROW:lisi02 column:course:math value=89]这条数据就会过滤掉
ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:lisi02 column:course:math value=89
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101
ROW:zhangsan02 column:course:art value=90
ROW:zhangsan02 column:course:math value=66
ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):173:0> scan 'test:scores',{FILTER=>"SingleColumnValueFilter('course','art',=,'substring:9')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
lisi02 column=course:math, timestamp=1616576689519, value=89
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
/* 除了符合条件的列没显示出来,其他列显示出来。SingleColumnValueExcludeFilter作用就是排除条件列
hbase(main):187:0> scan 'test:scores',{FILTER=>"SingleColumnValueExcludeFilter('course','art',=,'substring:9')"}
ROW COLUMN+CELL
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
lisi02 column=course:math, timestamp=1616576689519, value=89
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
/* 18,ValueFilter
按value全数据库搜索(全部列的value均会被检索)
EQUAL(操作符[=])只有符合条件的列才会显示出来,相同ROWKEY的其他列也不会显示
NOT_EQUAL(操作符[!=])只有符合条件的列不会显示出来,相同ROWKEY的其他符合列会显示
只是针对列的值进行过滤。不针对ROWKEY进行过滤
*/
@Test
public void ValueFilter() throws Exception{
Filter filter =
new ValueFilter(CompareOperator.EQUAL, new BinaryComparator(Bytes.toBytes("102")));
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:zhangsan02 column:grade: value=102*/
/* hbase(main):208:0> scan 'test:scores',{FILTER=>"ValueFilter(=,'binary:102')"}
ROW COLUMN+CELL
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
/* 19,SkipFilter
根据整行中的每个列来做过滤,只要存在一列不满足条件,整行都被过滤掉。
例如,如果一行中的所有列代表的是不同物品的重量,则真实场景下这些数值都必须大于零,我们希望将那些包含任意列值为0的行都过滤掉。
在这个情况下,我们结合ValueFilter和SkipFilter共同实现该目的:
scan.setFilter(new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL,new BinaryComparator(Bytes.toBytes(0))));*/
@Test
public void SkipFilter() throws Exception{
SkipFilter filter = new SkipFilter(
new ValueFilter(CompareOperator.NOT_EQUAL,
new BinaryComparator(Bytes.toBytes("102"))));
//Filter filter = new SkipFilter(new DependentColumnFilter(Bytes.toBytes("course"),
// Bytes.toBytes("art"),false,
// CompareOp.NOT_EQUAL,new BinaryComparator(Bytes.toBytes("90"))));
/* 等同于SingleColumnValueFilter过滤效果,单条件满足返回相应记录
SkipFilter filter = new SkipFilter(
new SingleColumnValueFilter(Bytes.toBytes("course")
, Bytes.toBytes("art"),CompareOperator.EQUAL,
new BinaryComparator(Bytes.toBytes("90"))));*/
Scan scan = new Scan();
//该过滤器需要配合其他过滤器来使用
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89
ROW:lisi01 column:course:math value=89
ROW:lisi01 column:grade: value=201
ROW:zhangsan01 column:course:art value=90
ROW:zhangsan01 column:course:math value=99
ROW:zhangsan01 column:grade: value=101*/
}
@Test
public void TimestampsFilter() throws Exception{
//ls中存放所有需要查找匹配的时间戳
List<Long> stamps =new ArrayList<Long>();
stamps.add(1616489599699L);
stamps.add(1616489886634L);
Filter filter = new TimestampsFilter(stamps);
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:lisi01 column:grade: value=201 timestamps=1616489599699
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan01 column:course:math value=99 timestamps=1616489599699
ROW:zhangsan02 column:course:art value=90 timestamps=1616489886634
ROW:zhangsan02 column:course:math value=66 timestamps=1616489599699
ROW:zhangsan02 column:grade: value=102 timestamps=1616489886634*/
/* hbase(main):250:0> scan 'test:scores',{FILTER=>"TimestampsFilter(1616489599699,1616489886634)"}
ROW COLUMN+CELL
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102 */
}
/*
21,WhileMatchFilter
相当于while执行,直到不match就break了返回了。
扫描全表直到第一个不匹配的列才终止(不在扫描后面的数据),同一个ROWKEY也只显示符合条件的部分
*/
@Test
public void WhileMatchFilter() throws Exception{
Filter filter = new WhileMatchFilter(new ValueFilter(CompareOperator.NOT_EQUAL,
new BinaryComparator(Bytes.toBytes("101"))));
Scan scan = new Scan();
scan.setFilter(filter);
_printResult(scan);
/* ROW:lisi01 column:course:art value=89 timestamps=1616490059006
ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:lisi01 column:grade: value=201 timestamps=1616489599699
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan01 column:course:math value=99 timestamps=1616489599699*/
/* hbase(main):237:0> scan 'test:scores'
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan01 column=course:art, timestamp=1616489599699, value=90
zhangsan01 column=course:math, timestamp=1616489599699, value=99
zhangsan01 column=grade:, timestamp=1616489773779, value=101
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
/* 22,FilterList
代表一个过滤器链,它可以包含一组即将应用于目标数据集的过滤器,过滤器间具有
“与”FilterList.Operator.MUST_PASS_ALL和
“或”FilterList.Operator.MUST_PASS_ONE关系。
官网实例代码,两个“或”关系的过滤器的写法:*/
@Test
public void FilterList() throws Exception{
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
Bytes.toBytes("course"),
Bytes.toBytes("math"),
CompareOperator.EQUAL,
new BinaryComparator(Bytes.toBytes("89")));
//巨坑,match非math,导致过滤字段不存在,使得全部数据显示,
// 设置成TRUE可以避免字段名写错导致的记录错误
filter1.setFilterIfMissing(true);
SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
Bytes.toBytes("course"),
Bytes.toBytes("math"),
CompareOperator.EQUAL,
new BinaryComparator(Bytes.toBytes("66")));
//巨坑,match非math,导致过滤字段不存在,使得全部数据显示,
// 设置成TRUE可以避免字段名写错导致的记录错误
filter2.setFilterIfMissing(true);
filterList.addFilter(filter1);
filterList.addFilter(filter2);
Scan scan = new Scan();
scan.setFilter(filterList);
_printResult(scan);
/*
ROW:lisi01 column:course:art value=89 timestamps=1616490059006
ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:lisi01 column:grade: value=201 timestamps=1616489599699
ROW:zhangsan02 column:course:art value=90 timestamps=1616489886634
ROW:zhangsan02 column:course:math value=66 timestamps=1616489599699
ROW:zhangsan02 column:grade: value=102 timestamps=1616489886634
*/
/* hbase(main):270:0> scan 'test:scores',{FILTER=>"SingleColumnValueFilter('course','math',=,'binary:89') OR SingleColumnValueFilter('course','math',=,'binary:66')"}
ROW COLUMN+CELL
lisi01 column=course:art, timestamp=1616490059006, value=89
lisi01 column=course:math, timestamp=1616489599699, value=89
lisi01 column=grade:, timestamp=1616489599699, value=201
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102*/
}
@Test
public void scanCondition() throws Exception{
//---只显示course列族的数据------
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("course"));
_printResult(scan);
System.out.println("-------------------");
/* ROW:lisi01 column:course:art value=89 timestamps=1616490059006
ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan01 column:course:math value=99 timestamps=1616489599699
ROW:zhangsan02 column:course:art value=90 timestamps=1616489886634
ROW:zhangsan02 column:course:math value=66 timestamps=1616489599699*/
scan = new Scan();
Map<byte[], NavigableSet<byte[]>> familyMap = scan.getFamilyMap();
familyMap.put(Bytes.toBytes("course"),null);
familyMap.put(Bytes.toBytes("grade"),null);
scan.setFamilyMap(familyMap);
_printResult(scan);
System.out.println("-------------------");
/* ROW:lisi01 column:course:art value=89 timestamps=1616490059006
ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:lisi01 column:grade: value=201 timestamps=1616489599699
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan01 column:course:math value=99 timestamps=1616489599699
ROW:zhangsan01 column:grade: value=101 timestamps=1616489773779
ROW:zhangsan02 column:course:art value=90 timestamps=1616489886634
ROW:zhangsan02 column:course:math value=66 timestamps=1616489599699
ROW:zhangsan02 column:grade: value=102 timestamps=1616489886634 */
scan = new Scan();
scan.addColumn(Bytes.toBytes("course"),Bytes.toBytes("art"));
_printResult(scan);
System.out.println("-------------------");
/* ROW:lisi01 column:course:art value=89 timestamps=1616490059006
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan02 column:course:art value=90 timestamps=1616489886634*/
scan = new Scan();
scan.setTimeRange(1616489599699L,1616489599700L);
_printResult(scan);
System.out.println("-------------------");
/* ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:lisi01 column:grade: value=201 timestamps=1616489599699
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan01 column:course:math value=99 timestamps=1616489599699
ROW:zhangsan02 column:course:math value=66 timestamps=1616489599699*/
//正向方向获取总页数
Scan pageScan = new Scan();
pageScan.withStartRow(Bytes.toBytes("lisi01"),true);
pageScan.withStopRow(Bytes.toBytes("zhangsan02"),true);
FirstKeyOnlyFilter filter = new FirstKeyOnlyFilter();
pageScan.setFilter(filter);
Table table = connection.getTable(TableName.valueOf("test", "scores"));
ResultScanner scanner = table.getScanner(pageScan);
int count = 0;
for(Result result:scanner){
count++;
}
System.out.println("--------------count:"+count);
String startRow = "";
//是否反向(Z->A,z->a)查询,false未正向取值(A->Z,a->z)
boolean reversed = false;
for(int i=1;i<=(count%2==0?count/2:count/2+1);i++){
PageFilter pageFilter = new PageFilter(2);
FilterList filterListTotal = new FilterList(FilterList.Operator.MUST_PASS_ALL);
filterListTotal.addFilter(pageFilter);
Scan pageScanResult = new Scan();
pageScanResult.setReversed(reversed);
//正向取值(A->Z,a->z),从前面往后取所以lisi01未开始记录
if(!reversed) {
pageScanResult.withStartRow(Bytes.toBytes("lisi01"), true);
pageScanResult.withStopRow(Bytes.toBytes("zhangsan02"), true);
//反向取值(Z->A,z->a),从后面往前取所以zhangsan02未开始记录
}else {
pageScanResult.withStartRow(Bytes.toBytes("zhangsan02"), true);
pageScanResult.withStopRow(Bytes.toBytes("lisi01"), true);
}
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);
if(i!=1&&!"".equals(startRow)){
System.out.println("startRow:"+startRow);
//正向取值
if(!reversed) {
RowFilter rowFilter = new RowFilter(CompareOperator.GREATER,
new BinaryComparator(Bytes.toBytes(startRow)));
filterListTotal.addFilter(rowFilter);
//反向取值
}else{
RowFilter rowFilter = new RowFilter(CompareOperator.LESS,
new BinaryComparator(Bytes.toBytes(startRow)));
filterListTotal.addFilter(rowFilter);
}
}
SingleColumnValueFilter columnValueFilter1 = new SingleColumnValueFilter(Bytes.toBytes("course"), Bytes.toBytes("art")
, CompareOperator.EQUAL, new SubstringComparator("90"));
SingleColumnValueFilter columnValueFilter2 = new SingleColumnValueFilter(Bytes.toBytes("course"), Bytes.toBytes("art")
, CompareOperator.EQUAL, new SubstringComparator("89"));
//颞部
filterList.addFilter(columnValueFilter1);
filterList.addFilter(columnValueFilter2);
filterListTotal.addFilter(filterList);
pageScanResult.setFilter(filterListTotal);
scanner = table.getScanner(pageScanResult);
for(Result result:scanner){
Cell[] cells = result.rawCells();
for(Cell cell:cells){
String family = Bytes.toString(CellUtil.cloneFamily(cell));
String column = Bytes.toString(CellUtil.cloneQualifier(cell));
String value = Bytes.toString(CellUtil.cloneValue(cell));
System.out.println(String.format("ROW:%s column:%s:%s value=%s timestamps=%s",Bytes.toString(result.getRow()),family,column,value,cell.getTimestamp()));
}
startRow = Bytes.toString(result.getRow());
}
System.out.println("----------分页分割---------------");
/* --------------count:3
ROW:lisi01 column:course:art value=89 timestamps=1616490059006
ROW:lisi01 column:course:math value=89 timestamps=1616489599699
ROW:lisi01 column:grade: value=201 timestamps=1616489599699
ROW:zhangsan01 column:course:art value=90 timestamps=1616489599699
ROW:zhangsan01 column:course:math value=99 timestamps=1616489599699
ROW:zhangsan01 column:grade: value=101 timestamps=1616489773779
----------分页分割---------------
startRow:zhangsan01
ROW:zhangsan02 column:course:art value=90 timestamps=1616489886634
ROW:zhangsan02 column:course:math value=66 timestamps=1616489599699
ROW:zhangsan02 column:grade: value=102 timestamps=1616489886634
----------分页分割---------------
hbase(main):307:0> scan 'test:scores', {FILTER => "(SingleColumnValueFilter('course','art',=,'binary:90') OR SingleColumnValueFilter('course','art',=,'binary:89')) AND PageFilter(2) AND RowFilter(>,'binary:zhangsan01')",STARTROW=>'lisi01',STOPROW=>'zhangsan03'}
ROW COLUMN+CELL
zhangsan02 column=course:art, timestamp=1616489886634, value=90
zhangsan02 column=course:math, timestamp=1616489599699, value=66
zhangsan02 column=grade:, timestamp=1616489886634, value=102
*/
}
}
//打印结果集
private void _printResult(Scan scan) throws Exception{
Table table = connection.getTable(TableName.valueOf("test", "scores"));
ResultScanner scanner = table.getScanner(scan);
for(Result result:scanner){
Cell[] cells = result.rawCells();
for(Cell cell:cells){
String family = Bytes.toString(CellUtil.cloneFamily(cell));
String column = Bytes.toString(CellUtil.cloneQualifier(cell));
String value = Bytes.toString(CellUtil.cloneValue(cell));
System.out.println(String.format("ROW:%s column:%s:%s value=%s timestamps=%s",Bytes.toString(result.getRow()),family,column,value,cell.getTimestamp()));
}
}
}
}