什么是Hbase过滤器?有什么用
因为Hbase没有sql,一些复杂的查询就需要用过滤器来实现.
过滤器
为方便后续输出查看结果,先封装一个打印结果的函数
def printScanner(resultScanner: ResultScanner): Unit ={
val value: util.Iterator[Result] = resultScanner.iterator()
while(value.hasNext){
val result: Result = value.next()
HbaseUtil.printResult(result)
}
}
单列值过滤器
比如如果用sql查询
select * where name ="zhaoyun"
用hbase实现是这样的
def testColumnFilter(): Unit = {
//单个列的值的二进制比较的构造函数。 如果找到该列并且条件通过,则将输出该行的所有列。 如果条件失败,则不会
val filter1 = new SingleColumnValueFilter(Bytes.toBytes("f1"), Bytes.toBytes("name"), CompareOp.EQUAL, Bytes.toBytes("zhaoyun"))
filter1.setFilterIfMissing(true)
val scan = new Scan()
scan.setFilter(filter1)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
RegexStringComparator
select * where name like "^zha"
def testRegexStringComparator(): Unit ={
val comparator = new RegexStringComparator("^zhao")
//需要指名列族,列名,值得特征
val filter = new SingleColumnValueFilter(Bytes.toBytes("f1"), Bytes.toBytes("name"), CompareFilter.CompareOp.EQUAL, comparator)
filter.setFilterIfMissing(true)
val table: Table = HbaseUtil.getTable("ns1:students")
val scan=new Scan()
//所有的filter都是绑定到scan对象上的
scan.setFilter(filter)
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
SubstringComparator
筛选name的value包含"zh"子串的
def testSubStringComparator(): Unit ={
val comparator = new SubstringComparator("zh")
val filter = new SingleColumnValueFilter(Bytes.toBytes("f1"), Bytes.toBytes("name"), CompareFilter.CompareOp.EQUAL, comparator)
filter.setFilterIfMissing(true)
val table: Table = HbaseUtil.getTable("ns1:students")
val scan=new Scan()
scan.setFilter(filter)
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
rk00002 f1 age 23
rk00002 f1 gender f
rk00002 f1 name zhenji
rk00003 f1 age 25
rk00003 f1 gender m
rk00003 f1 name zhaoyun
BinaryPrefixComparator
无非就是参数是Byte数组罢了,其余的和上面没有区别
def testbinaryPrefixComparator(): Unit ={
//构建比较强
val comparator = new BinaryPrefixComparator(Bytes.toBytes("zh"))
//给单列值过滤器传参
val filter = new SingleColumnValueFilter(Bytes.toBytes("f1"), Bytes.toBytes("name"), CompareFilter.CompareOp.EQUAL, comparator)
filter.setFilterIfMissing(true)
val scan = new Scan()
scan.setFilter(filter)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
FamilyFilter
直接用的话会查询到列族下面所有的单元格
def testFamilyFilter(): Unit ={
val comparator = new BinaryComparator(Bytes.toBytes("f1"))
val filter = new FamilyFilter(CompareFilter.CompareOp.EQUAL,comparator)
val scan = new Scan()
scan.setFilter(filter)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
QualifierFilter
查询列名包含"zh"的
def testQualifierFiletr(): Unit ={
val comparator = new SubstringComparator("na")
val filter = new QualifierFilter(CompareFilter.CompareOp.EQUAL,comparator)
val scan = new Scan()
scan.setFilter(filter)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
运行结果,只是打印该列的所有值,无关的kv对没有
rk00002 f1 name zhenji
rk00003 f1 name zhaoyun
rk00004 f1 name liubei
rk0001 f1 name lisi
ColumnPrefixFilter
看名字就明白了,列名前缀过滤器
def testColumPrefixFilter(): Unit ={
val filter = new ColumnPrefixFilter(Bytes.toBytes("ag"))
val scan = new Scan()
scan.setFilter(filter)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
运行结果
rk00002 f1 age 23
rk00003 f1 age 25
rk00004 f1 age 30
rk0001 f1 age 15
ColumnRangeFilter
def testColumnRageFilter(): Unit ={
//列名属于age到name范围的,true表示包含,false表示不包含
val filter = new ColumnRangeFilter(Bytes.toBytes("age"),true,Bytes.toBytes("name"),false)
val scan = new Scan()
scan.setFilter(filter)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
运行结果
rk00002 f1 age 23
rk00002 f1 gender f
rk00003 f1 age 25
rk00003 f1 gender m
rk00004 f1 age 30
rk00004 f1 gender m
rk0001 f1 age 15
rk0001 f1 height 180
RowFilter
def testRowFilter(): Unit ={
val comparator = new BinaryComparator(Bytes.toBytes("rk0001"))
val filter = new RowFilter(CompareFilter.CompareOp.EQUAL, comparator)
val scan = new Scan()
scan.setFilter(filter)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
过滤器列表
比如如果用sql查询
select * where name ="zhaoyun" and sge="25"
这个用Hbase如何实现,Hbase没有sql,然后Hbase一般是依据rowkey来进行查询的.现在要查询字段,如何做?Hbase的API提供了过滤器进行查询. 两个key和value,需要2个列值过滤器联合进行查询猜可以达成结果. 查到符合条件的会把整个rowkey这一行的内容都获取到.
def testFilterList(): Unit = {
val list = new FilterList(FilterList.Operator.MUST_PASS_ALL)
//单个列的值的二进制比较的构造函数。 如果找到该列并且条件通过,则将输出该行的所有列。 如果条件失败,则不会
val filter1 = new SingleColumnValueFilter(Bytes.toBytes("f1"), Bytes.toBytes("name"), CompareOp.EQUAL, Bytes.toBytes("zhaoyun"))
val filter2 = new SingleColumnValueFilter(Bytes.toBytes("f1"), Bytes.toBytes("age"), CompareOp.EQUAL, Bytes.toBytes("25"))
filter1.setFilterIfMissing(true)
filter2.setFilterIfMissing(true)
list.addFilter(filter1)
list.addFilter(filter2)
val scan = new Scan()
scan.setFilter(list)
val table: Table = HbaseUtil.getTable("ns1:students")
val scanner: ResultScanner = table.getScanner(scan)
printScanner(scanner)
}
运行结果
rk00003 f1 age 25
rk00003 f1 gender m
rk00003 f1 name zhaoyun
总结
- filter就是挽救一下只能通过rowkey来查询的局限,只能通过rowkey查询局限性是很大的
- 基本上hbase相关的所有字段都可以找到一个filter供过滤查询