常用过滤器
过滤器操作
scan可以扫描全表的数据,通过过滤器来进行条件查询。
指定列族:列的所有cell中进行值的过滤:
SingleColumnValueFilter(‘StuInfo’,‘Age’,=,‘binary:23’)
返回满足条件的cell所在的行;而ValueFilter则返回cell
SingleColumnValueExcludeFilter(),返回的行数据中,只是不包含条件过滤的cell
设置每行返回的列数ColumnCountGetFilter
hbase(main):053:0> get 'Student','001',FILTER=>"ColumnCountGetFilter(3)"
COLUMN CELL
Grades:Bag timestamp=2, value=100
Grades:BigData timestamp=2, value=80
Grades:Computer timestamp=2, value=90
1 row(s) in 0.0150 seconds
设置返回的行数PageFilter,行分页
结合STARTROW使用,进行行分页
hbase(main):056:0> scan 'Student',FILTER=>"PageFilter(1)"
ROW COLUMN+CELL
001 column=Grades:BigData, timestamp=2, value=80
001 column=Grades:Computer, timestamp=2, value=90
001 column=Grades:Math, timestamp=26, value=85
001 column=StuInfo:Age, timestamp=3, value=19
001 column=StuInfo:Class, timestamp=2, value=02
001 column=StuInfo:Name, timestamp=1, value=Tom Green
001 column=StuInfo:Sex, timestamp=1, value=Male
1 row(s) in 0.0160 seconds
hbase(main):066:0> scan 'Student',{STARTROW=>'002',FILTER=>'PageFilter(1)'}
ROW COLUMN+CELL
002 column=StuInfo:Age, timestamp=2, value=23
002 column=StuInfo:Name, timestamp=2, value=Lucy
1 row(s) in 0.0090 seconds
#等价如下:
hbase(main):066:0> scan 'Student',{STARTROW=>'002',LIMIT=>1}
ROW COLUMN+CELL
002 column=StuInfo:Age, timestamp=2, value=23
002 column=StuInfo:Name, timestamp=2, value=Lucy
1 row(s) in 0.0090 seconds
设置列的分页ColumnPaginationFilter
ColumnPaginationFilter(num,offset)
偏移offset列,取num列
hbase(main):067:0> scan 'Student',FILTER=>"ColumnPaginationFilter(2,1)"
ROW COLUMN+CELL
001 column=Grades:Computer, timestamp=2, value=90
001 column=Grades:Math, timestamp=26, value=85
002 column=StuInfo:Name, timestamp=2, value=Lucy
2 row(s) in 0.0210 seconds
过滤器的组合
AND,OR连接
hbase(main):009:0> scan 'Student',FILTER=>"ColumnPaginationFilter(1,1) OR ValueFilter(=,'substring:Ma')"
ROW COLUMN+CELL
001 column=Grades:Computer, timestamp=2, value=90
001 column=StuInfo:Sex, timestamp=1, value=Male
002 column=Grades:Math, timestamp=1, value=89
2 row(s) in 0.0120 seconds