http://blog.csdn.net/liuxiaochen123/article/details/7737718
FirstKeyOnlyFilter :api中解释如下:
A filter that will only return the first KV from each row.
This filter can be used to more efficiently perform row count operations.
说的明明白白,只会取得每条数据的第一个kv,可以用于count,计算总数,速度很快
代码如下:
希望批评指正
[html] view plaincopy
01.public int getCount() {
02. long bef = System.currentTimeMillis();
03. int i = 0; HTable tableKeyword = new HTable(conf,"tableName"); tableKeyword.setScannerCaching(500);
04. ResultScanner rs = null;
05. try {
06. Scan s = new Scan();
07. s.setCaching(500);
08. s.setCacheBlocks(false);
09. s.setFilter(new FirstKeyOnlyFilter());
10. rs = tableKeyword.getScanner(s);
11. } catch (IOException e) {
12. log.warn(e);
13. e.printStackTrace();
14. }
15. for (org.apache.hadoop.hbase.client.Result r : rs) {
16. i++ ;
17. }
18. long now = System.currentTimeMillis();
19. log.warn("keyword表中数据总数 :" + i + ", 所用时间 : " + (now - bef)/1000.0);
20. rs.close();
21. return i;
22. }
最好设置tableKeyword.setScannerCaching(500);
s.setCaching(500);
s.setCacheBlocks(false);这三个参数,否则速度会降下来很多
总的来说,可以节省很多时间
FirstKeyOnlyFilter :api中解释如下:
A filter that will only return the first KV from each row.
This filter can be used to more efficiently perform row count operations.
说的明明白白,只会取得每条数据的第一个kv,可以用于count,计算总数,速度很快
代码如下:
希望批评指正
[html] view plaincopy
01.public int getCount() {
02. long bef = System.currentTimeMillis();
03. int i = 0; HTable tableKeyword = new HTable(conf,"tableName"); tableKeyword.setScannerCaching(500);
04. ResultScanner rs = null;
05. try {
06. Scan s = new Scan();
07. s.setCaching(500);
08. s.setCacheBlocks(false);
09. s.setFilter(new FirstKeyOnlyFilter());
10. rs = tableKeyword.getScanner(s);
11. } catch (IOException e) {
12. log.warn(e);
13. e.printStackTrace();
14. }
15. for (org.apache.hadoop.hbase.client.Result r : rs) {
16. i++ ;
17. }
18. long now = System.currentTimeMillis();
19. log.warn("keyword表中数据总数 :" + i + ", 所用时间 : " + (now - bef)/1000.0);
20. rs.close();
21. return i;
22. }
最好设置tableKeyword.setScannerCaching(500);
s.setCaching(500);
s.setCacheBlocks(false);这三个参数,否则速度会降下来很多
总的来说,可以节省很多时间