hbase读数据用scan,读数据加速的配置参数为:
Scan scan = new Scan();
scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false); // don't set to true for MR jobs
其中,
public Scan setCacheBlocks(boolean cacheBlocks)//Set whether blocks should be cached for this Scan
默认值为true, 分内存,缓存和磁盘,三种方式,一般数据的读取为内存->缓存->磁盘;setCacheBlocks不适合MapReduce工作:
MR程序为非热点数据,不需要缓存,因为Blockcache is LRU,也就是最近最少访问算法(扔掉最少访问的),那么,前一个请求(比如map读取&#x