- 博客(5)
- 资源 (2)
- 收藏
- 关注
原创 The Goole File System笔记
Assumptions: built from many commodity machines and able to detect and recover from failuresStore modest number of large files and thus optimize for itPrimarily two kinds of read large streaming
2012-02-26 17:13:08 637
原创 利用程序中的局部性
将注意力集中在内部循环上,大部分计算与存储器访问都放生在那里按照数据在存储器的顺序读,是空间局部性最大一旦读入某数据,尽可能多的使用它,使时间局部性最大缓存命中率只是影响性能的一个重要因素,存储器访问数量也很重要,两者需要折中考虑 摘自《深入理解计算机系统》第6章
2012-02-25 23:40:59 613
原创 Distributed Sort via MapReduce vs. K路归并+快排
Distributed Sort via MapReduce Map function just output key+recordPartition immediate keys to R pieces and this R pieces is sorted partitions for the key value domain. This functions as bucket sort
2012-02-23 09:54:09 2284
原创 Google: MapReduce in a Week Note
1. Failure is the number one concern in distributed system design Hardware failureSoftware failure Heisenbug: A bug that seems to disappear or alter its characteristics when it is observe
2012-02-21 22:37:59 409
原创 Hadoop学习笔记
1. Quick Start on MapReduce Google: MapReduce in a Week MapReduce paper笔记 The Goole File System笔记 2. Hadoop Hadoop各Release关系 Hadoop配置 3. Map-Reduce应用场景 MapReduce Patterns, Algorith
2012-02-21 22:16:37 304
Learning Spark
2019-03-06
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人