1、The Anatomy of a Large-Scale Hypertextual Web Search Engine
http://infolab.stanford.edu/~backrub/google.html
http://cs.ucsb.edu/~chong/250C/google.pdf
2、The Google File System
http://labs.google.com/papers/gfs.html
http://os.inf.tu-dresden.de/Studium/DOS/SS2010/04-GFS-2.pdf
3、MapReduce
http://labs.google.com/papers/mapreduce.html
http://labs.google.com/papers/mapreduce-osdi04.pdf
4、BigTable
http://labs.google.com/papers/bigtable.html
5、Hadoop
Hadoop是基于shared-nothing架构的海量数据存储和计算的分布式系统,它由若干个成员组成,主要包括:HDFS、MapReduce、Hive、HBase、Pig和ZooKeeper,其中HDFS是Google的GFS开源实现,而ZooKeeper是Google的Chubby开源版本,而HBase是Google的BigTable开源版本。HDFS具体高容错性和强线性扩展特点。