一.google论文系列
1. google系列论文译序
2. The anatomy of a large-scale hypertextual Web search engine (译 zz)
3. web search for a planet :the google cluster architecture(译)
5. MapReduce: Simplied Data Processing on Large Clusters (译)
6. Bigtable: A Distributed Storage System for Structured Data (译)
7. Chubby: The Chubby lock service for loosely-coupled distributed systems (译)
8. Sawzall:Interpreting the Data--Parallel Analysis with Sawzall (译 zz)
9. Pregel: A System for Large-Scale Graph Processing (译)
10. Dremel: Interactive Analysis of WebScale Datasets
11. Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications
12. MegaStore: Providing Scalable, Highly Available Storage for Interactive Services
13. Case Study GFS: Evolution on Fast-forward (译)
14. Google File System II: Dawn of the Multiplying Master Nodes
15. Tenzing - A SQL Implementation on the MapReduce Framework (译)
16. F1-The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business
17. Elmo: Building a Globally Distributed, Highly Available Database
18. PowerDrill:Processing a Trillion Cells per Mouse Click
19. Google-Wide Profiling:A Continuous Profiling Infrastructure for Data Centers
20. Spanner: Google’s Globally-Distributed Database
21. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
22. Omega: flexible, scalable schedulers for large compute clusters
23. CPI2: CPU performance isolation for shared compute clusters
24. F1: A Distributed SQL Database That Scales
25. MillWheel: Fault-Tolerant Stream Processing at Internet Scale
26. B4: Experience with a Globally-Deployed Software Defined WAN
27. The Datacenter as a Computer
二.分布式理论系列
00. Appraising Two Decades of Distributed Computing Theory Research
0. 分布式理论系列译序1. A brief history of Consensus_ 2PC and Transaction Commit (译)
2. 拜占庭将军问题 (译) --Leslie Lamport
3. Impossibility of distributed consensus with one faulty process (译)
5. Time Clocks and the Ordering of Events in a Distributed System(译) --Leslie Lamport
6. 关于Paxos的历史
7. The Part Time Parliament (译 zz) --Leslie Lamport
8. How to Build a Highly Available System Using Consensus(译)9. Paxos Made Simple (译) --Leslie Lamport
10. Paxos Made Live - An Engineering Perspective(译)
11. 2 Phase Commit(译)16. Single-Message Communication
17. Wait-Free Synchronization
18. Uniform consensus is harder than consensus
21. Paxos made code - Implementing a high throughput Atomic Broadcast
22. Distributed Snapshots: Determining Global States of a Distributed System --Leslie Lamport
23. Virtual Time and Global States of Distributed Systems
24. Timestamps in Message-Passing Systems That Preserve the Partial Ordering
25. Fundamentals of Distributed Computing:A Practical Tour of Vector Clock Systems
26. Knowledge and Common Knowledge in a Distributed Environment
27. Understanding Failures in Petascale Computers30. Life beyond Distributed Transactions :an Apostate’s Opinion
三.数据库理论系列
0. A Relational Model of Data for Large Shared Data Banks --E.F.Codd 1970
1. SEQUEL:A Structured English Query Language 1974
2. Implentation of a Structured English Query Language 1975
3. A System R: Relational Approach to Database Management 1976
4. Granularity of Locks and Degrees of Consistency in a Shared DataBase --Jim Gray 1976
5. Access Path Selection in a RDBMS 1979
6. The Transaction Concept:Virtues and Limitations --Jim Gray7. 2pc-2阶段提交:Notes on Data Base Operating Systems --Jim Gray
8. 3pc-3阶段提交:NONBLOCKING COMMIT PROTOCOLS
9. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging-199210. A Comparison of the Byzantine Agreement Problem and the Transaction Commit Problem --Jim Gray
11. A Formal Model of Crash Recovery in a Distributed System - Skeen, D. Stonebraker12. What Goes Around Comes Around - Michael Stonebraker, Joseph M. Hellerstein
四.大规模存储与计算(NoSql理论系列)
0. Towards Robust Distributed Systems:Brewer's 2000 PODC key notes
1. CAP理论
2. Harvest, Yield, and Scalable Tolerant Systems
3. 关于CAP
4. BASE模型:BASE an Acid Alternative
5. 最终一致性
6. 可扩展性设计模式
7. 可伸缩性原则
8. NoSql生态系统
9. scalability-availability-stability-patterns
10. The 5 Minute Rule and the 5 Byte Rule (译)
11. The Five-Minute Rule 20 Years Later(and How Flash Memory Changes the Rules)
12. 关于MapReduce的争论
15. MapReduce和并行数据库,朋友还是敌人?(zz)
16. MapReduce and Parallel DBMSs-Friends or Foes (译)
17. MapReduce:A Flexible Data Processing Tool (译)
18. A Comparision of Approaches to Large-Scale Data Analysis (译)
21. Map-Reduce-Merge: simplified relational data processing on large clusters
22. MapReduce Online
23. Graph Twiddling in a MapReduce World
24. Spark: Cluster Computing with Working Sets
25. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
26. Big Data Lambda Architecture
五.基本算法和数据结构
2. 大数据量,海量数据处理方法总结(续)
3. Consistent Hashing And Random Trees
4. Merkle Trees
5. Scalable Bloom Filters
6. Introduction to Distributed Hash Tables
7. B-Trees and Relational Database Systems
8. The log-structured merge-tree (译)
10. Data Structures for Spatial Database
11. Gossip
13. The Graph Traversal Pattern
六.基本系统和实践经验
2. Dynamo: Amazon’s Highly Available Key-value Store (译)
3. Cassandra - A Decentralized Structured Storage System (译)
4. PNUTS: Yahoo!’s Hosted Data Serving Platform (译)
5. Yahoo!的分布式数据平台PNUTS简介及感悟(zz)
6. LevelDB:一个快速轻量级的key-value存储库(译)
7. LevelDB理论基础
11. Sawzall原理与应用
12. Designs, Lessons and Advice from Building Large Distributed Systems --Jeff Dean
13. Challenges in Building Large-Scale Information Retrieval Systems --Jeff Dean
14. Experiences with MapReduce, an Abstraction for Large-Scale Computation --Jeff Dean
15. Taming Service Variability,Building Worldwide Systems,and Scaling Deep Learning --Jeff Dean
16. Large-Scale Data and Computation:Challenges and Opportunitis --Jeff Dean
17. Achieving Rapid Response Times in Large Online Services --Jeff Dean
18. The Tail at Scale(译) --Jeff Dean & Luiz André Barroso
19. How To Design A Good API and Why it Matters
20. Event-Based Systems:Architect's Dream or Developer's Nightmare?
七.其他辅助系统
1. The ganglia distributed monitoring system:design, implementation, and experience
2. Chukwa: A large-scale monitoring system
3. Scribe : a way to aggregate data and why not, to directly fill the HDFS?
4. Benchmarking Cloud Serving Systems with YCSB
5. Dynamo Dremel ZooKeeper Hive 简述
八. Hadoop相关
1. The Hadoop Distributed File System(译)
2. HDFS scalability:the limits to growth(译)
3. Name-node memory size estimates and optimization proposal.
5. HFile:A Block-Indexed File Format to Store Sorted Key-Value Pairs
6. HFile V2
7. Hive - A Warehousing Solution Over a Map-Reduce Framework
8. Hive – A Petabyte Scale Data Warehouse Using Hadoop
10. ZooKeeper: Wait-free coordination for Internet-scale systems
11. The life and times of a zookeeper
13. Apache Hadoop Goes Realtime at Facebook (译)
14. Hadoop平台优化综述
15. The Anatomy of Hadoop I/O Pipeline (译)
16. Hadoop公平调度器指南
17. 下一代Apache Hadoop MapReduce
九.其他
On Computable Numbers with an Application to the Entscheidungsproblem-1936.5.28-A.M.Turing
The First Draft Report on the EDVAC-1945.6.30-John von Neumann
Reflections on Trusting Trust --Ken Thompson
Who Needs an Architect?
Go To statements considered harmfull --Edsger W.Dijkstra
No Silver Bullet Essence and Accidents of Software Engineering --Frederick P. Brooks
转载请注明作者:phylips@bmy 2011-4-30
出处:http://duanple.blog.163.com/blog/static/709717672011330101333271/
再推荐一个相关文章:http://blog.nosqlfan.com/html/1647.html
列举的大部分论文都是相同的,不过也有一些是各自独有的。