分布式系统阅读清单

最新推荐文章于 2024-07-18 19:55:19 发布

或许对了

最新推荐文章于 2024-07-18 19:55:19 发布

阅读量357

点赞数

文章标签：分布式

原文链接：http://blog.jobbole.com/84575/

版权

分布式系统阅读清单

简介

我常常主张说，研究分布式系统最难的是改变你思考的方式。对于激发这种改变，我找到的一些很实用的阅读材料。如下。

Thought Provokers

一些让你考虑你设计方式的随笔。不是所有事都可以靠大服务器，数据库和事物来解决的。

Harvest, Yield and Scalable Tolerant Systems CAP原理在现实世界里的应用来自Brewer等人
On Designing and Deploying Internet Scale Services James Hamilton
Latency Exists, Cope! 处理延迟及其架构方面影响的说明
Latency – the new web performance bottleneck 内容不太新了，但是值得关注下
The Perils of Good Abstractions 构建完美的API/接口很困难
Chaotic Perspectives 大规模系统有开发人员不喜欢的所有东西——不可预测，无序，并行
Website Architecture 一些来自各类大型网站的可扩展架构文章
Data on the Outside versus Data on the Inside Pat helland
Memories, Guesses and Apologies Pat Helland
SOA and Newton’s Universe – Pat Helland
Building on Quicksand – Pat Helland
Why Distributed Computing
– Jim Waldo
A Note on Distributed Computing – Waldo, Wollrath 等人
Stevey’s Google Platforms Rant – Yegge的SOA平台经验

Amazon

有些有关的技术，但更有趣的是他们创造的与之配合的文化和结构。

A Conversation with Werner Vogels 关于亚马逊转型为一个基于服务的架构的采访报道
Discipline and Focus 关于亚马逊转型为一个基于服务的架构的另一篇采访
Vogels on Scalability
SOA creates order out of chaos @ Amazon

Google

当前分布式系统领域的“火箭科学”（形容艰深的学问）

MapReduce
Chubby Lock Manager
Google File System
BigTable
Data Management for Internet-Scale Single-Sign-On
Dremel: Interactive Analysis of Web-Scale Datasets
Large-scale Incremental Processing Using Distributed Transactions and Notifications
Megastore: Providing Scalable, Highly Available Storage for Interactive Services – 实现跨数据中心、低延迟的paxos算法的巧妙设计。
Spanner – Google的可扩展、多版本、全球分布且同步复制的数据库。
Photon – 连续数据流的容错和扩容。扩容是非常困难的，尤其是在时钟偏移、高可用性和分布式的情况下.
Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing 用于存储谷歌互联网广告业务相关的关键测量数据的数据仓库系统。

eBay

有趣的是他们抛弃了大多数的J2EE，并使用了大量的数据库分区。同时，看看他们的网站升级工具。

一致性模型

构建能够适应环境的系统的关键是寻求正确权衡一致性和可用性。

CAP Conjecture – 一致性，可用性，分区容忍性不可能同时满足
Consistency, Availability, and Convergence – 证明了在一个典型系统中一致性可能的上界。
CAP Twelve Years Later: How the “Rules” Have Changed – Eric Brewer 在原来权衡描述工作上的扩展
Consistency and Availability – Vogels
Eventual Consistency – Vogels
Avoiding Two-Phase Commit – 两阶段提交的避免方法
2PC or not 2PC, Wherefore Art Thou XA
– 两阶段提交不是银弹
Life Beyond Distributed Transactions – Helland
If you have too much data, then ‘good enough’ is good enough – NoSQL, 数据理论的未来- Pat Helland
Starbucks doesn’t do two phase commit – 在起作用的异步机制
You Can’t Sacrifice Partition Tolerance – 另外的 CAP 说明
Optimistic Replication – 数据主从复制的弱一致性方法

理论

一些描述了分布式系统设计中各种各样的重要因素的论文。

Distributed Computing Economics – Jim Gray
Rules of Thumb in Data Engineering – Jim Gray and Prashant Shenoy
Fallacies of Distributed Computing – Peter Deutsch
Impossibility of distributed consensus with one faulty process 也称为FLP [访问需要帐号或付费，免费版本在这里：

here]
Unreliable Failure Detectors for Reliable Distributed Systems.一种处理FLP难题的方法
Lamport Clocks -当每台电脑的时钟都是独立的时候，你如何建立对时间的全局视图。
The Byzantine Generals Problem
Lazy Replication: Exploiting the Semantics of Distributed Services
Scalable Agreement – Towards Ordering as a Service
Scalable Eventually Consistent Counters over Unreliable Networks 在不可靠的世界，可扩展计数很困难。

语言和工具

使用特定技术构建分布式系统的问题。

Programming Distributed Erlang Applications: Pitfalls and Recipes 构建可靠的分布式应用并不仅仅是的选择Erlang还是OTP的问题那么简单。

基础设施

Principles of Robust Timing over the Internet 即便是调试这么基础的事，管理时钟也很重要。

存储

Consistent Hashing and Random Trees
Amazon’s Dynamo Storage Service

Paxos 一致性算法

理解这种算法是一个挑战。我建议在阅读其他论文之前先读读“Paxos Made Simple”，然后在读完其他论文之后，再读一遍。

The Part-Time Parliament – Leslie Lamport
Paxos Made Simple – Leslie Lamport
Paxos Made Live – An Engineering Perspective – Chandra等人
Revisiting the Paxos Algorithm – Lynch 等人
How to build a highly available system with consensus – Butler Lampson
Reconfiguring a State Machine – Lamport 等人 -改变集群的成员

Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial – Fred Schneider

其他一致性文章

Mencius: Building Efficient Replicated State Machines for WANs – 针对广域网的一致性算法

Gossip 协议（传染行为）

Epidemic Routing Bibliography
How robust are gossip-based communication protocols
Astrolabe: A Robust and Scalable Technology For Distributed Systems Monitoring, Management, and Data Mining
Epidemic Computing at Cornell
Fighting Fire With Fire: Using Randomized Gossip To Combat Stochastic Scalability Limits
Bi-Modal Multicast
ACM SIGOPS Operating Systems Review – Gossip-based computer networking
SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol

P2P

Chord:一种针对互联网应用的可扩展的点对点查找协议。
Kademlia: 一种基于XOR的点对点信息系统
Pastry: 可扩展的，去中心化的对象位置和对大规模点对点系统的路由。
PAST: 一种大规模，持久化的点对点存储功能——Pastry上的存储系统
SCRIBE: 一个大规模且去中心化的应用层多播基础设施——Pastry上的广域消息系统。

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
分布式系统阅读清单

分布式系统阅读清单简介我常常主张说，研究分布式系统最难的是改变你思考的方式。对于激发这种改变，我找到的一些很实用的阅读材料。如下。Thought Provokers一些让你考虑你设计方式的随笔。不是所有事都可以靠大服务器，数据库和事物来解决的。Harvest, Yield and Scalable Tolerant Systems CAP原理在现实世界里的应用来自Brewer等人On Designing and Deploying Internet Scale Services James Ha
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。