1. 分布式数据库领域CAP理论
Consistency(一致性), 数据一致更新,所有数据变动都是同步的
Availability(可用性), 好的响应性能
Partition tolerance(分区容错性) 可靠性,A single piece of data is stored in 3 nodes, 1 node failed, the other 2 nodes can still work. This is implemented via Replication or Duplication. 也就是没有单点失败
定理:任何分布式系统只可同时满足二点,没法三者兼顾。
忠告:架构师不要将精力浪费在如何设计能满足三者的完美分布式系统,而是应该进行取舍。
- 对于应用服务器,并行意味着多台相同的应用服务器cluster,通常在cluster前端配置有load balance,这个cluster在eBay中叫pool
- 对于数据库服务器,并行意味着热备份的多台数据库服务器(Replication),一般至少有两台(master,failover server)
You cannot, however, choose both consistency and availability in a distributed system.
As a thought experiment, imagine a distributed system which keeps track of a single piece of data using three nodes—A
, B
, and C
—and which claims to be both consistent and available in the face of network partitions. Misfortune strikes, and that system is partitioned into two components: {A,B}
and {C}
. In this state, a write request arrives at node C
to update the single piece of data.
That node only has two options:
- Accept the write, knowing that neither
A
norB
will know about this new data until the partition heals. - Refuse the write, knowing that the client might not be able to contact
A
orB
until the partition heals.
You either choose availability (Door #1) or you choose consistency (Door #2). You cannot choose both.
Refrence: http://codahale.com/you-cant-sacrifice-partition-tolerance/4.分布式数据库的优缺点
优点:
- 提高系统的可靠性、可用性当某一场地出现故障时,系统可以对另一场地上的相同副本进行操作,不会因一处故障而造成整个系统的瘫痪。
- 提高系统性能系统可以根据距离选择离用户最近的数据副本进行操作,减少通信代价,改善整个系统的性能。
- 易于扩展,如果服务器软件支持透明的水平扩展,那么就可以增加多个服务器来进一步分布数据和分担处理任务。(关于水平扩展可以参考http://xuezhongfeicn.blog.163.com/blog/static/22460141201201153456711/, eBay的数据库存储就是水平扩展的)