分布式原理整理1

CAP理论


  • 一致性(Consistency): all nodes see the same data at the same time

A service that is consistent operate fully or not at all.

  • 可用性(Availability): a guarantee that every request receives a response about whether it succeeded or failed
  • 分区容忍性(Partition Tolerance): the system continues to operate despite arbitrary partitioning due to network failures
    No set of failures less than total netowork failure is allowed to cause the system to respond incorrectly.

  • CAP 三者不可兼得

Dynamo设计时面临的问题及解决方案

摘录自 杨传辉,《大规模分布式存储系统》

问题采取的技术
数据分布改进的一致性哈希(虚拟节点)
复制协议复制写协议(Replicated-write protocol, NWR参数可调)
数据冲突协议向量时钟
临时故障处理数据回传机制(Hinted handoff)
永久故障后的恢复Merkle哈希树
成员资格及错误检测基于Gossip的成员资格和错误检测协议

DHT

(整理好再补充)

NWR策略(Quorum协议)

NWR是一种在分布式存储系统中用于控制一致性级别的策略。
* N: 同一份数据的Replica的份数;
* W: 更新一个数据对象时需要确保成功更新的份数;
* R: 读取一个数据需要读取的Replica的份数
* W+R>N : 保证某个数据不能被两个不同的事务同时读或写
* W>N/2 : 保证两个事务不能并发写一个数据

在分布式系统中,数据的单点是不允许存在的。一旦这个Replica出现错误,就可能发生数据的永久性错误。如果N设置为2,那么只要一个存储节点出错,就会有单点的存在,所以N>2。

以下整理自卡耐基梅隆大学CMU 的课件

Vector Clock

Lamport’s Logical Clock

  • hapened-before relation

    • if a and b are events in the same process, and a occurs before b, then a->b is true
    • if a is an event of message m being sent by a process, and b is the event of m being received by another process, then a->b
  • happened-before relation is transitive

    if a->b and b->c, then a->c

  • property of logical clock

    • if two eventa a and b occur within the same process and a->b, then assign the logical timve value C(a) and C(b), then C(a) < C(b)
    • the clock time C must always go forward, and never backward
  • lamport’s clock alogrithm

    • when a message is being sent: each message carries a timestamp according to the sender’s logical clock
    • when a message is received: if the receiver logical clock is less than message sending time in the packet, then adjust the receiver’s clock suck that currentTime = tiemstamp + 1

Vector clock

Lamport’s clock cannot guarantee perfect ordering of events by just observing the time values of two arbitrary events

defination


  • vector clocks was proposed to overcome the limition of lamport’s clock(ie., C(a) < C(b) doesn’t mean that a->b)
  • a vector clock for a system of N processes is an array of N integers

  • every process Pi stores its own vector clock VCi
  • Lamport’s time values for events are stored in VCi,VCi(a) is assigned to an event a
  • VCi(a) < VCi(b) ==> a->b

update algorithm


  • whenever ther is a new event at Pi, increment VCi[i]
  • when a p process Pi sends a message m to Pj:

  1. increment VCi[i]
  2. set m’s timestamp ts(m) to the vector VCi
  3. when message m is received by process Pj:
  4. for k in ts(m):
    VCj = max(VCi[k], ts(m)[k]);
  5. increment VCj[j]

causal communication

to enforce causally-ordered multicasting, the delivery of message m sent from Pi to Pj can be delay until the following two conditions are met:
* ts(m)[i] = VCj[i] + 1
* ts(m)[k] <= VCj[k] for k in ts(m) and k!=i

Merkle tree

Merkle tree is a tree in which every non-leaf node is labelled with the hash of the labels or values (in case of leaves) of its children nodes.
(整理完之后补充)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值