论文一
time, clocks, and the ordering of Events in a Distributed System
简单说在分布式环境中各个计算机独立计时,前面已经提到过,独立计算的时间有偏差,所以不能
通过这种物理时间来排序多个计算机上的进程上的执行活动。 比如两个计算机上的进程同时访问修改两个数据库是
不能靠各自带的timestamp来决定哪个进程先访问的。
因此提出逻辑时钟的概念。见之前的任务3
1> happened before
Lamport defined a “happened before” relation (->) to capture the causal dependencies between events.
A -> B, if A and B are events in the same process and A occurred before B.
A -> B, if A is the event of sending a message m in a process and B is the event of the receipt of the same message m by another process.
If A->B, and B -> C, then A -> C (happened-before relation is transitive).
A -> B, if A and B are events in the same process and A occurred before B.
A -> B, if A is the event of sending a message m in a process and B is the event of the receipt of the same message m by another process.
If A->B, and B -> C, then A -> C (happened-before relation is transitive).
![spacer.gif](/e/u/themes/default/p_w_picpaths/spacer.gif)
论文一
time, clocks, and the ordering of Events in a Distributed System
简单说在分布式环境中各个计算机独立计时,前面已经提到过,独立计算的时间有偏差,所以不能
通过这种物理时间来排序多个计算机上的进程上的执行活动。 比如两个计算机上的进程同时访问修改两个数据库是
不能靠各自带的timestamp来决定哪个进程先访问的。
因此提出逻辑时钟的概念。见之前的任务3
1> happened before
Lamport defined a “happened before” relation (->) to capture the causal dependencies between events.
A -> B, if A and B are events in the same process and A occurred before B.
A -> B, if A is the event of sending a message m in a process and B is the event of the receipt of the same message m by another process.
If A->B, and B -> C, then A -> C (happened-before relation is transitive).
A -> B, if A and B are events in the same process and A occurred before B.
A -> B, if A is the event of sending a message m in a process and B is the event of the receipt of the same message m by another process.
If A->B, and B -> C, then A -> C (happened-before relation is transitive).
2>全序
在上述偏序定义基础上
a => b if C(i) < C(j) or C(i) == C(j) and Pi < Pj
通过全序关系的定义,使得任意两个进程的消息顺序具有了可比性,因此可以得到如下的分布式算法来保证访问一致性
例子:在分布式环境中访问一个critical section或者叫共享资源。
方法1:找一个机器或者进程作为 master来响应请求,保证访问的串行化。
方法2:设置一个token令牌,可以通过将需要访问的进程组成一个逻辑环,得到令牌的进程访问,结束后传递给下一个进程。
这里基于上述逻辑时间的全序定义,创建了一个分布式算法来保证串行访问。
![spacer.gif](/e/u/themes/default/p_w_picpaths/spacer.gif)
========
论文二
Virtual Time and Global States of Distributed Systems
通过 logical time + process id 将时间偏序转化为全序进行全局的比较和event的排序
问题是通过全序排列出来的线性关系有时候不能很好的反应分布式事件的特征,特别是并行的事件被以process id的方式所“掩盖”
提出一个基于logical time的vector time来描述分布式系统中的相对顺序。
首先对于Lamport的virtual time 存在的局限进行了描述, 即
a=>b, if C(a) < C(b) or
C(a) = C(b) && Pa < Pb (a的进程号小于b的进程号,人为定义)
lose information which happen simultaneously, 即没法得知 a || b
| |
事件模型, 提到 actor model. Actor模型最近比较热门 (只看过几篇介绍),经常与函数式编程一起被提及,代表语言有erlang和scale。 Actor模型的开源实例:比如scale的akka, 接近实时的高并发。 (TBD)
===========================
The heart of virtual time:(1) Events occurring at a particular process are totally ordered by their local sequence of occurrence (2) each receive event has a corresponding send event => e and e' are events in the same process and e precedes e' e is the sending event of a message and e' the corresponding receive event, e'' exists such that e < e'' and e'' < e'
===========================
描述 consistent cut (1) A consistent cut C of an event set E is a finite subset C of E (2) if e belongs to C && e' < e, then e' belongs to C. snapshot algorithm from Chandy and Lamport is approximately a global state. 真实时间的概念和特征: 传递性 ( e < e' if e < e'' and e'' < e'), 非自反性 ( if e < e', then e' < e can't happen), 线性, eternity(永恒性? for all x, if y exists and y < x, then for all x, exist y && x < y), density (for all x, y if x < y then z exists and x < z < y) 第五条density在现实中经常被模拟实现,比如数字手表和计算机时间最小位数是跳变的。 =========================== virtual time的表示 (1) 对于进程Pi, 当内部事件发生或者发送一个事件event时,进程Pi的时间Ci Ci := Ci + d (d > 0, normally d = 1) (2) 每个消息包含表示发送进程当前的时间戳,比如Ci (3) 当Pj 接受到一个带有时间戳t的消息时,调整本地时间 Ci := max(Ci, t) + d (d > 0, normally d = 1) 如果事件e和e' 是并发的,表示为e || e', 则 (e < e' ) 和 (e' < e) 都不成立 但是这里的一个问题是没法用 Ci 来表达 e || e', 即 e < e' => C(e) < C(e') 但是 C(e) < C(e') 得不到 e < e' Lamport 引入对进程排序的假设来将上述问题规避。
===========================
作者在这里提出vector time 来解决描述 e || e' 的问题。N个进程的时间用N维向量来表示, 进程Pi 负责增加该进程的时间维度的值 Ci[i] := Ci[i] + 1 同样每个消息包含发送进程的时间戳,t 当进程Pi接受到带有时间戳t的消息时,调整本地时间 Ci[i] := t[i] + 1 Ci[j] := max(Ci[j], t[j]) for all j which j is not equal to i. 根据以上概念得出vector time的几个特性 Ci <= Cj if u[k] <= v[k] for all k in vector N Ci < Cj if Ci <= Cj and Ci is not equal to Cj Ci || Cj if Ci < Cj not exist and Cj < Ci not exist 因此可以通过vector time可以表述原来时间的偏序关系也可以表示多个进程的并发事件 即 if Ci < Cj => Ei < Ej or if Ei < Ej => Ci < Cj if Ei || Ej => Ci < Ci wrong && Cj < Ci wrong , otherwise Ci < Cj wrong and Cj < Ci wrong can get Ei || Ej 类比Minkowski's space time, N维空间+1维时间来描述整个世界, 比如物理时间描述为3维空间+ 1维时间。 因此vector time可看成 1维空间 (多进程) + 1维时间
===========================
vector time的应用
(1) 分布式系统debug, 通过vector的比较来判定事件e和e'的关系是否为因果关系
(2) 分布式系统性能分析, 通过两个事件e和e' 是否e||e' 来了解系统的并发性
(3) D.S. Parker Detection of Mutual Inconsistency in Distributed Systems. 使用vector来检测一个文件的被独立修改的多个副本是否存在版本冲突。
https://www.cs.purdue.edu/homes/bb/cs542-06Spr/week11_lecture2.ppt (
TBD)
(4) 设计和验证分布式算法和协议(5) non-FIFO snapshot algorithm a> Pi "ticks" and then fixes its next time s = Ci + (0, ..., 0, 1, 0, ..., 0) 1 is on position i b> Pi broadcasts s to all other process c> Pi wait till it knows that every process knows s. d> Pi "ticks" again, takes a local snapshot, and broadcasts a dummy message to all processes. This forces all processes to advance their clocks to a value >= s e> Each process take a local snapshot and sends it to Pi when its local clock becomes equal to s or jumps from some value smaller than s to a value larger than s. | |
其他参考
|
![spacer.gif](/e/u/themes/default/p_w_picpaths/spacer.gif)
![spacer.gif](/e/u/themes/default/p_w_picpaths/spacer.gif)
![spacer.gif](/e/u/themes/default/p_w_picpaths/spacer.gif)
转载于:https://blog.51cto.com/usdaydayup/1392060