Vector Clocks, 时间向量

Why Vector Clock are Easy or Hard?

Amazon's Dynamo, 4.4版本化的数据

 

常用的版本化技术

Timestamps seem to be an obvious solution for developing a chronological order. However, timestamps “rely on synchronized clocks and don’t capture causality” as Lipcon points out

Optimistic Locking implies that a unique counter or clock value is saved for each piece of data. When a client tries to update a dataset it has to provide the counter/clock-value of the revision it likes to update (cf. [K+10b]). As a downside of this procedure, the Project Voldemort development team notes that it does not work well in a distributed and dynamic scenario where servers show up and go away often and without prior notice.

悲观锁和乐观锁
http://www.cnblogs.com/guyufei/archive/2011/01/10/1931632.html
悲观锁:假定会发生并发冲突(悲观), 屏蔽一切可能违反数据完整性的操作
虽然简单安全, 但是当一个用户锁定记录的时候,会block所有其它用户

乐观锁: 假设不会发生并发冲突(乐观), 只在提交操作时检查是否违反数据完整性
          可以随意的并发读取和修改, 只在数据提交的时候通过数据版本判断是否存在冲突

两种锁并没有好坏之说, 在于使用的场景更符合哪种假设, 如果符合悲观假设, 那么使用乐观锁可能效率更低
当然两者都必须依赖master来保证锁机制, 本质上都是主从结构, 所以都会有单点问题
当然好处是两者都能保证全序

Vector Clocks are an alternative approach to capture order and allow reasoning between updates in a distributed system.
向量时钟也是一种乐观锁, 只是更通用, 因为不依赖于master, 基于因果关系的狭义相对论
所以他只能保证偏序
发生冲突时, 必须依靠client去做resolve以达成全序

Multiversion Storage means to store a timestamp for each table cell.
These timestamps “don’t necessarily need to correspond to real life”, but can also be some artificial values that can be brought into a definite order. For a given row multiple versions can exist concurrently. Besides the most recent version a reader may also request the “most recent before T” version. This provides “optimistic concurrency control with compare-and-swap on timestamps” and also allows to take snapshots of datasets (cf. [Lip09, slide 20])
其实多版本存储和乐观锁的区别在于, 提交的时候它允许冲突, 并不会试图解决冲突.
而是采用另外一种方法, 把所有版本都保存下来, 即保存版本之间的偏序关系
但实际上这并不真正解决问题, 只是把冲突解决的时机延迟, 当用户使用时, 自己选择使用什么版本, 即做冲突解决

 

Vector Clocks

Vector clocks is an algorithm for generating a partial ordering of events in a distributed system and detecting causality violations.

其实vector clocks就是Lamport的偏序理论的一个实际应用
具体的使用方法, 参考,
Why Vector Clock are Easy or Hard?

A vector clock is defined as a tuple V [0], V [1], ..., V [n] of clock values from each node (cf. [Lip09, slide18]).

In a distributed scenario node i maintains such a tuple of clock values, which represent the state of itself and the other (replica) nodes’ state as it is aware about at a given time (Vi[0] for the clock value of the first node, Vi[1] for the clock value of the second node, . . . Vi[i] for itself, . . . Vi[n] for the clock value of the last node).

时钟向量, 表示的含义由下面这个图很好的表示出来.

通过一个向量, 不但记录了当前节点上该数据的version, 还记录了其他节点该数据的版本情况(不一定是最新的)

 image

posted on 2013-04-13 11:39  fxjwind  阅读(1129)  评论(0编辑  收藏  举报