CAP – Consistency, Availability, Partition Tolerance - fxjwind

CAP – Consistency, Availability, Partition Tolerance

----------------------------------------------------------------------------------------------------------------------

http://blog.nahurst.com/visual-guide-to-nosql-systems

相当不错的ppt: http://www.slideshare.net/jboner/scalability-availability-stability-patterns

再Mark一下Nosql的总站, http://nosql-database.org/

对于CAP的权威解释参看下面的文章,

http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

中文翻译版：

http://pt.alibaba-inc.com/wp/dev_related_728/brewers-cap-theorem.html

-----------------------------------------------------------------------------------------------------------------------------

关系型数据库和 ACID

在讨论大数据和CAP之前先看看关系数据库的ACID

关系数据库, 最大的特点就是事务处理, 即满足ACID, 如下所示, 其实ACID这个理论本身也有些confuse

引用, Jim Gray和我讨论了这些缩写，他欣然认可ACID也有些扭曲（stretch）– A和D（的概念）有相当多的重复部分，C至多也是含糊不清的。

所以我们可以理解为ACID最重要的含义, 就是Atomicity和Isolation , 即强一致型, 要么全做,要么不做, 所有用户看到的数据一致.

这也是关系数据库, 这么多年长盛不衰的原因, 强调数据的可靠性, 一致性和可用性.

ACID, which stands for Atomicity, Consistency, Isolation, and Durability, has become the gold standard to define the highest level of transactional integrity in database systems. As
the acronym suggests it implies the following:

Atomicity — Either a transactional operation fully succeeds or completely fails. 这个很容易理解, 最简单就是银行的例子, 你修改记录和吐钱, 必须atomic

Consistency — Consistency implies that data is never persisted if it violates a predefined constraint or rule. For example, if a particular field states that it should hold only integer values, then a float value is not accepted or is rounded to the nearest integer and then saved.

这个和nosql里面的意义不一样, 这儿单纯就是不能破坏预定的rule和约束
Isolation — Isolation gets relevant where data is accessed concurrently.

控制隔离数据以供一个进程使用并防止其它进程干扰的程度的事务属性. 当多个事务同时进行时,通过设置隔离级别来处理脏读、不可重复读、幻读事件

read uncommitted | 0 未提交读

将查询的隔离级别指定为 0。

可以读脏数据

读脏数据:一事务对数据进行了增删改,但未提交,有可能回滚,另一事务却读取了未提交的数据

read committed | 1 已提交读

将查询的隔离级别指定为 1。

避免脏读,但可以出现不可重复读和幻读

不可重复读:一事务对数据进行了更新或删除操作,另一事务两次查询的数据不一致

幻像读:一事务对数据进行了新增操作,另一事务两次查询的数据不一致

repeatable read | 2 可重复读

将查询的事务隔离级别指定为 2。

避免脏读,不可重复读,允许幻像读

serializable | 3 可序列化

将查询的隔离级别指定为 3。

串行化读，事务只能一个一个执行，避免了脏读、不可重复读、幻读

执行效率慢（我遇到过一种情况，用时是隔离级别1的30倍），使用时慎重

Durability — Durability implies that once a transactional operation is confirmed, it is assured.
An RDBMS often maintains a transaction log. A transaction is confirmed only after it’s written to the transaction log. If a system fails between the confirmation and the data persistence, the transaction log is synchronized with the persistent store to bring it to a consistent state.

大数据和CAP理论

但是随着数据量越来越大, 分布式数据存储成了必然的选择, 在分布式的环境中, 我们是否可以实现强一致性, 高可用性的数据存储方案了?

这儿就不得不提Brewer（CAP）定理, http://code.alibabatech.com/blog/dev_related_728/brewers-cap-theorem.html

这个被认为是具有划时代意义的理论, 表明在分布式环境中, 在强一致性和高可用性之间必须有取舍

引用一下关于CAP的定义,

Consistency

A service that is consistent operates fully or not at all. Gilbert and Lynch use the word "atomic" instead of consistent in their proof, which makes more sense technically.

Availability

Availability means just that - the service is available (to operate fully or not as above).

Partition Tolerance

说白了, 就是在分布式环境中, 如果某些节点crash, 系统是否还能正常工作

Gilbert & Lynch defined partition tolerance as:

No set of failures less than total network failure is allowed to cause the system to respond incorrectly

Parallel processing and scaling out are proven methods and are being adopted as the model for scalability and higher performance as opposed to scaling up and building massive super computers.

Partition tolerance measures the ability of a system to continue to service in the event a few of its cluster members become unavailable.

CAP选择

当处理CAP的问题时，你会有几个选择, 参考下图

放弃Partition Tolerance

传统的关系型数据库, 把所有数据都存在单个节点上, 所以肯定不具有partition tolerance, 节点crash就无法提供服务.

一般而言, Nosql方案不会放弃这个特性, 因为对于分布式系统, 一个partition crash就导致系统不可用, 那么就没有意义了, 我还不如用关系型数据库了.

所以只能在一致性与可用性之间做出选择

Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of CA systems include:

Traditional RDBMSs like Postgres, MySQL, etc (relational)
Vertica (column-oriented)
Aster Data (relational)
Greenplum (relational)

放弃Availability

我们需要保证一致性, 原子性. 当一个数据被别人操作时, 你就必须等待, 等待时就无法保证可用性.

Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across partitioned nodes. Examples of CP systems include:

BigTable (column-oriented/tabular)
Hypertable (column-oriented/tabular)
HBase (column-oriented/tabular)
MongoDB (document-oriented)
Terrastore (document-oriented)
Redis (key-value)
Scalaris (key-value)
MemcacheDB (key-value)
Berkeley DB (key-value)

放弃Consistency

对于大多数的互联网应用来讲，强一致性并不是非常重要的。和一致性比起来，可用性更加重要性一些。最终一致性（Eventually Consistent）简单的讲就是在某一个短暂的时间内数据可以不一致，但是在无限长的时间内，所有节点上的replica最终会达到完全一致。

上面那个问题, 只要放弃强一致性, 就可以保证可用性, 亲, 你先将就读吧...

Available, Partition-Tolerant (AP) Systems achieve "eventual consistency" through replication and verification. Examples of AP systems include:

Dynamo (key-value)
Voldemort (key-value)
Tokyo Cabinet (key-value)
KAI (key-value)
Cassandra (column-oriented/tabular)
CouchDB (document-oriented)
SimpleDB (document-oriented)
Riak (document-oriented)

BASE vs. ACID

有一种架构的方法（approach）称作BASE（Basically Available, Soft-state, Eventually consistent）支持最终一致概念的接受。BASE（注：化学中的含义是碱），如其名字所示，是ACID（注：化学中的含义是酸）的反面. 关于BASE, 我们参考BASE: An Acid Alternative, http://www.dbthink.com/?p=483

在对数据库进行分区后,为了可用性(Availability)牺牲部分一致性(Consistency)可以显著的提升系统的可伸缩性(Scalability).

Horizontal data scaling can be performed along two vectors. Functional scaling involves grouping data by function and spreading functional groups across databases. Splitting data within functional
areas across multiple databases, or sharding, adds the second dimension to horizontal scaling.