A top-down intro to NoSQL

Data storage

1. No structured storage

2. Structured storage

No structured storage

1. file

2. chunk

Structured storage

http://en.wikipedia.org/wiki/Structured_storage

1. SQL

2. NoSQL

SQL

http://en.wikipedia.org/wiki/SQL

Relational Database

1. SQL Server

2. Oracle

3. MySQL

NoSQL

http://en.wikipedia.org/wiki/Nosql

A movement promoting a loosely defined class.

Architecture

Properties

1. No fixed table schemas

2. Avoid join operations

3. Scale horizontally（水平伸缩）

Motivation

Data-intensive applications, such as:

l indexing a large number of documents

l serving pages on high-traffic websites

l delivering streaming media

Taxonomy

Document store

1. CouchDB

2. XML database

Graph

1. Neo4j

Key/value store on disk

1. BigTable

2. Memcachedb

Key/value cache in RAM

1. memcached

Eventually‐consistent key‐value store

1. Dynamo

2. Cassandra

Ordered key-value store

1. Memcachedb

Tabular

1. BigTable

2. Hbase

Graph Database

http://en.wikipedia.org/wiki/Graph_database

A database uses graph structures with nodes, edges and properties to represent and store information.

Properties

Faster for associative data sets

Map more to the structure of OOP

Not require expensive join operations

Document-oriented database

http://en.wikipedia.org/wiki/Document-oriented_database

¡ Dynamic Fields

¡ For example here's a document:

l FirstName="Bob", Address="5 Oak St.", Hobby="sailing".

¡ Another document could be:

l FirstName="Jonathan", Address="15 Wanamassa Point Road", Children=("Michael,10", "Jennifer,8", "Samantha,5", "Elena,2").

CouchDB

http://en.wikipedia.org/wiki/Couchdb

¡ Features

l Document Storage

l ACID Semantics

l Map/Reduce Views and Indexes

l Distributed Architecture with Replication

l REST API:

¡ REST uses the HTTP methods POST, GET, PUT and DELETE for the four basic CRUD (Create, Read, Update, Delete) operations on all resources.

l MVCC（Multi-Version-Concurrency-Control）

¡ 读写均不锁定数据库

l 服务端脚本 —— 纯 JavaScript 开发环境

Key-Value Database（目前最广泛）

　　HBase vs Cassandra

http://wangxu.me/blog/?p=371

CAP

¡ Hbase：CA，基于BigTable，GFS，对MapReduce支持更好

¡ Cassandra：AP，后来者，更灵活，基于Dynamo

Hbase

l 模块性更强，需要多个组件构成

l 因为要部署多个组件，部署困难

Cassandra

Cassandra最初由Avinash Lakshman (Amazon's Dynamo的作者之一) 和 Prashant Malik ( Facebook工程师)在Facebook设计开发，在2008年Facebook把它贡献给了开源社区。在很多的地方你可以把Cassandra看成是Dynamo的升级版本2.0或者是Dynamo与BigTable的结合。Cassandra在Facebook投入实际应用运行，但仍然处于大量开发进展阶段。

http://www.ruohai.org/?p=17

l Backgound

n Digg在去年九月宣布了他们转向Cassandra的计划，仔细比对了其它项目——HBase，Hypertable，Tokyo Cabinet/Tyrant，Voldemort，以及Dynomite——，他们最终选择了Cassandra

l Architecture

n 集群模型：Dynamo

u （去中心&&单纯的KeyValue）

n 数据模型：BigTable

l Key

n 决定数据份分布在哪些节点上面

n Keyspace：解决不同应用间的作用域问题。相当于不同的scheme。

n 一个Key对应一个行

l Value

n Column: (name, value )

n SuperColumn: (name, sortedlist<Column> )

n ColumnFamily: 相当于RDBM中的Table

l Example 01

l Users ColumnFamily

n 由Column组成

Users: { // ColumnFamily

ruohai : { // 用户的nick作为key

{name: "nick", value: "ruohai", timestamp: "123456"},

{name: "email", value: "sucode@gmail.com", timestamp: "234567"},

{name: "website", value: "http://www.ruohai.org", timestamp: "345678"},

{name: "twitter", value: "sucode", timestamp: "456789"},

// other properties

}

user2 : {

// ...

}

l Example 02

n Favourites ColumnFamily

n 由SuperColumn组成

Favorites: { // ColumnFamily

ruohai : { // ruohai的收藏信息， Row key

lining : { // SuperColumn name，表示收藏的tag

{name: "123", value: "1", timestamp: 123},

{name: "125", value: "7", timestamp: 125},

{name: "139", value: "13", timestamp: 139}

nike : { // 另一个tag

{name: "223", value: "11", timestamp: 223},

{name: "225", value: "9", timestamp: 225},

{name: "239", value: "23", timestamp: 239}

// ... 其他tag

user2 : {

// user2的tag收藏信息

}

l Evaluation

n 读性能较差

n 写性能较好

l Approach

n 分布式 Key-Value 存储系统：Cassandra 入门

u http://www.ibm.com/developerworks/cn/opensource/os-cn-cassandra/

总结

NoSQL的起因是：目前的Web系统，Data-based越来越明显，Model-based越来越弱化，数据暂时不需要复杂的结构。

大多数应用只需要在松散的数据结构上存取数据。例如twitter，facebook。而不需要复杂的计算模型和数据模型。

同时，Google广泛使用的MapReduce计算模型，

在计算模型上具有广泛适用的灵活性与Scale horizontally（水平伸缩）能力。

其Key-Value数据模型，也是一种松散的数据结构。

Cassandra，Dynamo，Hbase也具有类似特征。

大半文字由wiki总结而来，我已经给出相应链接。

Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply.

posted @ 2010-08-15 17:09 贺韬阅读(248) 评论(0) 收藏举报

刷新页面返回顶部

贺韬 elfin

A top-down intro to NoSQL

A top-down intro to NoSQL

Data storage

No structured storage

Structured storage

SQL

NoSQL

Architecture

Properties

Motivation

Taxonomy

Graph Database

Properties

Document-oriented database

CouchDB

Key-Value Database（目前最广泛）

HBase vs Cassandra

CAP

Cassandra

总结

公告