system design学习

DDIA

1.0 基础介绍

  • Reliability
  • Scalability
  • Maintain

 

1.1 Reliability

work under failure.

Error

1) hardware error
  • disk / network
2) software error
  • bug
3) humen error

 

Solve

  • access control
  • sandbox -> experiment (test)
  • CI-CD (continus imployment, continus deployment)  + automatic test
  • monitoring 
  • rollout feature gradually (逐渐散发特性给用户)

 

1.3 Scalability

scale up (better CPU)

scale out (more service)

 

Example: Weibo

Post

select post from posts p

join users u on p.send_id = u_id

join follows f on f_follow_id = u.id

where cur_id = f.follow_id

 

if user is a famous person, folowers are very large

send post to followers directly

 

1) normal:

method 1 : use query

 

2) celebrity post

method : send post directly

 

1) Metric

through put (Request / per sec)

Response time 

latency (tail-latency need to pay attention to, because they are valuble)

 

1.5 Maintainability

  • software, bug
  • hw, cpu, disk, battery (redundancy, extra hw)

 

2.1 Data model

1) SQL
  • Relation SQL
  • IMS
2) NoSql
  • Document Model (JSON, HTML)
  • Graph SQL

 

3.1 Data structure that power database

1) hash table index

key: index

value: location (in memory)

 

pro:

  • append log. (change location directly by key
  • good performance
  • concurrent

 

con:

  • memory, (lost easily)
  • can not implement range query
 
2) SS (sorted string) table + LSMT
SS Table (RB Tree) sorted string (can use merge sort to compress data)  can use binary search one key one segment

Pros:

  • Range query
  • Current

Cons:

  • Memory (lost update)
LSMT Table        

B 树 B+树 一般 4层, 256 T数据.

B Tree: Point Query

con : data duplucation

B+ Tree: Range Query

 

3) LSMT (SS Table) VS B, B+ Tree
  PROS CONS
LSMT Write

Read (when compress)

Read Two Part. New Tree -> Old Tree

B+ Tree Read Write (divide page)

 

3.8 memory database

pros:

Read faster

Complicated datastructure

 

3.9 data warehouse

主表写很多数据,fact_table. (order_id, user_age)... 

然后 可以根据主表 做一些snapshot 给 数据分析 DA 用 来挖掘一些价值,bendwidth有限,所以不需要实时更新

 

然后 select * from table 这种 内存加载内容很多,

可以把传统的 row storage 改成 column storage,这样不需要的 column 就不用加载,例如 order_id 放一个存储文件,user_id 放一个等等

 

4.1 Encoding & Evolution

backward compatibility when you release v_2, you need to think whether v_1.5 can use
forward compatibility will v_2 in the future use current structure

 

code

data

  • memory
  • disk (serialize, decode)
method pros cons  
java

easy to write

  • decode is not safe
  • only java
  • UUID is not the same
 
json /csv / xml can change shceme easily more time and space  
protobuf

less space, reflect key to number

don't need to spell out key name

   

 

 

4.5 data flow

1) flow database
2) service calls

 

5.1 why distribution system

when the scale is very large, use distribution system

1) single point failure
2) work under failure
3) request

 

5.2 CAP

Only choose two.

1) C consistency

Pros: same data

Cons: latency

2) A avalability

asynchronized.

pros: faster, available

cons: data not the same

3) P partition Tolerance(分区容错 必须有)

must happen. make sure serivice is available.

 

5.3 single leader

1) problems
  • can not read data from followers imediately.

solve: you can read data from the leader when data is written. read from leader or wait for follower

 

  • consistence 

Two message A and B, and A happened before B. How to make sure?

 

5.4 Multiple Leader

can solve single point failure.  It is complex, do not prefer use it.

 

you can use data center instead. user is send to the nearest data center.  and they are still multiple data centers. but there are some problems, for example, how to make sure the data are the same?

 

5.6 apply multiple leaders

1) case

so these devices have same data

 

2) problem

  • sync (latency)
  • let user decide what to write. like how to solve git conflict
  • gossip

still has write conflict. last time win. set a timestamp

 

5-8 leaderless replication

use version to control which to read.

Eventually consistency to solve different machine has different data.

 

5-9 Quorum

w -> write time

r -> read time 

n -> total number of services

 

w +r > n, so you can get the latest data

 

problem read old data. 

sove: sloppy.

 

happen before.  (close friends circle, then send a post)

 

6.1 Partition

1) partition by key. 

0~100 us

101 - 200 eu.

pro: can use range query.

con: maybe has hot spot.

 
2)  partition by hash key

pro: can solve hot spot

con: can not apply range query.

 

6.2 partition and secondary index

6.3 rebalance

  • R/W available

就是hash算法 之前博客里写的,选hash 一致算法 或者 virtual node 方法。

 

6.5 client and server

how the client remember the node ? 

1) heartbeat

save information in session. not good. 

2) middle software

message broker

 

7.1 Transaction 事务

ACID (Atomic, Consislent, Isolation,Duaration)

 

7.3 Trasaction隔离机制

dirty read

can solve dirty read.

If U1 write is rollbacked, if U2 read things uncommited, so x will be 3, it is wrong. 

 

dirty write

write conflict, last write win.

 

non reatable read

 

 

1) Read Commited

no dirty read. 

others could happen

dirty write (row lock)

2) Repeatable Read

use snapshot to solve unrepeatable

3) 串行化 

 

8.1 The Trouble with distribute system

Use super machine or algorithm to handle some common problems

 

8.3 how to detect a broken node

detect:

heartbeat

 

solve

1) load balance

traffic drain

2) Promote

follower -> leader

 

8.6 拜占庭问题

9.1 linerizabiling

1) eventual consistency (convergence)

CAP. 

9.2 2pc

1) prepare, all prepare

2) commit then commit

 

10.1 Map reduce

一棵树.

10.2 Unix角度批处理

all things are files.

file -> ordered bytes.

posted @ 2024-04-08 12:03  ylxn  阅读(1)  评论(0编辑  收藏  举报