consist hashing -- 一致性hash
1.why do we consist hashing?
problem:
if we just use the normal hashing, for example, firstly we have 3 nodes in our db system,
all the hashcode%3==0 go to node1
all the hashcode%3==1 go to node1
all the hashcode%3==2 go to node2

but if we add one more node3, all the data in node0~2 need to be rehashing.

2.what is consist hashing?
Consist hashing is a kind of hashing that when hash table is resized , only keys/slots of keys need to be remapped on average.
We can consider consist hashing as a hash ring, the ring has been divided 360(in the real condition it's 232).
When we add a node , we would randomly select K point from the ring.

what is virtual nodes?
这里首先思考一个问题,假如node3宕掉了,node1到node3之间由node3负责的数据会全部压到node2上,这样很容易把node2节点给压垮了,容易引起连锁反应最后导致整个集群崩掉。
因此我们尝试在新增每个节点时,不是从环上随机选取1个点,而是随机选取多个点作为该node的负载,这样当该节点退出或者异常时,它的负载会分散到多个其他节点上

3.what is the advantage of consist hashing?
- when we add/remove a new node, least of data will be affected.
- because we use the virtual nodes mechnism (one real node will process several virtual nodes data), so when we remove a node, it would not give the impact to a single node(since there'are several virtual nodes).
- data can be more balanced assigned to nodes
4.how to implement the consist hashing?
the related algrithm to understand the consist hashing:
https://www.lintcode.com/problem/520/

浙公网安备 33010602011771号