6.824 Lab 3

题目:http://nil.csail.mit.edu/6.824/2022/labs/lab-kvraft.html

Part A

Part A主要有两个场景:

  • 任务1:正常场景,没有消息丢失和节点失效
  • 任务2:异常场景,例如server失效导致消息重发

Server层需要处理Raft协商开始之后,commit之前leader发生变化的情况。由client向其他server发起重试。

Your solution needs to handle a leader that has called Start() for a Clerk's RPC, but loses its leadership before the request is committed to the log. In this case you should arrange for the Clerk to re-send the request to other servers until it finds the new leader. One way to do this is for the server to detect that it has lost leadership, by noticing that a different request has appeared at the index returned by Start(), or that Raft's term has changed. If the ex-leader is partitioned by itself, it won't know about new leaders; but any client in the same partition won't be able to talk to a new leader either, so it's OK in this case for the server and client to wait indefinitely until the partition heals.

要点1

为了保证线性一致性,简单粗暴的方法是单节点上的所有RPC请求串行执行,在RPC开始处加锁。

Hint: It's best to add locking from the start because the need to avoid deadlocks sometimes affects overall code design. Check that your code is race-free using go test -race.

要点2

Server层收到client请求,向底层Raft发起复制,然后等待apply。Server层收到apply消息,要通知等待中的线程。但是如果有很多线程都在等待中,apply线程完成后怎么找到等待该entry的线程呢?答案是通过LogIndex进行关联。Raft Start接口返回的index保存起来,apply的时候可以从msg中取出index,就能找到是哪个线程等待这个entry了。这里应该让apply线程通知请求线程,而不是请求线程轮训apply线程的结果,能更好的节省cpu开销。

在实现上,我使用了一个用于通知的map,key是index,value是term,apply线程收到一条apply通知后,会向表里填入这个entry对应的index和term,然后通过条件变量通知请求线程。请求线程被唤醒后,检查表里的index和term是否和Start返回的index和term相等,如果相等,则表示自己发起的消息被commit了,否则表示自己的消息在Raft中传播失败,client需要重试。

参考:https://thesquareplanet.com/blog/students-guide-to-raft/#applying-client-operations

要点3

Put和Append请求需要去重,也就是要保证幂等性。有两个地方需要去重,一是在请求进入server层时,防止重复的请求进入Raft log,二是做apply时,防止server中的kv map被破坏。

方法是给每个client分配一个ClientId,并且给每个请求分配一个递增的RequestId,server用一张表记录自己apply了那些ClientId的哪些RequestId(生产环境这张表应该要能够持久化)。但这个方法有一定的局限性,就是只能针对同一个ClientId做幂等。

参考:https://thesquareplanet.com/blog/students-guide-to-raft/#duplicate-detection

待优化点

  • 客户端重试切换leader,目前采用round-robin的方法,效率可能不高。
  • Request去重用的map没有删除,里面的数据会越来越多,存在内存泄漏的问题。

问题案例

2022/11/27 11:01:46.122593 client.go:78: [Client] Append key:0, value:x 0 2 y
2022/11/27 11:01:46.122808 server.go:80: [server0] Put, begin, {Key:0 Value:x 0 2 y Op:Append}

2022/11/27 11:01:46.486206 raft.go:797: [S2|T2] Begin leader election
S1,S3,S4投票给S2

2022/11/27 11:01:46.537044 server.go:83: [server0] Put, lock success, {Key:0 Value:x 0 2 y Op:Append}


2022/11/27 11:01:46.540999 raft.go:571: [S2|T3] I am leader, my log: [{<nil> 0} {{1  Put} 2} {{1 x 1 0 y Append} 2} {{4  Put} 2} {{3  Put} 2} {{0  Put} 2} {{2  Put} 2} {{1 x 1       1 y Append} 2} {{4 x 4 0 y Append} 2} {{3  Get} 2} {{0  Get} 2} {{2  Get} 2} {{1  Get} 2} {{4  Get} 2} {{3  Get} 2} {{0  Get} 2} {{2 x 2 0 y Append} 2} {{1  Get} 2} {{4 x 4 1       y Append} 2} {{3 x 3 0 y Append} 2} {{0 x 0 0 y Append} 2} {{2 x 2 1 y Append} 2} {{1  Get} 2} {{4  Get} 2} {{3 x 3 1 y Append} 2} {{0  Get} 2} {{2  Get} 2} {{1 x 1 2 y Appen      d} 2} {{4  Get} 2} {{3  Get} 2} {{0 x 0 1 y Append} 2} {{2  Get} 2} {{1 x 1 3 y Append} 2} {{4 x 4 2 y Append} 2} {{3 x 3 2 y Append} 2} {{0  Get} 2} {{2 x 2 2 y Append} 2} {{      1  Get} 2} {{4 x 4 3 y Append} 2} {{3 x 3 3 y Append} 2} {{0  Get} 2} {{2 x 2 3 y Append} 2} {{1  Get} 2} {{4  Get} 2} {{3  Get} 2}], commitIndex: 43, lastIncludedIndex: 0

2022/11/27 11:01:46.541005 raft.go:702: [S0|T2] Leader after append log [{<nil> 0} {{1  Put} 2} {{1 x 1 0 y Append} 2} {{4  Put} 2} {{3  Put} 2} {{0  Put} 2} {{2  Put} 2} {{1       x 1 1 y Append} 2} {{4 x 4 0 y Append} 2} {{3  Get} 2} {{0  Get} 2} {{2  Get} 2} {{1  Get} 2} {{4  Get} 2} {{3  Get} 2} {{0  Get} 2} {{2 x 2 0 y Append} 2} {{1  Get} 2} {{4 x       4 1 y Append} 2} {{3 x 3 0 y Append} 2} {{0 x 0 0 y Append} 2} {{2 x 2 1 y Append} 2} {{1  Get} 2} {{4  Get} 2} {{3 x 3 1 y Append} 2} {{0  Get} 2} {{2  Get} 2} {{1 x 1 2 y Ap      pend} 2} {{4  Get} 2} {{3  Get} 2} {{0 x 0 1 y Append} 2} {{2  Get} 2} {{1 x 1 3 y Append} 2} {{4 x 4 2 y Append} 2} {{3 x 3 2 y Append} 2} {{0  Get} 2} {{2 x 2 2 y Append} 2}       {{1  Get} 2} {{4 x 4 3 y Append} 2} {{3 x 3 3 y Append} 2} {{0  Get} 2} {{2 x 2 3 y Append} 2} {{1  Get} 2} {{4  Get} 2} {{3  Get} 2} {{0 x 0 2 y Append} 2}]

2022/11/27 11:01:46.541772 server.go:95: [server0] Put, start return, {Key:0 Value:x 0 2 y Op:Append}

S0是leader,收到Put请求,启动Raft,将OP append到log中。S2由于某些原因发起leader选举,并得到S1、S3、S4的支持,成为leader。如果S2中不包含这条entry,那么S0 log中的这条entry会被覆盖,server层永远无法得到apply通知。因此server层启动Raft后,除了等待OP apply,还需要随时感知自己是否还是leader,如果不是leader了,就要返回client错误。

但是简单的根据”自己是否为leader“这个条件来决定返回错误也是有问题的。如果S2中包含这个entry,客户端Append重试,value会产生重复的内容。因此需要一个去重机制来保证client请求的幂等性。

题目中有一个提示:

Hint: It's best to add locking from the start because the need to avoid deadlocks sometimes affects overall code design. Check that your code is race-free using go test -race.

并且代码中也提供了一个sync.Mutex,但是在实现过程中我把这个限制去掉了,客户端的并发请求可以并发处理,Raft在做日志复制时,一次也能传播更多的条目,从耗时上来看,这个优化可以使性能提升一个数量级。

Part B

Part B在Part A的基础上增加了快照功能。需要实现的几个点包括:

  • Apply之后需要判断log长度是否过长,过长则要执行snapshot
  • Server启动时执行restore流程,从snapshot中恢复出相关数据
  • Apply收到的是snapshot消息,则也要执行restore流程

用于去重的数据结构也需要持久化到snapshot中:

Hint: Your kvserver must be able to detect duplicated operations in the log across checkpoints, so any state you are using to detect them must be included in the snapshots.

踩过的坑

有一点需要注意,如果snapshot中有map结构,decode之前需要先将目标map清空,否则最终map是原有map和decode map的并集。下面是我写的一段gob编解码验证代码:

type LabEncoder struct {
	gob *gob.Encoder
}

type LabDecoder struct {
	gob *gob.Decoder
}

func NewEncoder(w io.Writer) *LabEncoder {
	enc := &LabEncoder{}
	enc.gob = gob.NewEncoder(w)
	return enc
}

func NewDecoder(r io.Reader) *LabDecoder {
	dec := &LabDecoder{}
	dec.gob = gob.NewDecoder(r)
	return dec
}

func (enc *LabEncoder) Encode(e interface{}) error {
	return enc.gob.Encode(e)
}

func (dec *LabDecoder) Decode(e interface{}) error {
	return dec.gob.Decode(e)
}

func main() {
	m := make(map[string]string)
	m["aaa"] = "bbb"
	w := new(bytes.Buffer)
	e := NewEncoder(w)
	e.Encode(m)
	m["aaa"] = "www"
	m["111"] = "222"
	fmt.Println(m)
	snapshot := w.Bytes()
	r := bytes.NewBuffer(snapshot)
	d := NewDecoder(r)
	d.Decode(&m)
	fmt.Println(m)
}

打印结果:

map[111:222 aaa:www]
map[111:222 aaa:bbb]

可以看到decode出来的内容只覆盖了“aaa”,而“111”仍然保留在了map中。

参考

https://pdos.csail.mit.edu/6.824/labs/lab-kvraft.html

posted @ 2023-02-15 22:11  leo987  阅读(27)  评论(0编辑  收藏  举报