etcd频繁选举leader

etcd频繁选举leader

集群中etcd出现报警

Alert Name: A high number of leader changes within the etcd cluster are happening
Severity: warning
Cluster Name: shdmz-prod-diamond (ID: c-n6wc4)
Namespace: cattle-prometheus
Expression: increase(etcd_server_leader_changes_seen_total[1h])>3
Description: Threshold Crossed: datapoint value 4.067796610169491 was greater than to the threshold (3) for (3m)

 日志中发现的问题,还有类似心跳检测超时的情况

2020-07-08 11:32:11.730958 W | rafthttp: the clock difference against peer db40725e6f94d8e3 is too high [13.717094955s > 1s] (prober "ROUND_TRIPPER_RAFT_MESSAGE")

 解决方式

1、集群中有某些机器时间不同步

2、扩大心跳检测时长

- --election-timeout=5000
- --heartbeat-interval=500

 

posted @ 2020-07-12 20:10  Wshile  阅读(2554)  评论(0编辑  收藏  举报