Etcd 基础维护

本文所有命令均在 TLS 环境下运行,如需参考,请自行更改为您的环境(节点IP,证书路径),无证书环境请删除证书相关指令

本文所有命令均在 etcdctl 默认api ,即 etcd api v2 下操作,v3 指令略有改动可能不匹配,详情请查阅官方文档:https://etcd.io/docs/

Etcd 使用

  • 举例:创建、查询、删除 key ( /test/ok,值为 11)

# Etcd 录入数据示例
ETCDCTL_API=3 etcdctl \
--endpoints=https://172.16.10.70:2379 \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/etcd/ssl/etcd.pem \
--key=/etc/etcd/ssl/etcd-key.pem \
put /test/ok 11

# Etcd 查询数据示例 ETCDCTL_API=3 etcdctl \ --endpoints=https://172.16.10.70:2379 \ --cacert=/etc/kubernetes/ssl/ca.pem \ --cert=/etc/etcd/ssl/etcd.pem \ --key=/etc/etcd/ssl/etcd-key.pem \ get /test/ok

# Etcd 删除数据示例 ETCDCTL_API
=3 etcdctl \ --endpoints=https://172.16.10.70:2379 \ --cacert=/etc/kubernetes/ssl/ca.pem \ --cert=/etc/etcd/ssl/etcd.pem \ --key=/etc/etcd/ssl/etcd-key.pem \ del /test/ok

 

通过 Curl 来维护 Etcd

  • 查看版本

curl ‐k ‐‐cert /etc/etcd/ssl/etcd.pem ‐‐key /etc/etcd/ssl/etcd‐key.pem https://127.0.0.1:2379/version

 

  • 查看 Etcd 暴露出来的 prometheus 指标,在 prometheus 对其监控时可调用

curl ‐k ‐‐cert /etc/etcd/ssl/etcd.pem ‐‐key /etc/etcd/ssl/etcd‐key.pem https://127.0.0.1:2379/metrics

 

通过 Etcdctl 查看 版本

  • 查看 etcd、etcd api v2 版本

etcdctl -v

 

  • 查看 etcd、etcd api v3 版本

ETCDCTL_API=3 etcdctl version

 

查看集群健康状态

etcdctl \
--endpoints=https://172.16.10.70:2379 \
--ca-file=/etc/kubernetes/ssl/ca.pem \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
cluster-health

 

查看集群成员、哪个是leader节点

etcdctl \
--endpoints=https://172.16.10.70:2379 \
--ca-file=/etc/kubernetes/ssl/ca.pem \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
member list

 

删除 Etcd 节点

  • 查询节点 ID

etcdctl \
‐‐endpoints=https://172.16.10.70:2379 \
‐‐ca‐file=/etc/kubernetes/ssl/ca.pem \
‐‐cert‐file=/etc/etcd/ssl/etcd.pem \
‐‐key‐file=/etc/etcd/ssl/etcd‐key.pem \
member list

340acbd004e6bcdb: name=etcd3 peerURLs=https://172.16.10.72:2380 clientURLs=https://172.16.10.72:2379
isLeader=false
9784cb04cceb3a48: name=etcd1 peerURLs=https://172.16.10.70:2380 clientURLs=https://172.16.10.70:2379
isLeader=true
ba343177666dd96e: name=etcd2 peerURLs=https://172.16.10.71:2380 clientURLs=https://172.16.10.71:2379
isLeader=false

 

  • 删除节点,如删除 Eecd3

etcdctl \
‐‐endpoints=https://172.16.10.70:2379 \
‐‐ca‐file=/etc/kubernetes/ssl/ca.pem \
‐‐cert‐file=/etc/etcd/ssl/etcd.pem \
‐‐key‐file=/etc/etcd/ssl/etcd‐key.pem \
member remove 340acbd004e6bcdb

 

  • 修改配置文件 etcd.conf,修改参数 ETCD_INITIAL_CLUSTER 并移除节点信息,重启etcd服务

 

加入 Etcd 节点

已存在的 Etcd 节点故障重新添加(例 etcd3 重新添加)

1)在群集中删除故障节点

  • 在任意一 etcd 节点服务器查询该节点 ID,通过ID删除故障节点,操作步骤如上

  • 删除目标节点的数据

# 停止目标节点 etcd 服务
systemctl stop etcd

# 删除前备份
cd /var/lib/ && mkdir ‐p etcd_bak && tar ‐czvf etcd_bak/etcd_`date +%Y%m%d%H%M%S`.tar.gz etcd

# 删除节点数据
rm ‐rf /var/lib/etcd/*

 

2)编辑目标节点配置文件,将 --initial-cluster-state值为 existing (否则会生成新的ID,与原ID不匹配将无法加入集群)

vim /etc/etcd/etcd.conf

[member]
ETCD_NAME=etcd3
ETCD_DATA_DIR="/var/lib/etcd/"
ETCD_SNAPSHOT_COUNT="100"
ETCD_HEARTBEAT_INTERVAL="100"
ETCD_ELECTION_TIMEOUT="1000"
ETCD_LISTEN_PEER_URLS="https://172.16.10.72:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.10.72:2379,https://127.0.0.1:2379"
ETCD_MAX_SNAPSHOTS="5"
ETCD_MAX_WALS="5"
# [cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.10.72:2380"
ETCD_INITIAL_CLUSTER="etcd1=https://172.16.10.70:2380,etcd2=https://172.16.10.71:2380,etcd3=https://172.16.10.72:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_INITIAL_CLUSTER_TOKEN="k8s-etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.10.72:2379"
# [security]
ETCD_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_TRUSTED_CA_FILE="/etc/kubernetes/ssl/ca.pem"
ETCD_AUTO_TLS="true"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"
ETCD_PEER_TRUSTED_CA_FILE="/etc/kubernetes/ssl/ca.pem"
ETCD_PEER_AUTO_TLS="true"

 

3)加入节点至集群,需输入目标节点的 etcd name 和 PEER_URLS

etcdctl \
‐‐endpoints=https://172.16.10.70:2379 \
‐‐ca‐file=/etc/kubernetes/ssl/ca.pem \
‐‐cert‐file=/etc/etcd/ssl/etcd.pem \
‐‐key‐file=/etc/etcd/ssl/etcd‐key.pem \
member add etcd3 https://172.16.10.72:2380

 

4)启动目标节点 etcd 服务

systemctl start etcd && systemctl status etcd

 

5)查看集群健康状态

etcdctl \
‐‐endpoints=https://172.16.10.70:2379 \
‐‐ca‐file=/etc/kubernetes/ssl/ca.pem \
‐‐cert‐file=/etc/etcd/ssl/etcd.pem \
‐‐key‐file=/etc/etcd/ssl/etcd‐key.pem \
cluster‐health

 

对 Etcd 进行快照备份

ETCDCTL_API=3 etcdctl \
‐‐endpoints=https://172.16.10.70:2379 \
‐‐cacert=/etc/kubernetes/ssl/ca.pem \
‐‐cert=/etc/etcd/ssl/etcd.pem \
‐‐key=/etc/etcd/ssl/etcd‐key.pem \
snapshot save /tmp/snapshot_`date +%Y%m%d%H%M%S`.db

ETCDCTL_API=3:表示使用etcd的v3版本的API接口
注:一定要添加ETCDCTL_API=3才能正常备份;如果不添加将无法备份

 

通过快照备份恢复 Etcd集群;(每个节点都要执行)

  • 停止 Etcd 服务

systemctl stop etcd

 

  • 备份并删除当前 Etcd 数据 

cd /var/lib/ && mkdir ‐p etcd_bak && tar ‐czvf etcd_bak/etcd_`date +%Y%m%d%H%M%S`.tar.gz etcd ‐‐remov
e‐files

 

  • 还原快照镜像

ETCDCTL_API=3 etcdctl \
‐‐cacert=/etc/kubernetes/ssl/ca.pem \
‐‐cert=/etc/etcd/ssl/etcd.pem \
‐‐key=/etc/etcd/ssl/etcd‐key.pem \
‐‐name etcd1 \
‐‐data‐dir=/var/lib/etcd \
‐‐initial‐cluster etcd1=https://172.16.10.70:2380,etcd2=https://172.16.10.71:2380,etcd3=https://172.16.10.72:2380 \
‐‐initial‐cluster‐token k8s‐etcd‐cluster \
‐‐initial‐advertise‐peer‐urls https://172.16.10.70:2380 \
snapshot restore /tmp/201912‐18_snapshot.db

‐‐name:表示当前etcd节点的名字(非主机名) ‐‐data‐dir:表示当前 etcd 节点的数据目录 ‐‐initial‐cluster:集群中所有节点的peer访问地址;例:etcd1
=https:///172.16.10.70:2380,etcd2=https:///172.16.10.71:2380,etcd3=https:///172.16.10.72:2380
‐‐initial‐cluster‐token:集群中各节点通信的token ‐‐initial‐advertise‐peer‐urls:当前节点对其它节点的通信地址 

 

  • 启动 所有 Etcd 节点服务器

systemctl start etcd

 

  • 查看集群健康状态

etcdctl \
‐‐endpoints=https://172.16.10.70:2379 \
‐‐ca‐file=/etc/kubernetes/ssl/ca.pem \
‐‐cert‐file=/etc/etcd/ssl/etcd.pem \
‐‐key‐file=/etc/etcd/ssl/etcd‐key.pem \
cluster‐health

 

没有进行快照备份,通过数据目录的 db 恢复

注意:此方法恢复数据可能不完整,仅建议极端环境下使用,常规数据恢复请使用快照

  • 如果当前 Etcd 集群故障,且没有快照备份文件,可通过数据目录的 db 恢复数据;

  • 从数据目录 db 复制而来数据源,没有完整性hash,需要 --skip-hash-check=true 参数跳过完整性检查。

ETCDCTL_API=3 etcdctl \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/etcd/ssl/etcd.pem \
--key=/etc/etcd/ssl/etcd-key.pem \
--name etcd3 \
--data-dir=/var/lib/etcd \
--initial-cluster etcd1=https://172.16.10.70:2380,etcd2=https://172.16.10.71:2380,etcd3=https://172.16.10.72:2380 \
--initial-cluster-token k8s-etcd-cluster \
--initial-advertise-peer-urls https://172.16.10.72:2380 \
--skip-hash-check=true \
snapshot restore /var/lib/etcd_bak/etcd/member/snap/db

--name:表示当前etcd节点的名字(非主机名)
--data-dir:表示当前 etcd 节点的数据目录
--initial-cluster:集群中所有节点的peer访问地址;例:https://172.16.10.70:2380,etcd2=https://172.16.10.71:2380,etcd3=https://172.16.10.72:2380
--initial-cluster-token:集群中各节点通信的token --initial-advertise-peer-urls:当前节点对其它节点的通信地址

 

posted @ 2019-12-19 15:48  Whitedba  阅读(2019)  评论(0编辑  收藏  举报