修改 kubernetes master 主机名(hostname)与节点名称(node name)

这篇博文记录的是修改 k8s 集群 master(control plane) 的主机名与节点名称的操作步骤,是 用 master 服务器镜像恢复出新集群 的后续博文,目标是将 master 主机名与节点名称由 k8s-master0 修改为 kube-master0

服务器操作系统是 Ubuntu 18.04,Kubernetes 版本是 1.20.2。

第1次修改尝试

修改 master 服务器 hostname

hostnamectl set-hostname kube-master0

替换 /etc/kubernetes/manifests 中与主机名相关的配置

oldhost=k8s-master0
newhost=kube-master0
cd /etc/kubernetes/manifests
find . -type f | xargs grep $oldhost
find . -type f | xargs sed -i "s/$oldhost/$newhost/"
find . -type f | xargs grep $newhost

替换 kubeadm-config 中的主机名

kubectl edit cm kubeadm-config -n kube-system
:%s/k8s-master0/kube-master0

重启相关服务是配置修改生效

systemctl daemon-reload && systemctl restart kubelet && systemctl restart docker

进入 etcd 容器确认 member 名称是否已更新

docker exec -it $(docker ps -f name=etcd_etcd -q) /bin/sh
 etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
896d19d1d0a08f49, started, kube-master0, https://10.0.9.171:2380, https://10.0.9.171:2379, false

查看 node name 是否已经改过来

$ kubectl get nodes
NAME          STATUS     ROLES                  AGE     VERSION
k8s-master0   NotReady   control-plane,master   372d    v1.20.2

很遗憾,没改过来。

第2次修改尝试

通过 kubectl edit node k8s-master0 查看节点配置有3个地方还在使用 k8s-master0

  1. metadata -> labels: kubernetes.io/hostname: kube-master0(可以直接修改)
  2. metadata: name: k8s-master0(无法修改,报错"error: At least one of apiVersion, kind and name was changed")
  3. status -> addresses:(修改后再次打开又恢复为原值)
  - address: k8s-master0
    type: Hostname

修改 node 配置文件的方法未成功。

第3次修改尝试

尝试通过 etcdctl 直接修改 etcd 数据库中包含 k8s-master0 的配置数据

设置 etcdctl 的环境变量

export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
export ETCDCTL_ENDPOINTS=10.0.9.171:2379

导出所有配置

etcdctl get "" --prefix  -w json > etcd-kv.json

基于 etcd-kv.json 导出所有包含 k8s-master0 的配置

for k in $(cat etcd-kv.json | jq '.kvs[].key' | cut -d '"' -f2); do echo $k | base64 --decode; echo; done | grep k8s-master0 > kv_k8s-master0.txt

导出结果如下

/registry/crd.projectcalico.org/blockaffinities/k8s-master0-192-168-70-128-26
/registry/crd.projectcalico.org/ipamhandles/ipip-tunnel-addr-k8s-master0
/registry/csinodes/k8s-master0
/registry/events/default/k8s-master0.165a969b97e7c4ea
...
/registry/events/kube-system/etcd-k8s-master0.165a984e78509ebd
...
/registry/events/kube-system/kube-apiserver-k8s-master0.165a96905a9bf40c
...
/registry/events/kube-system/kube-controller-manager-k8s-master0.165a7016cd8a6ca9
...
/registry/events/kube-system/kube-scheduler-k8s-master0.165a7016cead2a32
...
/registry/leases/kube-node-lease/k8s-master0
/registry/minions/k8s-master0
/registry/pods/kube-system/etcd-k8s-master0
/registry/pods/kube-system/kube-apiserver-k8s-master0
/registry/pods/kube-system/kube-controller-manager-k8s-master0
/registry/pods/kube-system/kube-scheduler-k8s-master0

通过下面的命令添加 /registry/minions/k8s-master0

key=/registry/minions/k8s-master0
etcdctl get $key --print-value-only > kv-temp.txt
sed -i "s/k8s-master0/kube-master0/" kv-temp.txt
cat kv-temp.txt | etcdctl put `echo $key | sed "s/k8s-master0/kube-master0/"`

添加之后运行 kubectl get nodes 报错

Error from server: proto: Unknown: illegal tag 0 (wire type 0)

给 etcdctl 加了 -w fields 参数后消除了上面的报错,但通过 etcdctl 修改的尝试也失败了,详见博问 https://q.cnblogs.com/q/133164/

第4次修改尝试

导出 k8s-master0 的 node 配置文件

kubectl get node k8s-master0 -o yaml > kube-master0.yml

将配置文件中的 k8s-master0 替换为 kube-master0

sed -i "s/k8s-master0/kube-master0/" kube-master0.yml

将宿主机 hostname 修改为 kube-master0

hostnamectl set-hostname kube-master0

替换 /etc/kubernetes/manifests 中与主机名相关的配置

oldhost=k8s-master0
newhost=kube-master0
cd /etc/kubernetes/manifests
find . -type f | xargs sed -i "s/$oldhost/$newhost/"

通过 etcdctl 从 etcd 中删除 /registry/minions/k8s-master0

etcdctl del /registry/minions/k8s-master0

用之前导出并修改的配置文件部署 kube-master0 node

kubectl apply -f kube-master0.yml

这样一番操作后,kubectl get nodes 列表中出现了 kube-master0,但处于 NotReady 状态

NAME           STATUS     ROLES                  AGE     VERSION
kube-master0   NotReady   control-plane,master   97m     v1.20.2

syslog 中的错误日志之一

Jan 20 18:20:27 kube-master0 kubelet[23220]: E0120 18:20:27.460470 23220 controller.go:144] failed to ensure lease exists, will retry in 7s, error: leases.coordination.k8s.io "kube-master0" is forbidden: User "system:node:k8s-master0" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease": can only access node lease with the same name as the requesting node

从日志中的 User system:node:k8s-master0" 获知 node 的用户名还没改过来,查看 /etc/kubernetes/kubelet.conf

users:
- name: default-auth
  user:
    client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
    client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

用户信息是来自 /var/lib/kubelet/pki/ 中的证书文件 kubelet-client-current.pem,用 openssl 命令查看证书绑定的 common name (CN)

$ openssl x509 -noout -subject -in kubelet-client-current.pem                                                                                                  
subject=O = system:nodes, CN = system:node:k8s-master0

原来证书还是改名之前的,需要针对新主机名为节点的 kubelet 重新生成证书。

经过一番折腾后,用下面的 kubeadm 命令轻松搞定:

kubeadm init phase kubeconfig kubelet

运行上面的命令重新生成证书后,/etc/kubernetes/kubelet.conf 中 users 部分变成下面的内容:

users:
- name: system:node:kube-master0
  user:
    client-certificate-data: 
    ***...
    client-key-data: 
    ***...

重启 kubelet

systemctl restart kubelet

终于大功告成!

$ kubectl get nodes                                               
NAME           STATUS   ROLES                  AGE     VERSION
kube-master0   Ready    control-plane,master   18h     v1.20.2

2022年5月21日补充:还需要删除 /etc/kubernetes/pki/etcd 中除了 ca.crt 与 ca.key 之外的证书文件,用下面的命令重新生成证书

kubeadm init phase certs etcd-server
kubeadm init phase certs etcd-peer
kubeadm init phase certs etcd-healthcheck-client
posted @ 2021-01-21 11:03  dudu  阅读(12535)  评论(0编辑  收藏  举报