k8s故障汇总

故障1:

 查看etcd服务和apiserver:

[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep api
7b37e91782436       9dc6939e7c573       2 minutes ago       Exited              kube-apiserver            7                   65ca6cbe79865       kube-apiserver-master

[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep et
cfb7a31a20dcb       73deb9a3f7025       2 minutes ago       Running             etcd                      7                   8b2460aa74ca9       etcd-master
9222b51368b07       73deb9a3f7025       6 minutes ago       Exited              etcd                      6                   8b2460aa74ca9       etcd-master

查看对应日志,查看etcd服务:

[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock logs -f 9222b51368b07

 space磁盘不够,清理磁盘后,重启该服务:

[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock stop 7b37e91782436
7b37e91782436
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock start 7b37e91782436
7b37e91782436

查看两个服务状态:

[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep api
286d64ddde3f7       9dc6939e7c573       4 minutes ago       Running             kube-apiserver            8                   65ca6cbe79865       kube-apiserver-master
7b37e91782436       9dc6939e7c573       8 minutes ago       Exited              kube-apiserver            7                   65ca6cbe79865       kube-apiserver-master
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep etc
cfb7a31a20dcb       73deb9a3f7025       8 minutes ago       Running             etcd                      7                   8b2460aa74ca9       etcd-master
9222b51368b07       73deb9a3f7025       12 minutes ago      Exited              etcd                      6                   8b2460aa74ca9       etcd-master

服务正常后,k8s集群正常了

[root@master ~]# kubectl get nodes
NAME     STATUS   ROLES           AGE   VERSION
master   Ready    control-plane   11d   v1.28.2

 

故障如下:

 怀疑cri-docker服务挂了,于是开始排查:

[root@master ~]# systemctl status cri-docker

 查看k8s集群各组件状态:

[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a

 关键服务etcd和apiserver正常,那么k8s集群正常:

 

posted on 2025-06-27 22:30  wadeson  阅读(18)  评论(0)    收藏  举报