k8s故障汇总
故障1:

查看etcd服务和apiserver:
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep api 7b37e91782436 9dc6939e7c573 2 minutes ago Exited kube-apiserver 7 65ca6cbe79865 kube-apiserver-master [root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep et cfb7a31a20dcb 73deb9a3f7025 2 minutes ago Running etcd 7 8b2460aa74ca9 etcd-master 9222b51368b07 73deb9a3f7025 6 minutes ago Exited etcd 6 8b2460aa74ca9 etcd-master
查看对应日志,查看etcd服务:
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock logs -f 9222b51368b07

space磁盘不够,清理磁盘后,重启该服务:
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock stop 7b37e91782436 7b37e91782436 [root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock start 7b37e91782436 7b37e91782436
查看两个服务状态:
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep api 286d64ddde3f7 9dc6939e7c573 4 minutes ago Running kube-apiserver 8 65ca6cbe79865 kube-apiserver-master 7b37e91782436 9dc6939e7c573 8 minutes ago Exited kube-apiserver 7 65ca6cbe79865 kube-apiserver-master [root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a|grep etc cfb7a31a20dcb 73deb9a3f7025 8 minutes ago Running etcd 7 8b2460aa74ca9 etcd-master 9222b51368b07 73deb9a3f7025 12 minutes ago Exited etcd 6 8b2460aa74ca9 etcd-master
服务正常后,k8s集群正常了
[root@master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready control-plane 11d v1.28.2
故障如下:

怀疑cri-docker服务挂了,于是开始排查:
[root@master ~]# systemctl status cri-docker

查看k8s集群各组件状态:
[root@master ~]# crictl --runtime-endpoint=unix:///var/run/cri-dockerd.sock ps -a

关键服务etcd和apiserver正常,那么k8s集群正常:

浙公网安备 33010602011771号