9.K8s集群常见报错3
51.Metrics server组件工作不正常
(1).报错信息
[root@master231 02-metrics-server]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE deploy-stress Deployment/deploy-stress <unknown>/95% 2 5 0 3s
(2).问题原因
如果目标一直处于"<unknown>"状态,说明没有监控到目标,也可能metrics-server不正常工作。
(3).解决方案
- 1.检查metrics-server的版本和K8S是否对应,验证组件是否正常工作;(kubectl top) - 2.被监控目标(Deployment/deploy-stress)是否被移除;
52.Metrics server组件工作不正常
(1).报错信息
replicasets.apps is forbidden: User "system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard" cannot list resource "replicasets" in API group "apps" in the namespace "default" deployments.apps is forbidden: User "system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard" cannot list resource "deployments" in API group "apps" in the namespace "default" statefulsets.apps is forbidden: User "system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard" cannot list resource "statefulsets" in API group "apps" in the namespace "default"
(2).问题原因
访问K8S集群资源没有权限。
(3).解决方案
为K8S配置授权。
53.pvc无法关联pv或sc
(1).报错信息
[root@master231 persistentvolumeclaims]# kubectl get pvc oldboyedu-linux-pvc-2 NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE oldboyedu-linux-pvc-2 Pending 30s [root@master231 persistentvolumeclaims]# kubectl describe pvc oldboyedu-linux-pvc-2 Name: oldboyedu-linux-pvc-2 Namespace: default StorageClass: Status: Pending ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal FailedBinding 13s (x3 over 32s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
(2).问题原因
pvc无法关联pv或者sc,目前集群资源不符合pvc的期望。
(3).解决方案
- 降低pvc的期望存储(不推荐)
- 创建符合对应的pv或者sc即可。
54.pvc被引用时无法被删除
(1).报错信息
[root@master231 ~]# kubectl get pvc oldboyedu-linux-pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE oldboyedu-linux-pvc Terminating oldboyedu-linux-pv02 5Gi RWX 55m [root@master231 ~]# kubectl describe pvc oldboyedu-linux-pvc Name: oldboyedu-linux-pvc Namespace: default StorageClass: Status: Terminating (lasts 5m2s) Volume: oldboyedu-linux-pv02 Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes Finalizers: [kubernetes.io/pvc-protection] Capacity: 5Gi Access Modes: RWX VolumeMode: Filesystem Used By: deploy-xiuxian-pvc-6564dd4856-t52rq Events: <none>
(2).问题原因
通过观察"Used By"说明该pvc正在被名为"deploy-xiuxian-pvc-6564dd4856-t52rq"的pods使用。
(3).解决方案
删除pod或pod对于pvc的引用即可。
55.pod无法关联pvc
(1).报错信息
[root@master231 deployments]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-pvc-6564dd4856-86kds 0/1 Pending 0 41s <none> <none> <none> <none> [root@master231 deployments]# kubectl describe pod deploy-xiuxian-pvc-6564dd4856-86kds Name: deploy-xiuxian-pvc-6564dd4856-86kds Namespace: default ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 42s default-scheduler 0/3 nodes are available: 3 persistentvolumeclaim "oldboyedu-linux-pvc" not found.
(2).问题原因
pod无找到pvc。
(3).解决方案
- 1.手动创建pvc - 2.pod引用其他的pvc
56.sc的回收策略不支持热更新
(1).报错信息
[root@master231 storageclasses]# kubectl apply -f sc.yaml storageclass.storage.k8s.io/oldboyedu-sc-xixi unchanged The StorageClass "oldboyedu-sc-haha" is invalid: reclaimPolicy: Forbidden: updates to reclaimPolicy are forbidden.
(2).错误原因
sc的回收策略不支持热更新。
(3).解决方案
删除sc重建即可。
57.sc不支持archiveOnDelete参数
(1).报错信息
[root@master231 persistentvolumeclaims]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE oldboyedu-linux-pvc Bound pvc-f47d1e5b-a2f1-463f-b06c-7940add76104 3Gi RWX nfs-csi 3h12m pvc-linux94 Pending oldboyedu-sc-haha 7s [root@master231 persistentvolumeclaims]# kubectl describe pvc pvc-linux94 Name: pvc-linux94 Namespace: default StorageClass: oldboyedu-sc-haha Status: Pending ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ExternalProvisioning 8s (x2 over 12s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "nfs.csi.k8s.io" or manually created by system administrator Normal Provisioning 5s (x4 over 12s) nfs.csi.k8s.io_worker233_d8fe1cb1-595b-4208-95c1-8599785034ad External provisioner is provisioning volume for claim "default/pvc-linux94" Warning ProvisioningFailed 5s (x4 over 12s) nfs.csi.k8s.io_worker233_d8fe1cb1-595b-4208-95c1-8599785034ad failed to provision volume with StorageClass "oldboyedu-sc-haha": rpc error: code = InvalidArgument desc = invalid parameter "archiveOnDelete" in storage class
(2).错误原因
新版本的nfs 4.9.0不支持"archiveOnDelete"参数,参考官网即可。
(3).解决方案
经过查看官网链接,不难发现已经将"archiveOnDelete"参数更名为"OnDelete"参数啦。 参考链接: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/docs/driver-parameters.md#storage-class-usage-dynamic-provisioning
58.sc的回收策略不支持
(1).报错信息
[root@master231 storageclasses]# kubectl apply -f sc.yaml storageclass.storage.k8s.io/oldboyedu-sc-xixi created The StorageClass "oldboyedu-sc-haha" is invalid: reclaimPolicy: Unsupported value: "archive": supported values: "Delete", "Retain"
(2).错误原因
新版本的nfs 4.9.0不支持指定的回收策略,默认有效的回收策略为: "Delete", "Retain"
(3).解决方案
根据提示修改为正确的回收策略即可。
59.yaml格式解析出错
(1).报错信息
[root@master231 04-helm]# helm -n oldboyedu-helm install xiuxian oldboyedu-xiuxian-quote Error: INSTALLATION FAILED: 1 error occurred: * Deployment in version "v1" cannot be handled as a Deployment: json: cannot unmarshal number into Go struct field LabelSelector.spec.selector.matchLabels of type string
(2).错误原因
对于deploy资源"spec.selector.matchLabels"要求是string类型,可能得到了不符合字符串类型的数据。
(3).解决方案
如果是helm可以考虑使用quote或者squote实现。
60.本地没有helm仓库
(1).报错信息
[root@master231 04-helm]# helm repo list Error: no repositories to show
(2).错误原因
本地没有helm仓库,可以自行添加第三方仓库。
(3).解决方案
在网上找一些知名度比较高,可信任的站点添加仓库。
61.资源清单api版本错误
(1).报错信息
[root@master231 04-helm]# helm install es-exporter elasticsearch-exporter Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: resource mapping not found for name: "es-exporter-elasticsearch-exporter" namespace: "" from "": no matches for kind "Deployment" in version "apps/v1beta2" ensure CRDs are installed first
(2).错误原因
类型的api版本出错。
(3).解决方案
修改资源清单。
62.未定义Chart的version信息
(1).报错信息
[root@master231 13-oldboyedu-xiuxian-package]# helm package . Error: validation: chart.metadata.version is required
(2).错误原因
未定义Chart的version信息。
(3).解决方案
查看Chart的Chart.yaml文件,观察是否定义version字段。
63.未定义Chart的name信息
(1).报错信息
[root@master231 13-oldboyedu-xiuxian-package]# helm package . Error: validation: chart.metadata.name is required
(2).错误原因
未定义Chart的name信息。
(3).解决方案
查看Chart的Chart.yaml文件,观察是否定义name字段。
64.helm对harbor证书不识别
(1).报错信息
[root@master231 13-oldboyedu-xiuxian-package]# helm push oldboyedu-apps-v1.tgz oci://harbor.oldboyedu.com/oldboyedu-helm Error: failed to do request: Head "https://harbor.oldboyedu.com/v2/oldboyedu-helm/oldboyedu-apps/blobs/sha256:846795cdbf1cb14ec33d7c87b50795e59b17347e218c077d3fad66e8321f46c1": tls: failed to verify certificate: x509: certificate signed by unknown authority
(2).错误原因
helm对harbor证书不识别。
(3).解决方案
使用helm查看帮助信息,以识别证书的相关参数。举例: [root@master231 13-oldboyedu-xiuxian-package]# helm push oldboyedu-apps-v1.tgz oci://harbor.oldboyedu.com/oldboyedu-helm --ca-file /etc/docker/certs.d/harbor.oldboyedu.com/ca.crt --cert-file /etc/docker/certs.d/harbor.oldboyedu.com/harbor.oldboyedu.com.cert Pushed: harbor.oldboyedu.com/oldboyedu-helm/oldboyedu-apps:v1 Digest: sha256:34a8cd21c9bc7a3c6361aa13768e3a1d5780ef7d1e64617c8b7fda4fb3d040dc
65.helm的Release版本不对应
(1).报错信息
[root@master231 kubeapps-12.2.10]# helm -n oldboyedu-helm install oldboyedu-kubeapps kubeapps Error: INSTALLATION FAILED: Unable to continue with install: AppRepository "bitnami" in namespace "oldboyedu-helm" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "oldboyedu-kubeapps": current value is "myapps"
(2).错误原因
初步怀疑是kubeapps部署时有数据在etcd中未删除干净。
(3).解决方案
- 1.等待一段时间再执行; - 2.或者修改和报错同名称的Release测试; - 3.换一个新的名称空间测试;
66.默认启用admissionWebhooks导致的报错
(1).报错信息
[root@master231 ingresses]# kubectl apply -f 02-ingress-xiuxian.yaml Error from server (InternalError): error when creating "02-ingress-xiuxian.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": failed to call webhook: Post "https://myingress-ingress-nginx-controller-admission.yinzhengjie-ingress.svc:443/networking/v1/ingresses?timeout=10s": x509: certificate is not valid for any names, but wanted to match myingress-ingress-nginx-controller-admission.yinzhengjie-ingress.svc
(2).错误原因
默认启用admissionWebhooks导致的报错
(3).解决方案
禁用admissionWebhooks即可。
67.endpoints找不到svc
(1).报错信息
[root@master231 ingresses]# kubectl describe -f 03-ingress-redirect.yaml Name: apps-redirect Labels: <none> Namespace: default Address: 10.0.0.150 Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>) Rules: Host Path Backends ---- ---- -------- blog.oldboyedu.com / svc-apps:80 (<error: endpoints "svc-apps" not found>) Annotations: nginx.ingress.kubernetes.io/permanent-redirect: https://www.cnblogs.com/yinzhengjie nginx.ingress.kubernetes.io/permanent-redirect-code: 308 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Sync 3s (x2 over 7s) nginx-ingress-controller Scheduled for sync Normal Sync 3s (x2 over 7s) nginx-ingress-controller Scheduled for sync
(2).错误原因
根据错误提示: "error: endpoints "svc-apps" not found"不难发现,endpoints后端的svc不存在。
(3).解决方案
检查环境配置,是否存在svc,更正即可。
68.yaml格式制表符问题
(1).报错信息
[root@master231 04-kuboard]# docker-compose up -d parsing /root/cloud-computing-stack/linux94/kubernetes/projects/04-kuboard/docker-compose.yaml: yaml: line 7: found a tab character that violates indentation
(2).错误原因
yaml格式制表符问题,可以使用"cat -A"选项验证是否有问题。
(3).解决方案
使用"cat -A"选项验证是否有问题,根据提示改正即可。
69.ingress在相同的名称空间找不到svc
(1).报错信息
[root@master231 05-prometheus]# kubectl describe -f 01-ingress-prometheus.yaml Name: ing-prometheus-grafana Labels: <none> Namespace: default Address: Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>) Rules: Host Path Backends ---- ---- -------- grafana.oldboyedu.com / grafana:3000 (<error: endpoints "grafana" not found>) prom.oldboyedu.com / prometheus-k8s:9090 (<error: endpoints "prometheus-k8s" not found>) Annotations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Sync 4s nginx-ingress-controller Scheduled for sync Normal Sync 4s nginx-ingress-controller Scheduled for sync
(2).错误原因
ingress在相同的名称空间找不到svc
(3).解决方案
将ing和svc放在同一个名称空间中。
70.Could not get lock /var/lib/dpkg/lock-frontend
(1).报错信息
[root@node-exporter41 ~]# apt -y install ipvsadm ipset sysstat conntrack Waiting for cache lock: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 2829 (unattended-Waiting for cache lock: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 2829 (unattended-upgr)
(2).错误原因
dpkg工具被另一个安装占用导致的错误。
(3).解决方案
使用"kill -9 $ps"即可。
71.没有Kubeconfig认证文件
(1).报错信息
[root@node-exporter42 ~]# kubectl get nodes E1202 16:02:28.074900 13882 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp 127.0.0.1:8080: connect: connection refused" E1202 16:02:28.076566 13882 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp 127.0.0.1:8080: connect: connection refused" E1202 16:02:28.077953 13882 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp 127.0.0.1:8080: connect: connection refused" E1202 16:02:28.079502 13882 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp 127.0.0.1:8080: connect: connection refused" E1202 16:02:28.080963 13882 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp 127.0.0.1:8080: connect: connection refused" The connection to the server localhost:8080 was refused - did you specify the right host or port?
(2).错误原因
没有Kubeconfig认证文件导致的报错。
(3).解决方案
参考K8S配置Kubeconfig认证的课程视频笔记即可。
72.CoreDNS本地回环问题
(1).报错信息
[FATAL] plugin/loop: Loop (127.0.0.1:36030 #> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 8244365230594049349.2552766472385065880."
(2).错误原因
CoreDNS组件本地的DNS## 解析和Pod## 解析回环## 问题导致的错误。 参考链接: https://coredns.io/plugins/loop#troubleshooting
(3).解决方案
如果修改本地的"/etc/resolv.conf"你会发现,修改后会被覆盖!因此我们需要自行定义一个文件## 解析记录。 1.所有节点添加## 解析记录 echo "nameserver 223.5.5.5" > /etc/kubernetes/resolv.conf 2.所有节点修改kubelet的配置文件 # vim /etc/kubernetes/kubelet#conf.yml ... resolvConf: /etc/kubernetes/resolv.conf 3.所有节点重启kubelet组件 systemctl daemon-reload systemctl restart kubelet 4.验证DNS组件是否正常工作 [root@node#exporter41 ~]# kubectl get svc,pods -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/coredns ClusterIP 10.200.0.254 <none> 53/UDP,53/TCP,9153/TCP 14h NAME READY STATUS RESTARTS AGE pod/coredns#859664f9d8#2fl7l 1/1 Running 0 89s pod/coredns#859664f9d8#stdbs 1/1 Running 0 89s [root@node#exporter41 ~]# kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE calico-apiserver calico-api ClusterIP 10.200.93.100 <none> 443/TCP 16h calico-system calico-kube-controllers-metrics ClusterIP None <none> 9094/TCP 15h calico-system calico-typha ClusterIP 10.200.250.163 <none> 5473/TCP 16h default kubernetes ClusterIP 10.200.0.1 <none> 443/TCP 17h kube-system coredns ClusterIP 10.200.0.254 <none> 53/UDP,53/TCP,9153/TCP 14h [root@node#exporter41 ~]# dig @10.200.0.254 calico-api.calico-apiserver.svc.oldboyedu.com +short 10.200.93.100 [root@node#exporter41 ~]# dig @10.200.0.254 calico-typha.calico-system.svc.oldboyedu.com +short 10.200.250.163
73.fail to check rbd image status with: (executable file not found in $PATH)
(1).报错信息
[root@master231 06-ceph]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-rbd-698bbc5555-44dz6 0/1 ContainerCreating 0 34s <none> worker233 <none> <none> [root@master231 06#ceph]# kubectl describe pod deploy-xiuxian-rbd-698bbc5555-44dz6 Name: deploy-xiuxian-rbd-698bbc5555-44dz6 Namespace: default ... Type Reason Age From Message #### ###### #### #### ####### Normal Scheduled 36s default#scheduler Successfully assigned default/deploy#xiuxian#rbd#698bbc5555#44dz6 to worker233 Normal SuccessfulAttachVolume 36s attachdetach#controller AttachVolume.Attach succeeded for volume "data" Warning FailedMount 4s (x7 over 35s) kubelet MountVolume.WaitForAttach failed for volume "data" : fail to check rbd image status with: (executable file not found in $PATH), rbd output: ()
(2).错误原因
无法使用以下命令检查rbd映像状态,是k8s的worker节点未安装rbd相关的工具。
(3).解决方案
安装ceph-common工具包即可。
74: MountVolume.WaitForAttach failed for volume "data" : fail to check rbd image status with: (exit status 95), rbd output: (did not load config file, using default settings
(1).报错信息
[root@master231 06-ceph]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-rbd-698bbc5555#9wtf6 0/1 ContainerCreating 0 22s <none> worker232 <none> <none> [root@master231 06#ceph]# kubectl describe po deploy#xiuxian#rbd#698bbc5555#9wtf6 Name: deploy#xiuxian#rbd#698bbc5555#9wtf6 Namespace: default Priority: 0 ... Events: Type Reason Age From Message #### ###### #### #### ####### Normal Scheduled 24s default#scheduler Successfully assigned default/deploy#xiuxian#rbd#698bbc5555#9wtf6 to worker232 Normal SuccessfulAttachVolume 24s attachdetach#controller AttachVolume.Attach succeeded for volume "data" ... Warning FailedMount 5s kubelet MountVolume.WaitForAttach failed for volume "data" : fail to check rbd image status with: (exit status 95), rbd output: (did not load config file, using default settings. 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 Errors while parsing config file! 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 can't open ceph.conf: (2) No such file or directory 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 Errors while parsing config file! 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 can't open ceph.conf: (2) No such file or directory 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 AuthRegistry(0x558e897092e8) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin, disabling cephx 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory 2024#12#09T11:00:25.991+0800 7fd4c748f4c0 #1 AuthRegistry(0x7fff3c530800) no keyring found at /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin, disabling cephx 2024#12#09T11:00:25.995+0800 7fd4c748f4c0 #1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication rbd: couldn't connect to the cluster! )
(2).错误原因
无法加载ceph的秘钥认证文件。
(3).解决方案
- 1.将认证文件拷贝到K8S工作节点集群环境即可。 - 2.请确保名称是否和默认的路径"/etc/ceph/keyring"是否对应,若不对应,则需要在资源清单中指定密钥的文件路径。
75.pod has unbound immediate PersistentVolumeClaims
(1).报错信息
[root@master231 06-ceph]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-secretref-5d97b68548-rf9f9 0/1 Pending 0 5s <none> <none> <none> <none> [root@master231 06-ceph]# kubectl describe pod deploy-xiuxian-secretref-5d97b68548-rf9f9 Name: deploy-xiuxian-secretref-5d97b68548-rf9f9 Namespace: default ... Events: Type Reason Age From Message #### ###### #### #### ####### Warning FailedScheduling 9s default#scheduler 0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.
(2).错误原因
pod具有未绑定的pvc导致Pod无法挂载持久卷。
(3).解决方案
检查是否pvc为绑定导致的错误。
76.wrong fs type, bad option, bad superblock on /dev/rbd0, missing codepag
(1).报错信息
[root@master231 06-ceph]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deploy-xiuxian-secretref-5d97b68548-pxtkq 0/1 ContainerCreating 0 13s <none> worker233 <none> <none> [root@master231 06-ceph]# [root@master231 06-ceph]# kubectl describe pods deploy-xiuxian-secretref-5d97b68548-pxtkq Name: deploy-xiuxian-secretref-5d97b68548-pxtkq ... Events: Type Reason Age From Message #### ###### #### #### ####### Normal Scheduled 18s default#scheduler Successfully assigned default/deploy#xiuxian#secretref#5d97b68548#pxtkq to worker233 Normal SuccessfulAttachVolume 18s attachdetach#controller AttachVolume.Attach succeeded for volume "pv#rbd" Warning FailedMount 16s kubelet MountVolume.MountDevice failed for volume "pv#rbd" : rbd: failed to mount device /dev/rbd0 at /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/yinzhengjie#k8s#image#xiuxian (fstype: ), error mount failed: exit status 32 Mounting command: systemd#run Mounting arguments: ##description=Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/yinzhengjie#k8s#image#xiuxian ##scope ## mount #t ext4 #o defaults /dev/rbd0 /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/yinzhengjie#k8s#image#xiuxian Output: Running scope as unit: run#refe6402e5fb941fb8114176a595fdaed.scope mount: /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/yinzhengjie#k8s#image#xiuxian: wrong fs type, bad option, bad superblock on /dev/rbd0, missing codepage or helper program, or other error. ...
(2).错误原因
当使用ceph做后端存储时,若不指定fsType字段,则默认为ext4,因此检查块设备的文件系统类型是否对应。
(3).解决方案
已经存在的文件系统,需要使用fsType字段指定正确的文件系统名称。
77.missing configuration for cluster ID "0f06b0e2#b128#11ef#9a37#4971ded8a98b"
(1).报错信息
[root@master231 rbd]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE ... rbd-pvc01 Pending csi-rbd-sc 11s rbd-pvc02 Pending csi-rbd-sc 11s [root@master231 rbd]# kubectl describe pvc rbd#pvc01 Name: rbd-pvc01 Namespace: default StorageClass: csi-rbd-sc Status: Pending ... Events: Type Reason Age From Message #### ###### #### #### ####### Warning ProvisioningFailed 17s persistentvolume#controller storageclass.storage.k8s.io "csi#rbd#sc" not found Normal ExternalProvisioning 3s (x2 over 3s) persistentvolume#controller waiting for a volume to be created, either by external provisioner "rbd.csi.ceph.com" or manually created by system administrator Normal Provisioning 1s (x3 over 3s) rbd.csi.ceph.com_csi#rbdplugin#provisioner#5dfcf67885#jmplh_d864ea20#046e#430e#9867#7a34250c7d9d External provisioner is provisioning volume for claim "default/rbd#pvc01" Warning ProvisioningFailed 1s (x3 over 3s) rbd.csi.ceph.com_csi#rbdplugin#provisioner#5dfcf67885#jmplh_d864ea20#046e#430e#9867#7a34250c7d9d failed to provision volume with StorageClass "csi#rbd#sc": rpc error: code = InvalidArgument desc = failed to fetch monitor list using clusterID (0f06b0e2#b128#11ef#9a37#4971ded8a98b): missing configuration for cluster ID "0f06b0e2#b128#11ef#9a37#4971ded8a98b"
(2).错误原因
检查动态存储类的配置,有关于"cluster ID"字段的配置是否加了双引号。
(3).解决方案
在“cluster ID”字段必须添加双引号,否则会## 报错!
———————————————————————————————————————————————————————————————————————————
无敌小马爱学习
浙公网安备 33010602011771号