ingress-nginx静态文件访问pending - 少年老余

一、系统环境：

操作系统：ubuntu 22.04

K8S版本： 1.25.9

ingress版本：1.8.1

二、问题现象

访问静态文件总会有pending情况，需要多次刷新解决

pending时候，ingress日志看不到请求内容

三、ingress启动异常日志

2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 39#39: sendmsg() failed (9: Bad file descriptor)
2025/03/11 05:10:05 [alert] 1225#1225: pthread_create() failed (11: Resource temporarily unavailable)
2025/03/11 05:10:05 [alert] 1231#1231: pthread_create() failed (11: Resource temporarily unavailable)
2025/03/11 05:10:05 [alert] 39#39: worker process 691 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 723 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 756 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 809 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 860 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 885 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 929 exited with fatal code 2 and cannot be respawned
2025/03/11 05:10:05 [alert] 39#39: worker process 974 exited with fatal code 2 and cannot be respawned

View Code

发现worker_processes 值等于cpu数量，默认应该是auto模式了

四、临时解决办法

修改configmap： ingress-nginx-controller

worker_processes要小于cpu个数

data:
worker-processes: "4"

重新ingress，进入容器查看是否修改成功

查看日志是否还有相关的报错，如没有，正常

kubectl rollout restart deploy -n ingress-nginx ingress-nginx-controller

等待半分钟，验证，发现访问流畅

五、查找根因

1.内存够大，排除

2.系统文件描述符设置够大

3.发现有些服务日志产生报错：

failed to create fsnotify watcher: too many open files

解决办法：

fs.inotify.max_user_watches：单个用户可监控的文件/目录数量。
fs.inotify.max_user_instances：单个用户可创建的 inotify 实例数。默认只有128

修改宿主机：

# 临时生效
sudo sysctl -w fs.inotify.max_user_watches=524288
sudo sysctl -w fs.inotify.max_user_instances=1024

# 永久生效
echo "fs.inotify.max_user_watches = 524288" >> /etc/sysctl.conf
echo "fs.inotify.max_user_instances = 1024" >> /etc/sysctl.conf
sysctl -p

4.dmesg -T 发现ingress重启产生cgroup日志

cgroup: fork rejected by pids controller in /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod9ed3a9b7_46f9_4e31_83b5_048fa4bbfe44.slice/cri-containerd-a11fdc33537da800a422443fcb261e8167909df0138fa15faa784923e48d80a2.scope

todo

结论：

目前只发现跟ubuntu系统有关， centos没问题，怀疑此版本系统有相关bug或者兼容性问题

发表于 2025-03-11 13:54 少年老余阅读(74) 评论(0) 收藏举报