K8S 中实现程序 OOM 后 dump 上传到阿里云 OSS (二)
在 K8S 中使用 Sidecar
创建 secret, 用来保存阿里云 RAM 用户的 OSS_ACCESS_KEY_ID 和 OSS_ACCESS_KEY_SECRET,traeAliOSS 命令需要用到。
kubectl create secret generic aliyun-credentials \
--from-literal=OSS_ACCESS_KEY_ID=aaaaaaaaaaaa \
--from-literal=OSS_ACCESS_KEY_SECRET=bbbbbbbbbbbb
创建 deployment,memory-leak-test 镜像是我用来测试 OOM 的,同样也是使用 Trae 来编写的 java 程序。
# 下面是使用 Sidecar 的配置,新增的部分,我会标出来
cat deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: memory-leak-test
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: memory-leak-test
template:
metadata:
labels:
app: memory-leak-test
spec:
terminationGracePeriodSeconds: 10
imagePullSecrets:
- name: harbor
# 新增一个挂载目录,用于共享 dump 所在的目录
volumes:
- name: shared-tmp
emptyDir: {}
containers:
- name: memory-leak-test
image: harbor.klvchen.com/tmp/memory-leak-test:12
resources:
limits:
cpu: 1
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
livenessProbe:
tcpSocket:
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
# 新增目录挂载到 dump 所在的地方
volumeMounts:
- name: shared-tmp
mountPath: /tmp
# 新增的 Sidecar 容器的配置
- name: tmp-watcher-sidecar
image: harbor.klvchen.com/library/alpine-ossutil:0.1
env:
- name: OSS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aliyun-credentials
key: OSS_ACCESS_KEY_ID
- name: OSS_ACCESS_KEY_SECRET
valueFrom:
secretKeyRef:
name: aliyun-credentials
key: OSS_ACCESS_KEY_SECRET
- name: ALI_OSS_BUCKET
value:klvchen-test
volumeMounts:
- name: shared-tmp
mountPath: /tmp
# 启动服务
kubectl apply -f deployment.yaml
注意事项:
1. 业务容器,我这里是 memory-leak-test,启动时,要配置 "-XX:+HeapDumpOnOutOfMemoryError", "-XX:HeapDumpPath=/tmp/memory-leak-test.dump" ,当 OOM 时生成 dump,并且指定 dump 在 /tmp 目录中与 Deployment 中匹配
下面是我的 memory-leak-test Dockerfile 参考例子:
cat Dockerfile
FROM harbor.junengcloud.com/openfaas/openjdk:11
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo "Asia/Shanghai" >/etc/timezone
ADD memory-leak-test-1.0-SNAPSHOT.jar memory-leak-test.jar
ENTRYPOINT ["java", "-Xmx128m","-Xms128m", "-XX:+HeapDumpOnOutOfMemoryError", "-XX:HeapDumpPath=/tmp/memory-leak-test.dump", "-Djava.security.egd=file:/dev/./urandom", "-Dentry.timezone=GMT+08", "- jar","/memory-leak-test.jar"]
EXPOSE 8000
2. 业务容器 deployment 挂载的目录,要与 "-XX:HeapDumpPath=/tmp/memory-leak-test.dump" 匹配,并且dump的名字以 .dump 命名
测试
memory-leak-test 设置成访问 /leak 就会启动

一段时间后,程序发生 OOM

登录阿里云OSS,发现 dump 已经成功上传

后续可配置钉钉告警,发生 OOM 时提醒相关人员。

浙公网安备 33010602011771号