zno2

OOMKilled

问题描述:某应用节点频繁重启

通过 describe 查看详情发现

 

kubectl -n <yournamespace> describe pod <yourapplicationpodid>

 

    Command:
      java
    Args:
      -Denv=PRO
      -XX:+UnlockExperimentalVMOptions
      -XX:+UseCGroupMemoryLimitForHeap
      -XX:+UseConcMarkSweepGC
      -XX:+HeapDumpOnOutOfMemoryError
      -XX:HeapDumpPath=/data/applog/MICRO-MTK
      -Xms3816M
      -Xmx3816M
      -jar
      /usr/local/app.jar
      --server.port=8080
    State:          Running
      Started:      Wed, 01 Feb 2023 17:10:14 +0800
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 01 Feb 2023 15:38:35 +0800
      Finished:     Wed, 01 Feb 2023 17:10:13 +0800
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:      1
      memory:   4Gi

 

跟踪节点日志为发现oom 相关日志,排查代码未发现易造成内存溢出的逻辑

-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/applog/MICRO-MTK
均未触发

 

推测原因是:

现在容器限制4g , jvm限制3.7g,垃圾回收不及时有可能在oom之前触发容器的上限,导致被kill

 

调整:

将xmx 和 xms 设置为2g

 

持续观察节点恢复正常

 

参考:

https://kubernetes.io/zh-cn/docs/concepts/configuration/manage-resources-containers/

 

posted on 2023-08-03 19:50  zno2  阅读(25)  评论(0编辑  收藏  举报

导航