alertManager

prometheus-operator里的alertManager实例是通过服务发现的方式来做的


自定义rule

------
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: etcd-rules
namespace: monitoring
spec:
groups:

  • name: etcd
    rules:
    • alert: EtcdClusterUnavailable
      annotations:
      summary: etcd cluster small
      descriprion: If one more etcd peer goes down the cluster will be unavailable.
      expr: |
      up{job="etcd"} == 0

count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)

  for: 1s
  labels:
    serverity: critical
posted @ 2021-09-24 16:25  lavida2000  阅读(174)  评论(0)    收藏  举报