Grafana SQL汇总

  导航:这里主要是列出一个prometheus一些系统的学习过程,最后按照章节顺序查看,由于写作该文档经历了不同时期,所以在文中有时出现

的云环境不统一,但是学习具体使用方法即可,在最后的篇章,有一个完整的腾讯云的实战案例。

  1.什么是prometheus?

  2.Prometheus安装

  3.Prometheus的Exporter详解

  4.Prometheus的PromQL

  5.Prometheus告警处理

  6.Prometheus的集群与高可用

  7.Prometheus服务发现

  8.kube-state-metrics 和 metrics-server

  9.监控kubernetes集群的方式

  10.prometheus operator

  11.Prometheus实战之联邦+高可用+持久

  12.Prometheus实战之配置汇总

  13.Grafana简单用法

  14.Grafana SQL汇总

  15.prometheus SQL汇总

  参考:

  https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

  https://yunlzheng.gitbook.io/prometheus-book/part-iii-prometheus-shi-zhan/readmd/use-prometheus-monitor-kubernetes

  https://www.bookstack.cn/read/prometheus_practice/introduction-README.md

  https://www.kancloud.cn/huyipow/prometheus/521184

  https://www.qikqiak.com/k8s-book/docs/

  

  由于自己写一些prometheus sql会比较耗时,所以这里从腾讯云的云原生监控和prometheus operator中扒一些过来进行记录。

  (prometheus operator和云原生中的基本差不多)

  这里主要从腾讯云的云原生监控来获取,因为标签以及变量问题,该sql在联邦集群环境中需要调整才能使用。

 

1.Compute resources/Cluster

  • 大盘

  • 变量

 

  • Sql

  CPU Utilisation

1 - avg(rate(node_cpu_seconds_total{mode="idle", cluster="$cluster"}[1m]))

 

  CPU Requests Commitment

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster"}) / sum(kube_node_status_allocatable_cpu_cores{cluster="$cluster"})

 

  CPU Limits Commitment

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster"}) / sum(kube_node_status_allocatable_cpu_cores{cluster="$cluster"})

 

  Memory Utilisation

1 - sum(:node_memory_MemAvailable_bytes:sum{cluster="$cluster"}) / sum(kube_node_status_allocatable_memory_bytes{cluster="$cluster"})

 

  Memory Requests Commitment

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster"}) / sum(kube_node_status_allocatable_memory_bytes{cluster="$cluster"})

 

  Memory Limits Commitment

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster"}) / sum(kube_node_status_allocatable_memory_bytes{cluster="$cluster"})

 

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster"}) by (namespace)

 

  Memory Usage (working_set)

sum(container_memory_working_set_bytes{cluster="$cluster", container!=""}) by (namespace)

 

  Requests by Namespace

sum(kube_pod_owner{cluster="$cluster"}) by (namespace)

count(avg(namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster"}) by (workload, namespace)) by (namespace)

sum(container_memory_rss{cluster="$cluster", container!=""}) by (namespace)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster"}) by (namespace)

 

  Current Network Usage

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Average Container Bandwidth by Namespace: Received

avg(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Average Container Bandwidth by Namespace: Transmitted

avg(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Transmitted Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~".+"}[1m])) by (namespace)

 

2.Compute Resources / Namespace (Pods)

  • 大盘

 

  • 变量

 

  • Sql

  CPU Utilisation (from requests)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"})

 

  CPU Utilisation (from limits)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"})

 

  Memory Utilization (from requests)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) / sum(kube_pod_container_resource_requests_memory_bytes{namespace="$namespace"})

 

  Memory Utilisation (from limits)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) / sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster",namespace="$namespace"})

 

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.cpu"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.cpu"})

 

  CPU Quota

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}) by (pod)

 

  Memory Usage (w/o cache)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}) by (pod)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.memory"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.memory"})

 

 

  Memory Quota

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster",namespace="$namespace"}) by (pod)

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}) by (pod)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace",container!=""}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{namespace="$namespace"}) by (pod)

sum(container_memory_rss{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

sum(container_memory_cache{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

sum(container_memory_swap{cluster="$cluster", namespace="$namespace",container!=""}) by (pod)

 

  Current Network Usage

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])) by (pod)

 

3.Compute Resources/Namespace (Workloads)

  • 大盘

 

  • 变量

 

  • Sql

CPU Usage

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.cpu"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.cpu"})

 

  CPU Quota

count(namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}) by (workload, workload_type)

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

 

  Memory Usage

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="requests.memory"})

scalar(kube_resourcequota{cluster="$cluster", namespace="$namespace", type="hard",resource="limits.memory"})

 

  Memory Quota

count(namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}) by (workload, workload_type)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
  kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)
/sum(
  kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload_type="$type"}
) by (workload, workload_type)

 

  Current Network Usage

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload_type="$type"}) by (workload))

 

  Receive Bandwidth

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Transmit Bandwidth

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Container Bandwidth by Workload: Received

(avg(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Container Bandwidth by Workload: Transmitted

(avg(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets Dropped

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets Dropped

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

4.Compute Resources / Node (Pods)

  • 大盘

 

  • 变量

 

  • Sql

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod)

 

  CPU Quota

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod)

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", node=~"$node"}) by (pod) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", node=~"$node"}) by (pod)

 

  Memory Usage (w/o cache)

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node", container!=""}) by (pod)

 

  Memory Quota

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node",container!=""}) by (pod)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node",container!=""}) by (pod) / sum(kube_pod_container_resource_requests_memory_bytes{node=~"$node"}) by (pod)

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_memory_working_set_bytes{cluster="$cluster", node=~"$node",container!=""}) by (pod) / sum(kube_pod_container_resource_limits_memory_bytes{node=~"$node"}) by (pod)

sum(node_namespace_pod_container:container_memory_rss{cluster="$cluster", node=~"$node",container!=""}) by (pod)

sum(node_namespace_pod_container:container_memory_cache{cluster="$cluster", node=~"$node",container!=""}) by (pod)

sum(node_namespace_pod_container:container_memory_swap{cluster="$cluster", node=~"$node",container!=""}) by (pod)

 

5.Compute Resources / Pod

  • 大盘

 

  • 变量

 

  • Sql

  CPU Usage

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{namespace="$namespace", pod="$pod", container!="POD", cluster="$cluster"}) by (container)

sum(
kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"})

sum(
    kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"})

 

  CPU Throttling

sum(increase(container_cpu_cfs_throttled_periods_total{namespace="$namespace", pod="$pod", container!="POD", container!="", cluster="$cluster"}[5m])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace="$namespace", pod="$pod", container!="POD", container!="", cluster="$cluster"}[5m])) by (container)

 

  CPU Quota

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace", pod="$pod", container!="POD"}) by (container)

sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container) / sum(kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container) / sum(kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

 

  Memory Usage

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!="POD", container!=""}) by (container)

sum(
kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"})

sum(
    kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"})

 

  Memory Quota

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!="POD", container!=""}) by (container)

sum(kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod"}) by (container) / sum(kube_pod_container_resource_requests_memory_bytes{namespace="$namespace", pod="$pod"}) by (container)

sum(kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!=""}) by (container)

sum(container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", pod="$pod", container!=""}) by (container) / sum(kube_pod_container_resource_limits_memory_bytes{namespace="$namespace", pod="$pod"}) by (container)

sum(container_memory_rss{cluster="$cluster", namespace="$namespace", pod="$pod", container != "", container != "POD"}) by (container)

sum(container_memory_cache{cluster="$cluster", namespace="$namespace", pod="$pod", container != "", container != "POD"}) by (container)

sum(container_memory_swap{cluster="$cluster", namespace="$namespace", pod="$pod", container != "", container != "POD"}) by (container)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_transmit_packets_total{namespace=~"$namespace", pod=~"$pod"}[1m])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{namespace=~"$namespace",   pod=~"$pod"}[1m])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{namespace=~"$namespace",   pod=~"$pod"}[1m])) by (pod)

 

6.Compute Resources / Workload

  • 大盘

 

  • 变量

 

  • Sql

  CPU Usage

sum(
  node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  CPU Quota

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_requests_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_limits_cpu_cores{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  Memory Usage

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  Memory Quota

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_requests_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

sum(
    container_memory_working_set_bytes{cluster="$cluster", namespace="$namespace", container!=""}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)
/sum(
    kube_pod_container_resource_limits_memory_bytes{cluster="$cluster", namespace="$namespace"}
  * on(namespace,pod)
    group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace="$namespace", workload="$workload", workload_type="$type"}
) by (pod)

 

  Current Network Usage

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Receive Bandwidth

(sum(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Transmit Bandwidth

(sum(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Container Bandwidth by Pod: Received

(avg(irate(container_network_receive_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Container Bandwidth by Pod: Transmitted

(avg(irate(container_network_transmit_bytes_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets

(sum(irate(container_network_receive_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets

(sum(irate(container_network_transmit_packets_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets Dropped

(sum(irate(container_network_receive_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets Dropped

(sum(irate(container_network_transmit_packets_dropped_total{cluster="$cluster", namespace=~"$namespace"}[1m])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster="$cluster", namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

7.Networking/Cluster

  • 大盘

 

  • 变量

 

  • Sql

  Current Rate of Bytes Received

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Current Rate of Bytes Transmitted

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Current Status

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Average Rate of Bytes Received

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Average Rate of Bytes Transmitted

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Receive Bandwidth

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

  Transmit Bandwidth

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~".+"}[$interval:$resolution])) by (namespace))

 

8.Networking / Namespace (Pods)

  • 大盘

 

  • 变量

 

  • Sql

  Current Rate of Bytes Received

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution]))

 

  Current Rate of Bytes Transmitted

sum(irate(container_network_transmit_bytes_total{namespace=~"$namespace"}[$interval:$resolution]))

 

  Current Status

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster", namespace=~"$namespace"}[$interval:$resolution])) by (pod)

 

9.Networking / Namespace (Workload)

  • 大盘

 

  • 变量

 

  • Sql

  Current Rate of Bytes Received

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Current Rate of Bytes Transmitted

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Current Status

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Rate of Bytes Received

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Average Rate of Bytes Transmitted

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Receive Bandwidth

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Transmit Bandwidth

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Received Packets Dropped

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

  Rate of Transmitted Packets Dropped

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~".+", workload_type="$type"}) by (workload))

 

10.Networking/Pod

  • 大盘

 

  • 变量

 

  • Sql

  Current Rate of Bytes Received

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution]))

 

  Current Rate of Bytes Transmitted

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution]))

 

  Receive Bandwidth

sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Transmit Bandwidth

sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets

sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets

sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Received Packets Dropped

sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

  Rate of Transmitted Packets Dropped

sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace",   pod=~"$pod"}[$interval:$resolution])) by (pod)

 

11.Networking / Workload

  • 大盘

 

  • 变量

 

 

  • Sql

  Current Rate of Bytes Received

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Current Rate of Bytes Transmitted

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Rate of Bytes Received

sort_desc(avg(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Average Rate of Bytes Transmitted

sort_desc(avg(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Receive Bandwidth

sort_desc(sum(irate(container_network_receive_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Transmit Bandwidth

sort_desc(sum(irate(container_network_transmit_bytes_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets

sort_desc(sum(irate(container_network_receive_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets

sort_desc(sum(irate(container_network_transmit_packets_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Received Packets Dropped

sort_desc(sum(irate(container_network_receive_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

  Rate of Transmitted Packets Dropped

sort_desc(sum(irate(container_network_transmit_packets_dropped_total{cluster=~"$cluster",namespace=~"$namespace"}[$interval:$resolution])
* on (namespace,pod)
group_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{cluster=~"$cluster",namespace=~"$namespace", workload=~"$workload", workload_type="$type"}) by (pod))

 

12.Node

  • 大盘

 

  • 变量

             

  • Sql

  服务器资源总览表(每页10行)

node_uname_info{job=~"$job", cluster=~"$cluster"} – 0

sum(time() - node_boot_time_seconds{job=~"$job",cluster=~"$cluster"})by(instance)

node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"} – 0

count(node_cpu_seconds_total{job=~"$job",mode='system',cluster=~"$cluster"}) by (instance)

node_load5{job=~"$job",cluster=~"$cluster"}

(1 - avg(irate(node_cpu_seconds_total{job=~"$job",mode="idle",cluster=~"$cluster"}[5m])) by (instance)) * 100

(1 - (node_memory_MemAvailable_bytes{job=~"$job",cluster=~"$cluster"} / (node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"})))* 100

max((node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}-node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}) *100/(node_filesystem_avail_bytes {job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}+(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"}-node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"ext.?|xfs"})))by(instance)

max(irate(node_disk_read_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])) by (instance)

max(irate(node_disk_written_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])) by (instance)

max(irate(node_network_receive_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])*8) by (instance)

max(irate(node_network_transmit_bytes_total{job=~"$job",cluster=~"$cluster"}[5m])*8) by (instance)

 

  $job:整体总负载与整体平均CPU使用率

count(node_cpu_seconds_total{job=~"$job",cluster=~"$cluster", mode='system'})

sum(node_load5{job=~"$job",cluster=~"$cluster"})

avg(1 - avg(irate(node_cpu_seconds_total{job=~"$job",mode="idle",cluster=~"$cluster"}[5m])) by (instance)) * 100

 

  $job:整体总内存与整体平均内存使用率

sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"})

sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"} - node_memory_MemAvailable_bytes{job=~"$job",cluster=~"$cluster"})

(sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"} - node_memory_MemAvailable_bytes{job=~"$job",cluster=~"$cluster"}) / sum(node_memory_MemTotal_bytes{job=~"$job",cluster=~"$cluster"}))*100

 

  $job:整体总磁盘与整体平均磁盘使用率

sum(avg(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))

sum(avg(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))

(sum(avg(node_filesystem_size_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))) *100/(sum(avg(node_filesystem_avail_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))+(sum(avg(node_filesystem_size_bytes{job=~"$job",fstype=~"xfs|ext.*"})by(device,instance)) - sum(avg(node_filesystem_free_bytes{job=~"$job",cluster=~"$cluster",fstype=~"xfs|ext.*"})by(device,instance))))

 

  运行时间

avg(time() - node_boot_time_seconds{instance=~"$node",cluster=~"$cluster"})

 

  CPU 核数

count(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node", mode='system'})

 

  总内存

sum(node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"})

 

  无

sum(node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"})

avg(irate(node_cpu_seconds_total{instance=~"$node",mode="iowait",cluster=~"$cluster"}[5m])) * 100

(1 - (node_memory_MemAvailable_bytes{instance=~"$node",cluster=~"$cluster"} / (node_memory_MemTotal_bytes{instance=~"$node",cluster=~"$cluster"})))* 100

(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"})*100 /(node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}+(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint="$maxmount"}))

(1 - ((node_memory_SwapFree_bytes{cluster=~"$cluster",instance=~"$node"} + 1)/ (node_memory_SwapTotal_bytes{cluster=~"$cluster",instance=~"$node"} + 1))) * 100

 

  【$show_hostname】:各分区可用空间(EXT.*/XFS)

node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-0

node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-0

(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}))

 

  CPU iowait

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="iowait"}[5m])) * 100

 

  剩余节点数:$maxmount

avg(node_filesystem_files_free{cluster=~"$cluster",instance=~"$node",mountpoint="$maxmount",fstype=~"ext.?|xfs"})

 

  总文件描述符

avg(node_filefd_maximum{cluster=~"$cluster",instance=~"$node"})

 

  每小时流量$device

increase(node_network_receive_bytes_total{cluster=~"$cluster",instance=~"$node",device=~"$device"}[60m])

increase(node_network_transmit_bytes_total{cluster=~"$cluster",instance=~"$node",device=~"$device"}[60m])

 

  CPU使用率

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="system"}[5m])) by (instance) *100

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="user"}[5m])) by (instance) *100

avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="iowait"}[5m])) by (instance) *100

(1 - avg(irate(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node",mode="idle"}[5m])) by (instance))*100

  

  内存信息

node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"} - node_memory_MemAvailable_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemAvailable_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_Buffers_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemFree_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_Cached_bytes{cluster=~"$cluster",instance=~"$node"}

node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"} - (node_memory_Cached_bytes{cluster=~"$cluster",instance=~"$node"} + node_memory_Buffers_bytes{cluster=~"$cluster",instance=~"$node"} + node_memory_MemFree_bytes{cluster=~"$cluster",instance=~"$node"})

(1 - (node_memory_MemAvailable_bytes{cluster=~"$cluster",instance=~"$node"} / (node_memory_MemTotal_bytes{cluster=~"$cluster",instance=~"$node"})))* 100

 

  每秒网络带宽使用$device

irate(node_network_receive_bytes_total{cluster=~"$cluster",instance=~'$node',device=~"$device"}[5m])*8

irate(node_network_transmit_bytes_total{cluster=~"$cluster",instance=~'$node',device=~"$device"}[5m])*8

 

  系统平均负载

node_load1{cluster=~"$cluster",instance=~"$node"}

node_load5{cluster=~"$cluster",instance=~"$node"}

node_load15{cluster=~"$cluster",instance=~"$node"}

sum(count(node_cpu_seconds_total{cluster=~"$cluster",instance=~"$node", mode='system'}) by (cpu,instance)) by(instance)

 

  每秒磁盘读写容量

irate(node_disk_read_bytes_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_written_bytes_total{cluster=~"$cluster",instance=~"$node"}[5m])

 

  磁盘使用率

(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}) *100/(node_filesystem_avail_bytes {cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}+(node_filesystem_size_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}-node_filesystem_free_bytes{cluster=~"$cluster",instance=~'$node',fstype=~"ext.*|xfs",mountpoint !~".*pod.*"}))

node_filesystem_files_free{cluster=~"$cluster",instance=~'$node',fstype=~"ext.?|xfs"} / node_filesystem_files{cluster=~"$cluster",instance=~'$node',fstype=~"ext.?|xfs"}

 

  磁盘读写速率(IOPS)

irate(node_disk_reads_completed_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_writes_completed_total{cluster=~"$cluster",instance=~"$node"}[5m])

node_disk_io_now{cluster=~"$cluster",instance=~"$node"}

 

  每1秒内I/O操作耗时占比

irate(node_disk_io_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m])

 

  每次IO读写的耗时(参考:小于100ms)(beta)

irate(node_disk_read_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m]) / irate(node_disk_reads_completed_total{instance=~"$node"}[5m])

irate(node_disk_write_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m]) / irate(node_disk_writes_completed_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_io_time_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m])

irate(node_disk_io_time_weighted_seconds_total{cluster=~"$cluster",instance=~"$node"}[5m])

 

  网络Socket连接信息

node_netstat_Tcp_CurrEstab{cluster=~"$cluster",instance=~'$node'}

node_sockstat_TCP_tw{cluster=~"$cluster",instance=~'$node'}

node_sockstat_sockets_used{cluster=~"$cluster",instance=~'$node'}

node_sockstat_UDP_inuse{cluster=~"$cluster",instance=~'$node'}

node_sockstat_TCP_alloc{cluster=~"$cluster",instance=~'$node'}

irate(node_netstat_Tcp_PassiveOpens{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_ActiveOpens{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_InSegs{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_OutSegs{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_Tcp_RetransSegs{cluster=~"$cluster",instance=~'$node'}[5m])

irate(node_netstat_TcpExt_ListenDrops{cluster=~"$cluster",instance=~'$node'}[5m])

 

  打开的文件描述符(左 )/每秒上下文切换次数(右)

node_filefd_allocated{cluster=~"$cluster",instance=~"$node"}

irate(node_context_switches_total{cluster=~"$cluster",instance=~"$node"}[5m])

(node_filefd_allocated{cluster=~"$cluster",instance=~"$node"}/node_filefd_maximum{cluster=~"$cluster",instance=~"$node"}) *100

 

13.Pods

  • 大盘

 

  • 变量

  没有变量显示,可以根据之前的模版自己写

 

  • sql

  腾讯云无法查看

 

14.Porxy

  • 大盘

 

  • 变量

  没有变量显示,可以根据之前的模版自己写

 

  • sql

  腾讯云无法查看

 

15.Scheduler

  • 大盘

  • 变量

  没有变量显示,可以根据之前的模版自己写

  • sql

  腾讯云无法查看

 

16.StatefulSets

  • 大盘

 

 

  • 变量

  没有变量显示,可以根据之前的模版自己写

 

  • sql

  腾讯云无法查看

 

17.Persistent Volumes

  • 大盘

  因为没有使用PV,所以可能没有数据查询

 

  • 变量

  没有变量显示,可以根据之前的模版自己写

 

  • sql

  腾讯云无法查看

 

18.Kubelet

  • 大盘

 

  • 变量

  没有变量显示,可以根据之前的模版自己写

 

  • sql

  腾讯云无法查看

 

 

posted @ 2021-11-21 16:49  小家电维修  阅读(3141)  评论(0编辑  收藏  举报