2025,每天10分钟,跟我学K8S(四十五)- Prometheus(三)添加系统监控项
在上一章节,我们完成了Prometheus的安装,也可以在prometheus的targets管理页面看到了现在已经有一些系统应用指标被监控到了,例如 kube-apiserver,kubelet。但是任然有一些系统应用指标还缺失,例如 kube-scheduler、kube-controller-manager、kube-proxy 这三个系统组件就还没被监控。
如下图所示:
本节内容,我们来一起学习下 Prometheus中系统监控项的添加方式。
配置kube-scheduler监控
1.首先修改 /etc/kubernetes/manifests/kube-scheduler.yaml 文件中的bind IP
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
====修改为====
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=0.0.0.0 #修改这里
2.再来查看一下manifests/kubernetesControlPlane-serviceMonitorKubeScheduler.yaml这个文件,这里给它添加上了注释。
apiVersion: monitoring.coreos.com/v1 # 定义 CRD 资源类型为 ServiceMonitor
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: kube-scheduler # 标识监控目标为 kube-scheduler
app.kubernetes.io/part-of: kube-prometheus # 标记属于 kube-prometheus 生态
name: kube-scheduler # ServiceMonitor 资源名称
namespace: monitoring # 部署在 monitoring 命名空间
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token # 使用 ServiceAccount Token 认证
interval: 30s # 抓取间隔 30 秒
port: https-metrics # 关联 Service 的端口名称
scheme: https # 使用 HTTPS 协议(需配合 TLS 配置)
tlsConfig:
insecureSkipVerify: true # 跳过 HTTPS 证书验证(生产环境建议配置合法证书)[1](@ref)
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 5s # 更高频率的抓取间隔(适用于实时性指标)
metricRelabelings:
- action: drop # 指标重标记:过滤特定指标
regex: process_start_time_seconds # 正则匹配需要丢弃的指标名称
sourceLabels:
- __name__ # 作用于指标名称字段
path: /metrics/slis # 自定义指标路径(服务级别指标)
port: https-metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
jobLabel: app.kubernetes.io/name # 使用标签值作为 Prometheus Job 名称
namespaceSelector:
matchNames:
- kube-system # 仅在 kube-system 命名空间发现目标服务
selector:
matchLabels:
app.kubernetes.io/name: kube-scheduler # 匹配 Service 的标签选择器
3.发现最后的matchLabels,表名需要选择标签为 app.kubernetes.io/name: kube-scheduler
的 Service,但是通过kubectl get svc -n kube-system 发现并没有这个svc。
root@k8s-master:~/prometheus/kube-prometheus-0.14.0/manifests# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 21d
kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 24h
metrics-server ClusterIP 10.105.177.80 <none> 443/TCP 2d
那我们就创建一个svc,并且绑定对应的标签
# kube-scheduler-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kube-scheduler
namespace: kube-system
labels:
k8s-app: kube-scheduler
app.kubernetes.io/name: kube-scheduler
spec:
clusterIP: None # Headless Service
ports:
- name: https-metrics
port: 10259
targetPort: 10259
protocol: TCP
selector:
component: kube-scheduler # 必须与 Pod 的标签匹配
# 应用这个yaml文件
root@k8s-master:~/prometheus/kube-prometheus-0.14.0/manifests# kubectl apply -f kube-scheduler-service.yaml
service/kube-scheduler created
# 再次查看svc
root@k8s-master:~/prometheus/kube-prometheus-0.14.0/manifests# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-scheduler ClusterIP None <none> 10259/TCP 67s
4.过一会回到页面,即可发现现在已经有了这个监控项
配置kube-controller-manager监控
1.首先修改 /etc/kubernetes/manifests/kube-controller-manager.yaml 的bind IP
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
====修改为====
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=0.0.0.0
2.查看kubernetesControlPlane-serviceMonitorKubeControllerManager.yaml 文件最后的标签,这里只看最后的matchLabels 是app.kubernetes.io/name: kube-controller-manager
# vim manifests/kubernetesControlPlane-serviceMonitorKubeControllerManager.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: kube-controller-manager
app.kubernetes.io/part-of: kube-prometheus
name: kube-controller-manager
namespace: monitoring
spec:
....
selector:
matchLabels:
app.kubernetes.io/name: kube-controller-manager
3. 发现最后的matchLabels,表名需要选择标签为 app.kubernetes.io/name: kube-controller-manager
的 Service,但是通过kubectl get svc -n kube-system 发现并没有这个svc。这里直接创建一个对应的svc yaml文件,需要对应标签为 app.kubernetes.io/name: kube-controller-manager,并且指定了endpoint
# kube-controller-manager-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kube-controller-manager
namespace: kube-system
labels:
app.kubernetes.io/name: kube-controller-manager # 必须与 ServiceMonitor 的标签匹配
spec:
clusterIP: None
ports:
- name: https-metrics
port: 10257
targetPort: 10257
protocol: TCP
selector:
component: kube-controller-manager
---
apiVersion: v1
kind: Endpoints
metadata:
name: kube-controller-manager
namespace: kube-system
subsets:
- addresses:
- ip: 172.21.176.3 # 替换为 Controller Manager 实际 IP
ports:
- name: https-metrics
port: 10257
4.过一会回到页面,即可发现现在已经有了这个监控项
配置kube-proxy
其实在有了上面2个例子,我们就已经弄明白了,系统服务的监控总共分3个步骤
- 修改配置文件监听的端口
- 查看或创建ServiceMonitor 文件,用于 Prometheus 添加监控项
- 查看或创建Service和endpoint文件,对象可以正确获取到 metrics 数据
有了这个共识,继续来监控kube-proxy。
1.修改配置文件监听端口
kube-proxy是以pod的形式运行的,并没有单独的配置文件
使用命令 kubectl -n kube-system get configmap kube-proxy -o yaml | grep metricsBindAddress获取端口,确保输出为 metricsBindAddress: "0.0.0.0:10249"
,若127.0.0.1 或 空 需修改 ConfigMap
kubectl -n kube-system get configmap kube-proxy -o yaml | grep metricsBindAddress
# 使用edit 来修改
kubectl -n kube-system edit configmap kube-proxy
metricsBindAddress: ""
====修改为====
metricsBindAddress: "0.0.0.0:10249"
#重启kube-proxy
kubectl -n kube-system rollout restart daemonset/kube-proxy
2. 创建ServiceMonitor 文件
说明:无需 TLS 配置,因 kube-proxy 默认使用 HTTP 协议
# vim manifests/kubernetesControlPlane-serviceMonitorKube-Proxy.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kube-proxy
namespace: monitoring
spec:
endpoints:
- interval: 30s
port: http-metrics # 与 Service 端口名称一致
scheme: http # 协议为 HTTP
selector:
matchLabels:
k8s-app: kube-proxy
namespaceSelector:
matchNames: [kube-system]
3.创建Service文件
# vim manifests/kube-proxy-service.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kube-proxy
namespace: monitoring
spec:
endpoints:
- interval: 30s
port: http-metrics # 与 Service 端口名称一致
scheme: http # 协议为 HTTP
selector:
matchLabels:
k8s-app: kube-proxy
namespaceSelector:
matchNames: [kube-system]
---
apiVersion: v1
kind: Endpoints
metadata:
name: kube-proxy
namespace: kube-system
subsets:
- addresses:
- ip: 172.21.176.3 # 替换为实际节点 IP
ports:
- name: http-metrics
port: 10249
4.应用并查看
root@k8s-master:~/prometheus/kube-prometheus-0.14.0# kubectl apply -f manifests/kubernetesControlPlane-serviceMonitorKube-Proxy.yaml
servicemonitor.monitoring.coreos.com/kube-proxy created
root@k8s-master:~/prometheus/kube-prometheus-0.14.0# kubectl apply -f manifests/kube-proxy-service.yaml
service/kube-proxy created