flink集群已经部署完毕,监控一直没上。使用pushgateway方式收集
1.prometheus环境是采用operator部署的,这里首先部署pushgateway
# cat pushgateway.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: pushgateway
name: pushgateway
spec:
replicas: 1
revisionHistoryLimit: 5
selector:
matchLabels:
app: pushgateway
template:
metadata:
labels:
app: pushgateway
spec:
containers:
- image: prom/pushgateway
imagePullPolicy: Always
name: pushgateway
---
apiVersion: v1
kind: Service
metadata:
name: pushgateway
labels:
app: pushgateway
spec:
ports:
- port: 9091
name: server
targetPort: 9091
selector:
app: pushgateway
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
release: monitor
name: monitor-flink
spec:
endpoints:
- path: /metrics
port: server
relabelings:
- action: replace
regex: ^(.*)$
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_service_label_port
targetLabel: service_port
- action: replace
regex: ^(.*)$
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_service_label_ip
targetLabel: hostip
- action: replace
regex: ^(.*)$
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_service_label_name
targetLabel: app_name
namespaceSelector:
matchNames:
- monitoring
selector:
matchLabels:
app: pushgateway
2.修改flink-conf.yaml添加监控内容
# cat flink-conf.yaml
metrics.reporter.promgateway.class: org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter
#host 为pushgateway内网域名地址
metrics.reporter.promgateway.host: pushgateway.monitoring
metrics.reporter.promgateway.port: 9091
3.重启flink服务,查看prometheus收集是否正常,然后配置监控项和grafana看板即可
