prometheus 监控 eureka 里面的服务状态
1.用python写个eureka_exporter.py
pip install flask requests prometheus_client
from flask import Flask, Response import requests from prometheus_client import Gauge, generate_latest, REGISTRY import prometheus_client from prometheus_client import Counter, Gauge, generate_latest from prometheus_client.core import CollectorRegistry app = Flask(__name__) # 设置 metrics registry = CollectorRegistry(auto_describe=False) downgrade = Gauge('eureka', 'eureka', ['instanceId','hostName', 'app', 'status', 'port'], registry=registry) @app.route("/metrics") def metrics(): downgrade.clear() # 假设 Eureka 地址为 http://localhost:8761/eureka/apps/ eureka_url = "http://10.20.11.138:9500/eureka/apps/" up_count = 0 try: response = requests.get(eureka_url,headers = {"Content-Type": "application/json", "Accept": "application/json"}) if response.status_code == 200: apps = response.json()['applications']['application'] # 遍历所有应用并设置其状态 for app in apps: app_name = app['name'] for instance in app['instance']: status = instance['status'] # print(instance) if status == 'UP': downgrade.labels(instanceId=instance['instanceId'], hostName=instance['hostName'], app=app_name, status=status, port=instance['port']['$']).set(1) else: downgrade.labels(instanceId=instance['instanceId'], hostName=instance['hostName'], app=app_name, status=status, port=instance['port']['$']).set(0) else: print(f"Failed to fetch data from Eureka. Status code: {response.status_code}") except Exception as e: print(f"Error fetching data from Eureka: {e}") metrics_page = prometheus_client.generate_latest(registry) return Response(metrics_page, mimetype=prometheus_client.CONTENT_TYPE_LATEST) if __name__ == '__main__': app.run(host='0.0.0.0', port=8000)
2.配置Prometheus:
修改prometheus.yml文件,添加针对Eureka Server的scrape_configs。这里需要指定job_name、scrape_interval、metrics_path以及targets等参数。例如,如果你的Eureka Server运行在10.124.129.42:8000,并且暴露了/metrics端点用于指标收集,那么你的配置可能如下所示:
- job_name: 'eureka_exporter' scrape_interval: 10s metrics_path: '/metrics' static_configs: - targets: ['10.124.129.42:8000']
3.告警配置
- name: Eureka_Rules rules: # Eureka里面的 服务down告警 - alert: EurekaDown expr: eureka==0 for: 5s labels: severity: critical annotations: summary: "{{ $labels.app }} instance down on {{ $labels.hostName }}" description: "{{ $labels.app }} instance {{ $labels.hostName }} has been down for more than 1 minute." - alert: 服务数量小于2 expr: count by (app) (eureka{job="eureka_exporter"}) < 2 for: 5s labels: severity: critical annotations: summary: "{{ $labels.app }} 数量小于2" description: "{{ $labels.app }} 数量小于2 "

浙公网安备 33010602011771号