从0开始搭建Prometheus集群架构(Prometheus联邦模式+Grafana+node_exporter+VirctoriaMetrics远端仓库)

一、环境说明

主机名 主机IP 操作系统
prometheus-server31 10.0.0.31 ubuntu 22.04 LTS
prometheus-server32 10.0.0.32 ubuntu 22.04 LTS
prometheus-server33 10.0.0.33 ubuntu 22.04 LTS
node-exporter41 10.0.0.41 ubuntu 22.04 LTS
node-exporter42 10.0.0.42 ubuntu 22.04 LTS
node-exporter43 10.0.0.43 ubuntu 22.04 LTS

二、服务版本说明

服务名 版本号 下载链接
Prometheus 2.53.4 https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
Grafana 9.5.21 https://dl.grafana.com/enterprise/release/grafana-enterprise_9.5.21_amd64.deb
no-exporter 1.9.0 https://github.com/prometheus-community/elasticsearch_exporter/releases/download/v1.9.0/elasticsearch_exporter-1.9.0.linux-amd64.tar.gz
VictoriaMetrics 1.93.16 https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.16/victoria-metrics-linux-amd64-v1.93.16.tar.gz
consul 1.20.5 https://releases.hashicorp.com/consul/1.20.5/consul_1.20.5_linux_amd64.zip

三、架构说明

  • 本集群采用联邦模式部署,基于consul服务实现自动发现,采集Linux服务器的基本指标,将数据存储在VictoriaMetrics远端仓库,Grafana出图展示

  • 关于联邦模式

    • 类似于古代帝王的分封制

    • prometheus-server32直接采集node-exporter41的数据

    • prometheus-server33采集consul集群的数据,node-exporter42、node-exporter43自动注册到consul集群

    • prometheus-server31采集prometheus-server32、prometheus-server33采集到数据,也就是对其做一个汇总,并将数据存储到远端仓库

    • Grafana从远端仓库获取数据出图展示

四、集群部署

1.所有prometheus-server节点安装Prometheus

安装过程简单,参考https://www.cnblogs.com/dezyan/p/18794577
我编写了脚本一键部署,安装路径为/dezyan/softwares/prometheus-2.53.4.linux-amd64/
配置了一键启动脚本(systemctl start prometheus)

2.所有node-exporter节点安装node-exporter

安装过程简单,参考https://www.cnblogs.com/dezyan/p/18794577
我编写了脚本一键部署,安装路径为/dezyan/softwares/node_exporter-1.9.0.linux-amd64
配置了一键启动脚本(systemctl start node-exporter.service )

3.部署VictoriaMetrics单机版

  • 资源有限,我直接部署在了10.0.0.43节点上

3.1 下载\解压victoriametrics

[root@node-exporter43 ~]# wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.93.16/victoria-metrics-linux-amd64-v1.93.16.tar.gz
[root@node-exporter43 ~]# tar xf victoria-metrics-linux-amd64-v1.93.16.tar.gz  -C /usr/local/bin/

3.2 编写启动脚本

[root@node-exporter43 ~]# cat > /etc/systemd/system/victoria-metrics.service <<EOF
[Unit]
Description=Linux VictoriaMetrics Server
Documentation=https://docs.victoriametrics.com/
After=network.target

[Service]
ExecStart=/usr/local/bin/victoria-metrics-prod  \
   -httpListenAddr=0.0.0.0:8428 \
   -storageDataPath=/dezyan/data/victoria-metrics \
   -retentionPeriod=3

[Install]
WantedBy=multi-user.target
EOF

[root@node-exporter43 ~]# systemctl daemon-reload
[root@node-exporter43 ~]# systemctl enable --now victoria-metrics.service
[root@node-exporter43 ~]# systemctl status victoria-metrics

3.3检查端口是否存活

[root@node-exporter43 ~]# ss -ntl | grep 8428
LISTEN 0      4096         0.0.0.0:8428      0.0.0.0:*

3.4 查看webUI

http://10.0.0.43:8428/

4.部署consul集群

资源有限,直接部署在了node-exporter三个节点上

4.1 下载/解压consul、拷贝到其他节点

[root@node-exporter41 ~]# wget https://releases.hashicorp.com/consul/1.20.5/consul_1.20.5_linux_amd64.zip
[root@node-exporter41 ~]# unzip consul_1.20.5_linux_amd64.zip -d /usr/local/bin/

[root@node-exporter41 ~]# scp /usr/local/bin/consul 10.0.0.42:/usr/local/bin
[root@node-exporter41 ~]# scp /usr/local/bin/consul 10.0.0.43:/usr/local/bin

4.2 运行consul集群

指定10.0.0.43作为集群的服务端

服务端43:
[root@node-exporter43 ~]# consul agent -server -bootstrap -bind=10.0.0.43 -data-dir=/dezyan/softwares/consul -client=10.0.0.43 -ui
客户端42:
[root@node-exporter42 ~]# consul agent  -bind=10.0.0.42 -data-dir=/dezyan/softwares/consul -client=10.0.0.42 -ui -retry-join=10.0.0.43
客户端41:
[root@node-exporter41 ~]# consul agent -server -bind=10.0.0.41 -data-dir=/dezyan/softwares/consul -client=10.0.0.41 -ui -retry-join=10.0.0.43

4.3 查看各个节点的监听端口 访问webUI测试

[root@node-exporter41 ~]# ss -ntl | grep 8500
[root@node-exporter42 ~]# ss -ntl | grep 8500
[root@node-exporter43 ~]# ss -ntl | grep 8500

http://10.0.0.41:8500/ui/dc1/nodes
http://10.0.0.42:8500/ui/dc1/nodes
http://10.0.0.43:8500/ui/dc1/nodes

5.部署Grafana

参考https://www.cnblogs.com/dezyan/p/18794577

五、集群配置

1.Prometheus-server32节点

1.1编辑配置文件

[root@prometheus-server32 ~]# cd /dezyan/softwares/prometheus-2.53.4.linux-amd64/
[root@prometheus-server32 ~]# vim prometheus.yml

  - job_name: linux96-node-exporter
    static_configs:
      - targets: 
        - 10.0.0.41:9100

1.2 重新加载配置

[root@prometheus-server32 ~]# curl -X POST http://10.0.0.32:9090/-/reload

2.Prometheus-server33节点

2.1编辑配置文件

[root@prometheus-server31 ~]# cd /dezyan/softwares/prometheus-2.53.4.linux-amd64/
[root@prometheus-server31 ~]# vim prometheus.yml

  - job_name: "consul-seriver-discovery"
    # 配置基于consul的服务发现
    consul_sd_configs:
        # 指定consul的服务器地址,若不指定,则默认值为"localhost:8500".
      - server: 10.0.0.43:8500
      - server: 10.0.0.42:8500
      - server: 10.0.0.41:8500
    relabel_configs:
        # 匹配consul的源标签字段,表示服务名称
      - source_labels: [__meta_consul_service]
        # 指定源标签的正则表达式,若不定义,默认值为"(.*)"
        regex: consul
        # 执行动作为删除,默认值为"replace",有效值有多种
        #   https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_action
        action: drop

2.2 重新加载配置

[root@prometheus-server33 ~]# curl -X POST http://10.0.0.33:9090/-/reload

2.3 被监控节点注册到console集群

[root@node-exporter42 ~]# curl -X PUT -d '{"id":"prometheus-node42","name":"dezyan-prometheus-node42","address":"10.0.0.42","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.0.0.42:9100","interval":"5m"}]}' http://10.0.0.42:8500/v1/agent/service/register

[root@node-exporter43 ~]# curl -X PUT -d '{"id":"prometheus-node42","name":"dezyan-prometheus-node42","address":"10.0.0.43","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.0.0.43:9100","interval":"5m"}]}' http://10.0.0.42:8500/v1/agent/service/register

或者使用Postman这样的工具

PUT http://10.0.0.42:8500/v1/agent/service/register
{"id":"prometheus-node43","name":"dezyan-prometheus-node42","address":"10.0.0.42","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.0.0.42:9100","interval":"5m"}]}

PUT http://10.0.0.43:8500/v1/agent/service/register
{"id":"prometheus-node43","name":"dezyan-prometheus-node43","address":"10.0.0.43","port":9100,"tags":["node-exporter"],"checks": [{"http":"http://10.0.0.43:9100","interval":"5m"}]}

3.Prometheus-server31节点

3.1编辑配置文件

[root@prometheus-server31 ~]# cd /dezyan/softwares/prometheus-2.53.4.linux-amd64/
[root@prometheus-server31 ~]# vim prometheus.yml
remote_write:
  - url: http://10.0.0.43:8428/api/v1/write
  
  - job_name: "prometheus-federate-32"
    metrics_path: "/federate"
    # 用于解决标签的冲突问题,有效值为: true和false,默认值为false
    # 当设置为true时,将保留抓取的标签以忽略服务器自身的标签。说白了会覆盖原有标签。
    # 当设置为false时,则不会覆盖原有标签,而是在标点前加了一个"exported_"前缀。
    honor_labels: true
    params:
       "match[]":
       - '{job="promethues"}'
       - '{__name__=~"job:.*"}'
       - '{__name__=~"node.*"}'
    static_configs:
    - targets:
        - "10.0.0.32:9090"

  - job_name: "prometheus-federate-33"
    metrics_path: "/federate"
    honor_labels: true
    params:
       "match[]":
       - '{job="promethues"}'
       - '{__name__=~"job:.*"}'
       - '{__name__=~"node.*"}'
    static_configs:
    - targets:
        - "10.0.0.33:9090"

3.2 重新加载配置

[root@prometheus-server33 ~]# curl -X POST http://10.0.0.31:9090/-/reload

六、在VictoriaMetrics的WebUI查看数据

http://10.0.0.43:8428/

七、配置grafana的数据源及URL

数据源依旧是Prometheus
数据源来源的IP是http://10.0.0.43:8428

八、导入grafana的模板ID并选择数据源出图展示

posted @ 2025-03-29 21:37  丁志岩  阅读(287)  评论(0)    收藏  举报