10.部署prometheus+grafana监控系统详细过程

1 安装前准备

1.1 主机环境

准备一台虚拟机

IP地址        操作系统                                 配置
10.0.0.101 CentOS Linux release 7.9.2009 (Core) 4核心、4G内存、100G磁盘
1.2 规划安装目录

将prometheus相关服务都安装在/data/目录下面,最好/data是一块单独的磁盘,易于扩容

[root@docker01 ~]# mkdir -p /data/{prometheus,grafana,alertmanager,node_exporter}
1.3 下载安装包

官网下载地址:

https://prometheus.io/download/

https://grafana.com/grafana/download/

版本信息:

Prometheus版本:2.53.4

grafana版本:11.5.3

alertmanager版本:0.28.1

exporter版本:1.9.0

[root@docker01 data]# cd /data
[root@docker01 data]# ls -ltr
total 297164
drwxr-xr-x 2 root root           6 Apr 11 09:20 node_exporter
drwxr-xr-x 2 root root           6 Apr 11 09:20 grafana
drwxr-xr-x 2 root root           6 Apr 11 09:20 alertmanager
drwxr-xr-x 4 root root         132 Apr 11 09:20 prometheus
# 下载prometheus
[root@docker01 data]# wget https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
--2025-04-11 09:21:38--  https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
Unable to establish SSL connection.
[root@docker01 data]# wget https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
--2025-04-11 09:21:40--  https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 
302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/6838921/cebdf666-8802-4ed5-a1ae-e4fd5cad8363?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250411%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250411T012142Z&X-Amz-Expires=300&X-Amz-Signature=e49abcef3f6900812a883195504764c53f01b04318d2b10ee42d2ad55ae667ae&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dprometheus-2.53.4.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream [following]
--2025-04-11 09:21:41--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/6838921/cebdf666-8802-4ed5-a1ae-e4fd5cad8363?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250411%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250411T012142Z&X-Amz-Expires=300&X-Amz-Signature=e49abcef3f6900812a883195504764c53f01b04318d2b10ee42d2ad55ae667ae&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dprometheus-2.53.4.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 
185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.110.133|:443... 
connected.
HTTP request sent, awaiting response... 200 OK
Length: 104200835 (99M) [application/octet-stream]
Saving to: ‘prometheus-2.53.4.linux-amd64.tar.gz’

100%[===========================================================================>] 104,200,835 7.44MB/s   in 15s    

2025-04-11 09:21:58 (6.48 MB/s) - ‘prometheus-2.53.4.linux-amd64.tar.gz’ saved [104200835/104200835]
下载prometheus wget https://github.com/prometheus/prometheus/releases/download/v2.53.4/prometheus-2.53.4.linux-amd64.tar.gz
# 下载grafana
[root@docker01 data]# wget https://dl.grafana.com/enterprise/release/grafana-enterprise-11.5.3.linux-amd64.tar.gz
--2025-04-11 09:22:40--  https://dl.grafana.com/enterprise/release/grafana-enterprise-11.5.3.linux-amd64.tar.gz
Resolving dl.grafana.com (dl.grafana.com)... 146.75.42.217, 2a04:4e42:7a::729
Connecting to dl.grafana.com (dl.grafana.com)|146.75.42.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 155080092 (148M) [application/x-tar]
Saving to: ‘grafana-enterprise-11.5.3.linux-amd64.tar.gz’

100%[===========================================================================>] 155,080,092 8.71MB/s   in 28s    

2025-04-11 09:23:12 (5.21 MB/s) - ‘grafana-enterprise-11.5.3.linux-amd64.tar.gz’ saved [155080092/155080092]
下载grafana wget https://dl.grafana.com/enterprise/release/grafana-enterprise-11.5.3.linux-amd64.tar.gz
# 下载altermanager
[root@docker01 data]# wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz
--2025-04-11 09:29:23--  https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/11452538/72c63ac7-cd2a-4224-84be-28f404ca793f?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250411%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250411T012924Z&X-Amz-Expires=300&X-Amz-Signature=f72ddf439253c3c4f2a20d7021bc6aa6893c6b8ac203d8ff547bfb0c8be20449&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dalertmanager-0.28.1.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream [following]
--2025-04-11 09:29:24--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/11452538/72c63ac7-cd2a-4224-84be-28f404ca793f?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250411%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250411T012924Z&X-Amz-Expires=300&X-Amz-Signature=f72ddf439253c3c4f2a20d7021bc6aa6893c6b8ac203d8ff547bfb0c8be20449&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dalertmanager-0.28.1.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 33436897 (32M) [application/octet-stream]
Saving to: ‘alertmanager-0.28.1.linux-amd64.tar.gz’

100%[===========================================================================>] 33,436,897  6.27MB/s   in 11s    

2025-04-11 09:29:36 (2.95 MB/s) - ‘alertmanager-0.28.1.linux-amd64.tar.gz’ saved [33436897/33436897]
下载altermanager wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz
# 下载node_exporter
[root@docker01 data]# wget https://github.com/prometheus/node_exporter/releases/download/v1.9.0/node_exporter-1.9.0.linux-amd64.tar.gz
--2025-04-11 09:30:05--  https://github.com/prometheus/node_exporter/releases/download/v1.9.0/node_exporter-1.9.0.linux-amd64.tar.gz
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/9524057/c181ae2d-a1b3-4bac-883f-2a071c7ba341?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250411%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250411T013007Z&X-Amz-Expires=300&X-Amz-Signature=c99e44d456bf3db92ffdafa36bd09a66d233ea9c92b86cab05f035054de96700&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dnode_exporter-1.9.0.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream [following]
--2025-04-11 09:30:06--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/9524057/c181ae2d-a1b3-4bac-883f-2a071c7ba341?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20250411%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250411T013007Z&X-Amz-Expires=300&X-Amz-Signature=c99e44d456bf3db92ffdafa36bd09a66d233ea9c92b86cab05f035054de96700&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dnode_exporter-1.9.0.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.108.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11569068 (11M) [application/octet-stream]
Saving to: ‘node_exporter-1.9.0.linux-amd64.tar.gz’

100%[===========================================================================>] 11,569,068   362KB/s   in 19s    

2025-04-11 09:30:27 (603 KB/s) - ‘node_exporter-1.9.0.linux-amd64.tar.gz’ saved [11569068/11569068]
下载node_exporter wget https://github.com/prometheus/node_exporter/releases/download/v1.9.0/node_exporter-1.9.0.linux-amd64.tar.gz
1.4 检查安装包下载完毕
[root@docker01 ~]# cd /data/
[root@docker01 data]# ls -ltr
total 297164
-rw-r--r-- 1 root root    11569068 Feb 17 15:27 node_exporter-1.9.0.linux-amd64.tar.gz
-rw-r--r-- 1 root root    33436897 Mar  7 23:09 alertmanager-0.28.1.linux-amd64.tar.gz
-rw-r--r-- 1 root root   104200835 Mar 18 23:12 prometheus-2.53.4.linux-amd64.tar.gz
-rw-r--r-- 1 root root   155080092 Mar 26 03:22 grafana-enterprise-11.5.3.linux-amd64.tar.gz
drwxr-xr-x 2 root root           6 Apr 11 09:20 node_exporter
drwxr-xr-x 2 root root           6 Apr 11 09:20 grafana
drwxr-xr-x 2 root root           6 Apr 11 09:20 alertmanager
drwxr-xr-x 4 root root         132 Apr 11 09:20 prometheus

2 安装prometheus相关服务

2.1 安装Prometheus 
2.1.1 解压安装包
[root@docker01 data]# cd /data
[root@docker01 data]# tar -xvf prometheus-2.53.4.linux-amd64.tar.gz 
prometheus-2.53.4.linux-amd64/
prometheus-2.53.4.linux-amd64/LICENSE
prometheus-2.53.4.linux-amd64/promtool
prometheus-2.53.4.linux-amd64/prometheus.yml
prometheus-2.53.4.linux-amd64/prometheus
prometheus-2.53.4.linux-amd64/NOTICE
prometheus-2.53.4.linux-amd64/console_libraries/
prometheus-2.53.4.linux-amd64/console_libraries/menu.lib
prometheus-2.53.4.linux-amd64/console_libraries/prom.lib
prometheus-2.53.4.linux-amd64/consoles/
prometheus-2.53.4.linux-amd64/consoles/node-disk.html
prometheus-2.53.4.linux-amd64/consoles/index.html.example
prometheus-2.53.4.linux-amd64/consoles/prometheus.html
prometheus-2.53.4.linux-amd64/consoles/node-cpu.html
prometheus-2.53.4.linux-amd64/consoles/node.html
prometheus-2.53.4.linux-amd64/consoles/node-overview.html
prometheus-2.53.4.linux-amd64/consoles/prometheus-overview.html
[root@docker01 data]# mv prometheus-2.53.4.linux-amd64/* /data/prometheus
2.1.2 创建prometheus用户
[root@docker01 data]# useradd -M -s /sbin/nologin prometheus
[root@docker01 data]# id prometheus 
uid=1000(prometheus) gid=1000(prometheus) groups=1000(prometheus)
2.1.3 授予prometheus目录权限
[root@docker01 data]# chown -R prometheus.prometheus /data/prometheus
2.1.4 给 Prometheus创建systemd服务
[root@docker01 data]# cat /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview
After=network.target

[Service]
Type=simple
User=prometheus
Group=prometheus
Restart=on-failure
ExecStart=/data/prometheus/prometheus \
  --config.file=/data/prometheus/prometheus.yml \
  --storage.tsdb.path=/data/prometheus/data \
  --storage.tsdb.retention.time=15d \
  --web.enable-lifecycle

[Install]
WantedBy=multi-user.target
2.1.5 加载、启动并设置开机自启动
[root@docker01 data]# systemctl daemon-reload
[root@docker01 data]# systemctl enable --now prometheus.service 
Created symlink from /etc/systemd/system/multi-user.target.wants/prometheus.service to /etc/systemd/system/prometheus.service.
2.1.6 检查状态
[root@docker01 data]# systemctl status prometheus.service
● prometheus.service - Prometheus Server
   Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2025-04-11 10:37:25 CST; 4s ago
     Docs: https://prometheus.io/docs/introduction/overview
 Main PID: 2295 (prometheus)
    Tasks: 9
   Memory: 18.9M
   CGroup: /system.slice/prometheus.service
           └─2295 /data/prometheus/prometheus --config.file=/data/prometheus/prometheus.yml --storage.tsdb.path=/d...

Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.461Z caller=head.go:721 level=info componen...hile"
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.476Z caller=head.go:793 level=info componen...ent=0
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.476Z caller=head.go:830 level=info componen…41563ms
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.478Z caller=main.go:1169 level=info fs_type...MAGIC
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.478Z caller=main.go:1172 level=info msg="TS...rted"
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.478Z caller=main.go:1354 level=info msg="Lo...s.yml
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.547Z caller=main.go:1391 level=info msg="up...ew=75
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.547Z caller=main.go:1402 level=info msg="Complet…µs
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.547Z caller=main.go:1133 level=info msg="Se...sts."
Apr 11 10:37:25 docker01 prometheus[2295]: ts=2025-04-11T02:37:25.547Z caller=manager.go:164 level=info compo...r..."
Hint: Some lines were ellipsized, use -l to show in full.
systemctl status prometheus.service
2.1.7 访问prometheus的Web界面
本地浏览器访问: 10.0.0.101:9090

2.2 安装alertmanager
2.2.1 解压安装包
[root@docker01 ~]# cd /data/
[root@docker01 data]# tar -xvf alertmanager-0.28.1.linux-amd64.tar.gz
alertmanager-0.28.1.linux-amd64/
alertmanager-0.28.1.linux-amd64/LICENSE
alertmanager-0.28.1.linux-amd64/alertmanager
alertmanager-0.28.1.linux-amd64/alertmanager.yml
alertmanager-0.28.1.linux-amd64/amtool
alertmanager-0.28.1.linux-amd64/NOTICE
[root@docker01 data]# mv /data/alertmanager-0.28.1.linux-amd64/* /data/alertmanager
2.2.2 授予 Prometheus目录权限
[root@docker01 data]# chown -R prometheus.prometheus /data/alertmanager
2.2.3 给alertmanager创建systemd服务
[root@docker01 data]# cat /etc/systemd/system/alertmanager.service
[Unit]
Desciption=Alert Manager
wants=network-online.target
After=network-online.target

[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/data/alertmanager/alertmanager \
--config.file=/data/alertmanager/alertmanager.yml \
--storage.path=/data/alertmanager/data
Restart=always

[Install]
WantedBy=multi-user.target
2.2.4 加载、启动并设置开机自启动
[root@docker01 data]# systemctl daemon-reload
[root@docker01 data]# systemctl enable --now alertmanager.service 
Created symlink from /etc/systemd/system/multi-user.target.wants/alertmanager.service to /etc/systemd/system/alertmanager.service.
2.2.5 检查状态
[root@docker01 data]# systemctl status alertmanager.service 
● alertmanager.service
   Loaded: loaded (/etc/systemd/system/alertmanager.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2025-04-11 11:13:31 CST; 9s ago
 Main PID: 2449 (alertmanager)
    Tasks: 10
   Memory: 14.5M
   CGroup: /system.slice/alertmanager.service
           └─2449 /data/alertmanager/alertmanager --config.file=/data/alertmanager/alertmanager.yml --storage.path...

Apr 11 11:13:31 docker01 systemd[1]: Started alertmanager.service.
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.349Z level=INFO source=main.go:191 msg=...910)"
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.349Z level=INFO source=main.go:192 msg=...tgo)"
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.353Z level=INFO source=cluster.go:185 m...=9094
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.354Z level=INFO source=cluster.go:674 m...al=2s
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.397Z level=INFO source=coordinator.go:1...r.yml
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.398Z level=INFO source=coordinator.go:1...r.yml
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.400Z level=INFO source=tls_config.go:34...:9093
Apr 11 11:13:31 docker01 alertmanager[2449]: time=2025-04-11T03:13:31.400Z level=INFO source=tls_config.go:35...:9093
Apr 11 11:13:33 docker01 alertmanager[2449]: time=2025-04-11T03:13:33.355Z level=INFO source=cluster.go:699 m...6518s
Hint: Some lines were ellipsized, use -l to show in full.
systemctl status alertmanager.service
2.2.6 将alertmanager加入prometheus
[root@docker01 prometheus]# cat /data/prometheus/prometheus.yml
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
           # 根据实际填写alertmanager的IP地址
           - 10.0.0.101:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files: 
  # 根据实际名修改文件名,可以有多个规则文件
  - "/data/alertmanager/rule/alert.yml"
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
vim /data/prometheus/prometheus.yml

2.2.7 增加触发器配置文件并检查配置
[root@docker01 data]# mkdir /data/alertmanager/rule
[root@docker01 data]# cat /data/alertmanager/rule/alert.yml 
groups:
  - name: 主机状态监控
    rules:
      - alert: 主机宕机
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "{{ $labels.instance }} 主机宕机,请尽快处理"
          description: "{{ $labels.instance }} 已经宕机超过 1 分钟。请检查服务状态。"
[root@docker01 data]# chown -R prometheus.prometheus /data/alertmanager
2.2.8 检查配置
[root@docker01 ~]# cd /data/prometheus
[root@docker01 prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
  SUCCESS: 1 rule files found
 SUCCESS: prometheus.yml is valid prometheus config file syntax

Checking /data/alertmanager/rule/alert.yml
  SUCCESS: 1 rules found

注意:一定要检测通过再进行重启prometheus

2.2.9 重启prometheus
[root@docker01 prometheus]# systemctl restart prometheus
2.2.10 访问alertmanager的Web界面
本地浏览器访问: 10.0.0.101:9093

2.3 安装node_exporter
2.3.1 解压安装包
[root@docker01 ~]# cd /data/
[root@docker01 data]# tar -xvf node_exporter-1.9.0.linux-amd64.tar.gz 
node_exporter-1.9.0.linux-amd64/
node_exporter-1.9.0.linux-amd64/LICENSE
node_exporter-1.9.0.linux-amd64/NOTICE
node_exporter-1.9.0.linux-amd64/node_exporter
[root@docker01 data]# mv node_exporter-1.9.0.linux-amd64/* /data/node_exporter
2.3.2 授予node_exporter目录权限
[root@docker01 data]# chown prometheus.prometheus -R /data/node_exporter
2.3.3 给node_exporter创建systemd服务
[root@docker01 data]# cat /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target

[Service]
User=prometheus
Group=prometheus
ExecStart=/data/node_exporter/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target
2.3.4 加载、启动并设置开机自启动
[root@docker01 data]# systemctl daemon-reload
[root@docker01 data]# systemctl enable --now node_exporter.service
Created symlink from /etc/systemd/system/multi-user.target.wants/node_exporter.service to /etc/systemd/system/node_exporter.service.
2.3.5 检查状态
[root@docker01 data]# systemctl status node_exporter
● node_exporter.service - node_exporter
   Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2025-04-11 14:58:52 CST; 10s ago
     Docs: https://prometheus.io/
 Main PID: 3374 (node_exporter)
    Tasks: 5
   Memory: 4.8M
   CGroup: /system.slice/node_exporter.service
           └─3374 /data/node_exporter/node_exporter

Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...=time
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...timex
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...ueues
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...uname
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...mstat
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...chdog
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...g=xfs
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.380Z level=INFO source=node_exporter.g...g=zfs
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.381Z level=INFO source=tls_config.go:3...:9100
Apr 11 14:58:52 docker01 node_exporter[3374]: time=2025-04-11T06:58:52.381Z level=INFO source=tls_config.go:3...:9100
Hint: Some lines were ellipsized, use -l to show in full.
systemctl status node_exporter
2.3.6 访问node_exporter的Web界面
本地浏览器访问: 10.0.0.101:9100/metrics

2.3.7 将node_exporter加入prometheus
# 在prometheus.yml文件尾部添加一个job_name,可以添加多个targets
[root@docker01 prometheus]# tail -10 /data/prometheus/prometheus.yml # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"] - job_name: "node_exporter" static_configs: - targets: ["10.0.0.101:9100"] labels: instance: "10.0.0.101服务器"

2.3.9 检查配置并平滑加载或重启
[root@docker01 prometheus]# ./promtool check config prometheus.yml
Checking prometheus.yml
  SUCCESS: 1 rule files found
 SUCCESS: prometheus.yml is valid prometheus config file syntax

Checking /data/alertmanager/rule/alert.yml
  SUCCESS: 1 rules found
# 平滑加载(2选1)
[root@docker01 prometheus]# curl -X POST http://10.0.0.101:9090/-/reload
# 重启
(2选1)
[root@docker01 prometheus]# systemctl restart prometheus.service
2.3.10 登录prometheus查看node_exporter是否启动

2.4 安装Grafana
2.4.1 解压安装包
[root@docker01 ~]# cd /data/
[root@docker01 data]# tar -xvf grafana-enterprise-11.5.3.linux-amd64.tar.gz
grafana-v11.5.3/VERSION
grafana-v11.5.3/LICENSE
grafana-v11.5.3/NOTICE.md
grafana-v11.5.3/README.md
grafana-v11.5.3/Dockerfile
grafana-v11.5.3/tools/zoneinfo.zip
............
[root@docker01 data]# mv grafana-v11.5.3/* /data/grafana
2.4.2 授予 Grafana目录权限
[root@docker01 data]# chown -R prometheus.prometheus /data/grafana
2.4.3 给grafana创建systemd服务
[root@docker01 data]# cat /etc/systemd/system/grafana-server.service
[Unit]
Description=Grafana server
Documetation=http://dosc.grafana.org

[Service]
Type=simple
User=prometheus
Group=prometheus
Restart=on-failure
ExecStart=/data/grafana/bin/grafana-server --config=/data/grafana/conf/defaults.ini --homepath=/data/grafana

[Install]
WantedBy=multi-user.target
2.4.4 加载、启动并设置开机自启动
[root@docker01 data]# systemctl daemon-reload
[root@docker01 data]# systemctl enable --now grafana-server.service
Created symlink from /etc/systemd/system/multi-user.target.wants/grafana-server.service to /etc/systemd/system/grafana-server.service.
2.4.5 检查状态
[root@docker01 data]# systemctl status grafana-server.service
● grafana-server.service - Grafana server
   Loaded: loaded (/etc/systemd/system/grafana-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2025-04-11 16:02:21 CST; 8s ago
 Main PID: 5024 (grafana)
    Tasks: 14
   Memory: 93.5M
   CGroup: /system.slice/grafana-server.service
           └─5024 grafana server --config=/data/grafana/conf/defaults.ini --homepath=/data/grafana

Apr 11 16:02:24 docker01 grafana-server[5024]: logger=migrator t=2025-04-11T16:02:24.07784359+08:00 level=info...ion"
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=migrator t=2025-04-11T16:02:24.078712974+08:00 level=inf….921µs
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=migrator t=2025-04-11T16:02:24.080007229+08:00 level=inf...umn"
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=migrator t=2025-04-11T16:02:24.086532635+08:00 level=inf...56ms
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=migrator t=2025-04-11T16:02:24.087565225+08:00 level=inf...029s
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=migrator t=2025-04-11T16:02:24.088075756+08:00 level=inf...ase"
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=sqlstore t=2025-04-11T16:02:24.101759062+08:00 level=inf...dmin
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=sqlstore t=2025-04-11T16:02:24.10213337+08:00 level=info...ion"
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=licensing t=2025-04-11T16:02:24.104527078+08:00 level=in...ound
Apr 11 16:02:24 docker01 grafana-server[5024]: logger=secrets t=2025-04-11T16:02:24.104610142+08:00 level=info...y.v1
Hint: Some lines were ellipsized, use -l to show in full.
systemctl status grafana-server.service
2.4.6 访问grafana的Web界面
本地浏览器访问: 10.0.0.101:3000   默认用户名/密码:admin/admin  # 首次登录需要修改密码

2.4.7 grafana对接prometheus
依次点击:1.Home -> 2.Connections -> 3.Add new connection -> 4.Prometheus
-> 5.Add new data source

2.4.8从Grafana官网导入符合要求的仪表盘

Grafana官网:https://grafana.com/grafana/dashboards/

1.基于ID号方式导入

 在官网查看监控模版的ID号,把ID号粘贴到 2 中,依次点击3、4,选择prometheus数据源,点击import导入

2.基于文件方式导入

先在官网下载json格式文件,把文件中的内容粘贴到 2 中,点击3,选择prometheus数据源,点击import导入

———————————————————————————————————————————————————————————————————————————

                                                                                                                         无敌小马爱学习

posted on 2025-04-11 09:38  马俊南  阅读(142)  评论(0)    收藏  举报