1,prometheus 的简介和使用

1,监控指标分类:
硬件监控     温度,硬件故障等
系统监控     CPU,内存,硬盘,网卡流量,TCP状态,进程数
应用监控     Nginx、Tomcat、PHP、MySQL、Redis等
日志监控     系统日志、服务日志、访问日志、错误日志
安全监控     WAF,敏感文件监控
API监控      可用性,接口请求,响应时间
业务监控     例如电商网站,每分钟产生多少订单、注册多少用户、多少活跃用户、推广活动效果
流量分析     根据流量获取用户相关信息,例如用户地理位置、某页面访问状况、页面停留时间等

2,Prometheus提供了大量的官方以及第三方的exporters:
    https://prometheus.io/docs/instrumenting/exporters/
    (official) 官方开发的
    不带(official)社区开发的
    Prometheus默认的pull模式获取数据,这也是官方推荐的方式。
 
3,Prometheus 组成及架构:
    Prometheus Server:收集指标和存储时间序列数据,并提供查询接口
    ClientLibrary:客户端库
    Push Gateway:短期存储指标数据。主要用于临时性的任务,将指标push到pushgateway,再由Prometheus Server从Pushgateway上pull。
    Exporters:采集已有的第三方服务监控指标并暴露metrics
    Alertmanager:告警
    Web UI:简单的Web控制台


4,下载服务端二进制包:
prometheus-2.6.1.linux-amd64.tar.gz
[root@centos7 prometheus-2.6.1.linux-amd64]# ./prometheus –help

5,启动prometheus server端
[root@centos7 -amd64]# ./prometheus --config.file="./prometheus.yml"

6,检查配置文件语法:
[root@centos7 prometheus]# ./promtool check config prometheus.yml 


7,服务端设置systemctl 启动:
[root@centos7 ~]# cat /usr/lib/systemd/system/prometheus.service
[Unit]
Description=prometheus

[Service]
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml
Restart=on-failure

[Install]
WantedBy=multi-user.target
[root@centos7 ~]#


relabel_configs :允许在采集之前对任何目标及其标签进行修改
重新标签的意义?
	重命名标签名
	删除标签
	过滤目标

8,监控prometheus server本机:
global:
  scrape_interval:     15s
  evaluation_interval: 15s 

alerting:
  alertmanagers:
  - static_configs:
    - targets:

rule_files:

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['192.168.0.11:9090']
	
9,添加自定义标签:
global:
  scrape_interval:     15s
  evaluation_interval: 15s 

alerting:
  alertmanagers:
  - static_configs:
    - targets:

rule_files:

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['192.168.0.11:9090']
	  labels:
	    idc: bj
10,
process_cpu_seconds_total
返回值:
Element 	Value
process_cpu_seconds_total{idc="bj",instance="192.168.0.11:9090",job="prometheus"}	0.34
process_cpu_seconds_total{instance="192.168.0.11:9090",job="prometheus"}			3.14


process_cpu_seconds_total{idc="bj"}
返回值:
process_cpu_seconds_total{idc="bj",instance="192.168.0.11:9090",job="prometheus"}	0.86



11,重命名标签
global:
  scrape_interval:     15s
  evaluation_interval: 15s 

alerting:
  alertmanagers:
  - static_configs:
    - targets:

rule_files:

scrape_configs:
  - job_name: 'bj'

    static_configs:
    - targets: ['192.168.0.11:9090']
    relabel_configs:
    - action: replace
      source_labels: ['job'] 
      regex: (.*) 		 #匹配的是job标签的值:bj
      replacement: $1    #$1的值就是(.*)匹配的
      target_label: idc  #即重命名标签,job='bj' 为 idc='bj'

12,删除标签
global:
  scrape_interval:     15s
  evaluation_interval: 15s 

alerting:
  alertmanagers:
  - static_configs:
    - targets:

rule_files:

scrape_configs:
  - job_name: 'bj'

    static_configs:
    - targets: ['192.168.0.11:9090']
    relabel_configs:
    - action: replace
      source_labels: ['job']
      regex: (.*)
      replacement: $1
      target_label: idc
    - action: labeldrop
      regex: job

13,基于文件的服务发现
[root@centos7 prometheus]# cat /usr/local/prometheus/sd_config/test.yml 
- targets: ['192.168.0.11:9090']
[root@centos7 prometheus]# 
[root@centos7 prometheus]# 
[root@centos7 prometheus]# cat prometheus.yml
global:
  scrape_interval:     15s
  evaluation_interval: 15s 

alerting:
  alertmanagers:
  - static_configs:
    - targets:

rule_files:

scrape_configs:
  - job_name: 'bj'
    file_sd_configs:
      - files: ['/usr/local/prometheus/sd_config/*.yml']
        refresh_interval: 5s
[root@centos7 prometheus]# 

14,监控Linux服务器,node_exporter
node_exporter的可执行文件即可启动 node export,默认会启动9100端口。
[root@centos7 node]# cat /etc/systemd/system/node.service 
[Unit]
Description=node

[Service]
Restart=on-failure
ExecStart=/usr/local/node/node_exporter

[Install]
WantedBy=multi-user.target
[root@centos7 node]# 
posted @ 2020-07-16 21:14  pwcc  阅读(375)  评论(0)    收藏  举报