zabbix自定义监控项
zabbix自定义监控项
监控指标
- 系统指标
- 内存
- CPU
- 硬盘
 
- 文件监控
- 网络监控
- 硬件监控(通过IPMI实现)
- 硬盘温度
- 电源是否异常
- CPU温度
 
- 业务监控
自定义监控流程
- 
开启自定义监控的功能 - 修改agentd.conf文件,修改以下2行配置
- UnsafeUserParameters=1
- UserParameters=key,command 格式为:UserParameter=<键值>,<命令>
 
 
- 修改agentd.conf文件,修改以下2行配置
- 
写脚本 
- 
web界面配置监控项,触发器 
环境说明
| 环境 | IP地址 | 主机名 | 需要安装的应用 | 系统版本 | 
|---|---|---|---|---|
| 服务端 | 192.168.110.30 | zabbix.example.com | lamp架构 zabbix_server zabbix_agent | redhat 8 | 
| 客户端 | 192.168.110.40 | zabbix-agent | zabbix_agent | redhat 8 | 
配置服务端和客户端详情参考:zabbix监控配置流程+实例演示
1. 自定义监控进程
本次监控进程以httpd服务进程为例,编译安装httpd步骤详情见:利用shell脚本实现安装httpd服务
准备工作:
//客户端
#关闭防火墙和selinux
[root@zabbix-agent ~]# systemctl stop firewalld
[root@zabbix-agent ~]# setenforce 0
#通过上面给的链接使用脚本编译安装apache
[root@zabbix-agent ~]# bash httpd.sh
#取消警告,映射软连接
[root@zabbix-agent ~]# sed -i '/#ServerName/s/#//g' /etc/httpd24/httpd.conf
[root@zabbix-agent ~]# ln -s /usr/local/apache/bin/apachectl /usr/bin/apachectl
#启动apache服务
[root@zabbix-agent ~]# apachectl start
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         128                  0.0.0.0:80                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:* 
第一步:编写脚本
//服务端
#创建脚本目录
[root@zabbix-agent ~]# mkdir /scripts
#脚本内容
[root@zabbix-agent ~]# vim /scripts/check_process.sh 
#以查看进程的方式,过滤apache进程,输出1表示进程有问题,0表示没问题
#!/bin/bash
count=$(ps -ef | grep -Ev "grep|$0" | grep -c "$1")
if [ $count -eq 0 ];then
        echo "1"
else
        echo "0"
fi
#给脚本执行权限
[root@zabbix-agent ~]# chmod +x /scripts/check_process.sh 
[root@zabbix-agent ~]# ll /scripts/
total 4
-rwxr-xr-x. 1 root root 118 Apr 29 00:02 check_process.sh
#测试
[root@zabbix-agent ~]# apachectl start
[root@zabbix-agent ~]# bash /scripts/check_process.sh httpd
0
[root@zabbix-agent ~]# apachectl stop
[root@zabbix-agent ~]# bash /scripts/check_process.sh httpd
1
第二步:开启自定义监控功能
//客户端
#开启自定义监控并添加指标
[root@zabbix-agent ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=
#在最后面添加以下内容
UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
#重启zabbix
[root@zabbix-agent ~]# pkill zabbix
[root@zabbix-agent ~]# zabbix_agentd 
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:*   
#使用服务端测试是否能获取客户端的指标
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k check_apache
1
第三步:web界面配置
- 添加监控项
点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

- 添加触发器
点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

- 配置媒介和动作
配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式
- 触发触发器
//客户端
#停止apache服务,触发告警
[root@zabbix-agent ~]# apachectl stop
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:*  
- 邮箱验证

2. 自定义监控日志
下载log.py文件到本机
log.py文件详情见:leidazhuang_Github
编写脚本
log.py作用:检查日志文件中是否有指定的关键字
- 第一个参数为日志文件名(必须有,相对路径、绝对路径均可)
- 第二个参数为seek position文件的路径(可选项,若不设置则默认为/tmp/logseek文件。相对路径、绝对路径均可)
- 第三个参数为搜索关键字,默认为 Error
//客户端
#安装python36
[root@zabbix-agent ~]# yum -y install python36
#本脚本用于检查日志文件中是否有指定关键字
[root@zabbix-agent scripts]# cat log.py 
#!/usr/bin/env python3
import sys
import re
def prePos(seekfile):
    global curpos
    try:
        cf = open(seekfile)
    except IOError:
        curpos = 0
        return curpos
    except FileNotFoundError:
        curpos = 0
        return curpos
    else:
        try:
            curpos = int(cf.readline().strip())
        except ValueError:
            curpos = 0
            cf.close()
            return curpos
        cf.close()
    return curpos
def lastPos(filename):
    with open(filename) as lfile:
        if lfile.readline():
            lfile.seek(0,2)
        else:
            return 0
        lastPos = lfile.tell()
    return lastPos
def getSeekFile():
    try:
        seekfile = sys.argv[2]
    except IndexError:
        seekfile = '/tmp/logseek'
    return seekfile
def getKey():
    try:
        tagKey = str(sys.argv[3])
    except IndexError:
        tagKey = 'Error'
    return tagKey
def getResult(filename,seekfile,tagkey):
    destPos = prePos(seekfile)
    curPos = lastPos(filename)
    if curPos < destPos:
        curpos = 0
    try:
        f = open(filename)
    except IOError:
        print('Could not open file: %s' % filename)
    except FileNotFoundError:
        print('Could not open file: %s' % filename)
    else:
        f.seek(destPos)
        while curPos != 0 and f.tell() < curPos:
            rresult = f.readline().strip()
            global result
            if re.search(tagkey, rresult):
                result = 1
                break
            else:
                result = 0
        with open(seekfile,'w') as sf:
            sf.write(str(curPos))
    finally:
        f.close()
    return result
if __name__ == "__main__":
    result = 0
    curpos = 0
    tagkey = getKey()
    seekfile = getSeekFile()
    result = getResult(sys.argv[1],seekfile,tagkey)
    print(result)
添加指标
//客户端
#开启自定义监控功能,添加指标
[root@zabbix-agent ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=
UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
#在最后面添加以下内容
UserParameter=check_logs[*],/scripts/log.py $1 $2 $3
#重启zabbix
[root@zabbix-agent ~]# pkill zabbix
[root@zabbix-agent ~]# zabbix_agentd 
[root@zabbix-agent ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         80                   0.0.0.0:3306               0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:80                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:* 
#手动触发警告
[root@zabbix-agent ~]# echo 'Error' >> /usr/local/apache/logs/error_log
//服务端
#使用服务端测试是否能获取客户端的指标
#监控/usr/local/apache/logs/error_logs文件,seek position文件为默认的/tmp/seek,关键字为Error
#第一次结果为0说明搜索到Error
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k  check_logs["/usr/local/apache/logs/error_log","/tmp/seek","Error"]
1
#第二次结果为1是因为在第一次的报错之后,并没有找到Error
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k  check_logs["/usr/local/apache/logs/error_log","/tmp/seek","Error"]
0
web界面配置
- 添加监控项
点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

- 添加触发器
点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

- 配置媒介和动作
配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式
- 触发触发器
//客户端
#手动输入Error触发警告
[root@zabbix-agent ~]# echo 'Error' >> /usr/local/apache/logs/error_log 
//服务端
#取值为1
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k  check_logs["/usr/local/apache/logs/error_log","/tmp/logseek","Error"]
1
- 邮箱测试

3. 自定义监控mysql主从状态
环境说明
增加一台机器 192.168.110.50,当作主msater数据库
| 环境 | IP地址 | 需要安装的应用 | 系统版本 | 
|---|---|---|---|
| 主:master | 192.168.110.50 | mariadb | redhat 8 | 
| 从:slave | 192.168.110.40 | mariadb | redhat 8 | 
准备工作
//master端
#安装数据库
[root@master ~]# yum -y install mariadb*
#启动服务
[root@master ~]# systemctl enable --now mariadb
#关闭防火墙和selinux
[root@master ~]# systemctl disenable --now firewalld
Unknown operation disenable.
[root@master ~]# systemctl disable --now firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@master ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
[root@master ~]# setenforce 0
//slave端
#安装数据库
[root@slave ~]# yum -y install mariadb*
#启动服务
[root@slave ~]# systemctl enable --now mariadb
#关闭防火墙和selinux
[root@slave ~]# systemctl disenable --now firewalld
Unknown operation disenable.
[root@slave ~]# systemctl disable --now firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@slave ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
[root@slave ~]# setenforce 0
配置master主数据库
//master端
#测试数据库
[root@master ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 22
Server version: 10.3.28-MariaDB-log MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
#授权
MariaDB [(none)]> grant replication slave on *.* to 'repl'@'192.168.110.40' identified by 'repl123!';
Query OK, 0 rows affected (0.001 sec)
#刷新
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> exit
Bye
#配置my.cnf文件
[root@master ~]# vim /etc/my.cnf
#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
#最后面加下以下信息
[mysqld]
log-bin=mysql-bin
server-id=1
#重启mysql,查看状态
[root@master ~]# systemctl restart mariadb
[root@master ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.28-MariaDB-log MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |      328 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.000 sec)
MariaDB [(none)]> exit
Bye
配置slave从数据库
//slave端
#测试数据库
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.3.28-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> exit
Bye
#配置my.cnf文件
[root@slave ~]# vim /etc/my.cnf
#
# include all files from the config directory
#
!includedir /etc/my.cnf.d
#最后添加以下信息
[mysqld]
server-id=20
relay-log=myrelay
#重启mysql,配置并启动主从复制
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.28-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> change master to \
    -> master_host='192.168.110.50',
    -> master_user='repl',
    -> master_password='repl123!',
    -> master_log_file='mysql-bin.000001',
    -> master_log_pos=328;
    
Query OK, 0 rows affected (0.003 sec)
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.002 sec)
MariaDB [(none)]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000003
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
    Slave_Transactional_Groups: 0
1 row in set (0.000 sec)
MariaDB [(none)]> exit
Bye
编写脚本
//slave端
#脚本内容
[root@slave ~]# vim /scripts/check_mysql_repl.sh 
#!/bin/bash
  
count=$(mysql -uroot -e 'show slave status\G'|grep 'Running:'|awk '{print $2}'|grep -c 'Yes')
if [ $count -ne 2 ];then
        echo '1'
else
        echo '0'
fi
#给脚本执行权限
[root@slave ~]# chmod +x /scripts/check_mysql_repl.sh 
[root@slave ~]# ll /scripts/
total 12
-rwxr-xr-x. 1 root root  179 Apr 29 15:33 check_mysql_repl.sh
-rwxr-xr-x. 1 root root  118 Apr 29 00:02 check_process.sh
#测试脚本
[root@slave ~]# bash /scripts/check_mysql_repl.sh 
0
添加指标
//slave端
#开启自定义监控功能,添加指标
[root@slave ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=
UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
UserParameter=check_logs[*],/scripts/log.py $1 $2 $3
#在最后面添加以下内容
UserParameter=check_mysql_repl,/scripts/check_mysql_repl.sh
#重启zabbix
[root@slave ~]# pkill zabbix
[root@slave ~]# zabbix_agentd 
[root@slave ~]# ss -antl
State     Recv-Q    Send-Q         Local Address:Port          Peer Address:Port    
LISTEN    0         80                   0.0.0.0:3306               0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:80                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:22                 0.0.0.0:*       
LISTEN    0         128                  0.0.0.0:10050              0.0.0.0:*       
LISTEN    0         128                     [::]:22                    [::]:* 
#使用服务端测试是否能获取客户端的指标
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k check_mysql_repl
0
web界面配置
- 添加监控项
点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

- 添加触发器
点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

- 配置媒介和动作
配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式
- 触发触发器
//slave端
#关闭slave,触发告警
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 70
Server version: 10.3.28-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> stop slave;
Query OK, 0 rows affected, 1 warning (0.015 sec)
MariaDB [(none)]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State: 
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000006
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
1 row in set (0.000 sec)
MariaDB [(none)]> exit
Bye
- 邮箱验证

4. 自定义监控mysql主从延迟
编写脚本
//slave端
#脚本内容
[root@slave ~]# vim /scripts/check_mysql_delay.sh 
#!/bin/bash
  
mysql -uroot -e 'show slave status \G'|grep 'Seconds_Behind_Master:'|awk '{print $2}'
#给脚本执行权限
[root@slave ~]# chmod +x /scripts/check_mysql_delay.sh
[root@slave ~]# ll /scripts/
total 16
-rwxr-xr-x. 1 root root  100 Apr 29 17:11 check_mysql_delay.sh
-rwxr-xr-x. 1 root root  179 Apr 29 15:33 check_mysql_repl.sh
-rwxr-xr-x. 1 root root  118 Apr 29 00:02 check_process.sh
#测试脚本
[root@slave ~]# mysql -uroot -e 'show slave status \G'
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000007
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
#slave落后master的秒数
         Seconds_Behind_Master: 0
[root@slave ~]# bash /scripts/check_mysql_delay.sh 
0
添加指标
//slave端
#开启自定义监控功能,添加指标
[root@slave ~]# vim /usr/local/etc/zabbix_agentd.conf
# Mandatory: no
# Default:
# TLSCipherAll=
UnsafeUserParameters=1
UserParameter=check_apache,/scripts/check_process.sh httpd
UserParameter=check_logs[*],/scripts/log.py $1 $2 $3
UserParameter=check_mysql_repl,/scripts/check_mysql_repl.sh
#在最后面添加以下内容
UserParameter=check_mysql_delay,/scripts/check_mysql_delay.sh
//重启zabbix
[root@slave ~]# pkill zabbix
[root@slave ~]# zabbix_agentd 
//使用服务端测试是否能获取客户端的指标
[root@zabbix ~]# zabbix_get -s 192.168.110.40 -k check_mysql_delay
0
web界面配置
- 添加监控项
点击Configuration ---> Hosts ---> 客户机(192.168.110.40)的Items ---> 右上角Create Items

- 添加触发器
点击 Configuration ---> Hosts ---> 客户机(192.168.110.40)的Triggers ---> 右上角Create triggers

- 配置媒介和动作
配置媒介和动作方法详情请见:zabbix监控服务-邮箱告警的三种配置方式
- 触发触发器
//slave端
#开启slave,延迟为0,触发告警
[root@slave ~]# mysql -uroot
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 407
Server version: 10.3.28-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> start slave;
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: 192.168.110.50
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: mysql-bin.000001
           Read_Master_Log_Pos: 652
                Relay_Log_File: myrelay.000008
                 Relay_Log_Pos: 555
         Relay_Master_Log_File: mysql-bin.000001
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB: 
           Replicate_Ignore_DB: 
#延迟为0 
         Seconds_Behind_Master: 0
1 row in set (0.000 sec)
MariaDB [(none)]> exit
Bye
这里临时修改为延迟为0的时候触发告警

- 邮箱验证


 
                
            
         
         浙公网安备 33010602011771号
浙公网安备 33010602011771号