11-Zabbix自动发现LLD实现进程使用CPU和内存监控 + 结合主动模式应该会更好
原文:https://blog.csdn.net/u013272009/article/details/90486079
Zabbix 自动发现(LLD)
LLD : Low-level discovery
官网文档: https://www.zabbix.com/documentation/4.0/manual/discovery/low_level_discovery
作用: 可以指定规则(rule),来达成不确定数量的监测项的自动配置生成
自定义 LLD 规则, 参见上官网文档中的 Creating custom LLD rules 节,比较有用
比如使用 zabbix 实现服务器进程 CPU 、 MEM 的使用情况,则使用 LLD 较为合适
实际例子

如上图, Server CPU all 图表,服务进程数量等开好时才确定。下次开可能又不一样
下面,使用 LLD 来实现上述图表
1. 编写获取服务名脚本
例如 kgetserver.sh :
#!/bin/bash
echo '{"data":['
n0=`ps -aux | grep Server | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | wc -l`
ps -aux | grep Server | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | awk -v n=$n0 '{printf "{\"{#PROCESSNAME}\":\"\\\"";for(i=11;i<=NF;i++){printf $i;if(i<NF)printf " "};printf "\\\"\"}";if(NR<n)printf ",";printf "\n"}'
echo ']}'
执行可输出:
{"data":[
{"{#PROCESSNAME}":"\"./MgrServer.dbg\""},
{"{#PROCESSNAME}":"\"./LogServer_Ex.dbg\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 1\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 2\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 3\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 4\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 5\""},
{"{#PROCESSNAME}":"\"./RecordServer_Ex.dbg --stderrthreshold 0 --log_dir ../log -s 6\""},
{"{#PROCESSNAME}":"\"./ProxyServer_Ex.dbg\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 1\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 2\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 3\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 4\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 5\""},
{"{#PROCESSNAME}":"\"./RedisSyncServer_Ex.dbg -s 6\""},
{"{#PROCESSNAME}":"\"./RobotServer_Ex.dbg\""},
{"{#PROCESSNAME}":"\"./LoginServer.dbg --stderrthreshold 0 --log_dir ../log -s 1\""},
{"{#PROCESSNAME}":"\"./LoginServer.dbg --stderrthreshold 0 --log_dir ../log -s 2\""},
{"{#PROCESSNAME}":"\"./LoginServer.dbg --stderrthreshold 0 --log_dir ../log -s 3\""}
]}
本脚本就是 rule ,通过本脚本可以找到要监测的服务项
再比如 kgetproc.sh :
#!/bin/bash
echo '{"data":['
n0=`ps -aux | grep $1 | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | grep -v defunct | wc -l`
ps -aux | grep $1 | grep -v grep | grep -v $0 | grep -v kgetcpu | grep -v kgetmem | grep -v tail | grep -v defunct | awk -v n=$n0 '{printf "{\"{#PROCESSNAME}\":\"\\\"";for(i=11;i<=NF;i++){printf $i;if(i<NF)printf " "};printf "\\\"\", \"{#PROCESSPID}\":";printf $2;printf ",\"{#PROCESSNO}\":"; printf NR; printf "}";if(NR<n)printf ",";printf "\n"}'
echo ']}'
执行可输出:
[root@host-192-168-21-36 opt]# ./kgetproc.sh codis-server
{"data":[
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23790\"", "{#PROCESSPID}":28381,"{#PROCESSNO}":1},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23791\"", "{#PROCESSPID}":28486,"{#PROCESSNO}":2},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23792\"", "{#PROCESSPID}":28523,"{#PROCESSNO}":3},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23793\"", "{#PROCESSPID}":28576,"{#PROCESSNO}":4},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23794\"", "{#PROCESSPID}":28597,"{#PROCESSNO}":5},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23795\"", "{#PROCESSPID}":28633,"{#PROCESSNO}":6},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23796\"", "{#PROCESSPID}":28671,"{#PROCESSNO}":7},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23797\"", "{#PROCESSPID}":28707,"{#PROCESSNO}":8},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23798\"", "{#PROCESSPID}":28735,"{#PROCESSNO}":9},
{"{#PROCESSNAME}":"\"/home/fananchong/go/src/github.com/CodisLabs/codis/admin/../bin/codis-server 127.0.0.1:23799\"", "{#PROCESSPID}":28780,"{#PROCESSNO}":10}
]}
2. 编写获取某进程CPU 、 MEM 占用脚本
比如 kgetcpu.sh :
#!/bin/bash
mypid=`ps aux | grep "$1" | grep -v grep | grep -v "$0" | grep -v tail | grep -v defunct | grep -v vi | awk '{print $2}'`
getactive=`top -b n 1 | awk -v v=$mypid '{if($1==v){print $9};}'`
if [[ -n $getactive ]]; then
echo $getactive
else
echo "0"
fi
比如 kgetmem.sh :
#!/bin/bash
mypid=`ps aux | grep "$1" | grep -v grep | grep -v "$0" | grep -v tail | grep -v defunct | grep -v vi | awk '{print $2}'`
getactive=`top -b n 1 | awk -v v=$mypid '{if($1==v){print $6};}'`
if [[ ""$getactive != "" ]]; then
if [[ ${getactive} =~ "g" ]];then
getactive=${getactive%%g*}
echo "$getactive*1024*1024" | bc
else
n=$[getactive*1024];
echo $n
fi
else
echo "0"
fi
以上脚本定义了每个监测项要监测的内容
3. 配置监测项
比如 /etc/zabbix/zabbix_agentd.d/userparameter_mygraph.conf :
UserParameter=myGraph.server_cpu[*],sudo /opt/kgetcpu.sh $1 UserParameter=myGraph.server_mem[*],sudo /opt/kgetmem.sh $1 UserParameter=myGraph.server_process[*],sudo /opt/kgetserver.sh UserParameter=myGraph.proc[*],sudo /opt/kgetproc.sh $1
重启服务
systemctl restart zabbix-agent.service
剩下的就是使用 zabbix frontend ,在页面上操作了
4. 模版(Templates)上创建 Discovery rule

类似上图
5. 模版(Templates)上创建 Item prototypes

类似上图
6. 模版(Templates)上创建 Graph prototype

类似上图
至此,所有监测项会自动生成
Host 上创建 Server CPU all 图形
(目前是手动创建的,按道理也可以自动生成。 有时间翻翻文档,再补上)

浙公网安备 33010602011771号