hank_gao

导航

 

整体逻辑:
没有使用zabbix模板,仅监控一些关注的参数,通过利用一台服务器远程登录到全部redis节点进行相关监控信息获取,可能非真正意义上的自动发现,通过维护redis_nodes.txt中的节点信息,即可自动生成监控项/触发器

过程:

自动发现来源:文件读取
/var/lib/zabbix/redis_nodes.txt
172.17.99.111:8101
172.17.99.111:8102
172.17.99.112:8101
172.17.99.112:8102
172.17.99.113:8101
172.17.99.113:8102

自动发现脚本:
UserParameter=discovery_consumers,sh /var/lib/zabbix/discovery/get_nodes.sh
#!/bin/bash
#设置数组item为需要得到的所有监控项键值数据,变量itemnum为数据的个数
item=`cat /var/lib/zabbix/redis_nodes.txt |awk '{print $1}'|sort |uniq`
itemnum=`cat /var/lib/zabbix/redis_nodes.txt |awk '{print $1}'|sort |uniq|wc -l`
#输出json格式数据
num=0
echo "{""data"":["
for name in ${item[@]}
do
let num=num+1
if [ "$num" -eq "$itemnum" ]
then
echo "{""{#REDISNODES}"":""${name}""}"
else
echo "{""{#REDISNODES}"":""${name}""},"
fi
done
echo "]}"

获取监控信息脚本 crontab定时执行,例如1分钟执行一次
#cat /var/lib/zabbix/monitor_redis_info.sh
#!/bin/bash
redis_pass=123456

get_redis_info() {
echo info |/usr/local/bin/redis-cli -c -h ${1} -p ${2} -a ${redis_pass} 2>/dev/null > /var/lib/zabbix/info/${1}-${2}-info.txt
}

get_redis_conn() {
echo "client list" |/usr/local/bin/redis-cli -c -h ${1} -p ${2} -a ${redis_pass} 2>/dev/null|wc -l > /var/lib/zabbix/conn/${1}-${2}-conn.txt
}

get_redis_ping() {
node_stats=`echo ping | redis-cli -c -h ${1} -p ${2} -a ${redis_pass} 2>/dev/null`
if [ $node_stats == 'PONG' ];
then
echo 0 > /var/lib/zabbix/ping/${1}-${2}-ping.txt
else
echo 1 > /var/lib/zabbix/ping/${1}-${2}-ping.txt
fi
}

get_redis_note_type() {
node_type=`echo cluster nodes |/usr/local/bin/redis-cli -c -p $2 -h $1 -a ${redis_pass} 2>/dev/null |grep $1:$2 |awk -F ' ' '{print $3}' |awk -F ',' '{print $2}'`
if [ "$node_type" == "master" ]
then
master_status=`echo info replication|/usr/local/bin/redis-cli -c -p $2 -h $1 -a ${redis_pass} 2>/dev/null|grep connected_slaves:1|wc -l`
if [ $master_status -eq 1 ]
then
echo 0 > /var/lib/zabbix/note_type/${1}-${2}-note_type.txt
elif [ $master_status -eq 0 ]
then
echo 1 > /var/lib/zabbix/note_type/${1}-${2}-note_type.txt
fi
elif [ "$node_type" == "slave" ]
then
slave_status=`echo info replication|/usr/local/bin/redis-cli -c -p $2 -h $1 -a ${redis_pass} 2>/dev/null|grep master_link_status:up|wc -l`
if [ $slave_status -eq 1 ]
then
echo 0 > /var/lib/zabbix/note_type/${1}-${2}-note_type.txt
elif [ $slave_status -eq 0 ]
then
echo 1 > /var/lib/zabbix/note_type/${1}-${2}-note_type.txt
fi
fi
}

for i in `cat /var/lib/zabbix/redis_nodes.txt`
do
redis_host=`echo ${i} |awk -F':' '{print $1}'`
redis_port=`echo ${i} |awk -F':' '{print $2}'`
get_redis_info ${redis_host} ${redis_port}
get_redis_conn ${redis_host} ${redis_port}
get_redis_note_type ${redis_host} ${redis_port}
get_redis_ping ${redis_host} ${redis_port}
done

自定义监控项
UserParameter=monitor_redis.check.[*],sh /var/lib/zabbix/get_redis_info_detail.sh $1 $2

自定义监控项脚本
/var/lib/zabbix/get_redis_info_detail.sh

get_redis_info_detail.sh
#!/bin/bash
discory_redis_host=`echo ${2} |awk -F':' '{print $1}'`
discory_redis_port=`echo ${2} |awk -F':' '{print $2}'`
redis_info_detail=$1
case "${redis_info_detail}" in
used_memory)
cat /var/lib/zabbix/info/${discory_redis_host}-${discory_redis_port}-info.txt|grep -w used_memory |awk -F":" '{print $2/1024/1024}'
;;
instantaneous_ops_per_sec)
cat /var/lib/zabbix/info/${discory_redis_host}-${discory_redis_port}-info.txt|grep -w instantaneous_ops_per_sec |awk -F":" '{print $2/1024/1024}'
;;
real_memory_use)
cat /var/lib/zabbix/info/${discory_redis_host}-${discory_redis_port}-info.txt|grep -w used_memory_rss |awk -F":" '{print $2}'
;;
conn)
cat /var/lib/zabbix/conn/${discory_redis_host}-${discory_redis_port}-conn.txt
;;
cluster_stat)
cat /var/lib/zabbix/note_type/${discory_redis_host}-${discory_redis_port}-note_type.txt
;;
ping)
cat /var/lib/zabbix/ping/${discory_redis_host}-${discory_redis_port}-ping.txt
;;
*)
exit;
;;
esac

zabbix配置:

展示部分自动发现的监控项:

展示部分触发器:

告警样例(此处为前期的坑,调整触发器结果为1或调整脚本/var/lib/zabbix/monitor_redis_info.sh的get_redis_ping函数的0和1调换均可):

坑1:ping的错误码定义不一致需修改
坑2:real_memory_use和used_memory的信息类型应该修改为浮点数

posted on 2025-05-20 15:36  hank_gao  阅读(27)  评论(0)    收藏  举报