状态监控
为了监控到各业务的访问质量,基于LB层的Nginx日志,实现LB层到Real Server之间访问请求的响应时间(即upstream_response_time)及HTTP状态码(即upstream_status)的监控及报警。操作记录如下:
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
|
基本信息:负载均衡采用的是Nginx+Keeplived负载域名:bs7001.kevin-inc.com (有很多负载域名,这里用该域名作为示例)日志:bs7001.kevin-inc.com-access.log1)LB层Nginx的log_format日志格式的设置(可以参考:http://www.cnblogs.com/kevingrace/p/5893499.html)[root@inner-lb01 ~]# cat /data/nginx/conf/nginx.conf......###### ## set access log format ###### log_format main '$remote_addr $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '$http_user_agent $http_x_forwarded_for $request_time $upstream_response_time $upstream_addr $upstream_status'; #######.....2)监控及报警脚本设置日志路径[root@inner-lb01 ~]# ll /data/nginx/logs/bs7001.kevin-inc.com-access.log-rw-r--r-- 1 root root 0 12月 13 17:00 /data/nginx/logs/bs7001.kevin-inc.com-access.logsendemail安装配置(安装可参考:http://www.cnblogs.com/kevingrace/p/5961861.html)[root@inner-lb01 ~]# cat /opt/sendemail.sh //该脚本可直接拿过来使用#!/bin/bash# Filename: SendEmail.sh# Notes: 使用sendEmail## 脚本的日志文件LOGFILE="/tmp/Email.log":>"$LOGFILE"exec 1>"$LOGFILE"exec 2>&1SMTP_server='smtp.kevin.com'username='notice@kevin.com'password='notice@123'from_email_address='notice@kevin.com'to_email_address="$1"message_subject_utf8="$2"message_body_utf8="$3"# 转换邮件标题为GB2312,解决邮件标题含有中文,收到邮件显示乱码的问题。message_subject_gb2312=`iconv -t GB2312 -f UTF-8 << EOF$message_subject_utf8EOF`[ $? -eq 0 ] && message_subject="$message_subject_gb2312" || message_subject="$message_subject_utf8"# 转换邮件内容为GB2312,解决收到邮件内容乱码message_body_gb2312=`iconv -t GB2312 -f UTF-8 << EOF$message_body_utf8EOF`[ $? -eq 0 ] && message_body="$message_body_gb2312" || message_body="$message_body_utf8"# 发送邮件sendEmail='/usr/local/bin/sendEmail'set -x$sendEmail -s "$SMTP_server" -xu "$username" -xp "$password" -f "$from_email_address" -t "$to_email_address" -u "$message_subject" -m "$message_body" -o message-content-type=text -o message-charset=gb2312[root@inner-lb01 ~]# cd /opt/lb_log_monit.sh/[root@inner-lb01 lb_log_monit.sh]# ll总用量 12-rwxr-xr-x 1 root root 1180 2月 1 13:03 bs7001_request_status_monit.sh-rwxr-xr-x 1 root root 821 2月 1 11:20 bs7001_request_time_monit_request.sh-rwxr-xr-x 1 root root 559 2月 1 13:01 bs7001_request_time_monit.sh访问请求的响应时间监控报警脚本(下面脚本中取日志文件中的第3、10列以及倒数第1、2、3列)[root@inner-lb01 lb_log_monit.sh]# cat bs7001_request_time_monit.sh#!/bin/bash/usr/bin/tail -1000 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001.kevin-inc.com-check.logfor i in `awk '{print $3}' /root/lb_log_check/bs7001.kevin-inc.com-check.log`do a=$(printf "%f" `echo ${i}*1000|bc`|awk -F"." '{print $1}') b=$(printf "%f" `echo 1*1000|bc`|awk -F"." '{print $1}') if [ $a -ge $b ];then cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep $i else echo "it is ok" >/dev/null 2>&1 fidone[root@inner-lb01 lb_log_monit.sh]# cat bs7001_request_time_monit_request.sh#!/bin.bash/bin/bash -x /opt/lb_log_monit.sh/bs7001_request_time_monit.sh > /root/lb_log_check/bs7001.kevin-inc.com_request_time.logNUM=`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log|wc -l`if [ $NUM != 0 ];then /bin/bash /opt/sendemail.sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的响应时间" "响应时间已超过1秒钟!\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`" /bin/bash /opt/sendemail.sh linan@kevin.com "从LB层访问bs7001.kevin-inc.com请求的响应时间" "响应时间已超过1秒钟!\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`"else echo "从LB层访问bs7001.kevin-inc.com请求的响应正常"fi[root@inner-lb01 lb_log_monit.sh]# ll /root/lb_log_check/总用量 152-rw-r--r-- 1 root root 147766 2月 1 15:00 bs7001.kevin-inc.com-check.log-rw-r--r-- 1 root root 216 2月 1 15:00 bs7001.kevin-inc.com_request_time.log访问的HTTP状态码监控报警脚本(500,502,503,504的状态码进行报警)[root@inner-lb01 lb_log_monit.sh]# cat bs7001_request_status_monit.sh#!/bin/bash/usr/bin/tail -1000 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001.kevin-inc.com-check.logfor i in `awk '{print $5}' /root/lb_log_check/bs7001.kevin-inc.com-check.log|sort|uniq`do if [ ${i} = 500 ];then /bin/bash /opt/sendemail.sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:500\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" elif [ ${i} = 502 ];then /bin/bash /opt/sendemail.sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:502\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" elif [ ${i} = 503 ];then /bin/bash /opt/sendemail.sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:503\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" else echo "it is ok" fidone3)结合crontab进行定时监控[root@inner-lb01 lb_log_monit.sh]# crontab -l#LB到后端服务器之间访问各系统业务的请求响应时间和http状态码监控*/2 * * * * /bin/bash -x /opt/lb_log_monit.sh/bs7001_request_time_monit_request.sh >/dev/null 2>&1*/2 * * * * /bin/bash -x /opt/lb_log_monit.sh/bs7001_request_status_monit.sh >/dev/null 2>&1取对应log文件中的第3、10以及倒数第1、2、3列内容[root@inner-lb01 lb_log_monit.sh]# /usr/bin/tail -10 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}'[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304[01/Feb/2018:15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304[01/Feb/2018:15:06:02 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.006 192.168.1.21:7001 200[01/Feb/2018:15:07:12 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.22:7001 200[01/Feb/2018:15:07:51 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.21:7001 200[01/Feb/2018:15:07:57 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.007 192.168.1.22:7001 200 |



浙公网安备 33010602011771号