基本信息: 负载均衡采用的是Nginx+Keeplived 负载域名:bs7001.kevin-inc.com (有很多负载域名,这里用该域名作为示例) 日志:bs7001.kevin-inc.com-access.log 1)LB层Nginx的log_format日志格式的设置(可以参考:http: //www .cnblogs.com /kevingrace/p/5893499 .html) [root@inner-lb01 ~] # cat /data/nginx/conf/nginx.conf ...... ###### ## set access log format ###### log_format main '$remote_addr $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '$http_user_agent $http_x_forwarded_for $request_time $upstream_response_time $upstream_addr $upstream_status' ; ####### ..... 2)监控及报警脚本设置 日志路径 [root@inner-lb01 ~] # ll /data/nginx/logs/bs7001.kevin-inc.com-access.log -rw-r--r-- 1 root root 0 12月 13 17:00 /data/nginx/logs/bs7001 .kevin-inc.com-access.log sendemail安装配置(安装可参考:http: //www .cnblogs.com /kevingrace/p/5961861 .html) [root@inner-lb01 ~] # cat /opt/sendemail.sh //该脚本可直接拿过来使用 #!/bin/bash # Filename: SendEmail.sh # Notes: 使用sendEmail # # 脚本的日志文件 LOGFILE= "/tmp/Email.log" :> "$LOGFILE" exec 1> "$LOGFILE" exec 2>&1 SMTP_server= 'smtp.kevin.com' username= 'notice@kevin.com' password= 'notice@123' from_email_address= 'notice@kevin.com' to_email_address= "$1" message_subject_utf8= "$2" message_body_utf8= "$3" # 转换邮件标题为GB2312,解决邮件标题含有中文,收到邮件显示乱码的问题。 message_subject_gb2312=`iconv -t GB2312 -f UTF-8 << EOF $message_subject_utf8 EOF` [ $? - eq 0 ] && message_subject= "$message_subject_gb2312" || message_subject= "$message_subject_utf8" # 转换邮件内容为GB2312,解决收到邮件内容乱码 message_body_gb2312=`iconv -t GB2312 -f UTF-8 << EOF $message_body_utf8 EOF` [ $? - eq 0 ] && message_body= "$message_body_gb2312" || message_body= "$message_body_utf8" # 发送邮件 sendEmail= '/usr/local/bin/sendEmail' set -x $sendEmail -s "$SMTP_server" -xu "$username" -xp "$password" -f "$from_email_address" -t "$to_email_address" -u "$message_subject" -m "$message_body" -o message-content- type =text -o message-charset=gb2312 [root@inner-lb01 ~] # cd /opt/lb_log_monit.sh/ [root@inner-lb01 lb_log_monit.sh] # ll 总用量 12 -rwxr-xr-x 1 root root 1180 2月 1 13:03 bs7001_request_status_monit.sh -rwxr-xr-x 1 root root 821 2月 1 11:20 bs7001_request_time_monit_request.sh -rwxr-xr-x 1 root root 559 2月 1 13:01 bs7001_request_time_monit.sh 访问请求的响应时间监控报警脚本(下面脚本中取日志文件中的第3、10列以及倒数第1、2、3列) [root@inner-lb01 lb_log_monit.sh] # cat bs7001_request_time_monit.sh #!/bin/bash /usr/bin/tail -1000 /data/nginx/logs/bs7001 .kevin-inc.com-access.log| awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001 .kevin-inc.com-check.log for i in ` awk '{print $3}' /root/lb_log_check/bs7001 .kevin-inc.com-check.log` do a=$( printf "%f" ` echo ${i}*1000| bc `| awk -F "." '{print $1}' ) b=$( printf "%f" ` echo 1*1000| bc `| awk -F "." '{print $1}' ) if [ $a - ge $b ]; then cat /root/lb_log_check/bs7001 .kevin-inc.com-check.log | grep $i else echo "it is ok" > /dev/null 2>&1 fi done [root@inner-lb01 lb_log_monit.sh] # cat bs7001_request_time_monit_request.sh #!/bin.bash /bin/bash -x /opt/lb_log_monit .sh /bs7001_request_time_monit .sh > /root/lb_log_check/bs7001 .kevin-inc.com_request_time.log NUM=` cat /root/lb_log_check/bs7001 .kevin-inc.com_request_time.log| wc -l` if [ $NUM != 0 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的响应时间" "响应时间已超过1秒钟!\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`" /bin/bash /opt/sendemail .sh linan@kevin.com "从LB层访问bs7001.kevin-inc.com请求的响应时间" "响应时间已超过1秒钟!\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com_request_time.log`" else echo "从LB层访问bs7001.kevin-inc.com请求的响应正常" fi [root@inner-lb01 lb_log_monit.sh] # ll /root/lb_log_check/ 总用量 152 -rw-r--r-- 1 root root 147766 2月 1 15:00 bs7001.kevin-inc.com-check.log -rw-r--r-- 1 root root 216 2月 1 15:00 bs7001.kevin-inc.com_request_time.log 访问的HTTP状态码监控报警脚本(500,502,503,504的状态码进行报警) [root@inner-lb01 lb_log_monit.sh] # cat bs7001_request_status_monit.sh #!/bin/bash /usr/bin/tail -1000 /data/nginx/logs/bs7001 .kevin-inc.com-access.log| awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' > /root/lb_log_check/bs7001 .kevin-inc.com-check.log for i in ` awk '{print $5}' /root/lb_log_check/bs7001 .kevin-inc.com-check.log| sort | uniq ` do if [ ${i} = 500 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:500\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" elif [ ${i} = 502 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:502\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" elif [ ${i} = 503 ]; then /bin/bash /opt/sendemail .sh wangshibo@kevin.com "从LB层访问bs7001.kevin-inc.com请求的HTTP状态返回码" "HTTP状态返回码:503\n具体情况如下:\n`cat /root/lb_log_check/bs7001.kevin-inc.com-check.log |grep ${i}`" else echo "it is ok" fi done 3)结合 crontab 进行定时监控 [root@inner-lb01 lb_log_monit.sh] # crontab -l #LB到后端服务器之间访问各系统业务的请求响应时间和http状态码监控 * /2 * * * * /bin/bash -x /opt/lb_log_monit .sh /bs7001_request_time_monit_request .sh > /dev/null 2>&1 * /2 * * * * /bin/bash -x /opt/lb_log_monit .sh /bs7001_request_status_monit .sh > /dev/null 2>&1 取对应log文件中的第3、10以及倒数第1、2、3列内容 [root@inner-lb01 lb_log_monit.sh] # /usr/bin/tail -10 /data/nginx/logs/bs7001.kevin-inc.com-access.log|awk '{print $3,$10,$(NF-2),$(NF-1),$(NF)}' [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.001 192.168.1.22:7001 304 [01 /Feb/2018 :15:05:41 "http://bs7001.kevin-inc.com/I8QW/dataimport/qw/found_analysis_imp8.jsp" 0.002 192.168.1.22:7001 304 [01 /Feb/2018 :15:06:02 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.006 192.168.1.21:7001 200 [01 /Feb/2018 :15:07:12 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.22:7001 200 [01 /Feb/2018 :15:07:51 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.003 192.168.1.21:7001 200 [01 /Feb/2018 :15:07:57 "http://bs7001.kevin-inc.com/portal/main_new.do" 0.007 192.168.1.22:7001 200 |