1、 问题的由来
对于一个网站来说,外部用户能够看到就是该网站的页面。网站页面能否被正常访问,以及显示是否正常势必会成为网站整体水平最直接的外在表现。
那么,如何才能在第一时间检测到网页是否正常,并且给相应的技术人员发出报警来及时解决问题,而不是等接到用户抱怨的电话后才在慌忙中仓促的解决问题呢? 解决这个问题的关键就是要在第一时间发现问题,发现那些不能显示的网页或是显示不正常的网页,并及时发出报警。当然我们可以通过人工的方法去监测,但对于 一些大型的、复杂的网站来说就不是很合适了,我们可以使用监控软件来解决这个问题。我所使用的就是Nagios软件,它提供的插件(Plugins)中有 相应的命令可以完成对网页的监控。
2. 如何通过Nagios解决此类问题
对于Nagios、NRPE以及Nagios Plugins的安装配置网站的资料很多,在我的Blog中也有相应的文章可以参考,在这里就不再过多的说明了。
想使用Nagios监控网页状况,Nagios插件中的一个命令不得不被提及,那就是check_http,我没可以使用它来检查网页是否正常、可用。该命令的具体说明和用法如下。
check_http v2.1.1 (nagios-plugins 2.1.1) Copyright (c) 1999 Ethan Galstad <nagios@nagios.org> Copyright (c) 1999-2014 Nagios Plugin Development Team <devel@nagios-plugins.org> This plugin tests the HTTP service on the specified host. It can test normal (http) and secure (https) servers, follow redirects, search for strings and regular expressions, check connection times, and report on certificate expiration times. Usage: check_http -H <vhost> | -I <IP-address> [-u <uri>] [-p <port>] [-J <client certificate file>] [-K <private key>] [-w <warn time>] [-c <critical time>] [-t <timeout>] [-L] [-E] [-a auth] [-b proxy_auth] [-f <ok|warning|critcal|follow|sticky|stickyport>] [-e <expect>] [-d string] [-s string] [-l] [-r <regex> | -R <case-insensitive regex>] [-P string] [-m <min_pg_size>:<max_pg_size>] [-4|-6] [-N] [-M <age>] [-A string] [-k string] [-S <version>] [--sni] [-C <warn_age>[,<crit_age>]] [-T <content-type>] [-j method] NOTE: One or both of -H and -I must be specified Options: -h, --help Print detailed help screen -V, --version Print version information --extra-opts=[section][@file] Read options from an ini file. See https://www.nagios-plugins.org/doc/extra-opts.html for usage and examples. -H, --hostname=ADDRESS Host name argument for servers using host headers (virtual host) Append a port to include it in the header (eg: example.com:5000) -I, --IP-address=ADDRESS IP address or name (use numeric address if possible to bypass DNS lookup). -p, --port=INTEGER Port number (default: 80) -4, --use-ipv4 Use IPv4 connection -6, --use-ipv6 Use IPv6 connection -S, --ssl=VERSION Connect via SSL. Port defaults to 443. VERSION is optional, and prevents auto-negotiation (1 = TLSv1, 2 = SSLv2, 3 = SSLv3). --sni Enable SSL/TLS hostname extension support (SNI) -C, --certificate=INTEGER[,INTEGER] Minimum number of days a certificate has to be valid. Port defaults to 443 (when this option is used the URL is not checked.) -J, --client-cert=FILE Name of file that contains the client certificate (PEM format) to be used in establishing the SSL session -K, --private-key=FILE Name of file containing the private key (PEM format) matching the client certificate -e, --expect=STRING Comma-delimited list of strings, at least one of them is expected in the first (status) line of the server response (default: HTTP/1.) If specified skips all other status line logic (ex: 3xx, 4xx, 5xx processing) -d, --header-string=STRING String to expect in the response headers -s, --string=STRING String to expect in the content -u, --url=PATH URL to GET or POST (default: /) -P, --post=STRING URL encoded http POST data -j, --method=STRING (for example: HEAD, OPTIONS, TRACE, PUT, DELETE) Set HTTP method. -N, --no-body Don't wait for document body: stop reading after headers. (Note that this still does an HTTP GET or POST, not a HEAD.) -M, --max-age=SECONDS Warn if document is more than SECONDS old. the number can also be of the form "10m" for minutes, "10h" for hours, or "10d" for days. -T, --content-type=STRING specify Content-Type header media type when POSTing -l, --linespan Allow regex to span newlines (must precede -r or -R) -r, --regex, --ereg=STRING Search page for regex STRING -R, --eregi=STRING Search page for case-insensitive regex STRING --invert-regex Return CRITICAL if found, OK if not -a, --authorization=AUTH_PAIR Username:password on sites with basic authentication -b, --proxy-authorization=AUTH_PAIR Username:password on proxy-servers with basic authentication -A, --useragent=STRING String to be sent in http header as "User Agent" -k, --header=STRING Any other tags to be sent in http header. Use multiple times for additional headers -E, --extended-perfdata Print additional performance data -L, --link Wrap output in HTML link (obsoleted by urlize) -f, --onredirect=<ok|warning|critical|follow|sticky|stickyport> How to handle redirected pages. sticky is like follow but stick to the specified IP address. stickyport also ensures port stays the same. -m, --pagesize=INTEGER<:INTEGER> Minimum page size required (bytes) : Maximum page size required (bytes) -w, --warning=DOUBLE Response time to result in warning status (seconds) -c, --critical=DOUBLE Response time to result in critical status (seconds) -t, --timeout=INTEGER:<timeout state> Seconds before connection times out (default: 10) Optional ":<timeout state>" can be a state integer (0,1,2,3) or a state STRING -v, --verbose Show details for command-line debugging (Nagios may truncate output) Notes: This plugin will attempt to open an HTTP connection with the host. Successful connects return STATE_OK, refusals and timeouts return STATE_CRITICAL other errors return STATE_UNKNOWN. Successful connects, but incorrect reponse messages from the host result in STATE_WARNING return values. If you are checking a virtual server that uses 'host headers' you must supply the FQDN (fully qualified domain name) as the [host_name] argument. This plugin can also check whether an SSL enabled web server is able to serve content (optionally within a specified time) or whether the X509 certificate is still valid for the specified number of days. Please note that this plugin does not check if the presented server certificate matches the hostname of the server, or if the certificate has a valid chain of trust to one of the locally installed CAs. Examples: CHECK CONTENT: check_http -w 5 -c 10 --ssl -H www.verisign.com When the 'www.verisign.com' server returns its content within 5 seconds, a STATE_OK will be returned. When the server returns its content but exceeds the 5-second threshold, a STATE_WARNING will be returned. When an error occurs, a STATE_CRITICAL will be returned. CHECK CERTIFICATE: check_http -H www.verisign.com -C 14 When the certificate of 'www.verisign.com' is valid for more than 14 days, a STATE_OK is returned. When the certificate is still valid, but for less than 14 days, a STATE_WARNING is returned. A STATE_CRITICAL will be returned when the certificate is expired. CHECK CERTIFICATE: check_http -H www.verisign.com -C 30,14 When the certificate of 'www.verisign.com' is valid for more than 30 days, a STATE_OK is returned. When the certificate is still valid, but for less than 30 days, but more than 14 days, a STATE_WARNING is returned. A STATE_CRITICAL will be returned when certificate expires in less than 14 days Send email to help@nagios-plugins.org if you have questions regarding use of this software. To submit patches or suggest improvements, send email to devel@nagios-plugins.org
实际操作:
1.在command.cfg中修改check_http命令
由于本案例使用的是基于域名的监控,因此heck_http 命令中的-H XXX.com -u /index.aspx等参数通ARG1 ARG2传递。
2.在host.cfg 和service.cfg中添加需要监控的主机和服务。
host.cfg:
service.cfg
所用监控的域名和url都是通过参数传递。
3. 检查配置文件并重载nagios监控。
实例:
1、-u测试页面 是否可以到开 可以用相对路径和绝对路径
#./check_http -H www.****.com -u /url1/url2/index.html
#./check_http -H www.****.com -u http://www.****.com/url1/url2/index.html
#./check_http -H www.****.com -p 80 -u http://www.****.com/url1/url2/index.html
#./check_http -I xxx.xxx.xxx -u /url1/url2/index.html
2、加密传输 -S
# ./check_http -H "log.gw.com.cn" -S Connection refused HTTP CRITICAL - Unable to open TCP socket
3、测试服务器的http版本协议或者状态返回码 -e
# ./check_http -I 114.80.136.138 -k "HOST:log.gw.com.cn" -e "HTTP/1.1"
HTTP OK: Status line output matched "HTTP/1.1" - 3088 bytes in 0.206 second response time |time=0.205964s;;;0.000000 size=3088B;;;0
# ./check_http -I 114.80.136.138 -k "HOST:log.gw.com.cn" -e "HTTP/1.0"
HTTP CRITICAL - Invalid HTTP response received from host: HTTP/1.1 200 OK
4、搜索返回的页面中的内容 -s
# ./check_http -I 114.80.136.138 -k "HOST:log.gw.com.cn" -s "Piwik"
HTTP OK: HTTP/1.1 200 OK - 3088 bytes in 0.196 second response time |time=0.196134s;;;0.000000 size=3088B;;;0
$ curl 114.80.136.138 -H "host:log.gw.com.cn" 显示的页面中出现的“Piwik”
5、以用户名密码访问一个需认证的页面 -a;
# ./check_http -I 10.15.62.38 -u /nagios/
HTTP WARNING: HTTP/1.1 401 Authorization Required - 726 bytes in 0.019 second response time |time=0.019393s;;;0.000000 size=726B;;;0
# ./check_http -I 10.15.62.38 -u /nagios/ -a nagiosadmin:nagios
HTTP OK: HTTP/1.1 200 OK - 917 bytes in 0.066 second response time |time=0.066009s;;;0.000000 size=917B;;;0
6、连接超时的时间 -t;
# ./check_http -I 10.15.62.38 -t 1 HTTP OK: HTTP/1.1 200 OK - 38056 bytes in 0.017 second response time |time=0.017460s;;;0.000000 size=38056B;;;0
7、定义warning和critical的告警阀值:-w和-c;
8、检测返回的页面大小 -m ;
# ./check_http -H "log.gw.com.cn" -I 114.80.136.151 -m 10:400
HTTP WARNING: HTTP/1.1 200 OK - page size 3128 too large - 3128 bytes in 0.223 second response time |time=0.223207s;;;0.000000 size=3128B;10;0;0
9、检测证书是否过期 -C
#check_http -H www.verisign.com -C 14
10、设置返回的http的头部信息 -A
# ./check_http -H "log.gw.com.cn" -v GET / HTTP/1.1 User-Agent: check_http/v1.4.16 (nagios-plugins 1.4.16) Connection: close Host: log.gw.com.cn # ./check_http -H "log.gw.com.cn" -v -A "check_http" GET / HTTP/1.1 User-Agent: check_http Connection: close Host: log.gw.com.cn
11、只显示头部信息,不显示页面信息。 -N
# ./check_http -H "log.gw.com.cn" -I 114.80.136.151 -N
HTTP OK: HTTP/1.1 200 OK - 1460 bytes in 0.219 second response time |time=0.218590s;;;0.000000 size=1460B;;;0
# ./check_http -H "log.gw.com.cn" -I 114.80.136.151
HTTP OK: HTTP/1.1 200 OK - 3128 bytes in 0.220 second response time |time=0.220288s;;;0.000000 size=3128B;;;0
12、检测文档修改时间 -M
# ./check_http -I 114.80.136.138 -k "HOST:log.gw.com.cn" -M 1 HTTP CRITICAL: HTTP/1.1 200 OK - Document modification date unknown - 3088 bytes in 0.197 second response time |time=0.196606s;;;0.000000 size=3088B;;;0 文档修改时间未知