一、背景概述
Blackbox(黑盒)监控指的是指检测外部暴露出来的,能够被用户所看到、观察到的较表面的数据,比如WEB响应、网络是否通畅、端口连通性等指标,黑盒监控不会深入到程序或系统内部中去,能够直观表现出用户使用问题。由于建设网业务部署分散,每个机房环境及虚拟化水平参差不齐,容易存在各种网络问题,为了避免网络波动不能够及时发现而造成业务长时间故障,因此需要对网络、业务端口、重要网站URL的监控,基于现有监控系统Prometheus技术栈,官网提供的BlackboxExporter可以用于以上指标的监控。
二、监控模式
BlackboxExporter类似一个代理的角色,Prometheus将要监控的指标写到prometheus.yaml配置文件中,通过JOB配置抓取Blackboxexporter数据,抓取时以WEB URL传参的方式传递给BlackboxExporter,由BlockboxExporter执行具体的监测。
三、监控部署
1.上传解压blackbox_exporter。 BlackboxExporter可以部署在任意服务器上,一般来说,为了避免监控复杂度,在没有特殊情况下,一般选择部署到Prometheus服务器上。
sz blackbox_exporter-0.18.0.linux-amd64.tar.gz
tar zxvf blackbox_exporter-0.18.0.linux-amd64.tar.gz
mv blackbox_exporter-0.18.0.linux-amd64/ /usr/local/blackbox_exporter-0.18.0
2.修改配置文件/usr/local/blackbox_exporter-0.18.0/blackbox.yml为如下内容
modules:
http_2xx:
prober: http
timeout: 10s
http:
preferred_ip_protocol: "ip4"
no_follow_redirects: true
http_post_2xx:
prober: http
http:
method: POST
tcp_connect:
prober: tcp
pop3s_banner:
prober: tcp
tcp:
query_response:
- expect: "^+OK"
tls: true
tls_config:
insecure_skip_verify: false
ssh_banner:
prober: tcp
tcp:
query_response:
- expect: "^SSH-2.0-"
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
3.添加blackbox_exporter.service
[root@prometheus ~]# cat /etc/systemd/system/blackbox_exporter.service
[Unit]
Description=blackbox_exporter v0.18.0 for sccin production envirenment.
ConditionFileIsExecutable=/usr/local/blackbox_exporter-0.18.0/blackbox_exporter
Requires=network-online.target
After=network-online.target
[Service]
Type=simple
User=root
Group=root
WorkingDirectory=/usr/local/blackbox_exporter-0.18.0/
ExecStart=/usr/local/blackbox_exporter-0.18.0/blackbox_exporter --config.file blackbox.yml
PrivateTmp=true
StartLimitInterval=0
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
4.启动blackbox_exporter
systemctl start blackbox_exporter
systemctl enable blackbox_exporter
5.根据需求在prometheus.yml配置文件中增加相关监控
(1)tcp端口联通性监控
- job_name: "PortListening"
scrape_interval: 5s
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets: ['192.168.1.11:2890','192.168.1.12:3891']
labels:
blackbox: 'TCPPort'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.1.11:9115 # The blackbox exporter's real hostname:port.
(2)Ping监控网络可达
- job_name: "IPPing"
scrape_interval: 5s
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['192.168.1.11','192.168.1.12']
labels:
blackbox: 'Ping'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.1.11:9115 # The blackbox exporter's real hostname:port.
(3)网站URL响应码2XX监控
- job_name: "HTTPCheck"
scrape_interval: 15s
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: ['http://www.123.com/index?dl','http://www.dwjoif.cn/id?dl']
labels:
blackbox: 'HTTPCheck'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.1.11:9115 # The blackbox exporter's real hostname:port.
6.重启prometheus
四、监控数据检查
1.HTTP监控成功数据
2.TCP监控成功数据
3.Ping网络成功数据