一、什么是promethues?
由SoundCloud开发的开源监控报警系统和时序列数据库(TSDB)。
二、promethues为什么更适合云监控?
云原生使用容器和k8s环境作为运行基础,一个整体的架构被拆散成多个微服务,微服务的变更和扩容频繁,也导致采集的目标信息变化频繁。这给时序监控提出两个要求:
需要采集运行在跨多个宿主机上海量pod容器
需要及时感知他们的变化
同时要构建完整的k8s监控生态,promethues就是为云监控而生。
三、promethues比zabbix等优势?
Zabbix使用的是C和php,promethues使用的Golang,整体promethues运行速度快
Zabbix属于传统主机监控,主要用于物理主机,交换机,网络监控等。Promethues不仅适用于主机监控,还适用于Cloud、Sass、openstack、Container监控。
四、promethues特点?
prometheus属于一站式监控告警平台,依赖少,功能齐全。
Promethues支持对云的或容器的监控,其他系统主要对主机的监控
Prometheus数据查询语句表现更强大,内置更强大的统计函数。
五、Promethues应用场景?
通过grafana对监控数据进行可视化展示,通过Alertmanager配置的监控报警,轻松实现云原生监控运维
六、promethues架构
Promethues生态圈包含了多个组件,其中组多组件的可选的:
Promethues server:用于抓取和存储时间序列数据
Client library:客户端库,为需要监控的服务生成相应的metrics并暴露给promethues server。当promethues server来pull时,直接返回实时状态的metrics。
Push gateway:主要用于短期的job。由于这类jobs存在时间较短,可能在promethues来pull之前就消失.为此,job可以直接向promethues server端推送metrics。对于机器层的metrices,需要用node exporter。
Exporters:用于暴露已有的第三方服务的metrics给promethues
Alertmanager:从promethues server端接收到alters后,会进行去除重复数据,分组,并路由到对收的接受方式:webhook、pagerduty等
工作流程:
- Promethues server定期从配置好的jobs或者exporters中拉metrics,或者从其他的pushgateway发送过来的metrics,或者从洽谈的promethues server中拉metrices
- Promethues server在本地存储收集到的metrices,并运行已定义好的alter.rules,记录新的时间序列或者向Altermanager推送报警
- Altermanager根据配置文件,对接收到的警报处理,发出警告。
- 在图形界面中,可视化采集。
数据模型:
- promethues中存储的数据为时间序列,是有metric的名字和一系列的标签(键值对)唯一的标识,不同的标签则代表不同的时间序列。
- metric名字:该名字应该具有语义,一般用于表示metric的功能,列如:http_requests_total,表示https请求的总数。其中,metric名字由ASCLL字符,数字、下划线、以及冒号组成,其必须满足正则表达式[a-z A-Z_:][a-z A-Z 0-9_:]*
- 标签:使用一个时间序列有了不同的维度的识别。列如:http_requests_total{method=”Get”}表示所有http请求中的Get请求。当method=”post”时,则为新的一个metric。标签中的键由ASCLL字符,数字,以及下划线组成。且必须满足正则表达式[a-z A-Z_:][a-z A-Z 0-9_:]*
- 样本:实际的时间序列,每个序列包括一个float64值和一个毫秒级的时间戳。
- 格式: 列如: < metric name > { < lable name >=< lable value>,…} http_requests_total{method=“POST”,endpoint=“/api/tracks”}
四种Metric类型: - Promethues客户端主要提供四种主要的metric类型Counter(计数器):一种累加的metric,典型的应用如:请求的个数,结束的任务数,出现的错误数等。
列如:查询http_requests_total{method=“get”,job=“Promethues”,handler=“query”}返回8,10后,再次查询,则返回14 - Gauge(仪表盘):一种常规的metric,典型的应用如:温度,运行的goroutines个数,可任意加减。
列如:go_goroutines{instance=“172.17.0.2”,job=“Promethues”}放回值147,10s后返回124. - Histogram(累积直方图):可以理解为柱状图,典型的应用如:请求持续时间,响应大小,可以对观察结果采用,分组及统计。
列如查询:http_request_duration_microseconds_sum{job=“promethues”,handler=“query”} - Summary(摘要):类似于Histogram,典型的应用如:请求持续时间,响应大小。提供观测值的count和sum功能。根据百分位的功能,既可以按照百分比划分跟踪结果。
任务:
linux主机监控(node_exporter),数据库监控(mysqld_exporter),redis监控(redis_exporter),nginx监控(nginx_exporter),黑盒测试(blackbox_exporter,就是监控icmp,端口),docker监控,钉钉告警或者邮件告警。
七、部署步骤测试如下
-
7、安装prometheus
7.1、下载并解压[root@CentOS7-TEST01~]#https://github.com/prometheus/prometheus/releases/download /v2.8.1/prometheus-2.8.1.linux-amd64.tar.gz [root@CentOS7-TEST01 ~]# tar -zxvf prometheus-2.8.1.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST01 ~]# cd /usr/local/ [root@CentOS7-TEST01 local]# mv prometheus-2.8.1.linux-amd64/ prometheus [root@CentOS7-TEST01 local]# cd prometheus/
7.2、修改配置文件
[root@CentOS7-TEST01 prometheus]# vim prometheus.yml global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['172.20.0.27:9090'] 启动prometheus [root@CentOS7-TEST01 prometheus]# ./prometheus
7.3、添加为服务开机自启动[root@CentOS7-TEST01 ~]# vim /usr/lib/systemd/system/prometheus.service [Unit] Description=https://prometheus.io [Service] Restart=on-failure ExecStart=/usr/local/prometheus/prometheus--config.file=/usr/local/prometheus/prome theus.yml [Install] WantedBy=multi-user.target [root@CentOS7-TEST01 ~]# systemctl daemon-reload [root@CentOS7-TEST01 ~]# systemctl start prometheus [root@CentOS7-TEST01 ~]# systemctl enable prometheus [root@CentOS7-TEST01 ~]# systemctl status prometheus
[root@CentOS7-TEST01 ~]# netstat -lntp | grep 9090
7.4、访问测试:http://172.20.0.27:9090/graph -
8、安装node_exporter
安装node_exporter 来供 Prometheus 监控获取指标值
8.1、下载解压安装[root@CentOS7-TEST01]wget https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz [root@CentOS7-TEST01 ~]# tar -zxf node_exporter-1.0.1.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST01~]#mv /usr/local/node_exporter-1.0.1.linux-amd64/usr/local/node_ exporter-1.0.1 #软连接 [root@CentOS7-TEST01 ~]# ln -sv /usr/local/node_exporter-1.0.1 /usr/local/node_exporter
8.2、设置为服务开机自启[root@CentOS7-TEST01 ~]# vim /etc/systemd/system/node_exporter.service [Unit] Description=node_exporter Documentation=https://prometheus.io/docs/introduction/overview After=network-online.target remote-fs.target nss-lookup.target Wants=network-online.target [Service] Type=simple PIDFile==/var/run/node_exporter.pid ExecStart=/usr/local/node_exporter/node_exporter ExecReload=/bin/kill -s HUP $MAINPID ExecStop=/bin/kill -s TERM $MAINPID [Install] WantedBy=multi-user.target [root@CentOS7-TEST01 ~]# systemctl daemon-reload [root@CentOS7-TEST01 ~]# systemctl enable node_exporter [root@CentOS7-TEST01 ~]# systemctl start node_exporter [root@CentOS7-TEST01 ~]# systemctl status node_exporter
8.3、访问测试
-
9、安装mysqld_exporter
9.1、下载并解压[root@CentOS7-TEST01~]#wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exp orter-0.12.1.linux-amd64.tar.gz [root@CentOS7-TEST01 local]# tar -zxvf mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST01local]#mv mysqld_exporter-0.12.1.linux-amd64 mysqld_exporter
9.2、安装Mariadb
[root@CentOS7-TEST01]# yum install mariadb-server mariadb -y ###查看版本 [root@CentOS7-TEST01 ~]# rpm -q mariadb-server mariadb 设置开机自启动并启动 [root@CentOS7-TEST01 ~]# systemctl enable mariadb [root@CentOS7-TEST01 ~]# systemctl start mariadb [root@CentOS7-TEST01 ~]# mysql_secure_installation 安装完后,root密码默认是空,因此需要进行配置安全控制程序。 配置前需要先启动mariadb
登陆数据库
[root@CentOS7-TEST01 ~]# mysql -uroot -p000000
9.3、创建用户并授权在mysql创建exporter用户并赋权 MariaDB [(none)]> CREATE USER 'exporter'@'localhost' IDENTIFIED BY '000000'; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost'; Query OK, 0 rows affected (0.00 sec)
9.4、修改配置文件
在mysql_exporter创建mysql配置文件,运行时读配置文件登录mysql [root@CentOS7-TEST01 mysqld_exporter]# touch /usr/local/mysqld_exporter/.my.cnf [root@CentOS7-TEST01mysqld_exporter]# echo "[client] user=exporter password=000000">> .my.cnf
9.5、设置开机自启动启动mysql_exporter [root@CentOS7-TEST01 mysqld_exporter]# ./mysqld_exporter --config.my-cnf=.my.cnf
[root@CentOS7-TEST01 ~]# vim /usr/lib/systemd/system/mysql_exporter.service [Unit] Description=https://prometheus.io [Service] Restart=on-failure ExecStart=/usr/local/mysql_exporter/mysqld_exporter --config.my-cnf=/usr/local/mysql_exporter/.my.cnf [Install] WantedBy=multi-user.target [root@CentOS7-TEST01 ~]# systemctl daemon-reload [root@CentOS7-TEST01 ~]# systemctl start mysql_exporter.service [root@CentOS7-TEST01 ~]# systemctl enable mysql_exporter
9.6、访问测试:
-
10、安装redis_exporter
10.1、下载并解压[root@CentOS7-TEST01]wget https://github.com/oliver006/redis_exporter/releases/download/v1.0.3/redis_exporter-v1.0.3.linux-amd64.tar.gz [root@CentOS7-TEST01 local]# tar -xvf redis_exporter-v1.0.3.linux-amd64.tar.gz -C /usr/local [root@CentOS7-TEST01 local]# mv redis_exporter-v1.0.3.linux-amd64 redis_exporter [root@CentOS7-TEST01 local]# cd redis_exporter 启动redis-exporter 有密码: redis_exporter -redis.addr 172.20.0.27:6379 -redis.password 123456 无密码: ./redis_exporter redis.addr 172.20.0.27:6379 &
10.2、加入开机自启动
[root@CentOS7-TEST01 ~]# vim /etc/systemd/system/redis_exporter.service [Unit] Description=redis_exporter Documentation=https://prometheus.io/docs/introduction/overview After=network-online.target remote-fs.target nss-lookup.target Wants=network-online.target [Service] Type=simple PIDFile==/var/run/redis_exporter.pid ExecStart=/usr/local/redis_exporter/redis_exporter -redis.addr 172.20.0.27:6379 ExecReload=/bin/kill -s HUP $MAINPID ExecStop=/bin/kill -s TERM $MAINPID [Install] WantedBy=multi-user.target [root@CentOS7-TEST01 ~]# systemctl daemon-reload [root@CentOS7-TEST01 ~]# systemctl enable redis_exporter [root@CentOS7-TEST01 ~]# systemctl start redis_exporter [root@CentOS7-TEST01 ~]# systemctl status redis_exporter
10.3、访问测试
-
11、安装nginx_exporte
11.1、nginx-module-vts模块下载[root@CentOS7-TEST01 ~]# cd /opt/ [root@CentOS7-TEST01 opt]# yum install git -y [root@CentOS7-TEST01 opt]# git clone https://github.com/vozlt/nginx-module-vts.git
11.2、nginx-module-vts安装
[root@CentOS7-TEST01 opt]# wget http://nginx.org/download/nginx-1.18.0.tar.gz [root@CentOS7-TEST01 opt]# tar -zxvf nginx-1.18.0.tar.gz
11.3、编译安装[root@CentOS7-TEST01 opt]# cd nginx-1.18.0 [root@CentOS7-TEST01 nginx-1.18.0]# ./configure --prefix=/usr/local/nginx --user=nginx --add-module=/opt/nginx-module-vts/ [root@CentOS7-TEST01 nginx-1.18.0]# make && make install
验证版本:
添加nginx用户[root@CentOS7-TEST01]# useradd -s /sbin/nologin -M nginx
11.4、nginx配置文件添加配置
Server中添加
http中添加
11.5、启动并访问nginx:[root@CentOS7-TEST01 conf]# /usr/local/nginx/sbin/nginx
11.6、安装nginx-exporter[root@CentOS7-TEST01~]#wget https://github.com/hnlq715/nginx-vts-exporter/releases/download/v0.9.1/nginx-vts-export er-0.9.1.linux-amd64.tar.gz [root@CentOS7-TEST01 ~]# tar -zxvf nginx-vts-exporter-0.9.1.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST01 ~]# mv /usr/local/nginx-vts-exporter-0.9.1.linux-amd64 nginx-vts-exporter
11.7、设置开机自启动
[root@CentOS7-TEST01 ~]# vim /usr/lib/systemd/system/nginx_vts_exporter.service [Unit] Description=nginx_vts_exporter After=network.target [Service] Type=simple ExecStart=/usr/local/nginx-vts-exporter/nginx-vts-exporter -nginx.scrape_uri http://172.20.0.27/status/format/json Restart=on-failure [Install] WantedBy=multi-user.target [root@CentOS7-TEST01 ~]# systemctl daemon-reload [root@CentOS7-TEST01 ~]# systemctl enable nginx_vts_exporter [root@CentOS7-TEST01 ~]# systemctl start nginx_vts_exporter [root@CentOS7-TEST01 ~]# systemctl status nginx_vts_exporter
11.8、访问网页测试
-
12、安装blackbox_exporte
12.1、下载解压并安装[root@CentOS7-TEST01~]#wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.12.0/blackbox_e xporter-0.12.0.linux-amd64.tar.gz [root@CentOS7-TEST01~]#tar -zxvf blackbox_exporter-0.12.0.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST01~]#mv /usr/local/blackbox_exporter-0.12.0.linux-amd64 /usr/local/blackbox_exporter
12.2、设置开机自启动
[root@CentOS7-TEST01 ~]# vim /usr/lib/systemd/system/blackbox_exporter.service [Unit] Description=blackbox_exporter After=network.target [Service] User=root Type=simple ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml Restart=on-failure [Install] WantedBy=multi-user.target [root@CentOS7-TEST01 ~]# systemctl daemon-reload [root@CentOS7-TEST01 ~]# systemctl enable blackbox_exporter.service [root@CentOS7-TEST01 ~]# systemctl start blackbox_exporter.service
12.3、访问网页测试
12.4、监控http、icmp
监控项(http测试–>定义Request Headerxinx、判断http status/http Respones Header/http Body内容、ICMP测试–>主机探活机制) -
13、docker监控
Docker监控基于cadvisor采集容器信息。
13.1、安装docker,配置镜像加速器[root@CentOS7-TEST02 ~]# yum -y install gcc && yum -y install gcc-c++ [root@CentOS7-TEST02 ~]# yum install docker -y [root@CentOS7-TEST02 ~]# vim /etc/docker/daemon.json { "registry-mirrors": ["https://e1****v.mirror.aliyuncs.com"] }
13.2、启动并设置开机自启
[root@CentOS7-TEST02 ~]# systemctl start docker [root@CentOS7-TEST02 ~]# systemctl enable docker [root@CentOS7-TEST02 ~]# systemctl status docker
13.3安装cadvisor[root@CentOS7-TEST02 ~]# docker pull google/cadvisor:latest [root@CentOS7-TEST02 ~]# vim cad.sh docker run --privileged=true --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker:/var/lib/docker:ro --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro -p 8080:8080 --d etach=true --name=cadvisor google/cadvisor [root@CentOS7-TEST02 ~]# chmod o+x cad.sh #查看 [root@CentOS7-TEST02 ~]# docker ps
13.4、访问验证
-
14、grafana安装
14.1、下载并解压[root@CentOS7-TEST02~]#wget https://dl.grafana.com/oss/release/grafana-7.3.0.linux-amd64.tar.gz [root@CentOS7-TEST02 ~]# tar -zxvf grafana-7.3.0.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST02 ~]# mv /usr/local/grafana-7.3.0 /usr/local/grafana [root@CentOS7-TEST02 ~]# cd /usr/local/grafana/
14.2、启动并访问测试
[root@CentOS7-TEST02 grafana]# ./bin/grafana-server http://172.20.0.28:3000 默认账号密码:admin/admin
-
15、安装alermanager
15.1、下载并解压[root@CentOS7-TEST02~]#wget https://github.com/prometheus/alertmanager/releases/download/v0.21.0/alertmanager-0. 21.0.linux-amd64.tar.gz [root@CentOS7-TEST02 ~]# tar -zxvf alertmanager-0.21.0.linux-amd64.tar.gz -C /usr/local/ [root@CentOS7-TEST02~]#mv /usr/local/alertmanager-0.21.0.linux-amd64 /usr/local/alertmanager
15.2、邮箱报警配置
[root@CentOS7-TEST02 ~]# vim /usr/local/alertmanager/alertmanager.yml global: resolve_timeout: 5m smtp_from: 'qw****ng@163.com' #邮箱地址 smtp_smarthost: 'smtp.163.com:465' #邮箱smtp服务器代理 smtp_auth_username: 'qw****ng@163.com' #邮箱登用户名 smtp_auth_password: '*******' #授权码 smtp_require_tls: false smtp_hello: '163.com' route: group_by: ['alertname'] #报警分组依据 group_wait: 5s #最初即第一次等待多久时间发送一组警报的通知 group_interval: 5s # 在发送新警报前的等待时间 repeat_interval: 30m # 发送重复警报的周期 对于email配置中,此项不可以设置过低,>否则将会由于邮件发送太多频繁,被smtp服务器拒绝 receiver: 'email' # 发送警报的接收者的名称,以下receivers name的名称 receivers: - name: 'email' email_configs: # 邮箱配置 - to: '14*****91@qq.com' # 接收警报的email配置 headers: { Subject: "[WARN] 报警邮件"} # 接收邮件的标题 send_resolved: true inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance']
15.3、钉钉报警配置
docker 安装钉钉报警插件
钉钉设置机器人
[root@CentOS7-TEST02~]#wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v0.3.0/ prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz
解压并安装
[root@CentOS7-TEST02 ~]# tar -zxvf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz [root@CentOS7-TEST02 ~]# scp -r prometheus-webhook-dingtalk-0.3.0.linux-amd64 /usr/local/ [root@CentOS7-TEST02 ~]# cd /usr/local/prometheus-webhook-dingtalk-0.3.0.linux-amd64/ [root@CentOS7-TEST02 ~]#vi dingding.sh
[root@CentOS7-TEST02 ~]#chmod o+x dingding.sh
写入配置到alertmanager.yaml中
[root@CentOS7-TEST02 alertmanager]# vim alertmanager.yml global: resolve_timeout: 2m route: receiver: webhook group_wait: 30s group_interval: 2m repeat_interval: 2m group_by: [alertname] routes: - receiver: webhook group_wait: 10s receivers: - name: webhook webhook_configs: - url: http://172.20.0.28:8060/dingtalk/webhook/send send_resolved: true inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance']
15.4、检测并启动
[root@CentOS7-TEST02 alertmanager]# ./amtool check-config alertmanager.yml
[root@CentOS7-TEST02 alertmanager]# ./alertmanager
16、修改prometheus配置文件,加入之前node_exporter、mysqld_exporter、redis_exporter、 nginx_exporter、blackbox_exporte、docker监控16.1、开始配置prometheus的target
[root@CentOS7-TEST01 ~]# vim /usr/local/prometheus/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: [172.20.0.28:9093] # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "/usr/local/prometheus/rules/*.rules" #- "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['172.20.0.27:9090'] - job_name: 'node-exporter' static_configs: - targets: ['172.20.0.27:9100'] labels: instance: vm-172.20.0.27 service: node-service - job_name: 'mysql' static_configs: - targets: ['172.20.0.27:9104'] labels: service: mysql-service - job_name: 'redis' static_configs: - targets: ['172.20.0.27:9121'] labels: service: redis-service - job_name: 'nginx' static_configs: - targets: ['172.20.0.27:9913'] labels: service: mysql-service - job_name: 'blackbox_http_2xx' metrics_path: /probe static_configs: - targets: - https://www.baidu.com/ - http://172.18.5.100/ relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 172.20.0.27:9115 - job_name: 'blackbox_icmp' metrics_path: /probe static_configs: - targets: - http://172.18.5.100/ relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 172.20.0.27:9115 - job_name: 'docker' static_configs: - targets: ['172.20.0.28:8080'] labels: service: docker-service
16.2、修改blackbox_exporter配置文件
由于涉及黑盒测试修改blackbox_exporter配置文件[root@CentOS7-TEST01 ~]# vim /usr/local/blackbox_exporter/blackbox.yml modules: http_2xx: prober: http timeout: 5s http: valid_status_codes: [200] method: GET http_post_2xx: prober: http http: method: POST tcp_connect: prober: tcp pop3s_banner: prober: tcp tcp: query_response: - expect: "^+OK" tls: true tls_config: insecure_skip_verify: false ssh_banner: prober: tcp tcp: query_response: - expect: "^SSH-2.0-" irc_banner: prober: tcp tcp: query_response: - send: "NICK prober" - send: "USER prober prober prober :prober" - expect: "PING :([^ ]+)" send: "PONG ${1}" - expect: "^:[^ ]+ 001" icmp: prober: icmp timeout: 5s
有规则先创建规则目录,等验证成功在写触发规则
[root@CentOS7-TEST01 ~]# mkdir /usr/local/prometheus/rules/
16.3、重启blackbox_exporter、promethues,验证服务状态是否被监控
[root@CentOS7-TEST01 prometheus]# systemctl restart blackbox_exporter.service [root@CentOS7-TEST01 ~]# systemctl restart prometheus
可以看到完全监控到了,监控网络就该配置rule规则了
16.4、rule规则编写所有被监控项[root@CentOS7-TEST01 rules]# vim /usr/local/prometheus/rules/node_exporter.rules
[root@CentOS7-TEST01 rules]# vim /usr/local/prometheus/rules/mysql_alert.rules
[root@CentOS7-TEST01 rules]# vim /usr/local/prometheus/rules/redis_alert.rules
[root@CentOS7-TEST01 rules]# vim /usr/local/prometheus/rules/nginx_alert.rules
[root@CentOS7-TEST01 rules]# vim /usr/local/prometheus/rules/docker_alert.rules
[root@CentOS7-TEST01 rules]# vim /usr/local/prometheus/rules/blackbox_alert.rules
16.5、重启prometheus
编写好所有规则重启prometheus服务,并进行验证[root@CentOS7-TEST01 ~]# systemctl restart prometheus
告警规则如下:
16.6、修改rule规则,触发报警
这里随便修改测试触发报警规则:
16.7、停掉redis服务,查看报警[root@CentOS7-TEST01 rules]# systemctl stop redis.service
-
17、grafana对接prometheus
17.1、添加数据源
选择prometheus
输入prometheus地址连接
17.2、node_exporter展示
输入面板id
8919是用来监控node_exporter的模板,需要联网,如果无法联网,需去官网离线下载面板,地址:https://grafana.com/grafana/dashboards
17.3、mysql_exporter展示
导入my2.sql
下载地址:https://codeload.github.com/john1337/my2Collector/zip/master
上传到服务器并解压[root@CentOS7-TEST01 ~]# unzip my2Collector-master.zip [root@CentOS7-TEST01 ~]# cd my2Collector-master && ll
登陆mysql账户,并导入sql[root@CentOS7-TEST01]# mysql -uroot -p000000 MariaDB [(none)]> source /root/my2Collector-master/my2.sql
7991是用来监控mysql_exporter的模板,需要联网,如果无法联网,需去官网离线下 载面板,地址:https://grafana.com/grafana/dashboards
17.4、redis_exporter展示
这里自定义模板下载地址:
https://grafana.com/api/dashboards/763/revisions/1/download
17.5、nginx_exporter展示
2949是用来监控mysql_exporter的模板,需要联网,如果无法联网,需去官网离线下 载面板,地址:https://grafana.com/grafana/dashboards
17.6、blackbox_exporter展示
9965是用来监控mysql_exporter的模板,需要联网,如果无法联网,需去官网离线下 载面板,地址:https://grafana.com/grafana/dashboards
此模板需要安装饼状图插件[root@CentOS7-TEST02 bin]# ./grafana-cli plugins install grafana-piechart-panel
重启grafana服务 [root@CentOS7-TEST02 grafana]# ./bin/grafana-server
17.6、docker展示
11558是用来监控mysql_exporter的模板,需要联网,如果无法联网,需去官网离线下 载面板,地址:https://grafana.com/grafana/dashboards