文章目录
一、prometheus安装
下载地址https://prometheus.io/download/
将下载好的弄到新建的prometheus文件里面
cd /home/tong/prometheus/
yum install -y tar(安装tar)
tar -xvf prometheus-2.38.0.linux-amd64(解压)
软连接
ln -s /home/tong/prometheus/prometheus-2.38.0.linux-amd64 /usr/local/prometheus
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &
lsof -i:9090
关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
iptables -F
验证:浏览器打开IP:9090端口即可打开普罗米修斯自带的监控页面
二、AlertManager的使用(邮件告警)
1、原文链接
https://blog.csdn.net/weixin_45880055/article/details/120585024
2、自己操作-prometheus+alertmanager实现CPU的监控报警
alertmanager下载链接:
https://prometheus.io/download/#alertmanager
将下载好的弄到新建的alertmanager文件里面
cd /home/tong/alertmanager
yum install -y tar(安装tar)
解压
tar -xvf alertmanager-0.24.0.linux-amd64.tar.gz
上传
ln -s /home/tong/alertmanager/alertmanager-0.24.0.linux-amd64 /usr/local/alertmanager
修改alertmanager的配置文件
smtp_auth_password: ‘mgkrvlkublozdhja’ # 这里是邮箱的授权密码,不是登录密码
自己的邮箱开启smtp服务,获得授权码
方法:
https://jingyan.baidu.com/article/ac6a9a5eb439f36b653eacc0.html
vim /usr/local/alertmanager/alertmanager.yml
自己原先的配置
修改完之后
global:
resolve_timeout: 5m
smtp_from: 3406747094@qq.com
smtp_auth_username: 3406747094@qq.com
smtp_auth_password: htvywgock(自己的)# 这里为第三方登录 QQ 邮箱的授权码
smtp_require_tls: false # 是否使用 tls,根据环境不同,来选择开启和关闭。如果提示报错 email.loginAuth failed: 530 Must issue a STARTTLS command first,那么就需要设置为 true。着重说明一下,如果开启了 tls,提示报错 starttls failed: x509: certificate signed by unknown authority,需要在 email_configs 下配置 insecure_skip_verify: true 来跳过 tls 验证。
smtp_smarthost: 'smtp.qq.com:465' # 这里为 QQ 邮箱 SMTP 服务地址,官方地址为 smtp.qq.com 端口为 465 或 587,同时要设置开启 POP3/SMTP 服务。
route:
group_by: ['alertname']
group_wait: 5s
group_interval: 10s
repeat_interval: 2m
receiver: 'email-demo'
receivers:
- name: 'email-demo'
email_configs:
- to: 3406747094@qq.com
send_resolved: true
解释:
group_by: ['alertname'] # 用于分组聚合,对告警通知按标签(label)进行分组,将具有相同标签或相同告警名称(alertname)的告警通知聚合在一个组,然后作为一个通知发送。如果想完全禁用聚合,可以设置为group_by: [...]
group_wait: 30s # 当一个新的告警组被创建时,需要等待'group_wait'后才发送初始通知。这样可以确保在发送等待前能聚合更多具有相同标签的告警,最后合并为一个通知发送。
group_interval: 2m # 当第一次告警通知发出后,在新的评估周期内又收到了该分组最新的告警,则需等待'group_interval'时间后,开始发送为该组触发的新告警,可以简单理解为,group就相当于一个通道(channel)。
repeat_interval: 10m # 告警通知成功发送后,若问题一直未恢复,需再次重复发送的间隔。
receiver: 'email' # 配置告警消息接收者,与下面配置的对应。例如常用的 email、wechat、slack、webhook 等消息通知方式。
修改好配置文件后,可以使用amtool工具检查配置
/usr/local/alertmanager/amtool check-config /usr/local/alertmanager/alertmanager.yml
vim /usr/local/alertmanager/alert.tmp
{{ define "email.from" }}3406747094@qq.com{{ end }}
{{ define "email.to" }}3406747094@qq.com{{ end }}
{{ define "email.to.html" }}
{{- if gt (len .Alerts.Firing) 0 -}}{{ range .Alerts }}
<h2>@告警通知</h2>
告警程序: prometheus_alert <br>
告警级别: {{ .Labels.severity }} 级 <br>
告警类型: {{ .Labels.alertname }} <br>
故障主机: {{ .Labels.instance }} <br>
告警主题: {{ .Annotations.summary }} <br>
告警详情: {{ .Annotations.description }} <br>
触发时间: {{ .StartsAt.Local.Format "2006-01-02 15:04:05" }} <br>
{{ end }}{{ end -}}
{{- if gt (len .Alerts.Resolved) 0 -}}{{ range .Alerts }}
<h2>@告警恢复</h2>
告警程序: prometheus_alert <br>
故障主机: {{ .Labels.instance }}<br>
故障主题: {{ .Annotations.summary }}<br>
告警详情: {{ .Annotations.description }}<br>
告警时间: {{ .StartsAt.Local.Format "2006-01-02 15:04:05" }}<br>
恢复时间: {{ .EndsAt.Local.Format "2006-01-02 15:04:05" }}<br>
{{ end }}{{ end -}}
{{- end }}
Prometheus配置
vim /usr/local/prometheus/prometheus.yml
编辑报警规则文件
mkdir /usr/local/prometheus/rules
vim /usr/local/prometheus/rules/node_alerts.yml
groups:
- name: Host
rules:
- alert: HostCPU
expr: 100 * (1 - avg(irate(node_cpu_seconds_total{mode="idle"}[2m])) by(instance)) > 1
for: 5s
labels:
serverity: high
annotations:
summary: "{{$labels.instance}}: High CPU Usage Detected"
description: "{{$labels.instance}}: CPU usage is {{$value}}, above 1%"
检查配置文件
/usr/local/prometheus/promtool check config /usr/local/prometheus/prometheus.yml
启动
nohup /usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml &
启动alertmanager
./alertmanager
重新启动Prometheus
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &
收到短信:在下面
3、prometheus+alertmanager实现CPU、内存、磁盘的监控报警
原文链接
https://blog.csdn.net/xiaoxiangzi520/article/details/115005765
mkdir /usr/local/prometheus/rules
vim /usr/local/prometheus/rules/node_alerts.yml
groups:
- name: Host
rules:
- alert: HostCPU
expr: 100 * (1 - avg(irate(node_cpu_seconds_total{mode="idle"}[2m])) by(instance)) > 10
for: 5m
labels:
serverity: high
annotations:
summary: "{{$labels.instance}}: High CPU Usage Detected"
description: "{{$labels.instance}}: CPU usage is {{$value}}, above 10%"
- alert: HostMemory
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 20
for: 5m
labels:
serverity: middle
annotations:
summary: "{{$labels.instance}}: High Memory Usage Detected"
description: "{{$labels.instance}}: Memory Usage i{{ $value }}, above 20%"
- alert: HostDisk
expr: 100 * (node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_avail_bytes) / node_filesystem_size_bytes > 30
for: 5m
labels:
serverity: low
annotations:
summary: "{{$labels.instance}}: High Disk Usage Detected"
description: "{{$labels.instance}}, mountpoint {{$labels.mountpoint}}: Disk Usage is {{ $value }}, above 30%"
注意:此处的CPU使用率超过10%,内存使用率超过20%,磁盘使用率超过30%均为测试需要,不一定适用于您的系统配置,请根据您的需要酌情合理配置
。
重启promethus服务。
三、被监控的机器安装node-exporter
tar -xvf node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
ln -s /home/tong/node-exporter/node_exporter-1.4.0-rc.0.linux-amd64 /usr/local/node_exporter
启动node-exporter
nohup /usr/local/node_exporter/node_exporter &
http://192.168.150.162:9100/metrics
普罗米修斯配置文件添加监控项
vim /usr/local/prometheus/prometheus.yml
- job_name: "localhostnode"
static_configs:
- targets: ["localhost:9100"]
关掉
pkill prometheus
重启
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &
http://192.168.150.162:9090/targets?search=
四、安装mysqld-exporter
在/home/tong/下面新建mysqld-exporter文件夹,将下载好的放在这里面
cd /home/tong/mysqld-exporter
解压
tar -xvf mysqld_exporter-0.14.0.linux-amd64.tar.gz
软连接
ln -s /home/tong/mysqld-exporter/mysqld_exporter-0.14.0.linux-amd64 /usr/local/mysqld_exporter
安装mysql
yum install mariadb\* -y
systemctl restart mariadb
systemctl enable mariadb
mysql
授权mysql
grant select,replication client,process ON *.* to 'mysql_monitor'@'localhost' identified by '123';
创建一个mariadb配置文件,写上连接的用户名与密码(和上面的授权的用户名和密码要对应)
vim /usr/local/mysqld_exporter/.my.cnf
[client]
user=mysql_monitor
password=123
启 动 mysqld_exporter
nohup /usr/local/mysqld_exporter/mysqld_exporter --config.my-cnf=/usr/local/mysqld_exporter/.my.cnf &
http://192.168.150.162:9104/metrics
普罗米修斯配置文件添加监控项
vim /usr/local/prometheus/prometheus.yml
- job_name: "localhostmysql"
static_configs:
- targets: ["localhost:9104"]
关掉
pkill prometheus
重启
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &
http://192.168.150.162:9090/targets?search=
五、grafana下载
下载地址https://grafana.com/grafana/download
在/home/tong/Downloads/下面新建grafana文件,上传下载好的
cd /home/tong/Downloads/grafana
解压
tar -zxvf grafana-enterprise-9.1.3.linux-amd64.tar
cd grafana-9.1.3
启动
nohup ./bin/grafana-server web > ./grafana.log 2>&1 &
http://192.168.150.162:3000/login
账号密码为admin
添加prometheus数据源
1、简单使用
(1)点击主界面的“Add data source”
(2)选择Prometheus
(3)Settings页面填写普罗米修斯地址并保存
选择data sources
选择new dashboard
选择add a new panel
主网站https://grafana.com/grafana/dashboards/
2、Node-exporter
Node-exporter的网址
https://grafana.com/grafana/dashboards/1860-node-exporter-full/
选择import
3、在上面基础上自定义仪表
选择more选择copy,选择edit
将四个蓝色内容复制粘贴
重新打开一个页面
选择new dashboard
开始复制粘贴
在第一个页面选择要copy
在自定义页面
选择
4、MySQL Overview
地址https://grafana.com/grafana/dashboards/7362-mysql-overview/
跟上面一样
保存
推荐文章
https://blog.csdn.net/ywd1992/article/details/85989259