安装process-exporter
下载
https://github.com/ncabatoff/process-exporter
https://github.com/ncabatoff/process-exporter/releases/download/v0.7.5/process-exporter-0.7.5.linux-amd64.tar.gz
解压,进入到那个目录
tar zxvf process-exporter-0.7.5.linux-amd64.tar.gz
mv process-exporter-0.7.5.linux-amd64 /usr/local/process-exporter
cd /usr/local/process-exporter/
创建yaml文件,添加需要监控的进程名
能够用ps查询的
vim process-exporter.yaml
process_names:
#+代表所有进程
# - name: "{{.Comm}}"
# # cmdline:
# # - '.+'
#
- name: "{{.Matches}}"
cmdline:
- 'nginx'
- name: "{{.Matches}}"
cmdline:
- 'vsftpd'
设置为系统服务
vim /usr/lib/systemd/system/process-exporter.service
[Unit]
Description=Prometheus exporter for processors metrics, written in Go with pluggable metric collectors.
Documentation=https://github.com/ncabatoff/process-exporter
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/usr/local/process-exporter
ExecStart=/usr/local/process-exporter/process-exporter -config.path=/usr/local/process-exporter/process-exporter.yaml
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动
systemctl daemon-reload
systemctl enable process-exporter
systemctl start process-exporter
设置邮件报警
下载安装alertmanager
cd /etc/alertmanager
tar zxvf alertmanager-0.22.2.linux-amd64.tar.gz
vim alertmanager.yml
nohup ./alertmanager --config.file=alertmanager.yml
编写alertmanager.yml文件
cd /etc/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m
#配置发送邮箱,我这里用的qq
smtp_from: 'xxxxx@qq.com'
smtp_smarthost: 'smtp.qq.com:465'
smtp_auth_username: 'xxxxx@qq.com'
# 注意这里需要配置QQ邮箱的授权码,不是登录密码,授权码在账户配置中查看
smtp_auth_password: 'xxxxxx'
smtp_require_tls: false
route:
group_by: ['alert_node']
group_wait: 5s
group_interval: 5s
repeat_interval: 5m
receiver: 'email'
receivers:
- name: 'email'
email_configs:
# 请注意这里的收件箱请改为你自己的邮箱地址,多个用逗号隔开
- to: 'xxxx@163.com,xxxx@163.com'
send_resolved: true
inhibit_rules:
- source_match:
我这里用的docker,直接拉取镜像,挂载配置文件
docker run -d --restart=always --name=alertmanager -p 9093:9093 -v /etc/alertmanager:/etc/alertmanager prom/alertmanager:latest
修改普罗米修斯文件
#这里修改为你安装alertmanager的地址,否则普罗米修斯无法调用,发送邮件
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
- 192.168.0.107:9093
这里设置rule地址
rule_files:
- '/etc/prometheus/rules/*.yaml'
- '/etc/prometheus/rules/*.yml'
#添加要监控的地址
- job_name: 'alertmanager'
static_configs:
- targets: ['192.168.0.107:9093']
#
#process-exporter
- job_name: 'process-exporter'
static_configs:
- targets: ['192.168.0.199:9256']
创建rules
mkdir /etc/prometheus/rules/
cd /etc/prometheus/rules/
vim process.yml
groups:
- name: process_rule
rules:
- alert: 服务告警
expr: (namedprocess_namegroup_num_procs) == 0
for: 30s
labels:
severity: error
annotations:
summary: "{{ $labels.instance }}: 进程服务挂了,已经超过30秒"
value: "{{ $value }}"
重启普罗米修斯服务
然后就可以验证服务了,这里会显示你当时写的进程名称。