首先确保服务开启
vim node_rules.yml
注意:编写这个文件注意不要用tab键,只用空格来缩进
访问localhost:9090/rules
如果relod发现rules没有生效,可以重启服务
netstate -lntp |grep prom
kill -9 进程号
./prometheus &
再次访问
cpu > 80
100-(avg(irate(node_cup_seconds_total{mode='idle'}[5m]))by(instance)*100) > 80
内存
100 - (node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100
disk
100 - (((node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_free_bytes{fstype=~"xfs|ext4"}) / node_filesystem_size_bytes{fstype=~"xfs|ext4"}) * 100)
节点状态
up metric
监视特定节点状态的另一个有用指标:up ,如果实例是健康的,度量就被设置为1 ,失败返回 - 或 0
用来监控节点是否健康,如果健康则为1,不健康的话说明该服务器node服务可能停了,也可能该节点down了需要立马检查
- alert: NodeDown
expr: node_up == 0
for: 0m
labels:
severity: serious
annotations:
summary: "NodeDown"
下面都一样的模板配置即可
MysqlDown
RedisDown
NginxDown
JavaDown
groups: - name: Hoststate-alert() rules: - alert: RedisDown expr: up == 0 for: 0m labels: status: critical annotations: summary: "Redisdown" description: "Redis instance is down" - alert: MysqlDown expr: up == 0 for: 0m labels: status: critical annotations: summary: "Msqldown" description: "Mysql instance is down" - alert: NginxDown expr: up == 0 for: 0m labels: status: critical annotations: summary: "Nginxdown" description: "Nginx instance is down" - alert: NodeDown expr: up == 0 for: 0m labels: status: critical annotations: summary: "Nodedown" description: "Node instance is down" - alert: JavaDown expr: up == 0 for: 0m labels: status: critical annotations: summary: "Javadown" description: "Java instance is down" - alert: CPUusage expr: 100-(avg(irate(node_cpu_seconds_total{mode='idle'}[5m]))by(instance) * 100) > 80 for: 5m labels: status: critical annotations: summary: "{{$labels.mountpoint}} CPU usage high" description: "{{$labels.mountpoint}} CPU usage above 80% ( current usage:{{$value}})" - alert: Memoryusage expr: 100 - (node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes)/ node_memory_MemTotal_bytes * 100 > 80 for: 5m labels: status: critical annotations: summary: " Memory usage high" description: "Memory usage above 80%.( current usage:{{$value}})" - alert: Diskusage expr: 100 - (((node_filesystem_size_bytes{fstype=~"xfs|ext4"} - node_filesystem_free_bytes{fstype=~"xfs|ext4"}) / node_filesystem_size_bytes{fstype=~"xfs|ext4"}) * 100) > 80 for: 5m labels: status: critical annotations: summary: "Disk usage high" description: "Disk usage above 80% ( current usage:{{$value}})"