linux网络流出量进行告警,prometheus 告警指标

记录了prometheus 告警指标

主机和硬件监控

可用内存指标

主机中可用内存容量不足 10%

- alert: HostOutOfMemory

expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 10

for: 5m

labels:

severity: warning

annotations:

summary: Host out of memory (instance { { $labels.instance }})

description: Node memory is filling up (< 10% left)\n VALUE = { { $value }}\n LABELS: { { $labels }}

内存

节点内存压力大。主要页面故障率高

- alert: HostMemoryUnderMemoryPressure

expr: rate(node_vmstat_pgmajfault[1m]) > 1000

for: 5m

labels:

severity: warning

annotations:

summary: Host memory under memory pressure (instance { { $labels.instance }})

description: The node is under heavy memory pressure. High rate of major page faults\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机网络接口流入流量异常

主机网络接口可能接收了太多的数据(> 100 MB/s)。阀值根据自己机器背板网卡决定

- alert: HostUnusualNetworkThroughputIn

expr: sum by (instance) (rate(node_network_receive_bytes_total[2m])) / 1024 / 1024 > 100

for: 5m

labels:

severity: warning

annotations:

summary: Host unusual network throughput in (instance { { $labels.instance }})

description: Host network interfaces are probably receiving too much data (> 100 MB/s)\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机网络接口流出流量异常

主机网络接口可能发送了太多的数据(> 100 MB/s)。

- alert: HostUnusualNetworkThroughputOut

expr: sum by (instance) (rate(node_network_transmit_bytes_total[2m])) / 1024 / 1024 > 100

for: 5m

labels:

severity: warning

annotations:

summary: Host unusual network throughput out (instance { { $labels.instance }})

description: Host network interfaces are probably sending too much data (> 100 MB/s)\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机网络接收错误

{ { \$labels.instance }}接口{ { ​\$labels.device }}在过去5分钟内遇到{ { printf "%.0f" $value }}接收错误。

- alert: HostNetworkReceiveErrors

expr: increase(node_network_receive_errs_total[5m]) > 0

for: 5m

labels:

severity: warning

annotations:

summary: Host Network Receive Errors (instance { { $labels.instance }})

description: { { $labels.instance }} interface { { $labels.device }} has encountered { { printf "%.0f" $value }} receive errors in the last five minutes.\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机网络传输错误

{ { \$labels.instance }} 接口 { { \$labels.device }} 在过去五分钟内遇到 { { printf "%.0f" $value }} 发送错误。

- alert: HostNetworkTransmitErrors

expr: increase(node_network_transmit_errs_total[5m]) > 0

for: 5m

labels:

severity: warning

annotations:

summary: Host Network Transmit Errors (instance { { $labels.instance }})

description: { { $labels.instance }} interface { { $labels.device }} has encountered { { printf "%.0f" $value }} transmit errors in the last five minutes.\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机磁盘读速率

磁盘每秒读数据(> 50 MB/s)。

- alert: HostUnusualDiskReadRate

expr: sum by (instance) (rate(node_disk_read_bytes_total[2m])) / 1024 / 1024 > 50

for: 5m

labels:

severity: warning

annotations:

summary: Host unusual disk read rate (instance { { $labels.instance }})

description: Disk is probably reading too much data (> 50 MB/s)\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机磁盘写速率

磁盘每秒写数据

- alert: HostUnusualDiskWriteRate

expr: sum by (instance) (rate(node_disk_written_bytes_total[2m])) / 1024 / 1024 > 50

for: 5m

labels:

severity: warning

annotations:

summary: Host unusual disk write rate (instance { { $labels.instance }})

description: Disk is probably writing too much data (> 50 MB/s)\n VALUE = { { $value }}\n LABELS: { { $labels }}

主机磁盘剩余空间

磁盘可用空间(<10% left)

# please add ignored mountpoints in node_exporter parameters like

# "--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run)($|/)"

- alert: HostOutOfDiskSpace

expr: (node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes < 10

for: 5m

labels:

severity: warning

annotations:

summary: Host out of disk space (instance { { $labels.instance }})

description: Disk is almost full (< 10% left)\n VALUE = { { $value }}\n LABELS: { { $labels }}

根据磁盘目前的增长速度,在几个小时内是否会写满

根据当前一小时内磁盘增长量,判断磁盘在 4 个小时内会不会被写满

- alert: HostDiskWillFillIn4Hours

expr: predict_linear(node_filesystem_free_bytes{fstype!~"tmpfs"}[1h], 4 * 3600) < 0

for: 5m

labels:

severity: warni

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值