二进制部署Prometheus及监控服务

一、部署 Prometheus
1、下载二进制文件

https://github.com/prometheus/prometheus/releases/download/v2.28.0/prometheus-2.28.0.linux-amd64.tar.gz

2、下载完后解压即可使用

tar xf prometheus-2.28.0.linux-amd64.tar.gz

3、添加systemd管理

[root@prometheus ~]# cat /usr/lib/systemd/system/prometheus.service 
[Unit]
Description=prometheus
[Service]
ExecStart=/opt/monitor/prometheus/prometheus --config.file=/opt/monitor/prometheus/prometheus.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target

4、加载配置并启动

systemctl daemon-reload
systemctl start prometheus.service

5、prometheus配置文件修改如下

[root@prometheus prometheus]# cat prometheus.yml 
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093    # 开启alertmanager告警,去掉 # 号即可

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"      # prometheus读取监控的数据文件
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'node server'
    static_configs:
     - targets: ['192.168.33.145:9100','192.168.33.142:9100']    # 监控 node_exporter 数据,主要监控node节点数据(内存,cpu,负载等)

  - job_name: 'docker'
    static_configs:
     - targets: ['192.168.33.145:8080']       #  cadvisor 服务,主要监控docker数据

6、热加载prometheus配置文件

[root@prometheus prometheus]# ps -ef|grep prometheus
root       1081      1  0 13:25 ?        00:00:10 /opt/monitor/prometheus/prometheus --config.file=/opt/monitor/prometheus/prometheus.yml
root       3123   2619  0 14:10 pts/0    00:00:00 grep --color=auto prometheus
[root@prometheus prometheus]# kill -HUP 1081

7、prometheus自带web页面如下:
输入prometheus所在主机地址+9100即可打开web页面(192.168.33.139:9100)
在这里插入图片描述
在这里插入图片描述
二、node_exporter部署
1、下载二进制文件

https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-amd64.tar.gz

2、解压

tar xf node_exporter-1.2.2.linux-amd64.tar.gz -C /opt/monitor

3、添加systemd管理

[root@prometheus ~]# cat /usr/lib/systemd/system/node_exporter.service 
[Unit]
Description=node_exporter
[Service]
ExecStart=/opt/monitor/node_exporter/node_exporter  --collector.systemd --collector.systemd.unit-include=(docker|sshd|nginx).service
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target

4、加载配置并启动

systemctl daemon-reload
systemctl start node_exporter.service

三、grafana部署
1、下载二进制文件

wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.0.3.linux-amd64.tar.gz

2、解压二进制文件

tar -zxvf grafana-enterprise-8.0.3.linux-amd64.tar.gz -C  /opt/monitor

3、添加systemd管理

[root@prometheus ~]# cat /usr/lib/systemd/system/grafana.service 
[Unit]
Description=grafana
[Service]
ExecStart=/opt/monitor/grafana/bin/grafana-server -homepath=/opt/monitor/grafana
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target

4、加载配置并启动

systemctl daemon-reload
systemctl start grafana.service

5、grafana模板下载地址

https://grafana.com/grafana/dashboards

常用模板
193  docke监控r模板
9276  node节点监控模板
7362  mysql监控模板

6、grafana展示界面 (192.168.33.145:3000)
6.1、监控node主机
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

6.2、监控kubernetes集群
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
四、alertmanager部署
1、下载alertmanager二进制包

wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-amd64.tar.gz

2、解压二进制包

tar xf alertmanager-0.23.0.linux-amd64.tar.gz -C /opt/monitor/

3、添加systemd管理

[root@prometheus alertmanager]# cat /usr/lib/systemd/system/alertmanager.service 
[Unit]
Description=alertmanager
[Service]
ExecStart=/opt/monitor/alertmanager/alertmanager --config.file=/opt/monitor/alertmanager/alertmanager.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target

4、加载配置并启动

systemctl daemon-reload
systemctl start alertmanager.service

5、修改alertmanager配置(钉钉告警版)

[root@prometheus alertmanager]# cat alertmanager.yml
global:
  resolve_timeout: 5m

templates:
  - '/opt/monitor/alertmanager/template/*.tmpl'

route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 1m
  repeat_interval: 2m
  receiver: 'web.hook'
receivers:
- name: 'web.hook'
  webhook_configs:
  - url: 'http://localhost:8060/dingtalk/webhook1/send'
    send_resolved: true
inhibit_rules:
  - source_match:
      alertname: 'ApplicationDown'
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname',"target","job","instance"]

6、修改alertmanager配置(邮件告警版)

[root@prometheus alertmanager]# cat alertmanager.yml.bak20210830 
global:
  resolve_timeout: 5m
  # 邮箱服务器
  smtp_smarthost: 'smtp.126.com:25'
  smtp_from: 'liujixiao6@126.com'
  smtp_auth_username: 'liujixiao6@126.com'
  smtp_auth_password: 'BBELDJWBPLMLIMUR' 
  smtp_require_tls: false

# 配置路由树
route:
  group_by: ['alertname'] # 根据告警规则组名进行分组
  group_wait: 10s # 分组内第一个告警等待时间,10s内如有第二个告警会合并一个告警
  group_interval: 10s # 发送新告警间隔时间
  repeat_interval: 1h # 重复告警间隔发送时间
  receiver: 'mail'

# 接收人
receivers:
- name: 'mail'
  email_configs:
  - to: '1665111913@qq.com'

7、重启alertmanager

systemctl restart alertmanager

说明:钉钉告警详细配置请查看我的其它博客
https://blog.csdn.net/ljx1528/article/details/120070330

钉钉告警截图
在这里插入图片描述
邮件告警截图
在这里插入图片描述

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

运维那些事~

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值