Prometheus、Grafan 容器化部署
环境说明
主机名 | IP |
---|---|
master | 192.168.10.202 |
client | 192.168.10.203 |
master上部署prometheus
运行prometheus容器,并进行端口和目录文件映射
//提供配置文件
[root@master ~]# mkdir /promethues
[root@master ~]# vim /promethues/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
//运行容器
[root@master ~]# docker run -d --name prometheus --restart always -p 9090:9090 -v /prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
35ed6a5bf7f1644f8ffe2570bcf888be9d20b319c692bf69ef85d7a5e32468d7
//查看是否成功
[root@master ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
35ed6a5bf7f1 prom/prometheus "/bin/prometheus --c…" 2 seconds ago Up 2 seconds 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
[root@master ~]#
访问查看
如何去监控其他主机(节点)呢?
Prometheus可以从Kubernetes集群的各个组件中采集数据,比如kubelet中自带的cadvisor,api-server等,而node-export就是其中一种来源
Exporter是Prometheus的一类数据采集组件的总称。它负责从目标处搜集数据,并将其转化为Prometheus支持的格式。与传统的数据采集组件不同的是,它并不向中央服务器发送数据,而是等待中央服务器主动前来抓取,默认的抓取地址为CURRENT_IP:9100/metrics
node-exporter用于采集服务器层面的运行指标,包括机器的loadavg、filesystem、meminfo等基础监控,类似于传统主机监控维度的zabbix-agent
使用node-exporter去采集信息,最后再将信息传给Prometheus,从而实现不同节点监控。
在client主机上部署 node-exporter
将安装包传入client主机中,解压后,重命名
[root@client ~]# ls
anaconda-ks.cfg node_exporter-1.3.1.linux-amd64.tar.gz
[root@client ~]# tar xf node_exporter-1.3.1.linux-amd64.tar.gz
[root@client ~]# mv node_exporter-1.3.1.linux-amd64 /usr/local/node_exporter
[root@client ~]# ls /usr/local/
bin etc games include lib lib64 libexec node_exporter sbin share src
[root@client ~]#
配置service文件
[root@client ~]# vim /usr/lib/systemd/system/node_exporter.service
[unit]
Description=The node_exporter Server
After=network.target
[Service]
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
RestartSec=15s
SyslogIdentifier=node_exporter
[Install]
WantedBy=multi-user.target
[root@client ~]#
//设置开机自启动
[root@client ~]# systemctl daemon-reload && systemctl enable --now node_exporter
Created symlink /etc/systemd/system/multi-user.target.wants/node_exporter.service → /usr/lib/systemd/system/node_exporter.service.
[root@client ~]#
在master 主机上修改prometheus.yaml配置文件,添加节点
[root@master ~]# vim /prometheus/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "Linux Server" //添加
static_configs:
- targets: ["192.168.10.203:9100"] //nodeIP
重启容器
[root@master ~]# docker restart prometheus
prometheus
[root@master ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1852abd8ca3a prom/prometheus "/bin/prometheus --c…" 3 minutes ago Up Less than a second 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
4726816303f9 grafana/grafana "/run.sh" 31 minutes ago Up 31 minutes 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana
[root@master ~]#
master上部署grafana
//运行grafana容器
[root@master ~]# docker run -d --name grafana -p 3000:3000 grafana/grafana
Unable to find image 'grafana/grafana:latest' locally
latest: Pulling from grafana/grafana
97518928ae5f: Pull complete
5b58818b7f48: Pull complete
d9a64d9fd162: Pull complete
4e368e1b924c: Pull complete
867f7fdd92d9: Pull complete
387c55415012: Pull complete
07f94c8f51cd: Pull complete
ce8cf00ff6aa: Pull complete
e44858b5f948: Pull complete
4000fdbdd2a3: Pull complete
Digest: sha256:18d94ae734accd66bccf22daed7bdb20c6b99aa0f2c687eea3ce4275fe275062
Status: Downloaded newer image for grafana/grafana:latest
4726816303f9d56451052f089eb8ba3a1f1432e4da5a784e6e33fe8842aaf5a1
[root@master ~]#
//查看是否运行成功
[root@master ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4726816303f9 grafana/grafana "/run.sh" About a minute ago Up About a minute 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana
35ed6a5bf7f1 prom/prometheus "/bin/prometheus --c…" 7 minutes ago Up 7 minutes 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
[root@master ~]#
访问查看
添加prometheus数据源