Prometheus监控软件的使用

最新推荐文章于 2024-06-17 08:09:10 发布

pengxb0v0

最新推荐文章于 2024-06-17 08:09:10 发布

阅读量610

点赞数 4

文章标签： prometheus 网络 docker 运维

本文链接：https://blog.csdn.net/pengxb0v0/article/details/126669392

版权

Prometheus监控软件的使用

查看系统资源相关的命令

cpu

top

查看进程在哪个cpu运行——top——》f——》选中P——》空格

top默认根据cpu使用率排序，使用M按内存排序，按P按cpu使用率排序

htop

lscpu——》cat /proc/cpuinfo
内存

top

free -m -h

swappness 默认30%

清理缓存——echo 3 >/proc/sys/vm/drop_caches

sync——将缓存中的数据刷入磁盘中

cat /proc/meminfo
磁盘

df -Th

iostat -x 2 10 每间隔2s刷新，一共刷新10次

sar ——》tps iops
网络

netstat

lsof

ss

nc

nmap

ping

fping

dstat

glances

nethogs——查看进程消耗带宽情况

tcpdump——网络抓包
进程

ps -aux

ps -ef

top

dstat

pstree
默认物理内存只剩30%的时候，就开始使用交换分区

清理缓存——》echo 3 >/proc/sys/vm/drop_caches

当你的物理内存不足的时候，如何解决？

1、杀死无关业务的进程，释放内存

2、加内存条，需关机

3、加服务器——》负载均衡
dd if=/dev/zero of=sc.dd bs=1M count=1000

dd是一个数据备份的命令

if ——input file 输入文件

/dev/zero——会产生0

of——output file 输出文件

bs=1M——数据单元

count=1000——数量

硬盘性能指标

TPS——transfer per second(Total number of transfers per second that were issued to physical devices. A transfer is an I/O request to a physical device.)

IOPS——Input Output Per Second

监控软件

监控：monitor

运维工作离不开监控

工具或者软件去帮助我们7*24监控我们的服务器和软件，是否还在正常的工作。如果不工作，马上告警（短信，电话，微信，钉钉），及时处理

监控的价值：防范事故于未然，减少事故带来的损失

监控软件：

1、cacti 仙人掌：出图比较好

2、nagios 监控脚本特别多

3、zabbix 集合cacti+nagios的优点： ——》企业里使用非常多

4、openfalcon 小米公司开源的监控软件

5、prometheus 开源的监控软件

云原生：k8s+Prometheus

Prometheus

在这里插入图片描述

组件

Prometheus server
- TSDB——time series database 时序数据库——》hdd/sdd
  
  hdd机械磁盘——hard disk driver
  
  sdd固态磁盘——solid disk driver
- HTTP server——web服务器
- Retrieval——提取
Exporter——收集数据，采集数据
Metrics——指标：很多数据 ——cpu的数据内存网络磁盘
Pushgateway——中间件（代理），代理保存短时间存活的工作push过来的数据
PromQL——prom数据库语句
Grafana——专业的出图软件：专门从别的数据库里抽取数据，然后通过图形展示工具，展示出来
Alertmanager——告警

获取数据的方式：

1、pull server——》pull——》client 主动的获取数据，避免大并发

2、push client——》push——》server client主动送数据过来，数据会非常新，时效性好，会出现大量的数据同时push过来

安装Prometheus

容器安装：docker run -d -p 9090:9090 --name sc-prom prom/prometheus
有命令去查看cpu、内存、磁盘、网络、进程，也可以查看，为什么还需要监控软件，Prometheus，zabbix？

1、时间上的优势

命令获取数据不能随时得到数据，监控软件可以时时刻刻替我们去获取

命令在输入的时候，可以获取，不输入就不能获取

2、操作更加简化，界面效果更加直观，门槛更加低

3、数据图形化——》可视化效果好

4、可操作性、继承性——》员工

运维开发——让运维更加自动化、智能化、可视化
在访问Prometheus和其他服务时需防止防火墙阻止访问相应端口，所以可以提前关闭iptables的各种规则，设置INPUT默认为ACCEPT规则
```
iptables -F
iptables -P INPUT ACCEPT
```

源码安装：

1、上传下载的Prometheus源码包——https://github.com/prometheus/prometheus/releases/download/v2.38.0/prometheus-2.38.0.linux-amd64.tar.gz

2、promethe服务器解压源码包

[root@prometheus opt]# tar xf prometheus-2.34.0.linux-amd64.tar.gz 
[root@prometheus opt]# ls
apiserver         containerd      prometheus-2.34.0.linux-amd64
apiserver.tar.gz  docker-compose  prometheus-2.34.0.linux-amd64.tar.gz
[root@prometheus opt]# mkdir /prom
[root@prometheus opt]# mv prometheus-2.34.0.linux-amd64 /prom/prometheus
[root@prometheus opt]# cd /prom/
[root@prometheus prom]# ls
prometheu

3、promethe服务器修改PATH环境变量

临时修改

[root@prometheus prometheus]# PATH=/prom/prometheus/:$PATH

永久修改

[root@prometheus prometheus]# vim /root/.bashrc
[root@prometheus prometheus]# cat /root/.bashrc
# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
	. /etc/bashrc
fi

PATH=/prom/prometheus/:$PATH

4、prometheus服务器启动prometheus

[root@prometheus prometheus]# ./prometheus --config.file=prometheus.yml

5、在node节点服务器上安装node_exporter——https://github.com/prometheus/node_exporter/releases/download/v1.4.0-rc.0/node_exporter-1.4.0-rc.0.linux-amd64.tar.gz

[root@node_exporter ~]# cd /opt
[root@node_exporter opt]# ls
apiserver         containerd      node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
apiserver.tar.gz  docker-compose
[root@node_exporter opt]# tar xf node_exporter-1.4.0-rc.0.linux-amd64.tar.gz 
[root@node_exporter opt]# ls
apiserver         containerd      node_exporter-1.4.0-rc.0.linux-amd64
apiserver.tar.gz  docker-compose  node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
[root@node_exporter opt]# mv node_exporter-1.4.0-rc.0.linux-amd64 /node_exporter
[root@node_exporter opt]# cd /node_exporter/
[root@node_exporter node_exporter]# ls
LICENSE  node_exporter  NOTICE

6、添加node_exporter到PATH变量

[root@node_exporter node_exporter]# PATH=/node_exporter/:$PATH
[root@node_exporter node_exporter]# vim /root/.bashrc
[root@node_exporternode_exporter]# cat /root/.bashrc
# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
	. /etc/bashrc
fi

PATH=/node_exporter/:$PATH

7、执行node_exporter代理程序，监听本机的8090端口

[root@node_exporter node_exporter]# nohup ./node_exporter --web.listen-address 0.0.0.0:8090 &
[1] 21505
[root@node_exporter node_exporter]# nohup: 忽略输入并把输出追加到"nohup.out"

8、访问node_exporter服务器上的的metrics
在这里插入图片描述

9、在Prometheus服务器上添加抓取数据的配置，添加node节点服务器，将抓取的数据存储到时序数据库TSDB中

[root@prometheus prometheus]# ls
console_libraries  consoles  data  LICENSE  NOTICE  prometheus  prometheus.yml  promtool
[root@prometheus prometheus]# pwd
/prom/prometheus
[root@prometheus prometheus]# vim prometheus.yml 
	# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:  #抓取数据的配置
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]	#抓取本机的数据

	#上面的配置无需更改，只需在原有基础上添加下面配置
  - job_name: "node-1"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["192.168.175.145:8090"]	#抓取目标节点的数据

10、重启Prometheus服务——需要先杀死原来启动的进程，然后再运行程序

因为我们是源码二进制安装，直接解压使用的，没有配套的prometheus服务，所以不能使用service prometheus restart命令需要将源码二进制安装的prometheus，添加成一个service方式管理（见步骤12）

[root@prometheus prometheus]# ps aux|grep prom
root      3162  0.2  6.1 913668 62024 pts/1    Sl+  11:58   0:00 ./prometheus --config.file=proetheus.yml
root      3181  0.0  0.0 112824   972 pts/0    S+   12:00   0:00 grep --color=auto prom
[root@prometheus prometheus]# kill -9 3163
[root@prometheus prometheus]# ps aux|grep prom
root      3196  0.0  0.0 112824   976 pts/0    S+   12:01   0:00 grep --color=auto prom
[root@prometheus prometheus]# ./prometheus --config.file=prometheus.yml

11、使用浏览器访问查看效果，当出现下图效果表示安装成功

12、为prometheus注册服务，需要在/usr/lib/systemd/system目录下新建prometheus.service文件。注册以后需要先将当前运行的prometheus服务退出，后续即可使用service对prometheus进行管理

[root@prometheus prometheus]vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=prometheus

[Service]
ExecStart=/prom/prometheus/prometheus --config.file=/prom/prometheus/prometheus.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart==on-failure

[Install]
WantedBy=multi-user.target

#查看prometheus的pid号
[root@docker grafana]# ps aux|grep prom
root      3812  0.4  6.2 982024 63656 ?        Ssl  14:55   0:21 /prom/prometheus/prometheus --config.file=/prom/prometheus/prometheus.yml
root      4162  0.0  0.0 112824   976 pts/2    S+   16:18   0:00 grep --color=auto prom
#强制杀死prometheus进程
[root@docker grafana]# kill -9 3812

#使用service启动prometheus服务
[root@docker grafana]# service prometheus start
Redirecting to /bin/systemctl start prometheus.service
#验证是否启动成功
[root@docker grafana]# ps aux|grep prom
root      4180 49.5  4.5 782468 45684 ?        Ssl  16:19   0:00 /prom/prometheus/prometheus --config.file=/prom/prometheus/prometheus.yml
root      4189  0.0  0.0 112824   976 pts/2    S+   16:19   0:00 grep --color=auto prom

安装grafana

概述：美观、强大的可视化监控指标展示工具，采用go语言编写的开源应用，主要用于大规模指标数据的可视化展现，时网络架构和应用分析中最流行的时序数据展示工具，目前已经支持绝大多数常用的时序数据库。

官网：https://grafana.com/

步骤：

1、下载rpm包

[root@prometheus grafana]# wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.1.2-1.x86_64.rpm

2、安装rpm包

[root@prometheus grafana]# yum install grafana-enterprise-9.1.2-1.x86_64.rpm

grafana默认监听的端口是3000

[root@prometheus grafana]# service grafana-server start
Starting grafana-server (via systemctl):                   [  确定  ]
[root@prometheus grafana]# netstat -anplut|grep grafan
tcp        0      0 192.168.175.144:55190   34.120.177.193:443      ESTABLISHED 4056/grafana-server 
tcp        0      0 192.168.175.144:45442   185.199.108.133:443     ESTABLISHED 4056/grafana-server 
tcp6       0      0 :::3000                 :::*                    LISTEN      4056/grafana-server

访问http://192.168.175.144:3000（本机IP地址:3000）

默认账号密码均为admin

在这里插入图片描述