关于Prometheus——普罗米修斯的亿点点

最新推荐文章于 2024-04-17 20:04:55 发布

一元十二會_

最新推荐文章于 2024-04-17 20:04:55 发布

阅读量465

点赞数

文章标签： prometheus

本文链接：https://blog.csdn.net/weixin_45059041/article/details/120453518

版权

Prometheus的概述

Prometheus（普罗米修斯）是一套开源的监控&报警&时间序列数据库的组合.由SoundCloud公司开发。

Prometheus基本原理是通过HTTP协议周期性抓取被监控组件的状态，这样做的好处是任意组件只要提供HTTP接口就可以接入监控系统，不需要任何SDK或者其他的集成过程。这样做非常适合虚拟化环境比如VM或者Docker 。

Prometheus应该是为数不多的适合Docker、Mesos、Kubernetes环境的监控系统之一。近几年随着k8s的流行，prometheus成为了一个越来越流行的监控工具。

Prometheus是一款具备开源的监控，是一种TSDB(时序数据库)，复制于谷歌的borgmon监控系统，对k8s容器的监控非常适用。Prometheus可以很好地记录任何纯数字时间序列。是位可靠性而设计的，既适用于以机器为中心的监视，也适用于高度动态的面向服务的体系结构的监视，系统中断期间依旧可以监控使用的系统。

Prometheus的优点

Prometheus"抓取"数据的方式

Prometheus生态组件

时间序列数据
1、什么是序列数据

2、时间序列数据特点

3、Prometheus的主要特征

关于Prometheus部署

服务器分配主机名   地址   安装包
prometheus   192.168.35.40   prometheus-2.27.1.linux-amd64.tar.gz
server1   192.168.35.10   node_exporter-1.1.2.linux-amd64.tar.gz
server2   192.168.35.20
server3   192.168.35.30

hostnamectl set-hostname prometheus		#其他主机分别设置server1.2.3
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
vim /etc/reslove.conf
nameserver 114.114.114.114
ntpdate ntp1.aliyun.com 		#时间同步;时间同步必须要做，不然会报错

cat > letc/ yum.repos.d/prometheus.repo <<EOF
[prometheus] 
name=prometheus
baseurl=https://packagecloud.io/prometheus-rpm/release/el/basearch
repo gpgcheck=1 
enabled-1
gpgkey=https://packagecloud.io/prometheus-rpm/release/gpgkeyhttps://raw.githubusercontent.com/lest/prometheus-rpm/master/RPM-GPG-KEY-prometheus-rpmgpgcheck=1 metadata_expire=300
EOF

tar -zxvf prometheus-2.27.1.linux-amd64.tar.gz -C /usr/local
cd prometheus-2.27.1.linux-amd64/
vim prometheus.yml
my global config
global:		##全局组件
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. ##每隔多久抓取一次指标，不设置默认1分钟
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.	 ##内置告警规则的评估周期
 #scrape_timeout is set to the global default (10s).

# Alertmanager configuration		##对接的altermanager(第三方告警模块)
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:		##告警规则；告警规则可以使用yml规则去书写
- "first_rules.yml"
- "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:		##数据采集模块
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.		##对于所抓取的指标数据采集的来源在意job_name来定义
  	- job_name: 'prometheus'		##对于指标需要打上的标签，对于PrometheusSQL（查询语句）的标签：比如prometheus{target='values'}

  	 # metrics_path defaults to '/metrics' 		 ##收集数据的路径；展示使用metrics模式
    	 # scheme defaults to 'http'.		##默认抓取的方式是http

   	 static_configs:		##对于Prometheus的静态配置监听端口具体数据收集的位置 默认的端口9090
   	 - targets: ['localhost:9090']

./prometheus		##直接开启Prometheus
netstat -antp | grep 9090		##另开一个终端查看9090端口

在这里插入图片描述

访问192.168.35.40:9090/metrics：查看prometheus自带的内键指标

tar zxvf node_exporter-1.1.2.linux-amd64.tar.gz
cd node_exporter-1.1.2.linux-amd64
cp node_exporter /usr/local/bin

./node_exporter --help		##可以查看命令可选项
服务管理方式utilfile（文件读取工具）
[Unit]
Description=node_exporter
Documentation=https:/prometheus.io/
After=network.targets
[serveice]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/node_exporter \
    --collector.ntp \
    --collector.mountstats \
    --collector.systemd \
    --collertor.tcpstat
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
Restart=always
[Install]
WantedBy=multi-user.target

./node_exporter
netstat -antp | grep 9100		##再开一个会话查询端口号

在这里插入图片描述

访问192.168.35.10:9100/metrics 查看抓取内容在这里插入代码片在这里插入代码片

cd /usr/local/prometheus-2.27.1.linux-amd64/
vim prometheus.yml		##配置文件的最后添加以下内容
  - job_name: 'nodes'
    static_configs:
    - targets:
      - 192.168.35.10:9100
      - 192.168.35.20:9100
      - 192.168.35.30:9100

启动服务

./prometheus	##启动服务
查看页面http://192.168.35.40:9090/targets#pool-nodes

node_cpu_seconds_total		##CPU使用总量

PromQL: irate(node_cpu_seconds_total{mode="idle"}[5m])		##计算过去5分钟内的CPU空闲速率

进阶2:

PromQL：(1- avg (irate(node_cpu_seconds_total{mode='idle'}[5m]))by (instance))* 100		##每台主机CPU 在5分组内的平均使用率

prometheus 服务发现机制

Prometheus Server的数据抓取工作于Pull模型，因而，它必需要事先知道各Target
的位置，然后才能从相应的Exporter或Instrumentation中抓取数据

对于小型的系统环境来说，通过static_configs指定各Target便能解决问题，这也是
最简单的配置方法;每个Targets用一个网络端点(ip:port）进行标识;

对于中大型的系统环境或具有较强动态性的云计算环境来说，静态配置显然难以适用;
因此，Prometheus为此专门设计了一组服务发现机制，以便于能够基于服务注册中心（服务总线）自动发现、检测、分类可被监控的各Target，以及更新发生了变动的Target指标抓取的生命周期

在每个scrape_interval期间，Prometheus都会检查执行的作业(Job);这些作业首先会根据
Job上指定的发现配置生成target列表，此即服务发现过程;服务发现会返回一个Target列表，其中包含一组称为元数据的标签，这些标签都以" meta_"为前缀;

服务发现还会根据目标配置来设置其它标签，这些标签带有"“前缀和后缀，b包括"scheme”
、" address"和" metrics path_"，分别保存有target支持使用协议(http或https，默认为
http) 、 target的地址及指标的URI路径（默认为/metrics) ;

若URI路径中存在任何参数，则它们的前缀会设置为" param"这些目标列表和标签会返回给
Prometheus，其中的一些标签也可以配置中被覆盖;

配置标签会在抓取的生命周期中被重复利用以生成其他标签，例如，指标上的instance标
签的默认值就来自于address标签的值;

对于发现的各目标，Prometheus提供了可以重新标记（relabel）目标的机会，它定义在
job配置段的relabel_config配置中，常用于实现如下功能

静态配置发现

动态发现