Prometheus安装使用与快速入门

最新推荐文章于 2024-07-26 10:12:54 发布

进击的程序猿~

最新推荐文章于 2024-07-26 10:12:54 发布

阅读量852

点赞数 1

分类专栏：中间件容器技术编程工具篇

本文链接：https://blog.csdn.net/qq_41822345/article/details/118877172

版权

编程工具篇同时被 3 个专栏收录

51 篇文章 4 订阅

订阅专栏

容器技术

29 篇文章 11 订阅

订阅专栏

中间件

21 篇文章 3 订阅

订阅专栏

文章目录

Prometheus监控

Prometheus监控

一、Prometheus概念

Prometheus(由go语言(golang)开发)是一套开源的监控&报警&时间序列数据库的组合。适合监控docker容器。因为kubernetes(俗称k8s)的流行带动了prometheus的发展。

时间序列数据(TimeSeries Data) : 按照时间顺序记录系统、设备状态变化的数据被称为时序数据。

Prometheus有着非常高效的时间序列数据存储方法，每个采样数据仅仅占用3.5byte左右空间。

Prometheus特点

多维度数据模型。

灵活的查询语言。

不依赖分布式存储，单个服务器节点是自主的。

以HTTP方式，通过pull模型拉去时间序列数据。

也可以通过中间网关支持push模型。

通过服务发现或者静态配置，来发现目标服务对象。

支持多种多样的图表和界面展。
Prometheus架构

在这里插入图片描述

Prometheus组件
Prometheus Server：采集、存储、告警分析时序数据。
client libaries：提供应用程序的数据收集功能。
pushgateway：jobs推送数据给pushgateway，再由pushgateway暴露给Prometheus Server。
exporters：暴露第三方服务的采集数据。
alertmanager：对Prometheus Server产生的告警进行去重、分组、发送等功能。
Prometheus监控流程

step1：Prometheus通过Service discovery发现采集目标

step2：使用HTTP请求采集（pull/push）目标指标数据，并进行持久化。

step3：定时运行本地规则进行数据聚合产生新的时序数据或告警。

step4：将告警发送于Alertmanager，由它进行分组、去重等功能后发送告警通知或恢复通知。

step5：可以调用Prometheus Server提供的HTTP API获取采集数据用于可视化。
Prometheus使用场景

Prometheus可以记录任何纯数据的时序数据；

Prometheus常用于以机器为中心的监控、面向高度动态的服务体系架构监控；

不适合100%准确性的要求，比如计费。

二、Prometheus安装使用

Prometheus官网安装包下载：https://prometheus.io/download/

1.安装

Prometheus服务：用于监控其它所有机器或服务的监控指标。

node_exporter服务：用于上报机器的监控指标。

mysqld_exporter服务：用于上报mysql服务的监控指标。

1.Prometheus服务

step1：解压下载好的安装包

tar xf prometheus-2.5.0.linux-amd64.tar.gz
mv prometheus-2.5.0.linux-amd64 /usr/local/prometheus

step2：直接使用默认配置启动

cd /usr/local/prometheus
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &

step3：浏览器访问 http://服务器IP:9090 可以查看监控页面。访问http://服务器IP:9090/metrics可以查看监控数据。

2.node_exporter服务

目前为止，只能监控当前机器。要想监控远程机器，需要在远程机器上安装node_exporter组件。

step1：解压下载好的安装包

tar xf node_exporter-0.16.0.linux-amd64.tar.gz
mv node_exporter-0.16.0.linux-amd64 /usr/local/node_exporter

step2：后台启动node_exporter

nohup /usr/local/node_exporter/node_exporter &

step3：回到安装Prometheus服务的机器，在配置文件末添加node_exporter服务。

vim /usr/local/prometheus/prometheus.yml
##在文本末添加如下配置
- job_name: 'agent1' # 取一个job名称来代 表被监控的机器 
  static_configs:
  - targets: ['192.168.168.102:9100'] # 这里改成被监控机器的IP，后面端口9100

step4：重启Prometheus服务

##先关闭旧进程
ps -ef|grep prometheus
kill -9 prometheus进程号
pkill prometheus
##重启
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &

step5：浏览器访问http://服务器IP:9090可以查看监控页面。(现在多了一个监控目标)

3.mysqld_exporter服务

step1：解压下载好的安装包。

tar xf mysqld_exporter-0.11.0.linux-amd64.tar.gz
mv mysqld_exporter-0.11.0.linux-amd64 /usr/local/mysqld_export

step2：登录已安装的mysql，进行用户授权

grant select,replication client,process ON *.* to 'mysql_monitor'@'localhost' identified by 'Aa123456';
flush privileges;
## 在/etc/my.cnf文件中配置好用户登录信息，如下：
[client] 
user=mysql_monitor 
password=Aa123456

原理：prometheus服务器找mysql_exporter，mysql_exporter再找mysql获取监控指标数据。

step3：后台启动mysqld_export服务

nohup /usr/local/mysqld_exporter/mysqld_exporter --config.mycnf=/etc/my.cnf &
## mysqld_export服务默认通过9104端口进行上报监控

step4：回到安装Prometheus服务的机器，在配置文件末添加mysqld_exporter服务。

vim /usr/local/prometheus/prometheus.yml
##在文本末添加如下配置
- job_name: 'mysqld-agent1' # 取一个job名称来代表被监控的服务
  static_configs:
  - targets: ['192.168.168.101:9104'] # 这里改成mysqld服务所在机器的IP，端口9104

step5：重启Prometheus服务

##先关闭旧进程
ps -ef|grep prometheus
kill -9 prometheus进程号
pkill prometheus
##重启
/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml" &

step6：浏览器访问http://服务器IP:9090(现在又多了一个监控目标)

4.Grafana服务

Grafana官网安装教程请访问：https://grafana.com/

Grafana是一个开源的度量分析和可视化工具，可以通过将采集的数据分析，查询，然后进行可视化的展示,并能实现报警。

2.PromSQL

Prometheus提供PromSQL功能用于时序数据的查询和统计。

举例如下：

prometheusURL: "http://127.0.0.1:30065"
  metrics:
  - name: metric_cpu_usage_ratio
    descritpion: CPU使用率
    expr: (rate(container_cpu_usage_seconds_total{container="mysql", alertingTargetName="{{.Name}}", namespace="{{.Namespace}}"}[5m]) / ignoring(cpu) (container_spec_cpu_quota / 100000)) * 100
  - name: metric_memory_usage_ratio
    descritpion: 内存使用率
    expr: (container_memory_rss{container="mysql", alertingTargetName="{{.Name}}", namespace="{{.Namespace}}"}/container_spec_memory_limit_bytes)*100
  - name: health
    description: 实例的健康状态
    expr: mysql_up{AppName="{{.Name}}", namespace="{{.Namespace}}"}
  - name: metric_mysql_storage_iops
    description: 实例每秒读写的IO数(IOPS)
    expr: sum(rate(container_fs_writes_total{container="mysql", alertingTargetName="{{.Name}}", namespace="{{.Namespace}}"}[5m])) by (namespace,pod,container)
      + sum(rate(container_fs_reads_total{container="mysql", alertingTargetName="{{.Name}}", namespace="{{.Namespace}}"}[5m])) by (namespace,pod,container)
  - name: metric_mysql_tps
    description: 实例每秒的事务数(TPS)
    expr: sum(rate(mysql_global_status_commands_total{command=~"(commit|rollback)", AppName="{{.Name}}", namespace="{{.Namespace}}"}[5m])) by (namespace,instance)
  - name: metric_mysql_qps
    description: 实例平均每秒SQL的执行次数(QPS)
    expr: sum(rate(mysql_global_status_commands_total{AppName="{{.Name}}", namespace="{{.Namespace}}"}[5m])) by (namespace,instance)
    - name: metric_mysql_threads_connected
    description: 实例当前总连接数
    expr: mysql_global_status_threads_connected{AppName="{{.Name}}", namespace="{{.Namespace}}"}
  - name: metric_mysql_threads_running
    description: 实例当前活跃
    expr: mysql_global_status_threads_running{AppName="{{.Name}}", namespace="{{.Namespace}}"}
  - name: metric_mysql_storage_volume_util
    description: 存储空间使用率
    expr: 100 * sum(mysql_filesystem_used_bytes{AppName="{{.Name}}", namespace="{{.Namespace}}"}) by (namespace,instance) /
      sum(mysql_filesystem_size_bytes{AppName="{{.Name}}", namespace="{{.Namespace}}"}) by (namespace,instance)

3.prometheus.yml配置

prometheus.yml配置举例

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['localhost:9090']
  - job_name: 'agent1'
    static_configs:
    - targets: ['192.168.168.101:9100']
  - job_name: 'agent2'
    static_configs:
    - targets: ['192.168.168.102:9100']
  - job_name: 'mysql'
    static_configs:
    - targets: ['192.168.168.101:9104']

prometheus.yml操作举例

package main

import (
	"fmt"
	"gopkg.in/yaml.v2"
	"log"
	"os"
)

type PrometheusConfig struct {
	Global        interface{}     `yaml:"global"`
	Alerting      interface{}     `yaml:"alerting"`
	RuleFiles     []string        `yaml:"rule_files"`
	ScrapeConfigs []*ScrapeConfig `yaml:"scrape_configs"`
}

type ScrapeConfig struct {
	JobName       interface{} `yaml:"job_name"`
	StaticConfigs interface{} `yaml:"static_configs"`
	//StaticConfigs interface{} `yaml:"static_configs,omitempty"`
}

func NewScrapeConfig(job string) *ScrapeConfig {
	return &ScrapeConfig{
		JobName: job,
	}
}

func main() {
	open, err := os.Open("./Prometheus.yaml")
	if err != nil {
		log.Fatal(err)
	}
	defer open.Close()
	decoder := yaml.NewDecoder(open)
	var config PrometheusConfig
	err = decoder.Decode(&config)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%#v\n", config)
	//----------------------------------------------------------
	// 修改scrapeConfig
	scrapeConfig := NewScrapeConfig("mysqld")
	config.ScrapeConfigs = append(config.ScrapeConfigs, scrapeConfig)
	//----------------------------------------------------------
	// 拷贝yaml文件
	create, err := os.Create("./Prometheus_copy.yaml")
	if err != nil {
		log.Fatal(err)
	}
	defer create.Close()
	encoder := yaml.NewEncoder(create)
	err = encoder.Encode(config)
	if err != nil {
		log.Fatal(err)
	}
}