prometheus +pushgatewa +grafana的安装部署与测试

最新推荐文章于 2024-05-11 16:04:59 发布

张dozen

最新推荐文章于 2024-05-11 16:04:59 发布

阅读量504

点赞数

分类专栏： java 文章标签： prometheus Powered by 金山文档

本文链接：https://blog.csdn.net/zjlgdxzzw/article/details/129158345

版权

java 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

# prometheus搭建教程

## 主要概念

prometheus 主要功能是负责数据的手机存储，手机的来源是各种exporter。比如mysql 有mysql exporter ，服务器性能指标的exporter 等等。

因此为了能够监控到某些东西，如主机的CPU 使用率，我们需要使用到 Exporter。Prometheus 周期性的从 Exporter 暴露的HTTP 服务地址（通常是/metrics）拉取监控样本数据。

## prometheus安装部署

访问官网

https://prometheus.io/download/

选择linux版本下载

https://github.com/prometheus/prometheus/releases/download/v2.42.0/prometheus-2.42.0.linux-amd64.tar.gz

上传到服务器

解压

修改配置文件

prometheus.yml文件

```

my global config

global: # global是一些常规的全局配置，这里只列出了两个参数：

scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. #每15s采集一次数据

evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. #每15s做一次告警检测

# scrape_timeout is set to the global default (10s).

# Alertmanager configuration

alerting:

alertmanagers:

- static_configs:

- targets:

# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.

rule_files: # rule_files指定加载的告警规则文件，告警规则放到下面来介绍

# - "first_rules.yml"

# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:

# Here it's Prometheus itself.

scrape_configs:

# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

- job_name: "prometheus" #

这是prometheus本机的一个监控节点，可以继续扩展加入其它需要被监控的节点，例如：

# metrics_path defaults to '/metrics'

# scheme defaults to 'http'.

#可以看到targets可以并列写入多个节点，用逗号隔开，机器名+端口号，端口号主要是exporters的端口，在这里9100其实是node_exporter的默认端口。配置完成后，prometheus就可以通过配置文件识别监控的节点，持续开始采集数据，prometheus基础配置也就搭建好了。

static_configs:

- targets: ["localhost:9090"] # 启动的端口

```

启动命令

*nohup ./prometheus --config.file=prometheus.yml > ./prometheus.log 2>&1 &*

## exporter 安装部署

下载地址

https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz

*nohup ./node_exporter > node_exporter.log 2>&1 &*

ps -ef |grep node_exporter

检查是否正常启动

打开网页 http://10.50.51.30:9100/metrics

配置node_exporter的自启动

```

vi /usr/lib/systemd/system/node_exporter.service

[Unit]

Description=node_export

Documentation=https://github.com/prometheus/node_exporter

After=network.target

[Service]

Type=simple

ExecStart= /usr/local/node_exporter-1.4.0/node_exporter

Restart=on-failure

[Install]

WantedBy=multi-user.target

```

## grafana安装部署

grafana官方下载地址：https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1

wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.6.linux-amd64.tar.gz

tar -zxvf grafana-enterprise-9.3.6.linux-amd64.tar.gz

参考教程

https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1

启动命令

nohup ./grafana-server &

需要配置邮箱信息

在grafana目录下创建目录config，在里面创建文件grafana.ini

```

#################################### SMTP / Emailing ##########################

# 配置邮件服务器

[smtp]

enabled = true

# 发件服务器

host = smtp.qq.com:465

# smtp账号

user = 2469278741@qq.com

# smtp 授权码

password = 123456

# 发信邮箱

from_address = 2469278741@qq.com

# 发信人

from_name = zhiweiliao

```

需要配置数据源文件 conf/ provisioning /datasource.yml

```

# config file version

apiVersion: 1

deleteDatasources: #如果之前存在name为Prometheus，orgId为1的数据源先删除

- name: Prometheus

orgId: 1

datasources: #配置Prometheus的数据源

- name: Prometheus

type: prometheus

access: proxy

orgId: 1

url: http://prometheus:9090 #在相同的docker compose下，可以直接用prometheus服务名直接访问

basicAuth: false

isDefault: true

version: 1

editable: true

```

打开页面

http://10.50.51.30:3000/

跳过用户名密码访问

添加prometheus数据源

点击右侧小齿轮图标 ==》add data source

选择prometheus 填入url http://localhost:9090 点击save&test成功

测试查询

点击explorer 小图标

选择顶部 explore 右边的下拉框里的 prometheus

metric里选择 go_gc_duration_seconds

label filters instance

localhost:9090 点击左上角的runQuery 就有图表数据出来了

## pushgateway安装部署

有些指标是能通过拉取来实现的，但是有些数据是事件触发的，或者我们想推送到prometheus怎么办这个时候就需要pushgateway了。

下载地址：https://github.com/prometheus/pushgateway/releases/download/v1.5.1/pushgateway-1.5.1.linux-amd64.tar.gz

启动命令 nohup ./pushgateway &

查看端口

netstat -apn | grep 9091

查看pushgateway页面

打开`pushgateway`的web页面，`http://10.50.51.30:9091`，发现Metrics栏没有任何数据。因为此时还没有客户端推送数据给`pushgateway`。

修改 prometheus server 配置文件，定义一个job

在prometheus server的prometheus.yml文件中定义个job，然后tagets指向pushgateway所在的ip和9091端口：

```

[root@ip-10-50-51-30 prometheus-2.42.0.linux-amd64]# ps -ef|grep prometheus

root 25157 22948 0 Feb20 pts/0 00:02:01 ./prometheus --config.file=prometheus.yml

root 25310 22948 2 14:35 pts/0 00:00:00 grep --color=auto prometheus

[root@ip-10-50-51-30 prometheus-2.42.0.linux-amd64]# kill -9 25157

[root@ip-10-50-51-30 prometheus-2.42.0.linux-amd64]# nohup ./prometheus --config.file=promeths.yml > ./prometheus.log 2>&1 &

ps -ef|grep prometheus

```

编辑采集脚本采集主机数据，然后推送给pushgateway

vim pushgateway.sh

```

#!/bin/bash

instance_name=`hostname -f | cut -d'.' -f1` 截取主机名

vim pushgateway.sh#编写pushgateway脚本采集数据

#!/bin/bash

instance_name=`hostname -f | cut -d'.' -f1`#截取主机名

if [ $instance_name == "localhost" ];then

echo "Must FQDN hostname"#要求主机名不能是localhost，不要主机名区别不了

exit 1

label="count_netstat_wait_connections"#定义一个key

count_netstat_wait_connections=`netstat -an| grep -i wait| wc -l`#定义values

#推送数据给pushgateway

echo "$label $count_netstat_wait_connections" | curl --data-binary @- http://10.50.51.30:9091/metrics/job/${instance_name}

```

echo "$label $count_netstat_wait_connections" | curl --data-binary @- http://10.50.51.30:9091/metrics/job/${instance_name}

然后给脚本授权执行

再打开prometheus或者grafana进行查看就可以了

http://10.50.51.30:9090/graph?

![1676962148196](C:\Users\王超凡\AppData\Roaming\Typora\typora-user-images\1676962148196.png)

count_netstat_wait_connections

![1676962177701](C:\Users\王超凡\AppData\Roaming\Typora\typora-user-images\1676962177701.png)

换成用java 进行推送

java 推送pushgateway

方案1 推送到gateway

推送的话数据量太大了，其实是http请求，每次都以http请求进行发送物理机的还好，

用户的数据量太大了。目前来看用户的数据是grpc实时上报的，udp是定时上报的。

方案2 写到redis中

方案3 写到本地日志中

方案4 prometheus拉取的时候取消费kafka

取了大量的时候返回

边缘节点有prometheus 为什么么要集中到中心节点总结领导sb

和同时讨论 prometheus 是适合存储监控指标，不适合记录每一个记录，他时候定时的记录监控目标的瞬间状态，但是你要让他存储完整的记录，他有translog 吗，他的存储是通过拉取的方式就不适合当做数据库取用！！！！！！！！

---------------------------------------------------完结撒花-------------------------------------------

## docker部署

待续

## 参考

官网： https://prometheus.io/download/

csdn： https://blog.csdn.net/weixin_44352521/article/details/127947313

https://blog.csdn.net/MssGuo/article/details/127599745

java推送 https://blog.csdn.net/qq_21389711/article/details/125183313

张dozen

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
prometheus +pushgatewa +grafana的安装部署与测试

和同时讨论 prometheus 是适合存储监控指标，不适合记录每一个记录，他时候定时的记录监控目标的瞬间状态，但是你要让他存储完整的记录，他有translog 吗，他的存储是通过拉取的方式就不适合当做数据库取用！---------------------------------------------------完结撒花-------------------------------------------prometheus 主要功能是负责数据的手机存储，手机的来源是各种exporter。
复制链接

扫一扫

专栏目录