Python（3）获取Prometheus数据：RPS峰值、响应时间等监控指标，并保存到本地csv文件中

一闪一闪亮晶晶~

已于 2022-11-25 17:41:22 修改

阅读量2.5k

点赞数 4

分类专栏： python 文章标签： grafana prometheus python

于 2022-11-10 15:00:33 首次发布

本文链接：https://blog.csdn.net/weixin_42221654/article/details/127789033

版权

python 专栏收录该内容

23 篇文章 2 订阅

订阅专栏

背景

因每次大促都需要统计各个服务的接口的性能情况（每秒请求数、请求99分为值、CPU%），涉及服务多接口多且工作重复性高，按1人/天的忙不停歇的干，至少也得3天时间才能完全统计好。特此开发自动化统计脚本，节省性能测试同学的时间和人力成本。

策略

不建议从grafana监控面板提取数据，建议直接查prometheus指标数据，不用从grafana绕路

参考: Prometheus官方接口文档

本项目的接口请求ip地址：172.16.xx.xx:9090 【找公司运维要即可】

用这个直接查指标就可以了，不用从grafana绕路
注意：这个是生产的接口，做了一些限流策略，适当限制下查询的时间跨度和频次，粒度越小返回越快。

Prometheus接口

prometheus统一http接口为/api/v1，其数据响应文件格式为JSON。目前一些常用稳定接口如下：

1、即时查询

GET /api/v1/query
POST /api/v1/query

参数说明：

query=<查询语句> Prometheus表达式查询字符串
time=<rfc3339 | unix_timestamp>时间戳，可选项。若time不指定，默认为当前服务器时间。
timeout=<duration> 超时时间，-query.timeout限制，可选项

实例：
curl 'http://localhost:9090/api/v1/query?query=up'
curl 'http://localhost:9090/api/v1/query?query=up&time=2022-11-11T00:00:00.781Z'

2、范围查询

GET /api/v1/query_range
POST /api/v1/query_range

参数说明：

query=<查询语句> Prometheus表达式查询字符串
start=<rfc3339 | unix_timestamp> 开始时间戳，包括在内
end=<rfc3339 | unix_timestamp> 结束时间戳，包括在内
step=<duration | float> 以秒为格式或浮点数查询分辨率步长。
timeout=<duration | float>评估超时，可选项。默认为- query.timeout标志值并受其限制。

curl 'http://localhost:9090/api/v1/query_range?query=up{instance=~"172.19.xx.xx:9100"}&start=1668099300&end=1668103200&step=15&timeout=40s'

3、查询元数据

GET /api/v1/series
POST /api/v1/series

参数说明：

match[]=<series_selector> 重复系列选择器参数，选择要返回的系列。match[]必须至少提供一个参数。
start=<rfc3339 | unix_timestamp> 开始时间戳，包括在内
end=<rfc3339 | unix_timestamp> 结束时间戳，包括在内

实例：
curl -g 'http://localhost:9090/api/v1/series?''match[]=up'

4、获取标签名称

GET /api/v1/labels
POST /api/v1/labels

参数说明：

match[]=<series_selector> 重复系列选择器参数，重复的系列选择器参数，用于选择要从中读取标签名称的系列。可选的。
start=<rfc3339 | unix_timestamp> 开始时间戳，包括在内。可选
end=<rfc3339 | unix_timestamp> 结束时间戳，包括在内。可选

实例
curl 'localhost:9090/api/v1/labels'

简单代码实例如下

def time_format(tt):
    timeArray = time.strptime(tt, "%Y-%m-%d %H:%M:%S")   #转换成时间数组
    timestamp = time.mktime(timeArray)                   #转换成时间戳
    return int(timestamp)
    
def get_data(sql):
    url = f'http://172.16.xx.xx:9090/api/v1/query_range?query={sql}&start={time_format(start_time)}&end={time_format(end_time)}&step=30&timeout=40s'
    res = requests.get(url=url)
    return res.json()

if __name__ == '__main__':
    start_time = '2022-11-11 00:55:00'
    end_time = '2022-11-11 02:00:00'
    val = ["xxx-service","xxx.xx.xx.xx|xxx.xx.xx.xx","/capi/xxx/xx/xx/xx/xx/x"]
    rps_sql = 'sum(rate(al_meter_request_counter_total{job="' + val[0] + '", instance=~"(' + val[1] + ')", path=~"' + val[2] + '",code=~".*"}[2m]))'
    p99rt_sql = 'max(al_meter_request_summary{job="' + val[0] + '", instance=~"(' + val[1] + ')", path=~"' + val[2] + '",quantile="0.99"}) by(instance,path)'
    cpu_sql = '1 - avg(irate(node_cpu_seconds_total{instance=~"' + val[1].replace("|",":9100|") + ':9100' + '",mode="idle"}[30m])) by (instance)'
    get_data(rps_sql)
    get_data(p99rt_sql)
    get_data(cpu_sql)

注意：上面只是简单的使用，并不能保存查询到的指标数据。

完整项目代码如下

把查询到的数据，保存到csv文件中

import csv
import time
from urllib import parse
import requests

def time_format(tt):
    timeArray = time.strptime(tt, "%Y-%m-%d %H:%M:%S")   #转换成时间数组
    timestamp = time.mktime(timeArray)                   #转换成时间戳
    return int(timestamp)

def clear_csv(file):
    with open(file,"r+") as f:
        f.truncate(0)

def get_data(sql):
    url = f'http://172.16.xx.xx:9090/api/v1/query_range?query={sql}&start={time_format(start_time)}&end={time_format(end_time)}&step=30&timeout=40s'
    res = requests.get(url=url)
    vmax = 0
    for value in res.json()["data"]["result"][0]["values"]:
        vd = float(value[1])
        if vd >=vmax:
            vmax = vd
    return round(vmax,2)

def collect_flow_data():
    header = ['service', 'host', 'interface', 'RPS', 'P50RT', 'P99RT', 'MaxRT', 'CPU']
    with open("new_flow_data.csv", "a+", encoding='utf-8', newline='') as wf:
        csv_writer = csv.writer(wf)
        csv_writer.writerow(header)  # 写表头
        with open('interfacelist.csv', encoding='utf-8') as rf:
            reader = csv.reader(rf)
            for val in reader:
                rps_sql = 'sum(rate(al_meter_request_counter_total{job="' + val[0] + '", instance=~"(' + val[1] + ')", path=~"' + val[2] + '",code=~".*"}[2m]))'
                p50rt_sql = 'sum(increase(al_meter_request_summary_sum{job="' + val[0] + '",path=~"' + val[2] + '"}[2m])) by (path)/sum(increase(al_meter_request_summary_count{job="' + val[0] + '",path=~"' + val[2] + '"}[2m]))  by (path)'
                p99rt_sql = 'max(al_meter_request_summary{job="' + val[0] + '", instance=~"(' + val[1] + ')", path=~"' + val[2] + '",quantile="0.99"}) by(instance,path)'
                maxrt_sql = 'max(al_meter_request_summary_max{job="' + val[0] + '",+instance=~"(' + val[1] + ')",+path=~"' + val[2] + '"})+by+(path)'
                cpu_sql = '1 - avg(irate(node_cpu_seconds_total{instance=~"' + val[1].replace("|",":9100|") + ':9100' + '",mode="idle"}[30m])) by (instance)'

                m = [val[0], val[1], val[2], get_data(rps_sql), get_data(p50rt_sql), get_data(p99rt_sql),get_data(maxrt_sql), "%.2f%%" % (get_data(cpu_sql) * 100)]
                csv_writer.writerow(m)

if __name__ == '__main__':
    start_time = '2022-11-11 00:55:00'
    end_time = '2022-11-11 02:00:00'
    clear_csv("new_flow_data.csv")
    collect_flow_data()