python获取Prometheus监控数据

2 篇文章 0 订阅
1 篇文章 0 订阅

1 获取Prometheus target数据

调用http://<prometheus.address>/api/v1/targets并解析。

def getTargetsStatus(address):
    url = address + '/api/v1/targets'
    response = requests.request('GET', url)
    if response.status_code == 200:
        targets = response.json()['data']['activeTargets']
        aliveNum, totalNum = 0, 0
        downList = []
        for target in targets:
            totalNum += 1
            if target['health'] == 'up':
                aliveNum += 1
            else:
                downList.append(target['labels']['instance'])
        print('-----------------------TargetsStatus--------------------------')
        print(str(aliveNum) + ' in ' + str(totalNum) + ' Targets are alive !!!')
        print('--------------------------------------------------------------')
        for down in downList:
            print('\033[31m\033[1m' + down + '\033[0m' + ' down !!!')
        print('-----------------------TargetsStatus--------------------------')
    else:
        print('\033[31m\033[1m' + 'Get targets status failed!' + '\033[0m')

2 获取Prometheus 监控信息(cpu、mem、disks)

调用http://<prometheus.address>/api/v1/query?query=<expr>并解析,其中expr为prometheus的查询语句。

### 定义cpu、mem、disks使用率的空字典
diskUsageDict = {}
cpuUsageDict = {}
memUsageDict = {}
### 定义采集时间间隔 s
monitorInterval = 5
### 定义超时告警时间 s
diskAlertTime = 5
cpuAlertTime = 300
memAlertTime = 300
### 定义告警阈值 %
diskThreshold = 80
cpuThreshold = 60
memThreshold = 70
def queryUsage(address, expr):
    url = address + '/api/v1/query?query=' + expr
    try:
        return json.loads(requests.get(url=url).content.decode('utf8', 'ignore'))
    except Exception as e:
        print(e)
        return {}
def orderUsageDict(usageDict, currentTime, monitorInterval):
    '''
    :param usageDict: 资源使用率字典
    :param usageDict: 资源使用率字典
    :param currentTime: 当前获取监控数据的时间节点
    :return:
    :description: 剔除字典中不满足连续超出阈值的数据
    '''
    for key in list(usageDict.keys()):
        if currentTime - usageDict[key][1] >= monitorInterval:
            usageDict.pop(key)
def getCurrentUsageGreater(address, record, threshold, usageDict, monitorInterval):
    '''
    :param address: Prometheus address
    :param record: Prometheus rules record
    :param threshold: 阈值
    :param usageDict: 资源使用率字典
    :param monitorInterval: 监控时间间隔
    :return:
    :description: 获取资源使用率大于阈值的数据
    '''
    expr = record + '>=' + str(threshold)
    usage = queryUsage(address=address, expr=expr)
    currentTime = 0
    if 'data' in usage and usage['data']['result']:
        for metric in usage['data']['result']:
            instance = metric['metric']['instance']
            if record == 'node:fs_usage:ratio' or record == 'node:fs_root_usage:ratio':
                metricLabel = instance + ':' + metric['metric']['mountpoint']
            else:
                metricLabel = instance
            utctime = metric['value'][0]
            value = metric['value'][1]
            describe = record.split(':')[1]
            if not metricLabel in usageDict.keys():
                usageDict[metricLabel] = (utctime, utctime, describe, value)
            else:
                startTime = usageDict.get(metricLabel)[0]
                usageDict[metricLabel] = (startTime, utctime, describe, value)
            currentTime = utctime
    orderUsageDict(usageDict=usageDict, currentTime=currentTime, monitorInterval=monitorInterval)
def printUsageDict(usageDict, alertTime):
    '''
    :param usageDict: 资源使用率字典
    :param alertTime: 监控告警时间
    :return:
    :description: 打印出超过监控告警时间的数据
    '''
    for key, value in usageDict.items():
        deltaT = value[1] - value[0]
        if deltaT >= alertTime:
            print(key + ' ----- ' + value[2] + '\033[31m\033[1m ' + str(value[3]) + '\033[0m ----- lasted for\033[31m\033[1m %.2f \033[0mseconds' % deltaT)
def monitorUsageGreater(address):
    '''
    :param address: Prometheus address
    :return:
    :description: 持续监控并输出数据
    '''
    while True:
        getCurrentUsageGreater(address, 'node:fs_usage:ratio', diskThreshold, diskUsageDict, monitorInterval)
        printUsageDict(diskUsageDict, alertTime=diskAlertTime)
        getCurrentUsageGreater(address, 'node:memory_usage:ratio', cpuThreshold, memUsageDict, monitorInterval)
        printUsageDict(memUsageDict, alertTime=memAlertTime)
        getCurrentUsageGreater(address, 'node:cpu_usage:ratio', memThreshold, cpuUsageDict, monitorInterval)
        printUsageDict(cpuUsageDict, alertTime=cpuAlertTime)
        time.sleep(monitorInterval)
要从Prometheus API导出监控数据,可以使用Python中的Prometheus客户端库。这些库使得调用Prometheus API变得更加简单。 以下是使用Python调用Prometheus API导出监控数据的基本步骤: 1. 安装Prometheus客户端库:使用pip安装Prometheus客户端库。例如,如果使用的是Prometheus Python客户端库,则可以使用以下命令进行安装: ``` pip install prometheus_client ``` 2. 导入所需的库:导入所需的库(例如,prometheus_client和requests)。 ```python from prometheus_client import CollectorRegistry, Gauge, push_to_gateway import requests ``` 3. 创建一个CollectorRegistry对象:创建一个CollectorRegistry对象,用于存储指标。 ```python registry = CollectorRegistry() ``` 4. 创建一个Gauge对象:使用Gauge对象创建一个指标(例如,cpu_usage)。 ```python cpu_usage = Gauge('cpu_usage', 'CPU usage percentage', registry=registry) ``` 5. 从Prometheus API获取数据:使用requests库从Prometheus API获取指标数据。 ```python response = requests.get('http://prometheus-server/api/v1/query', params={'query': 'cpu_usage'}) data = response.json() cpu_usage_value = data['data']['result'][0]['value'][1] ``` 6. 将数据推送到Prometheus Pushgateway:使用push_to_gateway方法将指标数据推送到Prometheus Pushgateway。 ```python push_to_gateway('prometheus-pushgateway:9091', job='my_job', registry=registry) ``` 这些步骤可以根据特定的需求进行修改和调整,以获取所需的监控数据
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值