跨平台应用进程cpu与内存监控的搭建说明

1. 前言:

随着科技的进步,互联网发展,能网上办理的就网上办理,按装一个app客户端,连接后台服务,只要是有网络就OK.便捷,快速,省事.但随之而来的是pc端上安装的应用越来越多,系统资源越来越不够用.这也一定程度上对应用程序有一定的要求,除了实现其功能外,性能也是需要关注的.

2. 目的

有了前言背景,那也不难理解,本文编写的目的即如何关注监测应用进程的性能.一般来说这也是跟业务脱不开关系.指标其实很简单与服务器性能指标相同还会更简单一些,比如cpu,内存使用率等.场景也就是核心的业务场景.比如一个应用具有杀毒,基线扫描与修复功能等,那就需要关注在杀毒或是基线扫描与修复时其进程使用cpu与内存使用情况,还有一个就是稳定性,在没有进行业务逻辑操作时,长时间的挂着应用会对系统有何影响等.

3. 实现方式

目标明确了,那就是如何实现,本文会使用比较流行常用的监控工具,有完整的操作手册,还支持跨平台操作.那就是当仁不让的工具"prometheus+grafana",工具是现成的,就是数据采集跟服务器端还是有一定的差别的.这里实现的是在pc端采集数据主动推送到prometheus进而再在grafana里展现出来.

最终展示出来的结果如下:
请添加图片描述

3.1. 环境准备

python环境,prometheus+grafana服务器

3.2. 具体操作

首先是使用python调用prometheus_client客户端函数,使用数据推送到prometheus pushgateway里,下面为python实现获取本地应用进程使用cpu,内存数据,然后推送到prometheus pushgateway.

3.2.1. python脚本

# -*- coding: utf-8 -*-
# @Time    : 2024-3-30
# @Author  : zhh
# @Version :
# @File    : app_perf.py
# @Software: PyCharm

# pip install psutil matplotlib prometheus_client
import time
import threading
import psutil
# import matplotlib.pyplot as plt
from prometheus_client import CollectorRegistry, Counter, Gauge, push_to_gateway
import os
from datetime import datetime
import platform
import socket

# 获取当前系统基本信息
def get_local_ip():
    try:
        # 获取当前系统类型system_type
        system_type = platform.system()
        # 创建一个Socket对象
        s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        # 获取当前系统名
        fqdn = socket.getfqdn()
        # 连接到远程主机,这里选择Google的DNS服务器
        s.connect(("8.8.8.8", 80))
        # 获取本地IP地址
        local_ip = s.getsockname()[0]
        # 关闭Socket连接
        s.close()
        return fqdn, system_type, local_ip,
    except Exception as e:
        print("获取系统信息出错:", e)
        return None
def pushgetway(cpu,mem,process):
    fqdn,system_type,local_ip=get_local_ip()
    registry = CollectorRegistry()
    if system_type=="Linux":
        process = process+"Linux"
    # 定义获取cpu
    c = Gauge('cpu_usage', 'get cpu usage', ['processName','platform','instance','hostname','hostIP'], registry=registry)
    c.labels(processName=process,platform=system_type,instance="",hostname=fqdn,hostIP=local_ip).set(cpu)  # +1
    # 定义获取memroy
    m = Gauge('memory_usage', 'get mem usage', ['processName','platform','instance','hostname','hostIP'], registry=registry)
    m.labels(processName=process,platform=system_type,instance="",hostname=fqdn,hostIP=local_ip).set(mem)  # -10

    push_to_gateway('10.90.21.12:9091', job=process, registry=registry)
def get_process_info(process_name):
    pid = None
    num_cores = psutil.cpu_count(logical=True)
    for proc in psutil.process_iter(['pid', 'name']):
        if proc.info['name'] == process_name:
            pid = proc.info['pid']
            break

    if pid is not None:
        process = psutil.Process(pid)
        cpu_percent = process.cpu_percent(interval=1)
        # total_cpu_percent = cpu_percent / num_cores
        # memory_percent = process.memory_percent()
        # return cpu_percent, memory_percent
        # 获取内存信息(返回字节数)
        memory_info = process.memory_info()
        # 获取进程的rss即物理内存,将字节数转换为MB
        memory_in_mb = memory_info.rss / (1024 * 1024)
        # print(f"当前进程使用的内存: {memory_in_mb:.2f} MB")
        return cpu_percent, memory_in_mb
    else:
        return None, None
# 写入时间def write_file(process,cpu,mem):
#     # print(process,cpu,mem)
#     # 获取当前执行工作的绝对路径
#     current_path = os.path.abspath(os.getcwd())
#     file_path = os.path.join("", current_path, process+"_perf.txt")
#     # print('Current Working Directory:', file_path)
#     # 获取当前日期和时间
#     current_datetime = datetime.now()
#     # 格式化时间为指定格式
#     formatted_time = current_datetime.strftime("%Y-%m-%dT%H:%M:%S")
#     # print('Current Date and Time:', formatted_time)
#
#     # 打开一个txt文件,如果文件不存在会自动创建并追加数据
#     with open(file_path, 'a') as file:
#         content = formatted_time + " " + str(cpu) + " " + str(mem) + "\n"
#         # 追加数据到文件
#         file.write(content)
#     # print('Additional data has been saved to example.txt')、cpu、memory到执行目录的app_perf.txt文件中


# 定义一个函数,作为线程要执行的任务
def task(process):
    cpu_percent, memory_in_mb = get_process_info(process)
    print("*****",process,cpu_percent,memory_in_mb)
    if cpu_percent is not None:
        # print("***task****",process, cpu_percent, memory_in_mb)
        pushgetway(cpu_percent, memory_in_mb, process)
        # write_file(process,cpu_percent, memory_in_mb)


# Press the green button in the gutter to run the script.
if __name__ == '__main__':
    fqdn, system_type, local_ip = get_local_ip()
    if system_type=="Darwin":
        process_name = ["CAZeroTrust", "com.chiansecurity.caztpmac.helper", "CASAviraService",
                        "FileService","NetAccess","CASBaseEndpointSecurity"]  # macOS替换成你要监控的进程名
    elif system_type=="Windows":
        process_name = ["caztpaui.exe", "caztpasvc.exe", "caztpawh.exe", "caztpAV.exe","caztpasw.exe"]

    elif system_type == "Linux":
        process_name = ["caztp", "CAZeroTrust", "catray"]

    while True:
        # 创建多个线程并启动
        threads = []
        for i in range(len(process_name)):
            t = threading.Thread(target=task, args=(process_name[i],))
            threads.append(t)
            t.start()

        # 等待所有线程执行完毕
        for t in threads:
            t.join()

        # print("All threads have finished.")
        # break

3.2.2. prometheus+grafana服务器搭建

在此使用的是macOS系统,当时下载了个docker桌面应用,启动后自动的就有docker-compose组件.只需做好配置文件,直接启动docker-compose即可.

需要注意的也就是两个配置文件:docker-compose.yml, prometheus.yml

version: "3"
services:
  prometheus:
    image: prom/prometheus:v2.36.2
    container_name: prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./data/prometheus_data:/prometheus
    ports:
    - "9090:9090"

  grafana:
    image: grafana/grafana:9.0.1
    container_name: grafana
    volumes:
      - ./data/grafana_data:/var/lib/grafana
      #- ./grafana/provisioning:/etc/grafana/provisioning
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=hogwarts
      - GF_USERS_ALLOW_SIGN_UP=false
    ports:
      - "3000:3000"

  influxdb:
    image: influxdb:1.8.10
    container_name: influxdb
    ports:
      - "8086:8086"
    volumes:
      - ./data/influxdb_data:/var/lib/influxdb

Prometheus.yml文件内容如下:

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: redis_exporter
    static_configs:
      - targets:
        - 10.1.1.11:9121

  - job_name: node_exporter
    static_configs:
      - targets:
        - 10.1.1.11:9100
        
  - job_name: pushgateway
    static_configs:
      - targets:
        - pushgateway:9091

启动docker-compose

创建文件夹,把上面的文件放入到文件夹中,再创建一个空的data数据存储文件夹.如"/docker/monitoring"
请添加图片描述

cd /docker/monitoring #进入到此文件夹下
docker-compose up -d #执行一键启动docker-compose ,数据文件夹data,配置文件docker-compose.yml与prometheus.yml配置正确

请添加图片描述
启动成功后在浏览器中输入"http://10.90.21.12:9090/targets?search="
请添加图片描述
在浏览器输入"http://10.90.21.12:9091/"即可看到各客户端推送的数据
请添加图片描述
请添加图片描述
请添加图片描述

3.2.3. 数据源与面板展示配置

浏览器输入"http://10.90.21.12:3000/"打开grafana,进行数据源与展示的数据面板配置
请添加图片描述
面板是自定义的面板:
请添加图片描述
请添加图片描述
请添加图片描述

4. 其他

  • 12
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值