整理镜像仓监控项与报警规则

theo.wu

已于 2023-12-22 22:16:27 修改

阅读量33

点赞数

文章标签： harbor

于 2022-10-17 10:43:32 首次发布

本文链接：https://blog.csdn.net/niwoxiangyu/article/details/127359177

版权

整理镜像仓监控项与报警规则

exporter仓库地址http://gitlab.cpaas.com/cpaas/harbor_exporter

短期内大量的push和pull相同镜像操作（可以理解为大量的pod漂移导致。通过函数去计算倒数第二个
挂载盘剩余空间
镜像同步结果
volume 空间（通用服务短时间无法实现，见 #172 (closed)
harbor服务是否正常up(直接通过kube_deployment_status_replicas_available、kube_statefulset_status_replicas_ready去获取
repo仓的pull、push、tags计数(需要通过sql直接去获取，可能会存在版本不兼容的问题.暂时搁置
网络层面IO统计（直接通过container_network_xx去获取
pg是否健康
pg连接数量（过高可能导致core不能使用
项目所使用总空间

当前监控项列表：
Metric    Meaning    Labels
harbor_up
harbor_project_count_total        type=[private_project, public_project, total_project]
harbor_repo_count_total        type=[private_repo, public_repo, total_repo]
harbor_system_volumes_bytes    当前存储空间的使用率(仅适用于filesystem)    storage=[free, total]
harbor_repositories_pull_total    每一个repo的pull次数    repo_id, repo_name
harbor_repositories_push_total    每一个repo的push次数    repo_id, repo_name
harbor_repositories_tags_total    每一个repo的tag数量    repo_id, repo_name
harbor_image_pull_count    每一个镜像的拉取次数    repo_name, repo_tag
harbor_database_health    harbor数据库是否健康
harbor_database_connections    harbor数据库的连接数
harbor_project_size    项目使用的总磁盘空间    project_name
harbor_replication_status    status of the last execution of this replication policy: Succeed = 1, any other status = 0    repl_pol_name
harbor_replication_tasks    number of replication tasks, with various results, in the latest execution of this replication policy    repl_pol_name, result=[failed, succeed, in_progress, stopped]

推荐告警规则：

24小时内harbor存储用尽 predict_linear(harbor_system_volumes_bytes{storage="free"}[6h], 3600 * 24) < 0

harbor存储使用率大于80% sum(harbor_system_volumes_bytes{storage="free"}) / sum(harbor_system_volumes_bytes{storage="total"}) > 80%

20分钟内一个镜像被拉取5次 increase(harbor_image_pull_count[20m])>5

postgres 连接数即将接近上限 harbor_database_connections > 45

postgres 不健康 harbor_database_health < 1

harbor组件不健康 kube_deployment_status_replicas_available{namespace="harbor-2"} < 1

harbor数据库不健康 kube_statefulset_status_replicas_ready{namespace="harbor-2"} < 1

Grafana面板 http://grafana.cpaas.com/d/Nhhla1VGk/harbor-dashbord?orgId=1

theo.wu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
整理镜像仓监控项与报警规则

harbor存储使用率大于80% sum(harbor_system_volumes_bytes{storage="free"}) / sum(harbor_system_volumes_bytes{storage="total"}) > 80%harbor服务是否正常up(直接通过kube_deployment_status_replicas_available、kube_statefulset_status_replicas_ready去获取。通过函数去计算倒数第二个。
复制链接

扫一扫