目录
1. 前言
由于alertmanager告警的记录不支持持久化记录,发送的告警信息不会存储在数据库中,prometheus将所有数据存储为时间序列,却不会将alertmanager发送的告警信息做为一条记录存储下来,因此,如何对alertmanager发送的告警信息进行持久化,并存储到数据库(mysql、pg)中?
2. 解决方案
我们可以知道,prometheus通过抓取监控数据,将触发规则的告警,推送到Alertmanger,Alertmanger对告警进行分组、聚合等处理后,可以通过邮件、Slack、webhook等方式对用户进行发送告警信息。
因此 ,解决的思路就是通过webhook发送告警信息到指定服务,并将记录写入到数据库。 这里推荐一个开源软件:alertsnitch
工作原理如下
官方地址:Yakshaving Art / alertsnitch · GitLab
alertsnitch支持将alertmanager的告警数据写入mysql和pg, 并结合grafna面板,将数据进行展示。
3. alertsnitch部署及使用
3.1 部署方式:
- go 编译安装
- docker-compose
- k8s
3.2 安装前准备
- 安装Mysql(版本5.7,大于5.7有报错,安装过程略),创建数据库alertsnitch,alertsnitch用户及授权
- 初始化数据库,先执行建表脚本
# 代码仓库 db.d/mysql/0.0.1-bootstrap.sql
DROP PROCEDURE IF EXISTS bootstrap;
DELIMITER //
CREATE PROCEDURE bootstrap()
BEGIN
SET @exists := (SELECT 1 FROM information_schema.tables I WHERE I.table_name = "Model" AND I.table_schema = database());
IF @exists IS NULL THEN
CREATE TABLE `Model` (
`ID` enum('1') NOT NULL,
`version` VARCHAR(20) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `Model` (`version`) VALUES ("0.0.1");
ELSE
SIGNAL SQLSTATE '42000' SET MESSAGE_TEXT='Model Table Exists, quitting...';
END IF;
END;
//
DELIMITER ;
-- Execute the procedure
CALL bootstrap();
-- Drop the procedure
DROP PROCEDURE bootstrap;
-- Create the rest of the tables
CREATE TABLE `AlertGroup` (
`ID` INT NOT NULL AUTO_INCREMENT,
`time` TIMESTAMP NOT NULL,
`receiver` VARCHAR(100) NOT NULL,
`status` VARCHAR(50) NOT NULL,
`externalURL` TEXT NOT NULL,
`groupKey` VARCHAR(255) NOT NULL,
KEY `idx_time` (`time`) USING BTREE,
KEY `idx_status_ts` (`status`, `time`) USING BTREE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `GroupLabel` (
`ID` INT NOT NULL AUTO_INCREMENT,
`AlertGroupID` INT NOT NULL,
`GroupLabel` VARCHAR(100) NOT NULL,
`Value` VARCHAR(1000) NOT NULL,
FOREIGN KEY (AlertGroupID) REFERENCES AlertGroup (ID) ON DELETE CASCADE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `CommonLabel` (
`ID` INT NOT NULL AUTO_INCREMENT,
`AlertGroupID` INT NOT NULL,
`Label` VARCHAR(100) NOT NULL,
`Value` VARCHAR(1000) NOT NULL,
FOREIGN KEY (AlertGroupID) REFERENCES AlertGroup (ID) ON DELETE CASCADE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `CommonAnnotation` (
`ID` INT NOT NULL AUTO_INCREMENT,
`AlertGroupID` INT NOT NULL,
`Annotation` VARCHAR(100) NOT NULL,
`Value` VARCHAR(1000) NOT NULL,
FOREIGN KEY (AlertGroupID) REFERENCES AlertGroup (ID) ON DELETE CASCADE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `Alert` (
`ID` INT NOT NULL AUTO_INCREMENT,
`alertGroupID` INT NOT NULL,
`status` VARCHAR(50) NOT NULL,
`startsAt` DATETIME NOT NULL,
`endsAt` DATETIME DEFAULT NULL,
`generatorURL` TEXT NOT NULL,
FOREIGN KEY (alertGroupID) REFERENCES AlertGroup (ID) ON DELETE CASCADE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `AlertLabel` (
`ID` INT NOT NULL AUTO_INCREMENT,
`AlertID` INT NOT NULL,
`Label` VARCHAR(100) NOT NULL,
`Value` VARCHAR(1000) NOT NULL,
FOREIGN KEY (AlertID) REFERENCES Alert (ID) ON DELETE CASCADE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `AlertAnnotation` (
`ID` INT NOT NULL AUTO_INCREMENT,
`AlertID` INT NOT NULL,
`Annotation` VARCHAR(100) NOT NULL,
`Value` VARCHAR(1000) NOT NULL,
FOREIGN KEY (AlertID) REFERENCES Alert (ID) ON DELETE CASCADE,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
- 修改Mode verions数据
-
# db.d/mysql/0.1.0-fingerprint.sql ALTER TABLE Alert ADD `fingerprint` TEXT NOT NULL ; UPDATE `Model` SET `version`="0.1.0";
3.3 k8s部署alertsnitch
部署文件deployment.yml如下
apiVersion: apps/v1
kind: Deployment
metadata:
name: alertsnitch
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: alertsnitch
template:
metadata:
labels:
app.kubernetes.io/name: alertsnitch
spec:
containers:
- image: registry.gitlab.com/yakshaving.art/alertsnitch
name: alertsnitch
ports:
- containerPort: 9567
name: http
env:
- name: ALERTSNITCH_BACKEND
value: mysql
- name: ALERTSNITCH_DSN
value: "DB_USER:DB_PASSWORD@(DB_IP:DB_PORT)/DB_NAME" #注意这里要修改成实际的库名帐号信息等
readinessProbe:
httpGet:
path: /-/ready
port: 9567
initialDelaySeconds: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /-/health
port: 9567
initialDelaySeconds: 60
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: alertsnitch
spec:
ports:
- name: http
port: 9567
targetPort: http
selector:
app.kubernetes.io/name: alertsnitch
注意:环境变量数据库配置格式ALERTSNITCH_DSN="${MYSQL_USER}:${MYSQL_PASSWORD}@/${MYSQL_DATABASE}"
部署完成后配置ingress,映射域名对外提供服务,过程略,配置域名后调用方式为:
https://alertsnitch.test.com
/webhook
4. 配置alertmanager
vim alertmanager.yml
# 配置接收者
- name: default
webhook_configs:
- send_resolved: true
http_config:
follow_redirects: true
url: https://alertsnitch.test.cn/webhook
配置后重启服务,告警信息将会推送alertsnitch,并写入到对应的Mysql库中
5. grafana展示
使用面板ID:15833
prometheus alert history | Grafana Labs
添加mysql数据源(略)
效果如下
以上就实现了alertmanager数据的持久化和简单展示