1. 目标效果
当系统出现错误日志时,算法人员可查看日志及实现主动消息通知,如钉钉短信等
2. 配置logstash采集日志
2.1 修改k8s.yaml发布文件
volumeMounts:
- name: host-time
mountPath: /etc/localtime
- name: app-log
mountPath: /home/sfxs/logs # 共享日志目录,为了读取
- name: logstash
image: logstash:7.11.2
imagePullPolicy: IfNotPresent
args: [ "-f", "/usr/share/logstash/config/logstash.conf" ] # 如果采用docker-compose方式,改为对应命令即可,如command: ["-f", "/usr/share/logstash/config/logstash.conf"]
volumeMounts:
- name: host-time
mountPath: /etc/localtime
- name: app-log
mountPath: /home/sfxs/logs
- name: nfs
mountPath: /usr/share/logstash/config
volumes:
- name: host-time
hostPath:
path: /etc/localtime
- name: nfs
cephfs: # 我们采用的cephfs共享存储,根据实际情况修改
monitors:
- 192.168.0.XXX:6789
- 192.168.0.XXX:6789
- 192.168.0.XXX:6789
user: admin
path: /k8stest/logstash/nlp
secretRef:
name: ceph-secret
readOnly: false
- name: app-log
emptyDir: { }
2.2 新增logstash相关文件
主要配置logstash.conf
input {
file {
path => "/home/sfxs/logs/*.log"
}
}
filter {
# 消息行不存在ERROR则跳过
if (!([message] =~ ".*ERROR.*")) {
drop {}
}
grok {
# 正则匹配满足的消息,解析为ES的字段,如msg row等
match => { "message" => ".*【%{TIMESTAMP_ISO8601:timestamp}\s+\|\s+%{DATA:class}\s+\|\s+%{DATA:row}\s+\|\s+%{DATA:msg}\s+\|\s+%{DATA:address}】.*"}
}
date {
match => ["timestamp", "yyyy-MM-dd HH:mm:ss" ]
target => "@timestamp"
timezone => "Asia/Shanghai"
}
mutate{
remove_field => ["host", "path", "@version", "message", "timestamp"]
}
}
output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "nlp-log"
}
}
其他配置详见https://gitee.com/SJshenjian/blog-code/tree/master/src/main/java/online/shenjian/logstash
3. ES索引创建,存储数据
PUT nlp_log
{
"mappings": {
"dynamic" : "strict",
"properties" : {
"@timestamp" : {
"type" : "date"
},
"address" : {
"type" : "text"
},
"class" : {
"type" : "text"
},
"level" : {
"type" : "text"
},
"msg" : {
"type" : "text"
},
"row" : {
"type" : "text"
}
}
}
}
4. 配置grafana可视化
4.1 配置datasource
4.2 配置可视化面板
展示格式选择
展示哪些字段及别名显示
统计方式选择,我们计数
统计指标:五分钟内计数超过3个错误日志
统计间隔:每分钟监控,指标达到后2S触发消息通知
消息通知:配置略