在Kubernetes中,事件分为两种,一种是Warning事件,表示产生这个事件的状态转换是在非预期的状态之间产生的;另外一种是Normal事件,表示期望到达的状态,和目前达到的状态是一致的。我们用一个Pod的生命周期进行举例,当创建3一个Pod的时候,首先Pod会进入Pending的状态,等待镜像的拉取,当镜像录取完毕并通过健康检查的时候,Pod的状态就变为Running。此时会生成Normal的事件。而如果在运行中,由于原因造成Pod宕掉,进入Failed的状态,而这种状态是非预期的,那么此时会在Kubernetes中产生Warning的事件。那么针对这种场景而言,如果我们能够通过监控事件的产生就可以非常及时的查看到一些容易被资源监控忽略的问题。
eventer 配置
// 创建 sa 文件
kubectl -n kube-system create serviceaccount eventers
// 绑定 Cluster-admin 权限
kubectl create clusterrolebinding eventers --clusterrole=cluster-admin --serviceaccount=kube-system:eventers
// eventer yaml文件
cat eventer.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kube-eventer
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: kube-eventer
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
serviceAccount: eventers
containers:
- name: kube-eventer
image: registry.cn-hangzhou.aliyuncs.com/acs/kube-eventer-amd64:v1.0.0-d9898e1-aliyun
imagePullPolicy: IfNotPresent
command:
- /kube-eventer
- --source=kubernetes:https://kubernetes.default
# - - --sink=dingtalk:[your_webhook_url]&label=[your_cluster_id]&level=[可选参数:Normal或者Warning,默认值为:Warning]
- --sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=2b634377&label=victor&level=Warning
// 部署到k8s集群上
kubectl create -f eventer.yaml
// 验证查看
kubectl -n kube-system logs -f kube-eventer-854b89fdf8-2msc9
I0821 02:01:50.405848 1 eventer.go:67] /kube-eventer --source=kubernetes:https://kubernetes.default --sink=dingtalk:https://oapi.dingtalk.com/robot/send?access_token=2b0377&label=victor&level=Warning
I0821 02:01:50.405895 1 eventer.go:68] Eventer version GoVersion: go1.12.6
Platform: linux/amd64
Version: v0.0.0-master+$Format:%h$
BuildTime: 1970-01-01T00:00:00Z
GitCommit: $Format:%H$
I0821 02:01:50.406947 1 eventer.go:94] Starting with DingTalkSink sink
I0821 02:01:50.406964 1 eventer.go:108] Starting eventer
I0821 02:01:50.406970 1 eventer.go:116] Starting eventer http service
I0821 02:02:00.000154 1 manager.go:102] Exporting 1 events
I0821 02:02:30.000137 1 manager.go:102] Exporting 0 events
I0821 02:03:00.000118 1 manager.go:102] Exporting 0 events
告警测试
// 修改一个deploy 镜像导致pod 故障,然后查看是否触发告警钉钉发送
// 查看日志
kubectl -n kube-system logs -f kube-eventer-854b89fdf8-2msc9
I0821 02:03:30.000115 1 manager.go:102] Exporting 0 events
I0821 02:04:00.000143 1 manager.go:102] Exporting 9 events
I0821 02:04:30.000120 1 manager.go:102] Exporting 21 events
I0821 02:05:00.000143 1 manager.go:102] Exporting 10 events
查看钉钉告警消息