supervisor英文官方文档解读&实现掉线报警通知

官方文档:Configuration File — Supervisor 4.2.5 documentation

参考博客

Supervisor这个监控告警功能你用过吗?-腾讯云开发者社区-腾讯云

Signals

The supervisord program may be sent signals which cause it to perform certain actions while it’s running.

You can send any of these signals to the single supervisord process id. This process id can be found in the file represented by the pidfile parameter in the [supervisord] section of the configuration file (by default it’s $CWD/supervisord.pid).

supervisord程序可能会被发送信号,这个信号能导致supervisor做一些事情

你可以发送一些信息给supervisord进程id, 它可以被发现在一个文件中,这个文件在那里呢?

这个文件的位置被展示在 配置文件中的[supervisord] 部分的 pidfile参数中 

这个就是supervisor的进程id,给这个进程发信息,就可以做报警通知等等各种信息

原文:

Events

Events are an advanced feature of Supervisor introduced in version 3.0. You don’t need to understand events if you simply want to use Supervisor as a mechanism to restart crashed processes or as a system to manually control process state. You do need to understand events if you want to use Supervisor as part of a process monitoring/notification framework.

Evetns 是被介绍在Supervisor3.0版本一个先进的特性。 如果你只是想用Supervisor作为一个重启 意外关掉进程的 工具 或者是 作为一个控制手动控制进程状态的系统,那么不需要了解Events 。

如果你想要使用supervisor作为 process 监控/通知 framework 的一部分,那就需要了解了。

Event Listeners and Event Notifications

Supervisor provides a way for a specially written program (which it runs as a subprocess) called an “event listener” to subscribe to “event notifications”. An event notification implies that something happened related to a subprocess controlled by supervisord or to supervisord itself. Event notifications are grouped into types in order to make it possible for event listeners to subscribe to a limited subset of event notifications. Supervisor continually emits event notifications as its running even if there are no listeners configured. If a listener is configured and subscribed to an event type that is emitted during a supervisord lifetime, that listener will be notified.

supervisor为一个特殊编写的程序(这个程序以子进程的方式来运行)提供了一种方式,这个方式叫做 event listener(事件监听器) 去订阅 event notifications(事件通知) 。 一个event notifications(事件通知器)意味着 somethings已经发生了,这个事情是关于子进程的,这个子进程被supervisord控制,或者这个事情就是关于supervisord自身的。

事件通知 被分组成为不同的类型,为了让事件监听器能够订阅有限数量的子集(只订阅自己许的的)。 即使没有事件监听器被配置,那supervisor也会持续不断的发出事件通知(发了等于白发,没人听)。 如果一个监听器被配置了,并且订阅了一个在supervisor工作期间会发出的事件类型 ,那么监听器就能注意到这个事件。 

The purpose(目标) of the event notification/subscription system is to provide a mechanism(机制) for arbitrary(任何的,任意的) code to be run (e.g. send an email, make an HTTP request, etc) when some condition is met. That condition usually has to do with subprocess state(这个情况通常与子进程的状态有关,这句话是什么意思呢?我们用子进程监测了我们自己的想成,所以当项目出现某些情况的时候,子进程的状态会变化). For instance, you may want to notify someone via email when a process crashes and is restarted by Supervisor.

The event notification protocol is based on communication via a subprocess’ stdin and stdout(事件通知协议基于子进程的stdin和stdout通信。).

Supervisor sends specially-formatted input to an event listener process’ stdin and expects specially-formatted output from an event listener’s stdout, forming a request-response cycle. (Supervisor将特定格式的输入发送到事件监听进程的stdin,并期望从事件监听进程的stdout得到特定格式的输出,形成一个请求-响应循环。)

A protocol(协议) agreed upon between supervisor and the listener’s implementer allows listeners to process event notifications.

Event listeners can be written in any language supported by the platform you’re using to run Supervisor. Although event listeners may be written in any language, there is special library support for Python in the form of a supervisor.childutils module, which makes creating event listeners in Python slightly easier than in other languages. (事件监听器可以用运行Supervisor的平台支持的任何语言编写。尽管事件监听器可以用任何语言编写,但Python有专门的库以supervisor的形式支持。childutils模块,它使在Python中创建事件监听器比在其他语言中更容易。)

上面文档说明了,supervisor是支持监控和订阅的。

Configuring an Event Listener

A supervisor event listener is specified via a [eventlistener:x] section in the configuration file (提示我们需要去修改配置文件,在配置文件里面添加eventlistener). Supervisor [eventlistener:x] sections are treated almost exactly like supervisor [program:x] section with the respect to(就...而言) the keys allowed in their configuration except (除了....看不懂) that Supervisor does not respect “capture mode” output from event listener processes (ie. event listeners cannot be PROCESS_COMMUNICATIONS_EVENT event generators). Therefore it is an error to specify stdout_capture_maxbytes or stderr_capture_maxbytes in the configuration of an eventlistener.

There is no artificial constraint on the number of eventlistener sections that can be placed into the configuration file.(在配置文件中放置事件监听器部分的数量没有人为的限制。)

写python程序发送http请求,使用钉钉机器人发消息进行报警

这是之前的go发送http请求,我们使用ChatGPT转化成python

func (t *DingRobot) SendMessage(p *ParamCronTask) error {
	b := []byte{}
	if p.MsgText.Msgtype == "text" {
		msg := map[string]interface{}{}
		atMobileStringArr := make([]string, len(p.MsgText.At.AtMobiles))
		for i, atMobile := range p.MsgText.At.AtMobiles {
			atMobileStringArr[i] = atMobile.AtMobile
		}
		atUserIdStringArr := make([]string, len(p.MsgText.At.AtUserIds))
		for i, AtuserId := range p.MsgText.At.AtUserIds {
			atUserIdStringArr[i] = AtuserId.AtUserId
		}
		msg = map[string]interface{}{
			"msgtype": "text",
			"text": map[string]string{
				"content": p.MsgText.Text.Content,
			},
		}
		if p.MsgText.At.IsAtAll {
			msg["at"] = map[string]interface{}{
				"isAtAll": p.MsgText.At.IsAtAll,
			}
		} else {
			msg["at"] = map[string]interface{}{
				"atMobiles": atMobileStringArr, //字符串切片类型
				"atUserIds": atUserIdStringArr,
				"isAtAll":   p.MsgText.At.IsAtAll,
			}
		}
		b, _ = json.Marshal(msg)
	} 
	var resp *http.Response
	var err error
	if t.Type == "1" || t.Secret == "" {
		resp, err = http.Post(t.getURLV2(), "application/json", bytes.NewBuffer(b))
	} else {
		resp, err = http.Post(t.getURL(), "application/json", bytes.NewBuffer(b))
	}
	if err != nil {
		return err
	}
	defer resp.Body.Close()
	date, err := ioutil.ReadAll(resp.Body)
	r := ResponseSendMessage{}
	err = json.Unmarshal(date, &r)
	if err != nil {
		return err
	}
	if r.Errcode != 0 {
		fmt.Println(r.Errmsg)
		return errors.New(r.Errmsg)
	}

	return nil
}
func (t *DingRobot) getURLV2() string {
	url := "https://oapi.dingtalk.com/robot/send?access_token=" + t.RobotId //拼接token路径
	return url
}
type DingRobot struct {
	RobotId            string         `gorm:"primaryKey;foreignKey:RobotId" json:"robot_id"` //机器人的token
	Deleted            gorm.DeletedAt `json:"deleted"`                                       //软删除字段
	Type               string         `json:"type"`                                          //机器人类型,1为企业内部机器人,2为自定义webhook机器人
	TypeDetail         string         `json:"type_detail"`                                   //具体机器人类型
	ChatBotUserId      string         `json:"chat_bot_user_id"`                              //加密的机器人id,该字段无用
	Secret             string         `json:"secret"`                                                                         // 机器人所属用户id
	UserName           string         `json:"user_name"`                                     //机器人所属用户名
	DingUsers          []DingUser     `json:"ding_users" gorm:"many2many:user_robot"`        //机器人@多个人,一个人可以被多个机器人@
	ChatId             string         `json:"chat_id"`                                       //机器人所在的群聊chatId
	OpenConversationID string         `json:"open_conversation_id"`                          //机器人所在的群聊openConversationID
	Tasks              []Task         `gorm:"foreignKey:RobotId;references:RobotId"`         //机器人拥有多个任务
	Name               string         `json:"name"`                                          //机器人的名称
	DingToken          `json:"ding_token" gorm:"-"`
	IsShared           int `json:"is_shared"`
}

type DingToken struct {
	Token string `json:"token"`
}

supervisor用的python代码

配置文件如下:主要是program和event listener,结尾附上完成配置文件

;导入配置文件,还有一种写法是把配置文件给分开,分开之后,可以在这里继续导入
[include]
files = /etc/supervisord.conf
[program:test]
;程序启动参数,这个比较简单
command=/usr/local/goproject/ding_server_v3/test
;是否跟随supervisord的启动而启动,我们设置了true是
autostart=true
;程序退出后自动重启,选择true是
autorestart=true
;进程被杀死时,是否向这个进程组发送stop信号,包括子进程,选择true是
stopasgroup=true
;向进程组发送kill信号,包括子进程,选择true是
killasgroup=true
;下面这几行是日志文件和日志大小和备份个数
stdout_logfile=/var/log/simpleHttp.std.log
stdout_logfile_maxbytes = 50MB
stdout_logfile_backups  = 10
stderr_logfile=/var/log/simpleHttp.err.log
stderr_logfile_maxbytes=50MB
stderr_logfile_backups=10

[eventlistener:testgolang]
command=/usr/bin/python3 /opt/my_custom_listener_testgolang.py  ; 自定义的监控程序,需要指定一下/usr/bin/python3,不然可能用到python2,然后就执行不起来
events=PROCESS_STATE_EXITED,PROCESS_STATE_FATAL,TICK_60  ; 监控事件:进程退出、进程启动失败、间隔六十秒
; 下面的配置和`[program:x]`完全一样
autostart=true
autorestart=true
log_stdout=true
log_stderr=true
stdout_logfile=/opt/supervisor_event_exited-stdout.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=3
buffer_size=100
stderr_logfile=/opt/supervisor_event_exited-stderr.log
stderr_logfile_maxbytes=50MB
stderr_logfile_backups=3

启动supervisor,进入supervisor安装目录

./supervisord -c /etc/supervisord.conf

重启,账号密码在配置文件中

./supervisorctl -c /etc/supervisord.conf -u user -p 123 restart all

杀死program中的进程

然后就会在钉钉群里面发送消息

查看日志

踩坑

kill -9 、 kill  、正常退出、异常退出 之间没有关联性,所以我们监听的时候,可以直接监听程序退出,这样无论是异常还是正常,都可以检测到,从而触发警报。

如何查看日志?

这里面有好几个日志,一个是go程序的输出日志,一个是event listener的日志,一个是supervisor的日志,前两个日志文件都是我们自己指定的路径,supervisor的日志是在

go程序的日志就不用说了,是自己项目的bug

对于supervisor来说,我们需要查看event listener 判断我们的监听器是否正常工作,但是也要查看supervisor的日志,因为supervisor不正常了,那event listener大概率也是不正常的

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值