刚开始,我以为在celery配置文件中配置的BROKER_HEARTBEAT这个参数是指worker和rabbitmq的心跳都为十秒,不过后来测试才发现,只有rabbitmq才会在十秒后收不到对方反应,就会将与celery的TCP连接给断开,而worker的TCP并无反应,直到过了十几分钟worker才意识到连不上rabbitmq了。
很奇怪!
后来查资料才知道,rabbitmq的心跳机制是在amqp协议上的,celery的worker的心跳机制跟它不是一回事(国内有网友居然混为一谈,具体不得而知是哪个大神的文章),Stack Overflow上有一个原文讲解了celery的worker的心跳:
the heartbeat of celery worker is application level heartbeat, not
AMQP protocol’s heartbeat. Each worker periodically send heartbeat
event message to “celeryev” event exchange in BROKER. The heartbeat
event is forwarded back to worker such worker can know the health
status of BROKER. If number of loss heartbeat exceeding a threshold,
the worker can do some reconnect action to BROKER.from
https://stackoverflow.com/questions/20957134/celery-heartbeat-not-working/21038204#21038204
应该不需要我翻译吧:)