背景
做celery的一个实验,实验内容是将一堆任务分组执行,等待一组任务执行完成后再执行下一组任务。但是,当启动1个worker并且在分组函数是被异步调用的时, 获取这一组任务的ready()时会hang住。
服务端源码:
# -*- utf-8 -*-
from __future__ import absolute_import, unicode_literals, print_function
from .config import app
from celery import Task, group
@app.task()
def save(s):
print(s)
@app.task()
def distribution(indexs):
for i in range(0, len(indexs), 5):
start = i
end = (i + 5) if (i + 5) <= len(indexs) else len(indexs)
L = []
for j in range(start, end):
L.append(save.s(indexs[j]))
res = group(L)()
while not res.ready():
time.sleep(1)
客户端源码:
from proj.tasks_2 import distribution
from celery import group
import time
if __name__ == '__main__':
indexs = [1, 2, 3, 4, 5, 6, 7, 8]
res = distribution.delay(indexs)
执行结果:
# 服务端启动命令
$ celery -A proj.tasks_2 worker --concurrency=1 --loglevel=info
# 客户端执行
$ python client.py
# 观察服务端情况
[2020-07-15 17:04:06,105: INFO/MainProcess] Received task: proj.tasks_2.save[9a7876dc-e4f5-4a26-8b73-ed1fa2f1a786]
[2020-07-15 17:04:06,110: INFO/MainProcess] Received task: proj.tasks_2.save[d4e56ee1-4180-477a-ae02-a526056e2042]
[2020-07-15 17:04:06,113: INFO/MainProcess] Received task: proj.tasks_2.save[f4cd1e81-7bb8-4462-8b5c-a194175456d8]
[2020-07-15 17:04:06,114: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:07,116: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:08,118: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:09,120: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:10,122: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:11,124: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:12,125: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:13,127: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:14,129: WARNING/ForkPoolWorker-1] sky hang...
[2020-07-15 17:04:15,131: WARNING/ForkPoolWorker-1] sky hang...
分析原因
调试时发现,当分发任务的函数distribution
函数不再异步调用,比如客户端写成下面的样子:
from proj.tasks_2 import distribution
from celery import group
import time
if __name__ == '__main__':
indexs = [1, 2, 3, 4, 5, 6, 7, 8]
res = distribution(indexs)
就不在hang住。或者,服务端启动大于1个worker时,也可以保证这种情况不被hang住。
根据这种现象结合celery的整体框架,分析原因。是当前若仅有一个worker启动着,即仅有一个进程。那么如果仅有的worker没有处理完成当前的task,是没有其余的worker能够处理新产生的task的。而原实验中第一个task的结束是需要新产生的task的处理完成才能正常退出。由此,也就被迫陷入了死循环中。
综上分析,可以得出一个结论。无论worker启动数量如何(总有一种情况可以占满所有的worker),在celery框架中不能够在一个异步的task中嵌套另一个异步的task,因为这极有可能产生隐藏的死循环陷阱。
解决案例
如果,客户端一定要使用异步调用的方式,那就不再启用celery的方式而是通过自己写异步方法。
from proj.tasks_2 import distribution
from celery import group
import time
from threading import Thread
def async(f):
def wrapper(*args, **kwargs):
thr = Thread(target = f, args = args, kwargs = kwargs)
thr.start()
return wrapper
@async
def func(indexs):
res = distribution(indexs)
if __name__ == '__main__':
indexs = [1, 2, 3, 4, 5, 6, 7, 8]
#res = distribution_middle.delay(indexs)
res = func(indexs)
print(res)
附录:
配置文件:
from __future__ import absolute_import, unicode_literals
from celery import Celery
app = Celery('proj',
broker='redis://127.0.0.1:6379',
backend='redis://127.0.0.1:6379/0',
include=['proj.tasks'])
app.conf.update(
result_expires = 3600,
#task_routes = {'proj.tasks.add': {'queue': 'hipri'}}
)
if __name__ == '__main__':
app.start()