1. 问题复现
在某天写完一个 celery 定时任务时,启动 celery 后正准备启动 celery-beat,突然发现弹出大量警告然后退出了进程,往上翻果然发现报错
celery beat v5.1.2 (sun-harmonics) is starting.
[2021-11-01 18:32:11,917: INFO/MainProcess] beat: Starting...
[2021-11-01 18:32:12,071: CRITICAL/MainProcess] beat raised exception <class 'django.core.exceptions.ImproperlyConfigured'>: ImproperlyConfigured('settings.DATABASES is improperly configured. Please supply the ENGINE value. Check settings documentation for more details.')
Traceback (most recent call last):
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/celery/apps/beat.py", line 105, in start_scheduler
service.start()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/celery/beat.py", line 636, in start
humanize_seconds(self.scheduler.max_interval))
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/kombu/utils/objects.py", line 29, in __get__
return super().__get__(instance, owner)
File "/Users/systemime/.pyenv/versions/3.9.6/lib/python3.9/functools.py", line 969, in __get__
val = self.func(instance)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/celery/beat.py", line 680, in scheduler
return self.get_scheduler()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/celery/beat.py", line 671, in get_scheduler
return symbol_by_name(self.scheduler_cls, aliases=aliases)(
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 231, in __init__
Scheduler.__init__(self, *args, **kwargs)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/celery/beat.py", line 271, in __init__
self.setup_schedule()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 239, in setup_schedule
self.install_default_entries(self.schedule)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 340, in install_default_entries
self.update_from_dict(entries)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 328, in update_from_dict
self.schedule.update(s)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 355, in schedule
elif self.schedule_changed():
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 261, in schedule_changed
transaction.commit()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/db/transaction.py", line 35, in commit
get_connection(using).commit()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/db/backends/base/base.py", line 266, in commit
self._commit()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/db/backends/dummy/base.py", line 20, in complain
raise ImproperlyConfigured("settings.DATABASES is improperly configured. "
django.core.exceptions.ImproperlyConfigured: settings.DATABASES is improperly configured. Please supply the ENGINE value. Check settings documentation for more details.
emmmmmm??,看最后一句报错提示,居然说我 Django 的数据库配置不对?我好奇心检查了一下我的数据库配置同时又写了几个 ORM 测试,并没有发现问题,于是决定跟踪一下报错流程
2. 问题定位
从上面的报错信息中,我们可以看到主要从定时任务的类启动开始
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/celery/beat.py", line 271, in __init__
self.setup_schedule()
到获取 django 数据库连接开始报错
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/db/transaction.py", line 35, in commit
get_connection(using).commit()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/db/backends/base/base.py", line 266, in commit
self._commit()
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django/db/backends/dummy/base.py", line 20, in complain
raise ImproperlyConfigured("settings.DATABASES is improperly configured. "
看到获取链接时的 using,我突然意识到我的 Django 项目配置的多数据库,而 Django 提供的 using 参数是指定当前使用的数据库,如果不指定,将通过配置的数据库路由或者读取默认数据库配置
那么再往上看,获取链接之前
File "/Users/systemime/.pyenv/versions/3.9.6/envs/vps-django/lib/python3.9/site-packages/django_celery_beat/schedulers.py", line 261, in schedule_changed
transaction.commit()
django-celery-beat
进行了一次事物提交,但是没有指定数据库,难道在获取数据库链接时,不会通过数据库路由去获取对应数据库链接吗?
抱着好奇我查看了 get_connection(using).commit()
的源码
def commit(using=None):
"""Commit a transaction."""
get_connection(using).commit()
进一步查看 get_connection
方法的源码
DEFAULT_DB_ALIAS = 'default'
...
def get_connection(using=None):
"""
Get a database connection by name, or the default database connection
if no name is provided. This is a private API.
"""
if using is None:
using = DEFAULT_DB_ALIAS
return connections[using]
至此破案,django-celery-beat
没有传入必要 using 参数,导致 Django 将会去获取默认数据库链接,而 Django 在配置多数据库时,默认数据库配置 default
为空,所以 django-celery-beat
其实并不支持多数据库
3. 解决方案
我们在配置完多数据库后,一般需要我们定义数据库路由(否则还是会获取 default
配置),Django 获取到路由后在后续 ORM 操作中都会自动去使用我们定义的路由匹配当前操作对应的数据库
所以,这里我们需要先获取到 django-celery-beat
在我们定义的数据库路由中对应哪个数据库
我们先在源码中尝试修改引起报错的地方
"""
django_celery_beat/schedulers.py
"""
from django.db import router, DEFAULT_DB_ALIAS
...... # 其他源码
class DatabaseScheduler(Scheduler):
......
# 添加以下内容
@property
def target_db(self):
"""Determine if there is a django route"""
# 先检查是否配置了django 数据库路由
if not settings.DATABASE_ROUTERS:
return DEFAULT_DB_ALIAS
# If the project does not actually implement this method, DEFAULT_DB_ALIAS will be automatically returned.
# The exception will be located to the django routing section
# 如果有路由,则通过调用路由的db_for_write方法获取对应数据库
# self.Model是django-celery-beat的模型
db = router.db_for_write(self.Model)
return db
def schedule_changed(self):
try:
......
try:
transaction.commit(using=self.target_db)
except transaction.TransactionManagementError:
pass # not in transaction management.
except DatabaseError as exc:
......
注意 ⚠️:
- 如果你的数据库路由并没有实现
db_for_write
方法,Django 将返回DATABASES
中default
的配置,所以db_for_write
必须实现,且必须能为django-celery-beat
的模型匹配到正确的数据库- 记得在
INSTALLED_APPS
中导入django-celery-beat
同时执行迁移
再次启动 celery-beat
❯ celery -A skill_test.app beat -l INFO 1 ↵ systemime@bogon
celery beat v5.1.2 (sun-harmonics) is starting.
__ - ... __ - _
LocalTime -> 2021-11-03 15:24:59
Configuration ->
. broker -> redis://localhost:6379//
. loader -> celery.loaders.app.AppLoader
. scheduler -> django_celery_beat.schedulers.DatabaseScheduler
. logfile -> [stderr]@%INFO
. maxinterval -> 5.00 seconds (5s)
[2021-11-03 15:24:59,065: INFO/MainProcess] beat: Starting...
[2021-11-03 15:24:59,234: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2021-11-03 15:25:00,014: INFO/MainProcess] Scheduler: Sending due task xxx (apps.xxx.tasks.xxx)
启动成功,但是我们一般不推荐改变源码,在使用 celery-beat 时,还记得需要配置 CELERY_BEAT_SCHEDULER 吗,我们可以通过重载的方式支持数据库路由
默认配置为
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
重载这个 DatabaseScheduler
from django_celery_beat.schedulers import DatabaseScheduler
from django.db import transaction, router
from functools import partial
class SQDatabaseSchedule(DatabaseScheduler):
def sync(self):
with transaction.atomic(using=router.db_for_write(self.Model)):
super(SQDatabaseSchedule, self).sync()
def schedule_changed(self):
db = router.db_for_write(self.Model)
transaction.commit = partial(transaction.commit, using=db)
super(SQDatabaseSchedule, self).schedule_changed()
修改配置
CELERY_BEAT_SCHEDULER = 'you_project.you_project_path:DatabaseScheduler'
再次启动尝试
❯ celery -A skill_test.app beat -l INFO 1 ↵ systemime@bogon
celery beat v5.1.2 (sun-harmonics) is starting.
__ - ... __ - _
LocalTime -> 2021-11-03 15:26:59
Configuration ->
. broker -> redis://localhost:6379//
. loader -> celery.loaders.app.AppLoader
. scheduler -> django_celery_beat.schedulers.DatabaseScheduler
. logfile -> [stderr]@%INFO
. maxinterval -> 5.00 seconds (5s)
[2021-11-03 15:26:59,065: INFO/MainProcess] beat: Starting...
[2021-11-03 15:26:59,234: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2021-11-03 15:26:00,014: INFO/MainProcess] Scheduler: Sending due task xxx (apps.xxx.tasks.xxx)
启动成功,大功告成