Apscheduler源码分析与应用
Apscheduler介绍
Apscheduler是python比较好用的定时任务框架,介绍和api可以参考官方文档
概念说明
- Job任务:定义定时任务所执行的函数、函数参数以及任务执行相关的配置。
- Trigger触发器:定义任务执行的触发方式,包含cron、date、interval、混合方式。
- JobStore任务仓库:保存调度的任务,默认内存,支持mongodb、redis、sqlalchemy等
- Executor执行器:负责执行任务,默认线程池,支持进程池、gevent、tornado、asyncio等
- Scheduler调度器:负责调度任务,将其他模块组合在一起并且对外提供方便的api。
- Listener事件监听器:用于监听scheduler中各种事件,官方提供四种事件:调度器事件、任务事件、任务提交事件、任务执行事件。
调度器结构图
模块定义
scheduler基类定义
一个scheduler可以包含多种JobStore、多种Executor以及多个Listener。
调度器还会维护一个变量,用来记录当前调度器所处状态。
class BaseScheduler(six.with_metaclass(ABCMeta)):
def __init__(self, gconfig={}, **options):
super(BaseScheduler, self).__init__()
self._executors = {} # executors别名与实例的映射
self._executors_lock = self._create_lock()
self._jobstores = {} # jobstores别名与实例的映射
self._jobstores_lock = self._create_lock()
self._listeners = [] # 事件监听器列表
self._listeners_lock = self._create_lock()
self._pending_jobs = [] # 待添加到任务仓库中的任务列表
self.state = STATE_STOPPED # scheduler状态变量
self.configure(gconfig, **options) # 配置调度器
JobStore定义
JobStore会维护一个任务列表,这个任务列表可以看作是按照任务执行时间的升序列表。内存JobStore的有序列表维护在内存,mongodb、redis等数据库JobStore则通过数据库引擎维护有序列表。数据库作为JobStore可支持任务的持久化保存。
Job定义
class Job(object):
__slots__ = (
'_scheduler', # 任务对应的调度器
'_jobstore_alias', # 储存这个任务的JobStore的别名
'id', # 任务id
'trigger', # 任务对应的触发器
'executor', # 任务对应的执行器的别名
'func', # 任务对应的函数
'func_ref', # 序列化的任务
'args',
'kwargs',
'name', # 任务描述
'misfire_grace_time', # the time (in seconds) how much this job’s execution is allowed to be late (None means “allow the job to run no matter how late it is”)
'coalesce', # whether to only run the job once when several run times are due
'max_instances', # the maximum number of concurrently executing instances allowed for this job
'next_run_time', # the next scheduled run time of this job
'__weakref__'
)
def __init__(self, scheduler, id=None, **kwargs):
super(Job, self).__init__()
self._scheduler = scheduler
self._jobstore_alias = None
self._modify(id=id or uuid4().hex, **kwargs)
工作流程
以BlockingScheduler、MemoryJobStore、ThreadPoolExecutor组合为例。
- 当时间到t1时,调用jobstore.get_due_jobs()获取该jobstore所有到期任务
- 获取到期任务的executor别名,通过scheduler._lookup_executor()找到executor实例
- 调用executor.submit_job()将任务发送给executor执行
- 执行成功后,通过调用任务对应的trigger.get_next_fire_time()获取任务下次执行时间,再执行job._modify()更新任务
- 通过调用jobstore.update_job()将更新后的任务更新回jobstore
- 调用jobstore.get_next_run_time()获取jobstore的最近任务执行时间
- scheduler获取到最近的jobstore任务执行时间,通过scheduler._event.wait() sleep到最近的任务执行时间,进入下个循环
Scheduler主循环
# BlockingScheduler._main_loop()
def _main_loop(self):
wait_seconds = TIMEOUT_MAX
while self.state != STATE_STOPPED:
self._event.wait(wait_seconds)
self._event.clear()
wait_seconds = self._process_jobs()
# scheduler._process_jobs()
def _process_jobs(self):
if self.state == STATE_PAUSED:
self._logger.debug('Scheduler is paused -- not processing jobs')
return None
self._logger.debug('Looking for jobs to run')
now = datetime.now(self.timezone)
next_wakeup_time = None
events = []
with self._jobstores_lock:
# 遍历每个jobstore获取所有到期job
for jobstore_alias, jobstore in six.iteritems(self._jobstores):
try:
due_jobs = jobstore.get_due_jobs(now)
except Exception as e:
# Schedule a wakeup at least in jobstore_retry_interval seconds
self._logger.warning('Error getting due jobs from job store %r: %s',
jobstore_alias, e)
retry_wakeup_time = now + timedelta(seconds=self.jobstore_retry_interval)
if not next_wakeup_time or next_wakeup_time > retry_wakeup_time:
next_wakeup_time = retry_wakeup_time
continue
for job in due_jobs:
# Look up the job's executor
try:
executor = self._lookup_executor(job.executor) # 找到每个任务对应的executor实例
except BaseException:
self._logger.error(
'Executor lookup ("%s") failed for job "%s" -- removing it from the '
'job store', job.executor, job)
self.remove_job(job.id, jobstore_alias)
continue
run_times = job._get_run_times(now)
run_times = run_times[-1:] if run_times and job.coalesce else run_times
if run_times:
try:
executor.submit_job(job, run_times) # executor执行任务
except MaxInstancesReachedError:
self._logger.warning(
'Execution of job "%s" skipped: maximum number of running '
'instances reached (%d)', job, job.max_instances)
event = JobSubmissionEvent(EVENT_JOB_MAX_INSTANCES, job.id,
jobstore_alias, run_times)
events.append(event)
except BaseException:
self._logger.exception('Error submitting job "%s" to executor "%s"',
job, job.executor)
else:
event = JobSubmissionEvent(EVENT_JOB_SUBMITTED, job.id, jobstore_alias,
run_times)
events.append(event)
# Update the job if it has a next execution time.
# Otherwise remove it from the job store.
job_next_run = job.trigger.get_next_fire_time(run_times[-1], now)
if job_next_run:
job._modify(next_run_time=job_next_run)
jobstore.update_job(job)
else:
self.remove_job(job.id, jobstore_alias)
# Set a new next wakeup time if there isn't one yet or
# the jobstore has an even earlier one
jobstore_next_run_time = jobstore.get_next_run_time()
if jobstore_next_run_time and (next_wakeup_time is None or
jobstore_next_run_time < next_wakeup_time):
next_wakeup_time = jobstore_next_run_time.astimezone(self.timezone)
# Dispatch collected events
for event in events:
self._dispatch_event(event)
# Determine the delay until this method should be called again
if self.state == STATE_PAUSED:
wait_seconds = None
self._logger.debug('Scheduler is paused; waiting until resume() is called')
elif next_wakeup_time is None:
wait_seconds = None
self._logger.debug('No jobs; waiting until a job is added')
else:
wait_seconds = min(max(timedelta_seconds(next_wakeup_time - now), 0), TIMEOUT_MAX)
self._logger.debug('Next wakeup is due at %s (in %f seconds)', next_wakeup_time,
wait_seconds)
return wait_seconds
Apscheduler与分布式
一般生产环境需要执行定时任务时须进行多实例部署,以免单点失效造成所有定时任务无法执行,但同一个定时任务又不希望在每个apscheduler实例上都执行一次,这就需要讨论一下apscheduler的分布式实现。虽然apscheduler的任务存储支持mongodb、redis等分布式存储,但apscheduler在从jobstore获取任务时没有进行过上锁,所以本身不支持分布式执行定时任务。要实现分布式执行,需要额外引入分布式锁,在每个apscheduler实例从jobstore获取任务时对该任务上锁,即可避免同一任务在所有apscheduler实例都执行。