解决apscheduler定时任务重复发送问题

最新推荐文章于 2024-06-20 13:05:43 发布

python_tty

最新推荐文章于 2024-06-20 13:05:43 发布

阅读量1w

点赞数 3

分类专栏： python 进阶

本文链接：https://blog.csdn.net/python_tty/article/details/87256011

版权

python 进阶专栏收录该内容

22 篇文章 1 订阅

订阅专栏

项目中要实现动态添加定时任务的功能，看了celery, celery不能实现动态添加定时任务的功能(celery添加定时任务之后要重启beat，定时任务才能生效)。后来看了apscheduler，apscheduler可以实现动态添加定时任务，但是有一个缺点，apscheduler不是一个单独的服务，它依赖于主服务。项目用gunicorn来部署，问题来了，gunicorn的每个worker都有初始化了一个flask app的实例，每一个app中有一个apscheduler的实例，当定时的时间到了的时候，每个worker都会执行一遍定时任务，任务就重复执行了。看了gunicorn的官方文档，加一个–preload参数，可以先初始化flask app,然后再启动worker,加了这个参数之后，worker中的apscheduler的实例就为None，不能动态添加定时任务了。最后重写了apscheduler的BackgroundScheduler的_process_jobs方法，主要的思想就是加一个文件锁，进程要想获取要执行的定时任务就必须获取文件锁，当一个进程获取文件锁并且执行完定时任务之后，释放文件锁，其他的进程获取文件锁，此时要执行的定时任务就为0，解决了定时任务重复发送的问题。

class CuBackgroundScheduler(BackgroundScheduler):

    def _process_jobs(self):
        """
        Iterates through jobs in every jobstore, starts jobs that are due and figures out how long
        to wait for the next round.

        If the ``get_due_jobs()`` call raises an exception, a new wakeup is scheduled in at least
        ``jobstore_retry_interval`` seconds.

        """
        f = open("scheduler.lock", "wb")
        wait_seconds = None
        try:
            fcntl.flock(f, fcntl.LOCK_EX | fcntl.LOCK_NB)
            logger.info("locked the pid {%s} success" % os.getpid())
        except Exception as exc:
            logger.error("[CuBackgroundScheduler] error: %s    os.pid:%s" % (str(exc), os.getpid()))
            f.close()
        else:
            logger.info("pid {%s} Scheduler is {%s}" % (os.getpid(), self.state))
            if self.state == STATE_PAUSED:
                self._logger.debug('Scheduler is paused -- not processing jobs')
                return None

            self._logger.debug('Looking for jobs to run and os.pid is {%s}' % os.getpid())
            logger.info('Looking for jobs to run and os.pid is {%s}' % os.getpid())
            now = datetime.now(self.timezone)
            next_wakeup_time = None
            events = []

            with self._jobstores_lock:
                for jobstore_alias, jobstore in six.iteritems(self._jobstores):
                    try:
                        due_jobs = jobstore.get_due_jobs(now)
                        self._logger.info("due_jobs:%s     os.pid: %s\n" % (len(due_jobs), os.getpid()))
                        logger.info("due_jobs:%s     os.pid: %s\n" % (len(due_jobs), os.getpid()))
                    except Exception as e:
                        # Schedule a wakeup at least in jobstore_retry_interval seconds
                        self._logger.warning('Error getting due jobs from job store %r: %s',
                                             jobstore_alias, e)
                        retry_wakeup_time = now + timedelta(seconds=self.jobstore_retry_interval)
                        if not next_wakeup_time or next_wakeup_time > retry_wakeup_time:
                            next_wakeup_time = retry_wakeup_time

                        continue

                    for job in due_jobs:
                        # Look up the job's executor
                        try:
                            executor = self._lookup_executor(job.executor)
                        except BaseException:
                            self._logger.error(
                                'Executor lookup ("%s") failed for job "%s" -- removing it from the '
                                'job store', job.executor, job)
                            logger.error(
                                'Executor lookup ("%s") failed for job "%s" -- removing it from the '
                                'job store', job.executor, job)
                            self.remove_job(job.id, jobstore_alias)
                            continue

                        run_times = job._get_run_times(now)
                        run_times = run_times[-1:] if run_times and job.coalesce else run_times
                        if run_times:
                            try:
                                executor.submit_job(job, run_times)
                            except MaxInstancesReachedError:
                                self._logger.warning(
                                    'Execution of job "%s" skipped: maximum number of running '
                                    'instances reached (%d)', job, job.max_instances)
                                event = JobSubmissionEvent(EVENT_JOB_MAX_INSTANCES, job.id,
                                                           jobstore_alias, run_times)
                                events.append(event)
                            except BaseException:
                                self._logger.exception('Error submitting job "%s" to executor "%s"',
                                                       job, job.executor)
                            else:
                                event = JobSubmissionEvent(EVENT_JOB_SUBMITTED, job.id, jobstore_alias,
                                                           run_times)
                                events.append(event)

                            # Update the job if it has a next execution time.
                            # Otherwise remove it from the job store.
                            job_next_run = job.trigger.get_next_fire_time(run_times[-1], now)
                            if job_next_run:
                                job._modify(next_run_time=job_next_run)
                                jobstore.update_job(job)
                            else:
                                self.remove_job(job.id, jobstore_alias)

                    # Set a new next wakeup time if there isn't one yet or
                    # the jobstore has an even earlier one
                    jobstore_next_run_time = jobstore.get_next_run_time()
                    if jobstore_next_run_time and (next_wakeup_time is None or
                                                   jobstore_next_run_time < next_wakeup_time):
                        next_wakeup_time = jobstore_next_run_time.astimezone(self.timezone)

            # Dispatch collected events
            for event in events:
                self._dispatch_event(event)

            # Determine the delay until this method should be called again
            if self.state == STATE_PAUSED:
                self._logger.debug('Scheduler is paused; waiting until resume() is called')
            elif next_wakeup_time is None:
                self._logger.debug('No jobs; waiting until a job is added')
            else:
                wait_seconds = min(max(timedelta_seconds(next_wakeup_time - now), 0), TIMEOUT_MAX)
                self._logger.debug('Next wakeup is due at %s (in %f seconds)', next_wakeup_time,
                                   wait_seconds)
                logger.info('os.pid {%s}  Next wakeup is due at %s (in %f seconds)', os.getpid(), next_wakeup_time,
                             wait_seconds)

            fcntl.flock(f, fcntl.LOCK_UN)
            logger.info("unlocked the pid {%s} success!!!" % os.getpid())
            f.close()

        return wait_seconds

python_tty

关注

3
点赞
踩
5

收藏

觉得还不错? 一键收藏
7
评论
解决apscheduler定时任务重复发送问题

项目中要实现动态添加定时任务的功能，看了celery, celery不能实现动态添加定时任务的功能(celery添加定时任务之后要重启beat，定时任务才能生效)。后来看了apscheduler，apscheduler可以实现动态添加定时任务，但是有一个缺点，apscheduler不是一个单独的服务，它依赖于主服务。项目用gunicorn来部署，问题来了，gunicorn的每个worker都有初始...
复制链接

扫一扫

专栏目录