定时任务系列(6)-Quartz启动核心原理之RAMJobStore

上一篇绑定了任务和触发器,现在接着看这个

    public void start() throws SchedulerException {

        if (shuttingDown|| closed) {
            throw new SchedulerException(
                    "The Scheduler cannot be restarted after shutdown() has been called.");
        }

        // QTZ-212 : calling new schedulerStarting() method on the listeners
        // right after entering start()
        // 通知调度器监控器启动中
        notifySchedulerListenersStarting();
        //初始化标识为null,进行初始化操作
        if (initialStart == null) {
            initialStart = new Date();
            this.resources.getJobStore().schedulerStarted();            
            startPlugins();
        } else {
            //如果已经初始化过,则恢复jobStore
            resources.getJobStore().schedulerResumed();
        }

        schedThread.togglePause(false);

        getLog().info(
                "Scheduler " + resources.getUniqueIdentifier() + " started.");
        //提醒调度器的监听启动
        notifySchedulerListenersStarted();
    }

先看看scheduler.start(); chedThread.togglePause(false);,初始化initialStart之后再来说。
在【定时任务系列(4)-Quartz创建Scheduler调度器核心原理】中描述过这个schedThread.togglePause(false),在【QuartzSchedulerThread主线程调度】这一小节中,说到等到togglePause(false)调用。

    void togglePause(boolean pause) {
        synchronized (sigLock) {
            paused = pause;

            if (paused) {
                signalSchedulingChange(0);
            } else {
                sigLock.notifyAll();
            }
        }
    }

这里将paused置为了false,回过头看QuartzSchedulerThread主线程的调度,

	 // check if we're supposed to pause...
	 synchronized (sigLock) {
	     while (paused && !halted.get()) {
	         try {
	             // wait until togglePause(false) is called...
	             //等到 togglePause(false) 被调用...
	             log.info("QuartzSchedulerThread 等到 togglePause(false) 被调用..." + LocalDateTime.now());
	             sigLock.wait(1000L);
	         } catch (InterruptedException ignore) {
	         }
	
	         // reset failure counter when paused, so that we don't
	         // wait again after unpausing
	         acquiresFailed = 0;
	     }
	
	     if (halted.get()) {
	         break;
	     }
	 }

这里的paused为false后,将不会再循环等待,直接break出去。

	// wait a bit, if reading from job store is consistently
	// failing (e.g. DB is down or restarting)..
	//稍等一下,如果从作业存储中读取一直失败(例如,数据库关闭或重新启动)
	if (acquiresFailed > 1) {
	    try {
	        long delay = computeDelayForRepeatedErrors(qsRsrcs.getJobStore(), acquiresFailed);
	        Thread.sleep(delay);
	    } catch (Exception ignore) {
	    }
	}

如果失败的次数大于1,就等待一段时间。

	 //获取可用线程数 qsRsrcs是QuartzSchedulerResources对象
	 int availThreadCount = qsRsrcs.getThreadPool().blockForAvailableThreads();
	 if(availThreadCount > 0) { // will always be true, due to semantics of blockForAvailableThreads...
	
	     List<OperableTrigger> triggers;
	
	     long now = System.currentTimeMillis();
	     //清除调度改变的信号
	     clearSignaledSchedulingChange();
	     try {
	         /**
	          * ======最重要的部分======
	          */
	         //到JobStore中获取下次被触发的触发器
	         triggers = qsRsrcs.getJobStore().acquireNextTriggers(
	                 now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
	         acquiresFailed = 0;
	         if (log.isDebugEnabled())
	             log.debug("batch acquisition of " + (triggers == null ? 0 : triggers.size()) + " triggers");
	     } catch (JobPersistenceException jpe) {
	         if (acquiresFailed == 0) {
	             qs.notifySchedulerListenersError(
	                 "An error occurred while scanning for the next triggers to fire.",
	                 jpe);
	         }
	         if (acquiresFailed < Integer.MAX_VALUE)
	             acquiresFailed++;
	         continue;
	     } catch (RuntimeException e) {
	         if (acquiresFailed == 0) {
	             getLog().error("quartzSchedulerThreadLoop: RuntimeException "
	                     +e.getMessage(), e);
	         }
	         if (acquiresFailed < Integer.MAX_VALUE)
	             acquiresFailed++;
	         continue;
	     }

这一段最重要的就是到JobStore中获取下次被触发的触发器,RAMJobStore中的acquireNextTriggers实现:

    public List<OperableTrigger> acquireNextTriggers(long noLaterThan, int maxCount, long timeWindow) {
        synchronized (lock) {
            List<OperableTrigger> result = new ArrayList<OperableTrigger>();
            Set<JobKey> acquiredJobKeysForNoConcurrentExec = new HashSet<JobKey>();
            Set<TriggerWrapper> excludedTriggers = new HashSet<TriggerWrapper>();
            long batchEnd = noLaterThan;
            
            // return empty list if store has no triggers. 如果timeTriggers的集合为空,那么直接返回。
            if (timeTriggers.size() == 0)
                return result;
            
            while (true) {
                TriggerWrapper tw;

                try {
                    //从timetriggers中获取第一个triggerWrapper
                    // 因为timeTriggers是有序的triggerWrapper集合(按照触发时间和优先级排序),接着从timeTriggers中移除triggerWrapper。
                    tw = timeTriggers.first();
                    if (tw == null)
                        break;
                    timeTriggers.remove(tw);
                } catch (java.util.NoSuchElementException nsee) {
                    break;
                }

                //如果trigger的下一次触发时间为空,则重新获取triggerWrapper。
                if (tw.trigger.getNextFireTime() == null) {
                    continue;
                }

                //如果失火,重新加到timeTriggers集合中,并直接跳过去判断下一个trigger
                if (applyMisfire(tw)) {
                    if (tw.trigger.getNextFireTime() != null) {
                        timeTriggers.add(tw);
                    }
                    continue;
                }

                //如果trigger的下一次触发时间大于需要获取的时间点,则跳出循环(因为timeTrigger是有序的,第一个时间都不满足了,就不用再继续循环了)
                if (tw.getTrigger().getNextFireTime().getTime() > batchEnd) {
                    timeTriggers.add(tw);
                    break;
                }
                
                // If trigger's job is set as @DisallowConcurrentExecution, and it has already been added to result, then
                // put it back into the timeTriggers set and continue to search for next trigger.
                //如果触发器的作业设置为@DisallowConcurrentExecution,并且它已经添加到结果中,则将其放回timeTriggers集合并继续搜索下一个触发器。
                JobKey jobKey = tw.trigger.getJobKey();
                JobDetail job = jobsByKey.get(tw.trigger.getJobKey()).jobDetail;
                if (job.isConcurrentExectionDisallowed()) {
                    if (acquiredJobKeysForNoConcurrentExec.contains(jobKey)) {
                        excludedTriggers.add(tw);
                        continue; // go to next trigger in store.
                    } else {
                        acquiredJobKeysForNoConcurrentExec.add(jobKey);
                    }
                }

                tw.state = TriggerWrapper.STATE_ACQUIRED;
                tw.trigger.setFireInstanceId(getFiredTriggerRecordId());
                OperableTrigger trig = (OperableTrigger) tw.trigger.clone();
                if (result.isEmpty()) {
                    batchEnd = Math.max(tw.trigger.getNextFireTime().getTime(), System.currentTimeMillis()) + timeWindow;
                }
                result.add(trig);
                if (result.size() == maxCount)
                    break;
            }
            //如果我们确实由于DisallowConcurrentExecution而排除了触发器以防止ACQUIRE状态,我们需要将它们添加回存储
            // If we did excluded triggers to prevent ACQUIRE state due to DisallowConcurrentExecution, we need to add them back to store.
            if (excludedTriggers.size() > 0)
                timeTriggers.addAll(excludedTriggers);
            return result;
        }
    }

先检查timeTriggers是否存在数据,这里是前面绑定触发器后存入的,依次移除第一个,timeTriggers的数据结构是TreeSet是一颗红黑树,所以是有序的树。
如果失火,重新加到timeTriggers集合中。
如果trigger的下一次触发时间大于需要获取的时间点,则跳出循环

看看怎么判定为失火:

    protected boolean applyMisfire(TriggerWrapper tw) {

        long misfireTime = System.currentTimeMillis();
        if (getMisfireThreshold() > 0) {
            misfireTime -= getMisfireThreshold();
        }

        Date tnft = tw.trigger.getNextFireTime();
        //为空、差值小于阈值、忽略失火 这三种就不管了
        if (tnft == null || tnft.getTime() > misfireTime 
                || tw.trigger.getMisfireInstruction() == Trigger.MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY) { 
            return false; 
        }

        Calendar cal = null;
        if (tw.trigger.getCalendarName() != null) {
            cal = retrieveCalendar(tw.trigger.getCalendarName());
        }
        //通知triggerListener该trigger发生失火
        signaler.notifyTriggerListenersMisfired((OperableTrigger)tw.trigger.clone());
        //根据失火策略设置下次触发时间
        tw.trigger.updateAfterMisfire(cal);

        if (tw.trigger.getNextFireTime() == null) {
            tw.state = TriggerWrapper.STATE_COMPLETE;
            signaler.notifySchedulerListenersFinalized(tw.trigger);
            synchronized (lock) {
                timeTriggers.remove(tw);
            }
        } else if (tnft.equals(tw.trigger.getNextFireTime())) {
            return false;
        }

        return true;
    }

由此可以看到3个条件

  1. 下次触发时间不为空
  2. 当前时间-触发时间>阈值(5s)
  3. 失火策略为不可忽略

如果真的存在失火的数据,这里会根据失火的处理策略修改trigger的下次调度时间。
任务上有@DisallowConcurrentExecution注解的,trigger需要再次放回timeTriggers,这里用一个临时变量存储。
直到所有任务的trigger获取完,或者超过最大线程数,就停止。
这样就获取了一组需要触发的触发器。

3、阻塞线程

     if (triggers != null && !triggers.isEmpty()) {

         now = System.currentTimeMillis();
         long triggerTime = triggers.get(0).getNextFireTime().getTime();
         //当前时间 到 触发时间 间隔
         long timeUntilTrigger = triggerTime - now;
         //间隔2毫秒以内的才放弃等待去调度任务
         while(timeUntilTrigger > 2) {
             synchronized (sigLock) {
                 if (halted.get()) {
                     break;
                 }
                 //这里判断是否存在更早的触发器
                 if (!isCandidateNewTimeEarlierWithinReason(triggerTime, false)) {
                     try {
                         // we could have blocked a long while
                         // on 'synchronize', so we must recompute
                         //我们可能在“同步”上阻塞了很长时间,所以我们必须重新计算
                         now = System.currentTimeMillis();
                         timeUntilTrigger = triggerTime - now;
                         if(timeUntilTrigger >= 1)
                             sigLock.wait(timeUntilTrigger);
                     } catch (InterruptedException ignore) {
                     }
                 }
             }
             //等待的过程中看看有没有收到调度信号
             if(releaseIfScheduleChangedSignificantly(triggers, triggerTime)) {
                 break;
             }
             now = System.currentTimeMillis();
             timeUntilTrigger = triggerTime - now;
         }

         // this happens if releaseIfScheduleChangedSignificantly decided to release triggers
         if(triggers.isEmpty())
             continue;

先判断是否存在更早的触发器,如果存在,需要清除当前trigger,调度更早的trigger。
否则需要阻塞到任务触发的时候。

         // set triggers to 'executing' 将触发器设置为“执行”
         List<TriggerFiredResult> bndles = new ArrayList<TriggerFiredResult>();

         boolean goAhead = true;
         synchronized(sigLock) {
             goAhead = !halted.get();
         }
         if(goAhead) {
             //scheduler不停止
             try {
                 //获取调度程序现在正在触发给定的Trigger
                 List<TriggerFiredResult> res = qsRsrcs.getJobStore().triggersFired(triggers);
                 if(res != null)
                     bndles = res;
             } catch (SchedulerException se) {
                 qs.notifySchedulerListenersError(
                         "An error occurred while firing triggers '"
                                 + triggers + "'", se);
                 //QTZ-179 : a problem occurred interacting with the triggers from the db
                 //we release them and loop again
                 for (int i = 0; i < triggers.size(); i++) {
                     qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
                 }
                 continue;
             }

         }

上面这一段获取调度程序现在正在触发给定的Trigger,主要的实现方法是triggersFired:

    public List<TriggerFiredResult> triggersFired(List<OperableTrigger> firedTriggers) {

        synchronized (lock) {
            List<TriggerFiredResult> results = new ArrayList<TriggerFiredResult>();

            for (OperableTrigger trigger : firedTriggers) {
                TriggerWrapper tw = triggersByKey.get(trigger.getKey());
                // was the trigger deleted since being acquired?
                if (tw == null || tw.trigger == null) {
                    continue;
                }
                // was the trigger completed, paused, blocked, etc. since being acquired?
                if (tw.state != TriggerWrapper.STATE_ACQUIRED) {
                    continue;
                }

                Calendar cal = null;
                if (tw.trigger.getCalendarName() != null) {
                    cal = retrieveCalendar(tw.trigger.getCalendarName());
                    if(cal == null)
                        continue;
                }
                Date prevFireTime = trigger.getPreviousFireTime();
                // in case trigger was replaced between acquiring and firing
                timeTriggers.remove(tw);
                // call triggered on our copy, and the scheduler's copy
                tw.trigger.triggered(cal);
                trigger.triggered(cal);
                //tw.state = TriggerWrapper.STATE_EXECUTING;
                tw.state = TriggerWrapper.STATE_WAITING;

                TriggerFiredBundle bndle = new TriggerFiredBundle(retrieveJob(
                        tw.jobKey), trigger, cal,
                        false, new Date(), trigger.getPreviousFireTime(), prevFireTime,
                        trigger.getNextFireTime());

                JobDetail job = bndle.getJobDetail();

                if (job.isConcurrentExectionDisallowed()) {
                    ArrayList<TriggerWrapper> trigs = getTriggerWrappersForJob(job.getKey());
                    for (TriggerWrapper ttw : trigs) {
                        if (ttw.state == TriggerWrapper.STATE_WAITING) {
                            ttw.state = TriggerWrapper.STATE_BLOCKED;
                        }
                        if (ttw.state == TriggerWrapper.STATE_PAUSED) {
                            ttw.state = TriggerWrapper.STATE_PAUSED_BLOCKED;
                        }
                        timeTriggers.remove(ttw);
                    }
                    blockedJobs.add(job.getKey());
                } else if (tw.trigger.getNextFireTime() != null) {
                    synchronized (lock) {
                        timeTriggers.add(tw);
                    }
                }

                results.add(new TriggerFiredResult(bndle));
            }
            return results;
        }
    }

这里就是查询出的时间通过Calendar过滤,然后将trigger封装成TriggerFiredBundle对象。

     for (int i = 0; i < bndles.size(); i++) {
          TriggerFiredResult result =  bndles.get(i);
          TriggerFiredBundle bndle =  result.getTriggerFiredBundle();
          Exception exception = result.getException();

          if (exception instanceof RuntimeException) {
              getLog().error("RuntimeException while firing trigger " + triggers.get(i), exception);
              qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
              continue;
          }

          // it's possible to get 'null' if the triggers was paused,
          // blocked, or other similar occurrences that prevent it being
          // fired at this time...  or if the scheduler was shutdown (halted)
          if (bndle == null) {
              qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
              continue;
          }
          // 下面是开始执行任务
          JobRunShell shell = null;
          try {
              //构造执行对象,JobRunShell实现了Runnable
              shell = qsRsrcs.getJobRunShellFactory().createJobRunShell(bndle);
              //这个里面会用我们自定义的Job来new一个对象,并把相关执行Job是需要的数据传给JobExecutionContextImpl(这是我们自定义job的execute方法参数)
              shell.initialize(qs);
          } catch (SchedulerException se) {
              qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
              continue;
          }
          // 这里是把任务放入到线程池中 供SimpleThreadPool调度
          if (!qsRsrcs.getThreadPool().runInThread(shell)) {
              // this case should never happen, as it is indicative of the
              // scheduler being shutdown or a bug in the thread pool or
              // a thread pool being used concurrently - which the docs
              // say not to do...
              getLog().error("ThreadPool.runInThread() return false!");
              qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
          }

      }

遍历前面的封装的trigger,如果在之前存在异常,就将这个trigger的状态由STATE_ACQUIRED改为STATE_WAITING。这里主要是创建出一个JobRunShell,这是一个线程Runnable的子类

    public boolean runInThread(Runnable runnable) {
        if (runnable == null) {
            return false;
        }

        synchronized (nextRunnableLock) {

            handoffPending = true;

            // Wait until a worker thread is available
            while ((availWorkers.size() < 1) && !isShutdown) {
                try {
                    nextRunnableLock.wait(500);
                } catch (InterruptedException ignore) {
                }
            }

            if (!isShutdown) {
                WorkerThread wt = (WorkerThread)availWorkers.removeFirst();
                busyWorkers.add(wt);
                wt.run(runnable);
            } else {
                // If the thread pool is going down, execute the Runnable
                // within a new additional worker thread (no thread from the pool).
                WorkerThread wt = new WorkerThread(this, threadGroup,
                        "WorkerThread-LastJob", prio, isMakeThreadsDaemons(), runnable);
                busyWorkers.add(wt);
                workers.add(wt);
                wt.start();
            }
            nextRunnableLock.notifyAll();
            handoffPending = false;
        }

        return true;
    }

这段代码开始运行任务,当没有可用线程时需要无限等待500ms,直到有可用的线程才能往下走,
可用线程使用1个,运行中的线程增加1个,然后调度这个线程。

最后看看这JobRunShell线程中做了什么:

    public void run() {
        qs.addInternalSchedulerListener(this);

        try {
            OperableTrigger trigger = (OperableTrigger) jec.getTrigger();
            JobDetail jobDetail = jec.getJobDetail();

            do {

                JobExecutionException jobExEx = null;
                Job job = jec.getJobInstance();

                try {
                    begin();
                } catch (SchedulerException se) {
                    qs.notifySchedulerListenersError("Error executing Job ("
                            + jec.getJobDetail().getKey()
                            + ": couldn't begin execution.", se);
                    break;
                }

                // notify job & trigger listeners...
                try {
                    if (!notifyListenersBeginning(jec)) {
                        break;
                    }
                } catch(VetoedException ve) {
                    try {
                        CompletedExecutionInstruction instCode = trigger.executionComplete(jec, null);
                        qs.notifyJobStoreJobVetoed(trigger, jobDetail, instCode);
                        
                        // QTZ-205
                        // Even if trigger got vetoed, we still needs to check to see if it's the trigger's finalized run or not.
                        if (jec.getTrigger().getNextFireTime() == null) {
                            qs.notifySchedulerListenersFinalized(jec.getTrigger());
                        }

                        complete(true);
                    } catch (SchedulerException se) {
                        qs.notifySchedulerListenersError("Error during veto of Job ("
                                + jec.getJobDetail().getKey()
                                + ": couldn't finalize execution.", se);
                    }
                    break;
                }

                long startTime = System.currentTimeMillis();
                long endTime = startTime;

                // execute the job
                try {
                    log.debug("Calling execute on job " + jobDetail.getKey());
                    job.execute(jec);
                    endTime = System.currentTimeMillis();
                } catch (JobExecutionException jee) {
                    endTime = System.currentTimeMillis();
                    jobExEx = jee;
                    getLog().info("Job " + jobDetail.getKey() +
                            " threw a JobExecutionException: ", jobExEx);
                } catch (Throwable e) {
                    endTime = System.currentTimeMillis();
                    getLog().error("Job " + jobDetail.getKey() +
                            " threw an unhandled Exception: ", e);
                    SchedulerException se = new SchedulerException(
                            "Job threw an unhandled exception.", e);
                    qs.notifySchedulerListenersError("Job ("
                            + jec.getJobDetail().getKey()
                            + " threw an exception.", se);
                    jobExEx = new JobExecutionException(se, false);
                }

                jec.setJobRunTime(endTime - startTime);

                // notify all job listeners
                if (!notifyJobListenersComplete(jec, jobExEx)) {
                    break;
                }

                CompletedExecutionInstruction instCode = CompletedExecutionInstruction.NOOP;

                // update the trigger
                try {
                    instCode = trigger.executionComplete(jec, jobExEx);
                } catch (Exception e) {
                    // If this happens, there's a bug in the trigger...
                    SchedulerException se = new SchedulerException(
                            "Trigger threw an unhandled exception.", e);
                    qs.notifySchedulerListenersError(
                            "Please report this error to the Quartz developers.",
                            se);
                }

                // notify all trigger listeners
                if (!notifyTriggerListenersComplete(jec, instCode)) {
                    break;
                }

                // update job/trigger or re-execute job
                if (instCode == CompletedExecutionInstruction.RE_EXECUTE_JOB) {
                    jec.incrementRefireCount();
                    try {
                        complete(false);
                    } catch (SchedulerException se) {
                        qs.notifySchedulerListenersError("Error executing Job ("
                                + jec.getJobDetail().getKey()
                                + ": couldn't finalize execution.", se);
                    }
                    continue;
                }

                try {
                    complete(true);
                } catch (SchedulerException se) {
                    qs.notifySchedulerListenersError("Error executing Job ("
                            + jec.getJobDetail().getKey()
                            + ": couldn't finalize execution.", se);
                    continue;
                }

                qs.notifyJobStoreJobComplete(trigger, jobDetail, instCode);
                break;
            } while (true);

        } finally {
            qs.removeInternalSchedulerListener(this);
        }
    }

提供了一个任务调度前的begin();方法和一个调度完成后的complete();方法,
通知所有的监听器,先通知trigger的监听器,再通知任务否决监听器,再通知job的监听器
最后调度:job.execute(jec);需要我们任务类实现Job接口,实现其中的execute方法

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值