上一篇绑定了任务和触发器,现在接着看这个
public void start() throws SchedulerException {
if (shuttingDown|| closed) {
throw new SchedulerException(
"The Scheduler cannot be restarted after shutdown() has been called.");
}
// QTZ-212 : calling new schedulerStarting() method on the listeners
// right after entering start()
// 通知调度器监控器启动中
notifySchedulerListenersStarting();
//初始化标识为null,进行初始化操作
if (initialStart == null) {
initialStart = new Date();
this.resources.getJobStore().schedulerStarted();
startPlugins();
} else {
//如果已经初始化过,则恢复jobStore
resources.getJobStore().schedulerResumed();
}
schedThread.togglePause(false);
getLog().info(
"Scheduler " + resources.getUniqueIdentifier() + " started.");
//提醒调度器的监听启动
notifySchedulerListenersStarted();
}
先看看scheduler.start();
的chedThread.togglePause(false);
,初始化initialStart之后再来说。
在【定时任务系列(4)-Quartz创建Scheduler调度器核心原理】中描述过这个schedThread.togglePause(false),在【QuartzSchedulerThread主线程调度】这一小节中,说到等到togglePause(false)调用。
void togglePause(boolean pause) {
synchronized (sigLock) {
paused = pause;
if (paused) {
signalSchedulingChange(0);
} else {
sigLock.notifyAll();
}
}
}
这里将paused置为了false,回过头看QuartzSchedulerThread
主线程的调度,
// check if we're supposed to pause...
synchronized (sigLock) {
while (paused && !halted.get()) {
try {
// wait until togglePause(false) is called...
//等到 togglePause(false) 被调用...
log.info("QuartzSchedulerThread 等到 togglePause(false) 被调用..." + LocalDateTime.now());
sigLock.wait(1000L);
} catch (InterruptedException ignore) {
}
// reset failure counter when paused, so that we don't
// wait again after unpausing
acquiresFailed = 0;
}
if (halted.get()) {
break;
}
}
这里的paused为false后,将不会再循环等待,直接break出去。
// wait a bit, if reading from job store is consistently
// failing (e.g. DB is down or restarting)..
//稍等一下,如果从作业存储中读取一直失败(例如,数据库关闭或重新启动)
if (acquiresFailed > 1) {
try {
long delay = computeDelayForRepeatedErrors(qsRsrcs.getJobStore(), acquiresFailed);
Thread.sleep(delay);
} catch (Exception ignore) {
}
}
如果失败的次数大于1,就等待一段时间。
//获取可用线程数 qsRsrcs是QuartzSchedulerResources对象
int availThreadCount = qsRsrcs.getThreadPool().blockForAvailableThreads();
if(availThreadCount > 0) { // will always be true, due to semantics of blockForAvailableThreads...
List<OperableTrigger> triggers;
long now = System.currentTimeMillis();
//清除调度改变的信号
clearSignaledSchedulingChange();
try {
/**
* ======最重要的部分======
*/
//到JobStore中获取下次被触发的触发器
triggers = qsRsrcs.getJobStore().acquireNextTriggers(
now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
acquiresFailed = 0;
if (log.isDebugEnabled())
log.debug("batch acquisition of " + (triggers == null ? 0 : triggers.size()) + " triggers");
} catch (JobPersistenceException jpe) {
if (acquiresFailed == 0) {
qs.notifySchedulerListenersError(
"An error occurred while scanning for the next triggers to fire.",
jpe);
}
if (acquiresFailed < Integer.MAX_VALUE)
acquiresFailed++;
continue;
} catch (RuntimeException e) {
if (acquiresFailed == 0) {
getLog().error("quartzSchedulerThreadLoop: RuntimeException "
+e.getMessage(), e);
}
if (acquiresFailed < Integer.MAX_VALUE)
acquiresFailed++;
continue;
}
这一段最重要的就是到JobStore中获取下次被触发的触发器,RAMJobStore
中的acquireNextTriggers实现:
public List<OperableTrigger> acquireNextTriggers(long noLaterThan, int maxCount, long timeWindow) {
synchronized (lock) {
List<OperableTrigger> result = new ArrayList<OperableTrigger>();
Set<JobKey> acquiredJobKeysForNoConcurrentExec = new HashSet<JobKey>();
Set<TriggerWrapper> excludedTriggers = new HashSet<TriggerWrapper>();
long batchEnd = noLaterThan;
// return empty list if store has no triggers. 如果timeTriggers的集合为空,那么直接返回。
if (timeTriggers.size() == 0)
return result;
while (true) {
TriggerWrapper tw;
try {
//从timetriggers中获取第一个triggerWrapper
// 因为timeTriggers是有序的triggerWrapper集合(按照触发时间和优先级排序),接着从timeTriggers中移除triggerWrapper。
tw = timeTriggers.first();
if (tw == null)
break;
timeTriggers.remove(tw);
} catch (java.util.NoSuchElementException nsee) {
break;
}
//如果trigger的下一次触发时间为空,则重新获取triggerWrapper。
if (tw.trigger.getNextFireTime() == null) {
continue;
}
//如果失火,重新加到timeTriggers集合中,并直接跳过去判断下一个trigger
if (applyMisfire(tw)) {
if (tw.trigger.getNextFireTime() != null) {
timeTriggers.add(tw);
}
continue;
}
//如果trigger的下一次触发时间大于需要获取的时间点,则跳出循环(因为timeTrigger是有序的,第一个时间都不满足了,就不用再继续循环了)
if (tw.getTrigger().getNextFireTime().getTime() > batchEnd) {
timeTriggers.add(tw);
break;
}
// If trigger's job is set as @DisallowConcurrentExecution, and it has already been added to result, then
// put it back into the timeTriggers set and continue to search for next trigger.
//如果触发器的作业设置为@DisallowConcurrentExecution,并且它已经添加到结果中,则将其放回timeTriggers集合并继续搜索下一个触发器。
JobKey jobKey = tw.trigger.getJobKey();
JobDetail job = jobsByKey.get(tw.trigger.getJobKey()).jobDetail;
if (job.isConcurrentExectionDisallowed()) {
if (acquiredJobKeysForNoConcurrentExec.contains(jobKey)) {
excludedTriggers.add(tw);
continue; // go to next trigger in store.
} else {
acquiredJobKeysForNoConcurrentExec.add(jobKey);
}
}
tw.state = TriggerWrapper.STATE_ACQUIRED;
tw.trigger.setFireInstanceId(getFiredTriggerRecordId());
OperableTrigger trig = (OperableTrigger) tw.trigger.clone();
if (result.isEmpty()) {
batchEnd = Math.max(tw.trigger.getNextFireTime().getTime(), System.currentTimeMillis()) + timeWindow;
}
result.add(trig);
if (result.size() == maxCount)
break;
}
//如果我们确实由于DisallowConcurrentExecution而排除了触发器以防止ACQUIRE状态,我们需要将它们添加回存储
// If we did excluded triggers to prevent ACQUIRE state due to DisallowConcurrentExecution, we need to add them back to store.
if (excludedTriggers.size() > 0)
timeTriggers.addAll(excludedTriggers);
return result;
}
}
先检查timeTriggers
是否存在数据,这里是前面绑定触发器后存入的,依次移除第一个,timeTriggers的数据结构是TreeSet是一颗红黑树,所以是有序的树。
如果失火,重新加到timeTriggers集合中。
如果trigger的下一次触发时间大于需要获取的时间点,则跳出循环
看看怎么判定为失火:
protected boolean applyMisfire(TriggerWrapper tw) {
long misfireTime = System.currentTimeMillis();
if (getMisfireThreshold() > 0) {
misfireTime -= getMisfireThreshold();
}
Date tnft = tw.trigger.getNextFireTime();
//为空、差值小于阈值、忽略失火 这三种就不管了
if (tnft == null || tnft.getTime() > misfireTime
|| tw.trigger.getMisfireInstruction() == Trigger.MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY) {
return false;
}
Calendar cal = null;
if (tw.trigger.getCalendarName() != null) {
cal = retrieveCalendar(tw.trigger.getCalendarName());
}
//通知triggerListener该trigger发生失火
signaler.notifyTriggerListenersMisfired((OperableTrigger)tw.trigger.clone());
//根据失火策略设置下次触发时间
tw.trigger.updateAfterMisfire(cal);
if (tw.trigger.getNextFireTime() == null) {
tw.state = TriggerWrapper.STATE_COMPLETE;
signaler.notifySchedulerListenersFinalized(tw.trigger);
synchronized (lock) {
timeTriggers.remove(tw);
}
} else if (tnft.equals(tw.trigger.getNextFireTime())) {
return false;
}
return true;
}
由此可以看到3个条件
- 下次触发时间不为空
- 当前时间-触发时间>阈值(5s)
- 失火策略为不可忽略
如果真的存在失火的数据,这里会根据失火的处理策略修改trigger的下次调度时间。
任务上有@DisallowConcurrentExecution
注解的,trigger需要再次放回timeTriggers,这里用一个临时变量存储。
直到所有任务的trigger获取完,或者超过最大线程数,就停止。
这样就获取了一组需要触发的触发器。
3、阻塞线程
if (triggers != null && !triggers.isEmpty()) {
now = System.currentTimeMillis();
long triggerTime = triggers.get(0).getNextFireTime().getTime();
//当前时间 到 触发时间 间隔
long timeUntilTrigger = triggerTime - now;
//间隔2毫秒以内的才放弃等待去调度任务
while(timeUntilTrigger > 2) {
synchronized (sigLock) {
if (halted.get()) {
break;
}
//这里判断是否存在更早的触发器
if (!isCandidateNewTimeEarlierWithinReason(triggerTime, false)) {
try {
// we could have blocked a long while
// on 'synchronize', so we must recompute
//我们可能在“同步”上阻塞了很长时间,所以我们必须重新计算
now = System.currentTimeMillis();
timeUntilTrigger = triggerTime - now;
if(timeUntilTrigger >= 1)
sigLock.wait(timeUntilTrigger);
} catch (InterruptedException ignore) {
}
}
}
//等待的过程中看看有没有收到调度信号
if(releaseIfScheduleChangedSignificantly(triggers, triggerTime)) {
break;
}
now = System.currentTimeMillis();
timeUntilTrigger = triggerTime - now;
}
// this happens if releaseIfScheduleChangedSignificantly decided to release triggers
if(triggers.isEmpty())
continue;
先判断是否存在更早的触发器,如果存在,需要清除当前trigger,调度更早的trigger。
否则需要阻塞到任务触发的时候。
// set triggers to 'executing' 将触发器设置为“执行”
List<TriggerFiredResult> bndles = new ArrayList<TriggerFiredResult>();
boolean goAhead = true;
synchronized(sigLock) {
goAhead = !halted.get();
}
if(goAhead) {
//scheduler不停止
try {
//获取调度程序现在正在触发给定的Trigger
List<TriggerFiredResult> res = qsRsrcs.getJobStore().triggersFired(triggers);
if(res != null)
bndles = res;
} catch (SchedulerException se) {
qs.notifySchedulerListenersError(
"An error occurred while firing triggers '"
+ triggers + "'", se);
//QTZ-179 : a problem occurred interacting with the triggers from the db
//we release them and loop again
for (int i = 0; i < triggers.size(); i++) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
}
continue;
}
}
上面这一段获取调度程序现在正在触发给定的Trigger,主要的实现方法是triggersFired:
public List<TriggerFiredResult> triggersFired(List<OperableTrigger> firedTriggers) {
synchronized (lock) {
List<TriggerFiredResult> results = new ArrayList<TriggerFiredResult>();
for (OperableTrigger trigger : firedTriggers) {
TriggerWrapper tw = triggersByKey.get(trigger.getKey());
// was the trigger deleted since being acquired?
if (tw == null || tw.trigger == null) {
continue;
}
// was the trigger completed, paused, blocked, etc. since being acquired?
if (tw.state != TriggerWrapper.STATE_ACQUIRED) {
continue;
}
Calendar cal = null;
if (tw.trigger.getCalendarName() != null) {
cal = retrieveCalendar(tw.trigger.getCalendarName());
if(cal == null)
continue;
}
Date prevFireTime = trigger.getPreviousFireTime();
// in case trigger was replaced between acquiring and firing
timeTriggers.remove(tw);
// call triggered on our copy, and the scheduler's copy
tw.trigger.triggered(cal);
trigger.triggered(cal);
//tw.state = TriggerWrapper.STATE_EXECUTING;
tw.state = TriggerWrapper.STATE_WAITING;
TriggerFiredBundle bndle = new TriggerFiredBundle(retrieveJob(
tw.jobKey), trigger, cal,
false, new Date(), trigger.getPreviousFireTime(), prevFireTime,
trigger.getNextFireTime());
JobDetail job = bndle.getJobDetail();
if (job.isConcurrentExectionDisallowed()) {
ArrayList<TriggerWrapper> trigs = getTriggerWrappersForJob(job.getKey());
for (TriggerWrapper ttw : trigs) {
if (ttw.state == TriggerWrapper.STATE_WAITING) {
ttw.state = TriggerWrapper.STATE_BLOCKED;
}
if (ttw.state == TriggerWrapper.STATE_PAUSED) {
ttw.state = TriggerWrapper.STATE_PAUSED_BLOCKED;
}
timeTriggers.remove(ttw);
}
blockedJobs.add(job.getKey());
} else if (tw.trigger.getNextFireTime() != null) {
synchronized (lock) {
timeTriggers.add(tw);
}
}
results.add(new TriggerFiredResult(bndle));
}
return results;
}
}
这里就是查询出的时间通过Calendar过滤,然后将trigger封装成TriggerFiredBundle
对象。
for (int i = 0; i < bndles.size(); i++) {
TriggerFiredResult result = bndles.get(i);
TriggerFiredBundle bndle = result.getTriggerFiredBundle();
Exception exception = result.getException();
if (exception instanceof RuntimeException) {
getLog().error("RuntimeException while firing trigger " + triggers.get(i), exception);
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
continue;
}
// it's possible to get 'null' if the triggers was paused,
// blocked, or other similar occurrences that prevent it being
// fired at this time... or if the scheduler was shutdown (halted)
if (bndle == null) {
qsRsrcs.getJobStore().releaseAcquiredTrigger(triggers.get(i));
continue;
}
// 下面是开始执行任务
JobRunShell shell = null;
try {
//构造执行对象,JobRunShell实现了Runnable
shell = qsRsrcs.getJobRunShellFactory().createJobRunShell(bndle);
//这个里面会用我们自定义的Job来new一个对象,并把相关执行Job是需要的数据传给JobExecutionContextImpl(这是我们自定义job的execute方法参数)
shell.initialize(qs);
} catch (SchedulerException se) {
qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
continue;
}
// 这里是把任务放入到线程池中 供SimpleThreadPool调度
if (!qsRsrcs.getThreadPool().runInThread(shell)) {
// this case should never happen, as it is indicative of the
// scheduler being shutdown or a bug in the thread pool or
// a thread pool being used concurrently - which the docs
// say not to do...
getLog().error("ThreadPool.runInThread() return false!");
qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
}
}
遍历前面的封装的trigger,如果在之前存在异常,就将这个trigger的状态由STATE_ACQUIRED
改为STATE_WAITING
。这里主要是创建出一个JobRunShell
,这是一个线程Runnable的子类
public boolean runInThread(Runnable runnable) {
if (runnable == null) {
return false;
}
synchronized (nextRunnableLock) {
handoffPending = true;
// Wait until a worker thread is available
while ((availWorkers.size() < 1) && !isShutdown) {
try {
nextRunnableLock.wait(500);
} catch (InterruptedException ignore) {
}
}
if (!isShutdown) {
WorkerThread wt = (WorkerThread)availWorkers.removeFirst();
busyWorkers.add(wt);
wt.run(runnable);
} else {
// If the thread pool is going down, execute the Runnable
// within a new additional worker thread (no thread from the pool).
WorkerThread wt = new WorkerThread(this, threadGroup,
"WorkerThread-LastJob", prio, isMakeThreadsDaemons(), runnable);
busyWorkers.add(wt);
workers.add(wt);
wt.start();
}
nextRunnableLock.notifyAll();
handoffPending = false;
}
return true;
}
这段代码开始运行任务,当没有可用线程时需要无限等待500ms,直到有可用的线程才能往下走,
可用线程使用1个,运行中的线程增加1个,然后调度这个线程。
最后看看这JobRunShell
线程中做了什么:
public void run() {
qs.addInternalSchedulerListener(this);
try {
OperableTrigger trigger = (OperableTrigger) jec.getTrigger();
JobDetail jobDetail = jec.getJobDetail();
do {
JobExecutionException jobExEx = null;
Job job = jec.getJobInstance();
try {
begin();
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't begin execution.", se);
break;
}
// notify job & trigger listeners...
try {
if (!notifyListenersBeginning(jec)) {
break;
}
} catch(VetoedException ve) {
try {
CompletedExecutionInstruction instCode = trigger.executionComplete(jec, null);
qs.notifyJobStoreJobVetoed(trigger, jobDetail, instCode);
// QTZ-205
// Even if trigger got vetoed, we still needs to check to see if it's the trigger's finalized run or not.
if (jec.getTrigger().getNextFireTime() == null) {
qs.notifySchedulerListenersFinalized(jec.getTrigger());
}
complete(true);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error during veto of Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
}
break;
}
long startTime = System.currentTimeMillis();
long endTime = startTime;
// execute the job
try {
log.debug("Calling execute on job " + jobDetail.getKey());
job.execute(jec);
endTime = System.currentTimeMillis();
} catch (JobExecutionException jee) {
endTime = System.currentTimeMillis();
jobExEx = jee;
getLog().info("Job " + jobDetail.getKey() +
" threw a JobExecutionException: ", jobExEx);
} catch (Throwable e) {
endTime = System.currentTimeMillis();
getLog().error("Job " + jobDetail.getKey() +
" threw an unhandled Exception: ", e);
SchedulerException se = new SchedulerException(
"Job threw an unhandled exception.", e);
qs.notifySchedulerListenersError("Job ("
+ jec.getJobDetail().getKey()
+ " threw an exception.", se);
jobExEx = new JobExecutionException(se, false);
}
jec.setJobRunTime(endTime - startTime);
// notify all job listeners
if (!notifyJobListenersComplete(jec, jobExEx)) {
break;
}
CompletedExecutionInstruction instCode = CompletedExecutionInstruction.NOOP;
// update the trigger
try {
instCode = trigger.executionComplete(jec, jobExEx);
} catch (Exception e) {
// If this happens, there's a bug in the trigger...
SchedulerException se = new SchedulerException(
"Trigger threw an unhandled exception.", e);
qs.notifySchedulerListenersError(
"Please report this error to the Quartz developers.",
se);
}
// notify all trigger listeners
if (!notifyTriggerListenersComplete(jec, instCode)) {
break;
}
// update job/trigger or re-execute job
if (instCode == CompletedExecutionInstruction.RE_EXECUTE_JOB) {
jec.incrementRefireCount();
try {
complete(false);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
}
continue;
}
try {
complete(true);
} catch (SchedulerException se) {
qs.notifySchedulerListenersError("Error executing Job ("
+ jec.getJobDetail().getKey()
+ ": couldn't finalize execution.", se);
continue;
}
qs.notifyJobStoreJobComplete(trigger, jobDetail, instCode);
break;
} while (true);
} finally {
qs.removeInternalSchedulerListener(this);
}
}
提供了一个任务调度前的begin();
方法和一个调度完成后的complete();
方法,
通知所有的监听器,先通知trigger的监听器,再通知任务否决监听器,再通知job的监听器
最后调度:job.execute(jec);
需要我们任务类实现Job
接口,实现其中的execute方法