前言
使用XxlJob调度框架已经有一段时间了,但是对它的认识还不够,导致这次就出了生产事故。由于业务需要,我的任务会执行很久,路由策略设置的是轮询,执行过程中第二个任务都到触发时间了,之前那个任务还没执行完,我以为会直接丢弃第二个调度(阻塞策略是丢弃后续调度),但是实际上并没有,导致两个任务并行了,污染了数据。
XxlJob的大致运行原理是:调度中心有一条后台线程会间隔提取已到执行时间的任务,通过任务的配置,获取注册到调度中心的节点服务器地址信息,然后通过配置的路由策略算法选择一台节点对远程发起调用,调用的就是我们写的任务实体(我的项目的运行模式是Bean模式)。
private Object process(HttpMethod httpMethod, String uri, String requestData, String accessTokenReq) {
if (HttpMethod.POST != httpMethod) {
return new ReturnT(500, "invalid request, HttpMethod not support.");
} else if (uri != null && uri.trim().length() != 0) {
if (this.accessToken != null && this.accessToken.trim().length() > 0 && !this.accessToken.equals(accessTokenReq)) {
return new ReturnT(500, "The access token is wrong.");
} else {
try {
if ("/beat".equals(uri)) {
return this.executorBiz.beat();
} else if ("/idleBeat".equals(uri)) {
IdleBeatParam idleBeatParam = (IdleBeatParam)GsonTool.fromJson(requestData, IdleBeatParam.class);
return this.executorBiz.idleBeat(idleBeatParam);
} else if ("/run".equals(uri)) {
TriggerParam triggerParam = (TriggerParam)GsonTool.fromJson(requestData, TriggerParam.class);
return this.executorBiz.run(triggerParam);
} else if ("/kill".equals(uri)) {
KillParam killParam = (KillParam)GsonTool.fromJson(requestData, KillParam.class);
return this.executorBiz.kill(killParam);
} else if ("/log".equals(uri)) {
LogParam logParam = (LogParam)GsonTool.fromJson(requestData, LogParam.class);
return this.executorBiz.log(logParam);
} else {
return new ReturnT(500, "invalid request, uri-mapping(" + uri + ") not found.");
}
} catch (Exception var6) {
logger.error(var6.getMessage(), var6);
return new ReturnT(500, "request error:" + ThrowableUtil.toString(var6));
}
}
} else {
return new ReturnT(500, "invalid request, uri-mapping empty.");
}
}
/run路径就是被调度中心调度的时候会被匹配,实际执行ExecutorBizImpl.run方法
public ReturnT<String> run(TriggerParam triggerParam) {
JobThread jobThread = XxlJobExecutor.loadJobThread(triggerParam.getJobId());
IJobHandler jobHandler = jobThread != null ? jobThread.getHandler() : null;
String removeOldReason = null;
GlueTypeEnum glueTypeEnum = GlueTypeEnum.match(triggerParam.getGlueType());
IJobHandler originJobHandler;
if (GlueTypeEnum.BEAN == glueTypeEnum) {
originJobHandler = XxlJobExecutor.loadJobHandler(triggerParam.getExecutorHandler());
if (jobThread != null && jobHandler != originJobHandler) {
removeOldReason = "change jobhandler or glue type, and terminate the old job thread.";
jobThread = null;
jobHandler = null;
}
if (jobHandler == null) {
jobHandler = originJobHandler;
if (originJobHandler == null) {
return new ReturnT(500, "job handler [" + triggerParam.getExecutorHandler() + "] not found.");
}
}
} else if (GlueTypeEnum.GLUE_GROOVY == glueTypeEnum) {
if (jobThread != null && (!(jobThread.getHandler() instanceof GlueJobHandler) || ((GlueJobHandler)jobThread.getHandler()).getGlueUpdatetime() != triggerParam.getGlueUpdatetime())) {
removeOldReason = "change job source or glue type, and terminate the old job thread.";
jobThread = null;
jobHandler = null;
}
if (jobHandler == null) {
try {
originJobHandler = GlueFactory.getInstance().loadNewInstance(triggerParam.getGlueSource());
jobHandler = new GlueJobHandler(originJobHandler, triggerParam.getGlueUpdatetime());
} catch (Exception var7) {
logger.error(var7.getMessage(), var7);
return new ReturnT(500, var7.getMessage());
}
}
} else {
if (glueTypeEnum == null || !glueTypeEnum.isScript()) {
return new ReturnT(500, "glueType[" + triggerParam.getGlueType() + "] is not valid.");
}
if (jobThread != null && (!(jobThread.getHandler() instanceof ScriptJobHandler) || ((ScriptJobHandler)jobThread.getHandler()).getGlueUpdatetime() != triggerParam.getGlueUpdatetime())) {
removeOldReason = "change job source or glue type, and terminate the old job thread.";
jobThread = null;
jobHandler = null;
}
if (jobHandler == null) {
jobHandler = new ScriptJobHandler(triggerParam.getJobId(), triggerParam.getGlueUpdatetime(), triggerParam.getGlueSource(), GlueTypeEnum.match(triggerParam.getGlueType()));
}
}
if (jobThread != null) {
ExecutorBlockStrategyEnum blockStrategy = ExecutorBlockStrategyEnum.match(triggerParam.getExecutorBlockStrategy(), (ExecutorBlockStrategyEnum)null);
if (ExecutorBlockStrategyEnum.DISCARD_LATER == blockStrategy) {
if (jobThread.isRunningOrHasQueue()) {
return new ReturnT(500, "block strategy effect:" + ExecutorBlockStrategyEnum.DISCARD_LATER.getTitle());
}
} else if (ExecutorBlockStrategyEnum.COVER_EARLY == blockStrategy && jobThread.isRunningOrHasQueue()) {
removeOldReason = "block strategy effect:" + ExecutorBlockStrategyEnum.COVER_EARLY.getTitle();
jobThread = null;
}
}
if (jobThread == null) {
jobThread = XxlJobExecutor.registJobThread(triggerParam.getJobId(), (IJobHandler)jobHandler, removeOldReason);
}
ReturnT<String> pushResult = jobThread.pushTriggerQueue(triggerParam);
return pushResult;
}
三种策略处理方式
public static JobThread registJobThread(int jobId, IJobHandler handler, String removeOldReason) {
JobThread newJobThread = new JobThread(jobId, handler);
newJobThread.start();
logger.info(">>>>>>>>>>> xxl-job regist JobThread success, jobId:{}, handler:{}", new Object[]{jobId, handler});
JobThread oldJobThread = (JobThread)jobThreadRepository.put(jobId, newJobThread);
if (oldJobThread != null) {
oldJobThread.toStop(removeOldReason);
oldJobThread.interrupt();
}
return newJobThread;
}
XxlJobExecutor.registJobThread根据任务ID注册开启线程,开启新的线程会终止旧的线程,覆盖策略就需要终止旧的线程。JobThread的run方法会从队列里面取出任务执行。
阻塞策略
1、当阻塞策略为DISCARD_LATER(丢弃后续调度)时,如果当前线程还在执行之前的任务,那么直接被抛弃,返回return new ReturnT(500, “block strategy effect:” + ExecutorBlockStrategyEnum.DISCARD_LATER.getTitle());
2、当阻塞策略为COVER_EARLY(覆盖之前调度)时,会把jobThread置为null,进入XxlJobExecutor.registJobThread方法,终止之前的线程,并开启新的线程,实现覆盖。
3、如果不是前面两种阻塞策略(那么就是单机串行策略),这样任务会直接加到队列里面等待线程继续调度,实现任务串行,因为线程也是从队列中获取任务。
总结
三种调度策略都是针对于单机的调度策略,因为每个的策略代码都写在了客户端。如果某个任务执行时间很长,下一个任务已经开始,并且根据路由策略路由到了其他的机器上,那么此时任务会并行执行,阻塞策略跨节点无效。如果有这种需求,可以把路由策略选择第一个或最后一个,不管选择哪种方式我们都是让调度的机器选择为同一台机器那就能让阻塞策略生效了。