问题
服务:jobmgr(任务调度模块)
目前有三种job
- HttpPing拨测任务,定时get一下url,监控服务是否存活(多节点则域名解析成ip)
- 周期性聚合产生报告/告警(每1小时查询数据看是达到阈值触发告警)
- 告警恢复任务
现象:
告警恢复任务是昨晚上新加的job, 同时影响周期告警(原逻辑add:周期告警没出错后产生一个恢复job),
完了之后部署到开发环境,报错。代码有些BUG,修改之后,本地调试(开发环境停止),发现拨测任务正常运行, 周期告警全部无动静,并且没有任何异常抛出。
找问题
查看运行中的job:
@Override
public List<JobView> listJobs() throws JobSchedueException { // TODO: 记录数太多?
List<JobView> jobViews = new ArrayList<>();
try {
for (String groupName : sched.getJobGroupNames()) {
for (JobKey jobKey : sched.getJobKeys(GroupMatcher.jobGroupEquals(groupName))) {
String jobName = jobKey.getName();
String jobGroup = jobKey.getGroup();
// name and group
JobDetail jobDetail = new JobDetail();
jobDetail.setName(jobName);
jobDetail.setGroup(jobGroup);
JobView jobView = new JobView();
jobView.setJobDetail(jobDetail);
// params
org.quartz.JobDetail qJobDetail = sched.getJobDetail(jobKey);
if (null != qJobDetail.getJobDataMap()) {
Map<String, Object> params = new HashMap<>();
params.putAll(qJobDetail.getJobDataMap());
jobDetail.setParams(params);
}
List<Trigger> triggers = (List<Trigger>) sched.getTriggersOfJob(jobKey);
for (Trigger trigger : triggers) { // 应该只有一个
trigger.getNextFireTime();
if (trigger instanceof CronTrigger) {
CronTrigger cronTrigger = (Cron