Quartz集群调度出现的WARN警告问题

1、报错内容如下:
在这里插入图片描述
具体描述如下图所示:

This scheduler instance xxxx is still active but was recovered by another instance in the cluster. This may cause inconsistent behavior.
ClusterManager detected 1 failed or restarted instances.

分析:
1、可以看到当前日志是由LocalDataSourceJobStore打印出来的,源码查看无日志信息,往父类和接口进行查找到JobStoreSupport,主要源码如下:

protected void clusterRecover(Connection conn, List<SchedulerStateRecord> failedInstances)
        throws JobPersistenceException {

        if (failedInstances.size() > 0) {

            long recoverIds = System.currentTimeMillis();

            logWarnIfNonZero(failedInstances.size(),
                    "ClusterManager: detected " + failedInstances.size()
                            + " failed or restarted instances.");
            // 省略后面的N行代码
            // ....
        }
    }
protected List<SchedulerStateRecord> findFailedInstances(Connection conn)
        throws JobPersistenceException {
        try {
            List<SchedulerStateRecord> failedInstances = new LinkedList<SchedulerStateRecord>();
            boolean foundThisScheduler = false;
            long timeNow = System.currentTimeMillis();
            
            List<SchedulerStateRecord> states = getDelegate().selectSchedulerStateRecords(conn, null);

            for(SchedulerStateRecord rec: states) {
        
                // find own record...
                if (rec.getSchedulerInstanceId().equals(getInstanceId())) {
                    foundThisScheduler = true;
                    if (firstCheckIn) {
                        failedInstances.add(rec);
                    }
                } else {
                    // find failed instances...
                    if (calcFailedIfAfter(rec) < timeNow) {
                        failedInstances.add(rec);
                    }
                }
            }
            
            // The first time through, also check for orphaned fired triggers.
            if (firstCheckIn) {
                failedInstances.addAll(findOrphanedFailedInstances(conn, states));
            }
            
            // If not the first time but we didn't find our own instance, then
            // 不是当前机器同时也不是第一次进行check.
            if ((!foundThisScheduler) && (!firstCheckIn)) {
                // FUTURE_TODO: revisit when handle self-failed-out impl'ed (see FUTURE_TODO in clusterCheckIn() below)
                getLog().warn(
                    "This scheduler instance (" + getInstanceId() + ") is still " + 
                    "active but was recovered by another instance in the cluster.  " +
                    "This may cause inconsistent behavior.");
            }
            
            return failedInstances;
        } catch (Exception e) {
            lastCheckin = System.currentTimeMillis();
            throw new JobPersistenceException("Failure identifying failed instances when checking-in: "
                    + e.getMessage(), e);
        }
    }

可以看到代码中的 // find failed instances… 下面的calcFailedIfAfter方法:

protected long calcFailedIfAfter(SchedulerStateRecord rec) {
   return rec.getCheckinTimestamp() +
        Math.max(rec.getCheckinInterval(), 
                (System.currentTimeMillis() - lastCheckin)) +
        7500L;
}	

由于数据库中没有找到当前机器的instance并不是第一次check,所以会打印如下日志:

This scheduler instance xxxx is still active but was recovered by another instance in the cluster. This may cause inconsistent behavior.

同时有其他机器节点的时间发生了超时,由于系统的时间差值较大,超过7.5秒,才会将失败的实例增加到failedInstances中,由于存在超时通讯的节点,所以会执行调用clusterRecover方法,则会打印如下的日志:

ClusterManager detected 1 failed or restarted instances.

所以这个问题主要是由于系统服务器时间不同步导致的,同步集群当中服务的时间即可解决该问题。当前源码学习仍在进行中,如有不对,请不吝赐教,感激不尽!

  • 6
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 6
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值