如果在集群环境,就需要面临多个服务器同时访问任务的问题,如果不对并发访问进行控制,可能造成数据的不一致。quartz使用数据库加锁的方式来控制并发访问。
使用数据库锁必须在集群环境下,设置集群环境需要在quartz.properties中进行如下设置:
org.quartz.jobStore.isClustered = true
在quartz中需要加锁的场景有很多,比如获取触发器时,更新状态时,下面以获取触发器时的加锁进行分析。
在QuartzSchedulerThread线程调度是,需要先获取触发器,代码如下:
triggers = qsRsrcs.getJobStore().acquireNextTriggers(
now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
acquireNextTrigger()方法如下:
public List<OperableTrigger> acquireNextTriggers(final long noLaterThan, final int maxCount, final long timeWindow)
throws JobPersistenceException {
String lockName;
if(isAcquireTriggersWithinLock() || maxCount > 1) {
lockName = LOCK_TRIGGER_ACCESS;
} else {
lockName = null;
}
return executeInNonManagedTXLock(lockName,
new TransactionCallback<List<OperableTrigger>>() {
public List<OperableTrigger> execute(Connection conn) throws JobPersistenceException {
return acquireNextTrigger(conn, noLaterThan, maxCount, timeWindow);
}
},
new TransactionValidator<List<OperableTrigger>>() {
public Boolean validate(Connection conn, List<OperableTrigger> result) throws JobPersistenceException {
try {
List<FiredTriggerRecord> acquired = getDelegate().selectInstancesFiredTriggerRecords(conn, getInstanceId());
Set<String> fireInstanceIds = new HashSet<String>();
for (FiredTriggerRecord ft : acquired) {
fireInstanceIds.add(ft.getFireInstanceId());
}
for (OperableTrigger tr : result) {
if (fireInstanceIds.contains(tr.getFireInstanceId())) {
return true;
}
}
return false;
} catch (SQLException e) {
throw new JobPersistenceException("error validating trigger acquisition", e);
}
}
});
}
executeInNonManagedTXLock()方法如下:
protected <T> T executeInNonManagedTXLock(
String lockName,
TransactionCallback<T> txCallback, final TransactionValidator<T> txValidator) throws JobPersistenceException {
boolean transOwner = false;
Connection conn = null;
try {
if (lockName != null) {
// If we aren't using db locks, then delay getting DB connection
// until after acquiring the lock since it isn't needed.
if (getLockHandler().requiresConnection()) {
conn = getNonManagedTXConnection();
}
transOwner = getLockHandler().obtainLock(conn, lockName);
}
if (conn == null) {
conn = getNonManagedTXConnection();
}
final T result = txCallback.execute(conn);
try {
commitConnection(conn);
} catch (JobPersistenceException e) {
rollbackConnection(conn);
if (txValidator == null || !retryExecuteInNonManagedTXLock(lockName, new TransactionCallback<Boolean>() {
@Override
public Boolean execute(Connection conn) throws JobPersistenceException {
return txValidator.validate(conn, result);
}
})) {
throw e;
}
}
Long sigTime = clearAndGetSignalSchedulingChangeOnTxCompletion();
if(sigTime != null && sigTime >= 0) {
signalSchedulingChangeImmediately(sigTime);
}
return result;
} catch (JobPersistenceException e) {
rollbackConnection(conn);
throw e;
} catch (RuntimeException e) {
rollbackConnection(conn);
throw new JobPersistenceException("Unexpected runtime exception: "
+ e.getMessage(), e);
} finally {
try {
releaseLock(lockName, transOwner);
} finally {
cleanupConnection(conn);
}
}
}
getLockHandler()方法会获得一个信号量,如果是单机环境使用的是SimpleSemaphore,如果集群环境使用的是DBSemaphore。在集群环境下obtainLock()方法如下:
public boolean obtainLock(Connection conn, String lockName)
throws LockException {
if(log.isDebugEnabled()) {
log.debug(
"Lock '" + lockName + "' is desired by: "
+ Thread.currentThread().getName());
}
if (!isLockOwner(lockName)) {
executeSQL(conn, lockName, expandedSQL, expandedInsertSQL);
if(log.isDebugEnabled()) {
log.debug(
"Lock '" + lockName + "' given to: "
+ Thread.currentThread().getName());
}
getThreadLocks().add(lockName);
//getThreadLocksObtainer().put(lockName, new
// Exception("Obtainer..."));
} else if(log.isDebugEnabled()) {
log.debug(
"Lock '" + lockName + "' Is already owned by: "
+ Thread.currentThread().getName());
}
return true;
}
executeSQL方法使用悲观锁的方式对行进行加锁。代码如下:
protected void executeSQL(Connection conn, final String lockName, final String expandedSQL, final String expandedInsertSQL) throws LockException {
PreparedStatement ps = null;
ResultSet rs = null;
SQLException initCause = null;
// attempt lock two times (to work-around possible race conditions in inserting the lock row the first time running)
int count = 0;
do {
count++;
try {
ps = conn.prepareStatement(expandedSQL);
ps.setString(1, lockName);
if (getLog().isDebugEnabled()) {
getLog().debug(
"Lock '" + lockName + "' is being obtained: " +
Thread.currentThread().getName());
}
rs = ps.executeQuery();
if (!rs.next()) {
getLog().debug(
"Inserting new lock row for lock: '" + lockName + "' being obtained by thread: " +
Thread.currentThread().getName());
rs.close();
rs = null;
ps.close();
ps = null;
ps = conn.prepareStatement(expandedInsertSQL);
ps.setString(1, lockName);
int res = ps.executeUpdate();
if(res != 1) {
if(count < 3) {
// pause a bit to give another thread some time to commit the insert of the new lock row
try {
Thread.sleep(1000L);
} catch (InterruptedException ignore) {
Thread.currentThread().interrupt();
}
// try again ...
continue;
}
throw new SQLException(Util.rtp(
"No row exists, and one could not be inserted in table " + TABLE_PREFIX_SUBST + TABLE_LOCKS +
" for lock named: " + lockName, getTablePrefix(), getSchedulerNameLiteral()));
}
}
return; // obtained lock, go
} catch (SQLException sqle) {
//Exception src =
// (Exception)getThreadLocksObtainer().get(lockName);
//if(src != null)
// src.printStackTrace();
//else
// System.err.println("--- ***************** NO OBTAINER!");
if(initCause == null)
initCause = sqle;
if (getLog().isDebugEnabled()) {
getLog().debug(
"Lock '" + lockName + "' was not obtained by: " +
Thread.currentThread().getName() + (count < 3 ? " - will try again." : ""));
}
if(count < 3) {
// pause a bit to give another thread some time to commit the insert of the new lock row
try {
Thread.sleep(1000L);
} catch (InterruptedException ignore) {
Thread.currentThread().interrupt();
}
// try again ...
continue;
}
throw new LockException("Failure obtaining db row lock: "
+ sqle.getMessage(), sqle);
} finally {
if (rs != null) {
try {
rs.close();
} catch (Exception ignore) {
}
}
if (ps != null) {
try {
ps.close();
} catch (Exception ignore) {
}
}
}
} while(count < 4);
throw new LockException("Failure obtaining db row lock, reached maximum number of attempts. Initial exception (if any) attached as root cause.", initCause);
}
expandedSQL:SELECT * FROM qrtz_LOCKS WHERE SCHED_NAME = 'chhSchedule' AND LOCK_NAME = ? FOR UPDATE
expandedInsertSQL:INSERT INTO qrtz_LOCKS(SCHED_NAME, LOCK_NAME) VALUES ('chhSchedule', ?)
如果已经有lockName代表的行,直接加锁,如果没有插入。但是在加锁时或插入时有可能失败,失败则重试,重试如果超过一定次数就会直接抛出异常。
在执行完操作(job)之后,提交事务,解锁。