CheckPointer
位置:package org.neo4j.kernel.impl.transaction.log.checkpoint;
This interface represent a check pointer which is responsible to write check points in the transaction log.
这个interface的作用是在transaction log中写入Check points.
但是具体进行checkPoint(也就是具体的写入磁盘或者调用写入磁盘的方法)不在这里。
方法:
* long checkPointIfNeeded(TriggerInfo triggerInfo) throws IOException;
* long tryCheckPoint(TriggerInfo triggerInfo) throws IOException;
* long tryCheckPoint(TriggerInfo triggerInfo, BooleanSupplier timeout) throws IOException;
* long tryCheckPointNoWait(TriggerInfo triggerInfo) throws IOException;
* long forceCheckPoint(TriggerInfo triggerInfo) throws IOException;
* long lastCheckPointedTransactionId();
checkPointIfNeeded
* This method will verify that the conditions for triggering a check point hold and in such a case it will write a check point in the transaction log.
* This method does NOT handle concurrency since there should be only one check point thread running. 这个方法不会处理并行情况 因为应该只有一个check point的transaction在运行(这里是不是有漏洞?我们可以开启多个进程手动调用checkPoint,这会不会导致数据不同步?)
* @param triggerInfo the info describing why check pointing has been triggered pending approval of the threshold check
* @return the transaction id used for the check pointing or -1 if check pointing wasn't needed
* @throws IOException if writing the check point fails
tryCheckPoint
* This method tries the write of a check point in the transaction log. If there is no running check pointing it will check point otherwise it will wait for the running check pointing to complete.
* @param triggerInfo the info describing why check pointing has been triggered
* @return the transaction id used for the check pointing.
* @throws IOException if writing the check point fails
lastCheckPointedTransactionId
* @return the transaction id which the last checkpoint was made it. If there's no checkpoint then {@link TransactionIdStore#BASE_TX_ID} is returned.
CheckPointerImpl
上面那个类的实现。
最重要的功能是在这个方法完成的:
private long doCheckPoint(TriggerInfo triggerInfo) throws IOException {
var databaseTracer = tracers.getDatabaseTracer();
try (var cursorContext = cursorContextFactory.create(CHECKPOINT_TAG);
LogCheckPointEvent checkPointEvent = databaseTracer.beginCheckPoint()) {
var lastClosedTxData = metadataProvider.getLastClosedTransaction();
var lastClosedTransaction = new TransactionId(
lastClosedTxData.transactionId(), lastClosedTxData.checksum(), lastClosedTxData.commitTimestamp());
long lastClosedTransactionId = lastClosedTransaction.transactionId();
cursorContext.getVersionContext().initWrite(lastClosedTransactionId);
LogPosition logPosition = lastClosedTxData.logPosition();
String checkpointReason = triggerInfo.describe(lastClosedTransactionId);
/*
* Check kernel health before going into waiting for transactions to be closed, to avoid
* getting into a scenario where we would await a condition that would potentially never
* happen.
*/
databaseHealth.assertHealthy(IOException.class);
/*
* First we flush the store. If we fail now or during the flush, on recovery we'll find the
* earlier check point and replay from there all the log entries. Everything will be ok.
*/
log.info(checkpointReason + " checkpoint started...");
Stopwatch startTime = Stopwatch.start();
// checkPointEvent似乎是进行IO的地方!!
try (var flushEvent = checkPointEvent.beginDatabaseFlush()) {
forceOperation.flushAndForce(flushEvent, cursorContext);
flushEvent.ioControllerLimit(ioController.configuredLimit());
}
/*
* Check kernel health before going to write the next check point. In case of a panic this check point
* will be aborted, which is the safest alternative so that the next recovery will have a chance to
* repair the damages.
*/
databaseHealth.assertHealthy(IOException.class);
checkpointAppender.checkPoint(
checkPointEvent, lastClosedTransaction, logPosition, clock.instant(), checkpointReason);
threshold.checkPointHappened(lastClosedTransactionId, logPosition);
long durationMillis = startTime.elapsed(MILLISECONDS);
checkPointEvent.checkpointCompleted(durationMillis);
log.info(createCheckpointMessageDescription(checkPointEvent, checkpointReason, durationMillis));
/*
* Prune up to the version pointed from the latest check point,
* since it might be an earlier version than the current log version.
*/
logPruning.pruneLogs(logPosition.getLogVersion());
lastCheckPointedTx = lastClosedTransactionId;
return lastClosedTransactionId;
} catch (Throwable t) {
// Why only log failure here? It's because check point can potentially be made from various
// points of execution e.g. background thread triggering check point if needed and during
// shutdown where it's better to have more control over failure handling.
log.error("Checkpoint failed", t);
throw t;
}
}
这里可能需要关注的类:CursorContext, LogCheckPointEvent, ForceOperation
cursorContext = cursorContextFactory.create(CHECKPOINT_TAG);
flushEvent = checkPointEvent.beginDatabaseFlush()
forceOperation.flushAndForce(flushEvent, cursorContext);
public interface ForceOperation {
void flushAndForce(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException;
}
flushAndForce的实现:
public class DefaultForceOperation implements CheckPointerImpl.ForceOperation {
//....这里有一些初始化功能啥的 占地方先删了
@Override
public void flushAndForce(DatabaseFlushEvent databaseFlushEvent, CursorContext cursorContext) throws IOException {
FlushGuard flushGuard = databasePageCache.flushGuard(databaseFlushEvent);
indexingService.checkpoint(databaseFlushEvent, cursorContext);
storageEngine.checkpoint(databaseFlushEvent, cursorContext);
flushGuard.flushUnflushed();
}
就……还是一堆接口。
感觉最重要的应该是storageEngine.checkPoint().
StorageEngine
位置:package org.neo4j.storageengine.api
A StorageEngine provides the functionality to durably store data and read it back.
StorageEngine提供读写磁盘文件的功能。Durably指的应该是在数据库运行的这个过程,它可以一直提供这种服务。
在RecordStorageEngine里面有checkPoint的实现:
@Override
public void checkpoint(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException {
try (var fileFlushEvent = flushEvent.beginFileFlush()) {
countsStore.checkpoint(fileFlushEvent, cursorContext);
}
try (var fileFlushEvent = flushEvent.beginFileFlush()) {
groupDegreesStore.checkpoint(fileFlushEvent, cursorContext);
}
neoStores.checkpoint(flushEvent, cursorContext);
}
不知道这个GBPTreeCountsStore和RelationshipGroupDegreesStore是干什么的。
这里调用了三次checkpoint
唉好难
先看一下neoStore的checkpoint
现在还有一个问题是,谁调用了checkPoint?? 有两个切入点,一个是代码调用.checkPoint的位置,另一个是输出log的地方(checkPoint is … by database shutdown…)。
NeoStore
public void flush(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException {
pageCache.flushAndForce(flushEvent);
checkpoint(flushEvent, cursorContext);
}
public void checkpoint(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException {
visitStores(store -> {
try (var fileFlushEvent = flushEvent.beginFileFlush()) {
store.getIdGenerator().checkpoint(fileFlushEvent, cursorContext);
}
});
}
SimpleTriggerInfo
位置:package org.neo4j.kernel.impl.transaction.log.checkpoint;
-
Simple implementation of a trigger info taking in construction the name/description of what triggered the check point and offering the possibility to be enriched with a single optional extra description.
@Override public String describe(long transactionId) { String info = description == null ? triggerName : triggerName + " for " + description; return "Checkpoint triggered by \"" + info + "\" @ txId: " + transactionId; }
哪个地方调用了这个类的方式?
CheckpointerLifecycle
@Override
public void shutdown() throws Exception {
// Write new checkpoint in the log only if the database is healthy.
// We cannot throw here since we need to shutdown without exceptions.
if (checkpointOnShutdown && databaseHealth.isHealthy()) {
checkPointer.forceCheckPoint(new SimpleTriggerInfo("Database shutdown"));
}
}
这个地方是在关闭数据库的时候触发checkPoint
CheckPointScheduler
这个类是周期查看checkPoint的地方。那是哪个类周期性的调用了这个类呢?
在Database.java里面调用了这个CheckPointScheduler
PeriodicThresholdPolicy
The {@code periodic} check point threshold policy uses the {@link GraphDatabaseSettings#check_point_interval_time} and {@link GraphDatabaseSettings#check_point_interval_tx} to decide when check points processes should be started.