NEO4j中用于保持内存内容与磁盘数据同步的类（CHECKPOINT AND FLUSH）（part 1）

最新推荐文章于 2023-03-08 17:27:18 发布

Ella486900

最新推荐文章于 2023-03-08 17:27:18 发布

阅读量223

点赞数

文章标签： neo4j

本文链接：https://blog.csdn.net/Ella486900/article/details/129331188

版权

CheckPointer

位置：package org.neo4j.kernel.impl.transaction.log.checkpoint;
This interface represent a check pointer which is responsible to write check points in the transaction log.
这个interface的作用是在transaction log中写入Check points.
但是具体进行checkPoint（也就是具体的写入磁盘或者调用写入磁盘的方法）不在这里。

方法：
* long checkPointIfNeeded(TriggerInfo triggerInfo) throws IOException;
* long tryCheckPoint(TriggerInfo triggerInfo) throws IOException;
* long tryCheckPoint(TriggerInfo triggerInfo, BooleanSupplier timeout) throws IOException;
* long tryCheckPointNoWait(TriggerInfo triggerInfo) throws IOException;
* long forceCheckPoint(TriggerInfo triggerInfo) throws IOException;
* long lastCheckPointedTransactionId();

checkPointIfNeeded

 * This method will verify that the conditions for triggering a check point hold and in such a case it will write a check point in the transaction log.
 * This method does NOT handle concurrency since there should be only one check point thread running. 这个方法不会处理并行情况 因为应该只有一个check point的transaction在运行（这里是不是有漏洞？我们可以开启多个进程手动调用checkPoint，这会不会导致数据不同步？）
 * @param triggerInfo the info describing why check pointing has been triggered pending approval of the threshold check
 * @return the transaction id used for the check pointing or -1 if check pointing wasn't needed
 * @throws IOException if writing the check point fails

tryCheckPoint

 * This method tries the write of a check point in the transaction log. If there is no running check pointing it will check point otherwise it will wait for the running check pointing to complete.
 * @param triggerInfo the info describing why check pointing has been triggered
 * @return the transaction id used for the check pointing.
 * @throws IOException if writing the check point fails

lastCheckPointedTransactionId

 * @return the transaction id which the last checkpoint was made it. If there's no checkpoint then {@link TransactionIdStore#BASE_TX_ID} is returned.

CheckPointerImpl

上面那个类的实现。
最重要的功能是在这个方法完成的：

    private long doCheckPoint(TriggerInfo triggerInfo) throws IOException {
        var databaseTracer = tracers.getDatabaseTracer();
        try (var cursorContext = cursorContextFactory.create(CHECKPOINT_TAG);
                LogCheckPointEvent checkPointEvent = databaseTracer.beginCheckPoint()) {
            var lastClosedTxData = metadataProvider.getLastClosedTransaction();
            var lastClosedTransaction = new TransactionId(
                    lastClosedTxData.transactionId(), lastClosedTxData.checksum(), lastClosedTxData.commitTimestamp());
            long lastClosedTransactionId = lastClosedTransaction.transactionId();
            cursorContext.getVersionContext().initWrite(lastClosedTransactionId);
            LogPosition logPosition = lastClosedTxData.logPosition();
            String checkpointReason = triggerInfo.describe(lastClosedTransactionId);
            /*
             * Check kernel health before going into waiting for transactions to be closed, to avoid
             * getting into a scenario where we would await a condition that would potentially never
             * happen.
             */
            databaseHealth.assertHealthy(IOException.class);
            /*
             * First we flush the store. If we fail now or during the flush, on recovery we'll find the
             * earlier check point and replay from there all the log entries. Everything will be ok.
             */
            log.info(checkpointReason + " checkpoint started...");
            Stopwatch startTime = Stopwatch.start();
            
			// checkPointEvent似乎是进行IO的地方！！
            try (var flushEvent = checkPointEvent.beginDatabaseFlush()) {
                forceOperation.flushAndForce(flushEvent, cursorContext);
                flushEvent.ioControllerLimit(ioController.configuredLimit());
            }

            /*
             * Check kernel health before going to write the next check point.  In case of a panic this check point
             * will be aborted, which is the safest alternative so that the next recovery will have a chance to
             * repair the damages.
             */
            databaseHealth.assertHealthy(IOException.class);
            checkpointAppender.checkPoint(
                    checkPointEvent, lastClosedTransaction, logPosition, clock.instant(), checkpointReason);
            threshold.checkPointHappened(lastClosedTransactionId, logPosition);
            long durationMillis = startTime.elapsed(MILLISECONDS);
            checkPointEvent.checkpointCompleted(durationMillis);
            log.info(createCheckpointMessageDescription(checkPointEvent, checkpointReason, durationMillis));

            /*
             * Prune up to the version pointed from the latest check point,
             * since it might be an earlier version than the current log version.
             */
            logPruning.pruneLogs(logPosition.getLogVersion());
            lastCheckPointedTx = lastClosedTransactionId;
            return lastClosedTransactionId;
        } catch (Throwable t) {
            // Why only log failure here? It's because check point can potentially be made from various
            // points of execution e.g. background thread triggering check point if needed and during
            // shutdown where it's better to have more control over failure handling.
            log.error("Checkpoint failed", t);
            throw t;
        }
    }

这里可能需要关注的类：CursorContext, LogCheckPointEvent, ForceOperation
cursorContext = cursorContextFactory.create(CHECKPOINT_TAG);
flushEvent = checkPointEvent.beginDatabaseFlush()
forceOperation.flushAndForce(flushEvent, cursorContext);

public interface ForceOperation {
    void flushAndForce(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException;
}

flushAndForce的实现：

public class DefaultForceOperation implements CheckPointerImpl.ForceOperation {
//....这里有一些初始化功能啥的 占地方先删了
   @Override
	public void flushAndForce(DatabaseFlushEvent databaseFlushEvent, CursorContext cursorContext) throws IOException {
        FlushGuard flushGuard = databasePageCache.flushGuard(databaseFlushEvent);
        indexingService.checkpoint(databaseFlushEvent, cursorContext);
        storageEngine.checkpoint(databaseFlushEvent, cursorContext);
        flushGuard.flushUnflushed();
    }

就……还是一堆接口。
感觉最重要的应该是storageEngine.checkPoint().

StorageEngine

位置：package org.neo4j.storageengine.api
A StorageEngine provides the functionality to durably store data and read it back.
StorageEngine提供读写磁盘文件的功能。Durably指的应该是在数据库运行的这个过程，它可以一直提供这种服务。
在RecordStorageEngine里面有checkPoint的实现：

@Override
public void checkpoint(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException {
    try (var fileFlushEvent = flushEvent.beginFileFlush()) {
        countsStore.checkpoint(fileFlushEvent, cursorContext);
    }
    try (var fileFlushEvent = flushEvent.beginFileFlush()) {
        groupDegreesStore.checkpoint(fileFlushEvent, cursorContext);
    }
    neoStores.checkpoint(flushEvent, cursorContext);
}

不知道这个GBPTreeCountsStore和RelationshipGroupDegreesStore是干什么的。
这里调用了三次checkpoint
唉好难
先看一下neoStore的checkpoint
现在还有一个问题是，谁调用了checkPoint?? 有两个切入点，一个是代码调用.checkPoint的位置，另一个是输出log的地方（checkPoint is … by database shutdown…）。

NeoStore

public void flush(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException {
    pageCache.flushAndForce(flushEvent);
    checkpoint(flushEvent, cursorContext);
}

public void checkpoint(DatabaseFlushEvent flushEvent, CursorContext cursorContext) throws IOException {
    visitStores(store -> {
        try (var fileFlushEvent = flushEvent.beginFileFlush()) {
            store.getIdGenerator().checkpoint(fileFlushEvent, cursorContext);
        }
    });
}

SimpleTriggerInfo

位置：package org.neo4j.kernel.impl.transaction.log.checkpoint;

Simple implementation of a trigger info taking in construction the name/description of what triggered the check point and offering the possibility to be enriched with a single optional extra description.

 @Override
 public String describe(long transactionId) {
     String info = description == null ? triggerName : triggerName + " for " + description;
     return "Checkpoint triggered by \"" + info + "\" @ txId: " + transactionId;
 }

哪个地方调用了这个类的方式？
调用SimpleTriggerInfo的位置

CheckpointerLifecycle

@Override
public void shutdown() throws Exception {
    // Write new checkpoint in the log only if the database is healthy.
    // We cannot throw here since we need to shutdown without exceptions.
    if (checkpointOnShutdown && databaseHealth.isHealthy()) {
        checkPointer.forceCheckPoint(new SimpleTriggerInfo("Database shutdown"));
    }
}

这个地方是在关闭数据库的时候触发checkPoint

CheckPointScheduler

这个类是周期查看checkPoint的地方。那是哪个类周期性的调用了这个类呢？

在Database.java里面调用了这个CheckPointScheduler

PeriodicThresholdPolicy

The {@code periodic} check point threshold policy uses the {@link GraphDatabaseSettings#check_point_interval_time} and {@link GraphDatabaseSettings#check_point_interval_tx} to decide when check points processes should be started.

Ella486900

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
NEO4j中用于保持内存内容与磁盘数据同步的类（CHECKPOINT AND FLUSH）（part 1）

neo4j在内存中采用两层cache结构（可以理解为Entity和PageCache）。数据以record的形式存储在PageCache里面，通过cursor与Entity或者磁盘进行交互。这三者之间需要保持数据的一致性。这篇笔记以Neo4j以15minutes为周期flush磁盘的CheckPoint为切入点尝试理解数据库中数据同步机制
复制链接

扫一扫