postgresql源码学习(40)—— 崩溃恢复② - 恢复起点

67 篇文章 52 订阅
14 篇文章 4 订阅

一、 获取恢复起点

1. 默认的恢复起点

       通常,崩溃恢复的起点是最近一次检查点,这个位置保存在控制文件中。在之前创建检查点的函数中我们也看到,每次检查点创建时都会刷新控制文件中的信息。

       以下代码位于StartupXLOG函数(xlog.c

/* Get the last valid checkpoint record. 
从控制文件获取检查点及redo点位置,从XLOG读取检查点记录 
*/
        checkPointLoc = ControlFile->checkPoint;
        RedoStartLSN = ControlFile->checkPointCopy.redo;
        record = ReadCheckpointRecord(xlogreader, checkPointLoc, 1, true);

/* 如果读到,输出debug信息 */
        if (record != NULL)
        {
            ereport(DEBUG1,
                    (errmsg_internal("checkpoint record is at %X/%X",
                                     LSN_FORMAT_ARGS(checkPointLoc))));
        }
        else
        {
            /* 如果读不到,直接报错。旧版本在非standby模式下会尝试再往前读取一个检查点,新版本为了简化,去掉了这一项 */
            ereport(PANIC,
                    (errmsg("could not locate a valid checkpoint record")));
        }

2. 特殊情况

        前面提到过排他模式备份会创建backup_label文件,如果该文件存在,则优先从该文件获取检查点信息,作为故障恢复起点。

if (read_backup_label(&checkPointLoc, &backupEndRequired,
                          &backupFromStandby))
    {
        List       *tablespaces = NIL;

        /*
         * Archive recovery was requested, and thanks to the backup label file, we know how far we need to replay to reach consistency. Enter archive recovery directly.
         */
        InArchiveRecovery = true;
        if (StandbyModeRequested)
            StandbyMode = true;

        /*
         * When a backup_label file is present, we want to roll forward from the checkpoint it identifies, rather than using pg_control.
          如果backup_label文件存在,则优先从该文件而非控制文件获取检查点信息。
         */
        record = ReadCheckpointRecord(xlogreader, checkPointLoc, 0, true);

/* 虽然backup_label文件中记录了检查点信息,但有可能其所在WAL日志已被清理(尤其是备份时间太长的时候),因此XLOG中有可能读不到对应信息 */

/* 如果读到检查点信息 */
        if (record != NULL)
        {
            memcpy(&checkPoint, XLogRecGetData(xlogreader), sizeof(CheckPoint));
            wasShutdown = ((record->xl_info & ~XLR_INFO_MASK) == XLOG_CHECKPOINT_SHUTDOWN);
            ereport(DEBUG1,
                    (errmsg_internal("checkpoint record is at %X/%X",
                                     LSN_FORMAT_ARGS(checkPointLoc))));
            InRecovery = true;  /* force recovery even if SHUTDOWNED */

            /*
             * Make sure that REDO location exists. This may not be the a backup_label around that references a WAL segment that's already been archived. 即使读到了检查点信息,也可能读不到redo信息(略早于检查点时间)
             */
            if (checkPoint.redo < checkPointLoc)
            {
                XLogBeginRead(xlogreader, checkPoint.redo);
                if (!ReadRecord(xlogreader, LOG, false))
                    ereport(FATAL,
                            (errmsg("could not find redo location referenced by checkpoint record"),
                             errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery options.\n"
                                     "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n"
                                     "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.",
                                     DataDir, DataDir, DataDir)));
            }
        }
/* 如果读不到检查点信息 */
        else
        {
            ereport(FATAL,
                    (errmsg("could not locate required checkpoint record"),
                     errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add required recovery options.\n"
                             "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n"
                             "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.",
                             DataDir, DataDir, DataDir)));
            wasShutdown = false;    /* keep compiler quiet */
        }

如果有backup_label文件,但又无法获取到检查点或者redo点信息,数据库启动会报错。

根据error提示:

  • 如果是在从备份恢复数据,则创建recovery.signal文件并添加必要的恢复选项
  • 如果不是从备份恢复数据,则删除backup_label文件
  • 注意,如果删除了backup_label文件,对应的那个排他备份是不能用于恢复的,相当于备份失败了

3. 重要变量

进入下一部分前,再重点看几个变量,它们会在后面的代码中频繁出现:

  • InRecovery:如果为true,应该理解为进程正在replay日志记录,而不是系统正处于恢复模式,后者应该通过RecoveryInProgress() 确定。
  • ArchiveRecoveryRequested:请求进行归档日志恢复
  • InArchiveRecovery:若为true,说明当前在使用归档日志恢复(通常在执行PITR、或者是从库);若为false,说明当前仅使用pg_wal目录中的wal日志进行恢复(通常是崩溃恢复阶段)
/*
 * Are we doing recovery from XLOG?
 *
 * This is only ever true in the startup process; it should be read as meaning
 * "this process is replaying WAL records", rather than "the system is in
 * recovery mode".  It should be examined primarily by functions that need
 * to act differently when called from a WAL redo function (e.g., to skip WAL
 * logging).  To check whether the system is in recovery regardless of which
 * process you're running in, use RecoveryInProgress() but only after shared
 * memory startup and lock initialization.
 */
bool        InRecovery = false;
/*
 * When ArchiveRecoveryRequested is set, archive recovery was requested,
 * ie. signal files were present. When InArchiveRecovery is set, we are
 * currently recovering using offline XLOG archives. These variables are only
 * valid in the startup process.
 *
 * When ArchiveRecoveryRequested is true, but InArchiveRecovery is false, we're
 * currently performing crash recovery using only XLOG files in pg_wal, but
 * will switch to using offline XLOG archives as soon as we reach the end of
 * WAL in pg_wal.
*/
bool        ArchiveRecoveryRequested = false;
bool        InArchiveRecovery = false;

二、 进入恢复模式

     从if (InRecovery) 部分开始,真正开始日志应用。首先会更新控制文件,说明当前进入了Recovery模式,并将读到的检查点信息也保存到控制文件。

/* REDO开始 */
    if (InRecovery)
    {
        int         rmid;

        /*
         * Update pg_control to show that we are recovering and to show the selected checkpoint as the place we are starting from. We also mark
pg_control with any minimum recovery stop point obtained from a backup history file.  
更新控制文件状态以示我们正在恢复,并且展示我们选择作为恢复起点的检查点位置。另外还会用从备份历史文件获取的最小恢复结束位置(minimum recovery stop point)标记控制文件
         */

        /* 先保存控制文件中的状态,然后更新 */
        dbstate_at_startup = ControlFile->state;
        /* 如果在使用归档日志进行恢复(PITR或者从库),更新状态 */
        if (InArchiveRecovery)
        {
            ControlFile->state = DB_IN_ARCHIVE_RECOVERY;

            SpinLockAcquire(&XLogCtl->info_lck);
            XLogCtl->SharedRecoveryState = RECOVERY_STATE_ARCHIVE;
            SpinLockRelease(&XLogCtl->info_lck);
        }
        /* 否则,如果在使用WAL文件进行恢复(崩溃恢复) */
        else
        {
            ereport(LOG,
                    (errmsg("database system was not properly shut down; "
                            "automatic recovery in progress")));

            /* 如果指定了目标时间线,且大于控制文件中记录的当前时间线,则记录日志信息并更新时间线 */
            if (recoveryTargetTLI > ControlFile->checkPointCopy.ThisTimeLineID)
                ereport(LOG,
                        (errmsg("crash recovery starts in timeline %u "
                                "and has target timeline %u",
                                ControlFile->checkPointCopy.ThisTimeLineID,
                                recoveryTargetTLI)));
            /* 修改控制文件状态,以示在进行崩溃恢复 */
            ControlFile->state = DB_IN_CRASH_RECOVERY;

            SpinLockAcquire(&XLogCtl->info_lck);
            XLogCtl->SharedRecoveryState = RECOVERY_STATE_CRASH;
            SpinLockRelease(&XLogCtl->info_lck);
        }

        /* 更新控制文件中检查点信息 */
        ControlFile->checkPoint = checkPointLoc;
        ControlFile->checkPointCopy = checkPoint;
        if (InArchiveRecovery)
        {
            /* initialize minRecoveryPoint if not set yet,最小恢复点不应该小于重做点 */
            if (ControlFile->minRecoveryPoint < checkPoint.redo)
            {
                ControlFile->minRecoveryPoint = checkPoint.redo;
                ControlFile->minRecoveryPointTLI = checkPoint.ThisTimeLineID;
            }
        }

        /* 如果有backup_label文件 */
        if (haveBackupLabel)
        {
            ControlFile->backupStartPoint = checkPoint.redo;
            ControlFile->backupEndRequired = backupEndRequired;

            /* 如果是从从库备份的 */
            if (backupFromStandby)
            {
                /* 只可能是以下两种状态,若不是则报错 */
                if (dbstate_at_startup != DB_IN_ARCHIVE_RECOVERY &&
                    dbstate_at_startup != DB_SHUTDOWNED_IN_RECOVERY)
                    ereport(FATAL,
                            (errmsg("backup_label contains data inconsistent with control file"),
                             errhint("This means that the backup is corrupted and you will "
                                     "have to use another backup for recovery.")));
                /* 若是,则更新备份结束点 */
                ControlFile->backupEndPoint = ControlFile->minRecoveryPoint;
            }
        }
        /* 更新时间 */
        ControlFile->time = (pg_time_t) time(NULL);
        /* No need to hold ControlFileLock yet, we aren't up far enough */
        UpdateControlFile();
…

参考

PostgreSQL技术内幕:事务处理深度探索》第4

https://blog.csdn.net/asmartkiller/article/details/121245772

https://blog.nowcoder.net/n/a21fd782200e4f9f9054e66898bbccf4

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Hehuyi_In

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值