Deleting backup_label on restore will corrupt your database!

The quick summary of this issue is that the backup_label file is an integral part of your database cluster binary backup, and removing it to allow the recovery to proceed without error is very likely to corrupt your database.  Don't do that.

Note that this post does not attempt to provide complete instructions for how to restore from a binary backup -- the documentation has all that, and it is of no benefit to duplicate it here; this is to warn people about a common error in the process that can corrupt databases when people try to take short-cuts rather than following the steps described in the documentation.

How to Lose Data


The Proximate Cause

If you are not careful to follow the documentation's instructions for archiving, binary backup, and PITR restore the attempt to start the restored database may fail, and you may see this in the log:

FATAL:  could not locate required checkpoint record
HINT:  If you are not restoring from a backup, try removing the file "$PGDATA/backup_label".

... where $PGDATA is the path to the data directory.  It is critically important to note that the hint says to try removing the file "If you are not restoring from a backup".  If you are restoring from a backup, removing the file will prevent recovery from knowing what set of WAL records need to be applied to the copy to put it into a coherent state; it will assume that it is just recovering from a crash "in place" and will be happy to apply WAL forward from the completion of the last checkpoint.  If that last checkpoint happened after you started the backup process, you will not replay all the WAL needed to achieve a coherent state, and you are very likely to have corruption in the restored database.  This corruption could result in anything from the database failing to start to errors about bad pages to silently returning incorrect results from queries when a particular index is used.  These problems may appear immediately or lie dormant for months before causing visible problems.

Note that you might sometimes get lucky and not experience corruption.  That doesn't mean that deleting the file when restoring from a backup is any more safe than stepping out onto a highway without checking for oncoming traffic -- failure to get clobbered one time provides no guarantee that you will not get clobbered if you try it again.


Secondary Conditions

Now, if you had followed all the other instructions from the documentation for how to restore, making the above mistake would not corrupt your database.  It can only do so as the last step in a chain of mistakes.  Note that for restoring a backup you are supposed to make sure that the postmaster.pid file and the files in the pg_xlog subdirectory have been deleted.  Failure to do so can cause corruption if the database manages to recover in spite of the transgressions.  But if you have deleted (or excluded from backup) the files in the pg_xlog directory, deleting the backup_label file is likely to result in another failure to start, with this in the log:

LOG:  invalid primary checkpoint record
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record

What the hint from the first error above doesn't say is that if you are restoring from a backup, you should check that you don't have any files in pg_xlog from the time of the backup, you should check that do not have a postmaster.pid file, and you should make sure you have a recovery.conf file with appropriate contents (including a restore_command entry that will copy from your archive location).


Why Does This Happen?


The Recovery Process

Restoring from a binary backup makes use of the same recovery process that prevents data loss on a crash of the server.  As pages for relations (tables, indexes, etc.) and other internal structures are modified, these changes are made in RAM buffers which are not written out to the OS until they have been journalled to the Write Ahead Log (WAL) files and flushed to persistent storage (e.g., disk).  Periodically there is a checkpoint, which writes all of the modified pages out to the OS and tells the OS to flush them to permanent storage.  So, if there is a crash, the recovery process can look to the last checkpoint and apply all WAL from that point forward to reach a consistent state.  WAL replay will create, extend, truncate, or remove tables as needed, modify data within files, and will tolerate the case that these changes were already flushed to the main files or have not yet made it to persistent storage.  To handle possible race conditions around the checkpoint, the system tracks the last two checkpoints, and if it can't use one of them it will go to the other.

When you run pg_start_backup() it waits for a distributed (or "paced") checkpoint in process to complete, or (if requested to do so with the "fast" parameter) forces an immediate checkpoint at maximum speed.  You can then copy the files in any order while they are being modified as long as the copy is completed before pg_stop_backup() is called.  Even though there is not consistency among the files (or even within a single file), WAL replay (if it starts from the point of the checkpoint related to the call to pg_start_backup()) will bring things to a coherent state just as it would in crash recovery.

The backup_label File

How does the recovery process know where in the WAL stream it has to start replay for it to be possible to reach a consistent state?  For crash recovery it's simple: it goes to the last checkpoint that is in the WAL based on data saved in the global/pg_control file.  For restoring a backup, the starting point in the WAL steam must be recorded somewhere for the recovery process to find and use.  That is the purpose of the backup_label file.  The presence of the file indicates to the recovery process that it is restoring from a backup, and tells it what WAL is needed to reach a consistent state.  It also contains information that may be of interest to a DBA, and is in a human-readable format; but that doesn't change the fact that it is an integral part of a backup, and the backup is not complete or guaranteed to be usable if it is removed.

Recovery


If you delete the file and cannot prove that there were no checkpoints after pg_start_backup() was run and before the backup copy was completed, you should assume that the database has hidden corruption.  If you can restore from a backup correctly, that is likely to be the best course; if not, you should probably use pg_dump and/or pg_dumpall to get a logical dump, and restore it to a fresh cluster (i.e., use initdb to get a cluster free from corruption to restore into).

Avoidance

If you read the documentation for restoring a binary backup, and follow the steps provided, you will never see this error during a restore and will not suffer the corruption problems.

 

参考:

http://tbeitr.blogspot.com/2015/07/deleting-backuplabel-on-restore-will.html

 

注:1、在做物理备份时产生的backup_label里面记录了恢复时的起始checkpoint,删掉该文件后,除非你能证明在做备份期间,没有checkpoint产生,否则备份是无法使用的。

If you delete the file and cannot prove that there were no checkpoints after pg_start_backup() was run and before the backup copy was completed, you should assume that the database has hidden corruption. 

2、类似crash的恢复则是读取的global/pg_control file。

转载于:https://www.cnblogs.com/xiaotengyi/p/5307955.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值