数据库的PITR是一般数据库都必须满足的技术。其原理是依据之前的物理备份文件加上wal的预写日志模式备份做的恢复。
直接拿实验说话:
环境 CentOS 7 + PG 10.8
0. 初始化数据库
initdb -D /usr/local/pgsql/data/
......
Success. You can now start the database server using:
pg_ctl -D /usr/local/pgsql/data/ -l logfile start
service postgresql_5432 start
psql -U postgres
postgres=# create user dbadmin superuser ;
CREATE ROLE
postgres=# create user mytest superuser ;
CREATE ROLE
postgres=# create database mytest owner mytest;
CREATE DATABASE
1. 配置参数
vim postgresql.conf
archive_mode = on
archive_command = 'cp %p /home/postgres/pg_wal/%f'
archive_timeout = 3600
wal_level = replica
service postgresql_5432 restart
2. 备份
su - postgres
mkdir /home/postgres/backup/
pg_basebackup -Ft -X -D /home/postgres/backup/
3. 测试数据
mytest=# create table test (id serial primary key, col varchar(1000), ctime timestamp without time zone default now());
CREATE TABLE
insert into test (col) values ('a');
insert into test (col) values ('b');
insert into test (col) values ('c');
mytest=# select * from test;
id | col | ctime
----+-----+----------------------------
1 | a | 2019-06-18 03:57:12.767601
2 | b | 2019-06-18 03:57:12.772068
3 | c | 2019-06-18 03:57:13.572814
4 | a | 2019-06-18 03:57:14.982699
5 | b | 2019-06-18 03:57:14.985398
6 | c | 2019-06-18 03:57:16.181587
7 | a | 2019-06-18 03:57:20.958266
8 | b | 2019-06-18 03:57:20.960301
9 | c | 2019-06-18 03:57:21.653256
(9 rows)
mytest=# select pg_switch_wal();
pg_switch_wal
---------------
0/30251D0
(1 row)
4. 删除数据库
模拟数据库宕机或修改错误数据
service postgresql_5432 stop
rm -rf /usr/local/pgsql/data/*
5. 恢复
5.1 解压基础备份文件
找到备份文件
cd /home/postgres/backup/
tar xvf base.tar -C /usr/local/pgsql/data/
tar xvf pg_wal.tar -C /usr/local/pgsql/data/
5.2 配置恢复文件 recovery.conf
vim recovery.conf
restore_command = 'cp /home/postgres/pg_wal/%f %p'
recovery_target_time = '2019-06-18 03:57:13.572814'
chown -R postgres:postgres recovery.conf
chmod 700 /var/lib/pgsql/10/data
chmod 700 /usr/local/pgsql/data
5.3 启动
service postgresql_5432 start
6. 查看
mytest=# select * from test;
id | col | ctime
----+-----+----------------------------
1 | a | 2019-06-18 03:57:12.767601
2 | b | 2019-06-18 03:57:12.772068
(2 rows)
7. 向前推进
7.1
service postgresql_5432 stop
vim recovery.conf
restore_command = 'cp /home/postgres/pg_wal/%f %p'
recovery_target_time = '2019-06-18 03:57:14'
观察日志
2019-06-18 04:10:29.787 EDT [24681] LOG: database system was shut down in recovery at 2019-06-18 04:09:37 EDT
2019-06-18 04:10:29.788 EDT [24681] LOG: starting point-in-time recovery to 2019-06-18 03:57:14-04
2019-06-18 04:10:29.841 EDT [24681] LOG: restored log file "000000010000000000000002" from archive
2019-06-18 04:10:29.895 EDT [24681] LOG: redo starts at 0/2000060
2019-06-18 04:10:29.943 EDT [24681] LOG: restored log file "000000010000000000000003" from archive
2019-06-18 04:10:29.996 EDT [24681] LOG: consistent recovery state reached at 0/3024D00
2019-06-18 04:10:29.996 EDT [24681] LOG: recovery stopping before commit of transaction 568, time 2019-06-18 03:57:14.983236-04
2019-06-18 04:10:29.996 EDT [24681] LOG: recovery has paused
2019-06-18 04:10:29.996 EDT [24681] HINT: Execute pg_wal_replay_resume() to continue.
2019-06-18 04:10:29.998 EDT [24679] LOG: database system is ready to accept read only connections
7.2
mytest=# select * from test;
id | col | ctime
----+-----+----------------------------
1 | a | 2019-06-18 03:57:12.767601
2 | b | 2019-06-18 03:57:12.772068
3 | c | 2019-06-18 03:57:13.572814
(3 rows)
7.3 继续推进
service postgresql_5432 stop
vim recovery.conf
restore_command = 'cp /home/postgres/pg_wal/%f %p'
recovery_target_time = '2019-06-18 03:57:17'
service postgresql_5432 start
mytest=# select * from test;
id | col | ctime
----+-----+----------------------------
1 | a | 2019-06-18 03:57:12.767601
2 | b | 2019-06-18 03:57:12.772068
3 | c | 2019-06-18 03:57:13.572814
4 | a | 2019-06-18 03:57:14.982699
5 | b | 2019-06-18 03:57:14.985398
6 | c | 2019-06-18 03:57:16.181587
(6 rows)
log
2019-06-18 04:13:27.951 EDT [25020] LOG: database system was shut down in recovery at 2019-06-18 04:13:16 EDT
2019-06-18 04:13:27.952 EDT [25020] LOG: starting point-in-time recovery to 2019-06-18 03:57:17-04
2019-06-18 04:13:27.997 EDT [25020] LOG: restored log file "000000010000000000000002" from archive
2019-06-18 04:13:28.054 EDT [25020] LOG: redo starts at 0/2000060
2019-06-18 04:13:28.103 EDT [25020] LOG: restored log file "000000010000000000000003" from archive
2019-06-18 04:13:28.157 EDT [25020] LOG: consistent recovery state reached at 0/3024DB0
2019-06-18 04:13:28.158 EDT [25020] LOG: recovery stopping before commit of transaction 571, time 2019-06-18 03:57:20.958823-04
2019-06-18 04:13:28.158 EDT [25020] LOG: recovery has paused
2019-06-18 04:13:28.158 EDT [25020] HINT: Execute pg_wal_replay_resume() to continue.
2019-06-18 04:13:28.159 EDT [25018] LOG: database system is ready to accept read only connections
8. 开启读写模式
如果我们确定恢复完成之后,可以打开读写模式了。只需要执行 pg_wal_replay_resume()
mytest=# create table test0618 (id int);
ERROR: cannot execute CREATE TABLE in a read-only transaction
mytest=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
mytest=# select * from pg_wal_replay_resume();
pg_wal_replay_resume
----------------------
(1 row)
mytest=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
mytest=# create table test0618 (id int);
CREATE TABLE