zalando postgres-operator backup/clone

postgres 使用 wal 备份

Administrator
Unable to backup data into minio
Minio test
PG Environment Variables

描述

什么是备份啊,什么是 wal 备份啊这些通通省略。 通过 wal 备份到 s3, s3 用的是 ceph 的 rgw。

自定义配置文件

custom-env.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: pod-env-overrides
data:
  BACKUP_SCHEDULE: "*/10 * * * *" # Every 10 minus,测试方便嘛
  BACKUP_NUM_TO_RETAIN: "14"
  USE_WALG_BACKUP: "true"
  AWS_ENDPOINT: "http://rook-ceph-rgw-my-store.rook-ceph.svc:80"
  AWS_ACCESS_KEY_ID: "8DPG4580SPX27TDG2RL7"
  AWS_SECRET_ACCESS_KEY: "TgKL0S7pgqcOo7SnF7S0h5pbT7XqCUaGMN1uKetX"
  WALG_DISABLE_S3_SSE: "true"
  AWS_S3_FORCE_PATH_STYLE: "true"
  # USE_WALG_RESTORE 和 CLONE_USE_WALG_RESTORE 从 s3 中恢复的时候需要 
  USE_WALG_RESTORE: "true"
  CLONE_USE_WALG_RESTORE: "true"
  ### 访问 PG 的用户、密码
  PGPASSWORD: u14b3MrK
  PGUSER: postgres
  # 如果没有用户和密码的话,那么可能会出现下面的情况
# envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
## 2022-03-02 10:01:44.872 - /scripts/postgres_backup.sh - I was called as: /scripts/postgres_backup.sh /home/postgres/pgdata/pgroot/data
## psql: error: could not connect to server: FATAL:  role "root" does not exist
##2022-03-02 10:01:45.220 - /scripts/postgres_backup.sh - ERROR: Recovery state unknown: 
kubectl -n default apply -f custom-env.yaml

修改 operator 的 configmap

kubectl edit configmaps postgres-operator
pod_environment_configmap: default/pod-env-overrides
wal_s3_bucket: pg-backup

note: 如果集群已经运行起来的,那可能需要 delete pg 的 operator pod, 然后 pg 的 pod 会重启。

check 一下下

进入一个 pg 的 pod 里面

# envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
2022-03-02 10:08:15.281 - /scripts/postgres_backup.sh - I was called as: /scripts/postgres_backup.sh /home/postgres/pgdata/pgroot/data
2022-03-02 10:08:15.679 - /scripts/postgres_backup.sh - Cluster is in recovery, not running backup
# envdir "/run/etc/wal-e.d/env" wal-g backup-list
INFO: 2022/03/02 10:09:12.437445 No backups found
### 先忽略输出,说明方式对了。只不过还在 recovery 嘛(说明我进入到了备库节点,可以换一个进去试试)
# psql -tXqAc "select pg_is_in_recovery()"
t

-------
# envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
2022-03-03 03:23:44.254 - /scripts/postgres_backup.sh - I was called as: /scripts/postgres_backup.sh /home/postgres/pgdata/pgroot/data
2022-03-03 03:23:44.932 - /scripts/postgres_backup.sh - producing a new backup
INFO: 2022/03/03 03:23:45.108109 Doing full backup.
INFO: 2022/03/03 03:23:45.208363 Calling pg_start_backup()
INFO: 2022/03/03 03:23:51.828511 Walking ...
INFO: 2022/03/03 03:23:51.829479 Starting part 1 ...
INFO: 2022/03/03 03:23:55.497596 Finished writing part 1.
INFO: 2022/03/03 03:23:56.387468 Starting part 2 ...
INFO: 2022/03/03 03:23:56.387568 /global/pg_control
INFO: 2022/03/03 03:23:56.394214 Finished writing part 2.
INFO: 2022/03/03 03:23:56.402528 Calling pg_stop_backup()
INFO: 2022/03/03 03:24:01.003815 Starting part 3 ...
INFO: 2022/03/03 03:24:01.030920 backup_label
INFO: 2022/03/03 03:24:01.031287 tablespace_map
INFO: 2022/03/03 03:24:01.031559 Finished writing part 3.
INFO: 2022/03/03 03:24:01.125669 Wrote backup with name base_000000100000000000000032

# envdir "/run/etc/wal-e.d/env" wal-g backup-list
name                          last_modified        wal_segment_backup_start
base_0000000F000000000000001C 2022-03-03T01:56:33Z 0000000F000000000000001C
base_0000000F000000000000001E 2022-03-03T02:00:19Z 0000000F000000000000001E
base_0000000F0000000000000020 2022-03-03T02:10:18Z 0000000F0000000000000020
base_0000000F0000000000000022 2022-03-03T02:11:49Z 0000000F0000000000000022
base_0000000F0000000000000024 2022-03-03T02:20:17Z 0000000F0000000000000024
base_0000000F0000000000000026 2022-03-03T02:30:17Z 0000000F0000000000000026
base_0000000F0000000000000028 2022-03-03T02:40:21Z 0000000F0000000000000028
base_0000000F000000000000002A 2022-03-03T02:50:21Z 0000000F000000000000002A
base_00000010000000000000002E 2022-03-03T03:10:17Z 00000010000000000000002E
base_000000100000000000000030 2022-03-03T03:20:26Z 000000100000000000000030
base_000000100000000000000032 2022-03-03T03:24:01Z 000000100000000000000032

Clone

创建 2 个用于验证 clone 的库

### 创建 2 个库试试
postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

postgres=# create database clone_test_1;
CREATE DATABASE
postgres=# create database clone_test_2;
CREATE DATABASE
postgres=# \l
                                   List of databases
     Name     |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
--------------+----------+----------+-------------+-------------+-----------------------
 clone_test_1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 clone_test_2 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0    | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
              |          |          |             |             | postgres=CTc/postgres
 template1    | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
              |          |          |             |             | postgres=CTc/postgres
(5 rows)

手动触发一次备份

# envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
2022-03-07 08:24:27.748 - /scripts/postgres_backup.sh - I was called as: /scripts/postgres_backup.sh /home/postgres/pgdata/pgroot/data
2022-03-07 08:24:28.499 - /scripts/postgres_backup.sh - producing a new backup
INFO: 2022/03/07 08:24:28.629076 Doing full backup.
INFO: 2022/03/07 08:24:28.720848 Calling pg_start_backup()
INFO: 2022/03/07 08:24:32.295808 Walking ...
INFO: 2022/03/07 08:24:32.296847 Starting part 1 ...
INFO: 2022/03/07 08:24:36.499136 Finished writing part 1.
INFO: 2022/03/07 08:24:38.118588 Starting part 2 ...
INFO: 2022/03/07 08:24:38.118700 /global/pg_control
INFO: 2022/03/07 08:24:38.121875 Finished writing part 2.
INFO: 2022/03/07 08:24:38.127394 Calling pg_stop_backup()
INFO: 2022/03/07 08:24:41.387076 Starting part 3 ...
INFO: 2022/03/07 08:24:41.418115 backup_label
INFO: 2022/03/07 08:24:41.418362 tablespace_map
INFO: 2022/03/07 08:24:41.418678 Finished writing part 3.
INFO: 2022/03/07 08:24:41.821536 Wrote backup with name base_00000006000000000000002D

### 这就是刚才的备份了 `base_00000006000000000000002D 2022-03-07T08:24:41Z 00000006000000000000002D`
# envdir "/run/etc/wal-e.d/env" wal-g backup-list
name                          last_modified        wal_segment_backup_start
base_000000010000000000000006 2022-03-03T09:42:33Z 000000010000000000000006
base_000000020000000000000009 2022-03-03T09:52:22Z 000000020000000000000009
base_00000002000000000000000B 2022-03-03T12:00:10Z 00000002000000000000000B
base_00000002000000000000000D 2022-03-04T00:00:14Z 00000002000000000000000D
base_00000002000000000000000F 2022-03-04T12:00:08Z 00000002000000000000000F
base_000000020000000000000011 2022-03-05T00:00:21Z 000000020000000000000011
base_000000030000000000000016 2022-03-05T12:00:20Z 000000030000000000000016
base_000000030000000000000018 2022-03-06T00:00:12Z 000000030000000000000018
base_00000003000000000000001A 2022-03-06T12:00:08Z 00000003000000000000001A
base_00000004000000000000001E 2022-03-07T00:00:17Z 00000004000000000000001E
base_00000006000000000000002D 2022-03-07T08:24:41Z 00000006000000000000002D

Clone

## 定义一下新的集群
apiVersion: acid.zalan.do/v1
kind: postgresql
metadata:
  name: clone-pg
  namespace: pg-test
spec:
  numberOfInstances: 3
  patroni:
    initdb:
      data-checksums: "true"
      encoding: UTF8
      locale: en_US.UTF-8
    pg_hba:
    - hostssl all all 0.0.0.0/0 md5
    - host    all all 0.0.0.0/0 md5
  postgresql:
    version: "12"
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 500m
      memory: 1024Mi
  teamId: clone
  volume:
    size: 128Gi
    storageClass: rook-ceph-block
  ### 重点开始
  clone:
    # 可以不填写,填的话就填下面 s3_wal_path 中的 uuid, 其他的值没有测试过。
    #uid: "ece6b770-4aef-4a17-b4a4-1b808c6f6f47"
    # 需要恢复的集群的名称
    cluster: "xxx-pg"
    s3_endpoint: http://rook-ceph-rgw-my-store.rook-ceph.svc
    s3_access_key_id: 8DPG4580SPN27TDG2RL7
    s3_secret_access_key: TgKL0S7pgdcOo7SnF7S0h5pbT7XqCUaGMN1uKetX
    s3_force_path_style: true
    # 时间戳很关键, 是关于需要恢复到哪一个点的备份,这个就是上面测试手动触发的创建的备份点,注意一下时间戳的格式。
    ## https://github.com/zalando/postgres-operator/blob/master/docs/user.md#clone-from-s3
    ## Note, that a time zone is required for timestamp in the format of +00:00 which is UTC.
    timestamp: "2022-03-07T08:24:41+00:00"
    # 不清楚的话可以用 s5cmd 看看
    ## https://github.com/peak/s5cmd
    s3_wal_path: "s3://pg-backup/spilo/xxx-pg/ece6b770-4aef-4a17-b4a4-1b808c6f6f47/wal/"
$ s5cmd --endpoint-url http://10.233.13.148 ls 
2022/03/03 09:39:24  s3://pg-backup
hchen@node1:~$ s5cmd --endpoint-url http://10.233.13.148 ls s3://pg-backup/
                                  DIR spilo/
hchen@node1:~$ s5cmd --endpoint-url http://10.233.13.148 ls s3://pg-backup/spilo/
                                  DIR xxx-pg/
hchen@node1:~$ s5cmd --endpoint-url http://10.233.13.148 ls s3://pg-backup/spilo/xxx-pg/
                                  DIR ece6b770-4aef-4a17-b4a4-1b808c6f6f47/
hchen@node1:~$ s5cmd --endpoint-url http://10.233.13.148 ls s3://pg-backup/spilo/xxx-pg/ece6b770-4aef-4a17-b4a4-1b808c6f6f47/
                                  DIR wal/
hchen@node1:~$ s5cmd --endpoint-url http://10.233.13.148 ls s3://pg-backup/spilo/xxx-pg/ece6b770-4aef-4a17-b4a4-1b808c6f6f47/wal
                                  DIR wal/
hchen@node1:~$ s5cmd --endpoint-url http://10.233.13.148 ls s3://pg-backup/spilo/xxx-pg/ece6b770-4aef-4a17-b4a4-1b808c6f6f47/wal/
                                  DIR basebackups_005/
                                  DIR wal_005/

验证一下

## clone_test_1, clone_test_2 都在,说明成功了吧?
# psql 
psql (12.4 (Ubuntu 12.4-1.pgdg18.04+1))
Type "help" for help.

postgres=# \l
                                   List of databases
     Name     |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
--------------+----------+----------+-------------+-------------+-----------------------
 clone_test_1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 clone_test_2 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0    | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
              |          |          |             |             | postgres=CTc/postgres
 template1    | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
              |          |          |             |             | postgres=CTc/postgres
(5 rows)

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值