续“hgdb HAC集群中,计划重新初始化数据库使用原集群配置的操作方法(一)”
1、停止所有节点的hghac服务并删除所有节点的data目录
[root@db ~]# systemctl stop hghac-vip
[root@db hgdbdata]# rm -rf data/
[root@db2 ~]# systemctl stop hghac-vip
[root@db2 hgdbdata]# rm -rf data/
[root@db3 ~]# systemctl stop hghac-vip
[root@db3 hgdbdata]# rm -rf data/
2、在节点1 initdb,不在sm4选项
[root@db hgdbdata]# initdb -c "echo 12345678" -D /db/hgdbdata/data
initdb: error: encryption cipher must be specified when encryption passphrase command is specified
[root@db hgdbdata]# initdb -D /db/hgdbdata/data
The files belonging to this database system will be owned by user "root".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
Data encryption is disabled.
Enter new sysdba password:
Enter it again:
Enter new syssao password:
Enter it again:
Enter new syssso password:
Enter it again:
creating directory /db/hgdbdata/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... PRC
creating configuration files ... ok
running bootstrap script ... 2022-04-02 14:05:55.364 CST [15293] LOG: Switchover the SSHA Role. Current is NONE
ok
performing post-bootstrap initialization ... 2022-04-02 14:05:55.663 CST [15295] LOG: Switchover the SSHA Role. Current is NONE
ok
syncing data to disk ... ok
initdb: warning: enabling "sm3" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
hg_sslkeygen.sh /db/hgdbdata/data
pg_ctl -D /db/hgdbdata/data -l logfile start
[root@db hgdbdata]# hg_sslkeygen.sh /db/hgdbdata/data
Setting the ssl configuration to /db/hgdbdata/data
Generating RSA private key, 2048 bit long modulus
........................................................................................+++
.....+++
e is 65537 (0x10001)
[root@db hgdbdata]#
[root@db hgdbdata]# cd data/
[root@db data]# ls
base pg_commit_ts pg_ident.conf pg_notify pg_snapshots pg_subtrans PG_VERSION postgresql.auto.conf secure_param.conf
global pg_dynshmem pg_logical pg_replslot pg_stat pg_tblspc pg_wal postgresql.conf server.crt
hgaudit pg_hba.conf pg_multixact pg_serial pg_stat_tmp pg_twophase pg_xact root.crt server.key
3、此时在节点1启动hghac服务,显示如下
[root@db data]# systemctl start hghac-vip
[root@db data]# hghactl list
+ Cluster: ha (7076286699020760566) -----+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Replica | stopped | | unknown |
+--------+---------------------+---------+---------+----+-----------+
此时启动节点2和节点3的hghac服务,状态显示如下:
[root@db2 hgdbdata]# systemctl start hghac-vip
[root@db3 hgdbdata]# systemctl start hghac-vip
[root@db data]# hghactl list
+ Cluster: ha (7076286699020760566) -----+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghacb | 192.168.80.112:5866 | Replica | stopped | | unknown |
| hghacc | 192.168.80.113:5866 | Replica | stopped | | unknown |
+--------+---------------------+---------+---------+----+-----------+
此时节点2和节点3并没有同步数据(未生成data目录):
[root@db2 hgdbdata]# pwd
/db/hgdbdata
[root@db2 hgdbdata]# ls
audit_archive hgdb_log hghalog root.crt server.crt server.key
[root@db3 hgdbdata]# pwd
/db/hgdbdata
[root@db3 hgdbdata]# ls
audit_archive hgdb_log hghalog root.crt server.crt server.key
4、此时在节点1查看hghac服务状态为failed
[root@db data]# systemctl status hghac-vip
● hghac-vip.service - hghac
Loaded: loaded (/usr/lib/systemd/system/hghac-vip.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Sat 2022-04-02 14:07:45 CST; 2min 4s ago
Process: 15622 ExecStart=/opt/HighGo/tools/hghac/hghac /opt/HighGo/tools/hghac/hghac.yaml (code=exited, status=1/FAILURE)
Main PID: 15622 (code=exited, status=1/FAILURE)
Apr 02 14:07:42 db systemd[1]: Started hghac.
Apr 02 14:07:45 db systemd[1]: hghac-vip.service: main process exited, code=exited, status=1/FAILURE
Apr 02 14:07:45 db systemd[1]: Unit hghac-vip.service entered failed state.
Apr 02 14:07:45 db systemd[1]: hghac-vip.service failed.
尝试手动启动该服务后,立即停止:
[root@db data]# systemctl start hghac-vip
[root@db data]# systemctl status hghac-vip
● hghac-vip.service - hghac
Loaded: loaded (/usr/lib/systemd/system/hghac-vip.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2022-04-02 14:09:58 CST; 1s ago
Main PID: 16173 (hghac)
Tasks: 3
CGroup: /system.slice/hghac-vip.service
├─16173 /opt/HighGo/tools/hghac/hghac /opt/HighGo/tools/hghac/hghac.yaml
└─16178 /opt/HighGo/tools/hghac/hghac /opt/HighGo/tools/hghac/hghac.yaml
Apr 02 14:09:58 db systemd[1]: Started hghac.
[root@db data]# systemctl status hghac-vip
● hghac-vip.service - hghac
Loaded: loaded (/usr/lib/systemd/system/hghac-vip.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Sat 2022-04-02 14:10:00 CST; 1s ago
Process: 16173 ExecStart=/opt/HighGo/tools/hghac/hghac /opt/HighGo/tools/hghac/hghac.yaml (code=exited, status=1/FAILURE)
Main PID: 16173 (code=exited, status=1/FAILURE)
Apr 02 14:09:58 db systemd[1]: Started hghac.
Apr 02 14:10:00 db systemd[1]: hghac-vip.service: main process exited, code=exited, status=1/FAILURE
Apr 02 14:10:00 db systemd[1]: Unit hghac-vip.service entered failed state.
Apr 02 14:10:00 db systemd[1]: hghac-vip.service failed.
查看hghac日志文件显示如下:提示system ID mismatch
[root@db hghalog]# tail -f patroni.log
2022-04-02 14:03:32,222 INFO: Lock owner: hghaca; I am hghaca
2022-04-02 14:03:32,250 INFO: no action. i am the leader with the lock
2022-04-02 14:03:42,222 INFO: Lock owner: hghaca; I am hghaca
2022-04-02 14:03:42,245 INFO: no action. i am the leader with the lock
2022-04-02 14:07:44,469 INFO: Selected new etcd server http://192.168.80.112:2379
2022-04-02 14:07:44,490 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-04-02 14:07:44,586 CRITICAL: system ID mismatch, node hghaca belongs to a different cluster: 7076286699020760566 != 7081883765097335741
2022-04-02 14:10:00,103 INFO: Selected new etcd server http://192.168.80.111:2379
2022-04-02 14:10:00,123 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-04-02 14:10:00,175 CRITICAL: system ID mismatch, node hghaca belongs to a different cluster: 7076286699020760566 != 7081883765097335741
5、查看重新初始化的数据库的identifier为7081883765097335741,与hghactl list中显示的7076286699020760566不一致
[root@db data]# pg_controldata -D /db/hgdbdata/data/
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 7081883765097335741
Database cluster state: shut down
pg_control last modified: Sat 02 Apr 2022 02:05:58 PM CST
Latest checkpoint location: 0/19F9D28
Latest checkpoint's REDO location: 0/19F9D28
Latest checkpoint's REDO WAL file: 000000010000000000000001
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:519
Latest checkpoint's NextOID: 15290
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 512
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Sat 02 Apr 2022 02:05:58 PM CST
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1988
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Data encryption: off
Mock authentication nonce: 1d8c81e2ab5469dac2b162f0ac4f03f88ffff41f73f7982233fd54d6b707f5f0
Data encryption cipher: off
6、停止所有节点的hghac服务
[root@db data]# systemctl stop hghac-vip
[root@db2 data]# systemctl stop hghac-vip
[root@db3 data]# systemctl stop hghac-vip
执行如下命令:(其中ha为hac集群名称,依据实际填写)
[root@db data]# hghactl remove ha
+ Cluster: ha (7076286699020760566) -----+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghacb | 192.168.80.112:5866 | Replica | stopped | | unknown |
| hghacc | 192.168.80.113:5866 | Replica | stopped | | unknown |
+--------+---------------------+---------+---------+----+-----------+
Please confirm the cluster name to remove: ha
You are about to remove all information in DCS for ha, please type: "Yes I am aware": Yes I am aware
[root@db data]# hghactl list
+ Cluster: ha (uninitialized) -+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+------+------+-------+----+-----------+
+--------+------+------+-------+----+-----------+
7、此时再启动所有节点的hghac服务,状态显示正常
[root@db data]# systemctl start hghac-vip
[root@db data]# hghactl list
+ Cluster: ha (7081883765097335741) ----+---------+----+-----------+-----------------+
| Member | Host | Role | State | TL | Lag in MB | Pending restart |
+--------+---------------------+--------+---------+----+-----------+-----------------+
| hghaca | 192.168.80.111:5866 | Leader | running | 1 | | * |
+--------+---------------------+--------+---------+----+-----------+-----------------+
[root@db2 hgdbdata]# systemctl start hghac-vip
[root@db3 hgdbdata]# systemctl start hghac-vip
[root@db data]# hghactl list
+ Cluster: ha (7081883765097335741) -----+---------+----+-----------+-----------------+
| Member | Host | Role | State | TL | Lag in MB | Pending restart |
+--------+---------------------+---------+---------+----+-----------+-----------------+
| hghaca | 192.168.80.111:5866 | Leader | running | 2 | | * |
| hghacb | 192.168.80.112:5866 | Replica | running | 2 | 0 | * |
| hghacc | 192.168.80.113:5866 | Replica | running | 2 | 0 | * |
+--------+---------------------+---------+---------+----+-----------+-----------------+