ChenxiRoc原创作品,转载请注明出处
集群管理-Node Rejoin
Node rejoin 操作常见的场景:primary 宕机,standby 接管,旧的 primary 要重新加入集群。
举例如下:
- Node101 由于数据库宕机,primary 切换到 node102
- 现在要将 node101 重新加入集群
** [kingbase@nn01 archive]$ repmgr node rejoin -h 192.168.237.102 -U esrep -d esrep**
WARNING: the item in /home/kingbase/.kbpass is not end in the right way
INFO: local node 1 can attach to rejoin target node 2
DETAIL: local node’s recovery point: 0/4B000028; rejoin target node’s fork point: 0/4B0000A0
NOTICE: setting node 1’s upstream to node 2
WARNING: unable to ping “host=192.168.237.101 user=esrep dbname=esrep port=54321
connect_timeout=3 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3”
DETAIL: PQping() returned “PQPING_NO_RESPONSE”
NOTICE: begin to start server at 2020-10-02 15:02:27.685836
NOTICE: starting server using “/home/kingbase/cluster/PROJ01/DBCL/kingbase/bin/sys_ctl -w -t 90 -D ‘/home/kingbase/cluster/PROJ01/DBCL/kingbase/data’ -l
/home/kingbase/cluster/PROJ01/DBCL/kingbase/bin/logfile start”
NOTICE: start server finish at 2020-10-02 15:02:27.792657
NOTICE: replication slot “repmgr_slot_2” deleted on node 1
WARNING: 1 inactive replication slots detected
DETAIL: inactive replication slots:
repmgr_slot_3 (physical)
HINT: these replication slots may need to be removed manually
NOTICE: NODE REJOIN successful
DETAIL: node 1 is now attached to node 2
- 确认集群
Note:以上的例子 node101 在宕机后、rejoin 之前,并没有启动,也就是 node101 数据与 primary 并没有出现“diverged”。但如果 node101 宕机后曾经启动过,甚至修改了数据,这时 node101 重新加入时,就必须指定“–force-rewind” 选项。