hgdb hac集群重启后所有节点显示Replica,无Leader节点的原因之一

1、瀚高HAC集群重启后,所有节点显示Replica;无法启动主节点
[root@db ~]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) -----+---------+----+-----------+
| Member | Host                | Role    | State   | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Replica | running |    |   unknown |
| hghacb | 192.168.80.112:5866 | Replica | running |    |   unknown |
| hghacc | 192.168.80.113:5866 | Replica | running |    |   unknown |
+--------+---------------------+---------+---------+----+-----------+
2、停止所有节点HAC服务后,仅重启节点1(原主节点)Role也显示为Replica
[root@db ~]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) -----+---------+----+-----------+
| Member | Host                | Role    | State   | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Replica | running |    |   unknown |
| hghacb | 192.168.80.112:5866 | Replica | stopped |    |   unknown |
| hghacc | 192.168.80.113:5866 | Replica | stopped |    |   unknown |
+--------+---------------------+---------+---------+----+-----------+
3、停止所有节点的HAC服务,在节点1手动删除standby.signal文件,启动节点1的HAC服务后仍然不正常
[root@db ~]# systemctl stop hghac-vip
[root@db ~]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) -----+---------+----+-----------+
| Member | Host                | Role    | State   | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Replica | stopped |    |   unknown |
| hghacb | 192.168.80.112:5866 | Replica | stopped |    |   unknown |
| hghacc | 192.168.80.113:5866 | Replica | stopped |    |   unknown |
+--------+---------------------+---------+---------+----+-----------+
[root@db ~]# cd /db/hgdbdata/data/
[root@db data]# ls
audit_param.conf  global                pg_commit_ts        pg_ident.conf         pg_notify     pg_stat      pg_twophase  postgresql.auto.conf         postgresql.conf.backup
backup_label.old  hgaudit               pg_dynshmem         pg_ident.conf.backup  pg_replslot   pg_stat_tmp  PG_VERSION   postgresql.base.conf         postmaster.opts
base              hgdb.lic              pg_hba.conf         pg_logical            pg_serial     pg_subtrans  pg_wal       postgresql.base.conf.backup  secure_param.conf
current_logfiles  patroni.dynamic.json  pg_hba.conf.backup  pg_multixact          pg_snapshots  pg_tblspc    pg_xact      postgresql.conf              standby.signal
[root@db data]# rm -rf standby.signal 
[root@db data]# systemctl start hghac-vip
[root@db data]# systemctl status hghac-vip
● hghac-vip.service - hghac
   Loaded: loaded (/etc/systemd/system/hghac-vip.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2022-03-18 10:22:55 CST; 4s ago
 Main PID: 21221 (hghac)
    Tasks: 14
   CGroup: /system.slice/hghac-vip.service
           ├─21221 /opt/HighGo/tools/hghac/hghac /opt/HighGo/tools/hghac/hghac.yaml
           ├─21227 /opt/HighGo/tools/hghac/hghac /opt/HighGo/tools/hghac/hghac.yaml
           ├─21241 /opt/HighGo4.5.7-see/bin/postgres -D /db/hgdbdata/data --config-file=/db/hgdbdata/data/postgresql.conf --listen_addresses=0.0.0.0 --port=5866 --cluster_name=ha --wal_l...
           ├─21243 postgres: ha: logger   
           ├─21244 postgres: ha: auditwriter   
           ├─21245 postgres: ha: startup   recovering 00000019000000000000002D
           ├─21247 postgres: ha: checkpointer   
           ├─21248 postgres: ha: background writer   
           ├─21249 postgres: ha: stats collector   
           └─21250 postgres: ha: audit archiver or cleanup   

Mar 18 10:22:55 db systemd[1]: Started hghac.
Mar 18 10:22:57 db hghac[21221]: 2022-03-18 10:22:57 CST [21241]: [1-1] 6233ed01.52f9 0     LOG:  Password detection module is disabled
Mar 18 10:22:57 db hghac[21221]: 2022-03-18 10:22:57 CST [21241]: [2-1] 6233ed01.52f9 0     LOG:  starting HighGo Security Enterprise Edition Database System 4.5.7 on CentOS...d on 20210804
Mar 18 10:22:57 db hghac[21221]: 2022-03-18 10:22:57 CST [21241]: [3-1] 6233ed01.52f9 0     LOG:  listening on IPv4 address "0.0.0.0", port 5866
Mar 18 10:22:57 db hghac[21221]: 2022-03-18 10:22:57 CST [21241]: [4-1] 6233ed01.52f9 0     LOG:  listening on Unix socket "/tmp/.s.PGSQL.5866"
Mar 18 10:22:57 db hghac[21221]: 2022-03-18 10:22:57 CST [21241]: [5-1] 6233ed01.52f9 0     LOG:  redirecting log output to logging collector process
Mar 18 10:22:57 db hghac[21221]: 2022-03-18 10:22:57 CST [21241]: [6-1] 6233ed01.52f9 0     HINT:  Future log output will appear in directory "../hgdb_log".
Mar 18 10:22:57 db hghac[21221]: localhost:5866 - accepting connections
Mar 18 10:22:57 db hghac[21221]: localhost:5866 - accepting connections
Mar 18 10:22:58 db hghac[21221]: localhost:5866 - accepting connections
Hint: Some lines were ellipsized, use -l to show in full.
[root@db data]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) -----+---------+----+-----------+
| Member | Host                | Role    | State   | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Replica | running |    |   unknown |
+--------+---------------------+---------+---------+----+-----------+
4、停止所有节点的HAC服务,在节点1手动删除standby.signal文件,启动节点1的数据库服务,数据库能正常启动,未生成standby.signal文件,表明数据库为正常读写模式
[root@db data]# systemctl stop hghac-vip
[root@db data]# pwd
/db/hgdbdata/data
[root@db data]# ls
audit_param.conf  global                pg_commit_ts        pg_ident.conf         pg_notify     pg_stat      pg_twophase  postgresql.auto.conf         postgresql.conf.backup
backup_label.old  hgaudit               pg_dynshmem         pg_ident.conf.backup  pg_replslot   pg_stat_tmp  PG_VERSION   postgresql.base.conf         postmaster.opts
base              hgdb.lic              pg_hba.conf         pg_logical            pg_serial     pg_subtrans  pg_wal       postgresql.base.conf.backup  secure_param.conf
current_logfiles  patroni.dynamic.json  pg_hba.conf.backup  pg_multixact          pg_snapshots  pg_tblspc    pg_xact      postgresql.conf              standby.signal
[root@db data]# rm -rf standby.signal 
[root@db data]# pg_ctl start -D /db/hgdbdata/data/
waiting for server to start....2022-03-18 10:24:22 CST [21576]: [1-1] 6233ed56.5448 0     LOG:  Password detection module is disabled
2022-03-18 10:24:22 CST [21576]: [2-1] 6233ed56.5448 0     LOG:  starting HighGo Security Enterprise Edition Database System 4.5.7 on CentOS7 x86_64,build on 20210804
2022-03-18 10:24:22 CST [21576]: [3-1] 6233ed56.5448 0     LOG:  listening on IPv4 address "0.0.0.0", port 5866
2022-03-18 10:24:22 CST [21576]: [4-1] 6233ed56.5448 0     LOG:  listening on Unix socket "/tmp/.s.PGSQL.5866"
2022-03-18 10:24:22 CST [21576]: [5-1] 6233ed56.5448 0     LOG:  redirecting log output to logging collector process
2022-03-18 10:24:22 CST [21576]: [6-1] 6233ed56.5448 0     HINT:  Future log output will appear in directory "../hgdb_log".
 done
server started
[root@db data]# ls
audit_param.conf  global                pg_commit_ts        pg_ident.conf         pg_notify     pg_stat      pg_twophase  postgresql.auto.conf         postgresql.conf.backup
backup_label.old  hgaudit               pg_dynshmem         pg_ident.conf.backup  pg_replslot   pg_stat_tmp  PG_VERSION   postgresql.base.conf         postmaster.opts
base              hgdb.lic              pg_hba.conf         pg_logical            pg_serial     pg_subtrans  pg_wal       postgresql.base.conf.backup  postmaster.pid
current_logfiles  patroni.dynamic.json  pg_hba.conf.backup  pg_multixact          pg_snapshots  pg_tblspc    pg_xact      postgresql.conf              secure_param.conf
[root@db data]# pg_ctl stop -D /db/hgdbdata/data/
waiting for server to shut down.... done
server stopped
[root@db data]# 

5、启动节点1 HAC服务,检查数据库日志文件,报错如下
[root@db data]# systemctl start hghac-vip
[root@db data]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) -----+---------+----+-----------+
| Member | Host                | Role    | State   | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Replica | running |    |   unknown |
+--------+---------------------+---------+---------+----+-----------+
[root@db data]# 
[root@db hgdb_log]# pwd
/db/hgdbdata/hgdb_log
[root@db hgdb_log]# tail -f hgdb-5.csv
2022-03-18 10:28:10.388 CST,"sysdba","highgo",22463,"127.0.0.1:14914",6233ee3a.57bf,2,"authentication",2022-03-18 10:28:10 CST,2/12,0,FATAL,28P01,"password authentication failed for user ""sysdba""","Password does not match for user ""sysdba"".
Connection matched pg_hba.conf line 4: ""host    all             all             0.0.0.0/0            sm3""",,,,,,,,""
2022-03-18 10:28:10.393 CST,,,22464,"127.0.0.1:14916",6233ee3a.57c0,1,"",2022-03-18 10:28:10 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=14916",,,,,,,,,""
2022-03-18 10:28:10.398 CST,"sysdba","highgo",22464,"127.0.0.1:14916",6233ee3a.57c0,2,"authentication",2022-03-18 10:28:10 CST,2/13,0,FATAL,28P01,"password authentication failed for user ""sysdba""","Password does not match for user ""sysdba"".
Connection matched pg_hba.conf line 4: ""host    all             all             0.0.0.0/0            sm3""",,,,,,,,""
2022-03-18 10:28:10.419 CST,,,22467,"127.0.0.1:14920",6233ee3a.57c3,1,"",2022-03-18 10:28:10 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=14920",,,,,,,,,""
2022-03-18 10:28:10.459 CST,,,22468,"127.0.0.1:14924",6233ee3a.57c4,1,"",2022-03-18 10:28:10 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=14924",,,,,,,,,""
2022-03-18 10:28:10.483 CST,"sysdba","highgo",22468,"127.0.0.1:14924",6233ee3a.57c4,2,"authentication",2022-03-18 10:28:10 CST,2/15,0,FATAL,28P01,"password authentication failed for user ""sysdba""","Password does not match for user ""sysdba"".
Connection matched pg_hba.conf line 4: ""host    all             all             0.0.0.0/0            sm3""",,,,,,,,""
2022-03-18 10:28:10.490 CST,,,22469,"127.0.0.1:14926",6233ee3a.57c5,1,"",2022-03-18 10:28:10 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=14926",,,,,,,,,""
2022-03-18 10:28:10.495 CST,"sysdba","highgo",22469,"127.0.0.1:14926",6233ee3a.57c5,2,"authentication",2022-03-18 10:28:10 CST,2/16,0,FATAL,28P01,"password authentication failed for user ""sysdba""","Password does not match for user ""sysdba"".
Connection matched pg_hba.conf line 4: ""host    all             all             0.0.0.0/0            sm3""",,,,,,,,""

经检查为hghac.yaml中密码与实际不符(因特殊需求前期更改过相关用户密码)
[root@db hghac]# pwd
/opt/HighGo/tools/hghac
[root@db hghac]# vi hghac.yaml
  authentication:
    replication:
      password: High@123
      username: sysdba
    rewind:
      password: High@123
      username: sysdba
    sysdba:
      password: High@123
    syssso:
      password: High@123
    syssao:
      password: High@123
      
将hghac.yaml中的密码修改为正确的密码:
  authentication:
    replication:
      password: High@789
      username: sysdba
    rewind:
      password: High@789
      username: sysdba
    sysdba:
      password: High@789
    syssso:
      password: High@789
    syssao:
      password: High@789
      
重启HAC服务,数据库Role显示为Leader:
[root@db data]# systemctl restart hghac-vip
[root@db data]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) ----+---------+----+-----------+
| Member | Host                | Role   | State   | TL | Lag in MB |
+--------+---------------------+--------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Leader | running | 25 |           |
+--------+---------------------+--------+---------+----+-----------+
[root@db data]# 

其他两个备节点也修改hghac.yaml文件为正确的密码,分别启动HAC服务,时间线发生变化
[root@db data]# /opt/HighGo/tools/hghac/hghactl -c /opt/HighGo/tools/hghac/hghac.yaml list
+ Cluster: ha (7072987311974756506) -----+---------+----+-----------+
| Member | Host                | Role    | State   | TL | Lag in MB |
+--------+---------------------+---------+---------+----+-----------+
| hghaca | 192.168.80.111:5866 | Leader  | running | 26 |           |
| hghacb | 192.168.80.112:5866 | Replica | running | 26 |         0 |
| hghacc | 192.168.80.113:5866 | Replica | running | 26 |         0 |
+--------+---------------------+---------+---------+----+-----------+

6、至此,集群恢复正常。
 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值