1.0自杀实验
故障自动切换模式下,必须配置确认监视器,且确认监视器最多只能配置一个。
#关闭主库服务器
[root@Centos7-STD root]# reboot
- 查看监视器
#捕捉到主库异常
[monitor] 2020-11-13 15:22:52: Received message timeout from(GRP1_PR)
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 02:22:43 ERROR OK GRP1_PR OPEN PRIMARY VALID 7 45393 45393
- 开始自动切换主备库角色
[monitor] 2020-11-13 15:22:52: Check primary instance error in group(GRP1), start to auto takeover
[monitor] 2020-11-13 15:22:52: Notify group(GRP1)'s active dmwatcher to set MID
[monitor] 2020-11-13 15:22:52: Notify group(GRP1)'s active dmwatcher to set MID success
[monitor] 2020-11-13 15:22:52: Start to takeover use instance GRP1_SD
[monitor] 2020-11-13 15:22:52: Notify dmwatcher(GRP1_SD) switch to TAKEOVER status
[monitor] 2020-11-13 15:22:52: Dmwatcher process GRP1_SD status switching [OPEN-->TAKEOVER]
[monitor] 2020-11-13 15:22:52: Switch dmwatcher GRP1_SD to TAKEOVER status success
[monitor] 2020-11-13 15:22:52: Instance GRP1_SD start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 7)
[monitor] 2020-11-13 15:22:52: Instance GRP1_SD execute sql SP_SET_GLOBAL_DW_STATUS(0, 7) success
[monitor] 2020-11-13 15:22:52: Instance GRP1_SD start to execute sql SP_APPLY_KEEP_PKG()
[monitor] 2020-11-13 15:22:52: Instance GRP1_SD execute sql SP_APPLY_KEEP_PKG() success
[monitor] 2020-11-13 15:22:52: Instance GRP1_SD start to execute sql ALTER DATABASE MOUNT
[monitor] 2020-11-13 15:22:54: Instance GRP1_SD execute sql ALTER DATABASE MOUNT success
[monitor] 2020-11-13 15:22:54: Instance GRP1_SD start to execute sql ALTER DATABASE PRIMARY
[monitor] 2020-11-13 15:22:54: Instance GRP1_SD execute sql ALTER DATABASE PRIMARY success
[monitor] 2020-11-13 15:22:57: Notify instance GRP1_SD to change all arch status to be invalid
[monitor] 2020-11-13 15:22:57: Succeed to change all instances arch status to be invalid
[monitor] 2020-11-13 15:22:57: Instance GRP1_SD start to execute sql ALTER DATABASE OPEN FORCE
[monitor] 2020-11-13 15:22:59: Instance GRP1_SD execute sql ALTER DATABASE OPEN FORCE success
[monitor] 2020-11-13 15:22:59: Instance GRP1_SD start to execute sql SP_SET_GLOBAL_DW_STATUS(7, 0)
[monitor] 2020-11-13 15:22:59: Instance GRP1_SD execute sql SP_SET_GLOBAL_DW_STATUS(7, 0) success
[monitor] 2020-11-13 15:22:59: Notify dmwatcher(GRP1_SD) switch to OPEN status
[monitor] 2020-11-13 15:22:59: Dmwatcher process GRP1_SD status switching [TAKEOVER-->OPEN]
[monitor] 2020-11-13 15:22:59: Switch dmwatcher GRP1_SD to OPEN status success
[monitor] 2020-11-13 15:22:59: Notify group(GRP1)'s dmwatcher to do clear
[monitor] 2020-11-13 15:22:59: Clean request of dmwatcher processer GRP1_SD success
[monitor] 2020-11-13 15:22:59: Success to takeover use instance GRP1_SD
[monitor] 2020-11-13 15:22:59: Group(GRP1) use instance GRP1_SD auto takeover success
- 主库节点重启成功后再次以备库角色加入集群
[monitor] 2020-11-13 15:23:26: Dmwatcher process GRP1_PR status switching [NONE-->STARTUP]
[monitor] 2020-11-13 15:23:45: Dmwatcher process GRP1_PR status switching [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 02:23:47 UNIFY EP OK GRP1_PR MOUNT PRIMARY VALID 7 45624 45624
[monitor] 2020-11-13 15:23:45: Dmwatcher process GRP1_PR status switching [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 02:23:48 STARTUP OK GRP1_PR MOUNT STANDBY INVALID 7 45624 45624
[monitor] 2020-11-13 15:23:45: Dmwatcher process GRP1_PR status switching [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 02:23:48 UNIFY EP OK GRP1_PR MOUNT STANDBY INVALID 7 45624 45624
[monitor] 2020-11-13 15:23:45: Dmwatcher process GRP1_PR status switching [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 02:23:48 STARTUP OK GRP1_PR OPEN STANDBY INVALID 7 45624 45624
[monitor] 2020-11-13 15:23:46: Dmwatcher process GRP1_PR status switching [STARTUP-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 02:23:48 OPEN OK GRP1_PR OPEN STANDBY INVALID 7 45624 45624
[monitor] 2020-11-13 15:23:48: Dmwatcher process GRP1_SD status switching [OPEN-->RECOVERY]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 15:23:48 RECOVERY OK GRP1_SD OPEN PRIMARY VALID 8 46880 46880
[monitor] 2020-11-13 15:23:52: Dmwatcher process GRP1_SD status switching [RECOVERY-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 15:23:52 OPEN OK GRP1_SD OPEN PRIMARY VALID 8 46880 46880
- 查看结果
GROUP OGUID MON_CONFIRM MODE MPP_FLAG
GRP1 453332 TRUE AUTO FALSE
<<DATABASE GLOBAL INFO:>>
IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
10.10.3.104 52142 2020-11-13 15:26:39 GLOBAL VALID OPEN GRP1_SD OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID
EP INFO:
INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
32142 OK GRP1_SD OPEN PRIMARY 0 0 REALTIME VALID 3942 46880 3942 46880 NONE
<<DATABASE GLOBAL INFO:>>
IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
10.10.3.103 52141 2020-11-13 02:26:41 GLOBAL VALID OPEN GRP1_PR OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID
EP INFO:
INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
32141 OK GRP1_PR OPEN STANDBY 0 0 REALTIME VALID 3941 46880 3941 46880 NONE
DATABASE(GRP1_PR) APPLY INFO FROM (GRP1_SD):
DSC_SEQNO[0], (ASEQ, SSEQ, KSEQ)[3942, 3942, 3942], (ALSN, SLSN, KLSN)[46880, 46880, 46880], N_TSK[0], TSK_MEM_USE[0]
1.1手工切换
- 手动切换刚才的重启节点为主库
#先登陆
login
username:
password:
[monitor] 2020-11-13 15:27:22: Login dmmonitor success!
#手动switchover
switchover grp1.grp1_pr
#过程
[monitor] 2020-11-13 15:27:50: Start to switchover instance GRP1_PR
[monitor] 2020-11-13 15:27:50: Notify dmwatcher(GRP1_SD) switch to SWITCHOVER status
[monitor] 2020-11-13 15:27:50: Dmwatcher process GRP1_SD status switching [OPEN-->SWITCHOVER]
[monitor] 2020-11-13 15:27:50: Switch dmwatcher GRP1_SD to SWITCHOVER status success
[monitor] 2020-11-13 15:27:50: Notify dmwatcher(GRP1_PR) switch to SWITCHOVER status
[monitor] 2020-11-13 15:27:50: Dmwatcher process GRP1_PR status switching [OPEN-->SWITCHOVER]
[monitor] 2020-11-13 15:27:50: Switch dmwatcher GRP1_PR to SWITCHOVER status success
[monitor] 2020-11-13 15:27:50: Instance GRP1_SD start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 6)
[monitor] 2020-11-13 15:27:50: Instance GRP1_SD execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) success
[monitor] 2020-11-13 15:27:50: Instance GRP1_PR start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 6)
[monitor] 2020-11-13 15:27:50: Instance GRP1_PR execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) success
[monitor] 2020-11-13 15:27:50: Instance GRP1_SD start to execute sql ALTER DATABASE MOUNT
[monitor] 2020-11-13 15:27:52: Instance GRP1_SD execute sql ALTER DATABASE MOUNT success
[monitor] 2020-11-13 15:27:52: Instance GRP1_PR start to execute sql SP_APPLY_KEEP_PKG()
[monitor] 2020-11-13 15:27:52: Instance GRP1_PR execute sql SP_APPLY_KEEP_PKG() success
[monitor] 2020-11-13 15:27:52: Instance GRP1_PR start to execute sql ALTER DATABASE MOUNT
[monitor] 2020-11-13 15:27:54: Instance GRP1_PR execute sql ALTER DATABASE MOUNT success
[monitor] 2020-11-13 15:27:54: Instance GRP1_SD start to execute sql ALTER DATABASE STANDBY
[monitor] 2020-11-13 15:27:54: Instance GRP1_SD execute sql ALTER DATABASE STANDBY success
[monitor] 2020-11-13 15:27:54: Instance GRP1_PR start to execute sql ALTER DATABASE PRIMARY
[monitor] 2020-11-13 15:27:54: Instance GRP1_PR execute sql ALTER DATABASE PRIMARY success
[monitor] 2020-11-13 15:27:54: Notify instance GRP1_PR to change all arch status to be invalid
[monitor] 2020-11-13 15:27:54: Succeed to change all instances arch status to be invalid
[monitor] 2020-11-13 15:27:54: Instance GRP1_SD start to execute sql ALTER DATABASE OPEN FORCE
[monitor] 2020-11-13 15:27:54: Instance GRP1_SD execute sql ALTER DATABASE OPEN FORCE success
[monitor] 2020-11-13 15:27:54: Instance GRP1_PR start to execute sql ALTER DATABASE OPEN FORCE
[monitor] 2020-11-13 15:27:56: Instance GRP1_PR execute sql ALTER DATABASE OPEN FORCE success
[monitor] 2020-11-13 15:27:56: Instance GRP1_SD start to execute sql SP_SET_GLOBAL_DW_STATUS(6, 0)
[monitor] 2020-11-13 15:27:56: Instance GRP1_SD execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) success
[monitor] 2020-11-13 15:27:56: Instance GRP1_PR start to execute sql SP_SET_GLOBAL_DW_STATUS(6, 0)
[monitor] 2020-11-13 15:27:56: Instance GRP1_PR execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) success
[monitor] 2020-11-13 15:27:56: Notify dmwatcher(GRP1_SD) switch to OPEN status
[monitor] 2020-11-13 15:27:56: Dmwatcher process GRP1_SD status switching [SWITCHOVER-->OPEN]
[monitor] 2020-11-13 15:27:56: Switch dmwatcher GRP1_SD to OPEN status success
[monitor] 2020-11-13 15:27:56: Notify dmwatcher(GRP1_PR) switch to OPEN status
[monitor] 2020-11-13 15:27:56: Dmwatcher process GRP1_PR status switching [SWITCHOVER-->OPEN]
[monitor] 2020-11-13 15:27:57: Switch dmwatcher GRP1_PR to OPEN status success
[monitor] 2020-11-13 15:27:57: Notify group(GRP1)'s dmwatcher to do clear
[monitor] 2020-11-13 15:27:57: Clean request of dmwatcher processer GRP1_PR success
2020-11-13 15:27:57
#================================================================================#
GROUP OGUID MON_CONFIRM MODE MPP_FLAG
GRP1 453332 TRUE AUTO FALSE
<<DATABASE GLOBAL INFO:>>
IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
10.10.3.103 52141 2020-11-13 02:27:59 GLOBAL VALID OPEN GRP1_PR OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID
EP INFO:
INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
32141 OK GRP1_PR OPEN PRIMARY 0 0 REALTIME VALID 3943 46880 3943 48238 NONE
<<DATABASE GLOBAL INFO:>>
IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
10.10.3.104 52142 2020-11-13 15:27:57 GLOBAL VALID OPEN GRP1_SD OK 1 1 OPEN STANDBY DSC_OPEN REALTIME INVALID
EP INFO:
INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
32142 OK GRP1_SD OPEN STANDBY 0 0 REALTIME INVALID 3943 46880 3943 46880 NONE
DATABASE(GRP1_SD) APPLY INFO FROM (GRP1_PR):
DSC_SEQNO[0], (ASEQ, SSEQ, KSEQ)[3943, 3943, 3943], (ALSN, SLSN, KLSN)[46880, 46880, 46880], N_TSK[0], TSK_MEM_USE[0]
- 结果查看
GROUP OGUID MON_CONFIRM MODE MPP_FLAG
GRP1 453332 TRUE AUTO FALSE
<<DATABASE GLOBAL INFO:>>
IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
10.10.3.103 52141 2020-11-13 02:28:40 GLOBAL VALID OPEN GRP1_PR OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID
EP INFO:
INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
32141 OK GRP1_PR OPEN PRIMARY 0 0 REALTIME VALID 3944 48238 3944 48238 NONE
<<DATABASE GLOBAL INFO:>>
IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
10.10.3.104 52142 2020-11-13 15:28:37 GLOBAL VALID OPEN GRP1_SD OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID
EP INFO:
INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
32142 OK GRP1_SD OPEN STANDBY 0 0 REALTIME VALID 3943 48238 3943 48238 NONE
DATABASE(GRP1_SD) APPLY INFO FROM (GRP1_PR):
DSC_SEQNO[0], (ASEQ, SSEQ, KSEQ)[3944, 3944, 3944], (ALSN, SLSN, KLSN)[48238, 48238, 48238], N_TSK[0], TSK_MEM_USE[0]
1.3备库重建
现实环境中,我们会遇到服务器硬件故障,或者误操作导致集群高可用环境失败。
由于本例中我们使用的是自动切换,当主库发生dang机后,备库会自动接管 升级为主库,这样我们只需要单方面考虑备库重建的问题了。
- 模拟误删除环境
#直接删除备库数据目录
[root@Centos7-STD log]# rm -rf /data/dmdbs/data/sd/sd/
#这里删除后发现dmserver还在运行
[root@Centos7-STD log]# ps -ef|grep dmserver
dmdba 1258 1 0 15:32 ? 00:00:01 /data/dmdbs/bin/dmserver /data/dmdbs/data/sd/sd/dm.ini mount
这里由于我们的数据库还太小,删除文件后,相关的文件句柄依旧存在缓存中
[dmdba@Centos7-STD 1258]$ cd /proc/1258/fd
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 10 -> /data/dmdbs/data/sd/sd/sd01.log (deleted)
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 11 -> /data/dmdbs/data/sd/sd/sd02.log (deleted)
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 12 -> socket:[19173]
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 13 -> socket:[19174]
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 14 -> socket:[19176]
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 15 -> /data/dmdbs/data/sd/sd/ROLL.DBF (deleted)
lrwx------ 1 dmdba dinstall 64 Nov 13 15:49 16 -> /data/dmdbs/data/sd/sd/MAIN.DBF (deleted)
此时你会发现,进入主库作一下操作,备库依旧处于同步状态
#主库操作如下:
create user test identified by test123456789;
create table test.t1(id int);
insert into test.t1 values(1),(2),(3);
commit;
#备库进行查询:
[dmdba@Centos7-STD fd]$ disql sysdba@localhost:32142
SQL> select * from test.t1;
LINEID ID
---------- -----------
1 1
2 2
3 3
还是数据库太小,打开的文件句柄不会及时被关闭,这个时候只需要杀掉备库数据库进程,高可用环境就会出现问题。
- 杀掉残存备库进程
[dmdba@Centos7-STD fd]$ ps -ef|grep dmserver
dmdba 1258 1 0 15:32 ? 00:00:01 /data/dmdbs/bin/dmserver /data/dmdbs/data/sd/sd/dm.ini mount
dmdba 2716 2184 0 15:59 pts/0 00:00:00 grep --color=auto dmserver
[dmdba@Centos7-STD fd]$ kill -9 1258
如上,杀掉备库残存进程后,一般来说只要备库守护进程还在运行,就会自动拉起备库
而此时,备库环境已经被人为破坏,所以集群可能面临诸多报错。
[monitor] 2020-11-13 02:59:18: Instance GRP1_SD[STANDBY, OPEN, ISTAT_SAME:TRUE] error
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 15:59:15 STARTUP ERROR GRP1_SD OPEN STANDBY VALID 9 48311 48311
最先发现的是dbmonitor,报告备库出现异常
2020-11-13 16:03:04.664 [INFO] dmwatcher P0000000988 T0000140654251452160 [!!! Local instance restarted by dmwatcher. start info is in dmserver log!!!]
其次,备库节点上的dbwatcher进程也报告如上,这里告诉我们本地实例被守护进程自动启动,详细的需要查看本地日志
进入到日志目录下,你会发现一个新的数据库实例日志dm_unknown_202011.log
这个应该是达梦默认不知名实例日志,由于我们直接删除整个数据目录,所以相应的日志被记录在这里
[dmdba@Centos7-STD log]$ tail dm_unknown_202011.log
2020-11-13 16:06:49.906 [WARNING] database P0000003138 T0000000000000003138 open ini file /data/dmdbs/data/sd/sd/dm.ini failed!
2020-11-13 16:06:49.906 [FATAL] database P0000003138 T0000000000000003138 dmserver startup failed, code = -104 [Invalid INI file]
2020-11-13 16:06:49.906 [FATAL] database P0000003138 T0000000000000003138 nsvr_ini_file_read failed, [code: -104]
很显然,这会儿备库已经无力回天,只能重建
- 重建前的准备和检查
通过dmmonitor中的tip命令,我们可以获得当前数据守护集群中出现的问题汇总
tip
[monitor] 2020-11-13 03:08:37: Instance GRP1_SD[STANDBY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently
[monitor] 2020-11-13 03:08:37: Instance GRP1_SD[STANDBY, OPEN, ISTAT_SAME:TRUE] error, please wait dmwatcher auto restart it
[monitor] 2020-11-13 03:08:37: Group(GRP1) has PRIMARY&OPEN instance, but still exists other instances not OK, please choose appropriate processing according to the above information!
[monitor] 2020-11-13 03:08:37: All groups' have PRIMARY&OPEN instances, but there still exist instances not OK!
根据提示,目前主库状态正常,处于open,依旧可以对外提供服务。 但是根据官方文档,主库在没有可用备库情况下,先是处于open状态
该状态可以查询,一旦有事务发生,这时候主库会自动挂起,不具备同步落盘日志 这也是为了保证主备数据一致
- 重建备库
这里我们备库环境已经失败,不再采用以上办法强制退出主库, 让主库业务继续办下去,至于备库,采用联机备份重建方法恢复回来即可
- 联机备份主库
[dmdba@centos7 ~]$ disql sysdba@localhost:32141
SQL> BACKUP DATABASE FULL BACKUPSET '/data/dmdbms/data/pr_bak_20201113.bak';
executed successfully
used time: 00:00:01.472. Execute id is 108.
- 重新初始化备库
[dmdba@Centos7-STD log]$ dminit path=/data/dmdbs/data/sd db_name=sd
#将备份文件拷贝至备份机器进行恢复
[dmdba@centos7 ~]$ scp -r /data/dmdbms/data/pr_bak_20201113.bak 192.168.3.104:/data/dmdbs/data/
- 备库恢复
dmrman
restore database '/data/dmdbs/data/sd/sd/dm.ini' from backupset '/data/dmdbs/data/pr_bak_20201113.bak';
recover database '/data/dmdbs/data/sd/sd/dm.ini' from backupset '/data/dmdbs/data/pr_bak_20201113.bak';
recover database '/data/dmdbs/data/sd/sd/dm.ini' update db_magic;
重新配置备库的dm.ini,dmarch.ini,dmmal.ini,dmwatcher.ini文件
vim /data/dmdbs/data/sd/sd/dm.ini
#从节点
INSTANCE_NAME = GRP1_SD
PORT_NUM = 32142
ALTER_MODE_STATUS = 0 #不允许手工方式修改实例模式/状态/OGUID
ENABLE_OFFLINE_TS = 2 #不允许备库 OFFLINE 表空间
MAL_INI = 1 #打开 MAL 系统
ARCH_INI = 1 #打开归档配置
vim /data/dmdbs/data/sd/sd/dmarch.ini
[ARCHIVE_REALTIME]
ARCH_TYPE = REALTIME #实时归档类型
ARCH_DEST = GRP1_PR #实时归档目标实例名
[ARCHIVE_LOCAL1]
ARCH_TYPE = LOCAL #本地归档类型
ARCH_DEST = /data/dmdbs/data/sd/sd/arch #本地归档文件存放路径
ARCH_FILE_SIZE = 128 #单位 Mb,本地单个归档文件最大值
ARCH_SPACE_LIMIT = 0 #单位 Mb,0 表示无限制,范围 1024~4294967294M
mkdir -p /data/dmdbs/data/sd/sd/arch
vim /data/dmdbs/data/sd/sd/dmmal.ini
MAL_CHECK_INTERVAL = 5 #MAL 链路检测时间间隔
MAL_CONN_FAIL_INTERVAL = 5 #判定 MAL 链路断开的时间
[MAL_INST1]
MAL_INST_NAME = GRP1_PR #实例名,和 dm.ini 中的 INSTANCE_NAME 一致
MAL_HOST = 10.10.3.103 #MAL 系统监听 TCP 连接的 IP 地址
MAL_PORT = 61141 #MAL 系统监听 TCP 连接的端口
MAL_INST_HOST = 192.168.3.103 #实例的对外服务 IP 地址
MAL_INST_PORT = 32141 #实例的对外服务端口,和 dm.ini 中的 PORT_NUM 一致
MAL_DW_PORT = 52141 #实例本地的守护进程监听 TCP 连接的端口
MAL_INST_DW_PORT = 33141 #实例监听守护进程 TCP 连接的端口
[MAL_INST2]
MAL_INST_NAME = GRP1_SD
MAL_HOST = 10.10.3.104
MAL_PORT = 61142
MAL_INST_HOST = 192.168.3.104
MAL_INST_PORT = 32142
MAL_DW_PORT = 52142
MAL_INST_DW_PORT = 33142
vim /data/dmdbs/data/sd/sd/dmwatcher.ini
[GRP1]
DW_TYPE = GLOBAL #全局守护类型
DW_MODE = AUTO #自动切换模式
DW_ERROR_TIME = 10 #远程守护进程故障认定时间
INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间
INST_ERROR_TIME = 10 #本地实例故障认定时间
INST_OGUID = 453332 #守护系统唯一 OGUID 值
INST_INI = /data/dmdbs/data/sd/sd/dm.ini #dm.ini 配置文件路径
INST_AUTO_RESTART = 1 #打开实例的自动启动功能
INST_STARTUP_CMD = /data/dmdbs/bin/dmserver #命令行方式启动
RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阈值,默认关闭
- 以mount方式启动备库
[dmdba@Centos7-STD log]$ dmserver /data/dmdbs/data/sd/sd/dm.ini mount
#Disql登陆备库,修改OGUID以及数据库模式
[dmdba@Centos7-STD ~]$ disql sysdba@localhost:32142
SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);
sp_set_oguid(453332);
alter database standby;
SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
- 启动备库上的dmwatcher进程
[dmdba@Centos7-STD ~]$ dmwatcher /data/dmdbs/data/sd/sd/dmwatcher.ini
#等待备库自动加入
#================================================================================#
[monitor] 2020-11-13 04:15:54: Dmwatcher process GRP1_SD status switching [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 17:15:50 UNIFY EP OK GRP1_SD MOUNT STANDBY INVALID 9 48323 48323
[monitor] 2020-11-13 04:15:54: Dmwatcher process GRP1_SD status switching [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 17:15:50 STARTUP OK GRP1_SD OPEN STANDBY INVALID 9 48323 48323
[monitor] 2020-11-13 04:15:54: Dmwatcher process GRP1_SD status switching [STARTUP-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 17:15:50 OPEN OK GRP1_SD OPEN STANDBY INVALID 9 48323 48323
[monitor] 2020-11-13 04:15:54: Dmwatcher process GRP1_PR status switching [OPEN-->RECOVERY]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 04:15:54 RECOVERY OK GRP1_PR OPEN PRIMARY VALID 9 48329 48329
[monitor] 2020-11-13 04:15:56: Dmwatcher process GRP1_PR status switching [RECOVERY-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2020-11-13 04:15:56 OPEN OK GRP1_PR OPEN PRIMARY VALID 9 48329 48329