DM备库维护操作

一、主要步骤

假如需要动态维护的备库名称为 GRP1_RT_02,完整的维护步骤说明如下:
1. 在监视器上执行 login,输入登录口令
2. 在监视器上执行 detach database GRP1_RT_02,将备库分离出守护系统。
3. 在监视器上执行 stop database GRP1_RT_02,将备库正常退出,如果不需要
退出备库,则需要执行 stop dmwatcher database GRP1_RT_02 关闭备库的守护进程
监控功能,避免将备库分离成功后,主库又发生切换,导致之前的分离操作失效,又将备库
重新加回守护系统。
4. 执行具体的备库维护操作。
5. 备库维护完成后,如果是关闭状态,在监视器上执行 startup database
GRP1_RT_02 将备库重新启动,如果已经是运行状态,根据情况看是否需要执行 startup
dmwatcher database GRP1_RT_02 打开备库的守护进程监控功能。
6. 待备库重新 Open 后,在监视器上执行 attach database GRP1_RT_02,将备
库重新加回到守护系统中。
以上步骤中,在步骤 3 执行完成后,会将备库的守护进程切换到 Shutdown 状态,备
库维护完成后,如果是用手工方式重启,需要先使用 startup dmwatcher 命令将备库的
守护进程监控功能打开,备库才能自动 Open。如果是用 startup database 命令方式重
启,此命令会自动将守护进程的监控功能打开。
   需要注意的是,如果在分离出去的备库维护完成之前,主库故障,其他备库接管成为新
主库,则之前的分离操作会失效,可以通过对备库的守护进程执行 stop dmwatcher
database 关闭指定备库的监控,避免这种情况发生,或者重新对此备库执行 detach 命
令,通知当前的新主库不去主动恢复正在维护中的备库。

二、具体测试

场景一:在主库GRP1_RWW_01没有down机下,备库GRP1_RWW_02分离

下边测试以一主两备,环境为:GRP1_RWW_01为主,GRP1_RWW_02为备,GRP1_RWW_03为备。现在维护GRP1_RWW_02备库。
1)、在监视器上执行 login,输入登录口令
2)、detach database grp1.GRP1_RWW_02
3)、stop database grp1.GRP1_RWW_02    (此命令执行完成后,会将守护进程切换为 Shutdown 状态(当然库也是Shutdown)。对 DMDSC 集群,会将每个 dmcss 的自动拉起功能关闭。)
     如果不需要退出备库,则需要执行 stop dmwatcher database grp1.GRP1_RWW_02 关闭备库的守护进程监控功能,避免将备库分离成功后,主库又发生切换,导致之前的分离操作失效,又将备库重新加回守护系统。
     最好执行这一步,以防止现在的主库故障:stop dmwatcher database grp1.GRP1_RWW_02    (守护进程切换至 Shutdown 状态后,不会再自动处理故障,也不会自动启动实例,但仍然能够正常接收、发送消息,监视器仍然能够监控到最新的实例状态。)        
4)、执行具体的备库维护操作。
5)、备库维护完成后,做: startup database grp1.GRP1_RWW_02

tip
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE] cannot join other instances, dmwatcher status is OPEN, SYSOPENHISTORY status is VALID
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE] is OK, dmwatcher status is OPEN, dw_type is GLOBAL

[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] can join instance GRP1_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE]
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] arch status is invalid, but it cannot be recovered now, please retry later, can use CHECK RECOVER command to find detail reason

[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_03[STANDBY, OPEN, ISTAT_SAME:TRUE] can join instance GRP1_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE]
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_03[STANDBY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently
[monitor]         2022-04-17 17:02:14: Instance GRP1_RWW_03[STANDBY, OPEN, ISTAT_SAME:TRUE] is OK, dmwatcher status is OPEN, dw_type is GLOBAL

[monitor]         2022-04-17 17:02:14: Group(GRP1) has PRIMARY&OPEN instance, but still exists other instances not OK, please choose appropriate processing according to the above information!

[monitor]         2022-04-17 17:02:14: All groups' have PRIMARY&OPEN instances, but there still exist instances not OK!

此时备库 GRP1_RWW_02 的规档还是无效的!是因为还没有执行 attach database grp1.GRP1_RWW_02

check recover grp1.GRP1_RWW_02
[monitor]         2022-04-17 17:04:38: Instance(grp1_rww_02) check result: Instance(GRP1_RWW_02) has not reached recovery time interval 86400(s), cannot be recovered, please retry after 83796(s)!	

场景二:在备库GRP1_RWW_02维护期间,主库GRP1_RWW_01  down机了,由备库GRP1_RWW_03自动接管成为新的主库。然后GRP1_RWW_02又修复好了,最后将其加入到集群中去。

现在模拟 GRP1_RWW_01 为主库关机,然后由 GRP1_RWW_03 自动接管为新的主库,注意此时 GRP1_RWW_02 还在修复中。。。。

也即是如下的场景:
            需要注意的是,如果在分离出去的备库维护完成之前,主库故障,其他备库接管成为新
        主库,则之前的分离操作会失效,可以通过对备库的守护进程执行 stop dmwatcher
        database 关闭指定备库的监控,避免这种情况发生,或者重新对此备库执行 detach 命
        令,通知当前的新主库不去主动恢复正在维护中的备库。

在上边第3)步时如果没有停库,只停了 dmwatcher;即: stop dmwatcher database grp1.GRP1_RWW_02     ,所以现在在备库还在维护期间,就不需要再执行 detach 命令。

我直接将 GRP1_RWW_01 为主库关机,监示器上显示如下:

[monitor]         2022-04-17 17:14:18: Received message timeout from(GRP1_RWW_01)
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN            
                  2022-04-17 17:14:07  ERROR          OK        GRP1_RWW_01      OPEN        PRIMARY   VALID    10       37660           37660           

[monitor]         2022-04-17 17:14:18: Check primary instance error in group(GRP1), start to auto takeover

[monitor]         2022-04-17 17:14:18: Notify group(GRP1)'s active dmwatcher to set MID
[monitor]         2022-04-17 17:14:18: Notify group(GRP1)'s active dmwatcher to set MID success
[monitor]         2022-04-17 17:14:18: Start to takeover use instance GRP1_RWW_03
[monitor]         2022-04-17 17:14:18: Notify dmwatcher(GRP1_RWW_03) switch to TAKEOVER status
[monitor]         2022-04-17 17:14:18: Dmwatcher process GRP1_RWW_03 status switching [OPEN-->TAKEOVER] 
[monitor]         2022-04-17 17:14:19: Switch dmwatcher GRP1_RWW_03 to TAKEOVER status success
[monitor]         2022-04-17 17:14:19: Instance GRP1_RWW_03 start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 7)
[monitor]         2022-04-17 17:14:19: Instance GRP1_RWW_03 execute sql SP_SET_GLOBAL_DW_STATUS(0, 7) success
[monitor]         2022-04-17 17:14:19: Instance GRP1_RWW_03 start to execute sql ALTER DATABASE MOUNT
[monitor]         2022-04-17 17:14:19: Instance GRP1_RWW_03 execute sql ALTER DATABASE MOUNT success
[monitor]         2022-04-17 17:14:19: Instance GRP1_RWW_03 start to execute sql ALTER DATABASE PRIMARY
[monitor]         2022-04-17 17:14:20: Instance GRP1_RWW_03 execute sql ALTER DATABASE PRIMARY success
[monitor]         2022-04-17 17:14:20: Notify instance GRP1_RWW_03 to change all arch status to be invalid
[monitor]         2022-04-17 17:14:20: Succeed to change all instances arch status to be invalid
[monitor]         2022-04-17 17:14:20: Instance GRP1_RWW_03 start to execute sql ALTER DATABASE OPEN FORCE
[monitor]         2022-04-17 17:14:20: Instance GRP1_RWW_03 execute sql ALTER DATABASE OPEN FORCE success
[monitor]         2022-04-17 17:14:20: Instance GRP1_RWW_03 start to execute sql SP_SET_GLOBAL_DW_STATUS(7, 0)
[monitor]         2022-04-17 17:14:20: Instance GRP1_RWW_03 execute sql SP_SET_GLOBAL_DW_STATUS(7, 0) success
[monitor]         2022-04-17 17:14:20: Notify dmwatcher(GRP1_RWW_03) switch to OPEN status
[monitor]         2022-04-17 17:14:21: Dmwatcher process GRP1_RWW_03 status switching [TAKEOVER-->OPEN] 
[monitor]         2022-04-17 17:14:21: Switch dmwatcher GRP1_RWW_03 to OPEN status success
[monitor]         2022-04-17 17:14:21: Notify group(GRP1)'s dmwatcher to do clear
[monitor]         2022-04-17 17:14:21: Clean request of dmwatcher processer GRP1_RWW_02 success
[monitor]         2022-04-17 17:14:22: Clean request of dmwatcher processer GRP1_RWW_03 success
[monitor]         2022-04-17 17:14:22: Success to takeover use instance GRP1_RWW_03

[monitor]         2022-04-17 17:14:22: Group(GRP1) use instance GRP1_RWW_03 auto takeover success

show
2022-04-17 17:16:39 
#================================================================================#
GROUP            OGUID       MON_CONFIRM     MODE            MPP_FLAG  
GRP1             453332      TRUE            AUTO            FALSE     


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT    
192.168.0.24        52143        2022-04-17 17:16:38  GLOBAL    VALID     OPEN           GRP1_RWW_03      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID    

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG          
192.168.1.24        32143      OK        GRP1_RWW_03      OPEN        PRIMARY   0          0            TIMELY    VALID    4412            39018           4412            39018           NONE                  

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT    
192.168.0.22        52142        2022-04-17 17:16:38  GLOBAL    VALID     SHUTDOWN       GRP1_RWW_02      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    INVALID  

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG          
192.168.1.22        32142      OK        GRP1_RWW_02      OPEN        STANDBY   0          0            TIMELY    INVALID  4381            37657           4381            37657           NONE                  

DATABASE(GRP1_RWW_02) APPLY INFO FROM (GRP1_RWW_03), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4410, 4410, 4410], (RLSN, SLSN, KLSN)[37657, 37657, 37657], N_TSK[0], TSK_MEM_USE[0] 
REDO_LSN_ARR: (37657)


ERROR DATABASE:

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT    
192.168.0.21        52141        2022-04-17 17:14:07  GLOBAL    VALID     ERROR          GRP1_RWW_01      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID    

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG          
192.168.1.21        32141      OK        GRP1_RWW_01      OPEN        PRIMARY   0          0            TIMELY    VALID    4411            37660           4411            37660           NONE                  

#================================================================================#

可以看到:GRP1_RWW_01 关机后,监示器不会更新其库信息,只会更新 dmwatcher 的状态为: ERROR; 现在 GRP1_RWW_03 成为了新的主库。

6)、修复完备库GRP1_RWW_02后,在监视器上执行 attach database grp1.GRP1_RWW_02 ,将备库重新加回到守护系统中。

attach database grp1.GRP1_RWW_02
[monitor]         2022-04-17 17:21:38: Dmwatcher GRP1_RWW_02 not in STARTUP or OPEN status, is SHUTDOWN status

可以看到:提示要先做 startup dmwatcher database grp1.GRP1_RWW_02

startup dmwatcher database grp1.GRP1_RWW_02 
[monitor]         2022-04-17 17:22:28: Notify group(GRP1)'s active dmwatcher to set MID
[monitor]         2022-04-17 17:22:28: Notify group(GRP1)'s active dmwatcher to set MID success
[monitor]         2022-04-17 17:22:28: Startup dmwatcher process of instance GRP1_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE]
[monitor]         2022-04-17 17:22:28: Dmwatcher process GRP1_RWW_02 status switching [SHUTDOWN-->STARTUP] 
[monitor]         2022-04-17 17:22:29: Startup dmwatcher process of instance GRP1_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] success

[monitor]         2022-04-17 17:22:29: Notify group(GRP1)'s dmwatcher to do clear
[monitor]         2022-04-17 17:22:29: Clean request of dmwatcher processer GRP1_RWW_02 success
[monitor]         2022-04-17 17:22:29: Clean request of dmwatcher processer GRP1_RWW_03 success
[monitor]         2022-04-17 17:22:29: Dmwatcher process GRP1_RWW_03 status switching [OPEN-->RECOVERY] 
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN            
                  2022-04-17 17:22:29  RECOVERY       OK        GRP1_RWW_03      OPEN        PRIMARY   VALID    11       39018           39018           

[monitor]         2022-04-17 17:22:30: Dmwatcher process GRP1_RWW_02 status switching [STARTUP-->OPEN] 
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN            
                  2022-04-17 17:22:30  OPEN           OK        GRP1_RWW_02      OPEN        STANDBY   VALID    10       37660           37660           

[monitor]         2022-04-17 17:22:31: Dmwatcher process GRP1_RWW_03 status switching [RECOVERY-->OPEN] 
                  WTIME                WSTATUS        INST_OK   INAME            ISTATUS     IMODE     RSTAT    N_OPEN   FLSN            CLSN            
                  2022-04-17 17:22:31  OPEN           OK        GRP1_RWW_03      OPEN        PRIMARY   VALID    11       39018           39018  
				  
				  
show
2022-04-17 17:22:41 
#================================================================================#
GROUP            OGUID       MON_CONFIRM     MODE            MPP_FLAG  
GRP1             453332      TRUE            AUTO            FALSE     


<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT    
192.168.0.24        52143        2022-04-17 17:22:41  GLOBAL    VALID     OPEN           GRP1_RWW_03      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID    

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG          
192.168.1.24        32143      OK        GRP1_RWW_03      OPEN        PRIMARY   0          0            TIMELY    VALID    4413            39018           4413            39018           NONE                  

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT    
192.168.0.22        52142        2022-04-17 17:22:41  GLOBAL    VALID     OPEN           GRP1_RWW_02      OK        1     1     OPEN        STANDBY   DSC_OPEN       TIMELY    VALID    

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG          
192.168.1.22        32142      OK        GRP1_RWW_02      OPEN        STANDBY   0          0            TIMELY    VALID    4381            39018           4381            39018           NONE                  

DATABASE(GRP1_RWW_02) APPLY INFO FROM (GRP1_RWW_03), REDOS_PARALLEL_NUM (1):
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4413, 4413, 4413], (RLSN, SLSN, KLSN)[39018, 39018, 39018], N_TSK[0], TSK_MEM_USE[0] 
REDO_LSN_ARR: (39018)


ERROR DATABASE:

<<DATABASE GLOBAL INFO:>>
DW_IP               MAL_DW_PORT  WTIME                WTYPE     WCTLSTAT  WSTATUS        INAME            INST_OK   N_EP  N_OK  ISTATUS     IMODE     DSC_STATUS     RTYPE     RSTAT    
192.168.0.21        52141        2022-04-17 17:14:07  GLOBAL    VALID     ERROR          GRP1_RWW_01      OK        1     1     OPEN        PRIMARY   DSC_OPEN       TIMELY    VALID    

EP INFO:
INST_IP             INST_PORT  INST_OK   INAME            ISTATUS     IMODE     DSC_SEQNO  DSC_CTL_NODE RTYPE     RSTAT    FSEQ            FLSN            CSEQ            CLSN            DW_STAT_FLAG          
192.168.1.21        32141      OK        GRP1_RWW_01      OPEN        PRIMARY   0          0            TIMELY    VALID    4411            37660           4411            37660           NONE                  

#================================================================================#

可以看到: startup dmwatcher database grp1.GRP1_RWW_02  ,因为是新的主库,所以直接规档变为 VALID ,不再需要 attach database grp1.GRP1_RWW_02  ,如果是以前的老主库的话,则要 attach database grp1.GRP1_RWW_02;

三、总结:

本文主要是熟悉detach/attach database的使用,较简单。

 更多学习内容参考:达梦数据库 - 新一代大型通用关系型数据库 | 达梦云适配中心 https://eco.dameng.com

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值