1 安装前准备
软硬件环境建议参考“数据库规范化部署 - 单机安装部署”,其内容与SERIES7中内容相似,这包括网络环境、磁盘IO、集群架构、切换模式说明等;仅有归档类型、dm_svc.conf配置文件等不同。在数据守护集群规划基础上适应性更改后的集群规划如下:
A 机器 | B 机器 | |
---|---|---|
业务 IP | 172.16.1.5 | 172.16.1.6 |
心跳 IP | 192.168.1.5 | 192.168.1.6 |
实例名 | GRP2_RWW_01 | GRP2_RWW_02 |
实例端口 | 5236 | 5236 |
MAL 端口 | 5336 | 5336 |
MAL 守护进程端口 | 5436 | 5436 |
守护进程端口 | 5536 | 5536 |
OGUID | 45332 | 45332 |
守护组 | GRP2 | GRP2 |
安装目录 | /home/dmdba/dmdbms | /home/dmdba/dmdbms |
实例目录 | /dmdata/data | /dmdata/data |
归档上限 | 51200 | 51200 |
与SERIES7中配置的数据守护集群相比,有修改项:业务IP、心跳IP、实例名(组名)、OGUID、守护组。
SERIES7数据守护集群搭建时创建的2个Docker Network在此处复用。
2 集群搭建
2.1 配置容器GRP2_RWW_01
2.1.1 初始化实例并备份数据
初始化实例:
docker run -d --restart=always --name=GRP2_RWW_01 --network dmwatch-test-business --ip 172.16.1.5 --privileged=true -e LD_LIBRARY_PATH=/opt/dmdbms/bin -e PAGE_SIZE=32 -e EXTENT_SIZE=32 -e LOG_SIZE=2048 -e UNICODE_FLAG=1 -e INSTANCE_NAME=GRP2_RWW_01 -v GRP2_RWW_01:/opt/dmdbms/data dm8_single:dm8_20240715_rev232765_x86_rh6_64
将其连接至heartbeat网络:
docker network connect --ip 192.168.1.5 dmwatch-test-heartbeat GRP2_RWW_01
进行数据库备份的方式有两种。如果是初始搭建集群环境,可以采用脱机备份、脱机还原方式;如果主库已在运行,可以采用联机备份、脱机还原方式来准备数据;
无论哪种方式,都需要首先配置本地归档,内容如下:
# 1. 配置dm.ini,修改ARCH_INI参数为1
# 2. 配置dmarch.ini
[ARCHIVE_LOCAL1]
ARCH_TYPE = LOCAL
ARCH_DEST = /opt/dmdbms/data/DAMENG/arch
ARCH_FILE_SIZE = 128
ARCH_SPACE_LIMIT = 0
脱机备份
正常关闭数据库:
root@46e945759cef:/opt/dmdbms/data/DAMENG$ cd /opt/dmdbms/bin
root@46e945759cef:/opt/dmdbms/bin$ ./DmService stop
Stopping DmService: [ OK ]
进行脱机备份:
root@46e945759cef:/opt/dmdbms/bin$ ./dmrman
dmrman V8
RMAN> BACKUP DATABASE '/opt/dmdbms/data/DAMENG/dm.ini';
BACKUP DATABASE '/opt/dmdbms/data/DAMENG/dm.ini';
file dm.key not found, use default license!
Processing backupset /opt/dmdbms/data/DAMENG/bak/DB_DAMENG_FULL_20240914_165714_960788
[Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00]
backup successfully!
time used: 00:00:03.253
RMAN> exit;
time used: 0.199(ms)
root@46e945759cef:/opt/dmdbms/bin$ ls /opt/dmdbms/data/DAMENG/bak
DB_DAMENG_FULL_20240914_165714_960788
使用dmrman
工具进行脱机备份时,不使用BACKUPSET工具指定【备份路径名】和【类型】时,默认为【FULL全库备份】和dm.ini
中BAK_PATH
配置的路径;若未配置BAK_PATH,则默认使用SYSTEM_PATH
下的bak
目录。个人更推荐这种方式,因为自动生成的备份目录命名按照规则自动设置,注明了数据库、备份类型、精确时间,清晰明了。
联机备份
联机备份于主库的备份方式上与脱机备份不同。在disql工具中执行语句的示例有:
SQL> BACKUP DATABASE BACKUPSET '/dm/data/BACKUP_FILE_01';
联机备份的参数以及选项见官方文档物理备份还原 - 联机备份还原 - 数据备份 - 手动备份。
2.1.2 配置dm.ini
部分dm.ini
参数修改如下:
ALTER_MODE_STATUS = 0 #Whether to permit database user to alter database mode and status by SQLs, 2: yes, 1: yes when dmwatcher ERROR, 0: no
ENABLE_OFFLINE_TS = 2 #Whether tablespace can be offline,此处为禁止备库,其他放开
MAL_INI = 1 #dmmal.ini
ARCH_INI = 1 #dmarch.ini
2.1.3 配置dmmal.ini
MAL_HOST
使用内部网络IP,MAL_PORT
与dm.ini
中PORT_NUM
需使用不同的端口值,MAL_DW_PORT
是各实例对应的守护进程之间、以及守护进程与Monitor通信的端口,其配置如下:
MAL_CHECK_INTERVAL = 5
MAL_CONN_FAIL_INTERVAL = 5
[MAL_INST1]
MAL_INST_NAME = GRP2_RWW_01
MAL_HOST = 192.168.1.5
MAL_PORT = 5336
MAL_INST_HOST = 172.16.1.5
MAL_INST_PORT = 5236
MAL_DW_PORT = 5436
MAL_INST_DW_PORT = 5536
[MAL_INST2]
MAL_INST_NAME = GRP2_RWW_02
MAL_HOST = 192.168.1.6
MAL_PORT = 5336
MAL_INST_HOST = 172.16.1.6
MAL_INST_PORT = 5236
MAL_DW_PORT = 5436
MAL_INST_DW_PORT = 5536
2.1.4 配置dmarch.ini
除本地归档外,其他归档配置项中的ARCH_DEST表示实例是Primary模式时,需要同步归档数据的目标实例名。
对于读写分离集群,其与数据守护的主要区别就在于归档类型、dm_svc.conf文件配置和接口调用方式不同两方面。两者的归档类型区别:
- 数据守护集群:实时归档(REALTIME)
- 读写分离集群:即时归档(TIMELY)
在GRP2_RWW_01
的dmarch.ini
中已有的[ARCHIVE_LOCAL1]有关配置的基础上,增加如下内容:
[ARCHIVE_TIMELY1]
ARCH_TYPE = TIMELY
ARCH_DEST = GRP2_RWW_02
2.1.5 配置dmwatcher.ini
修改dmwatcher.ini
配置守护进程,配置为GLOBAL全局守护类型,使用AUTO自动切换模式:
[GRP2]
DW_TYPE = GLOBAL
DW_MODE = AUTO
DW_ERROR_TIME = 10
INST_RECOVER_TIME = 60
INST_ERROR_TIME = 10
INST_OGUID = 45332
INST_INI = /opt/dmdbms/data/DAMENG/dm.ini
INST_AUTO_RESTART = 1
INST_STARTUP_CMD = /opt/dmdbms/bin/dmserver
RLOG_SEND_THRESHOLD = 0
RLOG_APPLY_THRESHOLD = 0
2.1.6 以mount方式启动主库 & 设置OGUID & 修改数据库模式
以mount方式启动主库:
root@46e945759cef:/opt/dmdbms/bin$ ./dmserver /opt/dmdbms/data/DAMENG/dm.ini mount
设置OGUID:
root@46e945759cef:/opt/dmdbms/bin$ ./disql SYSDBA/SYSDBA001@localhost:5236
Server[localhost:5236]:mode is normal, state is mount
login used time : 2.313(ms)
disql V8
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);
DMSQL executed successfully
used time: 20.507(ms). Execute id is 1.
SQL> SP_SET_OGUID(45332);
DMSQL executed successfully
used time: 4.681(ms). Execute id is 2.
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
DMSQL executed successfully
used time: 10.770(ms). Execute id is 3.
修改数据库模式:
SQL> ALTER DATABASE PRIMARY;
executed successfully
used time: 11.298(ms). Execute id is 0.
2.2 配置容器GRP2_RWW_02
2.2.1 初始化实例
初始化实例:
docker run -d --restart=always --name=GRP2_RWW_02 --network dmwatch-test-business --ip 172.16.1.6 --privileged=true -e LD_LIBRARY_PATH=/opt/dmdbms/bin -e PAGE_SIZE=32 -e EXTENT_SIZE=32 -e LOG_SIZE=2048 -e UNICODE_FLAG=1 -e INSTANCE_NAME=GRP2_RWW_02 -v GRP2_RWW_02:/opt/dmdbms/data dm8_single:dm8_20240715_rev232765_x86_rh6_64
将其连接到heartbeat网络:
docker network connect --ip 192.168.1.6 dmwatch-test-heartbeat GRP2_RWW_02
2.2.2 从GRP2_RWW_01的备份数据文件恢复数据到GRP2_RWW_02
使用临时容器同时挂载Docker Volume GRP2_RWW_01和GRP2_RWW_02方式,将备份文件从实例GRP2_RWW_01拷贝到GRP2_RWW_02:
[root@VM-8-6-centos ~]$ docker run -it --rm -v GRP2_RWW_01:/mnt/volume1 -v GRP2_RWW_02:/mnt/volume2 alpine sh
/mnt/volume1/DAMENG/bak $ cp -r DB_DAMENG_FULL_20240914_165714_960788 /mnt/volume2/DAMENG/bak
在备库中查看,备份文件已被成功拷贝。在确保备库服务已经关闭的前提下,使用DMRMAN工具进行数据库的Restore和Recover操作。恢复过程必须使用dmdba用户执行,使用root用户会报错。有执行语句:
RMAN> RESTORE DATABASE '/opt/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/opt/dmdbms/data/DAMENG/bak/DB_DAMENG_FULL_20240914_165714_960788';
RESTORE DATABASE '/opt/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/opt/dmdbms/data/DAMENG/bak/DB_DAMENG_FULL_20240914_165714_960788';
file dm.key not found, use default license!
[Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00]
restore successfully.
time used: 00:00:02.873
RMAN> RECOVER DATABASE '/opt/dmdbms/data/DAMENG/dm.ini' UPDATE DB_MAGIC;
RECOVER DATABASE '/opt/dmdbms/data/DAMENG/dm.ini' UPDATE DB_MAGIC;
recover successfully!
time used: 00:00:01.056
可见已成功根据备份还原恢复数据。
2.2.3 配置dm.ini
dm.ini
中需要修改的数据项与GRP2_RWW_01实例中相同;
同2.1.2,将dm.ini中的部分内容作如下修改:
ALTER_MODE_STATUS = 0 #Whether to permit database user to alter database mode and status by SQLs, 2: yes, 1: yes when dmwatcher ERROR, 0: no
ENABLE_OFFLINE_TS = 2 #Whether tablespace can be offline,此处为禁止备库,其他放开
MAL_INI = 1 #dmmal.ini
ARCH_INI = 1 #dmarch.ini
2.2.4 配置 dmarch.ini
对于dmarch.ini
中需要修改的数据项,除ARCH_DEST
数据项需要做适应性更改外,其他内容保持与先前的配置相同:
[ARCHIVE_LOCAL1]
ARCH_TYPE = LOCAL
ARCH_DEST = /opt/dmdbms/data/DAMENG/arch
ARCH_FILE_SIZE = 128
ARCH_SPACE_LIMIT = 0
[ARCH_TIMELY1]
ARCH_TYPE = TIMELY
ARCH_DEST = GRP2_RWW_01
2.2.5 配置dmmal.ini
dmmal.ini与GRP2_RWW_01中对应内容保持相同:
MAL_CHECK_INTERVAL = 5
MAL_CONN_FAIL_INTERVAL = 5
[MAL_INST1]
MAL_INST_NAME = GRP2_RWW_01
MAL_HOST = 192.168.1.5
MAL_PORT = 5336
MAL_INST_HOST = 172.16.1.5
MAL_INST_PORT = 5236
MAL_DW_PORT = 5436
MAL_INST_DW_PORT = 5536
[MAL_INST2]
MAL_INST_NAME = GRP2_RWW_02
MAL_HOST = 192.168.1.6
MAL_PORT = 5336
MAL_INST_HOST = 172.16.1.6
MAL_INST_PORT = 5236
MAL_DW_PORT = 5436
MAL_INST_DW_PORT = 5536
2.2.6 配置dmwatcher.ini
dmwatcher.ini内容保持与先前配置相同:
[GRP2]
DW_TYPE = GLOBAL
DW_MODE = AUTO
DW_ERROR_TIME = 10
INST_RECOVER_TIME = 60
INST_ERROR_TIME = 10
INST_OGUID = 45332
INST_INI = /opt/dmdbms/data/DAMENG/dm.ini
INST_AUTO_RESTART = 1
INST_STARTUP_CMD = /opt/dmdbms/bin/dmserver
RLOG_SEND_THRESHOLD = 0
RLOG_APPLY_THRESHOLD = 0
2.2.7 以mount方式启动备库 & 设置OGUID & 修改数据库模式
以mount方式启动备库:
root@bfb5213c97c2:/opt/dmdbms/bin$ ./dmserver /opt/dmdbms/data/DAMENG/dm.ini mount
设置OGUID:
root@bfb5213c97c2:/opt/dmdbms/bin$ ./disql SYSDBA/SYSDBA001@localhost:5236
Server[localhost:5236]:mode is normal, state is mount
login used time : 2.365(ms)
disql V8
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1);
DMSQL executed successfully
used time: 10.802(ms). Execute id is 1.
SQL> SP_SET_OGUID(45332);
DMSQL executed successfully
used time: 4.260(ms). Execute id is 2.
SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
DMSQL executed successfully
used time: 10.748(ms). Execute id is 3.
修改数据库模式:
SQL> ALTER DATABASE STANDBY;
executed successfully
used time: 9.266(ms). Execute id is 0.
如果当前数据库不是 Normal 模式,需要先修改 dm.ini 中 ALTER_MODE_STATUS 值为 1,允许修改数据库模式,修改 Standby 模式成功后再改回为 0。详见官方文档 - 修改数据库模式。
2.3 配置监视器
配置非确认监视器(也称普通监视器)和确认监视器的惯例配置:
- 手动切换:集群各节点的 bin 目录中,存放非确认监视器配置文件。
- 自动切换:在确认监视器上(非集群节点),存放确认监视器配置文件,并注册后台自启服务。
由于在先前主备库的配置过程中,dmwatcher.ini
的DW_MODE
参数值为AUTO
,即自动切换模式,所以这里必须配置确认监视器而不是非确认监视器;除了与非确认监视器提供了相同的命令支持外,确认监视器能够在主库发生故障时自动通知备库接管为新的主库,具有自动故障处理的功能。
2.3.1 配置单实例监视器
运行如下命令,创建Monitor实例并同时连接到dmwatch-test-heartbeat
网络,分配子网IP192.168.1.7
,创建同名Docker Volume GRP2_RWW_03
:
docker run -d --restart=always --name=GRP2_RWW_03 --network dmwatch-test-heartbeat --ip 192.168.1.7 --privileged=true -e LD_LIBRARY_PATH=/opt/dmdbms/bin -e PAGE_SIZE=32 -e EXTENT_SIZE=32 -e LOG_SIZE=2048 -e UNICODE_FLAG=1 -e INSTANCE_NAME=GRP2_RWW_03 -v GRP2_RWW_03:/opt/dmdbms/data dm8_single:dm8_20240715_rev232765_x86_rh6_64
进入容器GRP2_RWW_03
;按照惯例,在/opt/dmdbms/data
目录下创建dmmonitor.ini
。其中 MON_DW_IP
中的 IP 和 PORT
和 dmmal.ini
中的 MAL_HOST
和 MAL_DW_PORT
配置项保持一致:
MON_DW_CONFIRM = 1
MON_LOG_PATH = /opt/dmdbms/data/log
MON_LOG_INTERVAL = 60
MON_LOG_FILE_SIZE = 32
MON_LOG_SPACE_LIMIT = 0
[GRP2]
MON_INST_OGUID = 45332
MON_DW_IP = 192.168.1.5:5436
MON_DW_IP = 192.168.1.6:5436
2.3.2 配置多实例监视器
2.3.3 启动读写分离集群
2.3.3.1 启动主备库守护进程
在主库、备库上,使用如下命令分别开启守护进程:
./dmwatcher /opt/dmdbms/data/DAMENG/dmwatcher.ini
2.3.3.2 启动监视器
在Monitor实例上启动监视器:
root@ba6b0171922b:/opt/dmdbms/bin# ./dmmonitor /opt/dmdbms/data/dmmonitor.ini
[monitor] 2024-09-18 15:32:04: DMMONITOR[4.0] V8
[monitor] 2024-09-18 15:32:05: DMMONITOR[4.0] IS READY.
[monitor] 2024-09-18 15:32:05:
#-----------------------------------------------------------------------------------------------#
GET MONITOR CONNECT INFO FROM DMWATCHER(GRP2_RWW_02), THE FIRST LINE IS SELF INFO.
DW_CONN_TIME MON_CONFIRM MID MON_IP MON_VERSION
2024-09-18 15:32:05 TRUE 1316047694 ::ffff:192.168.1.7 DMMONITOR[4.0] V8
#-----------------------------------------------------------------------------------------------#
[monitor] 2024-09-18 15:32:05: Received message from(GRP2_RWW_02)
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2024-09-18 15:32:05 OPEN OK GRP2_RWW_02 OPEN STANDBY NULL 3 43564 43564
[monitor] 2024-09-18 15:32:06: Received message from(GRP2_RWW_01)
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2024-09-18 15:32:06 OPEN OK GRP2_RWW_01 OPEN PRIMARY VALID 3 43564 43564
使用show监视器命令查看所有实例组信息:
show
2024-09-18 15:32:55
#================================================================================#
GROUP OGUID MON_CONFIRM MODE MPP_FLAG
GRP2 45332 TRUE AUTO FALSE
<<DATABASE GLOBAL INFO:>>
DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
192.168.1.5 5436 2024-09-18 15:32:55 GLOBAL VALID OPEN GRP2_RWW_01 OK 1 1 OPEN PRIMARY DSC_OPEN TIMELY VALID
EP INFO:
INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
172.16.1.5 5236 OK GRP2_RWW_01 OPEN PRIMARY 0 0 TIMELY VALID 3183 43564 3183 43564 NONE
<<DATABASE GLOBAL INFO:>>
DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
192.168.1.6 5436 2024-09-18 15:32:55 GLOBAL VALID OPEN GRP2_RWW_02 OK 1 1 OPEN STANDBY DSC_OPEN TIMELY VALID
EP INFO:
INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
172.16.1.6 5236 OK GRP2_RWW_02 OPEN STANDBY 0 0 TIMELY VALID 3183 43564 3183 43564 NONE
DATABASE(GRP2_RWW_02) APPLY INFO FROM (GRP2_RWW_01), REDOS_PARALLEL_NUM (1), WAIT_APPLY[TRUE]:
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[3183, 3183, 3183], (RLSN, SLSN, KLSN)[43564, 43564, 43564], N_TSK[0], TSK_MEM_USE[0]
REDO_LSN_ARR: (43564)
使用tip命令查看守护进程配置信息:
tip
[monitor] 2024-09-18 15:33:11: Instance GRP2_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE] cannot join other instances, dmwatcher status is OPEN, SYSOPENHISTORY status is VALID
[monitor] 2024-09-18 15:33:11: Instance GRP2_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently
[monitor] 2024-09-18 15:33:11: Instance GRP2_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE] is OK, dmwatcher status is OPEN, dw_type is GLOBAL
[monitor] 2024-09-18 15:33:11: Instance GRP2_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] can join instance GRP2_RWW_01[PRIMARY, OPEN, ISTAT_SAME:TRUE]
[monitor] 2024-09-18 15:33:11: Instance GRP2_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently
[monitor] 2024-09-18 15:33:11: Instance GRP2_RWW_02[STANDBY, OPEN, ISTAT_SAME:TRUE] is OK, dmwatcher status is OPEN, dw_type is GLOBAL
[monitor] 2024-09-18 15:33:11: Group(GRP2) current active instances are OK
[monitor] 2024-09-18 15:33:11: All groups' current active instances are OK!
3 dm_svc.conf 配置
此部分较简单,略。详细信息可见“数据库规范化部署 - 读写分离集群安装部署”或查阅有关资料。
社区地址:https://eco.dameng.com