DSC集群动态扩展节点
目 录
环境准备
在扩展节点添加dmdba用户,安装数据库,配置系统参数,如果在虚拟机环境测试,需要配置VM磁盘共享相关。
[root@localhost dmsetup]# mkdir -p /opt/dsc/setup /opt/dsc/dmdbms
/opt/dsc/config /opt/dsc/arch0 /opt/dsc/arch0_remote /opt/dsc/bak
[root@localhost dmsetup]# chown dmdba.dinstall /opt/dsc -R;chmod 777
/opt/dsc -R
扩展节点前检查DSC集群状态,需要三个集群所有节点正常。
操作流程
添加日志文件
导出集群的dmdcr_cfg.ini文件
[dmdba@localhost ~]$ /opt/dsc/dmdbms/bin/dmasmcmd
DMASMCMD V8
ASM>export dcrdisk '/dev/raw/raw1' to '/opt/dsc/dmdcr_cfg_bak.ini'
ASMCMD export DCRDISK success.
Used time: 1.863(ms).
添加日志前登录asmtool查看之前的日志文件路径
[dmdba@localhost ~]$ /opt/dsc/dmdbms/bin/dmasmtool
DCR_INI=/opt/dsc/config/dmdcr.ini
DMASMTOOL V8
ASM>ls
\+
disk groups total [4]......
NO.1 name: DMLOG
NO.2 name: DMDATA
NO.3 name: VOTE
NO.4 name: DCR
Used time: 1.516(ms).
ASM>ls +DMLOG
file : DAMENG0_01.log
file : DAMENG0_02.log
file : DAMENG1_01.log
file : DAMENG1_02.log
total count 4.
Used time: 4.761(ms).
日志路径为+DMLOG。DISQL登录任一节点,添加至少两个RLOG文件,大小和已有节点保持一致。
[dmdba@localhost ~]$ /opt/dsc/dmdbms/bin/disql
SYSDBA/'"Dameng@1234"':15236
服务器[LOCALHOST:15236]:处于普通打开状态
登录使用时间 : 2.841(ms)
disql V8
SQL> alter database add node logfile '+DMLOG/DAMENG2_01.log' size
256, '+DMLOG/DAMENG2_02.log' size 256;
操作已执行
已用时间: 363.206(毫秒). 执行号:0.
核对dm.ctl文件中已有新增的RLOG信息
[dmdba@localhost ~]$ /opt/dsc/dmdbms/bin/dmctlcvt type=1
src=+DMDATA/data/DAMENG/dm.ctl dest=/opt/dsc/dmctl.txt
dcr_ini=/opt/dsc/config/dmdcr.ini
DMCTLCVT V8
convert ctl to txt success!
[dmdba@localhost ~]$ vi /opt/dsc/dmctl.txt
或登录ASM文件系统查看:
将节点0的配置文件目录拷贝到扩展节点,
[dmdba@localhost config]$ scp -r dsc0_config
dmdba@192.168.150.132:/opt/dsc/config/dsc2_config
[dmdba@localhost ~]$ ll /opt/dsc/config/dsc2_config/
总用量 96
-rw-r--r--. 1 dmdba dinstall 659 4月 27 15:13 dmarch.ini
-rw-r--r--. 1 dmdba dinstall 67064 4月 27 15:13 dm.ini
-rw-r--r--. 1 dmdba dinstall 1005 4月 27 15:13 dminit20230426113537.log
-rw-r--r--. 1 dmdba dinstall 1154 4月 27 15:13 dminit20230426113923.log
-rw-r--r--. 1 dmdba dinstall 1154 4月 27 15:13 dminit20230426115341.log
-rw-r--r--. 1 dmdba dinstall 1154 4月 27 15:13 dminit20230426123005.log
-rw-r--r--. 1 dmdba dinstall 210 4月 27 15:13 dmmal.ini
-rw-r--r--. 1 dmdba dinstall 686 4月 27 15:13 sqllog.ini
drwxr-xr-x. 2 dmdba dinstall 6 4月 27 15:13 trace
配置dm.ini
修改dm.ini文件中CONFIG_PATH:
CONFIG_PATH = /opt/dsc/config/dsc2_config
INSTANCE_NAME = DSC2
配置dmdcr.ini
新建dmdcr.ini文件(注意修改DMDCR_SEQNO)
[dmdba@localhost dsc2_config]$ vi /opt/dsc/config/dmdcr.ini
DMDCR_PATH = /dev/raw/raw1
DMDCR_MAL_PATH = /opt/dsc/config/dmasvrmal.ini
DMDCR_SEQNO = 2 #当前节点序号(用来获取ASM登录信息)
DMDCR_AUTO_OPEN_CHECK = 90
#指定时间内节点未启动,CSS将节点踢出,单位:秒
DMDCR_ASM_TRACE_LEVEL = 2 #日志级别为WARN
#DMDCR_ASM_RESTART_INTERVAL = 30 #CSS认定ASM故障重启时间
#DMDCR_ASM_STARTUP_CMD = /opt/dsc/dmdbms/bin/DmASMSvrServiceASM
start
#DMDCR_DB_RESTART_INTERVAL = 60 #CSS认定DSC故障重启时间
#DMDCR_DB_STARTUP_CMD = /opt/dsc/dmdbms/bin/DmServiceDSC start
配置dmasvrmal.ini
向已有节点的dmasvrmal.ini文件中新增扩展节点信息,并将修改后的文件拷贝到扩展节点的/opt/dsc/config下
修改后:
[dmdba@localhost dsc2_config]$ vi /opt/dsc/config/dmasvrmal.ini
[MAL_INST1]
MAL_INST_NAME = ASM0 #ASM节点名
MAL_HOST = 192.168.150.130 #心跳IP
MAL_PORT = 5636 #MAL监听端口
[MAL_INST2]
MAL_INST_NAME = ASM1
MAL_HOST = 192.168.150.131
MAL_PORT = 5637
[MAL_INST3]
MAL_INST_NAME = ASM2
MAL_HOST = 192.168.150.132
MAL_PORT = 5638
配置dmmal.ini
修改三个节点的dmmal.ini配置文件,内容如下:
[dmdba@localhost config]$ vi /opt/dsc/config/dsc0_config/dmmal.ini
[mal_inst0]
mal_inst_name = DSC0
mal_host = 192.168.150.130
mal_port = 5736
[mal_inst1]
mal_inst_name = DSC1
mal_host = 192.168.150.131
mal_port = 5737
[mal_inst2]
mal_inst_name = DSC2
mal_host = 192.168.150.132
mal_port = 5738
配置dmdcr_cfg.ini
修改导出的dmdcr_cfg_bak.ini,添加扩展节点信息。
修改后的dmdcr_cfg_bak.ini内容示例:
# the file is auto-created by system, self edit is invalid!
#DCR HDR
DCR_N_GRP = 3
DCR_VTD_PATH = /dev/raw/raw2
DCR_OGUID = 45331
[GRP]
DCR_GRP_TYPE = CSS
DCR_GRP_NAME = GRP_CSS
DCR_GRP_N_EP = 3
DCR_GRP_EP_ARR = {0,1,2}
DCR_GRP_N_ERR_EP = 0
DCR_GRP_ERR_EP_ARR = {}
DCR_GRP_DSKCHK_CNT = 60
[GRP]
DCR_GRP_TYPE = ASM
DCR_GRP_NAME = GRP_ASM
DCR_GRP_N_EP = 3
DCR_GRP_EP_ARR = {0,1,2}
DCR_GRP_N_ERR_EP = 0
DCR_GRP_ERR_EP_ARR = {}
DCR_GRP_DSKCHK_CNT = 60
[GRP]
DCR_GRP_TYPE = DB
DCR_GRP_NAME = GRP_DSC
DCR_GRP_N_EP = 3
DCR_GRP_EP_ARR = {0,1,2}
DCR_GRP_N_ERR_EP = 0
DCR_GRP_ERR_EP_ARR = {}
DCR_GRP_DSKCHK_CNT = 60
[GRP_CSS]
DCR_EP_NAME = CSS0
DCR_EP_HOST = 192.168.150.130
DCR_EP_PORT = 5336
[GRP_CSS]
DCR_EP_NAME = CSS1
DCR_EP_HOST = 192.168.150.131
DCR_EP_PORT = 5337
[GRP_CSS]
DCR_EP_NAME = CSS2
DCR_EP_HOST = 192.168.150.132
DCR_EP_PORT = 5338
[GRP_ASM]
DCR_EP_NAME = ASM0
DCR_EP_SHM_KEY = 93360
DCR_EP_SHM_SIZE = 100
DCR_EP_HOST = 192.168.150.130
DCR_EP_PORT = 5436
DCR_EP_ASM_LOAD_PATH = /dev/raw
[GRP_ASM]
DCR_EP_NAME = ASM1
DCR_EP_SHM_KEY = 93361
DCR_EP_SHM_SIZE = 100
DCR_EP_HOST = 192.168.150.131
DCR_EP_PORT = 5437
DCR_EP_ASM_LOAD_PATH = /dev/raw
将修改后的配置文件写回
[dmdba@localhost dsc]$ /opt/dsc/dmdbms/bin/dmasmcmd
DMASMCMD V8
ASM>extend dcrdisk '/dev/raw/raw1' from
'/opt/dsc/dmdcr_cfg_bak.ini'
ASMCMD extend node for dcr disk success.
ASMCMD extend node for vote disk success.
Used time: 45.934(ms).
扩展节点
进入监视器cssm,执行命令扩展节点
extend node
[monitor] 2023-04-27 15:42:50: 执行扩展节点动作
[monitor] 2023-04-27 15:42:52: 通知当前活动的CSS执行清理操作
[monitor] 2023-04-27 15:42:53: 清理CSS(0)请求成功
[monitor] 2023-04-27 15:42:54: 清理CSS(1)请求成功
[monitor] 2023-04-27 15:42:54: 命令EXTENT NODE 执行成功
使用show命令查看,新增节点为UNKNOW, ERROR状态
启动css,asm,dmserver服务
启动扩展机器的CSS, ASM,DMSERVER服务
[dmdba@localhost dsc]$ cd /opt/dsc/dmdbms/bin
[dmdba@localhost bin]$ ./dmcss DCR_INI=/opt/dsc/config/dmdcr.ini
DMCSS V8
DMCSS IS READY
[dmdba@localhost ~]$ cd /opt/dsc/dmdbms/bin/
[dmdba@localhost bin]$ ./dmasmsvr DCR_INI=/opt/dsc/config/dmdcr.ini
ASM SELF EPNO:2
[dmdba@localhost ~]$ cd /opt/dsc/dmdbms/bin
[dmdba@localhost bin]$ ./dmserver /opt/dsc/config/dsc2_config/dm.ini
dcr_ini=/opt/dsc/config/dmdcr.ini
更新dmcssm.ini
修改dmcssm.ini文件,内容如下:
CSSM_OGUID = 45331 #消息标识
CSSM_CSS_IP = 192.168.150.130:5336
CSSM_CSS_IP = 192.168.150.131:5337
CSSM_CSS_IP = 192.168.150.132:5338
CSSM_LOG_PATH = ../log
CSSM_LOG_FILE_SIZE = 512 #单个日志大小,单位MB
CSSM_LOG_SPACE_LIMIT = 2048 #日志上限,单位MB
启动实例服务后,CSS协调扩展节点加入集群
[CSS0] [2023-04-27 16:18:40:962] [DB]: 设置EP
DSC2[2]为故障重加入EP
[CSS0] [2023-04-27 16:18:40:963] [DB]: 设置命令[START NOTIFY],
目标站点 DSC2[2], 命令序号[29]
[CSS0] [2023-04-27 16:18:41:972] [DB]: 设置命令[SUSPEND EP WORKER
THREAD], 目标站点 DSC0[0], 命令序号[30]
[CSS0] [2023-04-27 16:18:41:972] [DB]: 设置命令[SUSPEND EP WORKER
THREAD], 目标站点 DSC1[1], 命令序号[31]
[CSS0] [2023-04-27 16:18:42:487] [DB]: 暂停工作线程结束
[CSS0] [2023-04-27 16:18:42:496] [DB]: 设置命令[DCR_LOAD],
目标站点 DSC0[0], 命令序号[32]
[CSS0] [2023-04-27 16:18:42:503] [DB]: 设置命令[DCR_LOAD],
目标站点 DSC1[1], 命令序号[33]
[CSS0] [2023-04-27 16:18:42:511] [DB]: 设置命令[DCR_LOAD],
目标站点 DSC2[2], 命令序号[34]
[CSS0] [2023-04-27 16:18:43:432] [DB]: 故障EP重新加入DSC结束
[CSS0] [2023-04-27 16:18:43:432] [DB]: 设置命令[ERROR EP ADD],
目标站点 DSC0[0], 命令序号[36]
[CSS0] [2023-04-27 16:18:43:442] [DB]: 设置命令[ERROR EP ADD],
目标站点 DSC1[1], 命令序号[37]
[CSS0] [2023-04-27 16:18:43:451] [DB]: 设置命令[ERROR EP ADD],
目标站点 DSC2[2], 命令序号[38]
[CSS0] [2023-04-27 16:18:43:563] [DB]: 故障EP重新加入DSC结束
[CSS0] [2023-04-27 16:18:43:563] [DB]: 设置命令[EP RECV],
目标站点 DSC0[0], 命令序号[40]
[CSS0] [2023-04-27 16:18:43:572] [DB]: 设置命令[EP RECV],
目标站点 DSC1[1], 命令序号[41]
[CSS0] [2023-04-27 16:18:43:683] [DB]: 故障EP恢复结束
[CSS0] [2023-04-27 16:18:43:684] [DB]: 设置命令[EP START],
目标站点 DSC2[2], 命令序号[43]
[CSS0] [2023-04-27 16:18:43:794] [DB]: 设置命令[EP START2],
目标站点 DSC2[2], 命令序号[45]
[CSS0] [2023-04-27 16:18:44:502] [DB]: 设置命令[EP OPEN],
目标站点 DSC2[2], 命令序号[47]
[CSS0] [2023-04-27 16:18:45:511] [DB]: 设置命令[NONE], 目标站点
DSC2[2], 命令序号[0]
[CSS0] [2023-04-27 16:18:45:517] [DB]: 设置命令[RESUME EP WORKER
THREAD], 目标站点 DSC0[0], 命令序号[49]
[CSS0] [2023-04-27 16:18:45:526] [DB]: 设置命令[RESUME EP WORKER
THREAD], 目标站点 DSC1[1], 命令序号[50]
[CSS0] [2023-04-27 16:18:45:638] [DB]: 继续工作线程结束
[CSS0] [2023-04-27 16:18:45:638] [DB]: 设置命令[NONE], 目标站点
DSC0[0], 命令序号[0]
[CSS0] [2023-04-27 16:18:45:648] [DB]: 设置命令[NONE], 目标站点
DSC1[1], 命令序号[0]
[CSS0] [2023-04-27 16:18:45:657] [DB]: 设置命令[EP REAL OPEN],
目标站点 DSC2[2], 命令序号[52]
[CSS0] [2023-04-27 16:18:46:576] [DB]: 设置命令[NONE], 目标站点
DSC2[2], 命令序号[0]
查看监视器,所有节点状态正常。