目录
5.1 初始化字典装载,报错CSL[ERROR]: load execute module failure
(2)环境变量配置问题,未找到lib动态库libdmfldr.so
一.安装前准备
- 系统环境准备
- 两台机器均安装达梦数据库
- 两台数据库字符集相同
- 两台主机都关闭防火墙,上传dmhs安装包
数据库版本 | IP | 主机名 | 实例名 | 端口号 | 机器 |
DM Database Server 64 V8 | 192.168.135.1 | DM1 | DMSERVER | 5236 | A |
DM Database Server 64 V8 | 192.168.135.2 | DM2 | DMSERVER | 5236 | B |
安装dmhs(源端和目标端):
(1)可执行文件授权:
[dmdba@DM1 ~]$ mv dmhs_V4.1.48_pack4_dm8_rev104804_rh6_64_veri_20211228.bin /home/dmdba
[dmdba@DM1 ~]$ chmod +x dmhs_V4.1.48_pack4_dm8_rev104804_rh6_64_veri_20211228.bin
- 检查下tmp空间是否有1G大小,不足的话在root进行扩容:
mount -o remount,size=2G /tmp
- 启动dmhs执行文件:
[dmdba@DM1 ~]$ ./dmhs_V4.1.48_pack4_dm8_rev104804_rh6_64_veri_20211228.bin -i
至此DMHS安装完成,我们环境安装路径为/home/dmdba/dmhs,源端虚机和目的端虚机环境都需要安装
1.2数据库参数检查(源端和目标端)
1.2.1归档参数检查
DMHS同步源端数据库需要开启归档模式,检查归档是否开启:
SQL> select arch_mode from v$database;
LINEID ARCH_MODE
---------- ---------
1 Y
used time: 3.236(ms). Execute id is 123.
如果ARCH_MODE为N,则未开启归档,执行步骤如下:
SQL> alter database mount;
SQL> alter database add archivelog 'DEST=/dm8/dmarch,TYPE=LOCAL,FILE_SIZE=256,SPACE_LIMIT=0';
SQL> alter database archivelog;
SQL> alter database open;
1.2.2附件日志参数
在disql下执行以下语句:
SP_SET_PARA_VALUE(1,'RLOG_APPEND_LOGIC',1);
select para_value from v$dm_ini where para_name in ('RLOG_APPEND_LOGIC');
RLOG_APPEND_LOGIC参数为1,表示已开启
源端和目标端重启数据库服务:
1.2.3 配置DDL辅助表和触发器
创建完dmhs后,会在安装的子目录的script下存在”ddl_sql_dm8.sql”脚本,建议在DM管理工具中执行该脚本,在disql中执行会报错。
执行完成后,使用以下SQL语句查询辅助表是否创建成功
select owner, table_name from dba_tables where owner = 'SYSDBA' and table_name like 'DMHS%' and status = 'VALID';
执行以下SQL语句查询触发器是否创建成功:
select owner, trigger_name from dba_triggers where owner = 'SYSDBA' and trigger_name like 'DMHS%' and status = 'Y';
正常创建成功后会有关于DMHS的9张表和4个触发器
目标端也需要执行。
1.3 准备DMHS连接数据库的用户(源端和目标端)
CREATE TABLESPACE HSEXEC DATAFILE 'HSEXEC.DBF' size 128;
CREATE USER HSEXEC IDENTIFIED by "111111111" DEFAULT TABLESPACE HSEXEC DEFAULT INDEX TABLESPACE HSEXEC;
GRANT VTI TO HSEXEC;
GRANT PUBLIC TO HSEXEC;
GRANT RESOURCE TO HSEXEC;
GRANT DBA TO HSEXEC;
1.4准备测试模式,对象:
源端创建同步对象(用于双向同步):
create tablespace liu datafile '/dm8/dmdata/DAMENG/liu.dbf' size 256;
create user liu identified by “liu123456” default tablespace liu default index tablespace liu;
GRANT VTI TO LIU;
GRANT PUBLIC TO LIU;
GRANT RESOURCE TO LIU;
GRANT DBA TO LIU;
create table liu.test1 (id int,ename varchar(10));
insert into liu.test1 values (1,'liu');
insert into liu.test1 values (2,'you');
insert into liu.test1 values (3,'rui');
commit;
源端创建同步对象(用于单向同步):
create table liu.test_insert(id int,name varchar(100),addr varchar(200));
declare
sqlstr varchar;
begin
for i in 1..100 loop
sqlstr:='insert into liu.test_insert values ';
for j in 1..10 loop
sqlstr=sqlstr || '(' ||j|| ',''test'',''testestewt''),';
end loop;
sqlstr=rtrim(sqlstr,',') ;
execute immediate sqlstr;
end loop;
commit;
end
/
目标端创建接收对象(用于双向同步):
create tablespace liu1 datafile 'liu1.dbf' size 256;
create user liu1 identified by “liu123456” default tablespace liu1 default index tablespace liu1;
GRANT VTI TO LIU1;
GRANT PUBLIC TO LIU1;
GRANT RESOURCE TO LIU1;
GRANT DBA TO LIU1;
目标端创建接收对象(用于单向同步):
create tablespace liu2 datafile 'liu2.dbf' size 256;
create user liu2 identified by “liu123456” default tablespace liu2 default index tablespace liu2;
GRANT VTI TO liu2;
GRANT PUBLIC TO liu2;
GRANT RESOURCE TO liu2;
GRANT DBA TO liu2;
二.正式安装
2.1 源端部署
2.1.1 源端DMHS服务配置
cd /home/dmdba/dmhs/bin
cp TemplateDmhsService DmhsService
vim DmhsService
2.1.2 dmhs.hs配置文件功能解读
base模块:
是本地DMHS管理服务相关配置,需要注意的是siteid必须为唯一,需要和节点2的配置区分开来;
cpt模块:
本地数据采集服务相关配置
1)我们需要采集的是节点1的DM8数据库,所以前部分填写相关连接信息
2)其中ddl_mask配置项是DDL 操作功能掩码,理论上我们双向需要同步所有的DDL操作,所以这里掩码我们配置了所有对象
3)其中send配置项,则是同步到目的端的投递信息配置,该标签是一个功能标签,需要进一步配置它的子项,其中包括目的端DMHS服务的连接信息,以及对象选择配置项filter(支持白名单enable和黑名单disable),本次模拟为指定对象同步,所以配置了白名单,且因为需要同步的目的端模式和本地源端模式名不一样,配置了map映射规则,按需求把指定对象映射到目的端指定模式下。
exec模块:
本地数据接收和执行服务相关配置,相对于节点2来说,本地是节点2的目的端,需要对节点2的DMHS服务cpt模块收集并发送来的数据,执行同步到本地节点1的DM8数据库,因此主要为连接本地节点1的数据库信息,以及执行工作进程的相关配置。其中exec_policy=2表示执行事务出错时,忽略出错的操作后继续执行;ddl_continue=1表示DDL操作同步出错时,继续往下操作。
2.1.3 新建dmhs.hs配置文件
cd /home/dmdba/dmhs/bin
vim dmhs.hs
----以下为dmhs.hs配置文件具体内容----
<?xml version="1.0" encoding="GB2312" standalone="no"?>
<?xml version="1.0" encoding="GB2312" standalone="no"?>
<dmhs>
<base>
<lang>en</lang>
<mgr_port>5345</mgr_port> <!-- 管理端口号-->
<ckpt_interval>60</ckpt_interval>
<siteid>11</siteid> <!-- 源端站点编号-->
<version>2.0</version>
</base>
<cpt>
<name>cpt1911</name>
<db_type>dm8</db_type>
<db_server>192.168.135.1</db_server>
<db_user>HSEXEC</db_user> <!--数据库用户名-->
<db_ssl_path/>
<db_ssl_pwd/>
<db_pwd>111111111</db_pwd>
<char_code>PG_UTF8</char_code>
<db_port>5236</db_port>
<ddl_mask>op:TABLE:VIEW:PROCEDURE:FUNCTION:TRIGGER:INDEX:CHECK:SEQUENCE:TYPE:PACKAGE:SYNONYM</ddl_mask> <!--DDL同步参数-->
<parse_thr>1</parse_thr>
<arch>
<clear_flag>1</clear_flag> <!--归档删除标识-->
<clear_interval>600</clear_interval>
</arch>
<send>
<ip>192.168.135.2</ip>
<mgr_port>5345</mgr_port>
<data_port>5346</data_port>
<trigger>1</trigger>
<constraint>1</constraint>
<identity>1</identity>
<net_turns>0</net_turns>
<filter> <!--过滤规则-->
<enable>
<item>LIU.TEST1</item>
<item>LIU.TEST_INSERT</item>
</enable>
</filter>
<map> <!--映射规则-->
<item>LIU.TEST1==LIU1.TEST1</item>
<item>LIU.TEST_INSERT==LIU2.TEST_INSERT</item>
</map>
</send>
</cpt>
<exec>
<recv>
<mgr_port>5345</mgr_port>
<data_port>5346</data_port>
</recv>
<enable>1</enable>
<name>exec1911</name>
<db_type>DM8</db_type>
<db_server>192.168.135.1</db_server>
<db_user>HSEXEC</db_user>
<db_pwd>111111111</db_pwd>
<db_port>5236</db_port>
<exec_thr>1</exec_thr>
<exec_sql>1024</exec_sql>
<exec_trx> 5000 </exec_trx>
<exec_rows>1000</exec_rows>
<save_mask>EXEC</save_mask>
<exec_policy>2</exec_policy>
<ddl_continue>1</ddl_continue>
</exec>
</dmhs>
2.2 目标端部署
2.2.1目标端DMHS服务配置
cd /home/dmdba/dmhs/bin
cp TemplateDmhsService DmhsService
vim DmhsService
2.2.2 目标端DMHS服务配置
cd /home/dmdba/dmhs/bin
vim dmhs.hs
<?xml version="1.0" encoding="GB2312" standalone="no"?>
<dmhs>
<base>
<lang>en</lang>
<mgr_port>5345</mgr_port>
<ckpt_interval>60</ckpt_interval>
<siteid>12</siteid>
<version>2.0</version>
</base>
<cpt>
<name>cpt1912</name>
<db_type>dm8</db_type>
<db_server>192.168.135.2</db_server>
<db_user>HSEXEC</db_user> <!-- 数据库用户名-->
<db_ssl_path/>
<db_ssl_pwd/>
<db_pwd>111111111</db_pwd>
<char_code>PG_UTF8</char_code>
<db_port>5236</db_port>
<ddl_mask>op:TABLE:VIEW:PROCEDURE:FUNCTION:TRIGGER:INDEX:CHECK:SEQUENCE:TYPE:PACKAGE:SYNONYM</ddl_mask>
<parse_thr>1</parse_thr>
<arch>
<clear_flag>1</clear_flag>
<clear_interval>600</clear_interval>
</arch>
<send>
<ip>192.168.135.1</ip>
<mgr_port>5345</mgr_port> <!-- 接收管理端口-->
<data_port>5346</data_port>
<trigger>1</trigger>
<constraint>1</constraint>
<identity>1</identity>
<net_turns>0</net_turns>
<filter>
<enable>
<item>LIU1.TEST1</item>
</enable>
</filter>
<map>
<item>LIU1.TEST1==LIU.TEST1</item>
</map>
</send>
</cpt>
<exec>
<recv>
<mgr_port>5345</mgr_port>
<data_port>5346</data_port>
</recv>
<enable>1</enable>
<name>exec1912</name>
<db_type>DM8</db_type>
<db_server>192.168.135.2</db_server>
<db_user>HSEXEC</db_user>
<db_pwd>111111111</db_pwd>
<db_port>5236</db_port>
<exec_thr>1</exec_thr> <!--执行线程参数-->
<exec_sql>1024</exec_sql>
<exec_trx> 5000 </exec_trx>
<exec_rows>1000</exec_rows>
<save_mask>EXEC</save_mask>
<exec_policy>2</exec_policy>
<ddl_continue>1</ddl_continue>
</exec>
</dmhs>
三.启动DMHS服务:
单向同步:
3.1初始化装载
DMHS初始装载有两种:初始字典装载和初始数据装载。
DMHS同步源端在首次启动日志分析进行数据同步之前,需要进行初始字典装载操作,在程序目录下创建DICT文件夹,并将同步表的字典信息写入文件保存在磁盘上,然后才能启动源端日志分析模块功能。
初始数据装载主要用于将同步源端的数据装载到目的端,保持源端和目的端的数据初始一致。如果没有数据装载的需求,可以不用进行该操作。
3.1.1 初始字典装载
- 启动目的端执行服务
进行初始字典装载时,需要首先启动目的端DMHS服务。启动DMHS服务方式有命令行启动和后台服务启动。如下为Linux环境启动方式,Windows方式类似。
启动DMHS管理服务:
cd /home/dmdba/dmhs/bin
./DmhsService start
ps -ef |grep dmhs
服务启动成功,检查日志未发生报错
初次启动目的端DMHS服务时,同步执行服务并未开启,需要使用DMHS控制台工具连接DMHS管理服务,手动输入start exec命令开启。具体示例如下:
./dmhs_console
CSL[INFO]: DMHS控制台工具: V3.0.2.01-Build(2016.12.15-69394trunc)_64
DMHS >connect 192.168.135.2:5345
CSL[UNKNOW]: 执行成功
DMHS >start exec
- 源端进行初始字典装载
连接管理端口:
DMHS >connect 192.168.135.1:5345
初始化目的端lsn
DMHS >clear exec lsn
装载源端字典
DMHS> copy 0 "sch.name = 'LIU '" CLEAR|DICT
CSL[WARN]: Detect the CLEAR mask, the mask will empty all dict file, please confirm whether to continue?(Y/N)
Y
copy mask is : |CLEAR|DICT|PARTITION|REP
execute finish, please look up log file of exec module to check data load result
3.3.2 初始数据装载
进行初始数据装载时,首先也需要启动目的端执行服务。在启动目的端执行服务后,启动源端DMHS管理服务,并通过DMHS控制台工具连接管理服务,使用COPY命令及相关数据装载选项进行数据装载操作。
目的端需要使用DMHS创建初始表
当目标端已建立有同步表时,初始装载不需要进行表的创建,不需要使用CREATE装载关键字。示例如下:
DMHS> copy 0 "sch.name = 'LIU'" drop|create|insert|nolock
CSL[WARN]: Detect the DROP mask, this mask will drop target table, confirm to continue?(Y/N)
Y
copy mask is : |CREATE|DROP|NOLOCK|INSERT|TABLE|PARTITION|OBJID|REP
execute finish, please look up log file of exec module to check data load result
DMHS> start cpt;
execute success
观察日志均无报错:
源端执行日志(/home/dmdba/dmhs/bin/log/dmhs*.log):
目标端执行日志(/home/dmdba/dmhs/bin/log/dmhs*.log):
双向同步:
源端操作:
[dmdba@DM1 bin]$ ./DmhsService start
Starting DmhsService: [ OK ]
[dmdba@DM1 bin]$ ./dmhs_console
DMHS console tool: V4.2.94-Build(2022.08.11-113147trunc)_64_2208
Copyright (c) 2020, DMHS. All rights reserved.
Type ? or "help" for help, type "quit" to quit console.Connected to DMHS: 127.0.0.1:5345
execute success
Dameng HS Server V4.2.94-Build(2022.08.11-113147trunc)_64_2208
DMHS> connect 192.168.135.1:5345
execute successDMHS> state
MGR: do not run any module
execute success
DMHS> start exec;
execute successDMHS> state
MGR: Execute
execute successDMHS> clear exec lsn
execute successDMHS> copy 0 "sch.name = 'LIU '" CLEAR|DICT
CSL[WARN]: Detect the CLEAR mask, the mask will empty all dict file, please confirm whether to continue?(Y/N)
Y
copy mask is : |CLEAR|DICT|PARTITION|REP
execute finish, please look up log file of exec module to check data load resultDMHS> start cpt
execute successDMHS> state
MGR: Capture Execute
TYPE VID SITEID EXEC/CPT IP PORT DBNAME
------- --- ------ ------------- ---- ------
Capture 0 11 192.168.135.2 5345
Execute 0 12 192.168.135.2 5345
目标端操作:
dmdba@DM2 bin]$ ./DmhsService start
Starting DmhsService: [ OK ]
[dmdba@DM2 bin]$ ./dmhs_console
DMHS console tool: V4.2.94-Build(2022.08.11-113147trunc)_64_2208
Copyright (c) 2020, DMHS. All rights reserved.
Type ? or "help" for help, type "quit" to quit console.Connected to DMHS: 127.0.0.1:5345
execute success
Dameng HS Server V4.2.94-Build(2022.08.11-113147trunc)_64_2208
DMHS> connect 192.168.135.2:5345
execute successDMHS> state
MGR: Execute
execute successDMHS> start exec;
execute successDMHS> clear exec lsn
execute successDMHS> copy 0 "sch.name = 'LIU1'" mysql -uroot -p
CSL[ERROR]: Invalid commandDMHS> copy 0 "sch.name = 'LIU1'" CLEAR|DICT
CSL[WARN]: Detect the CLEAR mask, the mask will empty all dict file, please confirm whether to continue?(Y/N)
Y
copy mask is : |CLEAR|DICT|PARTITION|REP
execute finish, please look up log file of exec module to check data load result
DMHS> state
MGR: Execute
execute successDMHS> start cpt
execute successDMHS> state
MGR: Capture Execute
TYPE VID SITEID EXEC/CPT IP PORT DBNAME
------- --- ------ ------------- ---- ------
Capture 0 12 192.168.135.1 5345
Execute 0 11 192.168.135.1 5345
execute success
四.同步对比验证
4.1单向同步:
查看源端同步表:
SQL> select * from liu.test1;
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
查看目标端同步表:
SQL> select * from liu1.test1;
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
测试源端插入数据:
SQL> insert into liu.test1 values(4,'rr');
影响行数 1
已用时间: 2.517(毫秒). 执行号:2702.
SQL> commit;
操作已执行
已用时间: 2.109(毫秒). 执行号:2703.
SQL> select * from liu.test1;
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
4 4 rr
目标端同步成功:
插入源端另一张表(liu.test_insert):
SQL> select count(*) from liu.test_insert;
行号 COUNT(*)
---------- --------------------
1 1000
SQL> insert into liu.test_insert values(1,'liu','ss');
影响行数 1
已用时间: 20.411(毫秒). 执行号:5402.
SQL> /
影响行数 1
已用时间: 0.502(毫秒). 执行号:5403.
SQL> commit;
SQL> select count(*) from liu.test_insert;
行号 COUNT(*)
---------- --------------------
1 1002
目标端进行查看:
SQL> select count(*) from liu2.test_insert;
行号 COUNT(*)
---------- --------------------
1 1002
同步成功!
DDL同步验证:
源端删除表:
SQL> drop table liu.test_insert;
操作已执行
目标端验证:
SQL> select count(*) from liu2.test_insert;
行号 COUNT(*)
---------- --------------------
1 1000
/
SQL> select count(*) from liu2.test_insert;
select count(*) from liu2.test_insert;
第1 行附近出现错误[-2106]:无效的表或视图名[TEST_INSERT].
已用时间: 0.608(毫秒). 执行号:0.
日志信息:
4.2双向同步:
机器B操作:
SQL> select * from liu1.test1;
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
4 1 ll
已用时间: 1.743(毫秒). 执行号:1503.
SQL>
SQL>
SQL> insert into liu1.test1 select * from liu1.test1;
影响行数 4
已用时间: 1.966(毫秒). 执行号:1504.
SQL> commit;
操作已执行
机器A验证:
QL> select * from liu.test1;
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
4 1 ll
已用时间: 2.157(毫秒). 执行号:2803.
SQL> /
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
4 1 ll
5 1 liu
6 2 you
7 3 rui
8 1 ll
8 rows got
日志信息看到已经接受:
机器A操作:
SQL> insert into liu.test1 select * from liu.test1;
影响行数 8
已用时间: 1.531(毫秒). 执行号:2805.
SQL> commit;
机器B验证:
SQL> select * from liu1.test1;
行号 ID ENAME
---------- ----------- -----
1 1 liu
2 2 you
3 3 rui
4 1 ll
5 1 liu
6 2 you
7 3 rui
8 1 ll
9 1 liu
10 2 you
11 3 rui
12 1 ll
13 1 liu
14 2 you
15 3 rui
16 1 ll
16 rows got
日志信息看到已经接受:
五.报错整理
5.1 初始化字典装载,报错CSL[ERROR]: load execute module failure
(1)缺失libdmoci.so文件或版本与dmhs不匹配
退出后查看ldd libdmhs_exec.so
真实原因:
libdmhs_exec.so需要链接的是对目的数据库操作的库文件libdmoci.so,也就是dm8的libdmoci.so。然而dmhs下的db/bin是DMHS内置元数据库dm7的路径,也就是说此时执行器链接的是dm7的libdmoci.so所以执行异常。
但是dm8的libdmoci.so在哪里呢?实际上,默认使用dm8的bin文件安装中并不包含oci接口库,而在dm8的安装包中除了bin安装文件,还有一个dmdci.zip文件,解压dmdci.zip后复制需要的库文件libdmoci.so和以及include文件到dm8/bin下对应位置即可使用。
解决方法:
以上报错主要原因是 libdmoci.so 动态库文件与 DMHS 版本不匹配导致。更换与 DMHS 版本匹配的 libdmoci.so 动态库文件即可。
Dm8单独上传一份和dmhs版本相匹配的libdmoci.s0包后解决。
cp /home/dmdba/libdmoci.so /home/dmdba/dmhs/bin
chown dmdba:dinstall /home/dmdba/dmhs/bin/libdmoci.so
ls -lrt /home/dmdba/dmhs/bin/libdmoci.so
-rwxr-xr-x 1 dmdba dinstall 7694676 5月 24 03:32 /home/dmdba/dmhs/bin/libdmoci.so
(2)环境变量配置问题,未找到lib动态库libdmfldr.so
报错信息:
/dmhs/bin/log/dmhs_202305.log信息:
2023-05-24 11:18:18 MGR[INFO]: monitor 127.0.0.1(dmhs_console) 's login
2023-05-24 11:18:20 MGR[INFO]: connection from 127.0.0.1(dmhs_console) has broken!
2023-05-24 11:18:20 MGR[INFO]: monitor 127.0.0.1(dmhs_console) 's login
2023-05-24 11:18:23 MGR[INFO]: loading the execute module...
2023-05-24 11:18:23 MGR[INFO]: loading the execute module...
2023-05-24 11:18:23 MGR[ERROR]: lib libdmhs_exec.so can not found,error code 0, /home/dmdba/dmdbms/bin/libdmfldr.so: undefined symbol: dpi_fldr_get_col_info
出现报错后核实 libcpt_dm8.so(dmhs 安装路径)和 libdmfldr.so(dm 数据库安装路径)均存在,且环境变量 LD_LIBRARY_PATH 中也已经配置相应的路径。
原因:
如果是通过服务方式启动 dmhs 服务,环境变量会优先调用服务脚本中的 NEED_LIB_PATH 变量,该值可通过修改 DMHS 相关服务的后台脚本重启生效。以上执行需确保 DM 数据库 bin 目录下的对应 dmoci 相关动态库已更新。
因此,出现以上报错一般是安装过程中 NEED_LIB_PATH 变量缺失导致。可以单独修改服务脚本中的 NEED_LIB_PATH 值,NEED_LIB_PATH 变量值的取值以实际情况为准。
解决:
vi /home/dmdba/dmhs/DmhsService
修改为:
NEED_LIB_PATH=/home/dmdba/dmdbms/bin:/home/dmdba/dmhs/bin:/home/dmdba/dmhs/hs_agent:/home/dmdba/dmhs/db/bin
重启服务:DmhsSerivice restart
验证可以初始化字典装载
(3)目标端装载数据库报错:
原因:数据库端口信息有误,修改为5236即可。
- 启动DmhsService报错:
背景:在执行源端cpt同步后,异常停止源端,导致启动时报错CPT[ERROR]: invalid log format, rowid check fail, plsql_type: 28, null_flag: 1
解决:删除dmhs_cpt.tmp后重新启动dmhs服务
[root@DM1 ~]# find / -name "dmhs_cpt.tmp"
/home/dmdba/dmhs/bin/dmhs_cpt.tmp
[root@DM1 ~]# rm -f /home/dmdba/dmhs/bin/dmhs_cpt.tmp
[dmdba@DM1 bin]$ ./DmhsService start
Starting DmhsService: [ OK ]
- dmhs发起服务cpt同步服务后,源端dmhs服务异常关闭
报错信息如下:
2023-06-01 02:40:15 CPT[INFO]: start LSN :111727 located in log file: /dm8/dmarch/ARCHIVE_LOCAL1_0x51E4F075_EP0_2023-05-31_20-47-36.log
2023-06-01 02:40:15 MGR[INFO]: log analysis start success
2023-06-01 02:40:15 CPT[ERROR]: invalid log format, rowid check fail, plsql_type: 28, null_flag: 1
2023-06-01 02:40:15 PUB[ERROR]: invalid log format, rowid check fail, system halt.
2023-06-01 02:40:15 SND[INFO]: Analysis module 11 are sending the map rules...
2023-06-01 02:40:15 SND[INFO]: LIU.TEST1==LIU1.TEST1...
2023-06-01 02:40:15 SND[INFO]: LIU.TEST_INSERT==LIU2.TEST_INSERT...
2023-06-01 02:40:15 SND[INFO]: The analysis module 11 are getting the min LSN from site 192.168.135.2:5346...
2023-06-01 02:40:15 SND[INFO]: analysis module 11 get LSN:111727 LFS:0 successfully
2023-06-01 02:40:15 PUB[INFO]: ============gdb thread info start===============
2023-06-01 02:40:15 PUB[INFO]: Thread 11 (LWP 4697):
#0 0x00007f70522eed30 in nanosleep () from /usr/lib64/libc.so.6
#1 0x00007f7051db5d39 in dmhs_os_thread_sleep_low () from ./libdmhs_pub.so
#2 0x00007f705049b833 in dm8_send_thread () from ./libcpt_dm8.so
#3 0x00007f7052082f2b in ?? () from /usr/lib64/libpthread.so.0
#4 0x00007f70523216bf in clone () from /usr/lib64/libc.so.6
Thread 10 (LWP 4696):
#0 0x00007f70523124a4 in read () from /usr/lib64/libc.so.6
#1 0x00007f70522a7088 in _IO_file_underflow () from /usr/lib64/libc.so.6
#2 0x00007f70522a83e5 in _IO_default_xsgetn () from /usr/lib64/libc.so.6
#3 0x00007f705229a7e1 in fread () from /usr/lib64/libc.so.6
#4 0x00007f7051dbd87d in dmhs_sys_halt_gdb_info () from ./libdmhs_pub.so
#5 0x00007f7051dbd91b in dmhs_report_halt_info () from ./libdmhs_pub.so
#6 0x00007f7051dbd957 in dmhs_sys_halt () from ./libdmhs_pub.so
#7 0x00007f7050473782 in dm8_log_rowid_type_check () from ./libcpt_dm8.so
#8 0x00007f7050485a30 in dm8_analyse_insert () from ./libcpt_dm8.so
#9 0x00007f7050464672 in dm8_dict_ddl_table () from ./libcpt_dm8.so
#10 0x00007f7050488af0 in dm8_log_parse () from ./libcpt_dm8.so
#11 0x00007f705049e17e in dm8_parse_thread () from ./libcpt_dm8.so
#12 0x00007f7052082f2b in ?? () from /usr/lib64/libpthread.so.0
#13 0x00007f70523216bf in clone () from /usr/lib64/libc.so.6
Thread 9 (LWP 4695):
#0 0x00007f70522eed30 in nanosleep () from /usr/lib64/libc.so.6
#1 0x00007f7051db5d39 in dmhs_os_thread_sleep_low () fr2023-06-01 02:40:15 PUB[INFO]: ============gdb thread info end ===============
2023-06-01 02:40:15 PUB[INFO]: ============ulimit info start ===============
2023-06-01 02:40:15 PUB[INFO]: core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 11341
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65536
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 16384
cpu time (seconds, -t) unlimited
max user processes (-u) 65536
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited2023-06-01 02:40:15 PUB[INFO]: ============ulimit info end ===============
2023-06-01 02:40:15 PUB[INFO]: ============lsof info start ===============
2023-06-01 02:40:15 PUB[INFO]: 407 46432023-06-01 02:40:15 PUB[INFO]: ============lsof info end ===============
2023-06-01 05:06:17 MGR[INFO]: DMHS start up, current version: V4.2.94-Build(2022.08.11-113147trunc)_64_2208(Enterprise Edition)
2023-06-01 05:06:17 MGR[WARN]: License will expire on 2023-12-25
2023-06-01 05:06:17 MGR[INFO]: load config file successful,site no:11, manager port :5345, poll interval:3
2023-06-01 05:06:17 MGR[INFO]: manager listening port:5345
2023-06-01 05:06:43 MGR[INFO]: monitor 127.0.0.1(dmhs_console) 's login
经排查,dmhs.hs 配置文件无误,主机内存和存储充足。报错为dmhs日志报错为日志格式无效,rowid检查失败,plsql_type:28,null_flag:1。经查看数据库版本和dmhs版本,DMHS版本是V4.2.94-Build(2022.08.11-113147trunc),数据库版本是03134283968-20230103-178822-20033,怀疑是数据库版本过高,或dmhs版本过低导致。
解决:
重新安装一套dm8版本为:03134283890-20220720-165295-10045,与dmhs版本(20220811)较为接近,经测试为出现dmhs发起同步后,服务挂起现象,日志无类似报错,测试单向和双向同步均正常。
达梦云适配技术社区
https://eco.dameng.com/