初步认识Debezium,通过连接器实现Oracle数据库实时DML操作同步到Kafka - 知乎 (zhihu.com)https://zhuanlan.zhihu.com/p/654725490
引言
Debezium是一个开源的数据库事件捕捉和发布平台,旨在提供可靠的实时数据流。它基于分布式日志(如Apache Kafka)来捕获并传输数据库的变更事件,从而实现高效的数据同步和分发。通过使用Debezium的Oracle连接器,可以轻松地将Oracle数据库中的DML操作(包括INSERT、UPDATE、DELETE)的变更事件提取出来,并以实时的方式推送到Kafka消息队列中。
本文分四部分介绍Debezium及连接器相关概念、通过实践操作方式介绍Debezium的安装、最后通过Oracle连接器服务以Kafka Connect方式实现Oracle数据到Kafak的实时数据高效同步。从理论到实践、一步一步的了解Debezium、安装Debezium和通过Debezium实现数据同步。
Oracle 数据源权限配置(Xstream)
Xstream 方式权限配置
CDC(Change Data Capture),即数据变更抓取,通过为源端数据源开启CDC,作业可实现数据源的实时数据同步。 本章节主要介绍如何为oracle数据库开启Xstream方式CDC功能。
本文以非CDB 数据库作为案例进行说明,需要注意的是如果数据库为CDB数据库时,在用户分配的命令可能存在差异,一般情况下,角色使用 C##角色名 即可
前提条件
Oracle 数据库为11g 以上版本
Oracle数据库需具备OGG License(无需安装OGG)
若Oracle数据库为主从模式,不支持使用Oracle从库。
不支持采集二进制类型的字段,例如:BINARY,VARBINARY,TINYBLOB,BLOB,MEDIUMBLOB,LONGBLOB
操作步骤
建议由数据库管理员对数据源配置进行操作
必须为Oracle数据库启用日志归档,并为Debezium Oracle连接器监视的所有数据库定义一个具有适当权限的Oracle用户
-
开启日志归档和XStream
a. 在命令行工具中执行以下命令以sys用户连接到数据库。在实际使用过程中,可以有多种方式连接数据库,此处以命令行方式为例进行说明。sqlplus /nolog CONNECT sys/password@host:port AS SYSDBA;
b. 执行以下命令开启Xstream
alter system set enable_goldengate_replication=true;
c. 执行以下命令,检查日志归档是否已开启
archive log list; 若回显打印“Database log mode: No Archive Mode”,说明日志归档未开启,继续执行下一步。 若回显打印“Database log mode: Archive Mode”,说明日志归档已开启,直接跳到 g。
d. 执行以下命令配置归档日志参数
alter system set db_recovery_file_dest_size = 100G; alter system set db_recovery_file_dest = '/opt/oracle/oradata/recovery_area' scope=spfile; 100G为日志文件存储空间的大小,请根据实际情况设置。 /opt/oracle/oradata/recovery_area为日志存储路径,请根据实际规划设置,但须确保路径提前创建。
e. 执行以下命令开启日志归档
shutdown immediate; startup mount; alter database archivelog;
f. 执行以下命令,确认日志归档是否已成功开启
archive log list; 当回显打印“Database log mode: Archive Mode”,说明日志归档已开启。
g. 退出连接
`exit;`
-
创建XStream用户并给用户赋予权限
a. 在命令行工具中执行以下命令以sys用户连接到数据库实例。
sqlplus sys/password@host:port/SID as sysdba
b.执行以下命令创建XStream管理员用户并配置权限。
CREATE TABLESPACE xstream_adm_tbs DATAFILE '/opt/oracle/oradata/orcl/xstream_adm_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED; CREATE USER xstrmadmin IDENTIFIED BY password DEFAULT TABLESPACE xstream_adm_tbs QUOTA UNLIMITED ON xstream_adm_tbs; GRANT CREATE SESSION TO xstrmadmin; BEGIN DBMS_XSTREAM_AUTH.GRANT_ADMIN_PRIVILEGE( grantee => 'xstrmadmin', privilege_type => 'CAPTURE', grant_select_privileges => TRUE, container => 'ALL' ); END; 其中: xstream_adm_tbs为XStream管理员用户的表空间名,请根据实际规划设置。 /opt/oracle/oradata/orcl/xstream_adm_tbs.dbf为XStream管理员用户的表空间文件,请根据实际规划设置。 xstrmadmin为XStream管理员用户名,请根据实际规划设置。 password为XStream管理员用户密码,请根据实际规划设置。 “container => 'ALL'”仅当Oracle为12c或以上版本时,才需要添加,否则删除此行内容。
c. 执行以下命令创建CDC Connect连接数据库的用户并配置权限。
CREATE TABLESPACE xstream_tbs DATAFILE '/opt/oracle/oradata/orcl/xstream_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED; CREATE USER xstrm IDENTIFIED BY password DEFAULT TABLESPACE xstream_tbs QUOTA UNLIMITED ON xstream_tbs; GRANT CREATE SESSION TO xstrm; GRANT SELECT ON V_$DATABASE to xstrm; GRANT FLASHBACK ANY TABLE TO xstrm; GRANT SELECT ANY TABLE to xstrm; GRANT LOCK ANY TABLE TO xstrm; grant select_catalog_role to xstrm; 其中:x stream_tbs为CDC Connect连接用户的表空间名,请根据实际规划设置。 /opt/oracle/oradata/orcl/xstream_tbs.dbf为CDC Connect连接用户的表空间文件,请根据实际规划设置。 xstrm为CDC Connect连接用户名,请根据实际规划设置。 password为CDC Connect连接用户密码,请根据实际规划设置。
d. 执行以下命令修改日志记录参数。
alter database add supplemental log data (all) columns;
e. 退出连接
`exit;`
-
创建XStream出站服务器。
一个XStream出站服务器只能在一个任务中使用,若该数据库需要用于多个任务,需创建多个XStream出站服务器。a.在命令行工具中执行以下命令以xstrmadmin用户连接到数据库实例。
sqlplus xstrmadmin/password@host:port/SID
b.执行以下命令创建XStream出站服务器。
DECLARE tables DBMS_UTILITY.UNCL_ARRAY; schemas DBMS_UTILITY.UNCL_ARRAY; BEGIN tables(1) := NULL; schemas(1) := 'CDC'; DBMS_XSTREAM_ADM.CREATE_OUTBOUND( server_name => 'dbzxout', table_names => tables, schema_names => schemas ); END; 其中: CDC为要同步数据表所在的schemas,即CDC需要抓取哪个schemas下的数据,请根据实际情况设置。 dbzxout为出站服务器名称,需保证唯一,请根据实际规划设置。
c. 退出连接
`exit;`
d. 执行以下命令以sys用户连接到数据库实例
sqlplus sys/password@host:port/SID as sysdba
e. 执行以下命令允许xstrm用户连接XStream出站服务器。
BEGIN DBMS_XSTREAM_ADM.ALTER_OUTBOUND( server_name => 'dbzxout', connect_user => 'xstrm' ); END; 其中: dbzxout为出站服务器名称,在3.b中设置。 xstrm为CDC Connect连接用户名,在2.c中设置。
f. 退出连接
`exit;`
Xstream客户端配置
CDC作业运行时,需要通过Xstream client 去获取数据,因此需要在yarn 集群所在机器进行客户端的安装
a.下载oracle的instantclient客户端 进入下载页面,根据操作系统以及oracle版本选择合适的package包,建议下载客户端版本 instantclient_19_13。 解压后的包结构如下
```
instantclient_XX_X/
├── adrci
├── BASIC_LITE_LICENSE
├── BASIC_LITE_README
├── genezi
├── libclntshcore.so -> libclntshcore.so.21.1
├── libclntshcore.so.12.1 -> libclntshcore.so.21.1
...
├── ojdbc8.jar
├── ucp.jar
├── uidrvci
└── xstreams.jar
```
b. 机器环境配置
● linux
使用root用户权限将目录里内容复制到/usr/lib64下
平台数据源添加注意事项
a. cdc任务数据同步方式选用xstream时,需要在数据源管理新增oracle 数据源时,增加Xstream 出站服务器的账号密码配置
点击加号,填入如下两个key,如下图:
xstream.password
xstream.user
本文目标
为Oracle19c打开xstream,为了让Debezium能够读取日志变化
参考文章
https://support.huaweicloud.com/usermanual-roma/fdi-ug-190624013.html
打开归档
修改管理员密码
[oracle@hostyyx ~]$ sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 5 16:18:04 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> alter user system identified by manager;
User altered.
SQL> alter user sys identified by manager;
User altered.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
用sys登录数据库修改配置
[oracle@hostyyx ~]$ sqlplus /nolog
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 5 16:23:11 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
SQL> connect sys/manager@192.168.31.224/orclcdb as sysdba;
Connected.
SQL> alter system set enable_goldengate_replication=true;
System altered.
SQL> archive log list;
Database log mode No Archive Mode
Automatic archival Disabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 6
Current log sequence 8
SQL> alter system set db_recovery_file_dest_size = 10G;
System altered.
SQL> alter system set db_recovery_file_dest = '/opt/oracle/oradata/recovery_area' scope=spfile;
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
ERROR:
ORA-12514: TNS:listener does not currently know of service requested in connect
descriptor
Warning: You are no longer connected to ORACLE.
SQL> exit
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
启动数据库
[oracle@hostyyx ~]$ sqlplus / as sysdba;
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 5 16:25:12 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1593832664 bytes
Fixed Size 9135320 bytes
Variable Size 1006632960 bytes
Database Buffers 570425344 bytes
Redo Buffers 7639040 bytes
Database mounted.
SQL> alter database archivelog;
Database altered.
SQL> alter database open;
Database altered.
SQL> alter pluggable database ORCLPDB1 OPEN;
Pluggable database altered.
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 6
Next log sequence to archive 8
Current log sequence 8
SQL>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
至此成功打开归档日志
打开XStream
创建xstream管理员及赋权
CREATE TABLESPACE xstream_adm_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/xstream_adm_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;
alter session set container = ORCLPDB1;
CREATE TABLESPACE xstream_adm_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/ORCLPDB1/xstream_adm_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;
alter session set container = CDB$ROOT;
CREATE USER c##xstrmadmin IDENTIFIED BY xstrmadmin DEFAULT TABLESPACE xstream_adm_tbs QUOTA UNLIMITED ON xstream_adm_tbs CONTAINER=ALL;
GRANT CREATE SESSION, SET CONTAINER TO c##xstrmadmin CONTAINER=ALL;
BEGIN
DBMS_XSTREAM_AUTH.GRANT_ADMIN_PRIVILEGE(
grantee => 'c##xstrmadmin',
privilege_type => 'CAPTURE',
grant_select_privileges => TRUE,
container => 'ALL'
);
END;
/
-- 注意最后是/执行多行
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
创建xstream账号及赋权
[oracle@hostyyx ~]$ sqlplus sys/manager@192.168.31.224:1521/ORCLCDB as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 5 16:46:40 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> CREATE TABLESPACE xstream_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/xstream_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;
Tablespace created.
SQL> alter session set container = ORCLPDB1;
Session altered.
SQL> CREATE TABLESPACE xstream_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/ORCLPDB1/xstream_tbs.dbf' SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;
Tablespace created.
SQL> alter session set container = CDB$ROOT;
Session altered.
SQL> CREATE USER c##xstrm IDENTIFIED BY xstrm DEFAULT TABLESPACE xstream_tbs QUOTA UNLIMITED ON xstream_tbs CONTAINER=ALL;
User created.
SQL> GRANT CREATE SESSION TO c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> GRANT SET CONTAINER TO c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> GRANT SELECT ON V_$DATABASE to c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> GRANT FLASHBACK ANY TABLE TO c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> GRANT SELECT ANY TABLE to c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> GRANT LOCK ANY TABLE TO c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> grant select_catalog_role to c##xstrm CONTAINER=ALL;
Grant succeeded.
SQL> alter database add supplemental log data (all) columns;
Database altered.
SQL> quit;
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
[oracle@hostyyx ~]$
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
创建XStream出站服务器
[oracle@hostyyx ~]$ sqlplus c##xstrmadmin/xstrmadmin@192.168.31.224/orclcdb
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 5 16:52:26 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
DECLARE
tables DBMS_UTILITY.UNCL_ARRAY;
schemas DBMS_UTILITY.UNCL_ARRAY;
BEGIN
tables(1) := NULL;
schemas(1) := 'YINYX';
DBMS_XSTREAM_ADM.CREATE_OUTBOUND(
server_name => 'xstrmout',
table_names => tables,
schema_names => schemas);
END;
/
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
为出站服务器赋权
[oracle@hostyyx ~]$ sqlplus sys/manager@192.168.31.224:1521/ORCLCDB as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Mon Dec 5 17:18:04 2022
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> BEGIN
DBMS_XSTREAM_ADM.ALTER_OUTBOUND(
server_name => 'xstrmout',
connect_user => 'c##xstrm'
);
END;
/
2 3 4 5 6 7
PL/SQL procedure successfully completed.
SQL>