oracle cdc_Debezium-CDC /Oracle

oracle cdc

Imagine the following scenario, we have a CUSTOMERS table in the ORACLE database that we need to capture changes instantly (CDC — change data capture) and monitor these changes to analyze the changed data in real time.

想象一下以下情况,我们在ORACLE数据库中有一个CUSTOMERS表,我们需要立即捕获更改(CDC —更改数据捕获)并监视这些更改以实时分析更改的数据。

In our test environment, we set up a POC to validate this scenario. This article has this purpose, to describe how this POC was made and its outcome.

在我们的测试环境中,我们设置了一个POC来验证这种情况。 本文的目的是描述此POC的制作方法及其结果。

Image for post
Debezium — Conceptual Architecture
Debezium —概念体系结构

Oracle安装程序 (Oracle Setup)

Oracle 12c was installed on a GCP instance with the following config:

使用以下配置在GCP实例上安装了Oracle 12c:

Image for post

The .bashrc file that loads variables when accessing user oracle has the following content:

访问用户oracle时加载变量的.bashrc文件具有以下内容:

export ORACLE_BASE=/u02/app/oracle/product/12.2.0 
export ORACLE_HOME=$ORACLE_BASE/dbhome_1
export ORACLE_SID=dbz
export PATH=$PATH:$ORACLE_HOME/bin

Following, here is the step by step in how the instance configuration was done:

接下来,这是逐步完成实例配置的步骤:

Image for post
Image for post

IMPORTANT

重要

In the ‘Global database name’ parameter, verify if the FQDN has been deleted. We had problems when the domain name was included in global database parameter. Debezium was not able to publish the changes in Kafka despite showing in the logs. This specific point is one of the issues that must be taken into account as this connector is still in the development phase, as highlighted on it website.

“全局数据库名称”参数中,验证FQDN是否已删除。 当域名包含在全局数据库参数中时,我们遇到了问题。 尽管在日志中显示,Debezium仍无法发布Kafka中的更改。 正如其网站上突出显示的那样,此特定点是必须考虑的问题之一,因为此连接器仍处于开发阶段。

The good news is, it's in constantly developing status.

好消息是,它处于不断发展的状态。

Info: https://debezium.io/documentation/reference/1.1/connectors/oracle.html

信息: https : //debezium.io/documentation/reference/1.1/connectors/oracle.html

This connector is currently in incubating state, i.e. exact semantics, configuration options etc. may change in future revisions, based on the feedback we receive. Please let us know if you encounter any problems.

该连接器目前处于孵化状态,即确切的语义,配置选项等可能会在将来的版本中根据收到的反馈进行更改。 如果您遇到任何问题,请告诉我们。

Image for post
File spfile:
dbz.__data_transfer_cache_size=0
dbz.__db_cache_size=570425344
dbz.__inmemory_ext_roarea=0
dbz.__inmemory_ext_rwarea=0
dbz.__java_pool_size=16777216
dbz.__large_pool_size=150994944
dbz.__oracle_base=’/u02/app/oracle/product/12.2.0' #ORACLE_BASE set from environment
dbz.__pga_aggregate_target=520093696
dbz.__sga_target=973078528
dbz.__shared_io_pool_size=0
dbz.__shared_pool_size=218103808
dbz.streams_pool_size=0
*.audit_file_dest=’/u02/app/oracle/product/12.2.0/admin/dbz/adump’
*.audit_trail=’db’
*.compatible=’12.2.0'
*.control_files=’/u02/app/oracle/product/12.2.0/oradata/DBZ/controlfile/o1_mf_hfmglp6w.ctl’,’/u02/app/oracle/product/12.2.0/fast_recovery_area/dbz/DBZ/controlfile/o1_mf_hfmglphq.ctl’
*.db_block_size=8192
*.db_create_file_dest=’/u02/app/oracle/product/12.2.0/oradata’
*.db_name=’dbz’
*.db_recovery_file_dest_size=8016m
*.db_recovery_file_dest=’/u02/app/oracle/oradata/DBZ/recovery_area’
*.diagnostic_dest=’/u02/app/oracle/product/12.2.0'
*.dispatchers=’(PROTOCOL=TCP) (SERVICE=dbzXDB)’
*.enable_goldengate_replication=TRUE
*.enable_pluggable_database=true
*.memory_target=1414m*.open_cursors=300
*.processes=300
*.remote_login_passwordfile=’EXCLUSIVE’
*.undo_tablespace=’UNDOTBS1'

Now in the instance, it will be necessary to make a configuration for Debezium be able to capture applied changes to the tables which we want to monitor.

现在,在该实例中,有必要对Debezium进行配置,使其能够捕获对我们要监视的表的应用更改。

Following, bellow you can find settings as mentioned on the official page:

在下面,您可以找到官方页面上提到的设置:

sqlplus /nolog
CONNECT sys/your_password AS SYSDBA
alter system set db_recovery_file_dest_size = 5G;
alter system set db_recovery_file_dest = ‘/u02/app/oracle/oradata/DBZ/recovery_area’ scope=spfile;
alter system set enable_goldengate_replication=true;
shutdown immediate

startup mount
alter database archivelog;
alter database open;
— Should show “Database log mode: Archive Mode”
archive log list

The above paths were based on our environment. Change it to reflect yours.

以上路径均基于我们的环境。 更改它以反映您的。

Image for post

Although it is not necessary to install the Oracle Golden Gate binaries, it is necessary to enable replication as we did in the previous step. If you want to confirm if its already enabled, use the command:

尽管没有必要安装Oracle Golden Gate二进制文件 ,但是有必要像上一步中那样启用复制 。 如果要确认其是否已启用,请使用以下命令:

show parameters enable_goldengate_replication
Image for post

Debezium ingests data by accessing the XStream API and to use it, you must have an Oracle GoldenGate license. You don’t need any serial number to enable it, just use the command:

Debezium通过访问XStream API提取数据,并且要使用它,您必须具有Oracle GoldenGate许可证。 您不需要任何序列号即可启用它,只需使用以下命令:

alter system set enable_goldengate_replication = true;

Info: more details about XStream API

信息: 有关XStream API的详细信息

Now it will be necessary to create the tablespaces for the XStream user whose will capture the data:

现在,有必要为XStream用户创建表空间,该用户将捕获数据:

sqlplus /nolog
CONNECT sys/your_password AS SYSDBACREATE TABLESPACE xstream_adm_tbs DATAFILE ‘/u02/app/oracle/oradata/DBZ/xstream_adm_tbs.dbf’
SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;

The next step is to create an XStream administrator user and a user for the Debezium connector. Create an XStream administrator user in the container database (used by Oracle’s recommendation to administer XStream):

下一步是创建XStream管理员用户和Debezium连接器用户。 在容器数据库中创建一个XStream管理员用户(由Oracle建议使用它来管理XStream):

CREATE USER c##xstrmadmin IDENTIFIED BY xsa
DEFAULT TABLESPACE xstream_adm_tbs
QUOTA UNLIMITED ON xstream_adm_tbs
CONTAINER=ALL;GRANT CREATE SESSION, SET CONTAINER TO c##xstrmadmin CONTAINER=ALL;

Give the grant for the admin user be able to capture changes in XStream:

管理员用户授予授予权限,使其能够捕获XStream中的更改:

BEGIN
DBMS_XSTREAM_AUTH.GRANT_ADMIN_PRIVILEGE(
grantee => ‘c##xstrmadmin’,
privilege_type => ‘CAPTURE’,
grant_select_privileges => TRUE,
container => ‘ALL’
);
END;
/

Create an user which Debezium will use to connect to Oracle and give the necessary grant to work as it is needed:

创建一个供Debezium用于连接到Oracle的用户,并提供必要的授权以根据需要工作:

CREATE USER c##xstrm IDENTIFIED BY xs
DEFAULT TABLESPACE xstream_tbs
QUOTA UNLIMITED ON xstream_tbs
CONTAINER=ALL;GRANT CREATE SESSION TO c##xstrm CONTAINER=ALL;
GRANT SET CONTAINER TO c##xstrm CONTAINER=ALL;
GRANT SELECT ON V_$DATABASE to c##xstrm CONTAINER=ALL;
GRANT FLASHBACK ANY TABLE TO c##xstrm CONTAINER=ALL;

Now it's needed to create an outbound XStream server with the right privileges:

现在需要创建具有正确特权的出站XStream服务器:

connect c##xstrmadmin/xsaDECLARE
tables DBMS_UTILITY.UNCL_ARRAY;
schemas DBMS_UTILITY.UNCL_ARRAY;
BEGIN
tables(1) := NULL;
schemas(1) := ‘debezium’;
DBMS_XSTREAM_ADM.CREATE_OUTBOUND(
server_name => ‘dbzxout’,
table_names => tables,
schema_names => schemas);
END;
/

Create the CUSTOMERS table that will be monitored by the connector:

创建将由连接器监视的CUSTOMERS表

ALTER SESSION SET CURRENT_SCHEMA = debezium;create table CUSTOMERS (
id NUMBER(10),
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(50),
gender VARCHAR(50),
club_status VARCHAR(8),
comments VARCHAR(90),
create_ts timestamp DEFAULT CURRENT_TIMESTAMP ,
update_ts timestamp
); — Enable supplemental log on CUSTOMERS
ALTER TABLE debezium.customers ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS; — Grant select to debezium connector user
GRANT SELECT ON debezium.customers to c##xstrm;

Don’t forget to enable SUPPLEMENTAL LOG DATA in this table.

不要忘记在此表中启用SUPPLEMENTAL LOG DATA

Oracle Debezium连接器 (Oracle Debezium Connector)

As the installation and configuration of the instance that we will use to make the capture is finished, now it is necessary to configure Debezium connector in Kafka. For this, it is necessary to have a machine with a working Kafka setup, what it's not being covered in this article.

完成用于捕获的实例的安装和配置后,现在需要在Kafka中配置Debezium连接器。 为此,必须有一台具有可正常使用的Kafka设置的机器,而本文未介绍。

By licensing purposes, Debezium Oracle Connector is not provided with the Oracle JDBC driver and the XStream API .jar. The prerequisites for installing the Debezium Oracle Connector are:

出于许可目的,Oracle JDBC驱动程序和XStream API .jar不提供Debezium Oracle Connector。 安装Debezium Oracle Connector的前提条件是:

  • Kafka

    卡夫卡
  • Kafka Connect

    Kafka Connect
  • Oracle Instant Client (For a free download of Oracle Instant Client, click here)

    Oracle Instant Client(要免费下载Oracle Instant Client,请单击此处 )

At the time of this P.O.C., the current Debezium version that was tested was Debezium Oracle (1.2) and it can be downloaded from the official project page:https://debezium.io/documentation/reference/1.2/connectors/oracle.html

在发布此POC时,当前经过测试的Debezium版本是Debezium Oracle(1.2),可以从官方项目页面下载: https ://debezium.io/documentation/reference/1.2/connectors/oracle.html

After downloaded, extract the Oracle client file into a directory, for example ‘/path/to/instant_client/’ and copy the files * _ojdbc8.jar and xstreams.jar to the Kafka libs directory. Export the environment variable LD_LIBRARY_PATH, (preferably in the .bashrc of the user whose runs Kafka) pointing to the Instant Client directory:

下载后,将Oracle客户端文件解压缩到目录中,例如“ / path / to / instant_client / ”,然后将文件_ojdbc8.jarxstreams.jar复制到Kafka libs目录中。 导出环境变量LD_LIBRARY_PATH ,(最好在运行Kafka的用户的.bashrc中)指向Instant Client目录:

export LD_LIBRARY_PATH = /path/to/instant_client /

Debezium连接器配置 (Debezium Connector Configuration)

Register the Debezium Oracle connector instance using the configuration file that will be created. In our environment we created within the installation path of Kafka:

使用将创建的配置文件注册Debezium Oracle连接器实例。 在我们的环境中,我们在Kafka的安装路径中创建了:

/opt/kafka/config/debezium-oracle-connector.properties

Content:

内容:

name=debezium-ora-001
connector.class=io.debezium.connector.oracle.OracleConnector
db_type=oracle
tasks.max=1
database.server.name=hostname_of_database
database.tablename.case.insensitive=true
database.oracle.version=12+
database.hostname=10.23.131.132
database.port=1522
database.user=c##xstrm
database.password=xs
database.dbname=dbz
database.pdb.name=dbz1
database.out.server.name=dbzxout
database.history.kafka.bootstrap.servers=10.23.131.131:9092
database.history.kafka.topic=debezium.oracle
database.history.skip.unparseable.ddl=true
include.schema.changes=true
table.whitelist=DEBEZIUM.CUSTOMERS
errors.log.enable=true

Change it accordingly to your environment configuration.

相应地将其更改为您的环境配置。

If you are doing the first tests, I suggest you to use the curl command to register the instance settings until you find a properly configuration that matches your scenario:

如果您正在进行首次测试,建议您使用curl命令注册实例设置,直到找到与您的情况相匹配的正确配置:

curl -i -X POST -H “Accept:application/json” -H “Content-Type:application/json” http://ip_kafka_connect:8083/connectors/ -d ‘{
“name”:”debezium-ora-001",
“config”: {
“connector.class”:”io.debezium.connector.oracle.OracleConnector”,
“db_type”:”oracle”,
“tasks.max”:”1",
“database.server.name”:”hostname_of_database",
“database.tablename.case.insensitive”:”true”,
“database.oracle.version”:”12+”,
“database.hostname”:”10.23.131.132",
“database.port”:”1522",
“database.user”:”c##xstrm”,
“database.password”:”xs”,
“database.dbname”:”dbz”,
“database.pdb.name”:”dbz1",
“database.out.server.name”:”dbzxout”,
“database.history.kafka.bootstrap.servers”:”10.23.131.131:9092",
“database.history.kafka.topic”:”debezium.oracle”,
“database.history.skip.unparseable.ddl”:”true”,
“include.schema.changes”:”true”,
“table.whitelist”:”DEBEZIUM.CUSTOMERS”,
“errors.log.enable”:”true”
}
}’

To be successful running this, it is necessary to have Kafka Connect started. In our case, we used the following command:

要成功运行此程序,必须启动Kafka Connect 。 在我们的例子中,我们使用以下命令:

cd /opt/kafka
bin/connect-standalone.sh config/connect-standalone.properties config/debezium-oracle-connector.properties

After insert the line and execute the commit; at Oracle database, the connector that is running should already show on log the changed row that was inserted in the table:

在插入行并执行提交之后; 在Oracle数据库中,正在运行的连接器应该已经在日志中显示了插入表中的更改行:

Image for post

Our log was configured to output in DEBUG mode just to make easier the visualization of possible issues. It's possible to configure it by this way, modifying the following file:

我们的日志配置为以DEBUG模式输出,只是为了使可视化更容易。 通过修改以下文件,可以通过这种方式进行配置:

$KAFKA_HOME/config/connect-log4j.properties

Change the value INFO to DEBUG:

将值INFO更改为DEBUG:

log4j.rootLogger=DEBUG, stdout, connectAppender

At this moment, when the connector had showed in the log that it captured the CDC, the message will be already published in the configured Kafka topic in the connector (database.server.name + table.whitelist) ie hostname_of_database.DEBEZIUM.CUSTOMERS:

此时,当连接器在日志中显示它已捕获CDC时,该消息将已经在连接器中配置的Kafka主题中发布( database.server.name + table.whitelist ),即hostname_of_database.DEBEZIUM.CUSTOMERS

Image for post

To view, just execute the following command:

要查看,只需执行以下命令:


kafka-console-consumer.sh — bootstrap-server localhost:9092 — topic TOPIC_NAME — from-beginning

If you are not sure about the topic name, you can list all topics with the command:

如果不确定主题名称,可以使用以下命令列出所有主题:

kafka-topics.sh — list — bootstrap-server localhost:9092

With that, we've concluded the capture part and those messages published in Kafka topic can be explored in many ways.

到此为止,我们已经结束了捕获部分,并且可以用多种方式来探索在Kafka主题中发布的那些消息。

故障排除 (Troubleshooting)

These are some errors that we've encountered and was necessary to find a workaround. I believe in future versions it may no longer exists. But anyway, it worth mentioning:

这些是我们遇到的一些错误,是找到解决方法所必需的。 我相信在将来的版本中可能不再存在。 但是无论如何,值得一提的是:

Error

错误

Caused by: org.apache.kafka.connect.errors.ConnectException: ….Caused by: java.lang.IllegalArgumentException: No metadata registered for captured table dbz1.fqdn.DEBEZIUM.customers

引起原因:org.apache.kafka.connect.errors.ConnectException:…。引起原因:java.lang.IllegalArgumentException:未为捕获的表dbz1.fqdn.DEBEZIUM.customers注册任何元数据。

Solution

Change table.whitelist com with the <databaseName>.<tableName> format as show in the error log:“table.whitelist”: “dbz1.DEBEZIUM.customers”

如错误日志中所示,使用<databaseName>。<tableName>格式更改table.whitelist com:“ table.whitelist”:“ dbz1.DEBEZIUM.customers”

Error

错误

io.debezium.DebeziumException: The db history topic or its content is fully or partially missing. Please check database history topic configuration and re-execute the snapshot.

io.debezium.DebeziumException:数据库历史记录主题或其内容完全或部分丢失。 请检查数据库历史记录主题配置,然后重新执行快照。

Solution

Change the name of the database.history.kafka.topic in the connector config:“database.history.kafka.topic”:”debezium.oracle”, ← — — — — change

在连接器配置中更改database.history.kafka.topic的名称:“ database.history.kafka.topic”:“ debezium.oracle”,←— — — —更改

Error

错误

Caused by: java.lang.RuntimeException: java.sql.SQLException: ORA-08180: no snapshot found based on specified timeORA-06512: at “SYS.TIMESTAMP_TO_SCN”, line 1

由以下原因引起:java.lang.RuntimeException:java.sql.SQLException:ORA-08180:根据指定的时间未找到快照ORA-06512:在“ SYS.TIMESTAMP_TO_SCN”的第1行

Solution

ALTER TABLE ADD and REMOVE a column for example will generate a new snapshot on this object.

例如,ALTER TABLE ADD和REMOVE列将在此对象上生成一个新的快照。

Adding Column:alter table CUSTOMERS add (teste VARCHAR2(50));

添加Column:Alter表CUSTOMERS add(teste VARCHAR2(50));

Removing Column:alter table CUSTOMERS SET UNUSED (teste);alter table CUSTOMERS DROP UNUSED COLUMNS;

删除Column:alter table CUSTOMERS SET UNUSED(teste); alter table CUSTOMERS DROP UNUSED COLUMNS;

翻译自: https://medium.com/@lmramos.usa/debezium-cdc-oracle-931c8a62023b

oracle cdc

Debezium Connector for Oracle is a software tool that provides continuous data streaming and change data capture (CDC) capabilities for Oracle databases. It allows users to capture changes to the data in real-time and convert them into a stream of events that can be consumed by other applications, such as Apache Kafka. The Debezium Connector for Oracle uses Oracle's LogMiner technology to read the database's transaction logs and capture changes as they occur. It supports both on-premises and cloud-based Oracle databases and can be configured to capture changes to specific tables or entire databases. Some of the benefits of using the Debezium Connector for Oracle include: - Real-time data streaming: The connector captures changes to data in real-time, allowing other applications to consume the data as soon as it's available. - Reduced data integration complexity: With the Debezium Connector, users can easily integrate data from Oracle databases into other applications without the need for complex ETL processes. - Improved data accuracy: By capturing changes to data as they occur, the connector ensures that the data being consumed by other applications is always up-to-date and accurate. - Lower latency and improved performance: Because the connector captures changes in real-time, it eliminates the need for expensive polling and reduces the latency of data delivery. - High availability and scalability: The connector is designed to be highly available and scalable, making it suitable for use in large-scale enterprise environments.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值