最近开发了 NiFi 集成 Debezium 的插件 DebeziumMySQL,读取 MySQL binlog 生成 CDC:
配置参数如下:
1. Engine Name:引擎名称。内部用作线程名称,方便调试;
2. Offset Filename:保存 CDC 偏移位置的文件,默认放到临时文件夹;
3. Offset Flush Interval:写 CDC 偏移位置文件的间隔,默认 1 分钟;
4. Database Host:MySQL 数据库主机名或 IP 地址,必须开启 binlog,格式设置为 ROW;
5. Database Port:JDBC 端口,默认 3306;
6. Database User:MySQL 用户名,要求有 SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT 权限;
7. Database Password:MySQL 用户密码;
8. Database Server ID:与 MySQL 集群节点不冲突的编号即可;
9. Topic Prefix:Topic 前缀,因为 Debezium 是为 Kafka Connector 开发的。起标识作用;
10. Schema History File:Schema 历史存储文件,默认放到临时文件夹。
要求 MySQL 开启 binlog。我的配置如下:
[mysqld]
# ----------------------------------------------
# Enable the binlog for replication & CDC
# ----------------------------------------------
# Enable binary replication log and set the prefix, expiration, and log format.
# The prefix is arbitrary, expiration can be short for integration tests but would
# be longer on a production system. Row-level info is required for ingest to work.
# Server ID is required, but this will vary on production systems
server-id = 223344
log_bin = C:/mysql-5.7.38-winx64/binlog/mysql-bin
expire_logs_days = 10
binlog_format = ROW
binlog_row_image = FULL
binlog_rows_query_log_events = on
# Mysql Packet Size may need to be re-configured. MySQL may have, by default, a ridiculously low allowable packet size.
# To increase it, you’ll need to have the property max_allowed_packet set to a higher number, say 1024M.
max_allowed_packet= 1024M
default-time-zone = "+08:00"
创建用户并授权:
CREATE USER 'debezium'@'localhost' IDENTIFIED BY 'debezium';
GRANT ALL PRIVILEGES ON *.* TO debezium@'localhost';
FLUSH PRIVILEGES;
插件下载地址:https://download.csdn.net/download/hejiangtju/87569518
下载后放到 NiFi 的 extensions 目录下即可。要求 NiFi 版本 >= 1.15.3