第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步

在这里插入图片描述

1、开启binlog日志

1.1、在mysql中对需要进行实时数据监测的库开启binlog同步

sudo vim /etc/my.cnf

#添加数据库的binlog
server-id=1
log-bin=mysql-bin
binlog_format=row
binlog-do-db=m_iceberg


#重启MySQL服务
sudo systemctl restart mysqld
  • 具体如下

在这里插入图片描述

1.2、查询是否开启

cd /var/lib/mysql
  • 查询同步日志

在这里插入图片描述

2、实时写入前提

  • 实时写入必要配置:在flink-conf.yaml中配置checkpoints相关参数
execution.checkpointing.interval: 3min
execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
execution.checkpointing.max-concurrent-checkpoints: 1
execution.checkpointing.min-pause: 0
execution.checkpointing.mode: EXACTLY_ONCE
execution.checkpointing.timeout: 10min
execution.checkpointing.tolerable-failed-checkpoints: 0
execution.checkpointing.unaligned: false
#
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the <class-name-of-factory>.
state.backend: filesystem

# Directory for checkpoints filesystem, when using any of the default bundled state backends.
state.checkpoints.dir: hdfs:///flink/flink-checkpoints

# Default target directory for savepoints, optional.
state.savepoints.dir: hdfs:///flink/flink-savepoints
  • 环境依赖包如下
flink-sql-connector-kafka_2.11-1.13.5.jar
flink-sql-connector-mysql-cdc-1.3.0.jar
#下载地址:https://repo.maven.apache.org/maven2/com/alibaba/ververica/flink-sql-connector-mysql-cdc/1.3.0/

3、数据源创建

CREATE TABLE `product_view` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`product_id` int(11) NOT NULL,
`server_id` int(11) NOT NULL,
`duration` int(11) NOT NULL,
`times` varchar(11) NOT NULL,
`time` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `time` (`time`),
KEY `user_product` (`user_id`,`product_id`) USING BTREE,
KEY `times` (`times`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

-- 样本数据
INSERT INTO `product_view` VALUES ('1', '1', '1', '1', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('2', '1', '1', '1', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('3', '1', '1', '3', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view_sink` VALUES ('4', '1', '1', '2', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('5', '8', '1', '1', '120', '120', '2020-05-14 13:14:00');
INSERT INTO `product_view` VALUES ('6', '8', '1', '2', '120', '120', '2020-05-13 13:14:00');
INSERT INTO `product_view` VALUES ('7', '8', '1', '3', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('8', '8', '1', '3', '120', '120', '2020-04-23 13:14:00');
INSERT INTO `product_view` VALUES ('9', '8', '1', '2', '120', '120', '2020-05-13 13:14:00');

INSERT INTO `product_view` VALUES (10, 8, 1, 2, 120, '120', '2020-05-13 13:14:00');

4、创建数据表关联Mysql

  • 含义:我们在flink sql client操作这个表相当于操作mysql里面的对应表。
CREATE TABLE `default_catalog`.`default_database`.`product_view_source` (
    `id` int,
    `user_id` int,
    `product_id` int,
    `server_id` int,
    `duration` int,
    `times` string,
    `time` timestamp,
    PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
    'connector' = 'mysql-cdc',
    'hostname' = '192.168.6.102',
    'port' = '3306',
    'username' = 'root',
    'password' =,
    'database-name' = 'm_iceberg',
    'table-name' = 'product_view'
);

5、Mysql表入湖

flink-cdc-mysql2kafka

  • Flink CDC入湖
CREATE EXTERNAL TABLE product_view_sink (
    `id` int,
    `user_id` int,
    `product_id` int,
    `server_id` int,
    `duration` int,
    `times` string,
    `time` timestamp,
    PRIMARY KEY (`id`) NOT ENFORCED
) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
TBLPROPERTIES ( 'write.metadata.delete-after-commit.enabled'='true',
 'write.metadata.previous-versions-max'='5',
 'format-version'='2');

insert into product_view_sink select * from default_catalog.default_database.product_view_source;
  • 验证结果:Source表(在default_catalog)可实时查看,Sink表(在hive_Iceberg)无法实时查看

在这里插入图片描述

6、数据的增删改同步

  • 验证结果:Flink1.14.5支持数据增删改同步
  • hive建表配置
TBLPROPERTIES ( 'write.metadata.delete-after-commit.enabled'='true',
 'write.metadata.previous-versions-max'='5',
 'format-version'='2');

6.1、数据增加验证

INSERT INTO `product_view` VALUES ('100', '8', '1', '2', '120', '120', '2020-05-13 13:14:00');
  • 验证结果

在这里插入图片描述

6.2、数据更新验证

UPDATE product_view SET user_id = 100 WHERE id = 100;
  • 验证结果:

在这里插入图片描述

6.3、数据删除验证

delete from product_view where id = 100;
  • 验证结果:

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

随缘清风殇

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值