flink-mysql数据同步到iceberg

数据开发平台:GitHub - 642933588/jiron-cloud: 该项目整合了多款优秀的开源产品,构建了一个功能全面的数据开发平台。平台提供了强大的数据集成、数据开发、数据查询、数据服务、数据质量管理、工作流调度和元数据管理功能。#dinky #dolphinscheduler #datavines #flinkcdc #openmetadata #flink #数据开发 #数据平台 # 数据开发平台 #大数据

目标:

利用jiron数据开发平台实现mysql数据实时同步到iceberg数据湖中。

准备工作:

启动hadoop文件系统:./start-dfs.sh

启动hive元数据:nohup hive --service metastore 2>&1 &

flink集群添加依赖,并启动flink集群

dinky添加依赖:

SET 'execution.checkpointing.interval' = '10s';
SET 'table.exec.state.ttl'= '8640000';
SET 'table.exec.mini-batch.enabled' = 'true';
SET 'table.exec.mini-batch.allow-latency' = '60s';
SET 'table.exec.mini-batch.size' = '10000';
SET 'table.local-time-zone' = 'Asia/Shanghai';
SET 'table.exec.sink.not-null-enforcer'='DROP';


CREATE TABLE activity_info_full_mq (
    `id` bigint NOT NULL COMMENT '活动id',
    `activity_name` STRING NULL COMMENT '活动名称',
    `activity_type` STRING NULL COMMENT '活动类型',
    `activity_desc` STRING NULL COMMENT '活动描述',
    `start_time` TIMESTAMP(3) NULL COMMENT '开始时间',
    `end_time` TIMESTAMP(3) NULL COMMENT '结束时间',
    `create_time` TIMESTAMP(3) NULL  COMMENT '创建时间',
 PRIMARY KEY(`id`) NOT ENFORCED
) WITH (
 'connector' = 'mysql-cdc',
 'scan.startup.mode' = 'earliest-offset',
 'hostname' = '192.168.3.22',
 'port' = '3306',
 'username' = 'root',
 'password' = '*****',
 'database-name' = 'gmall',
 'table-name' = 'activity_info',
 'server-time-zone' = 'Asia/Shanghai'
);


CREATE CATALOG iceberg_catalog WITH (
 'type' = 'iceberg',
 'metastore' = 'hive',
 'uri' = 'thrift://192.168.3.22:9083',
 'hive-conf-dir' = '/opt/module/hive/conf',
 'hadoop-conf-dir' = '/opt/module/hadoop/etc/hadoop',
 'warehouse' = 'hdfs:user/hive/warehouse'
);


use CATALOG iceberg_catalog;


create  DATABASE IF NOT EXISTS iceberg_ods;


CREATE TABLE IF NOT EXISTS iceberg_ods.ods_activity_info_full(
    `id`            BIGINT COMMENT '活动id',
    `k1`            STRING COMMENT '分区字段',
    `activity_name` STRING COMMENT '活动名称',
    `activity_type` STRING COMMENT '活动类型',
    `activity_desc` STRING COMMENT '活动描述',
    `start_time`    STRING COMMENT '开始时间',
    `end_time`      STRING COMMENT '结束时间',
    `create_time`   STRING COMMENT '创建时间',
 PRIMARY KEY (`id`,`k1` ) NOT ENFORCED
)   PARTITIONED BY (`k1` ) WITH (
 'catalog-name'='hive_prod',
 'uri'='thrift://192.168.3.22:9083',
 'warehouse'='hdfs://192.168.3.22:9000/user/hive/warehouse/'
);


INSERT into  iceberg_ods.ods_activity_info_full /*+ OPTIONS('upsert-enabled'='true') */ (`id`, `k1` , `activity_name`, `activity_type`, `activity_desc`, `start_time`, `end_time`, `create_time`)
select
    id,
    DATE_FORMAT(create_time, 'yyyy-MM-dd') AS k1,
    activity_name,
    activity_type,
    activity_desc,
    DATE_FORMAT(start_time, 'yyyy-MM-dd HH:mm:ss') AS start_time,
    DATE_FORMAT(end_time, 'yyyy-MM-dd HH:mm:ss') AS end_time,
    DATE_FORMAT(create_time, 'yyyy-MM-dd HH:mm:ss') AS create_time
from default_catalog.default_database.activity_info_full_mq
where create_time is not null;

启动任务:

flink web ui

同步成功:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值