4、mysql-->kafka-->mysql

本文详细介绍了如何通过Kafka消息队列,使用Flink CDC将MySQL数据异步传输到另一个MySQL库表。内容涵盖Kafka环境搭建、Java代码实现以及数据的双向同步过程。

本文实现mysql数据通过kafka消息队列,异步传输到mysql库表的全过程代码实现,包括kafka环境搭建,代码运行效果展示。全部使用Flink cdc最新版本实现,虽然代码精简,但确属全网最全,最完整的案例,没有之一。

一、kafka环境搭建

docker直接拉取kafka和zookeeper的镜像

docker pull wurstmeister/kafka

docker pull wurstmeister/zookeeper

首先需要启动zookeeper,如果不先启动,启动kafka没有地方注册消息。

docker run -it --name zookeeper -p 12181:2181 -d wurstmeister/zookeeper:latest

再启动kafka。

docker run -it --name kafka \

-p 19092:9092 -d -e KAFKA_BROKER_ID=0 \

-e KAFKA_ZOOKEEPER_CONNECT=localhost:12181 \

-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://192.168.3.252:19092 \

-e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9092 \

wurstmeister/kafka:latest

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.sport</groupId> <artifactId>bz-sport-realtime</artifactId> <version>1.0-SNAPSHOT</version> <properties> <maven.compiler.source>8</maven.compiler.source> <maven.compiler.target>8</maven.compiler.target> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <!-- 统一版本管理 --> <flink.version>1.20.2</flink.version> <scala.binary.version>2.12</scala.binary.version> <!-- Doris Connector基于Scala 2.12 --> <mysql.cdc.version>2.5.0</mysql.cdc.version> <mysql.driver.version>8.0.33</mysql.driver.version> </properties> <dependencies> <!-- Flink核心依赖 --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-core</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-base</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-runtime-web</artifactId> <version>${flink.version}</version> </dependency> <!-- Flink Kafka Connector (适配Flink 1.17+) --> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka</artifactId> <version>3.4.0-1.20</version> </dependency> <!-- Flink table api 支持 --> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java</artifactId> <version>1.20.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-api-java-bridge</artifactId> <version>1.20.2</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner_${scala.binary.version}</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-common</artifactId> <version>${flink.version}</version> </dependency> <!-- Doris Connector(适配Flink 1.20) --> <dependency> <groupId>org.apache.doris</groupId> <artifactId>flink-doris-connector-1.16</artifactId> <version>25.1.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-jdbc</artifactId> <version>3.3.0-1.20</version> </dependency> <!--flink cdc 依赖支持--> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-mysql-cdc</artifactId> <version>3.4.0</version> <!-- <exclusions>--> <!-- <exclusion>--> <!-- <groupId>org.apache.kafka</groupId>--> <!-- <artifactId>kafka-clients</artifactId>--> <!-- </exclusion>--> <!-- </exclusions>--> </dependency> <!--mysql 8 依赖支持--> <dependency> <groupId>com.mysql</groupId> <artifactId>mysql-connector-j</artifactId> <version>8.0.33</version> </dependency> <!-- redis 依赖 --> <dependency> <groupId>org.apache.bahir</groupId> <artifactId>flink-connector-redis_2.12</artifactId> <version>1.1.0</version> </dependency> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> <version>5.2.0</version> </dependency> <!-- rocksdb 状态后端依赖支持 --> <!-- <dependency>--> <!-- <groupId>org.apache.flink</groupId>--> <!-- <artifactId>flink-statebackend-rocksdb</artifactId>--> <!-- <version>1.20.2</version>--> <!-- </dependency>--> <!-- Flink S3插件 --> <!-- <dependency>--> <!-- <groupId>org.apache.flink</groupId>--> <!-- <artifactId>flink-s3-fs-hadoop</artifactId>--> <!-- <version>${flink.version}</version>--> <!-- </dependency>--> <!-- <!– AWS SDK v2 (不依赖Hadoop) –>--> <!-- <dependency>--> <!-- <groupId>software.amazon.awssdk</groupId>--> <!-- <artifactId>s3</artifactId>--> <!-- <version>2.20.56</version>--> <!-- </dependency>--> <!-- 其他工具依赖(按需添加) --> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.5.14</version> </dependency> <!-- https://mvnrepository.com/artifact/com.alibaba.fastjson2/fastjson2 --> <dependency> <groupId>com.alibaba.fastjson2</groupId> <artifactId>fastjson2</artifactId> <version>2.0.53</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-json</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.18.30</version> <scope>provided</scope> </dependency> <!-- Hadoop Common --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>3.3.6</version> <!-- 请根据你的 HDFS 集群版本选择,比如 2.7.x, 3.2.x, 3.3.x --> </dependency> <!-- Hadoop HDFS Client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs-client</artifactId> <version>3.3.6</version> <!-- 与 hadoop-common 版本保持一致 --> </dependency> <!-- <dependency>--> <!-- <groupId>org.apache.logging.log4j</groupId>--> <!-- <artifactId>log4j-to-slf4j</artifactId>--> <!-- <version>2.14.0</version>--> <!-- </dependency>--> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>3.6.0</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.2.4</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <artifactSet> <excludes> <!-- 只排除测试依赖或不必要的依赖 --> <exclude>junit:junit</exclude> <!-- <exclude>org.apache.kafka:kafka-clients</exclude>--> </excludes> </artifactSet> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> </execution> </executions> </plugin> </plugins> </build> </project>我的pom是这个
08-28
Flink 1.20.2 任务在本地 IDEA 可正常消费 Kafka,但在 YARN 集群上报 `org.apache.kafka.common.KafkaException: Failed to construct kafka consumer` 错误,且提示 `class org.apache.kafka.common.serialization.ByteArrayDeserializer is not an instance of org.apache.kafka.common.serialization.Deserializer`,可从以下几个方面结合 POM 文件进行分析和解决: ### 检查依赖版本兼容性 确保 POM 文件中的 `flink-connector-kafka` 依赖版本与 Flink 版本兼容。对于 Flink 1.20.2,可使用如下依赖: ```xml <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka_2.12</artifactId> <version>1.20.2</version> </dependency> ``` 这里假设使用的 Scala 版本是 2.12。要避免引入不兼容或者冲突的 Kafka 客户端依赖,保证所有 Kafka 相关依赖版本一致。 ### 排除重复依赖 重复的依赖可能会导致类加载冲突,可通过 `mvn dependency:tree` 命令查看依赖树,找出重复的 Kafka 相关依赖并在 POM 文件中排除。例如: ```xml <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka_2.12</artifactId> <version>1.20.2</version> <exclusions> <exclusion> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> </exclusion> </exclusions> </dependency> ``` ### 检查类加载顺序 报错可能跟 Flink 的类加载方式有关,可修改 `flink-conf.yml` 中的 `classloader.resolve-order` 参数,将默认的 `child-first` 改成 `parent-first`: ```properties classloader.resolve-order: parent-first ``` ### 检查序列化器配置 确保在代码中正确配置了 Kafka 消费者的反序列化器。示例代码如下: ```java import org.apache.flink.api.common.serialization.SimpleStringSchema; import org.apache.flink.connector.kafka.source.KafkaSource; import org.apache.flink.connector.kafka.source.enumerator.initializer.OffsetsInitializer; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import java.util.Properties; public class FlinkKafkaConsumerExample { public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); Properties properties = new Properties(); properties.setProperty("bootstrap.servers", "localhost:9092"); properties.setProperty("group.id", "test-group"); KafkaSource<String> source = KafkaSource.<String>builder() .setBootstrapServers("localhost:9092") .setTopics("test-topic") .setGroupId("test-group") .setStartingOffsets(OffsetsInitializer.earliest()) .setValueOnlyDeserializer(new SimpleStringSchema()) .build(); env.fromSource(source, org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext.class, "Kafka Source") .print(); env.execute("Flink Kafka Consumer Example"); } } ``` ### 检查 YARN 集群环境 确保 YARN 集群上的 Flink 版本和本地开发环境一致,并且检查 YARN 集群上是否已经有 Kafka 客户端的 Jar 包冲突。可以通过 `-Dyarn.provided.lib.dirs` 参数指定 Flink 依赖的 Jar 包路径,避免和集群上已有的 Jar 包冲突。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

vandh

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值