文章目录
0.写在前面
项目地址(GitHub):FlinkStreamETL
需求描述:利用Flink实时计算Mysql数据中的增量数据

方案:利用Canal实时读取Mysql数据库的Binlog日志,将其作为Kafka的生产者(Producer);然后利用Flink作为kafka的消费者(Consumer),读取Kafka中的数据。目前只是读取kafka中的数据,为Json格式,后面需要根据业务需求编写实时计算逻辑
所用的版本 —>服务器:
Kafka:Kafka 2.1.0-cdh6.2.0
Flink:<flink.version>1.9.0</flink.version>
Java:<java.version>1.8</java.version>
所用的版本 —>本机上:
Flink:<flink.version>1.9.0</flink.version>
Scala:<scala.binary.version>2.12</scala.binary.version>
Java:<java.version>1.8</java.version>
Kafka:flink-connector-kafka_${scala.binary.version}(Scala版本是2.12)
1.创建Maven项目
mvn archetype:generate
-DarchetypeGroupId=org.apache.flink
-DarchetypeArtifactId=flink-quickstart-java
-DarchetypeVersion=1.9.0
-DgroupId=flink-connector-kafka
-DartifactId=flink-connector-kafka
-Dversion=0.1
-Dpackage=myflink
-DinteractiveMode=false
pom文件
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>flink-connector-kafka</groupId>
<artifactId>flink-connector-kafka</artifactId>
<version>0.1</version>
<packaging>jar</packaging>
<name>Flink Quickstart Job</name>
<url>http://www.myorganization.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<flink.version>1.9.0</flink.version>
<java.version>1.8</java.version>
<scala.binary.version>2.12</scala.binary.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
</properties>
<repositories>
<repository>
<id>apache.snapshots</id>
<name>Apache Development Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<dependencies>
<!-- Apache Flink dependencies -->
<!-- These dependencies are provided, because they should not be packaged into the JAR file. -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<

本文介绍了如何使用Flink从Kafka消费由Canal实时捕获的Mysql Binlog日志。项目通过Canal将Mysql的增量数据发送到Kafka,然后Flink作为消费者处理这些Json格式的数据。文章涵盖了创建Maven项目、编写Java代码以及运行示例的过程,并讨论了遇到的错误及解决方案。
最低0.47元/天 解锁文章
2925

被折叠的 条评论
为什么被折叠?



