提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
前言
MySQL的CDC源表,即MySQL的流式源表,会先读取数据库的历史全量数据,并平滑切换到Binlog读取上,保证不多读一条也不少读一条数据。即使发生故障,也能保证通过Exactly Once语义处理数据。MySQL CDC Connector支持并发地读取全量数据,通过增量快照算法实现了全程无锁和断点续传。
一、参考文档
二、测试
Maven dependency
<dependency>
<groupId>com.ververica</groupId>
<artifactId>flink-connector-mysql-cdc</artifactId>
<!-- the dependency is available only for stable releases. -->
<version>2.1.1</version>
</dependency>
代码片段
import org.apache.flink.streaming.api.scala._
import org.apache.flink.table.api._
import org.apache.flink.table.api.bridge.scala._
import org.apache.flink.types.Row
object test_mysqltomysql {
def main(args: Array[String]): Unit = {
val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
val settings: EnvironmentSettings = EnvironmentSettings.newInstance()
.useBlinkPlanner()
.inStreamingMode()
.build()
val bsEnv: StreamTableEnvironment = StreamTableEnvironment.create(env, settings)
//
env.enableCheckpointing(300)
bsEnv.executeSql(
"""
|CREATE TABLE test1 (
| id int,
| name string,
| sex string ,
| class string
|) WITH (
| 'connector' = 'mysql-cdc',
| 'hostname' = '*********',
| 'port' = '3306',
| 'username' = '******',
| 'password' = '*****',
| 'database-name' = 'study',
| 'table-name' = 'test1'
|)
|
""".stripMargin)
bsEnv.executeSql(
"""
|
|CREATE TABLE test2 (
| sex STRING,
| num bigint,
| PRIMARY KEY (sex) NOT ENFORCED
|) WITH (
| 'connector' = 'jdbc',
| 'url' = '*********',
| 'table-name' = 'test2',
| 'username' = '******',
| 'password'= '******'
|
|)
|
""".stripMargin)
bsEnv.executeSql(
"""
|
|insert into test2
|select sex,count(1) as num from test1 group by sex
|
""".stripMargin)
}
}
在test1表中添加数据
test2表得到统计结果