之前写过一篇使用 FlinkCDC 的 DataStream 方式监控 MySQL 表变化的文章
这篇介绍下使用 FlinkCDC 的 FlinkSQL 方式监控表变化;
1、MySQL 开启 binlog
在 my.cnf 中开启 binlog,我这里指定了 test 库,然后重启 MySQL
server.id=1
log-bin=mysql-bin
binlog-do-db=test
2、在 MySQL 中创建测试库和表
mysql> create database test;
mysql> create table user_info(id int unsigned not null auto_increment primary key, username varchar(60), sex tinyint(1), nickname varchar(60), addr varchar(255))ENGINE=InnoDB default charset=utf8mb4;
先插入几条数据
mysql> insert into user_info values(null, 'zhangsan', 1, 'zhs','beijing');
mysql> insert into user_info values(null, 'lisi', 1, 'ls','shanghai');
mysql> insert into user_info values(null, 'wangwu', 1, 'ww','wangwu');
3代码
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.zsoft.flinkcdc</groupId>
<artifactId>flinkcdc</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<flink.version>1.13.1</flink.version>
</properties>
<dependencies>
<!-- FlinkCDC DataStream 方式 -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.1.3</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.22</version>
</dependency>
<dependency>
<groupId>com.alibaba.ververica</groupId>
<artifactId>flink-connector-mysql-cdc</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.75</version>
</dependency>
<!-- FlinkCDC FlinkSQL 方式 -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
运行类
com.zsoft.flinkcdc.FlinkCdcSQL.java
package com.zsoft.flinkcdc;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
/**
* 通过 Flink SQL 方式实现 FlinkCDC
*/
public class FlinkCdcSQL {
public static void main(String[] args) throws Exception {
// TODO 1.基本环境准备
// 1.1 流处理环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// 1.2 表执行环境
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
// 1.3 设置并行度
env.setParallelism(1);
// TODO 2.转换动态表
tableEnv.executeSql("CREATE TABLE user_info_binlog (" +
" id INT NOT NULL," +
" username STRING," +
" sex INT," +
" nickname String," +
" addr String" +
") WITH (" +
" 'connector' = 'mysql-cdc'," +
" 'hostname' = 's1'," +
" 'port' = '3306'," +
" 'username' = 'root'," +
" 'password' = '123456'," +
" 'database-name' = 'test'," +
" 'table-name' = 'user_info'" + // 不指定表名,会把库中所有表同步过来
")");
tableEnv.executeSql("select * from user_info_binlog").print();
env.execute();
}
}
4、运行及测试
在 IDEA 中运行 FlinkCdcSQL.java
在 console 中会输出之前已经有的几条数据记录:
+----+-------------+--------------------------------+-------------+--------------------------------+--------------------------------+
| op | id | username | sex | nickname | addr |
+----+-------------+--------------------------------+-------------+--------------------------------+--------------------------------+
| +I | 1 | zhangsan | 1 | zhs | beijing |
| +I | 2 | lisi | 1 | ls | shanghai |
| +I | 3 | wangwu | 1 | ww | wangwu |
在 user_info 表中添加一条数据:
mysql> insert into user_info values(null, 'zhaoliu', 1, 'zl','zhaoliu');
在程序 console 中输出:
| +I | 4 | zhaoliu | 1 | zl | zhaoliu |
执行修改语句:
mysql> update user_info set addr='guangzhou' WHERE id=4;
在程序 console 中输出:
| -U | 4 | zhaoliu | 1 | zl | zhaoliu |
| +U | 4 | zhaoliu | 1 | zl | guangzhou |