TableAPI和FlinkSQL的两种Connect
依赖:
<properties>
<scala.version>2.12.10</scala.version>
<mysql.version>8.0.11</mysql.version>
<flink.version>1.13.0</flink.version>
<encoding>UTF-8</encoding>
</properties>
<dependencies>
<!-- 导入scala的依赖 -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- flink scala -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- flink图计算 -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-gelly_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>${mysql.version}</version>
</dependency>
<!-- flink连接kafka -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- flink连接mysql -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-jdbc_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<!--fastjson-->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-statebackend-rocksdb_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-scala-bridge_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-common</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-csv</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-json</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>ru.yandex.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.2.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.73</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-cep-scala_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-sql-connector-kafka_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-hive_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>com.alibaba.ververica</groupId>
<artifactId>flink-connector-mysql-cdc</artifactId>
<version>1.1.0</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.baidubce</groupId>
<artifactId>api-explorer-sdk</artifactId>
<version>1.0.0</version>
</dependency>
</dependencies>
File Connect
案例:
1.首先映射文件flink_person.txt
2.查询工资大于18000
3.把查询结果输出到一张表中
第一步、创建一个表环境
val bsEnv = StreamExecutionEnvironment.getExecutionEnvironment
bsEnv.setParallelism(1)
val bsSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val bsTableEnv = StreamTableEnvironment.create(bsEnv, bsSettings)
第二步、创建一张输入表关联外部组件
bsTableEnv.executeSql(
"""
|create table t_person(
| name string,
| ts bigint,
| salary bigint,
| city string
|) with (
| 'connector' = 'filesystem',
| 'path'='file:///D:\\Note\\Projects\\02\\Flink\\cha01\\file\\flink_person.txt',
| 'format' = 'csv'
|)
""".stripMargin)
第三步、创建一张输出表关联外部组件
bsTableEnv.executeSql(
"""
|create table t_rich_person(
| name string,
| salary bigint,
| city string
|) with (
| 'connector' = 'filesystem',
| 'path' = 'D:\\Note\\Projects\\02\\Flink\\cha01\\rst',
| 'format' = 'csv'
|)
""".stripMargin)
第四步、查询
两种方法:Table API查询 和 sql查询
Table API查询:
val table1 = bsTableEnv.from("t_person").select($"name",$"salary",$"city").where($"salary" >= 18000)
sql 查询:
val table2 = bsTableEnv.sqlQuery(
"""
select name,salary,city
from t_person
where salary >= 18000
""".stripMargin)
第五步、把查询结果插入到创建好的输出表中
table2.executeInsert("t_rich_person")
Kafka Connect
第一步、创建一个表环境
val bsEnv = StreamExecutionEnvironment.getExecutionEnvironment
bsEnv.setParallelism(1)
val bsSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val bsTableEnv = StreamTableEnvironment.create(bsEnv, bsSettings)
第二步、创建一张输入表关联外部组件
bsTableEnv.executeSql(
"""
|create table t_person(
| uid bigint,
| phone bigint,
| addr string
|) with (
| 'connector' = 'kafka',
| 'topic' = 'test',
| 'properties.bootstrap.servers' = 'master:9092',
| 'properties.group.id' = 'jlx',
| 'scan.startup.mode' = 'earliest-offset',
| 'format' = 'json',
| 'json.fail-on-missing-field' = 'false',
| 'json.ignore-parse-errors' = 'true'
|)
""".stripMargin)
第三步、创建一张输出表关联外部组件
bsTableEnv.executeSql(
"""
|create table person_cnts(
| addr string,
| cnts bigint,
| PRIMARY KEY (addr) NOT ENFORCED
|) with (
| 'connector' = 'upsert-kafka',
| 'topic' = 'test1',
| 'properties.bootstrap.servers' = 'master:9092',
| 'key.format' = 'json',
| 'value.format' = 'json'
|)
""".stripMargin)
第四步、查询
val table1 = bsTableEnv.sqlQuery(
"""
select addr,count(1) cnts
from t_person
group by addr
""".stripMargin)
第五步、把查询结果插入到创建好的输出表中
table1.executeInsert("person_cnts")