1、Flik关系型api概念
最底层的Stateful event driven application是最难以使用的,因为接近底层;中层的DataStream api和DataSet api是对于开发人员来说是可以接受的;最上层的High-level编程格式是对各个程序员来说基本都会使用的,基于SQL的操作,同时也是流处理和批处理的一种统一的高级API。
注意:(目前博客是1.9版本)Table API和SQL尚未完成功能,正在积极开发中。[Table API,SQL]和[stream,batch]输入的每种组合都不支持所有操作。请严谨使用。
使用之前需要添加pom依赖
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-scala-bridge_2.11</artifactId>
<version>1.9.0</version>
</dependency>
此外,如果想在IDE中本地运行表API和SQL程序,必须添加以下依赖之一
<!-- Either... (for the old planner that was available before Flink 1.9) -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner_2.11</artifactId>
<version>1.9.0</version>
<scope>provided</scope>
</dependency>
<!-- or.. (for the new Blink planner) -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.11</artifactId>
<version>1.9.0</version>
<scope>provided</scope>
</dependency>
使用Table API和SQL
源文件
name,age,job
zhangsan,30,Developer
lisi,32,Developer
逻辑处理
package com.kun.flink.chapter06
import org.apache.flink.api.scala._
import org.apache.flink.table.api.scala.BatchTableEnvironment
import org.apache.flink.types.Row
object TableSQLAPI {
def main(args: Array[String]): Unit = {
val fbEnv = ExecutionEnvironment.getExecutionEnvironment
val fbTableEnv = BatchTableEnvironment.create(fbEnv)
val filePath = "test_files/test_csv/test01.csv"
//拿到DataSet
val csv = fbEnv.readCsvFile[People](filePath,ignoreFirstLine=true)
//DataSet转为Table
val salesTable = fbTableEnv.fromDataSet(csv)
//Table--table
fbTableEnv.registerTable("sales",salesTable)
//sql
val resultTable = fbTableEnv.sqlQuery("select * from sales")
fbTableEnv.toDataSet[Row](resultTable).print()
}
case class People(name:String,
age:Int,
job:String)
}
结果
zhangsan,30,Developer
lisi,32,Developer