本文为Flink sql Dataset 示例(Scala)
Scan / Select
功能描述: 查询一个表中的所有数据
package flink_sql
import org.apache.flink.api.scala.{ExecutionEnvironment,_}
import org.apache.flink.table.api.TableEnvironment
import org.apache.flink.table.api.scala._
/**
* Flink_SQL_DataSet Demo
*/
object FlinkSqlDataSet {
def main(args: Array[String]): Unit = {
//得到执行环境
val env = ExecutionEnvironment.getExecutionEnvironment
env.setParallelism(1)
//设置数据集
val dataSet = env.fromElements(("小明",18,"男"),("小红",19,"女"),("张三",9,"男"),("李四",26,"男"))
//得到 table环境
val tableEnv =TableEnvironment.getTableEnvironment(env)
//注册表
tableEnv.registerDataSet("Student",dataSet,'name,'age,'sex)
tableEnv.sqlQuery(s"select name,age,sex from Student")
.first(100)//Creates a new DataSet containing the first 100 elements of this DataSet.
.print()
/**
* 输出结果
* 小明,18,男
小红,19,女
张三,9,男
李四,26,男
*/
}
}
上述SQL均可以换成如下,以相对应功能:
as (table)
功能描述: 给表名取别称
tableEnv.sqlQuery(s"select s1.name,s1.age FROM Studentas s1")
as (column)
功能描述: 给表名取别称
tableEnv.sqlQuery(s"select name a,age as b FROM Student ")
Where / Filter
功能描述:列加条件过滤表中的数据
tableEnv.sqlQuery(s"select name,age,sex FROM Student where sex = '女'")
between and (where)
功能描述: 过滤列中的数据, 开始数据 <= data <= 结束数据
tableEnv.sqlQuery(s"select name,age,sex FROM Student where age between 20 and 35")
Sum
功能描述: 求和所有数据
tableEnv.sqlQuery(s"select sum(age) FROM Student")
max(min)
功能描述: 求最大(最小)值
tableEnv.sqlQuery(s"select max(age) FROM Student")
sum (group by )
功能描述: 按性别分组求和
tableEnv.sqlQuery(s"select sex,sum(age) from Student group by sex")
/**
* 输出结果:
*
* 女,19
男,53 *
*/
group by having
tableEnv.sqlQuery(s"select sex,sum(age) from Student group by sex having sum(age)>20")
distinct
功能描述: 去重一列或多列
tableEnv.sqlQuery("select distinct name FROM Student")