Flink-Table SQL 实践编程 BatchTableEnvironment(十)中

BatchTableEnvironment用于批处理场景,针对SQL相关信息,具体代码实践内容如下:

一、准备代码运行需要的文本信息,本文代码主要是读取csv文本内容,涉及student.csv和teacher.csv两个文件,文件具体内容分别如下:

学生姓名性别学科分数时间戳上报时间
张三语文90.515778511510002020/1/1 11:59
张三数学10015778512630002020/1/1 12:01
张三外语8015778511160002020/1/1 11:58
李四语文6815778511380002020/1/1 11:58
王二外语9915778514000002020/1/1 12:03
张三外语8015836315000002020/3/8 9:38
李四语文6815849400780002020/3/23 13:07
班主任所教学生姓名
教师一张三
教师一李四
教师二王二

二、java代码需要实体类信息:

import java.sql.Timestamp;

@Data
public class StudentInfo{

    private String name;
    private String sex;
    private String course;
    private Float score;
    private Long timestamp;  //sql 关键字,sql中不可以直接使用
    private Timestamp sysDate;
}
import lombok.Data;

@Data
public class Teacher {

    private String teacherName;
    private String studentName;

}

三、以下代码主要实现一些功能实例,具体如下:

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.java.BatchTableEnvironment;
import org.apache.flink.types.Row;
import org.apache.flink.util.Collector;

import java.sql.Timestamp;

public class FlinkTableSQLBatchExample {
    public static void main(String[] args) throws Exception{

        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        BatchTableEnvironment tEnv = BatchTableEnvironment.create(env);

        //source,这里读取CSV文件,并转换为对应的Class
        DataSet<StudentInfo> studentCsvInput = env
                .readCsvFile("/Users/springk/Documents/student.csv")
                .ignoreFirstLine().pojoType(StudentInfo.class,"name","sex","course","score","timestamp");


        DataSet<StudentInfo> studentInfoDataSet = studentCsvInput.flatMap(new FlatMapFunction<StudentInfo, StudentInfo>() {
            @Override
            public void flatMap(StudentInfo studentInfo, Collector<StudentInfo> collector) throws Exception {
                //时间戳转换成Timestamp类型,group window函数的time_attr参数必须是TIMESTAMP类型的属性
                studentInfo.setSysDate(new Timestamp(studentInfo.getTimestamp()));
                collector.collect(studentInfo);
            }
        });

        studentInfoDataSet.print();

        DataSet<Teacher> teacherCsvInput = env
                .readCsvFile("/Users/springk/Documents/teacher.csv")
                .ignoreFirstLine().pojoType(Teacher.class,"teacherName","studentName");
        teacherCsvInput.print();

//        //将DataSet转换为Table
        Table studentInfo =tEnv.fromDataSet(studentInfoDataSet);
        //将studentInfo注册为一个表
        tEnv.registerTable("studentInfo",studentInfo);

        //GroupBy Window Aggregation 根据name分组,统计学科数量,按天统计
        Table groupWindAggrTable = tEnv.sqlQuery("SELECT  " +
                "TUMBLE_START(sysDate, INTERVAL '1' DAY) as wStart, " +
                "name, SUM(score) as score " +
                "FROM  studentInfo " +
                "GROUP BY TUMBLE(sysDate, INTERVAL '1' DAY), name");

        DataSet<Row> groupWindAggrTableResult = tEnv.toDataSet(groupWindAggrTable, Row.class);
        groupWindAggrTableResult.print();

        tEnv.registerDataSet("teacher",teacherCsvInput);

        Table joinTable = tEnv.sqlQuery("SELECT * " +
                "FROM studentInfo LEFT JOIN teacher ON studentInfo.name = teacher.studentName");

        DataSet<Row> joinTableResult = tEnv.toDataSet(joinTable, Row.class);
        joinTableResult.print();



    }
}

四、运行结果信息:

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值