处理时间指定
①DataStream 转化成 Table 时指定
val dataStream = env.readTextFile("src/main/resources/sensor.txt")
.map(data => {
val arr = data.split(",")
SensorReading(arr(0),arr(1).toLong,arr(2).toDouble)
})
val sensorTableFromStream =
oldTabEnv.fromDataStream(dataStream,'id,'timestamp,'temperature,'pt.proctime)
oldTabEnv.fromDataStream(dataStream,'id,'timestamp,'temperature,'pt.proctime)
这个 proctime 属性只能通过附加逻辑字段,来扩展物理 schema。因此,只能在
schema 定义的末尾。
②定义 Table Schema 时指定
val sensorTableFromConnect = oldTabEnv
.connect(new FileSystem().path("src/main/resources/sensor.txt"))
.withFormat(new Csv())
.withSchema(new Schema()
.field("id",DataTypes.STRING())
.field("ts",DataTypes.DOUBLE())
.field("temp",DataTypes.BIGINT())
.field("pt",DataTypes.TIMESTAMP(3))
.proctime()
).createTemporaryTable("input_table")
由于CsvTableSource没有实现DefinedProctimeAttribute,因此descriptor为Csv()时无法指定时间属性,descriptor为Kafka()时实现了DefinedProctimeAttribute可以定义时间属性
③创建表的 DDL 中指定
val sourceDDL =
"""
|create table input_ddl_table(
|id string,
|ts bigint,
|temp double,
|pt as proctime()
|)with(
|'connector.type' = 'filesystem',
|'connector.path' = 'file://src//main//resources//sensor.txt',
|'format.type' = 'csv
|)
|""".stripMargin
blinkTabEnv.sqlUpdate(sourceDDL)
运行这段 DDL,必须使用 Blink Planner
事件时间指定
① DataStream 转化成 Table 时指定
val dataStream = env.readTextFile("src/main/resources/sensor.txt")
.map(data => {val arr = data.split(",")
SensorReading(arr(0),arr(1).toLong,arr(2).toDouble)
}).assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor[SensorReading](Time.minutes(1)) {
override def extractTimestamp(element: SensorReading): Long = element.timestamp*1000L
})
val tabEnv = StreamTableEnvironment.create(env)
val tableFromStream = tabEnv.fromDataStream(dataStream,'id,'timestamp as 'ts,'temterature as 'temp,'rt.rowtime)
val tableFromStream =
tabEnv.fromDataStream(dataStream,'id,'timestam.rowtime as 'ts,'temperature as 'temp)
事件时间与watermark在生成datastream时已经指定,创建table时指定的事件时间只是起标识作用。
② 定义 Table Schema 时指定
tabEnv.connect(new FileSystem().path("src/main/resources/sensor.txt"))
.withFormat(new Csv())
.withSchema(new Schema()
.field("id",DataTypes.STRING())
.field("ts",DataTypes.BIGINT())
.field("temp",DataTypes.DOUBLE())
.rowtime(
new Rowtime().watermarksPeriodicBounded(1000L)
)
).createTemporaryTable("input_table")
③ 创建表的 DDL 中指定
val tableDDL =
"""
|create table dataTable (
| id varchar(20) not null,
| ts bigint,
| temperature double,
| rt AS TO_TIMESTAMP( FROM_UNIXTIME(ts) ),
| watermark for rt as rt - interval '1' second
|) with (
| 'connector.type' = 'filesystem',
| 'connector.path' = '\sensor.txt',
| 'format.type' = 'csv'
|)
|""".stripMargin
tabEnv.sqlUpdate(tableDDL)