Flink Table和SQL的表和视图、Connectors和timestamp数据类型

本文介绍了Apache Flink中表和视图的概念,包括临时表和永久表的差异。展示了如何使用Table API连接不同的数据源,如filesystem、print和blackhole,并提供了相关依赖的配置。同时,详细解释了timestamp和timestamp_ltz类型及其用法,包括时间戳的精度和timestamp_ltz如何表示绝对时间点。最后,讨论了Flink中处理当前时间的各种函数。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1. 表和视图

表分为临时表和永久表,相同名称下,临时表的优先级比永久表高
永久表需要数据库保存元数据,例如Hive数据库

连接外部数据系统通常用createTemporaryTable,中间结果表通常用createTemporatyView,如下所示:

tEnv.createTemporaryTable("table_name", tableDescriptor)
tEnv.createTemporaryView("table_name", table)

2. Table API Connectors

2.1 filesystem、print、blackhole

添加pom.xml依赖

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-csv</artifactId>
            <version>1.14.3</version>
            <scope>provided</scope>
        </dependency>


        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.3.1</version>
            <scope>provided</scope>
        </dependency>

程序如下:

import org.apache.flink.api.common.RuntimeExecutionMode
import org.apache.flink.streaming.api.functions.sink.DiscardingSink
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.table.api.bridge.scala.StreamTableEnvironment
import org.apache.flink.table.api.{DataTypes, FormatDescriptor, Schema, TableDescriptor, long2Literal, row, string2Literal}
import org.apache.flink.types.Row
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.ipc.StandbyException

import scala.util.control.Breaks.{break, breakable}




object flink_test {

  // 获取Active HDFS Uri
  def getActiveHdfsUri() = {
    val hadoopConf = new Configuration()
    val hdfsUris = Array(
      "hdfs://192.168.23.101:8020",
      "hdfs://192.168.23.102:8020",
      "hdfs://192.168.23.103:8020"
    )
    var hdfsCli: FileSystem = null
    var hdfsCapacity: Long = -1L
    var activeHdfsUri: String = null

    breakable {
      for (hdfsUri <- hdfsUris) {
        hadoopConf.set("fs.defaultFS", hdfsUri)
        hdfsCli = FileSystem.get(hadoopConf)

        try {
          hdfsCapacity = hdfsCli.getStatus.getCapacity
          activeHdfsUri = hdfsUri
          break
        } catch {
          case hdfsException: StandbyException => {}
        }

      }
    }

    activeHdfsUri

  }

  def main(args: Array[String]): Unit = {


    val senv = StreamExecutionEnvironment.getExecutionEnvironment
    senv.setRuntimeMode(RuntimeExecutionMode.STREAMING)
    val tEnv = StreamTableEnvironment.create(senv)

    val hdfsFilePath = s"${getActiveHdfsUri()}/test/test.txt"

    // HDFS表
    val fileSystemTable = tEnv.from(
      TableDescriptor.forConnector("filesystem")
        .schema(Schema.newBuilder()
          .column("name", DataTypes.STRING())
          .column("amount", DataTypes.BIGINT())
          .build()
        )
        .option("path", hdfsFilePath)
        .format(FormatDescriptor
          .forFormat("csv")
          .option("field-delimiter", ",")
          .build()
        ).build()
    )
    tEnv.createTemporaryView("fileSystemTable", fileSystemTable)

    // print表
    tEnv.createTemporaryTable("printSink",
      TableDescriptor.forConnector("print")
        .schema(Schema.newBuilder()
          .column("name", DataTypes.STRING())
          .column("amount", DataTypes.BIGINT())
          .build()
        ).build()
    )

    // 读取HDFS表数据用print输出, 输出结果和转换成DataStream进行print一样
    fileSystemTable.executeInsert("printSink")

    // blackhole表
    tEnv.executeSql("create temporary table blackholeSink with ('connector' = 'blackhole') like printSink")

    // 读取HDFS表数据到blackhole
    tEnv.executeSql("insert into blackholeSink select * from fileSystemTable")

    // 转换为DataStream, 输出到blackhole
    val fileSystemDatastream = tEnv.toDataStream(fileSystemTable)
    fileSystemDatastream.addSink(new DiscardingSink[Row]())

    senv.execute()

  }
}

执行结果如下:

6> +I[zhang_san, 30]
4> +I[li_si, 40]

3. timestamp和timestamp_ltz

  1. timestamp(p)
    p指小数秒的精度,范围为0-9,默认是6
    val table = tEnv.sqlQuery("select timestamp '1970-01-01 00:00:04.001'")

    table.execute().print()

输出如下:

+----+-------------------------+
| op |                  EXPR$0 |
+----+-------------------------+
| +I | 1970-01-01 00:00:04.001 |
+----+-------------------------+
  1. timestamp_ltz(p)
    用于描述时间线上的绝对时间点, 使用long保存从epoch至今的毫秒数,使用int保存毫秒中的纳秒数
    无法通过字符串来指定, 可以通过一个long类型的epoch时间来转化。在同一个时间点, 全世界所有的机器上执行System.currentTimeMillis()都会返回同样的值
    tEnv.executeSql("create view t1 as select to_timestamp_ltz(4001, 3)")
    val table = tEnv.sqlQuery("select * from t1")

    table.execute().print()

输出如下:

+----+-------------------------+
| op |                  EXPR$0 |
+----+-------------------------+
| +I | 1970-01-01 08:00:04.001 |
+----+-------------------------+
  1. 各种当前时间函数
    tEnv.executeSql("create view myView1 as select localtime, localtimestamp, current_date, current_time, current_timestamp, current_row_timestamp(), now(), proctime()")
    val table = tEnv.sqlQuery("select * from myView1")
    table.printSchema()

    table.execute().print()

输出如下:

(
  `localtime` TIME(0) NOT NULL,
  `localtimestamp` TIMESTAMP(3) NOT NULL,
  `current_date` DATE NOT NULL,
  `current_time` TIME(0) NOT NULL,
  `current_timestamp` TIMESTAMP_LTZ(3) NOT NULL,
  `EXPR$5` TIMESTAMP_LTZ(3) NOT NULL,
  `EXPR$6` TIMESTAMP_LTZ(3) NOT NULL,
  `EXPR$7` TIMESTAMP_LTZ(3) NOT NULL *PROCTIME*
)
+----+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
| op | localtime |          localtimestamp | current_date | current_time |       current_timestamp |                  EXPR$5 |                  EXPR$6 |                  EXPR$7 |
+----+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
| +I |  12:59:06 | 2022-02-07 12:59:06.859 |   2022-02-07 |     12:59:06 | 2022-02-07 12:59:06.859 | 2022-02-07 12:59:06.859 | 2022-02-07 12:59:06.859 | 2022-02-07 12:59:06.862 |
+----+-----------+-------------------------+--------------+--------------+-------------------------+-------------------------+-------------------------+-------------------------+
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值