五、Structured Streaming的流式DataFrames/Datasets的操作

本文详细介绍了Structured Streaming中DataFrame/Datasets的创建、输入源、选择、投影、聚合、Join操作(包括Stream-static Joins和Stream-stream Joins),以及不支持的操作。重点讨论了Stream-Stream Joins的内连接、外连接,watermark机制和全局水印策略,同时也指出了在Structured Streaming中不支持的一些DataFrame/Dataset操作。
摘要由CSDN通过智能技术生成

目录

1、创建

2、输入源

3、操作:选择(Selection)、投射(Projection)和聚合(Aggregation)

4、Join操作

(1)Stream-static Joins

(2)Stream-steam Joins

6、不支持的操作


1、创建

    val sqLContext = SparkSession.builder().appName(" event-time-window_App").getOrCreate()

2、输入源

Source Options Fault-tolerant Notes
File source path: path to the input directory, and common to all file formats.
maxFilesPerTrigger: maximum number of new files to be considered in every trigger (default: no max)
latestFirst: whether to process the latest new files first, useful when there is a large backlog of files (default: false)
fileNameOnly: whether to check new files based on only the filename instead of on the full path (default: false). With this set to `true`, the following files would be considered as the same file, because their filenames, "dataset.txt", are the same:
"file:///dataset.txt"
"s3://a/dataset.txt"
"s3n://a/b/dataset.txt"
"s3a://a/b/c/dataset.txt"


For file-format-specific options, see the related methods in DataStreamReader (Scala/Java/Python/R). E.g. for "parquet" format options see DataStreamReader.parquet().

In addition, there are session configurations that affect certain file-formats. See the SQL Programming Guide for more details. E.g., for "parquet", see Parquet configuration section.
Yes Supports glob paths, but does not support multiple comma-separated paths/globs.
Socket Source host: host to connect to, must be specified
port: port to connect to, must be specified
No  
Rate Source rowsPerSecond (e.g. 100, default: 1): How many rows should be generated per second.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值