Concepts:Timely Stream Processing

本文介绍了实时流处理的概念,强调了事件时间和处理时间的区别。事件时间是事件实际发生的时间,而处理时间则依赖于执行操作的系统的系统时间。在处理延迟和无序事件时,事件时间与水印机制起着关键作用。水印用于在事件时间流中测量进度,确保在无序流中正确处理事件。文章还讨论了并行流中的水印、延迟元素和窗口开窗策略。
摘要由CSDN通过智能技术生成

Timely Stream Processing 实时流处理

Introduction 简介

Timely stream processing is an extension of stateful stream processing in which time plays some role in the computation. Among other things, this is the case when you do time series analysis, when doing aggregations based on certain time periods (typically called windows), or when you do event processing where the time when an event occurred is important.
实时流处理是有状态流处理的扩展,其中时间在计算中会发挥一定作用。例如当您进行时间序列分析、基于特定时间段(通常称为窗口)进行聚合时,或者在事件发生时间很重要的情况下进行事件处理时。

In the following sections we will highlight some of the topics that you should consider when working with timely Flink Applications.
在以下几节中,我们将重点介绍使用实时Flink应用程序时应考虑的一些事项。

Notions of Time: Event Time and Processing Time 时间概念:事件时间和处理时间

When referring to time in a streaming program (for example to define windows), one can refer to different notions of time:
当在流式应用程序中提及时间时(例如,定义窗口),可以表示不同的时间概念:

  • Processing time: Processing time refers to the system time of the machine that is executing the respective operation.
    When a streaming program runs on processing time, all time-based operations (like time windows) will use the system clock of the machines that run the respective operator. An hourly processing time window will include all records that arrived at a specific operator between the times when the system clock indicated the full hour. For example, if an application begins running at 9:15am, the first hourly processing time window will include events processed between 9:15am and 10:00am, the next window will include events processed between 10:00am and 11:00am, and so on.
    Processing time is the simplest notion of time and requires no coordination between streams and machines. It provides the best performance and the lowest latency. However, in distributed and asynchronous environments processing time does not provide determinism, because it is susceptible to the speed at which records arrive in the system (for example from the message queue), to the speed at which the records flow between operators inside the system, and to outages (scheduled, or otherwise).
    处理时间:处理时间是指执行相应操作的机器的系统时间。
    当流式程序以处理时间运行时,所有基于时间的操作(如时间窗口)将使用运行相应operator的机器的系统时钟。每小时处理时间窗口将包括在系统时钟指示的整小时之间到达特定operator的所有记录。例如,如果应用程序在上午9:15开始运行,第一个小时处理时间窗口将包括在上午9点15分至上午10点之间处理的事件

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值