SparkStreaming实战

最新推荐文章于 2021-06-04 07:15:37 发布

北京小辉

最新推荐文章于 2021-06-04 07:15:37 发布

阅读量1.5k

点赞数 2

分类专栏：【大数据】实战演练

本文链接：https://blog.csdn.net/silentwolfyh/article/details/70156432

版权

【大数据】实战演练专栏收录该内容

2 篇文章 2 订阅

订阅专栏

Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Data can be ingested from many sources like Kafka, Flume, Twitter, ZeroMQ, Kinesis, or TCP sockets, and can be processed using complex algorithms expressed with high-level functions like map, reduce, join and window. Finally, processed data can be pushed out to filesystems, databases, and live dashboards. In fact, you can apply Spark’s machine learning and graph processing algorithms on data streams.

![这里写图片描述](https://imgconvert.csdnimg.cn/aHR0cDovL2ltZy5ibG9nLmNzZG4ubmV0LzIwMTcwNDEzMTExNzAxMDYz?x-oss-process=image/format,png)

二、Spark Streaming的A Quick Example

这里写图片描述

<dependency>
   <groupId>org.apache.spark</groupId> 
   <artifactId>spark-streaming_2.11</artifactId> 
   <version>2.1.0</version> 
</dependency>

三、Discretized Streams (DStreams)

提前参考：SparkStreaming在启动执行步鄹和DStream的理解
http://blog.csdn.net/silentwolfyh/article/details/70157445

Discretized Stream or DStream is the basic abstraction provided by Spark Streaming.A DStream is represented by a continuous series of RDDs

![这里写图片描述](https://imgconvert.csdnimg.cn/aHR0cDovL2ltZy5ibG9nLmNzZG4ubmV0LzIwMTcwNDEzMTExOTA4ODc0?x-oss-process=image/format,png)

备注：Dstream就是一个基础抽象的管道，每一个Duration就是一个RDD

四、Dstream时间窗口

Spark Streaming also provides windowed computations, which allow you to apply transformations over a sliding window of data. The following figure illustrates this sliding window.

![这里写图片描述](https://imgconvert.csdnimg.cn/aHR0cDovL2ltZy5ibG9nLmNzZG4ubmV0LzIwMTcwNDEzMTEyMDIzNjg4?x-oss-process=image/format,png)

使用Spark Streaming每次只能消费当前批次内的数据，当然可以通过window操作，消费过去一段时间（多个批次）内的数据。举个简例子，需要每隔10秒，统计当前小时的PV和UV，在数据量特别大的情况下，使用window操作并不是很好的选择，通常是借助其它如Redis、HBase等完成数据统计。

![这里写图片描述](https://imgconvert.csdnimg.cn/aHR0cDovL2ltZy5ibG9nLmNzZG4ubmV0LzIwMTcwNDEzMTEyMTAxMDY1?x-oss-process=image/format,png)

这里写图片描述