Spark Streaming实时流处理实战笔记七

最新推荐文章于 2024-05-03 17:09:47 发布

huxin9611

最新推荐文章于 2024-05-03 17:09:47 发布

阅读量155

点赞数

分类专栏：学习笔记文章标签：实时流处理，Spark Streaming

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/huxin9611/article/details/89741313

版权

学习笔记专栏收录该内容

19 篇文章 0 订阅

订阅专栏

Spark Streaming核心概念

核心概念

核心概念之StreamingContext

在IDEA中搜索StreamingContext.scala

def this(sparkContext: SparkContext, batchDuration: Duration) = {
this(sparkContext, null, batchDuration)
}

def this(conf: SparkConf, batchDuration: Duration) = {
this(StreamingContext.createNewSparkContext(conf), null, batchDuration)
}

batch interval可以根据你的应用程序的需求的延迟要求以及集群可用的资源情况来设置

After a context is defined, you have to do the following.

Define the input sources by creating input DStreams.
Define the streaming computations by applying transformation and
output operations to DStreams.
Start receiving data and processing it using
streamingContext.start().
Wait for the processing to be stopped (manually or due to any error)
using streamingContext.awaitTermination().
The processing can be manually stopped using
streamingContext.stop().

一旦StreamingContext定义好之后，就可以做一些事情

Discretized Streams (DStreams)

Internally, a DStream is represented by a continuous series of RDDs
Each RDD in a DStream contains data from a certain interval

对DStream操作算子，比如map/flatMap，其实底层会被翻译为对DStream中的每个RDD都做相同因为一个DStream是由不同批次的RDD所构成的。
在这里插入图片描述

Input DStreams and Receivers

Input DStreams are DStreams representing the stream of input data received from streaming sources.

Every input DStream (except file stream, discussed later in this section) is associated with a Receiver (Scala doc, Java doc) object which
receives the data from a source and stores it
in Spark’s memory for processing.

Transformations

Output Operations

案例实战

案例一：Spark Streaming处理socket数据

报错：java.lang.NoClassDefFoundError: net/jpountz/util/SafeUtils

百度查找maven repository
在这里插入图片描述

案例二：Spark Streaming处理HDFS文件数据

在这里插入图片描述

文件系统对接的注意事项（官网查询）：

All files must be in the same data format.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。