【Spark】Spark Stream 入门案例

最新推荐文章于 2022-09-26 23:10:53 发布

晚风中的自由

最新推荐文章于 2022-09-26 23:10:53 发布

阅读量144

点赞数

分类专栏： Spark 大数据文章标签： Spark

本文链接：https://blog.csdn.net/u014028317/article/details/103264665

版权

大数据同时被 2 个专栏收录

41 篇文章 0 订阅

订阅专栏

Spark

24 篇文章 3 订阅

订阅专栏

官方案例：http://spark.apache.org/docs/latest/streaming-programming-guide.html

Spark Stream Demo，从socket实时读取数据，进行实时处理

首先检查有无安装 nc

rpm -qa | grep nc

如果没有则要先安装nc；

下载：http://vault.centos.org/6.6/os/x86_64/Packages/nc-1.84-22.el6.x86_64.rpm

安装：

rpm -Uvh --force --nodeps nc-1.84-22.el6.x86_64.rpm

运行nc针对于端口号：9999，运行命令：

nc -lk 9999

运行demo

bin/run-example streaming.NetworkWordCount hadoop-senior.ibeifeng.com 9999

在nc端输入数据，在spark端可以看到输入应用处理完之后的结果

$ nc -lk 9999
hello world

spark端处理结果：

$ ./bin/run-example streaming.NetworkWordCount localhost 9999
...
-------------------------------------------
Time: 1357008430000 ms
-------------------------------------------
(hello,1)
(world,1)
...

案例代码：

import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._ // not necessary since Spark 1.3

// Create a local StreamingContext with two working thread and batch interval of 1 second.
// The master requires 2 cores to prevent a starvation scenario.

val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount")
val ssc = new StreamingContext(conf, Seconds(1))

// Create a DStream that will connect to hostname:port, like localhost:9999
val lines = ssc.socketTextStream("localhost", 9999)

// Split each line into words
val words = lines.flatMap(_.split(" "))

// Count each word in each batch
val pairs = words.map(word => (word, 1))
val wordCounts = pairs.reduceByKey(_ + _)

// Print the first ten elements of each RDD generated in this DStream to the console
wordCounts.print()

ssc.start()             // Start the computation
ssc.awaitTermination()  // Wait for the computation to terminate

spark stream底层就是 spark core；

晚风中的自由

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【Spark】Spark Stream 入门案例

官方案例：http://spark.apache.org/docs/latest/streaming-programming-guide.htmlSpark Stream Demo，从socket实时读取数据，进行实时处理首先检查有无安装 ncrpm -qa | grep nc如果没有则要先安装nc；下载：http://vault.centos.org/6.6/os/x86_...
复制链接

扫一扫

专栏目录