Spark Streaming
文章平均质量分 60
霄嵩
这个作者很懒,什么都没留下…
展开
-
updateStateByKey案例(Scala版本)
package SparkStreamingimport org.apache.spark.SparkConfimport org.apache.spark.streaming.{Seconds, StreamingContext}/** * Created by tg on 11/4/16. */object updateStateByKeyPro {原创 2016-11-05 17:18:39 · 2168 阅读 · 0 评论 -
Spark Streaming架构原理剖析图解
画了一个晚上,终于画完了,和大家一起分享一下。原创 2018-07-27 18:50:29 · 1402 阅读 · 0 评论 -
Spark Streaming中reduceByKeyAndWindow实例开发
package SparkStreamingTest.Scalaimport org.apache.log4j.{Level, Logger}import org.apache.spark.SparkConfimport org.apache.spark.streaming.{Seconds, StreamingContext}/** * Created by TG. * 每隔...原创 2018-07-01 11:03:14 · 813 阅读 · 0 评论 -
Spark Streaming性能调优
一、 数据接收并行度调优 1、通过网络接收数据时(比如Kafka、Flume),会将数据反序列化,并存储在Spark的内存中。如果数据接收称为系统的瓶颈,那么可以考虑并行化数据接收。每一个输入DStream都会在某个Worker的Executor上启动一个Receiver,该Receiver接收一个数据流。因此可以通过创建多个输入DStream,并且配置它们接收数据源不同的分区数据,达到接收多个...原创 2018-03-10 12:48:54 · 1837 阅读 · 1 评论 -
Spark SQL+Spark Streaming案例
package SparkStreamingimport org.apache.spark.SparkConfimport org.apache.spark.sql.{Row, SQLContext}import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType}import原创 2016-11-10 16:34:41 · 2198 阅读 · 0 评论 -
实时wordcount程序
package gh.spark.SparkStreaming;import java.util.Arrays;import java.util.Iterator;import java.util.List;import org.apache.spark.SparkConf;import org.apache.spark.api.java.function.FlatMapF原创 2016-11-02 18:21:56 · 452 阅读 · 0 评论 -
Flume监控的数据Push推送给SparkStreaming(Scala版本)
package SparkStreamingimport org.apache.spark.SparkConfimport org.apache.spark.streaming.flume.FlumeUtilsimport org.apache.spark.streaming.{Seconds, StreamingContext}/** * Created by原创 2016-11-09 16:53:14 · 545 阅读 · 0 评论 -
reduceByKeyAndWindow实现基于滑动窗口的热点搜索词实时统计(Java版本)
package gh.spark.SparkStreaming;import java.util.List;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaPairRDD;import org.apache.spark.api.java.function.Function;im原创 2016-11-08 13:43:22 · 3777 阅读 · 0 评论 -
reduceByKeyAndWindow基于滑动窗口的热点搜索词实时统计(Scala版本)
package SparkStreamingimport org.apache.spark.SparkConfimport org.apache.spark.streaming.{Seconds, StreamingContext}/** * * 基于滑动窗口的热点搜索词实时统计 * 每隔5秒钟,统计最近20秒钟的搜索词的搜索频次, * 并打印出原创 2016-11-08 13:41:42 · 6128 阅读 · 0 评论 -
transform实现广告计费日志实时黑名单过滤(Scala版本)
package SparkStreamingimport org.apache.spark.SparkConfimport org.apache.spark.streaming.{Seconds, StreamingContext}/** * Created by tg on 11/6/16. */object transformDemo { def原创 2016-11-08 13:39:58 · 790 阅读 · 0 评论 -
transform实现广告计费日志实时黑名单过滤(Java版本)
package gh.spark.SparkStreaming;import java.util.ArrayList;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaPairRDD;import org.apache.spark.api.java.JavaRDD;import原创 2016-11-08 13:37:55 · 655 阅读 · 0 评论 -
updateStateByKey案例(Java版本)
package gh.spark.SparkStreaming;import java.util.Arrays;import java.util.Iterator;import java.util.List;import org.apache.spark.SparkConf;import org.apache.spark.api.java.Optional; //注意O原创 2016-11-05 17:17:08 · 815 阅读 · 0 评论 -
基于Kafka Receiver方式的实时WordCount
import kafka.serializer.StringDecoderimport org.apache.log4j.{Level, Logger}import org.apache.spark.SparkConfimport org.apache.spark.storage.StorageLevelimport org.apache.spark.streaming.kafka.Kaf...原创 2018-08-01 17:56:02 · 577 阅读 · 0 评论