[13]spark streaming
hjw199089
从事大数据
(1)查询引擎开发-presto深度开发-hive开发-自研查询引擎开发
(2)大数据用户行为分析
(3)spark、sparkstreaming、storm、druid开发应用经验
(4)数据仓库开发
展开
-
spark-streaming-[8]-Spark Streaming + Kafka Integration Guide0.8.2.1学习笔记
Spark Streaming + Kafka Integration Guide (Kafka broker version 0.8.2.1 or higher) Here we explain how to configure Spark Streaming to receive data from Kafka. There are two approaches to this - th原创 2017-05-07 11:52:15 · 899 阅读 · 0 评论 -
Spark Streaming Programming Guide
Spark Streaming Programming Guide 笔记翻译 2017-10-19 21:05:41 · 275 阅读 · 0 评论 -
spark-streaming-[9]-SparkStreaming消费Kafka-Direct Approach
spark-streaming-[8]-Spark Streaming + Kafka Integration Guide0.8.2.1学习笔记中已知: There are two approaches to this - the old approach using Receivers and Kafka’s high-level API, and a new approach (introd原创 2017-05-07 15:34:23 · 2001 阅读 · 0 评论 -
spark-streaming-[3]-Transform
Transform Operation Return a new DStream by applying a RDD-to-RDD function to every RDDof the source DStream. This can be used to do arbitrary RDD operationson the DStream. Th原创 2017-05-01 22:00:12 · 345 阅读 · 0 评论 -
Spark Streaming场景应用-Kafka数据读取方式
转自:Spark Streaming场景应用-Kafka数据读取方式 概述 Spark Streaming 支持多种实时输入源数据的读取,其中包括Kafka、flume、socket流等等。除了Kafka以外的实时输入源,由于我们的业务场景没有涉及,在此将不会讨论。本篇文章主要着眼于我们目前的业务场景,只关注Spark Streaming读取Kafka数据的方式。 Spark St转载 2017-05-31 19:05:26 · 3730 阅读 · 0 评论 -
spark-streaming-[10]-Spark Streaming 中使用 zookeeper 保存 offset 并重用
转载于:Spark Streaming 中使用 zookeeper 保存 offset 并重用 多谢分享 在 Spark Streaming 中消费 Kafka 数据的时候,有两种方式分别是 1)基于 Receiver-based 的 createStream 方法和 2)Direct Approach (No Receivers) 方式的 createDirectStream 方法,详转载 2017-05-08 11:56:59 · 1771 阅读 · 0 评论 -
spark-streaming-[1]-streaming基础NetworkWordCount
一、编程框架 Define context val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount") val ssc = new StreamingContext(conf, Seconds(1)) After a context is defined, you h原创 2017-04-28 18:07:47 · 449 阅读 · 0 评论 -
spark-streaming-[2]-累加器(更新器)操作(updateStateByKey)
多谢分享,参考引用:【Spark八十八】Spark Streaming累加器操作(updateStateByKey) updateStateByKey(func) Return a new "state" DStream where the state for each key is updated by applying the given function on the pre原创 2017-04-28 19:43:23 · 1234 阅读 · 0 评论 -
spark-streaming-[4]-Window Operations
Window Operations As shown in the figure, every time the window slides over a source DStream, the source RDDs that fallwithin the window are combined and operated upo原创 2017-05-02 10:34:58 · 604 阅读 · 0 评论 -
spark-streaming-[5]-Design Patterns for using foreachRDD
参考: 整合Kafka到Spark Streaming——代码示例和挑战 githubkafka实例 SparkStreaming之foreachRDD写mysql 待续。。。。转载 2017-05-02 20:01:53 · 342 阅读 · 0 评论 -
spark-streaming-[6]-KafkaWordCount和KafkaWordCountProducer(Receiver-based Approach)
学习spark streaming中KafkaWordCount和KafkaWordCountProducer 官方github代码 参考文章: 徽沪一郎 Apache Spark技术实战之1 -- KafkaWordCount 感谢分享 Spark-Streaming获取kafka数据的两种方式-Receiver与Direct的方式 搭建Kafka集群原创 2017-05-03 21:21:48 · 1181 阅读 · 0 评论 -
spark-streaming-[7]-Output Operations on DStreams-foreachRDD写Mysql
foreachRDD(func) The most generic output operator that applies a function, func, to each RDD generated from the stream. This function should push the data in each RDD to an external system, such as原创 2017-05-04 15:38:03 · 436 阅读 · 0 评论 -
Pro Spark Streaming笔记
待续。。。翻译 2017-10-19 21:40:11 · 293 阅读 · 0 评论