示例:Spark Streaming+Flume整合

5 篇文章 0 订阅

文章目录

push

import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.dstream.ReceiverInputDStream
import org.apache.spark.streaming.flume.{FlumeUtils, SparkFlumeEvent}
import org.apache.spark.streaming.{Seconds, StreamingContext}

object flume_push_streaming {
  Logger.getLogger("org").setLevel(Level.WARN)
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setAppName("flume_push_streaming").setMaster("local[*]")
    val ssc = new StreamingContext(sparkConf, Seconds(5))

    val flumeStreaming: ReceiverInputDStream[SparkFlumeEvent] = FlumeUtils.createStream(ssc, "本地ip", 41414)
    flumeStreaming.map(x => new String(x.event.getBody.array()).trim).flatMap(_.split(" ")).map(x => (x, 1)).reduceByKey(_ + _).print()

    ssc.start()
    ssc.awaitTermination()
  }
}

启动:

  1. 启动程序

  2. 编写 flume_push_streaming.conf 并启动flume (示例)

  3. 启动telnet 虚拟机主机名(hadoop01) 44444 输入测试数据

  4. 查看程序运行窗口是否有结果

pull

import org.apache.log4j.{Level, Logger}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.dstream.ReceiverInputDStream
import org.apache.spark.streaming.flume.{FlumeUtils, SparkFlumeEvent}
import org.apache.spark.streaming.{Seconds, StreamingContext}

object flume_pull_streaming {
  Logger.getLogger("org").setLevel(Level.WARN)
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setAppName("flume_pull_streaming").setMaster("local[*]")
    val ssc = new StreamingContext(sparkConf, Seconds(5))

    val flumeStreaming: ReceiverInputDStream[SparkFlumeEvent] = FlumeUtils.createPollingStream(ssc, "hadoop01", 41414)
    flumeStreaming.map(x => new String(x.event.getBody.array()).trim).flatMap(_.split(" ")).map(x => (x, 1)).reduceByKey(_ + _).print()

    ssc.start()
    ssc.awaitTermination()
  }
}

启动:

  1. 编写 flume_pull_streaming.conf 并启动flume (示例)
  2. 启动程序
  3. 启动telnet 虚拟机主机名(hadoop01) 44444 输入测试数据
  4. 查看程序运行窗口是否有结果

对比:

  • 在编写代码方面创建Flume流方式不一样
  • 配置文件中sink不一样
  • 在启动时启动顺序是反的

参考官网:https://spark.apache.org/docs/2.2.0/streaming-flume-integration.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值