flume-Spark整合-push方式

最新推荐文章于 2022-01-11 20:00:28 发布

上善若水211

最新推荐文章于 2022-01-11 20:00:28 发布

阅读量522

点赞数

分类专栏： kafka 大数据 spark 文章标签： flume-kafk flume spark 大数据

本文链接：https://blog.csdn.net/tuzhihai/article/details/78800604

版权

大数据同时被 3 个专栏收录

8 篇文章 0 订阅

订阅专栏

spark

3 篇文章 0 订阅

订阅专栏

kafka

2 篇文章 0 订阅

订阅专栏

第一种sparkStreaming 整合Flume

flume采用 netcat-memory-avro架构

本地测试

1：本地启动sprakStreaming服务，（0.0.0.0 10000）

2. 服务器中启动flume agent

3. telnet往端口中输入数据，观察本地idea控制台输出数据

服务器测试

mvn打包：mvn clean package -DskipTests

上传至服务器

先启动spark

spark-submit \
--class com.tuzhihai.flumespark.FlumePushSpark \
--master local[2] \
--packages org.apache.spark:spark-streaming-flume_2.11:2.2.0 \
/root/soft_down/lib/sparklearn-1.0.jar \
192.168.145.128 10000

后启动flume

flume-ng agent \
  --name netcat-memory-avro \
  --conf $FLUME_HOME/conf \
  --conf-file $FLUME_HOME/conf/netcat-memory-avro.conf \
  -Dflume.root.logger=INFO,console

在端口输入数据

telnet 192.168.145.128 9990

观察flume控制台

push方式为什么要先启动spark,后启动flume?

因为采用的是flume-Push，要push到一个服务器里，首先这个服务里得存在是不？所以要先启动spark这个接收数据的服务器，再启动flume这个采集数据的工具

flume-push-stream.conf

flume-ng agent \
  --name netcat-memory-avro \
  --conf $FLUME_HOME/conf \
  --conf-file $FLUME_HOME/conf/netcat-memory-avro.conf \
  -Dflume.root.logger=INFO,console 

# example netcat-memory-avro
netcat-memory-avro.sources = netcat-source
netcat-memory-avro.sinks = avro-sink
netcat-memory-avro.channels = memory-channel

# Describe/configure the source
netcat-memory-avro.sources.netcat-source.type = netcat
netcat-memory-avro.sources.netcat-source.bind = 192.168.145.128
netcat-memory-avro.sources.netcat-source.port = 9999

# Describe/ the sink
netcat-memory-avro.sinks.avro-sink.type = avro
netcat-memory-avro.sinks.avro-sink.hostname = 192.168.145.128
netcat-memory-avro.sinks.avro-sink.port = 10000

# Use a channel which buffers events in memory
netcat-memory-avro.channels.memory-channel.type = memory

# Bind the source and sink to the channel
netcat-memory-avro.sources.netcat-source.channels = memory-channel
netcat-memory-avro.sinks.avro-sink.channel = memory-channel

上善若水211

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
flume-Spark整合-push方式

第一种sparkStreaming 整合Flume flume采用 netcat-memory-avro架构本地测试 1：本地启动sprakStreaming服务，（0.0.0.0 10000） 2. 服务器中启动flume agent 3. telnet往端口中输入数据，观察本地idea控制台输出数据
复制链接

扫一扫