flume将数据发送到kafka、hdfs、hive、http、netcat等模式的使用总结

最新推荐文章于 2023-12-05 15:30:00 发布

u011811966

最新推荐文章于 2023-12-05 15:30:00 发布

阅读量1.7k

点赞数 1

分类专栏：大数据文章标签： flume hive分区 flume kafka

本文链接：https://blog.csdn.net/u011811966/article/details/80952641

版权

本文总结了使用Flume进行数据传输的各种模式，包括HTTP和Netcat作为数据源，将数据发送到Logger、HDFS、Hive以及通过Kafka进行中转。详细介绍了配置文件设置和操作步骤，例如通过telnet和HTTP发送数据，以及如何将数据存储到HDFS和Hive中，还涵盖了Avro模式在Flume间传输数据的应用。

摘要由CSDN通过智能技术生成

1、source为http模式，sink为logger模式，将数据在控制台打印出来。

conf配置文件如下：

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = http #该设置表示接收通过http方式发送过来的数据

a1.sources.r1.bind = hadoop-master #运行flume的主机或IP地址都可以

a1.sources.r1.port = 9000#端口

#a1.sources.r1.fileHeader = true

# Describe the sink

a1.sinks.k1.type = logger#该设置表示将数据在控制台打印出来

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

启动flume命令为：

bin/flume-ng agent -c conf -f conf/http.conf -n a1 -Dflume.root.logger=INFO,console。

显示如下的信息表示启动flume成功。

895 (lifecycleSupervisor-1-3) [INFO -org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SOURCE, name: r1 started

打开另外一个终端，通过http post的方式发送数据：

curl -X POST -d '[{"headers":{"timestampe":"1234567","host":"master"},"body":"badou flume"}]' hadoop-master:9000。

hadoop-master就是flume配置文件绑定的主机名，9000就是绑定的端口。

然后在运行flume的窗口就是看到如下的内容：

2018-06-12 08:24:04,472 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{timestampe=1234567, host=master} body: 62 61 64 6F 75 20 66 6C 75 6D 65 badou flume }

2、source为netcat（udp、tcp模式），sink为logger模式，将数据打印在控制台

conf配置文件如下：

a1.sources = r1

a1.sinks = k1

a1.channels = c1

a1.sources.r1.type = netcat

a1.sources.r1.bind = hadoop-master#绑定的主机名或IP地址

a1.sources.r1.port = 44444

a1.sinks.k1.type = logger

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transcationCapacity = 100

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

启动flume

bin/flume-ng agent -c conf -f conf/netcat.conf -n a1 -Dflume.root.logger=INFO,console。

然后在另外一个终端，使用telnet发送数据：

命令为：telnet hadoop-maser 44444

[root@hadoop-master ~]# telnet hadoop-master 44444

Trying 192.168.194.6...

Connected to hadoop-master.

Escape character is '^]'.

显示上面的信息表示连接flume成功，然后输入：

12213213213

12321313

在flume就会收到相应的信息：

2018-06-12 08:38:51,129 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 31 32 32 31 33 32 31 33 32 31 33 0D 12213213213. }

2018-06-12 08:38:51,130 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{} body: 31 32 33 32 31 33 31 33 0D 12321313. }

3、source为netcat/http模式，sink为hdfs模式，将数

最低0.47元/天解锁文章

u011811966

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
flume将数据发送到kafka、hdfs、hive、http、netcat等模式的使用总结

1、source为http模式，sink为logger模式，将数据在控制台打印出来。conf配置文件如下：# Name the components on this agenta1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = http #该设置表示接收通过h...
复制链接

扫一扫

专栏目录