Flume知识点（二） --案例使用

最新推荐文章于 2022-07-18 09:42:59 发布

小维_

最新推荐文章于 2022-07-18 09:42:59 发布

阅读量81

点赞数

分类专栏： Flume

本文链接：https://blog.csdn.net/qq_38633279/article/details/107716686

版权

Flume 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

自己还是对这个框架学习的不够认真，感觉还没有完全把这个搞懂，只是观看了2个使用案例，没有下沉到具体的实际操作中，最近有些浮躁啊！

cd app/flume/bin
./flume-ng version #查看版本
Flume核心组件
网址：http://flume.apache.org/releases/content/1.9.0/FlumeUserGuide.html
1.Source
2.Channel(相当于数据的缓存) 常见的有基于内存和基于文件
3.Sink

1.案例1

1.需求
从网络指定端口上收集数据，下沉到控制台
Agent技术选型（因为对于Flume的三大核心组件：Source、Channel、Sink各自都有不同的技术）在http://flume.apache.org/releases/content/1.9.0/FlumeUserGuide.html 网址上查看
Source：NetCat(nc)
Channel： Memory（选择基于内存的）
Sink：Logger

使用FLume Agent来完成日志的收集：* * * * *编写FLume配置文件

2.Flume的一个Agent应用编写

cd app/flume/conf
vi flume-first.sh    #文件名称随便起
# Name the components on this agent
# 1==》a1：Agent的名称
# 2==》Source/Channel/Sink 起名

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
# 3==》定义三大组件
a1.sources.r1.type = netcat       #确定值
a1.sources.r1.bind = localhost   #hostname
a1.sources.r1.port = 44444        #端口号

# Describe the sink
a1.sinks.k1.type = logger       #确定值

# Use a channel which buffers events in memory
a1.channels.c1.type = memory

# Bind the source and sink to the channel
# 4.将Source/Channel/Sink连接在一起
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3.启动
cd app/flume

./bin/flume-ng agent \
--name a1 \     #agent的名称 
--conf %FLUME_HOME/conf \     #conf的路径
--conf-file %FLUME_HOME/conf/flume-first.sh \   #配置文件名称路径
-Dflume.root.logger=INFO,console
如：
./bin/flume-ng agent \
--name a1 \
--conf /home/hadoop/app/flume/conf \
--conf-file /home/hadoop/app/flume/conf/flume-first.sh \
-Dflume.root.logger=INFO,console

4.检查

telnet data001 444444   
#检查：然后在该窗口输入，在步骤2启动窗口可以看到数据的输出。

2.案例2

1.需求
从指定文件上上收集新增的数据，下沉到控制台
Agent技术选型（因为对于Flume的三大核心组件：Source、Channel、Sink各自都有不同的技术）在http://flume.apache.org/releases/content/1.9.0/FlumeUserGuide.html 网址上查看
Source：exec
Channel： Memory（选择基于内存的）
Sink：Logger

2.Flume的一个Agent应用编写

cd app/flume/conf
vi flume-exec-first.conf    #文件名称随便起

# Name the components on this agent
# 1==》a1：Agent的名称
# 2==》Source/Channel/Sink 起名

a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
# 3==》定义三大组件
a1.sources.r1.type = exec       #确定值
a1.sources.r1.command = tail -F /home /hadoop/tmp/data.log     #需要追踪的文件路径
a1.sources.r1.shell = /bin/sh -c

# Describe the sink
a1.sinks.k1.type = logger       #确定值

# Use a channel which buffers events in memory
a1.channels.c1.type = memory

# Bind the source and sink to the channel
# 4.将Source/Channel/Sink连接在一起
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

=========
a1.sources = r1
a1.sinks = k1
a1.channels = c1

a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home /hadoop/tmp/data.log
a1.sources.r1.shell = /bin/sh -c

a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory

a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3.启动

./bin/flume-ng agent \
--name a1 \     #agent的名称 
--conf %FLUME_HOME/conf \     #conf的路径
--conf-file %FLUME_HOME/conf/flume-first.sh \   #配置文件名称路径
-Dflume.root.logger=INFO,console

如：
./bin/flume-ng agent \
--name a1 \
--conf /home/hadoop/app/flume/conf \
--conf-file /home/hadoop/app/flume/conf/flume-exec-first.conf \
-Dflume.root.logger=INFO,console

4.检查

cd tmp
echo  aaa >> data.log     #对data.log文件进行追加，发现启动窗口有追踪到数据，证明需求设置成功

#通过对data.log这个文件进行数据的修改，则在控制台，步骤2中窗口看到变化的数据，就证明flume追踪到了新增数据，类似于我自己学的canal追踪mysql中的新增数据，这个是一个道理。

备注：其中对于Flume中的Source、Channal、Sink中应用有很多种，在官网中可以找到，主要在于需求分析，来找到合适的组件。自己还没有深入的去学习，仅仅看懂了它的初步应用。