1.flume配置相关解析

最新推荐文章于 2023-09-26 16:19:32 发布

qq_35561207

最新推荐文章于 2023-09-26 16:19:32 发布

阅读量160

点赞数

分类专栏： python(深度学习，与机器学习分类)

本文链接：https://blog.csdn.net/qq_35561207/article/details/89307597

版权

本文详细解析Apache Flume的配置过程，包括数据源、通道和接收器的设置，以及如何进行数据流转和故障转移策略的配置，帮助读者深入理解Flume在日志收集和传输中的应用。

摘要由CSDN通过智能技术生成

#定义三大组件的名称
agent1.sources = source1
agent1.sinks = sink1
agent1.channels = channel1

# 配置source组件
# Spooling Directory Source，因为flume服务down掉的时候，能自动记录上一次读到的数据
agent1.sources.source1.type = spooldir
agent1.sources.source1.spoolDir = /home/hadoop/test/
agent1.sources.source1.ignorePattern = ^error.*\.log$
agent1.sources.source1.fileHeader = false

#配置拦截器
agent1.sources.source1.interceptors = i1
agent1.sources.source1.interceptors.i1.type = timestamp
# 配置sink组件
# 数据落地位置为hdfs
agent1.sinks.sink1.type = hdfs
agent1.sinks.sink1.hdfs.path =hdfs://master:9000/weblog/%y-%m-%d/
# 落地的数据为access_log
agent1.sinks.sink1.hdfs.filePrefix = access_log
agent1.sinks.sink1.hdfs.maxOpenFiles = 5000
# 单次批处理次数为100
agent1.sinks.sink1.hdfs.batchSize= 100
agent1.sinks.sink1.hdfs.fileType = DataStream
agent1.sinks.sink1.hdfs.writeFormat =Text
#滚动生成的文件按大小生成
agent1.sinks.sink1.hdfs.rollSize =