Flume 实时数据监控 3 Source(监控目录) - 3 Sink

监控 /data_log 目录下日志文件中的数据,将数据存储到 Kafka 的 Topic 中 。

需求: 三个 Topic 分别为:ChangeRecord,ProduceRecord,EnvrionmentData;分区数为: 4 4 4
需求分析: 需要将三个种 log 日志文件,区分开来存储,所以我们需要三个不同的 SourceSink

1. 创建 Kafka Topic

bin/kafka-topic.sh --create --broker-list master:9092 --topic ChangeRecord,ProduceRecord,EnvironmentData --partitions 4 --replication-factor 2

2. Flume 配置文件 :
produceRecord:
# 创建 Sources、Sinks、Channels
a1.sources = s1
a1.sinks = k1
a1.channels = c1

# producerecord 内容配置
a1.sources.s1.type = spooldir
a1.sources.s1.spoolDir = /data_log
# 判断目录下的文件名是否存在正则所表达的内容
a1.sources.s1.includePattern = ^producerecord.*$
a1.sources.s1.fileHeader = true

# 配置过滤器 去掉表头
a1.sources.s1.interceptors = i1
a1.sources.s1.interceptors.i1.type = regex_filter
a1.sources.s1.interceptors.i1.regex = \s*Produce.*
a1,sources.s1.interceptors.i1.excludeEvents = true

# 获取 将数据存储到 Kafka 中 
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = ProduceRecord
a1.sinks.k1.kafka.bootstrap.servers = master:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy

a1.channels.type = memory
a1.channels.capacity = 10000
a1.channels.transcationCapacity = 1000

a1.sinks.channel = c1
a1.sources.channels = c1
changeRecode:
# 创建 Sources、Sinks、Channels
a1.sources = s1
a1.sinks = k1
a1.channels = c1

# producerecord 内容配置
a1.sources.s1.type = spooldir
a1.sources.s1.spoolDir = /data_log
# 判断目录下的文件名是否存在正则所表达的内容
a1.sources.s1.includePattern = ^chagerecord.*$
a1.sources.s1.fileHeader = true

# 配置过滤器 去掉表头
a1.sources.s1.interceptors = i1
a1.sources.s1.interceptors.i1.type = regex_filter
a1.sources.s1.interceptors.i1.regex = \s*Change.*
a1,sources.s1.interceptors.i1.excludeEvents = true

# 获取 将数据存储到 Kafka 中 
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = ChangeRecord
a1.sinks.k1.kafka.bootstrap.servers = master:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy

a1.channels.type = memory
a1.channels.capacity = 10000
a1.channels.transcationCapacity = 1000
EnvironmentData:
# 创建 Sources、Sinks、Channels
a1.sources = s1
a1.sinks = k1
a1.channels = c1

# producerecord 内容配置
a1.sources.s1.type = spooldir
a1,sources.s1.spoolDir = /data_log
a1.sources.s1.includePattern = ^envrionment.*$
a1.sources.s1.fileHeader = true

# 配置过滤器 去掉表头
a1.sources.s1.interceptors = i1
a1.sources.s1.interceptors.i1.type = regex_filter
a1.sources.s1.interceptors.i1.regex = \s*PM25.*
a1.sources.s1.interceptors.i1.excludeEvents = true

# 获取 将数据存储到 Kafka 中 
a1.sinks.k1.type = org.apache.flume.sink.kafka.SinkKafka
a1.sinks.k1.topic = EnvironmentData
a1.sinks.k1.kafka.bootstrap.servers = master:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy

a1.channels.type = memory
a1.channels.capacity = 10000
a1.channels.transcationCapacity = 1000

a1.sources.channels = c1
a1.sinks.channel = c1
3. 启动 Flume 配置文件:

bin/flume-ng agent -n a1 -c conf -f file

将数据备份到 HDFS 中

需求: 将监控到的数据,备份到 HDFS 中的 /user/test/flumebackup 中。

a1.sources = s1
a1.sinks = k1
a1.channels = c1

a1.sources.s1.type = spooldir
a1.sources.s1.spoolDir = /data_log
a1.sources.s1.fileHeader = true

a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://master:8020/user/test/flumebackup/
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.filePrefix = %{filename}.bak
a1.sinks.k1.hdfs.roundValue = 1
a1.sinks.k1.hdfs.fileSuffx = .log
a1.sinks.k1.hdfs.roundUnit = hour
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.batchSize = 100
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.rollInterval = 20
a1.sinks.k1.hdfs.rollSize = 13421700
a1.sinks.k1.hdfs.rollCount = 0

a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transcationCapacity = 1000

a1.sources.s1.channels = c1
a1.sinks.k1.channel = c1
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值