Flume读取日志,并写入到Kafka broker,然后调console consumer输出结果



最近在学习大数据,学习了Flume 和kafka,做了一个小实验,分享一下。

实验环境:一个虚机 ,主机名:master

软件安装,不在重复。下裁:apache-flume-1.7.0-bin.tar.gz,kafka_2.10-0.10.1.1.tar

解压、安装。

1. 日志生成脚本:

[root@master software]# more createlog.sh
#!/bin/bash
i=1
while [ "1" = "1" ]
do
    echo "Flume get log to hdfs bigdata02"$i >> /application/software/test.log
    sleep 0.3
    i=` expr $i + 1 `
done
[root@master software]#

[root@master software]# mkdir /application/software/test.log

2. 配置Flume agent(flume的conf目录)

[root@master conf]# more agentkafka.properties
a11.sources = r11
a11.channels = c11
a11.sinks = k11
a11.sources.r11.type = exec
a11.sources.r11.command = tail -F /application/software/test.log
a11.sources.r11.channels = c11
#kafka config
a11.sinks.k11.channel = c11
a11.sinks.k11.type = org.apache.flume.sink.kafka.KafkaSink
a11.sinks.k11.brokerList = master:9092
a11.sinks.k11.topic = flume-data
a11.sinks.k11.batchSize = 20
a11.sinks.k11.requiredAcks = 1
a11.channels.c11.type = memory
a11.channels.c11.capacity = 10000
a11.channels.c11.transactionCapacity = 100
[root@master conf]#

启动Flume agent

[root@master conf]# bin/flume-ng agent --conf conf  --conf-file conf/agentkafka.properties --name a11 -Dflume.root.logger=DEBUG,console

3. 启动kafka

启动zookeeper

bin/zookeeper-server-start.sh  config/zookeeper.properties &

启动Kafka Broker

创建topic,名字为flume-data,包含5个分区,副本数为1,数据保留时长为2天(默认是1天)

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 5 --topic flume-data --config delete.retention.ms=172800000bin/kafka-server-start.sh -daemon config/server.properties

启动消费终端:

[root@master config]# bin/kafka-console-consumer.sh --bootstrap-server master:9092 --topic flume-data --from-beginning

4. 执行日志生成脚本(另一终端)

[root@master software]# ./createlog.sh

5. 查看结果(在步骤3启动消费终端窗口)

Flume get log to hdfs bigdata02440
Flume get log to hdfs bigdata02441
Flume get log to hdfs bigdata02442
Flume get log to hdfs bigdata02443
Flume get log to hdfs bigdata02444
Flume get log to hdfs bigdata02445
Flume get log to hdfs bigdata02446
Flume get log to hdfs bigdata02447
Flume get log to hdfs bigdata02448
Flume get log to hdfs bigdata02449
Flume get log to hdfs bigdata02450
Flume get log to hdfs bigdata02451
Flume get log to hdfs bigdata02452
Flume get log to hdfs bigdata02453
Flume get log to hdfs bigdata02454
Flume get log to hdfs bigdata02455
Flume get log to hdfs bigdata02456
Flume get log to hdfs bigdata02457
Flume get log to hdfs bigdata02458
Flume get log to hdfs bigdata02459
Flume get log to hdfs bigdata02460
Flume get log to hdfs bigdata02461
Flume get log to hdfs bigdata02462
Flume get log to hdfs bigdata02463
Flume get log to hdfs bigdata02464
Flume get log to hdfs bigdata02465
Flume get log to hdfs bigdata02466
Flume get log to hdfs bigdata02467
Flume get log to hdfs bigdata02468
Flume get log to hdfs bigdata02469
Flume get log to hdfs bigdata02470
Flume get log to hdfs bigdata02471
Flume get log to hdfs bigdata02472
Flume get log to hdfs bigdata02473
Flume get log to hdfs bigdata02474
Flume get log to hdfs bigdata02475
Flume get log to hdfs bigdata02476
Flume get log to hdfs bigdata02477
Flume get log to hdfs bigdata02478
Flume get log to hdfs bigdata02479
Flume get log to hdfs bigdata02480
Flume get log to hdfs bigdata02481
Flume get log to hdfs bigdata02482
Flume get log to hdfs bigdata02483
Flume get log to hdfs bigdata02484

  • 3
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要将RabbitMQ中的数据写入到HDFS中,可以使用Flume来实现。Flume是一个分布式、可靠、高可用的日志收集和聚合系统,支持从多个数据源(包括RabbitMQ)获取数据,并将数据写入到多个目的地(包括HDFS)中。 下面是一个使用Flume读取RabbitMQ数据写入HDFS的简单示例: 1. 安装RabbitMQ和Flume 首先需要安装RabbitMQ和Flume,可参考官方文档进行安装。 2. 配置RabbitMQ 需要在RabbitMQ中创建一个Exchange和一个Queue,并将它们绑定在一起。这样当有消息到达Exchange时,会被路由到Queue中。 3. 配置Flume 需要创建一个Flume配置文件,指定RabbitMQ作为数据源,HDFS作为目的地,并定义数据的格式和转换规则。 示例配置文件如下: ``` # Name the components on this agent agent.sources = rabbitmq-source agent.sinks = hdfs-sink agent.channels = memory-channel # Describe/configure the source agent.sources.rabbitmq-source.type = org.apache.flume.source.rabbitmq.RabbitMQSource agent.sources.rabbitmq-source.uri = amqp://guest:guest@localhost:5672 agent.sources.rabbitmq-source.queue = my-queue # Describe the sink agent.sinks.hdfs-sink.type = hdfs agent.sinks.hdfs-sink.hdfs.path = hdfs://localhost:9000/flume/rabbitmq-data/ agent.sinks.hdfs-sink.hdfs.fileType = DataStream agent.sinks.hdfs-sink.hdfs.writeFormat = Text # Use a channel which buffers events in memory agent.channels.memory-channel.type = memory # Bind the source and sink to the channel agent.sources.rabbitmq-source.channels = memory-channel agent.sinks.hdfs-sink.channel = memory-channel ``` 上述配置文件中,我们定义了一个名为“rabbitmq-source”的数据源,使用RabbitMQSource来接收来自RabbitMQ的数据。然后,我们定义了一个名为“hdfs-sink”的目的地,使用HDFS Sink将数据写入到HDFS中。最后,我们定义了一个名为“memory-channel”的通道,用于缓存事件。 4. 启动Flume 使用以下命令启动Flume: ``` $ bin/flume-ng agent -n agent -c conf -f conf/flume.conf ``` 其中,`-n`指定代理的名称,`-c`指定配置文件目录,`-f`指定配置文件路径。 5. 测试 向RabbitMQ发送一些消息,可以通过以下命令查看HDFS中是否成功写入了数据: ``` $ bin/hadoop fs -cat /flume/rabbitmq-data/* ``` 注意:这只是一个简单的示例,实际应用中需要根据具体情况进行配置和整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值