flume架构和启动

flume1.9概述

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.

webserver(源端)==》flume==》HDFS(目的)

设计目标:

     可靠性

     扩展性

     管理性

flume架构

  flume安装前置条件

  • Java Runtime Environment - Java 1.8 or later
  • Memory - Sufficient memory for configurations used by sources, channels or sinks
  • Disk Space - Sufficient disk space for configurations used by channels or sinks
  • Directory Permissions - Read/Write permissions for directories used by agen

flume使用案例:

  

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop000
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

   1.配置Source

   2.配置Channel

   3.配置Sink

   4.把以上三个组件串起来

启动flume

$ bin/flume-ng agent --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/example.conf --name a1 -Dflume.root.logger=INFO,console

不同source的采集

    1、exec source +memory  channel +logger sink

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/calllog/data.log
a1.sources.r1.shell=/bin/sh -c

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory


# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Kafka和Flume是两种常用的数据传输工具。它们有一些共同点和区别。 共同点是它们都可以用于数据采集和传输。它们都支持多个生产者的场景,可以从多个数据源获取数据。同时,它们都可以提供高吞吐量的数据传输能力。 Flume追求的是数据和数据源、数据流向的多样性。它有自己内置的多种source和sink组件,可以通过编写配置文件来定义数据的来源和目的地。Flume的配置文件中包含source、channel和sink的信息,通过启动Flume组件时关联配置文件来实现数据传输。 Kafka追求的是高吞吐量和高负载。它支持在同一个topic下拥有多个分区,适合多个消费者的场景。不同于Flume,Kafka没有内置的producer和consumer组件,需要用户自己编写代码来进行数据的发送和接收。 总的来说,Flume更适合于多个生产者的场景,而Kafka更适合于高吞吐量和高负载的场景,并且需要用户自己编写代码来操作数据的发送和接收。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *3* [Flume和Kafka的区别与联系](https://blog.csdn.net/wx1528159409/article/details/88257693)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] - *2* [大数据之Kafka(三):Kafka 与 Flume的整合及架构之道](https://blog.csdn.net/weixin_44291548/article/details/119839752)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值