Flume笔记(一) flume工作原理以及数据源获取

flume 特点:

	分布式、可靠、高可用的海量日志采集、聚合和传输的系统
	在生产者和消费者中间起协调作用

flume工作原理:

flume工作原理

	flume的数据流由事件(event)贯穿始终。事件是flume的基本单位,它携带日数据并且携带带有头信息,
	这些event由agent外部的source生成,当source捕获事件后会进行特定的格式化,然后source会把事件推入channel中,
	保存事件直到sink事件处理完该事件为止,sink负责持久化或者把事件推向另一个source或者写入hdfs、hbase

flume配置:

netcat收集数据(netcat source):
  1. 创建flume配置文件:$> /soft/flume/conf/xxx.conf
    # example.conf: A single-node Flume configuration

     # Name the components on this agent
     a1.sources = r1
     a1.sinks = k1
     a1.channels = c1
    
     # Describe/configure the source
     a1.sources.r1.type = netcat
     a1.sources.r1.bind = localhost
     a1.sources.r1.port = 44444
    
     # Describe the sink
     a1.sinks.k1.type = logger
    
     # Use a channel which buffers events in memory
     a1.channels.c1.type = memory
     a1.channels.c1.capacity = 1000
     a1.channels.c1.transactionCapacity = 100
    
     # Bind the source and sink to the channel
     a1.sources.r1.channels = c1
     a1.sinks.k1.channel = c1
    
  2. 启动flume:
    flume $> $ bin/flume-ng agent --conf conf --conf-file conf/xxxx.conf --name a1 -Dflume.root.logger=INFO,console

  3. 客户端连接flume: (配置文件中已经指定IP:PORT)
    $> nc localhost 44444

  4. 连接测试:
    客户端产生数据:
    $> hello world

     flume收集客户端数据:
     	(sinkRunner-PollingRunner-DefaultSinkProcessor)Event: {headers:{}  body: 68 65 6C 6F 20 66 6C 75 6D 65   hello world}
    
实时收集(Exec Source):
  1. 配置
    a1.sources = r1
    a1.sinks = k1
    a1.channels = c1

     a1.sources.r1.type = exec
     a1.sources.r1.command = tail -F /home/ubuntu/data/flume/execSource.txt
     
     a1.sinks.k1.type = logger
     a1.channels.c1.type = memory
     
     a1.sources.r1.channels = c1
     a1.sinks.k1.channel = c1
    
  2. 启动flume
    flume $> $ bin/flume-ng agent --conf conf --conf-file conf/execSource.conf --name a1 -Dflume.root.logger=INFO,console

  3. 监听execSource.txt文件
    $> echo hello world >> execSource.txt

批量收集数据:(Spooling Directory Source)
  1. 配置文件
    spooling监听spoolDir目录是否有文件移入,如果有文件移入,则将对该文件并进行处理,完毕之后对文件重命名或者删除
    a1.sources = r1
    a1.sinks = k1
    a1.channels = c1

     a1.sources.r1.type = spooldir
     a1.sources.r1.spoolDir = /home/ubuntu/data/flume/flumeSpool
     a1.sources.r1.fileHeader = true
     
     a1.sinks.k1.type = logger
     a1.channels.c1.type = memory
     
     a1.sources.r1.channels = c1
     a1.sinks.k1.channel = c1
    
  2. 创建目录
    /home/ubuntu/data/flume/flumeSpool

  3. 启动flume

序列源测试:(Sequence Generator Source)
  1. 配置conf文件
    a1.sources = r1
    a1.channels = c1
    a1.sources.r1.type = seq
    a1.sources.r1.channels = c1

    a1.channels.c1.type = memory
    a1.sinks.k1.type = logger

    #a1.sources.r1.bind = localhost
    #a1.sources.r1.port = 44444

  2. 启动flume


flume官方文档:
flume官方文档

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值