Flume(分布式日志收集框架)
|
|-主要作用:把数据从各种各样的server上移动到Kafka/HDFS/HIVE等分布式存储组件上;
|
|-组件示意图
| \ _____________________________________________________
| | |
| | _____Agent_______________ |
| | | | |
| | |---------> Source Sink-------| |
| | / | \ /|\ | \|/ |
| | Web Server | \-->Channel--| | HDFS |
| | | | |
| | |_________________________| |
| |_____________________________________________________|
|
|-组件|--1.Source 收集
| |--2.Channel 聚集
| |--3.Sink 输出
|
|-flume配置|--文件名:flume-env.sh
| |--添加内容:export JAVA_HOME = ~/jdk1.7(or later)
|
|-写flume实例配置
| |
| |--关键点|---1.配置Source
| | |---2.配置Channel
| | |---3.配置Sink
| | |---4.把以上三个组件串起来
| |
| |--配置步骤 |---1.指定Agent(实例)名称,如a1
| | |
| | |---2.指定组件名称,如|---a1.sources = r1
| | | |---a1.sinks = k1
| | | |---a1.channels = c1
| | |
| | |---3.配置Source,如|---a1.sources.r1.type = netcat
| | | |---a1.sources.r1.bind = hadoop001
| | | |---a1.sources.r1.port = 44444
| | |
| | |---4.配置Channel,如|---a1.channels.c1.type = memory
| | | |---a1.channels.c1.capacity = 1000
| | | |---a1.channels.c1.transactionCapacity = 100
| | |
| | |---5.配置Sink,如: a1.sinks.k1.type = logger
| | |
| | |---6.绑定Channel到Source、Sink|---a1.sources.r1.channels = c1
| | | |---a1.sinks.k1.channel = c1
|
|-启动flume agent |--~/bin> flume-ng agent \
| (flume 实例) | --name a1 \
| | --conf $FLUME_HOME/conf \
| | --conf-file $FLUME_HOME/conf/a1.conf \
| | --Dflume.root.logger=INFO,console
|
|-测试:登录hadoop001,进入telnet(telnet hadoop001 44444)生产内容,观察flume输出
扇入扇出架构图
官方用户手册
http://flume.apache.org/releases/content/1.9.0/FlumeUserGuide.html
User Guide
|
|-Introduction
|
|-Setup
|
|-Config|--Flume Sources|---Avro Source
| | |---Thrift Source
| | |---Exec Source
| | |---JMS Source
| | |---Kafka Source
| | |---...
| |
| |--Flume Channels|---Memory Channel
| | |---JDBC Channel
| | |---Kafka Channel
| | |---FIle Channel
| |
| |--Flume Sinks|---HDFS Sink
| | |---Hive Sink
| | |---Logger Sink
| | |---Avro Sink
| | |---ElasticSearch Sink
| | |---...
| |
| |--...
|
|-...