flume日志采集系统

Flume是一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可定制)的能力。

agentflume最小的独立运行单元,一个agent就是一个jvm,所有,在安装flume之前必须安装jdk。

一:flume安装:

wget  http://mirrors.cnnic.cn/apache/flume/1.7.0/apache-flume-1.7.0-bin.tar.gz

二:解压:

tar -xvzf apache-flume*.tar.gz

三:修改配置文件:

1:修改conf文件夹下flume-env.sh来指定JAVA_HOME,如果不去指定JAVA_HOME就有可能在启动flume时报错

exportJAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64

JRE_HOME=$JAVA_HOME/jre

2:在conf文件夹下新建flume.conf:

这是位于日志存放机器上的flume配置,它用来接收其他机器上flume发过来的数据,在channel选择器通过报头的不同键值来区分放在哪个channel里,最后由不同的sink将不同channel里的数据存放到不同的文件里。

a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1 c2

# source配置
a1.sources.r1.type = avro
a1.sources.r1.bind = 192.168.1.80
a1.sources.r1.port = 4444
a1.sources.r1.channels = c1 c2
a1.sources.r1.selector.type=multiplexing
a1.sources.r1.selector.header=sn
a1.sources.r1.selector.mapping.game1=c1
a1.sources.r1.selector.mapping.game2=c2

# channel配置
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
a1.channels.c1.byteCapacity = 800000
a1.channels.c1.keep-alive=3
a1.channels.c2.type = memory
a1.channels.c2.capacity = 10000
a1.channels.c2.transactionCapacity = 10000
a1.channels.c2.byteCapacity = 800000
a1.channels.c2.keep-alive=3


#sink配置
a1.sinks.k1.type = com.fyx.flume.FileSink
a1.sinks.k1.file.path = /home/ayue/flume/flume-test/game1/
a1.sinks.k1.channel = c1
a1.sinks.k1.file.filePrefix = log-
a1.sinks.k1.file.txnEventMax = 100
a1.sinks.k1.file.maxOpenFiles = 5

a1.sinks.k2.type = com.fyx.flume.FileSink
a1.sinks.k2.file.path = /home/ayue/flume/flume-test/game2/
a1.sinks.k2.channel = c2
a1.sinks.k2.file.filePrefix = log-
a1.sinks.k2.file.txnEventMax = 100
a1.sinks.k2.file.maxOpenFiles = 5

这是位于日志采集机器上的flume配置,采集数据后通过source拦截器给每个事件添加上报头以及报头上的key和value,方便后面的channel选择器来进行选择。

a2.sources = r2
a2.sinks = k2
a2.channels = c2

# source配置
a2.sources.r2.type = exec
a2.sources.r2.command = tail -F /home/ayue/flume/flume-source/serviceTest.log
a2.sources.r2.channels = c2
a2.sources.r2.interceptors = static
a2.sources.r2.interceptors.static.type=static
a2.sources.r2.interceptors.static.key=sn
a2.sources.r2.interceptors.static.value =game1
a2.sources.r2.interceptors.static.preserveExisting=false

# channel配置
a2.channels.c2.type = memory
a2.channels.c2.capacity = 10000
a2.channels.c2.transactionCapacity = 10000
a2.channels.c2.byteCapacity = 800000
a2.channels.c2.keep-alive=3

#sink配置
a2.sinks.k2.type = avro
a2.sinks.k2.hostname = 192.168.1.80
a2.sinks.k2.port = 4444
a2.sinks.k2.channel = c2
a2.sinks.k2.connect-timeout=20000
a2.sinks.k2.batch-size = 100

在a1的sink配置type是com.fyx.flume.FileSink,这是我自己开发的sink插件,因为flume文件存储不支持log-2018-04-18-21这种格式的目录输出,所有就自己开发了这个插件。

插件地址

https://download.csdn.net/download/newayue/10360684

或:https://github.com/ayue123/flume

3:插件导入:

直接在flume下创建plugins.d目录,目录结构为:

plugins.d/

plugins.d/FileSink/

plugins.d/FileSink/lib/flume-file-sink.jar

plugins.d/FileSink/libext/

plugins.d/FileSink/native/

lib是放插件JAR的目录,libext是放插件的依赖JAR的目录,native放使用到的原生库

重新启动flumeagent,flume就会自动装载我们的插件,这样在flume.conf中就可以使用全路径类名配置type属性了.

四:启动:

先将a1启动:

bin/flume-ngagent--confconf--conf-file./conf/flume.conf--name a1-Dflume.root.logger=INFO,console

之后启动a2

bin/flume-ngagent--confconf--conf-file./conf/flume.conf--name a2-Dflume.root.logger=INFO,console

我们可以写一个死循环来模拟日志的写入:

while true

>do

> date >>serviceTest.log

> sleep2

> done 

五:一些需要注意的坑:

1:a1.sources.r1.channels = c1其中channels不能写为channel,如果写错就会报下面的错:

2018-04-19 01:37:45,867 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:589)] Could not configure source  r1 due to: Failed to configure component!
org.apache.flume.conf.ConfigurationException: Failed to configure component!
        at org.apache.flume.conf.source.SourceConfiguration.configure(SourceConfiguration.java:111)
        at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:566)
        at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:346)
        at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:212)
        at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:126)
        at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:108)
        at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:189)
        at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:93)
        at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flume.conf.ConfigurationException: No channels set for r1
        at org.apache.flume.conf.source.SourceConfiguration.configure(SourceConfiguration.java:69)
        ... 15 more

2:a1.sinks.k1.sink.directory= /var/log/flume,其中红色的sink不能省略,省略就会报下面的错:

java.lang.IllegalArgumentException: Directory may not be null
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
        at org.apache.flume.sink.RollingFileSink.configure(RollingFileSink.java:90)
        at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
        at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411)
        at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
        at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)





评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值