最近在使用FLUME跨服务器实时采集多个日志文件并入库MySQL,过程是将服务器A中的多个日志文件发送到服务器B,在服务器B中将数据存入其他数据服务器MySQL库中。过程遇到的一些坑,已为大家踩过了,通过下面的步骤达到了最后的效果,话不多说,上菜!
1、两台互通的服务器A、B,都安装JDK、FLUME1.9(我是安装的1.9版本),可在其他数据服务器上安装MySQL数据库(略);
2、自定义MySQLSink,放在服务器B,已经有人写的很详细了,大家可以参考https://blog.csdn.net/weixin_43230682/article/details/108265427
3、在服务器A中安装flume目录下的conf目录中新增配置文件flume-taildir.conf
# flume-taildir.conf
a95.sources = r95
a95.sinks = k95
a95.channels = c95
#source 指定typy为TAILDIR
a95.sources.r95.type = TAILDIR
a95.sources.r95.channels=c95
a95.sources.r95.positionFile= /data/testlogs/tail_position/taildir_position.json
#可添加多个路径文件
a95.sources.r95.filegroups=f1 f2
a95.sources.r95.filegroups.f1=/data/testlogs/.*log
a95.sources.r95.headers.f1.headerKey1 = value1
a95.sources.r95.filegroups.f2=/data/testlogs/.*txt
a95.sources.r95.headers.f2.headerKey1 = value2
a95.sources.r95.headers.f2.headerKey2 = value2-2
#a95.sources.r95.fileHeader=true
a95.sources.r95.bind = 0.0.0.0
a95.sources.r95.port = 44444
#filter 可根据自身情况添加正则过滤日志信息
#a95.sources.r95.interceptors=i1
#a95.sources.r95.interceptors.i1.type=regex_filter
#a95.sources.r95.interceptors.i1.regex=(EVENT)(.*)
#channel
a95.channels.c95.type = file
a95.channels.c95.dataDirs = /data/testlogs/dataDirs
a95.channels.c95.checkpointDir = /data/testlogs/checkpointDir
a95.channels.c95.capacity = 1000
a95.channels.c95.transactionCapacity = 100
#sink 发送到服务器B
a95.sinks.k95.type = avro
a95.sinks.k95.channel = c95
a95.sinks.k95.hostname = 服务器B的IP
a95.sinks.k95.port = 端口
在服务器B中安装flume目录下的conf目录中新增配置文件flume2mysql.conf
# flume2mysql.conf
a94.sources = r94
a94.channels = c94
a94.sinks = k94
#sources 从服务器A接收的数据源
a94.sources.r94.type = avro
a94.sources.r94.channels = c94
a94.sources.r94.bind = 192.168.1.94
a94.sources.r94.port = 44444
#channel
a94.channels.c94.type = memory
a94.channels.c94.capacity = 1000
a94.channels.c94.transactionCapacity = 100
#sink 使用自定义的sink:MysqlSink
a94.sinks.k94.type = com.flume.MysqlSink
a94.sinks.k94.mysqlurl=jdbc:mysql://IP:port/databaseName?useSSL=false&useUnicode=true&characterEncoding=utf8
a94.sinks.k94.username=username
a94.sinks.k94.password=password
a94.sinks.k94.tablename=tableName
a94.sinks.k94.batch-size=100
a94.sinks.k94.channel = c94
4、两台服务器配置文件已完成,我们是从服务器A采集数据到服务器B,相当于是服务器B监听服务器A目录下的日志文件,先启动服务器B,cd到flume的bin目录下,输入:
flume-ng agent \
--name a94 \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/flume2mysql.conf \
-Dflume.root.logger=INFO,console
再cd到服务器Aflume的bin目录下,输入启动:
flume-ng agent \
--name a95 \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/flume-taildir.conf \
-Dflume.root.logger=INFO,console
到此两台服务器监听已启动,我们去服务器A的监控文件中添加数据信息
cd /data/testlogs
echo aaaaaa >> test.log
或监听的txt后缀文件中输入:
echo zzzzzz >> test.txt
查看数据库中,有数据,成功。
过程就是这样,亲测有效,没有截图,大家在实际中遇到的问题可自己baidu,也可讨论。如有不当之处,望大家多多指正!