大数据教程(12.5)日志采集框架Flume

        前面的章节介绍了hive的知识,本节博主将分享日志采集框架Flume的相关知识。在一个完整的大数据处理系统中,除了hdfs+mapreduce+hive组成分析系统的核心之外,还需要数据采集、结果数据导出、任务调度等不可或缺的辅助系统,而这些辅助工具在hadoop生态体系中都有便捷的开源框架,如图所示:

e1ebe4913820752e742a90e42f50cecc268.jpg

    一、Flume介绍

           (1)概述

               Flume是一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统。
               Flume可以采集文件,socket数据包等各种形式源数据,又可以将采集到的数据输出到HDFS、hbase、hive、kafka等众多外部存储系统中
               一般的采集需求,通过对flume的简单配置即可实现
               Flume针对特殊场景也具备良好的自定义扩展能力,因此,flume可以适用于大部分的日常数据采集场景

           (2)运行机制

               a、Flume分布式系统中最核心的角色是agent,flume采集系统就是由一个个agent所连接起来形成
               b、每一个agent相当于一个数据传递员(Source 到 Channel 到 Sink之间传递数据的形式是Event事件;Event事件是一个数据流单元),内部有三个组件:

5f82a631665f6983e362737e402806059a2.jpg

           (3)Flume采集系统结构图

               a、简单结构(单个agent采集数据)

86eb62a29cd953f373fa309def4ab6c47e7.jpg

               b、复杂结构(多级agent之间串联)

56bf204c65a2341dacc64c23e6ef591ac2a.jpg

    二、Flume实战案例

          (1) Flume的安装部署

1、Flume的安装非常简单,只需要解压即可,当然,前提是已有hadoop环境
   下载Flume软件包,官网地址:http://flume.apache.org/
   上传安装包到数据源所在节点上 
   Alt+p
   put  apache-flume-1.7.0-bin.tar.gz
   然后解压 tar -zxvf apache-flume-1.7.0-bin.tar.gz -C apps/
   然后进入flume的目录,修改conf下的flume-env.sh,在里面配置JAVA_HOME
2、根据数据采集的需求配置采集方案,描述在配置文件中(文件名可任意自定义)
3、指定采集方案配置文件,在相应的节点上启动flume agent

          先用一个最简单的例子来测试一下程序环境是否正常

          a、先在flume的conf目录下新建一个文件

vi   netcat-logger.conf
# 定义这个agent中各组件的名字
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 描述和配置source组件:r1
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# 描述和配置sink组件:k1
a1.sinks.k1.type = logger

# 描述和配置channel组件,此处使用是内存缓存的方式
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 描述和配置source  channel   sink之间的连接关系
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

          b、启动agent去采集数据

bin/flume-ng agent -c conf -f conf/netcat-logger.conf -n a1  -Dflume.root.logger=INFO,console
-c conf   指定flume自身的配置文件所在目录
-f conf/netcat-logger.con  指定我们所描述的采集方案
-n a1  指定我们这个agent的名字

执行效果:

d40463e1ccc8194874e0d6b72665967b252.jpg

          b、测试
               先要往agent采集监听的端口上发送数据,让agent有数据可采
               随便在一个能跟agent节点联网的机器上

#先检查telnet是否安装
rpm -qa | grep telnet
#检查yum列表中可用telnet版本
yum list | grep telnet
#安装
yum install -y telnet-server.x86_64
yum install -y telnet.x86_64
telnet anget-hostname  port   (telnet localhost 44444) 
[hadoop@centos-aaron-h1 ~]$ talnet localhost 44444
-bash: talnet: command not found
[hadoop@centos-aaron-h1 ~]$ telnet localhost 44444
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
121
OK
sdf
OK
sd
OK
fsd
OK
f
OK
sd
OK
f
OK
sdhello
OK
hellow

          (2) Flume采集文件夹里的文件

#创建在flume下conf文件夹中spooldir-logger.conf文件
vi spooldir-logger.conf
#新增以下内容

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
#监听目录,spoolDir指定目录, fileHeader要不要给文件夹前坠名
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/hadoop/flumespool
a1.sources.r1.fileHeader = true

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
#创建flume监控的文件夹
mkdir /home/hadoop/flumespool
#启动flume服务
bin/flume-ng agent -c ./conf -f ./conf/spooldir-logger.conf -n a1 -Dflume.root.logger=INFO,console
测试: 往/home/hadoop/flumeSpool放文件(mv ././xxxFile /home/hadoop/flumeSpool),但是不要在里面生成文件

执行效果:

[hadoop@centos-aaron-h1 apache-flume-1.6.0-bin]$ bin/flume-ng agent -c ./conf -f ./conf/spooldir-logger.conf -n a1 -Dflume.root.logger=INFO,console
Info: Including Hadoop libraries found via (/home/hadoop/apps/hadoop-2.9.1/bin/hadoop) for HDFS access
Info: Excluding /home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-api-1.7.25.jar from classpath
Info: Excluding /home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar from classpath
Info: Including Hive libraries found via (/home/hadoop/apps/apache-hive-1.2.2-bin) for Hive access
+ exec /usr/local/jdk1.7.0_45/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/home/hadoop/apps/apache-flume-1.6.0-bin/conf:/home/hadoop/apps/apache-flume-1.6.0-bin/lib/*:/home/hadoop/apps/hadoop-2.9.1/etc/hadoop:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/activation-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/avro-1.7.7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-codec-1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-collections-3.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-digester-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-lang-2.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-lang3-3.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/commons-net-3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/curator-client-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/curator-framework-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/gson-2.2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/guava-11.0.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/hadoop-annotations-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/hadoop-auth-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/hamcrest-core-1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/httpclient-4.5.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/httpcore-4.4.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jersey-json-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jets3t-0.9.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jetty-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jetty-sslengine-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jsch-0.1.54.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/json-smart-1.3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/jsr305-3.0.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/junit-4.11.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/nimbus-jose-jwt-4.41.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/servlet-api-2.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/snappy-java-1.0.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/stax2-api-3.1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/stax-api-1.0-2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/woodstox-core-5.0.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/xz-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/zookeeper-3.4.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/hadoop-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/hadoop-common-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/hadoop-nfs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/jdiff:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/sources:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/templates:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/hadoop-hdfs-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-annotations-2.7.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-core-2.7.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-databind-2.7.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/okhttp-2.7.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/okio-1.6.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-client-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-native-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-native-client-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-nfs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-rbf-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/hadoop-hdfs-rbf-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/jdiff:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/sources:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/templates:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/hdfs/webapps:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/activation-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/apacheds-i18n-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/api-asn1-api-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/api-util-1.0.0-M20.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/avro-1.7.7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-beanutils-1.7.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-cli-1.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-codec-1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-configuration-1.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-digester-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-lang-2.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-lang3-3.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-math3-3.1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/commons-net-3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/curator-client-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/curator-framework-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/curator-recipes-2.7.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/fst-2.50.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/gson-2.2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/guava-11.0.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/guice-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/htrace-core4-4.1.0-incubating.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/httpclient-4.5.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/httpcore-4.4.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/java-util-1.9.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/java-xmlbuilder-0.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jcip-annotations-1.0-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-client-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-json-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jets3t-0.9.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jettison-1.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jetty-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jetty-sslengine-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jsch-0.1.54.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/json-io-2.5.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/json-smart-1.3.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jsp-api-2.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/nimbus-jose-jwt-4.41.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/servlet-api-2.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/snappy-java-1.0.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/stax2-api-3.1.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/woodstox-core-5.0.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/xmlenc-0.52.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/xz-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-api-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-client-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-registry-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-router-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-tests-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/sources:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/test:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/yarn/timelineservice:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/avro-1.7.7.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/hadoop-annotations-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/junit-4.11.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/snappy-java-1.0.5.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.9.1-tests.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/jdiff:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib-examples:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/sources:/home/hadoop/apps/hadoop-2.9.1/contrib/capacity-scheduler/*.jar:/home/hadoop/apps/apache-hive-1.2.2-bin/lib/*' -Djava.library.path=:/home/hadoop/apps/hadoop-2.9.1/lib/native org.apache.flume.node.Application -f ./conf/spooldir-logger.conf -n a1
2019-02-13 07:40:44,890 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2019-02-13 07:40:44,897 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:./conf/spooldir-logger.conf
2019-02-13 07:40:44,901 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:931)] Added sinks: k1 Agent: a1
2019-02-13 07:40:44,902 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:k1
2019-02-13 07:40:44,902 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1017)] Processing:k1
2019-02-13 07:40:44,912 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:141)] Post-validation flume configuration contains configuration for agents: [a1]
2019-02-13 07:40:44,913 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:145)] Creating channels
2019-02-13 07:40:44,921 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel c1 type memory
2019-02-13 07:40:44,927 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:200)] Created channel c1
2019-02-13 07:40:44,931 (conf-file-poller-0) [INFO - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:41)] Creating instance of source r1, type spooldir
2019-02-13 07:40:44,939 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: k1, type: logger
2019-02-13 07:40:44,943 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:114)] Channel c1 connected to [r1, k1]
2019-02-13 07:40:44,950 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:Spool Directory source r1: { spoolDir: /home/hadoop/flumespool } }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@70edf123 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
2019-02-13 07:40:44,957 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel c1
2019-02-13 07:40:44,988 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
2019-02-13 07:40:44,988 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: CHANNEL, name: c1 started
2019-02-13 07:40:44,989 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink k1
2019-02-13 07:40:44,990 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source r1
2019-02-13 07:40:44,990 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.source.SpoolDirectorySource.start(SpoolDirectorySource.java:78)] SpoolDirectorySource source starting with directory: /home/hadoop/flumespool
2019-02-13 07:40:45,084 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:120)] Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
2019-02-13 07:40:45,084 (lifecycleSupervisor-1-3) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:96)] Component type: SOURCE, name: r1 started
2019-02-13 07:43:51,716 (pool-3-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:258)] Last read took us just up to a file boundary. Rolling to the next file, if there is one.
2019-02-13 07:43:51,716 (pool-3-thread-1) [INFO - org.apache.flume.client.avro.ReliableSpoolingFileEventReader.rollCurrentFile(ReliableSpoolingFileEventReader.java:348)] Preparing to move file /home/hadoop/flumespool/xxx4.txt to /home/hadoop/flumespool/xxx4.txt.COMPLETED
2019-02-13 07:43:55,040 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 68 65 6C 6C 6F 77 20 20 68 65 6C 6C 6F 20 0D    hellow  hello . }
2019-02-13 07:43:55,040 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 68 65 6C 6C 6F 77 78 20 68 65 6C 6C 6F 77 62 62 hellowx hellowbb }
2019-02-13 07:43:55,041 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 73 68 65 6C 6C 20 62 61 6E 61 6E 65 72 20 76 69 shell bananer vi }
2019-02-13 07:43:55,041 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 62 62 73 0D                                     bbs. }
2019-02-13 07:43:55,041 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 74 74 78 20 74 74 74 74 20 79 79 79 79 20 68 68 ttx tttt yyyy hh }
2019-02-13 07:43:55,042 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 73 68 61 6D 65 20 6A 69 6D 65 20 61 70 70 6C 65 shame jime apple }
2019-02-13 07:43:55,042 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:94)] Event: { headers:{file=/home/hadoop/flumespool/xxx4.txt} body: 63 68 65 6C 6C 79                               chelly }

    注意:

            a、由于博主的linux系统中已经安装了jdk1.7,而Flume在1.8以后就只支持jdk1.8或者更新的版本,下载Flume1.7过程实在太慢,所以博主用Flume1.6的版本来做演示。在生产环境中建议小伙伴们根据自己所在项目的需要选择适合自己项目的版本。

            b、上面介绍的Flume的收集方案配置文件都bind的是localhost主机,所以在其它机器上执行telnet是无法连接上的,正确的做法是将bind主机到Flume安装机器的主机名或ip。【此处道理同启动yarn集群,必须在配置的yarn resourcemanager的主机上执行启动脚本start-yarn.sh一样的道理】

            c、以上配置文件博主只展示了最简单的配置,更多精细化的配置见官网说明,需要强调的是Flume的官网文档是非常值得肯定的,相较于其他开源软件算是非常完善的。

    最后寄语,以上是博主本次文章的全部内容,如果大家觉得博主的文章还不错,请点赞;如果您对博主其它服务器大数据技术或者博主本人感兴趣,请关注博主博客,并且欢迎随时跟博主沟通交流。

转载于:https://my.oschina.net/u/2371923/blog/3009430

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值