Flume-0.9.4分布式安装与配置手册

最新推荐文章于 2019-10-10 14:12:42 发布

chengwenyao18

最新推荐文章于 2019-10-10 14:12:42 发布

阅读量415

点赞数

From: http://www.iteblog.com/archives/911

个人在运行过程中遇到的问题解决方案是：

As people in cloudera mail list suggest, there are probable reasons of this error:

The HDFS safemode is turned on. Try to run hadoop fs -safemode leave and see if the error goes away.
Flume and Hadoop versions are mismatched. To check this replace the hadoop-core.jar in flume/lib directory with the one found in hadoop's installation folder.

　　 Flume是一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统。支持在日志系统中定制各类数据发送方，用于收集数据;同时， Flume提供对数据进行简单处理，并写到各种数据接受方(比如文本、HDFS、Hbase等)的能力。
　　Flume主要有以下几类组件：
　　（1）、Master: 负责配置及通信管理，是集群的控制器，并支持多master节点；
　　（2）、Agent: 采集数据，Agent是flume中产生数据流的地方，同时Agent会将产生的数据流传输到Collector；
　　（3）、Collector: 用于对数据进行聚合（数据收集器），往往会产生一个更大的数据流，然后加载到storage（存储）中。
　　简单地来说，就是Agent把采集到的数据定时发送给Collector，Collector接收Agent发送的数据并把数据写到指定的位置(比如文本、HDFS、Hbase等)。
　　这篇文章主要简单介绍如何部署Flume-0.9.4分布式环境，涉及到三台机器。它们的hostname分别为master,agent,collector。
　　 1、到官网下载Flume-0.9.4，并解压：

 
   1[wyp@master~]$ wget https://repository.cloudera.com/content/                  \
 
   2      repositories/releases/com/cloudera/flume-distribution/0.9.4-cdh4.0.0/    \
 
   3       flume-distribution-0.9.4-cdh4.0.0-bin.tar.gz                            \
 
   4[wyp@master~]$  tar -zxvf flume-distribution-0.9.4-cdh4.0.0-bin.tar.gz
 
   5[wyp@master~]$ cd flume-0.9.4-cdh3u3
 
   6[wyp@masterflume-0.9.4-cdh3u3]$

　　2、进入$FLUME_HOMNE/bin目录，将flume-env.sh.template重命名为flume-env.sh，并在flume-env.sh文件里面设置如下变量：

 
   1[wyp@masterflume-0.9.4-cdh3u3]$ cd bin
 
   2[wyp@masterbin]$ cp flume-env.sh.template flume-env.sh
 
   3[wyp@masterbin]$ vim flume-env.sh
 
   4 
 
   5export FLUME_HOME=/home/q/flume-0.9.4-cdh3u3
 
   6export FLUME_CONF_DIR=$FLUME_HOME/conf
 
   7export PATH=$PATH:$FLUME_HOME/bin
 
   8export JAVA_HOME=/usr/lib/jvm/java-6-sun

　　　　3、进入$FLUME_HOMNE/conf目录，将flume-site.xml.template重命名为flume-site.xml，并修改flume-site.xml配置文件

 
   01[wyp@masterbin]$ cd ../conf
 
   02[wyp@masterconf]$ cp flume-site.xml.template flume-site.xml
 
   03[wyp@masterconf]$ vim flume-site.xml
 
   04<property>
 
   05    <name>flume.master.servers</name>
 
   06    <value>master</value>
 
   07    <description>This is the addressforthe config servers status
 
   08    server (http)
 
   09    </description>
 
   10</property>
 
   11 
 
   12<property>
 
   13    <name>flume.collector.output.format</name>
 
   14    <value>raw</value>
 
   15    <description>The output formatforthe data written by a Flume
 
   16    collector node.  There are several formats available:
 
   17      syslog - outputs events in a syslog-like format
 
   18      log4j - outputs events in a pattern similar to Hadoop's log4j pattern
 
   19      raw - Event body only.  This is most similar to copying a file but
 
   20        does not preserve any uniqifying metadata like host/timestamp/nanos.
 
   21      avro - Avro Native file format.  Default currently is uncompressed.
 
   22      avrojson -thisoutputs data as json encoded by avro
 
   23      avrodata -thisoutputs data as a avro binary encoded data
 
   24      debug - used onlyfordebugging
 
   25    </description>
 
   26  </property>
 
   27 
 
   28 <property>
 
   29    <name>flume.collector.roll.millis</name>
 
   30    <value>300000</value>
 
   31    <description>The time (in milliseconds)
 
   32    between when hdfs files are closed and anewfile is opened
 
   33    (rolled).
 
   34    </description>
 
   35  </property>
 
   36 
 
   37<property>
 
   38    <name>flume.agent.logdir.maxage</name>
 
   39    <value>10000</value>
 
   40    <description> number of milliseconds before a local log file is
 
   41    considered closed and ready to forward.
 
   42    </description>
 
   43  </property>
 
   44 
 
   45  <property>
 
   46    <name>flume.agent.logdir.retransmit</name>
 
   47    <value>60000</value>
 
   48    <description>The time (in milliseconds) before a sent event is
 
   49    assumed lost and needs to be retried in end-to-end reliability
 
   50    mode again.  This should be at least 2x the
 
   51    flume.collector.roll.millis.
 
   52    </description>
 
   53  </property>

　　4、将配置好的Flume整个文件夹打包，并发送到agent和collector的机器上：

 
   1[wyp@master~]$ tar -zcvf flume-0.9.4-cdh3u3.tar.gz flume-0.9.4-cdh3u3
 
   2[wyp@master~]$ scp flume-0.9.4-cdh3u3.tar.gz agent:/home/wyp
 
   3[wyp@master~]$ scp flume-0.9.4-cdh3u3.tar.gz collector:/home/wyp

　　5、分别在agent和collector机器上解压上述包，并在master，agent和collector机器上分别启动以下进程：

 
   1[wyp@master~]$ $FLUME_HOME/bin/flume master
 
   2 
 
   3[wyp@agent~]$ $FLUME_HOME/bin/flume node_nowatch –n agent
 
   4 
 
   5[wyp@collector~]$ $FLUME_HOME/bin/flume node_nowatch –n collector

这样master机器就充当master角色；agent 机器充当agent角色；collector机器充当collector角色。
　　 6、打开http://master:35871，看看能否进去，并看到agent和collector进程成功启动，则说明Flume安装完成！

chengwenyao18

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Flume-0.9.4分布式安装与配置手册

From: http://www.iteblog.com/archives/9112014 Spark亚太峰会会议资料下载、《Hadoop从入门到上手企业开发视频下载[70集]》、《炼数成金-Spark大数据平台视频百度网盘免费下载》、《Spark 1.X 大数据平台V2百度网盘下载[完整版]》、《深入浅出Hive视频教程百度网盘免费下载》　　Flume是一个分布式、
复制链接

扫一扫