将多个logs服务器上的log搜集到hdfs服务器上,多个logs服务上的flume-sink和hdfs服务器上的flume-source都是avro类型,hdfs服务器上的flume-sink是hdfs类型
Flume部署文档
系统要求:
Java 运行时环境
部署方式:
在logs和hdfs服务器上下载并解压flume包
下载flume包并解压:
http://mirror.bit.edu.cn/apache/flume/1.4.0/apache-flume-1.4.0-bin.tar.gz
logs服务器flume配置
进入解压后的flume目录,修改配置文件:
1. Cp conf/flume-env.sh.template conf/flume-env.sh
在conf/flume-env.sh中添加
JAVA_HOME=” JAVA HOME DIR”
2. 在conf目录下创建flume.conf配置文件,添加以下内容
修改agent.sources.loggerSource.command 值
# Licensed to the Apache SoftwareFoundation (ASF) under one
# or more contributor licenseagreements. See the NOTICE file
# distributed with this work for additionalinformation
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version2.0 (the
# "License"); you may not usethis file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law oragreed to in writing,
# software distributed under the License isdistributed on an
# "AS IS" BASIS, WITHOUTWARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissionsand limitations
# under the License.
# The configuration file needs to definethe sources,
# the channels and the sinks.
# Sources, channels and sinks are definedper agent,
# in this case called 'agent'
agenta.sources = loggerSource
agenta.channels = memoryChannel
agenta.sinks = loggerSink
# For each one of the sources, the type isdefined
agenta.sources.loggerSource.type = exec
agenta.sources.loggerSource.command = tail -F <logpath>
# The channel can be defined as follows.
agenta.sources.loggerSource.channels =memoryChannel
# Each sink's type must be defined
agenta.sinks.loggerSink.type = avro
agenta.sinks.loggerSink.hostname = <hdfs serverip>
agenta.sinks.loggerSink.port = 4141
#Specify the channel the sink should use
agenta.sinks.loggerSink.channel =memoryChannel
# Each channel's type is defined.
agenta.channels.memoryChannel.type = memory
# Other config values specific to each typeof channel(sink or source)
# can be defined as well
# In this case, it specifies the capacityof the memory channel
agenta.channels.memoryChannel.capacity =1000
启动命令:
./bin/flume-ng agent --conf conf/ --conf-file conf/flume.conf --name agent
Hdfs 服务器flume配置
进入解压后的flume目录,修改配置文件:
1. Cpconf/flume-env.sh.template conf/flume-env.sh
在conf/flume-env.sh中添加
JAVA_HOME=” JAVA HOME DIR”
HADOOP_HOME= “HADOOP HOME”
2. 在conf目录下创建flume.conf配置文件,添加以下内容
# Licensed to the Apache SoftwareFoundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additionalinformation
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version2.0 (the
# "License"); you may not usethis file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law oragreed to in writing,
# software distributed under the License isdistributed on an
# "AS IS" BASIS, WITHOUTWARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissionsand limitations
# under the License.
# The configuration file needs to definethe sources,
# the channels and the sinks.
# Sources, channels and sinks are definedper agent,
# in this case called 'agent'
agent.sources = loggerSource
agent.channels = memoryChannel
agent.sinks = loggerSink
# For each one of the sources, the type isdefined
agent.sources.loggerSource.type = avro
agent.sources.loggerSource.bind = 0.0.0.0
agent.sources.loggerSource.port = 4141
# The channel can be defined as follows.
agent.sources.loggerSource.channels =memoryChannel
# Each sink's type must be defined
agent.sinks.loggerSink.type = hdfs
agent.sinks.loggerSink.hdfs.path = <hdfs sink path>
agent.sinks.loggerSink.hdfs.filePrefix =csplog-
agent.sinks.loggerSink.hdfs.rollInterval=86400
agent.sinks.loggerSink.hdfs.rollSize = 0
agent.sinks.loggerSink.hdfs.rollCount = 0
agent.sinks.loggerSink.hdfs.fileType =DataStream
#Specify the channel the sink should use
agent.sinks.loggerSink.channel =memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# Other config values specific to each typeof channel(sink or source)
# can be defined as well
# In this case, it specifies the capacityof the memory channel
agent.channels.memoryChannel.capacity =1000
启动命令:
bin/flume-ng agent --conf conf/ --conf-fileflume.conf --name agent