一、需求
使用flume收集多台服务器上的日志(文件夹,或者文件),推送给kafka服务器实时消费展示,并且存储到hdfs。
二、准备工作
2.1、虚拟机:
最少4台虚拟机(hadoop需要至少3台虚拟机搭建集群,一台虚拟机作为日志源flume收集目标)
hadoop1 192.168.48.144
hadoop2 192.168.48.145
hadoop3 192.168.48.146
flume1 192.168.48.147
2.2、所需文件
jdk-8u101-linux-x64.gz
apache-flume-1.7.0-bin.tar.gz
zookeeper-3.4.8.tar.gz
kafka_2.11-0.10.2.0.tgz
hadoop-2.7.3.tar.gz
2.3、虚拟机创建
一台日志源虚拟机flume1
三台hadoop机hadoop1,hadoop2,hadoop3(可先创建一台hadoop1作为namenode,配置完成后使用VM的克隆,复制出剩余的两台datanode)
三、hadoop搭建
为了快速搭建,我这边没有创建用户用户组,直接使用根用户。
3.1、VM创建一台虚拟机
编辑
vi /etc/sysconfig/network-scripts/ifcfg-eth0
3.2、编辑hosts文件
vi /etc/hosts
3.3、编辑network文件
vi /etc/sysconfig/network
修改hostname为hadoop1
重启生效
3.4、关闭防火墙和selinux
永久关闭防火墙:chkconfig --level 35 iptables off
永久关闭selinux:
vim /etc/selinux/config
找到SELINUX 行修改成为:SELINUX=disabled
3.5、安装jdk
jdk包放置在/usr/local/xialei/java
cd 到当前目录,解压
[root@hadoop1 java]# tar -zxvf jdk-8u101-linux-x64.gz
配置环境变量
[root@hadoop1 java]# vi /etc/profile
文件最下面添加以下几行
export JAVA_HOME=/usr/local/xialei/java/jdk1.8.0_101
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
使其生效
source /etc/profile
检测是否配置成功
[root@hadoop1 java]# java -version
3.6、克隆虚拟机
VM中将hadoop1克隆两台虚拟机,分别命名为hadoop2,hadoop3。分别修改两台机器上的etc/hosts文件,修改相应的hostname。
3.7、配置三台机器的ssh免密登录
这里不多做阐述,贴出网上方法
验证能否免密登录
可能第一次需要输入密码
3.8、修改hadoop配置文件,配置hadoop环境变量
新建hdfs文件夹,解压hadoop-2.7.3.tar.gz文件
[root@hadoop1 hdfs]# tar -zxvf hadoop-2.7.3.tar.gz
vi /etc/profile
在最后追加两行
export HADOOP_HOME=/usr/local/xialei/hdfs/hadoop-2.7.3/bin
export PATH=$PATH:$HADOOP_HOME
监测hadoop环境变量
修改配置文件:进入etc的hadoop文件夹下
[root@hadoop1 hdfs]# cd hadoop-2.7.3/etc/hadoop
3.8.1、修改core-site.xml
[root@hadoop1 hadoop]# vi core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/xialei/hadoop/tmp</value>
</property>
</configuration>
3.8.2 修改hdfs-site.xml
[root@hadoop1 hadoop]# vi hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express