一、环境准备以及基础配置
1、准备环境,centos7虚拟机三台,192.168.2.150,151、152
2、创建hadoop用户
useradd -d /home/hadoop -m hadoop
3、修改hadoop密码
passwd hadoop
4、修改主机名(分别修改三台机器的主机名为master、slave1、slave2)
hostnamectl set-hostname master
5、配置hosts文件,在每个节点/etc/hosts文件中添加一下内容
192.168.2.150 master
192.168.2.151 slave1
192.168.2.152 slave2
5、配置ssh免密登录,是节点之间两两互通
ssh-keygen -t rsa
ssh-copy-id uname@hostname
6、安装jdk并且配置环境变量
7、下载hadoop安装文件,解压在工作目录中
cd /home/hadoop/work
tar -zxvf hadoop-2.8.3.tar.gz
mv hadoop-2.8.3 hadoop
8、在工作目录下创建hdfs目录
cd /home/hadoop/work
mkdir hdfs
cd hdfs
mkdir data name tmp
9、添加hadoop环境变量到系统
export HADOOP_HOME=/home/hadoop/work/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
10、添加JAVA_HOME到hadoop
在/home/hadoop/work/hadoop/etc/hadoop/hadoop-env.sh最后添加一下内容
export JAVA_HOME=/usr/local/jdk
二、配置集群
涉及core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml、slaves五个配置文件,对应各个组件的配置。 位于 /home/hadoop/work/hadoop/etc/hadoop/ 目录下,文件说明如下:
文件 | 说明 |
---|---|
core-site.xml | Common组件 |
hdfs-site.xml | HDFS组件 |
mapred-site.xml | MapReduce组件 |
yarn-site.xml | YARN组件 |
slaves | slave节点信息 |
1、core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/work/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
2、hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/work/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/work/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
3、mapred-site.xml
MapReduce组件配置文件默认为、mapred-site.xml.template ,复制问津
cp mapred-site.xml.template mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<!-- 通知框架MapReduce使用YARN -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
4、yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
5、编辑slaves文件,添加从节点信息
去掉原本的localhost,换成以下内容。配置slaves的目录,是把所有节点连在一起,构成一个相连的集群,启动时,整个集群一起启动。
slave1
slave2
6、将配置好的hadoop目录拷贝到其他各节点上
cd /home/hadoop/work
scp -r hadoop slave1:/home/hadoop/work/
scp -r hadoop slave2:/home/hadoop/work/
scp -r hdfs slave1:/home/hadoop/work/
scp -r hdfs slave2:/home/hadoop/work/
7、格式化namenode
hadoop namenode -format
8、启动hadoop集群
start-all.sh
日志输出如下:
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
18/09/09 12:26:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master]
master: starting namenode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-namenode-master.out
slave2: starting datanode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-datanode-slave1.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /home/hadoop/work/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
18/09/09 12:27:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/work/hadoop/logs/yarn-hadoop-resourcemanager-master.out
slave2: starting nodemanager, logging to /home/hadoop/work/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
slave1: starting nodemanager, logging to /home/hadoop/work/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
9、各节点进程情况
master节点:
1664 NameNode
4657 Jps
2021 ResourceManager
1865 SecondaryNameNode
1515 QuorumPeerMain
slave节点:
2145 Jps
1156 DataNode
1100 QuorumPeerMain
1244 NodeManager
10、访问web页面
访问master WEB UI界面,可以看另外2个节点都正常运行。
http://master:50070/
查看客户端节点: http://master:18088/cluster/nodes