Hadoop2.6.1环境搭建
环境及版本
- CentOS7
- JAVA8
- Hadoop2.6.1
- 集群环境
- master: 192.168.27.130
- slave1: 192.168.27.131
- slave2: 192.168.27.132
以下未做说明操作均在master节点上
删除机器自带的JAVA环境
[root@master src] echo $JAVA_HOME
[root@master src] rpm -qa | grep jdk
java-1.8.0-openjdk-headless-1.8.0.65-3.b17.el7.x86_64
java-1.7.0-openjdk-1.7.0.91-2.6.2.3.el7.x86_64
java-1.7.0-openjdk-headless-1.7.0.91-2.6.2.3.el7.x86_64
java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64
# 建议使用第一种
# 卸载命令 1.
[root@master src] yum -y remove java-1.8.0-openjdk-headless-1.8.0.65-3.b17.el7.x86_64
# 卸载命令 2. 将输出的java全部删除
rpm -e --nodeps
rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.65-3.b17.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.91-2.6.2.3.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.91-2.6.2.3.el7.x86_64
rpm -e --nodeps java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64
# 验证是否将 JAVA环境删除
[root@master src]# java -version
bash: java: command not found...
三台机器均需关闭防火墙
# 关闭系统防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭内核防火墙
setenforce 0
vim /etc/selinux/config
SELINUX=disable
获取hadoop、JAVA
- hadoop下载地址
- JAVA8下载
# 解压 tar -zxvf hadoop-2.6.1.tar.gz /usr/local/src tar -zxvf jdk-8u251-linux-x64.tar.gz /usr/local/src
设置环境变量
- 打开配置文件
# 用户环境变量(建议设置这个) vim ~/.bashrc # 系统环境变量(不建议) vim /etc/profile
- 在配置文件后面添加环境变量
export JAVA_HOME=/usr/local/src/jdk1.8.0_241 export HADOOP_HOME=/usr/local/src/hadoop-2.6.1 export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
- 使环境变量生效
source ~/.bashrc
配置ssh免密登入
# master 执行
ssh -keygen -t rsa
cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys
ssh slave1 cat /root/.ssh/authorized_keys >> /root/.ssh/authorized_keys
ssh slave2 cat /root/.ssh/authorized_keys >> /root/.ssh/authorized_keys
# slave1、slave2 执行
ssh -keygen -t rsa
cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys
ssh master cat /root/.ssh/authorized_keys >> /root/.ssh/authorized_keys
hadoop配置文件修改
- slaves
vim slaves slave1 slave2
- core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/src/hadoop-2.6.1/tmp/</value> </property> </configuration>
- hdfs-site.xml
<configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/src/hadoop-2.6.1/dfs/name/</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/src/hadoop-2.6.1/dfs/data/</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration>
- mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>http://master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>http://master:19888</value> </property> </configuration>
- yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8035</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <!-- 关闭虚拟内存检查--> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>
- 创建临时目录和hdfs配置中的文件目录
# 临时目录 mkdir /usr/local/src/hadoop-2.6.1/tmp # 文件目录 mkdir -p /usr/local/src/hadoop-2.6.1/dfs/name mkdir -p /usr/local/src/hadoop-2.6.1/dfs/data
将hadoop分发给slave1,slave2
scp -r /usr/local/src/hadoop-2.6.1 root@slave1:/usr/local/src/hadoop-2.6.1
scp -r /usr/local/src/hadoop-2.6.1 root@slave2:/usr/local/src/hadoop-2.6.1
启动集群
-
初始化Namenode 注:格式化操作只需要执行一次即可,多次格式化将造成id不一致的问题
hadoop namenode -format
-
启动
./usr/local/src/hadoop-2.6.1/sbin/start-all.sh # 因为已经配置了环境变量,所以可以直接执行 start-all.sh
-
查看集群是否启动
- master
[root@master ~]# jps 18126 ResourceManager 17975 SecondaryNameNode 21333 Jps 17798 NameNode
- slave1
[root@master ~]# jps 13538 DataNode 13988 Jps 13638 NodeManager
- slave2
[root@master ~]# jps 14516 DataNode 14944 Jps 14616 NodeManager
- master
-
启动历史服务器,master
sbin/mr-jobhistory-daemon.sh start historyserver
-
监控网页(本地没有配置ip映射,需要以ip:8088的方式查看)
http://master:8088 -
HDFS Shell
[root@master ~]# hadoop fs -ls /
到了这里,恭喜你hadoop集群已经搭建完成