3台centos7上安装hadoop3.1.4
计划
1.3台服务器:192.168.11.131-133
2.HDFS: 192.168.11.131上 部署 NameNode,133上部署SecondaryNameNode;
3.YARN: 192.168.11.132上部署ResourceManager
前提
1.jdk已安装
2.hadoop用户已配置ssh免密,root也已设置了免密登录
免密登录参考:https://blog.csdn.net/sndayYU/article/details/115036682
3.关闭防火墙
一.192.168.11.131服务器上操作
1.上传安装包,并解压
[hadoop@localhost .ssh]$ rz
-bash: rz: 未找到命令
[hadoop@localhost .ssh]$ sudo yum install lrzsz -y
[hadoop@localhost root]$ cd /home/hadoop
// 上传安装包
[hadoop@localhost ~]$ rz
[hadoop@localhost ~]$ ls
hadoop-3.1.4.tar.gz
// 解压
[hadoop@localhost ~]$ tar zxf hadoop-3.1.4.tar.gz
// 该文件已无用,可以删除
[hadoop@localhost ~]$ rm -f hadoop-3.1.4.tar.gz
[hadoop@localhost ~]$ cd /home/hadoop/hadoop-3.1.4/etc/hadoop/
2.修改4个配置文件
4个文件参考下面jar中对应文件进行配置
<!-- core-site.xml参考jar里core-default.xml -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.1.4</version>
</dependency>
<!-- hdfs-site.xml参考jar里hdfs-default.xml -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.1.4</version>
</dependency>
<!-- yarn-site.xml参考jar里yarn-default.xml -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
<version>3.1.4</version>
</dependency>
<!-- mapred-site.xml参考jar里mapred-default.xml -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.1.4</version>
</dependency>
2.1 core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<!-- 数据目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-3.1.4/data</value>
</property>
</configuration>
2.2 hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.http-address</name>
<value>hadoop1:9870</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop3:9868</value>
</property>
</configuration>
2.3 yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop2</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<!-- 设置虚拟内存更大些,默认2.1,即物理内存1G,虚拟内存2.1G -->
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>10</value>
</property>
</configuration>
2.4 mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
3.配置workers
[hadoop@localhost hadoop]$ vim /home/hadoop/hadoop-3.1.4/etc/hadoop/workers
hadoop1
hadoop2
hadoop3
二.传到其他2台服务器
// 时间有些久,要稍等下
scp -r /home/hadoop/hadoop-3.1.4 hadoop@hadoop2:/home/hadoop/
scp -r /home/hadoop/hadoop-3.1.4 hadoop@hadoop3:/home/hadoop/
三.配置环境变量
sudo vim /etc/profile.d/my.sh
# 内容如下
# JAVA_HOME若配置在/etc/profile,则在启动时,会报找不到JAVA_HOME的报错
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.201.b09-2.el7_6.x86_64
export CALSSPATH=$JAVA_HOME/lib/*.*
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/home/hadoop/hadoop-3.1.4
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
// 以root用户传到另外2台,或者另外2台也建这样的命令
scp /etc/profile.d/my.sh root@hadoop3:/etc/profile.d/
// 3台上启用参数,并且进行校验
[root@localhost hadoop-3.1.4]# source /etc/profile
[root@localhost hadoop-3.1.4]# hadoop version
Hadoop 3.1.4
Source code repository https://github.com/apache/hadoop.git -r 1e877761e8dadd71effef30e592368f7fe66a61b
Compiled by gabota on 2020-07-21T08:05Z
Compiled with protoc 2.5.0
From source with checksum 38405c63945c88fdf7a6fe391494799b
This command was run using /home/hadoop/hadoop-3.1.4/share/hadoop/common/hadoop-common-3.1.4.jar
四.win10配置hosts
192.168.11.131 hadoop1
192.168.11.132 hadoop2
192.168.11.133 hadoop3
五.启动集群
131机器
// 首次需要格式化磁盘
[hadoop@hadoop1 hadoop-3.1.4]$ pwd
/home/hadoop/hadoop-3.1.4
[hadoop@hadoop1 hadoop-3.1.4]$ hdfs namenode -format
启动集群
1).131机器
[hadoop@hadoop1 hadoop-3.1.4]$ /home/hadoop/hadoop-3.1.4/sbin/start-dfs.sh
Starting namenodes on [hadoop1]
Starting datanodes
hadoop2: WARNING: /home/hadoop/hadoop-3.1.4/logs does not exist. Creating.
hadoop3: WARNING: /home/hadoop/hadoop-3.1.4/logs does not exist. Creating.
Starting secondary namenodes [hadoop3]
[hadoop@hadoop1 hadoop-3.1.4]$ jps
9282 NameNode
9411 DataNode
9671 Jps
[hadoop@hadoop1 hadoop-3.1.4]$
2).132机器
[hadoop@hadoop2 hadoop-3.1.4]$ /home/hadoop/hadoop-3.1.4/sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers
[hadoop@hadoop2 hadoop-3.1.4]$ jps
8801 ResourceManager
8929 NodeManager
8643 DataNode
9244 Jps
[hadoop@hadoop2 hadoop-3.1.4]$
六.验证
验证1
131机器
[hadoop@hadoop3 hadoop-3.1.4]$ jps
9042 DataNode
9124 SecondaryNameNode
9225 NodeManager
9324 Jps
前端查看
验证2
[hadoop@hadoop1 hadoop-3.1.4]$ pwd
/home/hadoop/hadoop-3.1.4
[hadoop@hadoop1 hadoop-3.1.4]$ mkdir test-data
[hadoop@hadoop1 hadoop-3.1.4]$ vim test-data/word.txt
hello world
hello hadoop
hello java
hello flink
// 上传文件
[hadoop@hadoop1 hadoop-3.1.4]$ hadoop fs -mkdir /test-data
[hadoop@hadoop1 hadoop-3.1.4]$ hadoop fs -put test-data/word.txt /test-data
检查上传的文件:
可以看到正是我们上传的文件及内容
验证3
[hadoop@hadoop1 hadoop-3.1.4]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.4.jar wordcount /test-data /test-output
执行无报错,并且如下:
从上图可以看到,执行已经成功了