1.通过xftp下载Hadoop压缩包
2.进入xshell解压
tar -zxf hadoop-3.1.4.tar.gz -C /opt
3.修改配置文件
cd /opt/hadoop-3.1.4/etc/hadoop
进入hadoop中
(1)vi hadoop-env.sh
通过 echo $JAVA_HOME 获取路径
添加JAVA_HOME环境变量:
export JAVA_HOME=/usr/java/jdk1.8.0_281-amd64
添加授权(免的后面报错):
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"
(2)vi core-site.xml
添加:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<!-- 临时文件存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/tmp</value>
</property>
</configuration>
(3)vi hdfs-site.xml
添加:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/data/hadoop/datanode</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 是否启用hdfs权限检查 false 关闭-->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>master:50070</value>
</property>
</configuration>
(4)vi mapred-site.xml
添加:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/opt/hadoop-3.1.4</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>/opt/Hadoop-3.1.4/share/Hadoop/mapreduce/*:/opt/Hadoop-3.1.4/share/Hadoop/mapreduce/lib/*</value>
</property>
</configuration>
(5)vi yarn-site.xml
添加:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!-- 设置不检查虚拟内存的值,不然内存不够会报错 -->
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--yarn上面运行一个任务,最少需要1.5G内存,虚拟机没有这么大的内存就调小这个值,不然会报错 -->
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
</configuration>
(6)vi workers
修改为:
node1
node2
node3
4. 创建数据和临时文件夹
Master node:
mkdir -p /data/hadoop/tmp
mkdir -p /data/hadoop/namenode
Other nodes:
mkdir -p /data/hadoop/tmp
mkdir -p /data/hadoop/datanode
5. 分发到每个节点(可能时间会长一点,一定要耐心)
scp -r /opt/hadoop-3.1.4/ node1:/opt/
scp -r /opt/hadoop-3.1.4/ node2:/opt/
scp -r /opt/hadoop-3.1.4/ node3:/opt/
6. 格式化 HDFS
在master上面:
cd /opt/hadoop-3.1.4/bin
./hdfs namenode -format demo
7. 启动集群
(在hadoop-3.1.4 de sbin目录中)
[root@master sbin]#
./start-dfs.sh
./start-yarn.sh
./mr-jobhistory-daemon.sh start historyserver
-> 再重复2步骤,通过jps查看进程
4. 关闭防火墙
(对所有节点,可以考虑在克隆之前完成)
systemctl status firewalld.service
systemctl stop firewalld.service & systemctl disable firewalld.service
5. 宿主机上做节点映射
宿主机上修改,host文件
/C:/Windows/System32/drivers/etc/hosts
编辑文件
192.168.216.130 master master.centos.com
192.168.216.131 node1 node1.centos.com
192.168.216.132 node2 node2.centos.com
192.168.216.133 node3 node3.centos.com
6. Hadoop环境变量配置
(对所有节点)
vi /etc/profile
export HADOOP_HOME=/opt/hadoop-3.1.4
export PATH=$PATH:$HADOOP_HOME/bin
source /etc/profile
路径查找方式:
echo $HADOOP_HOME
echo $PATH
6. 关闭集群
在master上操作
cd /opt/hadoop-3.1.4/sbin
注意关闭顺序:
./stop-dfs.sh
./stop-yarn.sh
./mr-jobhistory-daemon.sh stop historyserver
poweroff #结束进程