在主机上传hadoop安装包到容器master下/opt/software下
进入容器master中将hadoop安装包解压到/opt/module下
解压hadoop文件
命令:
tar -zxvf /opt/software/hadoop-2.7.7.tar.gz -C /opt/module/
重命名为hadoop
命令:
mv /opt/module/hadoop-2.7.7 /opt/module/hadoop
配置hadoop环境变量(仅当前用户生效)
命令:
vi /etc/profile
配置内容:
export HADOOP_HOME=/opt/module/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
加载释放环境变量
命令:
source /etc/profile
查看hadoop的版本信息
命令:
hadoop version
配置hadoop-env.sh
命令:
vi /usr/local/src/hadoop/etc/hadoop/hadoop-env.sh
配置内容:
export JAVA_HOME=/opt/module/jdk1.8.0_212(jdk路径)
配置core-site.xml
命令:
vi /opt/module/hadoop/etc/hadoop/core-site.xml
配置内容:
<property> <!--namenode的URL地址(必须写)--> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <!--hadoop临时文件路径(可选配置)--> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop/dfs/tmp</value> </property>
配置hdfs-site.xml
命令:
vi /opt/module/hadoop/etc/hadoop/hdfs-site.xml
配置内容:
<property> <!--hadoop的副本数量,默认为3(必须写)--> <name>dfs.replication</name> <value>3</value> </property> <property> <!--在本地文件系统所在的NameNode的存储空间和持续化处理日志(必须写)--> <name>dfs.namenode.name.dir</name> <value>/opt/module/hadoop/dfs/name</value> </property> <property> <!--在本地文件系统所在的DataNode的存储空间和持续化处理日志(必须写)--> <name>dfs.datanode.data.dir</name> <value>/opt/module/hadoop/dfs/data</value> </property>
配置mapred-site.xml
命令:
cp /opt/module/hadoop/etc/hadoop/mapred-site.xml.template /opt/module/hadoop/etc/hadoop/mapred-site.xml vi /opt/module/hadoop/etc/hadoop/mapred-site.xml
配置内容:
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
配置yarn-site.xml
命令:
vi /usr/local/src/hadoop/etc/hadoop/yarn-site.xml
配置内容:
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <!--关闭虚拟内存验证--> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property>
配置slaves
命令:
vi /opt/module/hadoop/etc/hadoop/slaves
配置内容:
master slave1 slave2
将文件分发给slave1和slave2
命令:
scp -r /opt/module/hadoop slave1:/opt/module/ scp -r /opt/module/hadoop slave2:/opt/module/ scp /etc/profile slave1:/root/ scp /etc/profile slave2:/root/
对namenode进行格式化
命令:
hdfs namenode -format
最后十行输出信息:
21/10/19 01:18:21 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1164747633-192.168.222.201-1634577501149 21/10/19 01:18:21 INFO common.Storage: Storage directory /opt/module/hadoop/dfs/name has been successfully formatted. 21/10/19 01:18:21 INFO namenode.FSImageFormatProtobuf: Saving image file /opt/module/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression 21/10/19 01:18:21 INFO namenode.FSImageFormatProtobuf: Image file /usr/local/src/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 320 bytes saved in 0 seconds. 21/10/19 01:18:21 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 21/10/19 01:18:21 INFO util.ExitUtil: Exiting with status 0 21/10/19 01:18:21 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.222.201 ************************************************************/
启动集群
命令:
start-all.sh