机器:
10.211.55.67 master
10.211.55.68 slave1
10.211.55.69 slave2
解压hadoop-3.2.1到home
配置环境变量
#hadoop
export HADOOP_HOME=/home/hadoop-3.2.1
export PATH=$PATH:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin
进入Hadoop-3.2.1/etc/hadoop
配置core-site.xml
<!-- 指定NameNode主机和hdfs端口 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<!-- 指定tmp文件夹路径 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/data/tmp</value>
</property>
配置hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>slave2:50090</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/dfs/datanode</value>
</property>
配置 yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>slave1</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>106800</value>
</property>
配置 mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
cd /hadoop-3.2.1/etc/hadoop
vi workers
修改其内容为:
master
slave1
slave2
把master的hadoop-3.2.1包分发到节点机器
master执行
hdfs namenode -format
启动cd /hadoop-3.2.1/sbin
./start-all.sh
视图
./start-yarn.sh
日志
./mr-jobhistory-daemon.sh start historyserver
hadoop访问
http://10.211.55.67:9870/
yarn访问
http://10.211.55.68:8088/
日志
http://10.211.55.67:19888/
遇到的问题
问题1
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [DW1]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
在/usr/local/hadoop/hadoop-3.2.1/sbin下, 将start-dfs.sh和stop-dfs.sh两个文件顶部添加以下参数:
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
start-yarn.sh和stop-yarn.sh的顶部添加以下参数:
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
问题2
JAVA_HOME is not set and could not be found
配置hadoop-env.sh
export JAVA_HOME=/home/jdk-