各机器做免密配置(https://blog.csdn.net/FromTheWind/article/details/89887812)
安装jdk(不做介绍)
安装zookeeper
下载导入hadoop-3.1.2.tar.gz
解压:tar zxf hadoop-3.1.2.tar.gz
配置用户环境变量:sed -i '$a\\nexport HADOOP_HOME=/home/hadoop/env/hadoop-3.1.2\n\nexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin' ~/.bash_profile;source ~/.bash_profile
修改hadoop-env.sh
#export JAVA_HOME=/opt/jdk1.8.0_171
#export HADOOP_PID_DIR=/home/hadoop/data/pids
#export HADOOP_LOG_DIR=/home/hadoop/data/logs/hadoop/hdfs
修改core-site.xml:添加配置项,参考官方配置
修改hdfs-site.xml:添加配置项,参考官方配置
修改mapred-site.xml:添加配置项,参考官方配置
修改yarn-site.xml:添加配置项,参考官方配置
配置datanode节点:
workers #等同于slaves,在hadoop3开始不再是slaves文件配置datanode,而是workers配置datanode
将修改好的hadoop-3.1.2分发到其他节点:
scp -qr hadoop-3.1.2 hadoop2:/home/hadoop/env/
scp -qr hadoop-3.1.2 hadoop3:/home/hadoop/env/
scp -qr hadoop-3.1.2 hadoop4:/home/hadoop/env/
scp -qr hadoop-3.1.2 hadoop5:/home/hadoop/env/
创建软连接:cd /home/hadoop/env;ln -s hadoop-3.1.2 hadoop
格式化namenode以及同步主备namenode:
~/env/hadoop/sbin/hadoop-daemon.sh start journalnode #先启动所有节点上的journalnode进程,每个节点都启动
~/env/hadoop/bin/hdfs namenode -format #格式化namenode
~/env/hadoop/bin/hdfs zkfc -formatZK #格式化高可用zkfc
~/env/hadoop/bin/hdfs namenode #启动其中一个namenode
~/env/hadoop/bin/hdfs namenode -bootstrapStandby #将namenode数据同步,可以手动拷贝,这里使用了命令
~/env/hadoop/sbin/hadoop-daemon.sh stop journalnode #ctrl+c掉namenode,停止journalnode
查看各节点存在的进程:
jps;echo ''
ssh hadoop2 'source /home/hadoop/.bash_profile;jps';echo ''
ssh hadoop3 'source /home/hadoop/.bash_profile;jps';echo ''
ssh hadoop4 'source /home/hadoop/.bash_profile;jps';echo ''
ssh hadoop5 'source /home/hadoop/.bash_profile;jps';echo ''
hadoop1:
DFSZKFailoverController
NameNode
ResourceManager
hadoop2:
DFSZKFailoverController
NameNode
ResourceManager
hadoop3:
QuorumPeerMain
JournalNode
DataNode
hadoop4:
JournalNode
DataNode
QuorumPeerMain
NodeManager
hadoop5:
DataNode
QuorumPeerMain
NodeManager