参照Hadoop包里面的share的doc的离线文档
说明:搭建分布式至少两台主机,我这里是以三台为主 master slave1 slave2
-
解压 重命名 环境变量
1. tar -zxvf 解压包名(jdk的包,hadoop包) 2. mv 文件名 重命名 3. vi /etc/profile(环境变量对所有用户有效)vi ~/.bash_profile (只对当前用户有效) export JAVA_HOME=/usr/local/src/jdk export HADOOP_HOME=/usr/local/src/hadoop export PATH=$JAVA_HOME/bin:$PATH export PATH=$HADOOP_HOME/bin:$PATH export PATH=$HADOOP_HOME/sbin:$PATH 4. 刷新环境变量 source ~/.bash_profile
-
hadoop-env.sh
1. 进入到hadoop/etc/hadoop 2. vi hadoop-env.sh 修 export JAVA_HOME=/usr/local/src/jdk
-
core.site.xml
vi core.site.xml <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/src/hadoop/tmp</value> </property>
-
hdfs-site.xml
<property> <name>dfs.replication</name> <value>3</value> </property>
-
mapred-site.xml
cp mapred-site.xml.template mapred-site.xml <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
-
yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler </value> </property> <property> <name>yarn.nodemanager.pmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property>
-
修改slaves
1. vi slaves master slave1 slave2
-
分发配置文件
1. scp -r /usr/local/src/hadoop slava1:/usr/local/src 2. scp ~/.bash_profile slave1:~
-
格式化namenode并启动
1. hdfs namenode -format 2. cd hadoop/sbin start-all.sh/stop-all.sh 3. 查看namenode和resourcemanager 主机IP:50057 主机IP:8088
需要全套搭建最简化的视频与全套安装包的请添加群924338546