上传hadoop压缩包
解压-测试是否可用
tar zxvf hadoop-2.6.4.tar.gz
*解压*
rm hadoop-2.6.4.tar.gz
*删除压缩包*
cd /opt/hadoop-2.6.4/bin
*打开解压包,*
./hadoop version
*测试该版本是否可用*
添加环境变量,使全局可用
sudo vi ~/.bashrc
vi模式编辑–在最下面添加变量
export HADOOP_HOME=/opt/hadoop-2.6.4
export PATH=$PATH:$HADOOP_HOME/bin
source ~/.bashrc
*使全局变量生效*
hadoop version
*测试-不报则成功*
Master:
将master中配置hadoop已解压的文件夹发送给slave0 slave1
cd /opt/
sudo scp -r hadoop-2.6.4 hadoop@slave0:/opt/
sudo scp -r hadoop-2.6.4 hadoop@slave1:/opt/
scp ~/.bashrc hadoop@slave0:~/.bashrc
scp ~/.bashrc hadoop@slave1:~/.bashrc
slave0 slave1:
生效:
source ~/.bashrc
测试:
hadoop version
修改权限
cd /
sudo chown hadoop.hadoop opt/
集群搭建:
修改权限
cd /
sudo chown hadoop.hadoop opt/
配置集群搭建:
修改core-site.xml
1.进入目录:
cd /opt/hadoop-2.6.4/etc/hadoop
2.编辑core-site.xml
vi core-site.xml
<--指定HDFS的老大namenode地址-->
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</property>
3.编辑hdfs-site.xml
sudo vi hdfs-site.xml
<property>
<name>dfs.data.dir</name>
<value>/opt/dfs/data</value>
</property>
<!--指定名称节点目录-->
<property>
<name>dfs.name.dir</name>
<value>/opt/dfs/name</value>
</property>
<!--指定hdfs保存副本的数量-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>master.50090</value>
</property>
4.跟据mapred-site.xml.template模板文件复制一份
cp mapred-site.xml.template mapred-site.xml
5.编辑mapred-site.xml
vi mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
<!--告诉hadoop以后运行在yarn上-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
6.编辑yarn-site.xml
vi yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!--指定数据获取通过shuffle的方式-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
7.配置hadoop_env.sh
vi hadoop_env.sh
export JAVA_HOME=/opt/jdk1.8.0_101
8.配置 mapred-env.sh
vi mapred-env.sh
export JAVA_HOME=/opt/jdk1.8.0_101
9.配置slaves文件 修改哪些机器作为slave,
vi slaves
删除原来的localhost
添加
slave0
slave1
10.配置masters
vi masters
master
11.在master上配置成功后将master中的
hadoop整个文件夹传输到slave0跟slave1中
Master:
cd /opt
sudo scp -r hadoop-2.6.4 hadoop@slave0:/opt/
sudo scp -r hadoop-2.6.4 hadoop@slave1:/opt/
scp ~/.bashrc hadoop@slave0:~/.bashrc
scp ~/.bashrc hadoop@slave1:~/.bashrc
slave0:
source ~/.bashrc
slave1:
source ~/.bashrc
测试:
slave0:
hadoop version
slave1:
hadoop version
查看到版本号说明OK。
启动集群:
1.格式化硬盘:
hadoop namenode -format
2.启动集群:
cd /opt/hadoop-2.6.4/sbin
./start-all.sh
jps查看进程
发现没有NameNode说明失败了
./stop-all.sh
查看发现配置core-site.xml写错了
cd /opt/hadoop-2.6.4/etc/hadoop
```
scp core-site.xml hadoop@slave0:/opt/hadoop-2.6.4/etc/hadoop/
scp core-site.xml hadoop@slave1:/opt/hadoop-2.6.4/etc/hadoop/
再次测试:
hadoop namenode -format
cd /opt/hadoop-2.6.4/sbin
./start-all.sh
Master slave0 slave1:
jps
发现master中有NameNode
Slave0跟slave1中有DataNode这个进程则代表集群启动成功
在客户端浏览器中可以再次测试
http://masterIP:50070
最后:常见问题总结: 1: Master: cd /opt/dfs/name/current cat VERSION
Slave7,Slave8: cd /opt/dfs/data/current cat VERSION
对比master跟slave中的clusterID是否一致(一致代表在一个集群中) 如果不一致: Master: cd
/opt/hadoop-2.6.4/sbin ./stop-dfs.shMaster,Slave7,Slave8: rm -rf /opt/dfs
Master: hdfs namenode -format (hadoop namenode -format) 重启集群
./start-dfs.sh2:查看三台电脑防火墙是否关闭:
firewall-cmd --state systemctl stop firewalld.service systemctl
disable firewalld.service3:以下是之前必须检查的: 关闭防火墙 修改ip 修改hostname 设置ssh无密码登录 安装jdk 安装hadoop
/etc/hosts