1 下载hadoop2.6.0版本,解压到目录/app/hadoop,重命名文件夹名为hadoop260
2 文件配置
修改环境变量:
修改hadoop变量的值:
修改slaves文件
root@kaiseu-ubuntu:/app/hadoop/hadoop260/etc/hadoop# vi slaves
core-site.xml文件,hdfs的访问接口
root@kaiseu-ubuntu:/app/hadoop/hadoop260/etc/hadoop# vi core-site.xml
在<configuration> </configuration>间增加如下内容:
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:8000</value>
</property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:8000</value>
</property>
hdfs-site.xml文件
root@kaiseu-ubuntu:/app/hadoop/hadoop260/etc/hadoop# vi hdfs-site.xml
mapred-site.xml文件
root@kaiseu-ubuntu:/app/hadoop/hadoop260/etc/hadoop# vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml 文件
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
hadoop-env.sh文件
3 ssh无密码登陆
生成公钥:
将hadoop1 hadoop2 hadoop3 的公钥拷贝到authorized_key文件中
4 格式化namenode
5 启动集群
注:文中有几个图放错了,但是内容是一样的,不影响结果。红底的图是物理机上的截图,黑底的是通过ssh连接到集群上的。物理机安装的是Ubuntu14.04,作为客户端,在物理机上安装了虚拟机,虚拟机中安装了3台CentOS:hadoop1,hadoop2,hadoop3,模拟真实的集群,hadoop1作为主节点。