官网下载hadoop-2.7.1.tar.gz,拷贝到多台Linux机器对应/opt目录下
openJDK 1.7
CentOS 6
修改Linux配置文件/etc/profile
HADOOP_PREFIX=/opt/hadoop-2.7.1
JAVA_HOME=/usr/lib/jvm/jre-1.7.0
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
export HADOOP_PREFIX PATH JAVA_HOME
保存后使用 source /etc/profile使环境变量生效
设置 JAVA_HOME=/usr/lib/jvm/jre-1.7.0
192.168.1.197 master
192.168.1.197 D1 【这一行,为本机主机名映射,本台机器hostname为D1,如果机器名已经改成master和slaveN就不需要这一行了】
192.168.1.198 slave1
192.168.1.199 slave2
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>
/opt/hadoop-2.7.1/tmp
</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>
hdfs://master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
<configuration>
<!--
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
-->
<property>
<name>dfs.namenode.name.dir</name>
<value>
file:/opt/hadoop-2.7.1/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>
file:/opt/hadoop-2.7.1/data</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property>
</configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.acl.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.admin.acl</name>
<value>*</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>
master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>
master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>
master:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>
master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>
master:8088</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>
master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
useradd hadoop【系统会为新增的hadoop用户自动创建hadoop用户组】
chown -R hadoop:hadoop hadoop-2.7.1/
chmod -R 777 hadoop-2.7.1/ 【这里如果只希望hadoop用户组可以执行,可以更改为775】
passwd hadoop【输入两次密码,这里使用hadoop作为密码】
在Namenode节点上,切换至Hadoop用户目录:cd ~
生成NameNode节点密钥:ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 得到 id_dsa.pub
将 id_dsa.pub 分别使用scp拷贝到 slave1和slave2机器的/home/hadoop/.ssh/下面【如果.ssh不存在,可使用ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa在/home/hadoop目录下执行,生成私钥,也会生成.ssh目录】
如果slave1和slave2机器不存在 /home/hadoop/.ssh/authorized_keys 文件,则直接将 id_dsa.pub 拷贝成 authorized_keys
如果存在则将公钥加入到authorized_keys文件:cat id_dsa.pub >> authorized_keys
2,在Namenode上,执行,start-dfs.sh,和start-yarn.sh
3,关闭集群在Namenode上执行:stop-yarn.sh和stop-dfs.sh