二、 安装hadoop
2.1 解压安装包
进入~/downloads/ 目录,执行
tar –zxvf Hadoop-2.2.0.tar.gz–C ~/usr
此时在 ~/usr/ 下就有了hadoop的目录: ~/usr/hadoop-2.2.0
2.2 配置bashrc
.bashrc中hadoop的路径配置为:
# hadoop settings
export HADOOP_HOME=/myhome/hamr/usr/hadoop-2.2.0
export HADOOP_PREFIX=$HADOOP_HOME/
export YARN_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop/
export YARN_CONF_DIR=$HADOOP_CONF_DIR
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
exportHADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
更新配置文件。
2.3 修改配置文件
2.3.1core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://virt0-0-7-0:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop2/hdfs/tmp-${user.name}</value>
</property>
</configuration>
注:此处指定了hadoop的tmp目录为 /data/hadoop2/hdfs/tmp ,我们需要在每台机器上都建立起该目录,后续会给出步骤。
2.3.2hadoop-env.sh
这里需要修改的是JAVA_HOME:
export JAVA_HOME=/myhome/hamr/usr/jdk1.8.0_25/
2.3.3hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>virt0-0-7-0:9001</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/hadoop2/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/hadoop2/hdfs/datanode</value>
</property>
</configuration>
注:此处指定了hadoop的namenode和datanode目录分别为 /data/hadoop2/hdfs/namenode 和/data/hadoop2/hdfs/datanode ,我们需要在每台机器上都建立起该目录,后续会给出步骤。
2.3.4mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>virt0-0-7-0:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>virt0-0-7-0:19888</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1638M</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx3276M</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx3276M</value>
</property>
</configuration>
2.3.5slaves
virt0-0-7-1
virt0-0-7-2
virt0-0-7-3
virt0-0-7-4
2.3.6yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>virt0-0-7-0</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>virt0-0-7-0:8088</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>57344</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>57344</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>20</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/share/hadoop/common/*,
$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,
$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*,
$HADOOP_YARN_HOME/share/hadoop/yarn/*,
$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*
</value>
</property>
</configuration>
2.4 建立相关的目录
切换至root账户:
mkdir –p /data/hadoop2/hdfs/tmp
chown –R hamr:hamr /data
切换回hamr账户:
mkdir/data/hadoop2/hdfs/namenode
mkdir/data/hadoop2/hdfs/datanode
ssh至每台机器,执行同样的操作。
2.5 将安装包和bashrc同步到其它节点
参考1.4
2.6 启动和验证hadoop
在主节点上格式化hdfs:
hdfsnamenode –format
启动hadoop:
start-all.sh
查看hdfs运行状态:
hdfs dfsadmin–report