首先,下载jdk1.7,安装jdk1.7,在此不详细说明;下载Hadoop-2.0.2(与先前版本安装方法有区别);
然后,配置SSH登录不用输入密码:
#ssh-keygen -t rsa -P ''
#cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys /*然后按enter键即可*/
#ssh localhost
最后安装和配置Hadoop:
1.解压Hadoop到/usr/local/hadoop目录中
2.添加环境变量:
#vim /etc/profile 末行添加如下
export HADOOP_PREFIX=/usr/local/haddop
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}
3.修改Hadoop的配置文件:
hadoop-env.sh:
#vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
修改JAVA_HOME,这里JAVA_HOME的路径必须指定为真实的路径,不能引用${JAVA_HOME},否则运行的时候会有错误JAVA_HOME is not setd
export JAVA_HOME=/usr/java/jdk1.7 //此处填写本地jdk的安装目录
-----------------------------------------------------------------------------------------------------------------------------
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
<description>The name of the default file system. Either the
literal string "local" or a host:port for NDFS.
</description>
<final>true</final>
</property>
</configuration>
----------------------------------------------------------------------------------------------------------------------------------
hdfs-site.xml
其中,/home/hduser/dfs/name,/home/hduser/dfs/data都是文件系统中的目录
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/dfs/name</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table. If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/dfs/data</value>
<description>Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices.
Directories that do not exist are ignored.
</description>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
-----------------------------------------------------------------------------------------------------------------------------------------------------------
mapred-site.xml
其中,/home/hduser/mapred/system,/home/hduser/mapred/local都是文件系统中的目录
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>file:/home/hduser/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>file:/home/hduser/mapred/local</value>
<final>true</final>
</property>
</configuration>
------------------------------------------------------------------------------------------------------------------------------------
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
-----------------------------------------------------------------------------------------------------------------------------------------------------------
以上配置好后启动HDFS:
# hdfs namenode -format
运行成功以后可以使用一下命令启动NameNode和DataNode
# hadoop-daemon.sh start namenode
# hadoop-daemon.sh start datanode
或者
# start-dfs.sh --一起启动
启动 Yarn Daemons :
#yarn-daemon.sh start resourcemanager
# yarn-daemon.sh start nodemanager
或者
# start-yarn.sh
检查一下Demo是否启动
#jps
在我的机器上输出是这样的:
2048 NameNode
2322 SecondaryNameNode
3024 Jps
2439 ResourceManager
2441NodeManager
----------------------------------------------------end------------------------------------------------------------