配置hadoop-site.xml
任何一个制定站点的配置都是在此文件中完成的,<HADOOP_INSTALL>/conf/hadoop-site.xml,其中<HADOOP_INSTALL>为你解压的Hadoop目录,这里我的是/usr/local/hadoop。我们的设置将使用Hadoop的分布式系统,即便我们“集群”目前只包含一台机器。
你可以保持文件内容不变,除了hadoop.tmp.dir。你必须把它改为你指定的目录,例如 /usr/local/hadoop-datastore/hadoop-${user.name}。Hadoop 会将${user.name}自动扩展为运行Hadoop的系统用户名,因此这里的路径就为/usr/local/hadoop-datastore/hadoop-hadoop。
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/your/path/to/hadoop/tmp/dir/hadoop-${user.name}更改这里</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
第四步:格式化namenode
webir@weir-desktop$ bin/hadoop namenode –format
该过程初始化dfs.name.dir变量指定的目录
注意:不要格式化正在运行的Hadoop文件系统,这会导致你所有数据的丢失
输出大概是这样
07/09/21 12:00:25 INFO dfs.NameNode: STARTUP_MSG:
/***********************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = webir/127.0.0.1
STARTUP_MSG: args = [-format]
***********************************************************/
07/09/21 12:00:25 INFO dfs.Storage: Storage directory [...] has been successfully formatted.
07/09/21 12:00:25 INFO dfs.NameNode: SHUTDOWN_MSG:
/***********************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.0.1
***********************************************************/
hadoop@ubuntu:/usr/local/hadoop$