Pseudo-distributed mode(伪分布式)
伪分布式模式就是将所有hadoop程序运行在一台服务器上.这种模式将单机模式分割为各模块,主要还是用来debug你的程序,以便于测试内存占用量,HDFS input/output问题,已经各个deamons之间的交互等等.
下面就是这种模式的配置例子:
core-site.xml
<?xml version=”1.0”?>
<?xml-stylesheet type=”text/xsl” href=”confi guration.xsl”?>
<!-- Put site-specifi c property overrides in this fi le. -->
<confi guration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default fi le system. A URI whose
scheme and authority determine the FileSystem implementation.
</description>
</property>
</confi guration>
mapred-site.xml
<?xml version=”1.0”?>
<?xml-stylesheet type=”text/xsl” href=”confi guration.xsl”?>
<!-- Put site-specifi c property overrides in this fi le. -->
<confi guration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>The host and port that the MapReduce job tracker runs
at.</description>
</property>
</confi guration>
hdfs-site.xml
<?xml version=”1.0”?>
<?xml-stylesheet type=”text/xsl” href=”confi guration.xsl”?>30 CHAPTER 2 Starting Hadoop
<!-- Put site-specifi c property overrides in this fi le. -->
<confi guration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>The actual number of replications can be specifi ed when the
fi le is created.</description>
</property>
</confi guration>
在core-site.xml和mapred-site.xml中需要指定服务器名和各个服务的端口号.
In hdfs-site.xml we specify the default replication factor for HDFS, which should only be one because we’re running on only
one node.
在hdfs-site.xml中我们需要指定唯一的HDFS代理地址.同时,我们需要指定Secondary NameNode的位置.(masters和slaves脚本在conf目录下)
[hadoop-user@master]$ cat masters
localhost
[hadoop-user@master]$ cat slaves
Localhost
虽然所有的deamon程序运行在同一台服务器上,但是它们之间的通讯方式仍然就如同分布式一样是使用ssh协议.
你如果已经准备好启动hadoop,第一步要做的就是运行格式化HDFS的脚本
[hadoop-user@master]$ bin/hadoop namenode –format
而jps命令(在0.21.0版本中并未提供)将列出所有成功部署的deamons
[hadoop-user@master]$ bin/start-all.sh
[hadoop-user@master]$ jps
26893 Jps
26832 TaskTracker
26620 SecondaryNameNode
26333 NameNode
26484 DataNode
26703 JobTracker Running Hadoop 31
当然你也可以关闭所有hadoop进程
[hadoop-user@master]$ bin/stop-all.sh