环境
Ubuntu Linux 8.04
linuxsvr01.vgolive.com 192.168.1.209
linuxsvr02.vgolive.com 192.168.1.210
Hadoop 0.18.0
安装之前
编辑linuxsvr01和linuxsvr02的/etc/hosts文件,修改如下:
192.168.1.209 linuxsvr01.vgolive.com linuxsvr01
192.168.1.210 linuxsvr02.vgolive.com linuxsvr02
SSH访问设置
根据Hadoop 安装部署-单服务器文档中配置SSH步骤配置linuxsvr01和linuxsvr02,将两台服务器上通过ssl-keygen产生的id_rsa.pub中的内容,合并到authorized_keys中,并分别存放至linuxsvr01和linuxsvr02的/home/hadoop/.ssh目录下,如下:
hadoop@linuxsvr01:~$ cat ~/.ssh/authorized_keys
hadoop@linuxsvr02:~$ cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAx5SNLRAJzbx3vRjrIavHMM/7zndoSrEgDdBK094sBT5xdOrmp9vC2ljnInIbCir/b
0vdCLmIfhZ4hFrGd1xscBpYw9PUNC4L2rFf43QVtw3rJDBd2arR2Tf2oaVDkUjQJ339dL53yPgvNowibV4x2JwlcIgYgKBaZnUyC qiayuTDsi4HucueP0KgZSzwRtneRCfso+K5f5JPtkEqY5aD+wuKct/UERIv9hERYdpLIAiWwA+76ytup4hfw8rfJbtD+Co0LBHKn
jFis0IdIk9ICl4rk9X+8a95Xj/Nv2VpFIEDvULSmjwNe/nXgqd+gmUUd5uzJK5NAUs60MQx/NGepQ== hadoop@linuxsvr01
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArEhYHUpd368sUUsKGkd+hQ3bkydJ7XU8w7SNyF2QWDkKul3ncUVx/KauQi6i4
1/vnr+ijat/ewXOo2BHieoOxxXMLy556WydZV1Ctn2/479jAFClqMGlYolVLmYxZjSrSkth3Vd6P0bpkcV0k6vC2OEPz0n8dklupD
Fma5XEJOBo9x/hoKmxzzi3LqgENeIop6x+UJfuUQQrZ8I0SH/aM7f9KRATac3BnJjP6wP27W1AXjRhn1ewGa8bhBAyK65+8eG gAmj9OEI753fSs57yS0VwxvumeYwG/nEKGI09u+q6gEJbSOJEMdPfvZLE4cbYLg26GUJmEzSWw7y6czsAUw== hadoop@linuxsvr02当在linuxsvr01或linuxsvr02上,使用以下命令:
ssh linuxsvr01.vgolive.com
ssh linuxsvr02.vgolive.com
连接时不再提示输入密码,则表示成功。
安装
配置linuxsvr01.vgolive.com
配置masters
编辑/opt/hadoop/conf/masters,将:
localhost
改为:
linuxsvr01.vgolive.com
配置slaves
编辑/opt/hadoop/conf/slaves,将:
localhost
改为:
linuxsvr01.vgolive.com
linuxsvr02.vgolive.com
如果你想再加入slaves服务器,直接将其加入/opt/hadoop/conf/slaves文件后面即可,如下:
linuxsvr01.vgolive.com
linuxsvr02.vgolive.com
linuxsvr03.vgolive.com
配置conf/hadoop-site.xml(所有机器)
注:当你改变conf/hadoop-site.xml时,需要更新至每台服务器上
修改fs.default.name参数,指定NameNode(HDFS主服务器)主机地址和端口。
fs.default.name
hdfs://linuxsvr01.vgolive.com:54310
修改mapred.job.tracker参数,指定JobTracker(MapReduce主服务器)主机地址和端口,在本配置中与NameNode是同一主机。
mapred.job.tracker
master:54311
修改dfs.replication参数,指定缺省的block复制。在本配置中是由两台机器组成,而设置为2。
dfs.replication
2
其他设置
conf/hadoop-site.xml
dfs.name.dir:设置NameNode存贮name表的路径,如果是多个路径使用逗号分隔
dfs.data.dir:设置DataNode存贮blocks的路径,如果是多个路径使用逗号分隔
hadoop.tmp.dir:设置临时路径
mapred.local.dir:设置MapReduce数据临时路径本配置环境的conf/hadoop-site.xml如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://linuxsvr01.vgolive.com:54310/</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>linuxsvr01.vgolive.com:54311</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop/filesystem/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/filesystem/data</value>
</property>
<property>
<name>dfs.replication</name> //备份个数
<value>2</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>6</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
</property>
<configuration>
格式化NameNode
hadoop@linuxsvr01:/opt/hadoop$ bin/hadoop namenode -format
启动HDFS daemons
hadoop@linuxsvr01:/opt/hadoop$ bin/start-dfs.sh
停止HDFS daemons
hadoop@linuxsvr01:/opt/hadoop$ bin/stop-dfs.sh
启动Mapred daemons
hadoop@linuxsvr01:/opt/hadoop$ bin/start-mapred.sh
停止Mapred daemons
hadoop@linuxsvr01:/opt/hadoop$ bin/stop-mapred.sh
参考资料