环境
Ubuntu Linux 8.04
linuxsvr01.vgolive.com 192.168.1.209
Hadoop 0.18.0
HDFS Architecture (source: http://hadoop.apache.org/core/docs/current/hdfs_design.html
安装之前
Sun JDK 6
添加运行Hadoop系统用户
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hadoop
配置SSH
squall@linuxsvr01:~$ su - hadoop
hadoop@linuxsvr01:~$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is: 49:60:84:66:7b:05:15:6a:8d:e8:22:d9:c6:bb:85:28 hadoop@linuxsvr01
hadoop@linuxsvr01:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
编辑SSH服务器配置文档/etc/ssh/sshd_config,检查配置是否如下:
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile %h/.ssh/authorized_keys
取消IPv6
编辑/etc/modprobe.d/blacklist,在最后一行添加:
# disable IPv6
blacklist ipv6
Hadoop
安装
# cd /opt
# sudo tar xzf hadoop-0.18.0.tar.gz
# sudo mv hadoop-0.18.0 hadoop
# sudo chown -R hadoop:hadoop hadoop
配置hadoop-env.sh
编辑/opt/hadoop/conf/hadoop-env.sh,将:
# The java implementation to use. Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
改为:
# The java implementation to use. Required.
export JAVA_HOME=/opt/jdk1.6.0_07
配置hadoop-site.xml
编辑/opt/hadoop/conf/hadoop-site.xml,如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://linuxsvr01.vgolive.com:54310/</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>linuxsvr01.vgolive.com:54311</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop/filesystem/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/filesystem/data</value>
</property>
<property>
<name>dfs.replication</name> //备份个数
<value>2</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>6</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
</property>
<configuration>
格式化name node
hadoop@linuxsvr01:~$ /opt/hadoop/bin/hadoop namenode -format
启动/重启hadoop
hadoop@linuxsvr01:~$ /opt/hadoop/bin/start-all.sh
使用netstat检查hadoop运行端口
hadoop@linuxsvr01:~$ sudo netstat -plten | grep java
关闭hadoop
hadoop@linuxsvr01:~$ /opt/hadoop/bin/stop-all.sh
检查hadoop当前状态
hadoop@linuxsvr01:~$ /opt/hadoop/bin/hadoop dfsadmin -report
Total raw bytes: 8103133184 (7.55 GB)
Remaining raw bytes: 6112729452 (5.69 GB)
Used raw bytes: 24576 (24 KB)
% used: 0%Total effective bytes: 0 (0 KB)
Effective replication multiplier: Infinity
-------------------------------------------------
Datanodes available: 1Name: 192.168.1.209:50010
State : In Service
Total raw bytes: 8103133184 (7.55 GB)
Remaining raw bytes: 6112729452(5.69 GB)
Used raw bytes: 24576 (24 KB)
% used: 0%
Last contact: Thu Oct 16 00:15:13 CST 2008在hadoop中创建目录
hadoop@linuxsvr01:~$ /opt/hadoop/bin/hadoop dfs -mkdir testdir
COPY本地文件至HDFS
hadoop@linuxsvr01:~$ /opt/hadoop/bin/hadoop dfs -copyFromLocal /opt/hadoop/conf/*.xml /user/hadoop/testdir
hadoop@linuxsvr01:~$ /opt/hadoop/bin/hadoop dfs -ls /user/hadoop/testdir
Hadoop Web Interfaces
http://linuxsvr01.vgolive.com:50030/ - web UI for MapReduce job tracker(s)
http://linuxsvr01.vgolive.com:50060/ - web UI for task tracker(s)
http://linuxsvr01.vgolive.com:50070/ - web UI for HDFS name node(s)
参考资料