OS : fedora 14
1 创建 hadoop 组及用户
groupadd hadoop # 创建 hadoop 组
useradd -g hadoop hadoop # 创建 hadoop 用户
修改 hadoop 用户密码
passwd hadoop
2 安装 jdk 环境
用 hadoop 用户登录, jdk 为 1.6 版本 jdk-6u18-linux-i586.bin
chmod +x jdk-6u18-linux-i586.bin # 增加可执行权限
./jdk-6u18-linux-i586.bin # 安装 jdk 到 /home/hadoop/jdk1.6.0_18
配置环境变量
vi .bashrc
增加以下内容
export JAVA_HOME=/home/hadoop/jdk1.6.0_18
export CLASS_PATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH
3 安装 hadoop-0.20.2
解压 hadoop-0.20.2.tar.gz
tar –xvf hadoop-0.20.2.tar.gz
4 配置环境变量
vi .bashrc
增加以下内容
export HADOOP_INSTALL=/home/hadoop/hadoop-0.20.2
export PATH=$HADOOP_INSTALL/bin:PATH
检验是否正确
[hadoop@fedora14-001 ~]$ hadoop version
Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
可以配置 hadoop ,运行包内带的例子
5 配置 ssh 无密码登录
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
重启 ssh 服务
service sshd restart
校验是否可以无密码登录
[hadoop@fedora14-001 ~]$ ssh localhost
Last login: Fri Jul 1 10:04:30 2011 from localhost.localdomain
[hadoop@fedora14-001 ~]$
有时候会出现权限不足的错误,
执行下面的命令即可 [hadoop@fedora14-001 ~]chmod –R 700 ~/.ssh
6 由于 hadoop 的配置文件默认的为本地环境,所以不用配置
$HADOOP_INSTALL/conf/ 下的 core-site.xml,hdfs-site.xml,mapred-site.xml
相应的默认设置在 hadoop-0.20.2-core.jar 内
7 格式化 HDFS
[hadoop@fedora14-001 hadoop-0.20.2]$ hadoop namenode -format
11/07/01 10:34:09 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = fedora14-001/172.18.7.53
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
Re-format filesystem in /tmp/hadoop-hadoop/dfs/name ? (Y or N) Y
11/07/01 10:34:14 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
11/07/01 10:34:14 INFO namenode.FSNamesystem: supergroup=supergroup
11/07/01 10:34:14 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/07/01 10:34:14 INFO common.Storage: Image file of size 96 saved in 0 seconds.
11/07/01 10:34:14 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
11/07/01 10:34:14 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at fedora14-001/172.18.7.53
************************************************************/
8 启动 Hadoop 相关后台进程
[hadoop@fedora14-001 bin]$ ./start-all.sh
starting namenode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-namenode-fedora14-001.out
localhost: starting datanode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-datanode-fedora14-001.out
localhost: starting secondarynamenode, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-secondarynamenode-fedora14-001.out
starting jobtracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-jobtracker-fedora14-001.out
localhost: starting tasktracker, logging to /home/hadoop/hadoop-0.20.2/bin/../logs/hadoop-hadoop-tasktracker-fedora14-001.out
9 准备运行 wordcount 的数据
这里在本地创建了一个数据目录 input ,并拷贝一些文件到该目录下面,如下所示
[hadoop@fedora14-001 input]$ ll
total 8
-rw-rw-r--. 1 hadoop hadoop 40 Jul 1 09:21 input1.txt
-rw-rw-r--. 1 hadoop hadoop 21 Jul 1 09:22 input2.txt
10 启动 wordcount 任务
[hadoop@fedora14-001 hadoop-0.20.2]$hadoop jar hadoop-0.20.2-examples.jar wordcount /home/hadoop/input /home/hadoop/output
元数据目录为 input ,输出数据目录为 output 。
10 查看运行结果
[hadoop@fedora14-001 output]$ ll
total 4
-rwxrwxrwx. 1 hadoop hadoop 53 Jul 1 10:07 part-r-00000
[hadoop@fedora14-001 output]$ more part-r-00000
are 2
hello 2
how 2
lin 1
old 1
song 1
world 2
you 2
11 停止 hadoop 后台进程
[hadoop@fedora14-001 bin]$ stop-all.sh