1. 软件
软件/OS | 版本 |
---|---|
Hadoop | 2.7.2 |
Ubuntu | 14.04 (32位) |
VirtualBox | 4.3.24 |
openjdk | 1.7.0_91 |
ssh | - |
rsync | - |
2.下载安装软件包
1) 下载解压Hadoop 2.7.2
tar -xf hadoop-2.7.2.tar.gz ~/
2) 命令行安装
sudo apt-get install rsync
sudo apt-get install ssh
sudo apt-get install openjdk-7-jdk
3. 配置环境
1) 软件配置
sudo vi /etc/profile
#JAVA
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386
#HADOOP
export HADOOP_HOME=~/hadoop-2.7.2
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
2) Hadoop配置
etc/hadoop/hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-i386
4. Hadoop实施
$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
$ cat output/*