Hadoop 的运行模式有三种,本地模式、伪分布式模式、完全分布式模式
本地模式:是在一台机器上运行Hadoop,该模式是hadoop是作为一个Java进程运行,适合用于调试。
环境:
CentOS release 5.11 (Final)
hadoop-2.5.0
jdk-8u102-linux-i586
下面将介绍本地模式的配置使用过程:
1.确认服务器上是否已经安装了ssh、rsync软件
[yh.zeng@namenode1 hadoop]$ rpm -qa | grep -i ssh
[yh.zeng@namenode1 hadoop]$ rpm -qa | grep -i rsync
没有安装的话,可以使用以下命令进行安装
[yh.zeng@namenode1 hadoop]$ yum install -y ssh
[yh.zeng@namenode1 hadoop]$ yum install -y rsync
2.安装配置JDK,相信大家都挺熟悉这个步骤,略过
3.解压hadoop-2.5.0,并创建软链接hadoop文件夹指向解压的目录,创建软链接是为了以后haddop版本升级的时候,重新创建软链接,而不需要再次修改系统环境变量!
[yh.zeng@namenode1 local]$ ll
总计 88
drwxr-xr-x 2 root root 4096 2011-05-11 bin
drwxr-xr-x 2 root root 4096 2011-05-11 etc
drwxr-xr-x 2 root root 4096 2011-05-11 games
lrwxrwxrwx 1 yh.zeng yh.zeng 12 08-20 17:56 hadoop -> hadoop-2.5.0
drwxrwxr-x 9 yh.zeng yh.zeng 4096 08-20 17:38 hadoop-2.5.0
drwxr-xr-x 2 root root 4096 2011-05-11 include
drwxr-xr-x 2 root root 4096 2011-05-11 lib
drwxr-xr-x 2 root root 4096 2011-05-11 lib64
drwxr-xr-x 2 root root 4096 2011-05-11 libexec
drwxr-xr-x 2 root root 4096 2011-05-11 sbin
drwxr-xr-x 4 root root 4096 07-09 20:42 share
drwxr-xr-x 2 root root 4096 2011-05-11 src
[yh.zeng@namenode1 local]$ pwd
/usr/local
4.修改 etc/hadoop/hadoop-env.sh 文件,修改该配置文件里面的JDK安装路径
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_102/
5.接下来以统计单词个数为例子,运行hadoop
- 创建word.txt 文件,随便写几个单词
[yh.zeng@namenode1 hadoop]$ mkdir wcinut
[yh.zeng@namenode1 hadoop]$ cd wcinut/
[yh.zeng@namenode1 wcinut]$ touch word.txt
[yh.zeng@namenode1 wcinut]$ vi word.txt
abc
java android c++ tomcat
javascript c oralce
~
~
~
- 运行hadoop自带的统计单词的demo程序
[yh.zeng@namenode1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount wcinut/ wcoutput
- 查看刚才程序执行的结果
[yh.zeng@namenode1 hadoop]$ cd wcoutput/
[yh.zeng@namenode1 wcoutput]$ ls
part-r-00000 _SUCCESS
[yh.zeng@namenode1 wcoutput]$ cat part-r-00000
abc 1
android 1
c 1
c++ 1
java 1
javascript 1
oralce 1
tomcat 1