一. 指令介绍
hadoop fs 指令详见我的另一博文
Hadoop2.4.1 hadoop dfs(fs)指令大全
二.下面就开始执行过程
1.建立一个测试的目录
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -mkdir input
(若提示:mkdir: `input': No such file or directory, 则使用:hadoop fs -mkdir -p input)
2.检验input文件夹是否创建成功
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -ls
Found 1 items
drwxr-xr-x - root supergroup 0 2014-08-18 09:02 input
3.建立测试文件
root@master-hadoop:/home/hadoop/hadoop# vi test.txt
hello hadoop
hello World
Hello Java
Hey man
i am a programmer
4.将测试文件放到测试目录中
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -put test.txt input/
5.检验test.txt文件是否已经导入
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -ls input/
Found 1 items
-rw-r--r-- 1 root supergroup 62 2014-08-18 09:03 input/test.txt
6.执行wordcount程序
root@master-hadoop:/home/hadoop/hadoop# hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar wordcount input/ output/
执行过程:
14/08/18 09:06:13 INFO mapreduce.Job: Running job: job_1408322491685_0001
14/08/18 09:06:20 INFO mapreduce.Job: Job job_1408322491685_0001 running in uber mode : false
14/08/18 09:06:20 INFO mapreduce.Job: map 0% reduce 0%
14/08/18 09:06:25 INFO mapreduce.Job: map 100% reduce 0%
14/08/18 09:06:30 INFO mapreduce.Job: map 100% reduce 100%
14/08/18 09:06:31 INFO mapreduce.Job: Job job_1408322491685_0001 completed successfully
14/08/18 09:06:32 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=128
FILE: Number of bytes written=186663
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=177
HDFS: Number of bytes written=78
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=2478
Total time spent by all reduces in occupied slots (ms)=2941
Total time spent by all map tasks (ms)=2478
Total time spent by all reduce tasks (ms)=2941
Total vcore-seconds taken by all map tasks=2478
Total vcore-seconds taken by all reduce tasks=2941
Total megabyte-seconds taken by all map tasks=2537472
Total megabyte-seconds taken by all reduce tasks=3011584
Map-Reduce Framework
Map input records=5
Map output records=12
Map output bytes=110
Map output materialized bytes=128
Input split bytes=115
Combine input records=12
Combine output records=11
Reduce input groups=11
Reduce shuffle bytes=128
Reduce input records=11
Reduce output records=11
Spilled Records=22
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=45
CPU time spent (ms)=1350
Physical memory (bytes) snapshot=338079744
Virtual memory (bytes) snapshot=1113423872
Total committed heap usage (bytes)=277086208
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=62
File Output Format Counters
Bytes Written=78
7.查看是否得到output文件,发现会多一个output文件!
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -ls
Found 2 items
drwxr-xr-x - root supergroup 0 2014-08-18 09:03 input
drwxr-xr-x - root supergroup 0 2014-08-18 09:06 output
8.查看output文件夹里文件
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -ls output
Found 2 items
-rw-r--r-- 1 root supergroup 0 2014-08-18 09:06 output/_SUCCESS
-rw-r--r-- 1 root supergroup 78 2014-08-18 09:06 output/part-r-00000
9.查看output文件内容
root@master-hadoop:/home/hadoop/hadoop# hadoop fs -cat output/part-r-00000
Hello 1
Hey 1
Java 1
World 1
a 1
am 1
hadoop 1
hello 2
i 1
man 1
programmer 1