以jar方式在cdh3u2下运行WordCount

6 篇文章 0 订阅
4 篇文章 0 订阅

1.编写Map/Reduce/Driver类

Map:hadoop.TokenizerMapper
Reduce:hadoop.IntSumReducer

Driver:hadoop.WordCount


2.export jar

选择项目>右键/export... > Java/JAR file<WordCount-0.1.jar> 

> 下一步 ,选择相关资源文件和JAR文件名称路径 >下一步 >选择 Main class:hadoop.WordCount >点击 完成


3.准备输入数据

xcloud@xcloud:~/iworkspace/HelloHadoop$ sudo gedit input1.txt
[sudo] password for xcloud: 
xcloud@xcloud:~/iworkspace/HelloHadoop$ sudo gedit input2.txt
xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop fs -mkdir /tmp/input
#xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop fs -mkdir /tmp/output #输出目录不用创建
xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop fs -put input1.txt /tmp/input
xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop fs -put input2.txt /tmp/input



其中input1.txt内容:
Hello, i love china
are you ok?

其中input2.txt内容:
hello, i love word

You are ok


4.运行

     hadoop jar          WordCount-0.1.jar hadoop.WordCount  /tmp/input   /tmp/output
      运行jar命令:hadoop jar        jar文件                   Main class                                  input-path  output-path 
xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop jar WordCount-0.1.jar hadoop.WordCount /tmp/input /tmp/output
11/12/31 10:11:43 INFO input.FileInputFormat: Total input paths to process : 2
11/12/31 10:11:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library
11/12/31 10:11:43 WARN snappy.LoadSnappy: Snappy native library not loaded
11/12/31 10:11:43 INFO mapred.JobClient: Running job: job_201112310845_0002
11/12/31 10:11:44 INFO mapred.JobClient:  map 0% reduce 0%
11/12/31 10:11:49 INFO mapred.JobClient:  map 100% reduce 0%
11/12/31 10:11:56 INFO mapred.JobClient:  map 100% reduce 33%
11/12/31 10:11:57 INFO mapred.JobClient:  map 100% reduce 100%
11/12/31 10:11:58 INFO mapred.JobClient: Job complete: job_201112310845_0002
11/12/31 10:11:58 INFO mapred.JobClient: Counters: 22
11/12/31 10:11:58 INFO mapred.JobClient:   Job Counters 
11/12/31 10:11:58 INFO mapred.JobClient:     Launched reduce tasks=1
11/12/31 10:11:58 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6175
11/12/31 10:11:58 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
11/12/31 10:11:58 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
11/12/31 10:11:58 INFO mapred.JobClient:     Launched map tasks=2
11/12/31 10:11:58 INFO mapred.JobClient:     Data-local map tasks=2
11/12/31 10:11:58 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=8146
11/12/31 10:11:58 INFO mapred.JobClient:   FileSystemCounters
11/12/31 10:11:58 INFO mapred.JobClient:     FILE_BYTES_READ=152
11/12/31 10:11:58 INFO mapred.JobClient:     HDFS_BYTES_READ=270
11/12/31 10:11:58 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=172859
11/12/31 10:11:59 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=73
11/12/31 10:11:59 INFO mapred.JobClient:   Map-Reduce Framework
11/12/31 10:11:59 INFO mapred.JobClient:     Reduce input groups=11
11/12/31 10:11:59 INFO mapred.JobClient:     Combine output records=14
11/12/31 10:11:59 INFO mapred.JobClient:     Map input records=4
11/12/31 10:11:59 INFO mapred.JobClient:     Reduce shuffle bytes=158
11/12/31 10:11:59 INFO mapred.JobClient:     Reduce output records=11
11/12/31 10:11:59 INFO mapred.JobClient:     Spilled Records=28
11/12/31 10:11:59 INFO mapred.JobClient:     Map output bytes=118
11/12/31 10:11:59 INFO mapred.JobClient:     Combine input records=14
11/12/31 10:11:59 INFO mapred.JobClient:     Map output records=14
11/12/31 10:11:59 INFO mapred.JobClient:     SPLIT_RAW_BYTES=208
11/12/31 10:11:59 INFO mapred.JobClient:     Reduce input records=14
xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop fs -ls /tmp/output/
Found 3 items
-rw-r--r--   1 xcloud supergroup          0 2011-12-31 10:11 /tmp/output/_SUCCESS
drwxr-xr-x   - xcloud supergroup          0 2011-12-31 10:11 /tmp/output/_logs
-rw-r--r--   1 xcloud supergroup         73 2011-12-31 10:11 /tmp/output/part-r-00000


5.查看结果

hadoop fs -cat /tmp/output/part-r-00000 

xcloud@xcloud:~/iworkspace/HelloHadoop$ hadoop fs -cat /tmp/output/part-r-00000 Hello,1
You 1
are 2
china 1
hello, 1
i 2
love 2
ok 1
ok? 1
word 1
you 1




参考
:http://trac.nchc.org.tw/cloud/wiki/waue/2009/0617
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值