Hadoop Mapreduce之WordCount实现

最新推荐文章于 2024-09-05 11:47:24 发布

H20838883

最新推荐文章于 2024-09-05 11:47:24 发布

阅读量118

点赞数

文章标签：大数据

原文链接：http://www.cnblogs.com/dulixiaoqiao/p/6985237.html

版权

1.新建一个WCMapper继承Mapper

public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {

@Override

protected void map(LongWritable key , Text value , Context context )

throws IOException, InterruptedException {

//接收数据V1

String line = value .toString();

//切分数据

String[] wordsStrings = line .split( " " );

//循环

for (String w : wordsStrings ) {

//出现一次，记一个一，输出

context .write( new Text( w ), new LongWritable(1));

}

}

}

2.新建一个WCReducer继承Reducer

public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable> {

@Override

protected void reduce(Text key , Iterable<LongWritable> v2s , Context context )

throws IOException, InterruptedException {

// TODO Auto-generated method stub

//接收数据

//Text k3 = k2;

//定义一个计算器

long counter = 0;

//循环v2s

for (LongWritable i : v2s )

{

counter += i .get();

}

//输出

context .write( key , new LongWritable( counter ));

}

}

3.WordCount类实现Main方法

/*

* 1.分析具体的业力逻辑，确定输入输出数据样式

* 2.自定义一个类，这个类要继承import org.apache.hadoop.mapreduce.Mapper;

* 重写map方法，实现具体业务逻辑，将新的kv输出

* 3.自定义一个类，这个类要继承import org.apache.hadoop.mapreduce.Reducer;

* 重写reduce，实现具体业务逻辑

* 4.将自定义的mapper和reducer通过job对象组装起来

*/

public class WordCount {

public static void main(String[] args ) throws Exception {

// 构建Job对象

Job job = Job.getInstance( new Configuration());

// 注意：main方法所在的类

job .setJarByClass(WordCount. class );

// 设置Mapper相关属性

job .setMapperClass(WCMapper. class );

job .setMapOutputKeyClass(Text. class );

job .setMapOutputValueClass(LongWritable. class );

FileInputFormat.setInputPaths( job , new Path( "/words.txt" ));

// 设置Reducer相关属性

job .setReducerClass(WCReducer. class );

job .setOutputKeyClass(Text. class );

job .setOutputValueClass(LongWritable. class );

FileOutputFormat.setOutputPath( job , new Path( "/wcount619" ));

// 提交任务

job .waitForCompletion( true );

}

}

4.打包为wc.jar，并上传到linux，并在Hadoop下运行

hadoop jar /root/wc.jar

转载于:https://www.cnblogs.com/dulixiaoqiao/p/6985237.html

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hadoop Mapreduce之WordCount实现

1.新建一个WCMapper继承Mapperpublic class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {@Overrideprotected void map(LongWritable key, Text value, Con...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。