Hadoop Mapreduce之WordCount实现

1.新建一个WCMapper继承Mapper
public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
       @Override
       protected void map(LongWritable key , Text value , Context context )
                   throws IOException, InterruptedException {
             //接收数据V1
            String line = value .toString();
             //切分数据
            String[] wordsStrings = line .split( " " );
             //循环
             for (String w : wordsStrings ) {
                   //出现一次,记一个一,输出
                   context .write( new Text( w ), new LongWritable(1));
            }
      }
}
 
2.新建一个WCReducer继承Reducer
public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable> {
       @Override
       protected void reduce(Text key , Iterable<LongWritable> v2s , Context context )
                   throws IOException, InterruptedException {
             // TODO Auto-generated method stub
             //接收数据
             //Text k3 = k2;
             //定义一个计算器
             long counter = 0;
             //循环v2s
             for (LongWritable i : v2s )
            {
                   counter += i .get();
            }
             //输出
             context .write( key , new LongWritable( counter ));
      }
}
3.WordCount类实现Main方法
/*
 * 1.分析具体的业力逻辑,确定输入输出数据样式
 * 2.自定义一个类,这个类要继承import org.apache.hadoop.mapreduce.Mapper;
 * 重写map方法,实现具体业务逻辑,将新的kv输出
 * 3.自定义一个类,这个类要继承import org.apache.hadoop.mapreduce.Reducer;
 * 重写reduce,实现具体业务逻辑
 * 4.将自定义的mapper和reducer通过job对象组装起来
 */
public class WordCount {
       public static void main(String[] args ) throws Exception {
             // 构建Job对象
            Job job = Job.getInstance( new Configuration());
            
             // 注意:main方法所在的类
             job .setJarByClass(WordCount. class );
            
             // 设置Mapper相关属性
             job .setMapperClass(WCMapper. class );
             job .setMapOutputKeyClass(Text. class );
             job .setMapOutputValueClass(LongWritable. class );
            FileInputFormat.setInputPaths( job , new Path( "/words.txt" ));
            
             // 设置Reducer相关属性
             job .setReducerClass(WCReducer. class );
             job .setOutputKeyClass(Text. class );
             job .setOutputValueClass(LongWritable. class );
            FileOutputFormat.setOutputPath( job , new Path( "/wcount619" ));
             // 提交任务
             job .waitForCompletion( true );
      }     
}
 
 
4.打包为wc.jar,并上传到linux,并在Hadoop下运行
     hadoop jar /root/wc.jar

转载于:https://www.cnblogs.com/dulixiaoqiao/p/6985237.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值