MapReduce实例——查询缺失扑克牌

问题:

 解决:

首先分为两个过程,Map过程将<=10的牌去掉,然后只针对于>10的牌进行分类,Reduce过程,将Map传过来的键值对进行统计,然后计算出少于3张牌的的花色

 

1.代码

1) Map代码

1     String line = value.toString();
2     String[] strs = line.split("-");
3     if(strs.length == 2){
4         int number = Integer.valueOf(strs[1]);
5         if(number > 10){
6             context.write(new Text(strs[0]), value);
7         }
8     }

 

 

2) Reduce代码

1      Iterator<Text> iter = values.iterator();
2      int count = 0;
3      while(iter.hasNext()){
4         iter.next();
5         count ++;
6     }
7     if(count < 3){
8         context.write(key, NullWritable.get());
9     }

 

 

3) Runner代码

 1     Configuration conf = new Configuration();
 2     Job job = Job.getInstance(conf);
 3     job.setJobName("poker mr");
 4     job.setJarByClass(pokerRunner.class);
 5             
 6     job.setMapperClass(pakerMapper.class);
 7     job.setReducerClass(pakerRedue.class);
 8             
 9     job.setMapOutputKeyClass(Text.class);
10     job.setMapOutputValueClass(Text.class);
11             
12     job.setOutputKeyClass(Text.class);
13     job.setOutputValueClass(NullWriter.class);
14             
15     FileInputFormat.addInputPath(job, new Path(args[0]));
16     FileOutputFormat.setOutputPath(job, new Path(args[1]));
17             
18     job.waitForCompletion(true);

 

2.运行结果

File System Counters

      FILE: Number of bytes read=87

      FILE: Number of bytes written=211167

      FILE: Number of read operations=0

      FILE: Number of large read operations=0

      FILE: Number of write operations=0

      HDFS: Number of bytes read=366

      HDFS: Number of bytes written=6

      HDFS: Number of read operations=6

      HDFS: Number of large read operations=0

      HDFS: Number of write operations=2

   Job Counters

      Launched map tasks=1

      Launched reduce tasks=1

      Data-local map tasks=1

      Total time spent by all maps in occupied slots (ms)=109577

      Total time spent by all reduces in occupied slots (ms)=42668

      Total time spent by all map tasks (ms)=109577

      Total time spent by all reduce tasks (ms)=42668

      Total vcore-seconds taken by all map tasks=109577

      Total vcore-seconds taken by all reduce tasks=42668

      Total megabyte-seconds taken by all map tasks=112206848

      Total megabyte-seconds taken by all reduce tasks=43692032

   Map-Reduce Framework

      Map input records=49

      Map output records=9

      Map output bytes=63

      Map output materialized bytes=87

      Input split bytes=110

      Combine input records=0

      Combine output records=0

      Reduce input groups=4

      Reduce shuffle bytes=87

      Reduce input records=9

      Reduce output records=3

      Spilled Records=18

      Shuffled Maps =1

      Failed Shuffles=0

      Merged Map outputs=1

      GC time elapsed (ms)=992

      CPU time spent (ms)=3150

      Physical memory (bytes) snapshot=210063360

      Virtual memory (bytes) snapshot=652480512

      Total committed heap usage (bytes)=129871872

   Shuffle Errors

      BAD_ID=0

      CONNECTION=0

      IO_ERROR=0

      WRONG_LENGTH=0

      WRONG_MAP=0

      WRONG_REDUCE=0

   File Input Format Counters

      Bytes Read=256

   File Output Format Counters

      Bytes Written=6

3.运行方法

在Eclipse里编译好,生出jar包,然后上传到linux系统上,在集群上运行该文件

运行命令:bin/hadoop **.jar 类包名 /

例如:bin/hadoop **.jar com.test.mr /

 

转载于:https://www.cnblogs.com/langgj/p/6612566.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Mapreduce实例-WordCount是一个经典的MapReduce程序,用于统计文本中每个单词出现的次数。它的工作原理是将输入的文本划分为多个片段,每个片段由多个键值对组成,其中键是单词,值是1。然后通过Map阶段将每个片段中的单词提取出来,并将每个单词映射为键值对,其中键是单词,值是1。接下来,通过Shuffle和Sort阶段将具有相同单词的键值对聚集在一起。最后,通过Reduce阶段将相同单词的计数值进行累加,得到每个单词的总次数。 以下是一个示例代码片段,展示了WordCount程序的基本结构和关键组件: ```java import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Job job = Job.getInstance(); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } ```

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值