MapReduce单词统计,单项求和

MapReduce:单词统计,单项求和前奏:pom.xml <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</v...
摘要由CSDN通过智能技术生成

MapReduce:单词统计,单项求和

前奏:pom.xml

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-hdfs</artifactId>
      <version>2.7.3</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-common</artifactId>
      <version>2.7.3</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-mapreduce-client-core -->
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-mapreduce-client-core</artifactId>
      <version>2.7.3</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-mapreduce-client-jobclient -->
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
      <version>2.7.3</version>
      <scope>provided</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/log4j/log4j -->
    <dependency>
      <groupId>log4j</groupId>
      <artifactId>log4j</artifactId>
      <version>1.2.17</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-mapreduce-client-common -->
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-mapreduce-client-common</artifactId>
      <version>2.7.3</version>
    </dependency>
  </dependencies>

一.单词统计

package com.nike.mapred;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;

public class WcMap extends Mapper<LongWritable, Text,Text, IntWritable> {
   
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
   
        Text text = new Text();
        String[] vals = value.toString().split(" ");
        IntWritable in = new IntWritable();
        for (String val : vals) {
   
            text.set(val);
            in.set(1);
            context.write(text,in);
        }
    }
}
package com.nike.mapred;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
mapreduce单词统计的流程包括以下几个步骤: 1. 准备数据:从本地文件系统或者分布式文件系统(HDFS)中获取需要统计的文本数据。 2. 编程规范:按照MapReduce编程模型的规范,编写Mapper和Reducer的核心处理逻辑。 3. Map阶段:在Mapper中,对输入的文本数据进行切分和处理,将每个单词作为键,出现的次数作为值进行输出。 4. Reduce阶段:在Reducer中,对Mapper输出的键值对进行聚合和计算,将相同的单词进行合并,并计算出总的出现次数。 5. 组合Job:将编写好的Mapper和Reducer进行组合,形成一个完整的Job,用于提交到MapReduce框架中进行执行。 6. 设置和运行Job:对Job进行一些必要的设置,如指定输入路径、输出路径、Mapper和Reducer的类等。然后运行Job,让MapReduce框架执行整个统计任务。 综上所述,mapreduce单词统计的流程包括准备数据、编程规范、Map阶段、Reduce阶段、组合Job和设置与运行Job这几个步骤。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *3* [MapReduce统计单词数目详细说明](https://blog.csdn.net/ygp12345/article/details/109035195)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* [Hadoop实战大数据大作业](https://download.csdn.net/download/qq_50807624/85580175)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值