MapReduce案例：Wordcout

最新推荐文章于 2024-04-02 15:58:10 发布

我是小小白！

最新推荐文章于 2024-04-02 15:58:10 发布

阅读量235

点赞数 1

分类专栏： hadoop java maven 文章标签： hadoop mapreduce

本文链接：https://blog.csdn.net/weixin_44362861/article/details/115351398

版权

1.需求分析
在给定的文本文件中统计输出每一个单词出现的总次数

1.1 往hello.txt输入以下数据

你好 beautiful nice hey ad
hahaha
test
test
0319
0326
0326
0326

1.2 期望输出数据值

0319	1
0326	3
ad	1
beautiful	1
hahaha	1
hey	1
nice	1
test	2
你好	1

2.开发步骤
按照MapReduce编程规范，分别编写Mapper，Reducer，Driver

（1） Mapper

1.1 将MapTask传给我们的文本内容先转换成String
1.2根据空格将这一行切分成单词
1.3将单词输出为<单词,1>
（2） Reducer
2.1汇总各个key的个数
2.2 输出该key的总次数
（3） Driver
3.1获取配置信息，获取job对象实例
3.2指定本程序的jar所在的路径
3.3关联Mapper/Reducer的业务类
3.4指定Mapper输出数据的kv类型
3.5指定最终输出的数据的kv类型
3.6 指定job的输入原始文本所在目录
3.7 指定job的输出结果所在目录
3.8 提交作业
3.项目搭建
（1）配置maven工程中pom.xml文件的依赖

<dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>RELEASE</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.8.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.7.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.7.2</version>
        </dependency>
        <dependency>

最低0.47元/天解锁文章

我是小小白！

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
MapReduce案例：Wordcout

1.需求分析在给定的文本文件中统计输出每一个单词出现的总次数1.1 往hello.txt输入以下数据你好 beautiful nice hey adhahahatesttest03190326032603261.2 期望输出数据值0319 10326 3ad 1beautiful 1hahaha 1hey 1nice 1test 2你好 12.开发步骤按照MapReduce编程规范，分别编写Mapper，Reducer，Driver（1） Mapper1.
复制链接

扫一扫