Hadoop Day04~虚拟机中进行 wordcount计数

最新推荐文章于 2023-04-29 16:35:59 发布

buzhidaoyaa

最新推荐文章于 2023-04-29 16:35:59 发布

阅读量415

点赞数 1

本文链接：https://blog.csdn.net/buzhidaoyaa/article/details/100568501

版权

本文详细介绍了在虚拟机中使用Hadoop执行WordCount计数的四个步骤：创建输入文件、上传文件到HDFS、运行mapreduce示例及查看计算结果。通过这些步骤，展示了Hadoop处理文本数据的基本流程。

摘要由CSDN通过智能技术生成

wordcount计数

step1:

在home目录下创建文件wordcount.txt,内容如下：
hello tom
hello rose
hello jerry
hello TBL
hello tom
hello kitty
hello rose
hello TBL
hello ZDP
hello ZDP
hello TBL

step2:

在hdfs创建存放wordcount.txt文件的目录/wc/input/
将刚才创建的wordcount.txt上传到hdfs的/wc/input/

step3:

执行hadoop官方提供的mapreduce的wordcount的例子
hadoop jar hadoop-mapreduce-examples-2.8.0.jar wordcount /wc/input/wordcount.txt /wc/output/
命令说明：
hadoop jar ：用hadoop发方式运行jar文件
hadoop-mapreduce-examples-2.8.0.jar：具体的jar文件
wordcount：jar文件中的具体类
/wc/input/wordcount.txt：wordcount类运行需要的第一个参数，hdfs文件系统的输入目录
/wc/output/：wordcount类运行需要的第二个参数，hdfs文件系统的输出目录