MapReduce的编程开发——单词计数


前言

本文主要是学习MapReduce的学习笔记,对所学内容进行记录。
实验环境:
1.Linux Ubuntu 16.04

2.hadoop3.0.0

3.eclipse4.5.1


一、启动Hadoop

  1. 进入Hadoop启动目录cd /apps/hadoop/sbin
  2. 启动Hadoop./start-all.sh
  3. 输入‘jps’,启动后显示如下信息
    在这里插入图片描述

二、环境搭配

  1. 打开eclipse->Window->Preferences;

  2. 选择Hadoop Map/Reduce,选择Hadoop包根目录,/apps/hadoop,点击Apply,点击OK;

  3. 点击window–>show view–>other–>mapreduce tools–>map/reduce locations,之后页面会出现对应的标签页;
    界面

  4. 点击3中图标1,在Local name输入myhadoop,在DFS Master 框下Port输入8020,点击Finish,出现3中右侧页面;
    在这里插入图片描述

  5. 点击3中

  6. 图标2,选择下图内容,出现第3步图中左侧内容
    在这里插入图片描述
    完成环境配置环境。

三、求平均值

  1. 新建test项目,新建word包;
  2. 新建Wordcount类,即Wordcount.java,编写并保存如下代码:
package word;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;

public class Wordcount{
	public static class WordcountMapper extends Mapper<LongWritable,Text,Text,IntWritable>{
		protected void map(LongWritable key, Text value, Context context) throws IOException,InterruptedException{
			String line = value.toString();
			String[] words = line.split(" ");
			for(String word:words){
				context.write(new Text(word),new IntWritable(1));
			}
		}
	}
	public static class WordcountReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
		protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
			int count = 0;
			for(IntWritable value:values){
				count+=value.get();
			}
			context.write(key,new IntWritable(count));
		}
	}
	public static void main(String[] args) throws IOException,InterruptedException,ClassNotFoundException {
		String dir_in = "hdfs://localhost:8020/word/input";
		String dir_out = "hdfs://localhost:8020/word/output";
		Path in = new Path(dir_in);
		Path out = new Path(dir_out);
		
		Configuration conf = new Configuration();
		out.getFileSystem(conf).delete(out, true);
		
		Job job = Job.getInstance(conf);
		job.setJarByClass(Wordcount.class);
		
		job.setMapperClass(WordcountMapper.class);
		job.setReducerClass(WordcountReducer.class);
		
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(IntWritable.class);
		
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(IntWritable.class);
		
		job.setNumReduceTasks(1);
		
		FileInputFormat.addInputPath(job,in);
		FileOutputFormat.setOutputPath(job,out);
		
		System.exit(job.waitForCompletion(true)?0:1);
	}
}
  1. 运行指令cp /apps/hadoop/etc/hadoop/{core-site.xml,hdfs-site.xml,log4j.properties} /home/dolphin/workspace/test/src,将hadoop配置文件复制到src文件夹下;
  2. 创建输入文件存放路径
hadoop fs -mkdir /word
hadoop fs -mkdir /word/input
  1. 将数据文件放入hadoop目录下,hadoop fs -put /home/dolphin/Desktop/text.txt /word/input,text.txt内容如下:
Hello Hadoop
Welcome to Hadoop
I love Hadoop
  1. 运行Wordcount.java文件,得到单词计数的结果在output文件夹中如下所示


总结

本实验利用Hadoop进行单词计数操作

  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值