上一篇文章介绍了如何在idea上运行hdfs程序,中间出现了很多错误,通过不断的在网上查找资料和自己的尝试。终于可以正常运行了。
这篇我们将进行mapreduce程序的调试。
准备工作:
下载hadoop到windows本地
地址:https://archive.apache.org/dist/hadoop/core/stable/hadoop-2.7.3.tar.gz
解压之后进行设置环境变量
HADOOP_HOME------D:\git-mobile-workspace\hadoop-2.7.3
Path-----%HADOOP_HOME%\bin;%HADOOP_HOME%\sbin
1.目录结构
以上篇文章的maven工程为基础,继续添加mapreduce的代码。
2.pom.xml
和上一篇文章一样,这里就不在上图了。
3.CountMain类
public static void main(String[] args1) { try { Job job = Tools.getJob(); job.setJarByClass(CountMain.class); Tools.setMapper(job, CountMapper.class, Text.class, LongWritable.class); Tools.setReduce(job, CountReduce.class, Text.class, LongWritable.class); Tools.setInput(job, "/hadoop/mapred_input"); Tools.setOutPut(job, "/hadoop/mapred_output"); job.waitForCompletion(true); } catch (Exception e) { e.printStackTrace(); } }}
4.CountMapper类
public class CountMapper extends Mapper <LongWritable, Text, Text, LongWritable>{ @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { if (value != null) { String realValue = String.valueOf(value); context.write(new Text(realValue), new LongWritable(1)); } } }
5.CountReduce类
public class CountReduce extends Reducer <Text, LongWritable, Text, LongWritable>{ @Override protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException { int total = 0; Iterator<LongWritable> iterator = values.iterator(); while (iterator.hasNext()) { total += Integer.valueOf(String.valueOf(iterator.next())); } context.write(key, new LongWritable(total)); } }
6.Tools类
public class Tools {