idea新建项目
lib包导入jar包后右键点击“Add asLibrary”–“ok”
“src”下新建包“com.mapreduce”
“WordCountMapper.java”代码如下
package com.mapreduce;
import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import java.io.IOException;
public class WordCountMapper extends Mapper<LongWritable, Text,Text,LongWritable> {
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line=value.toString();
String[] words=StringUtils.split(line,"");
for (String word:words){
context.write(new Text(word),new LongWritable(1));
}
}
}
“WordCountReducer.java”代码如下
package com.mapreduce;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
public class WordCountReducer extends Reducer<Text, LongWritable,Text,LongWritable> {
@Override
protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
long count = 0;
for (LongWritable value:values){
count=count+value.get();
}
context.write(key,new LongWritable(count));
}
}
“WordCountRunner.java”代码如下
package com.mapreduce;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import java.io.IOException;
public class WordCountRunner {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration configuration=new Configuration();
Job wcjob=Job.getInstance(configuration);
//设置job中的mapper和reduce类
wcjob.setMapperClass(WordCountMapper.class);
wcjob.setReducerClass(WordCountReducer.class);
//设置job中的runner类
wcjob.setJarByClass(WordCountRunner.class);
//mapper
wcjob.setMapOutputKeyClass(Text.class);
wcjob.setMapOutputValueClass(LongWritable.class);
//reduce
wcjob.setOutputKeyClass(Text.class);
wcjob.setOutputValueClass(LongWritable.class);
FileInputFormat.setInputPaths(wcjob,new Path("hdfs://192.168.2.100:9000/wc/"));
FileOutputFormat.setOutputPath(wcjob,new Path("hdfs://192.168.2.100:9000/out/"));
wcjob.waitForCompletion(true);
}
}
如下图点击
可在“E:\mapreduce\out\artifacts\mapreduce_jar”路径下看到“mapreduce.jar”文件
双击打开
启动
新建“words.log”,并写入如下内容
在网页上新建“wc”文件,并将“words.log”文件上传入“wc”中
“alt+p”将“E:\mapreduce\out\artifacts\mapreduce_jar”路径下的“mapreduce.jar”拖入
切片
网址“192.168.2.100:50070”,网页进入“/out”路径下可看到
或可用“hadoop fs -ls /out”语句直接查看
可用“hadoop fs -cat /out/part-r-00000”语句查看“/out/part-r-00000”的结果