一、First Step
将配置文件拿出来包括(core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml)
注意:修改配置文件!把主机名改成对应的端口号(或者在window下设置hosts配置主机名和ip的映射)
拿出来之后放到你所要运行类的src下!如图所示:
因为我要运行WordCount类,所以我将这四个配置文件放入到了此项目的src下。
二、Second Step
在你所写的(WordCount.java)类中的主方法里添加一下内容:
如图:
其中:
Path pathInput = new Path("/betty");
Path pathOutput = new Path("/betty/output");
可以改为:
Path pathInput = new Path(args[0]);
Path pathOutput = new Path(args[1]);
但是要将路径写到 Run As ----Run Configurations---Arguments中,将不同路径空格分开。如图:
注意:如果没有Run Configurations就先将程序Run As以下,就会出现。还要检查一下Main中包和项目的信息是否为你所要运行的项目信息,如果不是改过来。如图:
注意先检查左边框框中的Java Application中是否有你所要运行的类的名字,没有的话,可以尝试点击,该框框上方的第一个图标。
其实是添加了以下两条内容:
1、conf.set("mapreduce.app-submission.cross-platform", "true");
注意:!!此条内容要放在(Job job = Job.getInstance(conf, "word count");)前边(如图所示),否则会报以下错误 Exception message: /bin/bash: line 0: fg: no job control Stack trace: ExitCodeException exitCode=1: /bin/bash: line 0: fg: no job control
因为Job.getInstance(conf,"word count");是从Configuration中获取的信息,所以在获取信息之前,一定要将此conf信息更改好,否则你获取的conf信息还是和没有conf.set("mapreduce.app-submission.cross-platform", "true")时的信息一样。
或者是你可以直接去mapred-site.xml中更改mapreduce.app-submission.cross-platform此属性为true,如下:<property> <description>If enabled, user can submit an application cross-platform i.e. submit an application from a Windows client to a Linux/Unix server or vice versa. </description> <name>mapreduce.app-submission.cross-platform</name> <value>true</value> </property>
2、job.setJar("mc.jar");
注意:!!!此项括号内的字符串名称和Third Step有关
三、Third Step
将mrmission此项目打成jar包,放在项目下(请注意是项目下,也就是此图中的mrmission下)!
打包步骤:选中你所要打包的项目--右键---Export---Java--JAR File---next---JAR file(设置生成jar file的位置)--next---next---main class(选择你所要运行的类--我的选择为WordCount)--finish
四、点击Run As --->选择--->Run On Hadoop--->执行成功。
五、完整代码
package com.hyxy.mrmission;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class MyMapper extends Mapper<Object, Text, Text, IntWritable> {
Text word = new Text();
final IntWritable one = new IntWritable(1);
@Override
protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
final IntWritable result = new IntWritable();
@Override
protected void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("mapreduce.app-submission.cross-platform", "true");
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setJar("mc.jar");
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/betty"));
FileOutputFormat.setOutputPath(job, new Path("/betty/output"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}