问题解决:使用maven打的jar包有问题,使用Intellij Idea重新按照步骤打包即可(被自己坑死),具体步骤见下图,在此做个记录,方便以后查阅。
步骤:
通过查看日志,在hadoop面板中找到你执行的application_xxxx,找到log点进去。提示最多的还是找不到我自己写的xxxmapper,在此之前,百度,bing翻阅了好多文章,解决思路有,挨个试了多次。
- 代码中设置job.setJarByClass(WordCountDriver.class); 或者job.setJar("/xxx/job/WordCountDriver.jar"),/xxx/job/是在服务器上创建存放jar的地址
- 代码设置configuration,configuration.set("mapred.jar",System.getProperty("user.dir")+"/WordCountDriver.jar")或者configuration.set("mapreduce.application.classpath", System.getProperty("user.dir"))
- 设置hadoop的mapred-site.xml,yarn-site.xml,增加property属性值
因为情况不同,所以都无效,因此怀疑是自己打包的问题。
在重新打包之前,我先找到hadoop自带的 hadoop-mapreduce-examples-2.7.5.jar,在$HADOOP_HOME/share/hadoop/mapreduce下,就是自己安装hadoop的目录下,执行指令,xxxx替换成自己的地址。能成功运行,因此环境是没有问题的。
hadoop jar /xxxx/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar wordcount /input/c.txt /out
接着,重新按步骤打jar包
最后打完的jar包有META-INF文件
META-INF文件夹中有文件MANIFEST.MF,显示你指定的入口类
上传到服务器重新执行指令: hadoop jar /xxx/WordCountDriver.jar /input/xxx.txt /out,成功输出
查看输出的out文件中的结果:
源文件:
结果:
---------------------------------------------分割线:问题描述-----------------------------------------------
环境:linux,hadoop-2.7.5,单点(自己练习用),能正常启动,hdfs的增删查一点毛病没有。
主程序:参考其他博主写的。
package com.guo.self.dubai.hadoop.mapreduce;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.io.Text;
import java.io.IOException;
/**
* @date:2023/8/25
* @descripion: mapreduce的驱动类
*/
public class WordCountDriver extends Configured implements Tool {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
System.out.println("进入WordCountDriver主程序");
Configuration configuration = new Configuration();
try {
//判断输出目录是否存在,若存在,则删除
System.out.println("[WordCountDriver]输入路径:" +args[0]);
System.out.println("[WordCountDriver]输出路径:" + args[1]);
Path fileOutPath = new Path(args[1]);
FileSystem fileSystem = FileSystem.get(configuration);
if (fileSystem.exists(fileOutPath)) {
fileSystem.delete(fileOutPath, true);
}
int status = ToolRunner.run(configuration, new WordCountDriver(), args);
System.exit(status);
} catch (Exception e) {
e.printStackTrace();
}
}
@Override
public int run(String[] args) throws Exception {
//1.获取配置及封装的任务
Job job = Job.getInstance(this.getConf(),"myHadoopJob");
//2.设置jar的加载路径
job.setJarByClass(WordcountDriver.class);
//3.设置mapper和reducer类
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
System.out.println("[WordCountDriver]WordcountMapper:" + WordCountMapper.class);
System.out.println("[WordCountDriver]WordcountReducer:" + WordCountReducer.class);
//4.设置mapper输出
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
//5.设置最终输出的kv类型
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//6.设置输入和输出路径
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
// 7 提交
boolean result = job.waitForCompletion(true);
System.out.println("[WordCountDriver]运行至此");
return (result) ? 0 : 1;
}
public static class WordCountMapper extends Mapper<LongWritable, org.apache.hadoop.io.Text, org.apache.hadoop.io.Text, IntWritable> {
org.apache.hadoop.io.Text k = new org.apache.hadoop.io.Text();
IntWritable v = new IntWritable(1);
@Override
protected void map(LongWritable key, org.apache.hadoop.io.Text value, Context context) throws IOException, InterruptedException {
//1.获取一行
String line = value.toString();
//2.切割
String[] words = line.split(" ");
//3.输出
for (String word : words) {
k.set(word);
context.write(k, v);
}
}
}
public static class WordCountReducer extends Reducer<org.apache.hadoop.io.Text, IntWritable, org.apache.hadoop.io.Text, IntWritable> {
int sum;
IntWritable v = new IntWritable();
@Override
protected void reduce(org.apache.hadoop.io.Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
//1.累加求和
sum = 0;
for (IntWritable count : values) {
sum += count.get();
}
//2.输出
v.set(sum);
context.write(key, v);
}
}
}
打包及执行:放到服务器的指定目录下,名字叫WordCountDriver.jar。执行指令:hadoop jar WordCountDriver.jar /input/c.txt /out
报错截图:Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.xxx.hadoop.mapreduce.WordCountDriver$WordCountMapper not found