Hadoop小兵笔记【三】利用Eclipse将wordcount打包成可以运行在hadoop上的jar包_在eclipse上创建wordcount后打包成一个jar包-CSDN博客

hadoop版本为hadoop1.2.1

eclipse版本为eclipse-standard-kepler-SR2-win32-x86_64

WordCount.java为hadoop-1.2.1\src\examples\org\apache\hadoop\examples\WordCount.java

 
    
  
/**
*  Licensed under the Apache License, Version 2.0 (the "License");
*  you may not use this file except in compliance with the License.
*  You may obtain a copy of the License at
*
*      http://www.apache.org/licenses/LICENSE-2.0
*
*  Unless required by applicable law or agreed to in writing, software
*  distributed under the License is distributed on an "AS IS" BASIS,
*  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
*  See the License for the specific language governing permissions and
*  limitations under the License.
*/


package org.apache.hadoop.examples;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

 public static class TokenizerMapper 
      extends Mapper<Object, Text, Text, IntWritable>{
   
   private final static IntWritable one = new IntWritable(1);
   private Text word = new Text();
     
   public void map(Object key, Text value, Context context
                   ) throws IOException, InterruptedException {
     StringTokenizer itr = new StringTokenizer(value.toString());
     while (itr.hasMoreTokens()) {
       word.set(itr.nextToken());
       context.write(word, one);
     }
   }
 }
 
 public static class IntSumReducer 
      extends Reducer<Text,IntWritable,Text,IntWritable> {
   private IntWritable result = new IntWritable();

   public void reduce(Text key, Iterable<IntWritable> values, 
                      Context context
                      ) throws IOException, InterruptedException {
     int sum = 0;
     for (IntWritable val : values) {
       sum += val.get();
     }
     result.set(sum);
     context.write(key, result);
   }
 }

 public static void main(String[] args) throws Exception {
   Configuration conf = new Configuration();
   String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
   if (otherArgs.length != 2) {
     System.err.println("Usage: wordcount <in> <out>");
     System.exit(2);
   }
   Job job = new Job(conf, "word count");
   job.setJarByClass(WordCount.class);
   job.setMapperClass(TokenizerMapper.class);
   job.setCombinerClass(IntSumReducer.class);
   job.setReducerClass(IntSumReducer.class);
   job.setOutputKeyClass(Text.class);
   job.setOutputValueClass(IntWritable.class);
   FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
   FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
   System.exit(job.waitForCompletion(true) ? 0 : 1);
 }
} 
    
  

在eclipse中新建java project，project名为WordCount

在project中新建class，类名为WordCount

再将上述代码覆盖eclipse中的WordCount.java

并将页首的package改了wordcount,改后的源码如下

 
    
  
package wordcount;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

 public static class TokenizerMapper 
      extends Mapper<Object, Text, Text, IntWritable>{
   
   private final static IntWritable one = new IntWritable(1);
   private Text word = new Text();
     
   public void map(Object key, Text value, Context context
                   ) throws IOException, InterruptedException {
     StringTokenizer itr = new StringTokenizer(value.toString());
     while (itr.hasMoreTokens()) {
       word.set(itr.nextToken());
       context.write(word, one);
     }
   }
 }
 
 public static class IntSumReducer 
      extends Reducer<Text,IntWritable,Text,IntWritable> {
   private IntWritable result = new IntWritable();

   public void reduce(Text key, Iterable<IntWritable> values, 
                      Context context
                      ) throws IOException, InterruptedException {
     int sum = 0;
     for (IntWritable val : values) {
       sum += val.get();
     }
     result.set(sum);
     context.write(key, result);
   }
 }

 public static void main(String[] args) throws Exception {
   Configuration conf = new Configuration();
   String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
   if (otherArgs.length != 2) {
     System.err.println("Usage: wordcount <in> <out>");
     System.exit(2);
   }
   Job job = new Job(conf, "word count");
   job.setJarByClass(WordCount.class);
   job.setMapperClass(TokenizerMapper.class);
   job.setCombinerClass(IntSumReducer.class);
   job.setReducerClass(IntSumReducer.class);
   job.setOutputKeyClass(Text.class);
   job.setOutputValueClass(IntWritable.class);
   FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
   FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
   System.exit(job.waitForCompletion(true) ? 0 : 1);
 
 }
} 
    
  

 1 import org.apache.hadoop.conf.Configuration;
 2 import org.apache.hadoop.fs.Path;
 3 import org.apache.hadoop.io.IntWritable;
 4 import org.apache.hadoop.io.Text;
 5 import org.apache.hadoop.mapreduce.Job;
 6 import org.apache.hadoop.mapreduce.Mapper;
 7 import org.apache.hadoop.mapreduce.Reducer;
 8 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
 9 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
10 import org.apache.hadoop.util.GenericOptionsParser;

可以看到源码import了好几个hadoop自定义类，非JDK环境自带的类，所以需要把这些依赖包导入eclipse中，不然编译器如何能找到这些类呢，得明确让编译器知道这些类所在位置。

这时候编译并运行一下，会发现有如下错误

Exception in thread "main" java.lang.Error: Unresolved compilation problems: 
    The import org.apache.commons cannot be resolved
    The import org.apache.commons cannot be resolved
    The import org.codehaus cannot be resolved
    The import org.codehaus cannot be resolved
    Log cannot be resolved to a type
    LogFactory cannot be resolved
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    Log cannot be resolved to a type
    JsonFactory cannot be resolved to a type
    JsonFactory cannot be resolved to a type
    JsonGenerator cannot be resolved to a type

    at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:60)
    at wordcount.WordCount.main(WordCount.java:52)

原因是缺少依赖的jar库文件，再把缺少的jar库文件添加入库即可。

使用Add External JARs添加hadoop1.2.1\lib目录下所有jar文件。

再一次编译并运行，成功

最后打包成为jar文件

file->export

其中，WordCount.jar不是强求与类名相同，可以改为其他，譬如CountWord.jar,关系不大，然后点击Finish即可。

之后就可以在hadoop上运行了。运行WordCount详解可以参考Hadoop集群（第6期）_WordCount运行详解

1 hadoop jar WordCount.jar WordCount input output

注意上述代码中是没有

1 package org.apache.hadoop.examples;

倘若使用了package，那么jar文件中就有层次的，不再如hadoop jar WordCount.jar WordCount input output就可以运行了，需要详细指出WordCount（这个是主类的类名），运行命令改为

hadoop jar WordCount.jar org.apache.hadoop.examples.WordCount input output

关于这里打包的内容，在[hadoop]命令行编译并运行hadoop例子WordCount有讲述

转载自：林羽飞扬