myeclipse配置hadoop开发环境

3. 安装Hadoop开发插件:
下载hadoop-1.0.3-eclipse-plugin.jar( http://ishare.iask.sina.com.cn/f/24669534.html)拷贝到myeclipseeclipse根目录下/dropins目录下:

4. 启动Eclipse,打开Perspective:
【Window】->【Open Perspective】->【Other...】->【Map/Reduce】->【OK】
Ubuntu下开发Eclipse下的Hadoop应用

5. 打开一个View:
【Window】->【Show View】->【Other...】->【MapReduce Tools】->【Map/Reduce Locations】->【OK】
Ubuntu下开发Eclipse下的Hadoop应用

6. 添加Hadoop location:
Ubuntu下开发Eclipse下的Hadoop应用
Ubuntu下开发Eclipse下的Hadoop应用
在Advanced parameters中修改:
hadoop.tmp.dir=/home/xsj/hadoop/hadoop-xsj
mapred.child.java.opts=-Xmx512m
注意: hadoop.tmp.dir的修改将影响到mapred.local.dir、mapred.system.dir、mapred.temp.dir、fs.s3.buffer.dir、fs.checkpoint.dir、fs.checkpoint.edits.dir、dfs.name.dir、dfs.name.edits.dir、dfs.data.dir等项,因此修改后需要重新启动Eclipse。
mapred.child.java.opts的修改可能最开始没有这一项,可以在后面再行修改,这里的512是我的虚拟机Ubuntu的内存大小,具体情形具体分析。
Ubuntu下开发Eclipse下的Hadoop应用
修改之后可以看到以下HDFS的视图:
Ubuntu下开发Eclipse下的Hadoop应用

7. 添加文本文件:
$ ./hadoop fs -mkdir /user/xsj/input
$ ./hadoop fs -put ./*.txt /user/xsj/input
Ubuntu下开发Eclipse下的Hadoop应用

8. 新建Map/Reduce Project:
【File】->【New】->【Project...】->【Map/Reduce】->【Map/Reduce Project】->【Project name: WordCount】->【Configure Hadoop install directory...】->【Hadoop installation directory: /home/xsj/hadoop/hadoop-0.20.2】->【Apply】->【OK】->【Next】->【Allow output folders for source folders】->【Finish】
Ubuntu下开发Eclipse下的Hadoop应用

9. 新建WordCount类:
【WordCount】->【src】->【New】->【Class】->【Package: org.apache.hadoop.examples】->【Name: WordCount】->【Finish】
Ubuntu下开发Eclipse下的Hadoop应用
添加/编写源代码:
/home/xsj/hadoop/hadoop-0.20.2/src/examples/org/apache/hadoop/examples/WordCount.java

package org.apache.hadoop.examples;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

   public static class TokenizerMapper 
           extends Mapper<Object, Text, Text, IntWritable>{
      
      private final static IntWritable one = new IntWritable(1);
      private Text word = new Text();
         
      public void map(Object key, Text value, Context context
                              ) throws IOException, InterruptedException {
         StringTokenizer itr = new StringTokenizer(value.toString());
         while (itr.hasMoreTokens()) {
            word.set(itr.nextToken());
            context.write(word, one);
         }
      }
   }
   
   public static class IntSumReducer 
           extends Reducer<Text,IntWritable,Text,IntWritable> {
      private IntWritable result = new IntWritable();

      public void reduce(Text key, Iterable<IntWritable> values, 
                                   Context context
                                   ) throws IOException, InterruptedException {
         int sum = 0;
         for (IntWritable val : values) {
            sum += val.get();
         }
         result.set(sum);
         context.write(key, result);
      }
   }

   public static void main(String[] args) throws Exception {
      Configuration conf = new Configuration();
      String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
      if (otherArgs.length != 2) {
         System.err.println("Usage: wordcount <in> <out>");
         System.exit(2);
      }
      Job job = new Job(conf, "word count");
      job.setJarByClass(WordCount.class);
      job.setMapperClass(TokenizerMapper.class);
      job.setCombinerClass(IntSumReducer.class);
      job.setReducerClass(IntSumReducer.class);
      job.setOutputKeyClass(Text.class);
      job.setOutputValueClass(IntWritable.class);
      FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
      FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
      System.exit(job.waitForCompletion(true) ? 0 : 1);
   }
}

10. 配置运行参数:
【Run】->【Run Configurations】->【Java Application】->【Word Count】->【Arguments】->【Program arguments: hdfs://localhost:9000/user/xsj/input/* hdfs://localhost:9000/user/xsj/output】->【VM arguments: -Xms512m -Xmx512m】->【Apply】->【Close】->【Run】->【Run As】->【Run On Hadoop】
Ubuntu下开发Eclipse下的Hadoop应用
Ubuntu下开发Eclipse下的Hadoop应用

11. 运行:
Ubuntu下开发Eclipse下的Hadoop应用
控制台正常输出:
12/06/01 10:23:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/06/01 10:23:33 WARN mapred.JobClient: No job jar file set.   User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
12/06/01 10:23:36 INFO input.FileInputFormat: Total input paths to process : 2
12/06/01 10:23:37 INFO mapred.JobClient: Running job: job_local_0001
12/06/01 10:23:37 INFO input.FileInputFormat: Total input paths to process : 2
12/06/01 10:23:37 INFO mapred.MapTask: io.sort.mb = 100
12/06/01 10:23:40 INFO mapred.MapTask: data buffer = 79691776/99614720
12/06/01 10:23:40 INFO mapred.MapTask: record buffer = 262144/327680
12/06/01 10:23:44 INFO mapred.JobClient:   map 0% reduce 0%
12/06/01 10:23:52 INFO mapred.MapTask: Starting flush of map output
12/06/01 10:23:59 INFO mapred.LocalJobRunner: 
12/06/01 10:23:59 INFO mapred.MapTask: Finished spill 0
12/06/01 10:24:00 INFO mapred.JobClient:   map 100% reduce 0%
12/06/01 10:24:00 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/06/01 10:24:03 INFO mapred.LocalJobRunner: 
12/06/01 10:24:03 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
12/06/01 10:24:04 INFO mapred.MapTask: io.sort.mb = 100
12/06/01 10:24:08 INFO mapred.MapTask: data buffer = 79691776/99614720
12/06/01 10:24:08 INFO mapred.MapTask: record buffer = 262144/327680
12/06/01 10:24:10 INFO mapred.MapTask: Starting flush of map output
12/06/01 10:24:10 INFO mapred.MapTask: Finished spill 0
12/06/01 10:24:11 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting
12/06/01 10:24:12 INFO mapred.LocalJobRunner: 
12/06/01 10:24:12 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000001_0' done.
12/06/01 10:24:14 INFO mapred.LocalJobRunner: 
12/06/01 10:24:16 INFO mapred.Merger: Merging 2 sorted segments
12/06/01 10:24:17 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 77 bytes
12/06/01 10:24:17 INFO mapred.LocalJobRunner: 
12/06/01 10:24:20 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/06/01 10:24:20 INFO mapred.LocalJobRunner: 
12/06/01 10:24:20 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/06/01 10:24:21 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://localhost:9000/user/xsj/output
12/06/01 10:24:21 INFO mapred.LocalJobRunner: reduce > reduce
12/06/01 10:24:21 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
12/06/01 10:24:22 INFO mapred.JobClient:   map 100% reduce 100%
12/06/01 10:24:22 INFO mapred.JobClient: Job complete: job_local_0001
12/06/01 10:24:22 INFO mapred.JobClient: Counters: 14
12/06/01 10:24:22 INFO mapred.JobClient:    FileSystemCounters
12/06/01 10:24:22 INFO mapred.JobClient:       FILE_BYTES_READ=50488
12/06/01 10:24:22 INFO mapred.JobClient:       HDFS_BYTES_READ=120
12/06/01 10:24:22 INFO mapred.JobClient:       FILE_BYTES_WRITTEN=102748
12/06/01 10:24:22 INFO mapred.JobClient:       HDFS_BYTES_WRITTEN=41
12/06/01 10:24:22 INFO mapred.JobClient:    Map-Reduce Framework
12/06/01 10:24:22 INFO mapred.JobClient:       Reduce input groups=5
12/06/01 10:24:22 INFO mapred.JobClient:       Combine output records=6
12/06/01 10:24:22 INFO mapred.JobClient:       Map input records=4
12/06/01 10:24:22 INFO mapred.JobClient:       Reduce shuffle bytes=0
12/06/01 10:24:22 INFO mapred.JobClient:       Reduce output records=5
12/06/01 10:24:22 INFO mapred.JobClient:       Spilled Records=12
12/06/01 10:24:22 INFO mapred.JobClient:       Map output bytes=81
12/06/01 10:24:22 INFO mapred.JobClient:       Combine input records=8
12/06/01 10:24:22 INFO mapred.JobClient:       Map output records=8
12/06/01 10:24:22 INFO mapred.JobClient:       Reduce input records=6

12. 查看运行结果:
/user/xsj/output/part-r-00000文件为输出文件:
Ubuntu下开发Eclipse下的Hadoop应用
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值