MapReduce编程

有两种方式编写MapReduce程序,一种通过eclipse的hadoop插件,另一种通过引入相关的包。这里先介绍通过引入jar包来编程。

一、 打开eclipse,导入开发MapReduce需要的jar包:

1.hadoop/share/hadoop/mapreduce下的所有jar包,但是子文件夹下面的jar包不需要导入

2.hadoop/share/hadoop/common下的hadoop-common-2.7.1.jar

3.hadoop/share/hadoop/common/lib下的commons-cli-1.2.jar

二、 编写程序,这里以一个简单的案例来介绍。

1. 程序结构

2. 程序代码:

MaxTemperatureDriver.java

package pers.peng.maxtemperature;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class MaxTemperatureDriver extends Configured implements Tool {
	public int run(String[] args) throws Exception{
		if(args.length != 2){
            System.err.printf("Usage: %s <input><output>",getClass().getSimpleName());
            ToolRunner.printGenericCommandUsage(System.err);
            return -1;  
		}

		Configuration conf =getConf();                
        Job job = new Job(getConf());
        job.setJobName("Max Temperature");                  
        job.setJarByClass(getClass());
        FileInputFormat.addInputPath(job,new Path(args[0]));
        FileOutputFormat.setOutputPath(job,new Path(args[1]));                  
        job.setMapperClass(MaxTemperatureMapper.class);
        job.setReducerClass(MaxTemperatureReducer.class);            
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);                  
        return job.waitForCompletion(true)?0:1;
	}
	
	public static void main(String[] args) throws Exception{
		int exitcode = ToolRunner.run(new MaxTemperatureDriver(), args);
		System.exit(exitcode);
	}
}
MaxTemperatureMapper.java

package pers.peng.maxtemperature;

import java.io.IOException; 
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Counter;

public class MaxTemperatureMapper extends Mapper<LongWritable, Text,Text, IntWritable>{
	public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{                                 
		String line =value.toString();                               
		try {
			String year = line.substring(0, 5);
			int airTemperature = Integer.parseInt(line.substring(5, 7));            
			context.write(new Text(year),	new IntWritable(airTemperature));
			Counter countPrint0 = context.getCounter("Test", "空");
			countPrint0.increment(1l);
			Counter countPrint = context.getCounter("Map1111", line.substring(5, 7));
			countPrint.increment(1l);
		} catch (Exception e) {
			System.out.print("Error in line:" + line);
		}                                  
	}        
}
MaxTemperatureReducer.java

package pers.peng.maxtemperature;

import java.io.IOException; 
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class MaxTemperatureReducer extends Reducer<Text,IntWritable,Text,IntWritable> {        
         public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,  InterruptedException{

                   int maxValue = Integer.MIN_VALUE;                  

                   for(IntWritable value: values){

                            maxValue = Math.max(maxValue,value.get());              

                   }        

                   context.write(key, new IntWritable(maxValue));                 

         } 

}

3. 导出为jar包

“右击项目”->“Export”->"Java"->"JAR file"->"Next"

此时选择一个放jar包的文件位置,并给该包取名,我的为/usr/local/hadoop/MaxTemperature.jar。其他默认。

"Next"->"Next"->"此时在最下面要选择一个主类作为程序的入口,我选择MaxTemperatureDriver"->"Finish"

4. 编写测试数据

数据格式为形如:2016 25

前面为年份,空格,然后是温度。具体要写多少这种数据自己决定。可以分成三四个.txt文件来存放。

5. 启动hadoop,并将测试数据放到hadoop上

将测试数据放到hadoop

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值