多个mapreduce过程的组合模式

1、作业链

mapreduce作业可以一次创建并依次执行。

	旧api:
	     // Create a new JobConf
	     JobConf job = new JobConf(new Configuration(), MyJob.class);
	     
	     // Specify various job-specific parameters     
	     job.setJobName("myjob");
	     
	     job.setInputPath(new Path("in"));
	     job.setOutputPath(new Path("out"));
	     
	     job.setMapperClass(MyJob.MyMapper.class);
	     job.setReducerClass(MyJob.MyReducer.class);
	
	     // Submit the job, then poll for progress until the job is complete
	     JobClient.runJob(job);
	
	新api:
	     // Create a new Job
	     Job job = new Job(new Configuration());
	     job.setJarByClass(MyJob.class);
	     
	     // Specify various job-specific parameters     
	     job.setJobName("myjob");
	     
	     job.setInputPath(new Path("in"));
	     job.setOutputPath(new Path("out"));
	     
	     job.setMapperClass(MyJob.MyMapper.class);
	     job.setReducerClass(MyJob.MyReducer.class);
	
	     // Submit the job, then poll for progress until the job is complete
	     job.waitForCompletion(true);

2、作业图


解决作业之间的依赖问题,作业之间可能存在多个依赖关系,形成一个有向的无环图(DAG)。

	旧api:
		Job job1 = new Job(new JobConf());
		Job job2 = new Job(new JobConf());
		Job job3 = new Job(new JobConf());
		
		job3.addDependingJob(job1);
		job3.addDependingJob(job2);
		
		JobControl jobControl = new JobControl("controlgroupname");
		jobControl.addJob(job1);
		jobControl.addJob(job2);
		jobControl.addJob(job3);
		jobControl.run();
		
	新api:
		//假设作业3依赖作业1和作业2
		Configuration jobconf1 = null;
		/*
		 * jobconf1 settting
		 */
		Configuration jobconf2 = null;
		/*
		 * jobconf2 settting
		 */
		Configuration jobconf3 = null;
		/*
		 * jobconf3 settting
		 */
		ControlledJob cjob1 = new ControlledJob(jobconf1);
		ControlledJob cjob2 = new ControlledJob(jobconf2);
		ControlledJob cjob3 = new ControlledJob(jobconf3);

		cjob3.addDependingJob(cjob1);
		cjob3.addDependingJob(cjob2);

		JobControl jobControl = new JobControl("controlgroupname");
		jobControl.addJob(cjob1);
		jobControl.addJob(cjob2);
		jobControl.addJob(cjob3);
		jobControl.run();

3、map/reduce链

旧api
	public class ChainTest {
	
		public void chianTest() throws IOException {
			//
			JobConf job = new JobConf();
			JobConf jobconf1 = new JobConf(false);
			
			ChainMapper.addMapper(job, Mapper1.class, Object.class, Text.class
				, Text.class, IntWritable.class, true, jobconf1);
			ChainReducer.setReducer(job, Reducer1.class, Text.class, IntWritable.class
				, Text.class, IntWritable.class, true, jobconf1);
			ChainMapper.addMapper(job, Mapper2.class, Text.class, IntWritable.class
				, Text.class, IntWritable.class, true, jobconf1);
			//
		}
		class Mapper1 implements Mapper<Object, Text, Text, IntWritable>{ }
		class Mapper2 implements Mapper<Text, IntWritable, Text, IntWritable>{ }
		class Reducer1 implements Reducer<Text, IntWritable,Text, IntWritable>{ }
	}
新api
	public class ChainTest {
	
		public void chianTest() throws IOException {
			//
			Job job = new Job();
			ChainMapper.addMapper(job, Mapper1.class, Object.class, Text.class
				, Text.class, IntWritable.class, true, jobconf1);
			ChainReducer.setReducer(job, Reducer1.class, Text.class, IntWritable.class
				, Text.class, IntWritable.class, true, jobconf1);
			ChainMapper.addMapper(job, Mapper2.class, Text.class, IntWritable.class
				, Text.class, IntWritable.class, true, jobconf1);
			//
		}
		class Mapper1 extends Mapper<Object, Text, Text, IntWritable>{ }
		class Mapper2 extends Mapper<Text, IntWritable, Text, IntWritable>{ }
		class Reducer1 extends Reducer<Text, IntWritable,Text, IntWritable>{ }
	}


4、对于复杂的工作流可能需要利用外部的mapreduce工作流工具来完成,如:oozie


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值