1、作业链
mapreduce作业可以一次创建并依次执行。
旧api:
// Create a new JobConf
JobConf job = new JobConf(new Configuration(), MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
job.setInputPath(new Path("in"));
job.setOutputPath(new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setReducerClass(MyJob.MyReducer.class);
// Submit the job, then poll for progress until the job is complete
JobClient.runJob(job);
新api:
// Create a new Job
Job job = new Job(new Configuration());
job.setJarByClass(MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
job.setInputPath(new Path("in"));
job.setOutputPath(new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setReducerClass(MyJob.MyReducer.class);
// Submit the job, then poll for progress until the job is complete
job.waitForCompletion(true);
2、作业图
解决作业之间的依赖问题,作业之间可能存在多个依赖关系,形成一个有向的无环图(DAG)。
旧api:
Job job1 = new Job(new JobConf());
Job job2 = new Job(new JobConf());
Job job3 = new Job(new JobConf());
job3.addDependingJob(job1);
job3.addDependingJob(job2);
JobControl jobControl = new JobControl("controlgroupname");
jobControl.addJob(job1);
jobControl.addJob(job2);
jobControl.addJob(job3);
jobControl.run();
新api:
//假设作业3依赖作业1和作业2
Configuration jobconf1 = null;
/*
* jobconf1 settting
*/
Configuration jobconf2 = null;
/*
* jobconf2 settting
*/
Configuration jobconf3 = null;
/*
* jobconf3 settting
*/
ControlledJob cjob1 = new ControlledJob(jobconf1);
ControlledJob cjob2 = new ControlledJob(jobconf2);
ControlledJob cjob3 = new ControlledJob(jobconf3);
cjob3.addDependingJob(cjob1);
cjob3.addDependingJob(cjob2);
JobControl jobControl = new JobControl("controlgroupname");
jobControl.addJob(cjob1);
jobControl.addJob(cjob2);
jobControl.addJob(cjob3);
jobControl.run();
3、map/reduce链
旧api
public class ChainTest {
public void chianTest() throws IOException {
//
JobConf job = new JobConf();
JobConf jobconf1 = new JobConf(false);
ChainMapper.addMapper(job, Mapper1.class, Object.class, Text.class
, Text.class, IntWritable.class, true, jobconf1);
ChainReducer.setReducer(job, Reducer1.class, Text.class, IntWritable.class
, Text.class, IntWritable.class, true, jobconf1);
ChainMapper.addMapper(job, Mapper2.class, Text.class, IntWritable.class
, Text.class, IntWritable.class, true, jobconf1);
//
}
class Mapper1 implements Mapper<Object, Text, Text, IntWritable>{ }
class Mapper2 implements Mapper<Text, IntWritable, Text, IntWritable>{ }
class Reducer1 implements Reducer<Text, IntWritable,Text, IntWritable>{ }
}
新api
public class ChainTest {
public void chianTest() throws IOException {
//
Job job = new Job();
ChainMapper.addMapper(job, Mapper1.class, Object.class, Text.class
, Text.class, IntWritable.class, true, jobconf1);
ChainReducer.setReducer(job, Reducer1.class, Text.class, IntWritable.class
, Text.class, IntWritable.class, true, jobconf1);
ChainMapper.addMapper(job, Mapper2.class, Text.class, IntWritable.class
, Text.class, IntWritable.class, true, jobconf1);
//
}
class Mapper1 extends Mapper<Object, Text, Text, IntWritable>{ }
class Mapper2 extends Mapper<Text, IntWritable, Text, IntWritable>{ }
class Reducer1 extends Reducer<Text, IntWritable,Text, IntWritable>{ }
}