1. job.waitForCompletion(true); 在Driver中提交job
1) sumbit() 提交
(1) connect():
<1> return new Cluster(getConfiguration());
① initialize(jobTrackAddr, conf);
通过YarnClientProtocolProvider | LocalClientProtocolProvider 根据配置文件的参数信息
获取当前job需要执行到本地还是Yarn
最终:LocalClientProtocolProvider ==> LocalJobRunner
(2) return submitter.submitJobInternal(Job.this, cluster); 提交job
<1> . checkSpecs(job); 检查job的输出路径。
<2> . Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
生成Job提交的临时目录
D:\tmp\hadoop\mapred\staging\Administrator1777320722\.staging
<3> . JobID jobId = submitClient.getNewJobID(); 为当前Job生成Id
<4> . Path submitJobDir = new Path(jobStagingArea, jobId.toString()); Job的提交路径
d:/tmp/hadoop/mapred/staging/Administrator1777320722/.staging/job_local1777320722_0001
<5> . copyAndConfigureFiles(job, submitJobDir);
① rUploader.uploadResources(job, jobSubmitDir);
[1] uploadResourcesInternal(job, submitJobDir);
{1}.submitJobDir = jtFs.makeQualified(submitJobDir);
mkdirs(jtFs, submitJobDir, mapredSysPerms);
创建Job的提交路径
<6> . int maps = writeSplits(job, submitJobDir); //生成切片信息 ,并返回切片的个数
<7> . conf.setInt(MRJobConfig.NUM_MAPS, maps); //通过切片的个数设置MapTask的个数
<8> . writeConf(conf, submitJobFile); //将当前Job相关的配置信息写到job提交路径下
路径下: job.split job.splitmetainfo job.xml xxx.jar
<9> .status = submitClient.submitJob(
jobId, submitJobDir.toString(), job.getCredentials());
//真正提交Job
<10> . jtFs.delete(submitJobDir, true); //等job执行完成后,删除Job的临时工作目录的内容