MapReduce 客户端提交Job源码跟踪
以WordCount为例:
首先是设置了连接Hadoop集群相关的配置文件,设置了Job相关的类的信息等等…
一,点击Job进入:
可以看到job类是继承了一个类并实现了 JobContext接口,点击JobContext
JobContext又是继承了 MRJobConfig 这个类,不难猜出,这个类是MapReduce程序运行时的配置相关参数类,点击进入查看:
很多默认的参数配置,
public static final String MAP_MEMORY_MB = "mapreduce.map.memory.mb";
public static final int DEFAULT_MAP_MEMORY_MB = 1024;
public static final String REDUCE_MEMORY_MB = "mapreduce.reduce.memory.mb";
public static final int DEFAULT_REDUCE_MEMORY_MB = 1024;
map和reduce的默认的内存配置大小。
二,点击进入到waitForCompletion()方法中。
在判断状态state可以提交job后,执行submit提交方法。monitorAndPrintJob()会不断的刷新获取Job运行的进度信息,并打印。boolean参数verbose为true表明要打印运行速度,为false,就是表示只等待运行结果,不打印运行日志。
public boolean waitForCompletion(boolean verbose
) throws IOException, InterruptedException,
ClassNotFoundException {
//当任务的状态为 define时,提交任务
if (state == JobState.DEFINE) {
submit();
}
//如果传入的参数是 true 则监听打印job运行日志
if (verbose) {
monitorAndPrintJob();
} else {
// get the completion poll interval from the client.
//从客户端获得完成轮询时间间隔
int completionPollIntervalMillis =
Job.getCompletionPollInterval(cluster.getConf());
while (!isComplete()) {
try {
Thread.sleep(completionPollIntervalMillis);
} catch (InterruptedException ie) {
}
}
}
//返回一个boolean值,表示作业是否成功提交
return isSuccessful();
}
三,点击进入到submit中
submit方法首先是确保当前的Job状态是处于 Define状态,否则不提交Job任务。然后启动新的API,connect()方法会产生一个Client实例,用来和ResourceManager进行通信。submit方法中关键的两个步骤就是,调用connect方法,另一个就是获取到 JobSubmitrer类的实例,调用该对象的submitJobInternal方法来提交任务。
public void submit()
throws IOException, InterruptedException, ClassNotFoundException {
//再次检查任务的状态
ensureState(JobState.DEFINE);
//使用新的API 里面就是一些配置的更改
setUseNewAPI();
connect();
//为cluster赋值,Client即是提交器,分为本体提交器和Yarn提交器,由配置文件决定
final JobSubmitter submitter =
getJobSubmitter(cluster.getFileSystem(), cluster.getClient());
status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {
public JobStatus run() throws IOException, InterruptedException,
ClassNotFoundException {
return submitter.submitJobInternal(Job.this, cluster);
}
});
state = JobState.RUNNING;
LOG.info("The url to track the job: " + getTrackingURL());
}
四,点 connect方法
MapReduce作业提交时连接集群是通过job类的connect实现的,它实际上就是构造了一个集群的实例对象cluster
private synchronized void connect()
throws IOException, InterruptedException, ClassNotFoundException {
if (cluster == null) {
cluster =
ugi.doAs(new PrivilegedExceptionAction<Cluster>() {
public Cluster run()
throws IOException, InterruptedException,
ClassNotFoundException {
//返回一个集群实例对象
return new Cluster(getConfiguration());
}
});
}
}
点击进入查看,怎么构造对象的:
public Cluster(Configuration conf) throws IOException {
//调用多参数的构造方法
this(null, conf);
}
public Cluster(InetSocketAddress jobTrackAddr, Configuration conf)
throws IOException {
this.conf = conf;
this.ugi = UserGroupInformation.getCurrentUser();
//初始化集群对象
initialize(jobTrackAddr, conf);
}
点击 initialize方法查看:
private void initialize(InetSocketAddress jobTrackAddr, Configuration conf)
throws IOException {
synchronized (frameworkLoader) {
//依次取出每个ClientProtocolProvider,通过其create()方法构造ClientProtocol实例
for (ClientProtocolProvider provider : frameworkLoader) {
LOG.debug("Trying ClientProtocolProvider : "
+ provider.getClass().getName());
ClientProtocol clientProtocol = null;
try {
//如果配置文件没有配置YARN信息,则构建LocalRunner,MR任务本地运行
//如果配置文件有配置YARN信息,则构建YarnRunner,MR任务在YARN集群上运行
if (jobTrackAddr == null) {
clientProtocol = provider.create(conf);
} else {
clientProtocol = provider.create(jobTrackAddr, conf);
}
if (clientProtocol != null) {
clientProtocolProvider = provider;
client = clientProtocol;
LOG.debug("Picked " + provider.getClass().getName()
+ " as the ClientProtocolProvider");
break;
}
else {
LOG.debug("Cannot pick " + provider.getClass().getName()
+ " as the ClientProtocolProvider - returned null protocol");
}
}
catch (Exception e) {
LOG.info("Failed to use " + provider.getClass().getName()
+ " due to error: " + e.getMessage());
}
}
}
if (null == clientProtocolProvider || null == client) {
throw new IOException(
"Cannot initialize Cluster. Please check your configuration for "
+ MRConfig.FRAMEWORK_NAME
+ " and the correspond server addresses.");
}
}
点击create方法,你可以看到 两个实现,一个是本地local,一个是yarn
到了这里,我们就能够知道一个很重要的信息,Cluster中客户端通信协议ClientProtocol实例,要么是Yarn模式下的YARNRunner,要么就是Local模式下的LocalJobRunner。
五,YarnRunner
以Yarn模式来分析MapReduce集群连接,看下YARNRunner的实现。
最重要的一个变量就是ResourceManager的代理ResourceMgrDelegate类型的resMgrDelegate实例,Yarn模式下整个MapReduce客户端就是由它负责与Yarn集群进行通信,完成诸如作业提交、作业状态查询等过程,通过它获取集群的信息,其内部有一个YarnClient实例YarnClient,负责与Yarn进行通信,还有ApplicationId、ApplicationSubmissionContext等与特定应用程序相关的成员变量。另外一个比较重要的变量就是客户端缓存ClientCache实例clientCache。
@SuppressWarnings("unchecked")
public class YARNRunner implements ClientProtocol {
private static final Log LOG = LogFactory.getLog(YARNRunner.class);
//记录工厂RecordFactory实例
private final RecordFactory recordFactory = RecordFactoryProvider.getRecordFactory(null);
//ResourceManager对象代理实例对象
private ResourceMgrDelegate resMgrDelegate;
//客户端的缓存实例
private ClientCache clientCache;
//配置信息
private Configuration conf;
//文件上下文实例
private final FileContext defaultFileContext;
/**
* Yarn runner incapsulates the client interface of
* yarn
* @param conf the configuration object for the client
*/
public YARNRunner(Configuration conf) {
this(conf, new ResourceMgrDelegate(new YarnConfiguration(conf)));
}
/**
* Similar to {@link #YARNRunner(Configuration)} but allowing injecting
* {@link ResourceMgrDelegate}. Enables mocking and testing.
* @param conf the configuration object for the client
* @param resMgrDelegate the resourcemanager client handle.
*/
public YARNRunner(Configuration conf, ResourceMgrDelegate resMgrDelegate) {
this(conf, resMgrDelegate, new ClientCache(conf, resMgrDelegate));
}
/**
* Similar to {@link YARNRunner#YARNRunner(Configuration, ResourceMgrDelegate)}
* but allowing injecting {@link ClientCache}. Enable mocking and testing.
* @param conf the configuration object
* @param resMgrDelegate the resource manager delegate
* @param clientCache the client cache object.
*/
public YARNRunner(Configuration conf, ResourceMgrDelegate resMgrDelegate,
ClientCache clientCache) {
this.conf = conf;
try {
this.resMgrDelegate = resMgrDelegate;
this.clientCache = clientCache;
this.defaultFileContext = FileContext.getFileContext(this.conf);
} catch (UnsupportedFileSystemException ufe) {
throw new RuntimeException("Error in instantiating YarnClient", ufe);
}
}
六,connect方法总结
MapReduce任务提交时连接集群是通过job的connect方法实现的,它实际上是构造了集群实例对象cluster。Cluster是连接MapReduce集群的一个工具,提供了具体获取MapReduce集群信息的方法。在Cluster内部,有一个和集群进行通信的客户端通信协议ClientProtocol实例Client,Hadoop2.0中提供了两种模式的ClientProtocol,分别是Yarn模式,另一种Local模式。Yarn模式下,ClientProtocol的实例YarnRunner对象内部有一个ResourceManager代理对象的实例,Yarn模式下整个MapReduce客户端就是由它负责与Yarn集群进行通信,完成作业提交,作业状态查询等操作。
七,submitJobInternal()方法
回到三,上面已经介绍了connect()方法,下面开始介绍另一个重要的的方法submitJobInternal()。
该方法隶属于JobSubmitter类,顾名思义,该类是MapReduce中作业提交者,而实际上JobSubmitter除了构造方法外,对外提供的唯一一个非private成员变量或方法就是submitJobInternal()方法,它是提交Job的内部方法,实现了提交Job的所有业务逻辑。
@InterfaceAudience.Private
@InterfaceStability.Unstable
class JobSubmitter {
protected static final Log LOG = LogFactory.getLog(JobSubmitter.class);
private static final String SHUFFLE_KEYGEN_ALGORITHM = "HmacSHA1";
private static final int SHUFFLE_KEY_LENGTH = 64;
private FileSystem jtFs; //文件系统FileSystem对象实例
private ClientProtocol submitClient; //客户端通信协议实例对象
private String submitHostName; //提交作业的主机名
private String submitHostAddress; //提交作业的主机
.............
JobSubmitter唯一的对外核心功能方法submitJobInternal(),它被用于提交作业至集群
JobStatus submitJobInternal(Job job, Cluster cluster)
throws ClassNotFoundException, InterruptedException, IOException {
//validate the jobs output specs
//检查任务输出规格
//检查作业输出路径是否配置并且是否存在。正确情况是已经配置且不存在
//输出路径的配置参数为mapreduce.output.fileoutputformat.outputdir
checkSpecs(job);
Configuration conf = job.getConfiguration();
//添加应用框架路径到分布式缓存中
addMRFrameworkToDistributedCache(conf);
//通过静态方法getStagingDir()获取作业执行时相关资源的存放路径
//参数未配置时默认是/tmp/hadoop-yarn/staging/提交作业用户名/.staging
Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
//configure the command line options correctly on the submitting dfs
//在提交dfs上正确配置命令行选项
InetAddress ip = InetAddress.getLocalHost(); //获取当前主机IP
if (ip != null) {
//记录提交作业的主机IP、主机名,并且设置配置信息conf
submitHostAddress = ip.getHostAddress();
submitHostName = ip.getHostName();
conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
}
JobID jobId = submitClient.getNewJobID(); //生成作业ID,即是jobID
job.setJobID(jobId); //将jobID设置入job
//构造提交作业路径,jobStagingArea后接/jobID
Path submitJobDir = new Path(jobStagingArea, jobId.toString());
JobStatus status = null;
try {
//设置一些作业参数
conf.set(MRJobConfig.USER_NAME,
UserGroupInformation.getCurrentUser().getShortUserName());
conf.set("hadoop.http.filter.initializers",
"org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString());
LOG.debug("Configuring job " + jobId + " with " + submitJobDir
+ " as the submit dir");
// get delegation token for the dir 获得路径的授权令牌
TokenCache.obtainTokensForNamenodes(job.getCredentials(),
new Path[] { submitJobDir }, conf);
//获取秘钥和令牌,并将它们存储到令牌缓存TokenCache中
populateTokenCache(conf, job.getCredentials());
// generate a secret to authenticate shuffle transfers 生成一个秘密来验证洗牌转移
if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
KeyGenerator keyGen;
try {
int keyLen = CryptoUtils.isShuffleEncrypted(conf)
? conf.getInt(MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS,
MRJobConfig.DEFAULT_MR_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS)
: SHUFFLE_KEY_LENGTH;
keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
keyGen.init(keyLen);
} catch (NoSuchAlgorithmException e) {
throw new IOException("Error generating shuffle secret key", e);
}
SecretKey shuffleKey = keyGen.generateKey();
TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
job.getCredentials());
}
//复制并配置相关文件
copyAndConfigureFiles(job, submitJobDir);
//获取配置文件路径
Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);
// Create the splits for the job 创建 split
LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
//调用writeSplits()方法,写分片数据文件job.splits和分片元数据文件job.splitmetainfo,计算map任务数
int maps = writeSplits(job, submitJobDir);
conf.setInt(MRJobConfig.NUM_MAPS, maps);
LOG.info("number of splits:" + maps);
// write "queue admins of the queue to which job is being submitted"
// to job file.
// 获取作业队列名queue,取参数mapreduce.job.queuename,默认值为default
String queue = conf.get(MRJobConfig.QUEUE_NAME,
JobConf.DEFAULT_QUEUE_NAME);
AccessControlList acl = submitClient.getQueueAdmins(queue);
conf.set(toFullPropertyName(queue,
QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());
// removing jobtoken referrals before copying the jobconf to HDFS
// as the tasks don't need this setting, actually they may break
// because of it if present as the referral will point to a
// different job.
TokenCache.cleanUpTokenReferral(conf);//清除缓存的令牌
//根据参数确定是否需要追踪令牌ID
if (conf.getBoolean(
MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
// Add HDFS tracking ids
ArrayList<String> trackingIds = new ArrayList<String>();
for (Token<? extends TokenIdentifier> t :
job.getCredentials().getAllTokens()) {
trackingIds.add(t.decodeIdentifier().getTrackingId());
}
conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
trackingIds.toArray(new String[trackingIds.size()]));
}
// Set reservation info if it exists 设置保留信息,如果它存在
ReservationId reservationId = job.getReservationId();
if (reservationId != null) {
conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
}
// Write job file to submit dir 写作业文件提交目录
writeConf(conf, submitJobFile);
//
// Now, actually submit the job (using the submit name)
// 现在,实际提交作业(使用提交名称)
printTokens(jobId, job.getCredentials());
//通过客户端通信协议ClientProtocol实例submitClient的submitJob()方法提交作业
//并获取作业状态实例status。由上下文可知,此处的submitClient是YARNRunner或LocalJobRunner
status = submitClient.submitJob(
jobId, submitJobDir.toString(), job.getCredentials());
if (status != null) {
return status;
} else {
throw new IOException("Could not launch job");
}
} finally {
if (status == null) {
LOG.info("Cleaning up the staging area " + submitJobDir);
if (jtFs != null && submitJobDir != null)
jtFs.delete(submitJobDir, true);
}
}
}
八,点击 writeSplits()方法
private int writeSplits(org.apache.hadoop.mapreduce.JobContext job,
Path jobSubmitDir) throws IOException,
InterruptedException, ClassNotFoundException {
JobConf jConf = (JobConf)job.getConfiguration();
int maps;
//使用的是新的 API
if (jConf.getUseNewMapper()) {
maps = writeNewSplits(job, jobSubmitDir);
} else {
maps = writeOldSplits(jConf, jobSubmitDir);
}
return maps;
}
点击 writeUseNewMapper()
@SuppressWarnings("unchecked")
private <T extends InputSplit>
int writeNewSplits(JobContext job, Path jobSubmitDir) throws IOException,
InterruptedException, ClassNotFoundException {
Configuration conf = job.getConfiguration();
InputFormat<?, ?> input =
ReflectionUtils.newInstance(job.getInputFormatClass(), conf);
List<InputSplit> splits = input.getSplits(job);
T[] array = (T[]) splits.toArray(new InputSplit[splits.size()]);
// sort the splits into order based on size, so that the biggest
// go first
Arrays.sort(array, new SplitComparator());
JobSplitWriter.createSplitFiles(jobSubmitDir, conf,
jobSubmitDir.getFileSystem(conf), array);
return array.length;
}
点击 Job.getInputFormatClass():
@SuppressWarnings("unchecked")
public Class<? extends InputFormat<?,?>> getInputFormatClass()
throws ClassNotFoundException {
return (Class<? extends InputFormat<?,?>>)
//输入的格式,未指定,默认是 TextInputFormat类型
conf.getClass(INPUT_FORMAT_CLASS_ATTR, TextInputFormat.class);
}
点击 input.getSplits()方法
public List<InputSplit> getSplits(JobContext job) throws IOException {
Stopwatch sw = new Stopwatch().start();
//从配置和默认的最小 取最大值,默认的是1MB
long minSize = Math.max(getFormatMinSplitSize(), getMinSplitSize(job));
//从配置和默认的最大 取最小值,默认的是Long的最大值
long maxSize = getMaxSplitSize(job);
// generate splits
List<InputSplit> splits = new ArrayList<InputSplit>();
//获取文件的列表信息
List<FileStatus> files = listStatus(job);
//遍历获取数据的位置信息
for (FileStatus file: files) {
Path path = file.getPath();
long length = file.getLen();
if (length != 0) {
BlockLocation[] blkLocations;
if (file instanceof LocatedFileStatus) {
blkLocations = ((LocatedFileStatus) file).getBlockLocations();
} else {
FileSystem fs = path.getFileSystem(job.getConfiguration());
blkLocations = fs.getFileBlockLocations(file, 0, length);
}
if (isSplitable(job, path)) {
long blockSize = file.getBlockSize();
//Blocksize和max取小 得出的值 和 minsize取大值
long splitSize = computeSplitSize(blockSize, minSize, maxSize);
long bytesRemaining = length;
//只要剩下的文件长度大小 是splitsize的 1.1以上,就继续切分 split
//也就是说最后的一个split取值可能是 0 到 1.1*splitsize 之间
while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
int blkIndex = getBlockIndex(blkLocations, length-bytesRemaining);
splits.add(makeSplit(path, length-bytesRemaining, splitSize,
blkLocations[blkIndex].getHosts(),
blkLocations[blkIndex].getCachedHosts()));
bytesRemaining -= splitSize;
}
if (bytesRemaining != 0) {
int blkIndex = getBlockIndex(blkLocations, length-bytesRemaining);
splits.add(makeSplit(path, length-bytesRemaining, bytesRemaining,
blkLocations[blkIndex].getHosts(),
blkLocations[blkIndex].getCachedHosts()));
}
} else { // not splitable
splits.add(makeSplit(path, 0, length, blkLocations[0].getHosts(),
blkLocations[0].getCachedHosts()));
}
} else {
//Create empty hosts array for zero length files
splits.add(makeSplit(path, 0, length, new String[0]));
}
}
// Save the number of input files for metrics/loadgen
job.getConfiguration().setLong(NUM_INPUT_FILES, files.size());
sw.stop();
if (LOG.isDebugEnabled()) {
LOG.debug("Total # of splits generated by getSplits: " + splits.size()
+ ", TimeTaken: " + sw.elapsedMillis());
}
return splits;
}
computeSplitSize()方法源码
protected long computeSplitSize(long blockSize, long minSize,
long maxSize) {
return Math.max(minSize, Math.min(maxSize, blockSize));
}
提交任务到底是谁?
点击 submitClient.submitJob()方法
这里看的是Yarn模式下提交的源码
@Override
public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
throws IOException, InterruptedException {
addHistoryToken(ts);
// Construct necessary information to start the MR AM
ApplicationSubmissionContext appContext =
createApplicationSubmissionContext(conf, jobSubmitDir, ts);
// Submit to ResourceManager
//提交给 ResourceManager 到这MapReduce client提交Job 也就差不多了
try {
ApplicationId applicationId =
resMgrDelegate.submitApplication(appContext);
ApplicationReport appMaster = resMgrDelegate
.getApplicationReport(applicationId);
String diagnostics =
(appMaster == null ?
"application report is null" : appMaster.getDiagnostics());
if (appMaster == null
|| appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
|| appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
throw new IOException("Failed to run job : " +
diagnostics);
}
return clientCache.getClient(jobId).getJobStatus(jobId);
} catch (YarnException e) {
throw new IOException(e);
}
}
至此,MapReduce的Job提交的大体过程就分析完毕!