hadoop yarn

最新推荐文章于 2022-08-11 10:49:10 发布

master-dragon

最新推荐文章于 2022-08-11 10:49:10 发布

阅读量412

点赞数

分类专栏： hadoop/hive/hbase

本文链接：https://blog.csdn.net/qq_26437925/article/details/106087593

版权

hadoop/hive/hbase 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

前言

核心思想：The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons.

hadoop 1.0 JobTracker

hadoop 1.0，图书《Hadoop技术内幕：深入理解MapReduce架构设计与实现原理》

在这里插入图片描述

计算模型理解

JobTracker

资源管理
任务调度

TaskTracker

任务管理
资源汇报（心跳 JobTracker）

客户端（前期准备工作&计算的发起）
(MapReduce, jar)

根据计算数据，返回 split 清单（咨询Name Node 获取元数据），即得到了MapTask的数量，计算向数据移动可以支持了
生成计算程序未来运行时的一些配置文件
客户端把准备的东西传到hdfs（这种数据移动是可靠的），未来TaskTracker去hdfs拉取要计算的数据
客户端调用JobTracker, 发起计算程序，并且准备的东西也告知了JobTracker

JobTracker & TaskTracker（对一个任务的运行）

JobTracker收到启动程序之后： a. 获取到split 清单；b. 确定split计算分配给哪些个TaskTracker；c: 未来，心跳TaskTracker获取任务运行信息相关
TaskTracker 心跳获取任务：从hdfs中获取配置文件到本机，然后启动MapTask 或者 ReduceTask (逻辑代码就在jar中)

多个任务发到JobTracker ?（单点）

资源管理
任务调度问题

缺点：

JobTracker 存在单点故障；能力有限，扩容有限
JobTracker 耦合了【资源管理和任务调度】，如果一个新的计算框架过来，则不能利用原来的资源管理了，不利于扩展
TaskTracker端，以Map/Reduce task的数目作为资源的表示过于简单，没有考虑CPU，内存（即真正的计算资源）
只能运行MapReduce；TaskTracker端, 把资源强制划分为了Map task slot 和 Reduce task slot

yarn( Yet Another Resource Negotiator)

https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html

https://www.edureka.co/blog/hadoop-yarn-tutorial/

YARN enabled the users to perform operations as per requirement by using a variety of tools like Spark for real-time processing, Hive for SQL, HBase for NoSQL and others.
(yarn 能够让 spark, hive, hbase等数据处理任务调度运行起来，不单单只是MapReduce任务)
架构图

Resource Manager: Runs on a master daemon and manages the resource allocation in the cluster.
（处理客户端请求，负责集群资源的管理和调度）
Node Manager: They run on the slave daemons and are responsible for the execution of a task on every single Data Node.
（负责自己所在节点的应用资源的使用情况，并向RM汇报。接收并处理来自RM,AM的各种命令）
Application Master: Manages the user job lifecycle and resource needs of individual applications. It works along with the Node Manager and monitors the execution of tasks.
（每个application都对应一个AM. 主要负责应用程序的管理，向RM申请资源，并分配给task。AM与NM通信，来启动或停止task）
Container: Package of resources including RAM, CPU, Network, HDD etc on a single node.
(封装了CPU,内存等资源的一个容器，相当于一个任务运行环境的抽象)

对比 hadoop 1.0 架构

客户端变化不大, 做的事情相同；不过原来客户端联系JobTracker, 现在联系Resource Manager (独立的资源管理)

RM 知晓所有 NodeManager 的资源使用情况, 要执行计算的时候，RM会选择一个NodeManager来启动一个Application Master （即为客户端启动一个计算程序的主节点）
AM(类似原来的TaskTracker，但没有资源管理)来启动任务：仍然是根据客户端上传到HFDF的split清单，配置等；且必须向 RM 申请资源，去哪些节点跑任务
AM 要处理任务，由RM来决策；RM 通信 NM 来计算资源，分配Container(计算资源的抽象); Container反向注册给AM，AM就知道有多少个container归自己使用
接着AM来分配MapTask, ReduceTask 到 container，即一个应用任务跑起来了

知识

container 容器

资源抽象： cpu，内存，磁盘等的一个量，归属于某个Node Manager
物理上：一个JVM进程（a. NodeManager用线程监控container资源情况,比如资源使用超过，container 被 kill 掉；重试; b. 内核cgroup技术：在启动jvm进程，有kernel约束死）

新的框架

ResourceManager 负责整体资源的管理
NodeManager 心跳 RM，提交自己的资源使用情况

MapReduce 在 yarn 中运行？

客户端提交任务，仍然是jar,配置等上传到HDFS，要访问RM，去申请AppMaster
RM会选择一个不忙的NodeManager启动一个container，在里面反射一个MRAppMaster
启动MRAppMaster，从HDFS下载splits相关，向RM申请资源
RM来确定资源的使用，通知NM去启动container
container启动后会反向注册到已经启动的MRAppMaster进程
MRAppMaster会将任务发送给container（启动Map任务,Reduce任务）
container会反射相应的Task为对象，调用执行，让业务代码跑起来
Task失败有重试

类比图理解架构

在这里插入图片描述

客户端请求看成一个外界请求，比如需要造一个购物APP微信小程序
这些需求由Boss(RM角色)来接，决定怎么做，且Boss是拥有各种资源的
Boss 发现某几个部门配合就能完成，找到部门负责人；部门负责人（NM角色）要负责自己部门的资源使用情况，同时要汇报Boss
部门负责人下面有项目经理（AM角色），来负责各种任务；为了完成任务，需要的各种资源，向RM申请就好了，当然也要通知NM
资源（可以任务是各种人员，设备之类的）即完成某个任务必备的（Container角色）

源码走读

当使用yarn的时候，我们在mapred-site.xml中配置了如下的实践版本：hadoop 2.7.2，当然还有别的配置）

<property>
    <name>mapreduce.framework.name</name>
	<value>yarn</value>
</property>

整个就是沿着job.waitForCompletion(true);这么简单的一句走下去

任务提交/准备的流程图

在这里插入图片描述

client 提交任务
rm会生成jobId，返回给client
任务配置(job.split, xml, jar等)上传hdfs
任务配置上传完成，通知rm
rm将job添加到队列上进行调度，启动am
am找到nm去负责该job
nm会启动容器生成 Mapreduce AppMaster
AppMaster 获取 hdfs上到任务配置，然后开始内部操作

资源调度，rm给AppMaster分配nm，nm启动container跑任务，随时汇报给appMaster，最终通知rm：任务结束，资源释放

job.waitForCompletion(true) -> (Job)this.submit(); -> submitter.submitJobInternal(Job.this, Job.this.cluster)

submitter.submitJobInternal(Job.this, Job.this.cluster)

public void submit() throws IOException, InterruptedException, ClassNotFoundException {
   this.ensureState(Job.JobState.DEFINE);
    this.setUseNewAPI();
    this.connect();
    final JobSubmitter submitter = this.getJobSubmitter(this.cluster.getFileSystem(), this.cluster.getClient());
    this.status = (JobStatus)this.ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {
        public JobStatus run() throws IOException, InterruptedException, ClassNotFoundException {
            return submitter.submitJobInternal(Job.this, Job.this.cluster);
        }
    });
    this.state = Job.JobState.RUNNING;
    LOG.info("The url to track the job: " + this.getTrackingURL());
}

cluster new出来的时候执行了this.initialize(jobTrackAddr, conf);,其中ClientProtocol使用的YarnClientProtocolProvider的create方法

@Override
public ClientProtocol create(Configuration conf) throws IOException {
  if (MRConfig.YARN_FRAMEWORK_NAME.equals(conf.get(MRConfig.FRAMEWORK_NAME))) {
    return new YARNRunner(conf);
  }
  return null;
}

submitJobInternal 大概做的几件事情

/**
   * Internal method for submitting jobs to the system.
   * 
   * <p>The job submission process involves:
   * <ol>
   *   <li>
   *   Checking the input and output specifications of the job. (检查任务的输入输出)
   *   </li>
   *   <li>
   *   Computing the {@link InputSplit}s for the job.（计算任务的splits）
   *   </li>
   *   <li>
   *   Setup the requisite accounting information for the 
   *   {@link DistributedCache} of the job, if necessary.
   *   </li>
   *   <li>
   *   Copying the job's jar and configuration to the map-reduce system (把任务的jar，配置等拷贝到hdfs中)
   *   directory on the distributed file-system.
   *   </li>
   *   <li>
   *   Submitting the job to the <code>JobTracker</code> and optionally (任务提交到JobTracker,并监控)
   *   monitoring it's status.
   *   </li>
   * </ol></p>
   * @param job the configuration to submit
   * @param cluster the handle to the Cluster
   * @throws ClassNotFoundException
   * @throws InterruptedException
   * @throws IOException
   */
  JobStatus submitJobInternal(Job job, Cluster cluster) 
  throws ClassNotFoundException, InterruptedException, IOException {

整个源码方法

 JobStatus submitJobInternal(Job job, Cluster cluster) 
  throws ClassNotFoundException, InterruptedException, IOException {

    //validate the jobs output specs 
    checkSpecs(job);

    Configuration conf = job.getConfiguration();
    addMRFrameworkToDistributedCache(conf);

    Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
    //configure the command line options correctly on the submitting dfs
    InetAddress ip = InetAddress.getLocalHost();
    if (ip != null) {
      submitHostAddress = ip.getHostAddress();
      submitHostName = ip.getHostName();
      conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
      conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
    }
    // 生成JobId, 类似: `application_1517538889175_2550` 这种
    JobID jobId = submitClient.getNewJobID();
    job.setJobID(jobId);
    Path submitJobDir = new Path(jobStagingArea, jobId.toString());
    JobStatus status = null;
    try {
      conf.set(MRJobConfig.USER_NAME,
          UserGroupInformation.getCurrentUser().getShortUserName());
      conf.set("hadoop.http.filter.initializers", 
          "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
      conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString());
      LOG.debug("Configuring job " + jobId + " with " + submitJobDir 
          + " as the submit dir");
      // get delegation token for the dir
      TokenCache.obtainTokensForNamenodes(job.getCredentials(),
          new Path[] { submitJobDir }, conf);
      
      populateTokenCache(conf, job.getCredentials());

      // generate a secret to authenticate shuffle transfers
      if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
        KeyGenerator keyGen;
        try {
          keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
          keyGen.init(SHUFFLE_KEY_LENGTH);
        } catch (NoSuchAlgorithmException e) {
          throw new IOException("Error generating shuffle secret key", e);
        }
        SecretKey shuffleKey = keyGen.generateKey();
        TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
            job.getCredentials());
      }
      if (CryptoUtils.isEncryptedSpillEnabled(conf)) {
        conf.setInt(MRJobConfig.MR_AM_MAX_ATTEMPTS, 1);
        LOG.warn("Max job attempts set to 1 since encrypted intermediate" +
                "data spill is enabled");
      }

      // 把任务相关的文件，配置，jars上传
      copyAndConfigureFiles(job, submitJobDir);

      // 获取配置文件job.xml的路径
      Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);

      // 输入文件的splits,配置信息写入job信息中, map数量也即splits数量
      // Create the splits for the job
      LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
      int maps = writeSplits(job, submitJobDir);
      conf.setInt(MRJobConfig.NUM_MAPS, maps);
      LOG.info("number of splits:" + maps);

      int maxMaps = conf.getInt(MRJobConfig.JOB_MAX_MAP,
          MRJobConfig.DEFAULT_JOB_MAX_MAP);
      if (maxMaps >= 0 && maxMaps < maps) {
        throw new IllegalArgumentException("The number of map tasks " + maps +
            " exceeded limit " + maxMaps);
      }

      // 设置job使用的资源队列
      // write "queue admins of the queue to which job is being submitted"
      // to job file.
      String queue = conf.get(MRJobConfig.QUEUE_NAME,
          JobConf.DEFAULT_QUEUE_NAME);
      AccessControlList acl = submitClient.getQueueAdmins(queue);
      conf.set(toFullPropertyName(queue,
          QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());

      // removing jobtoken referrals before copying the jobconf to HDFS
      // as the tasks don't need this setting, actually they may break
      // because of it if present as the referral will point to a
      // different job.
      TokenCache.cleanUpTokenReferral(conf);

      if (conf.getBoolean(
          MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
          MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
        // Add HDFS tracking ids
        ArrayList<String> trackingIds = new ArrayList<String>();
        for (Token<? extends TokenIdentifier> t :
            job.getCredentials().getAllTokens()) {
          trackingIds.add(t.decodeIdentifier().getTrackingId());
        }
        conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
            trackingIds.toArray(new String[trackingIds.size()]));
      }

      // Set reservation info if it exists
      ReservationId reservationId = job.getReservationId();
      if (reservationId != null) {
        conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
      }

      // Write job file to submit dir
      writeConf(conf, submitJobFile);

      // submitClient.submitJob 真正提交作业
      //
      // Now, actually submit the job (using the submit name)
      //
      printTokens(jobId, job.getCredentials());
      status = submitClient.submitJob(
          jobId, submitJobDir.toString(), job.getCredentials());
      if (status != null) {
        return status;
      } else {
        throw new IOException("Could not launch job");
      }
    } finally {
      if (status == null) {
        LOG.info("Cleaning up the staging area " + submitJobDir);
        if (jtFs != null && submitJobDir != null)
          jtFs.delete(submitJobDir, true);

      }
    }
  }

YARNRunner submitJob

在这里插入图片描述

@Override
public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
throws IOException, InterruptedException {
  
  addHistoryToken(ts);

  ApplicationSubmissionContext appContext =
    createApplicationSubmissionContext(conf, jobSubmitDir, ts);

  // Submit to ResourceManager
  try {
    ApplicationId applicationId =
        resMgrDelegate.submitApplication(appContext);

    ApplicationReport appMaster = resMgrDelegate
        .getApplicationReport(applicationId);
    String diagnostics =
        (appMaster == null ?
            "application report is null" : appMaster.getDiagnostics());
    if (appMaster == null
        || appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
        || appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
      throw new IOException("Failed to run job : " +
          diagnostics);
    }
    return clientCache.getClient(jobId).getJobStatus(jobId);
  } catch (YarnException e) {
    throw new IOException(e);
  }
}

/**
 * {@code ApplicationSubmissionContext} represents all of the
 * information needed by the {@code ResourceManager} to launch
 * the {@code ApplicationMaster} for an application.
 * <p>
 * It includes details such as:
 * <ul>
 *   <li>{@link ApplicationId} of the application.</li>
 *   <li>Application user.</li>
 *   <li>Application name.</li>
 *   <li>{@link Priority} of the application.</li>
 *   <li>
 *     {@link ContainerLaunchContext} of the container in which the
 *     <code>ApplicationMaster</code> is executed.
 *   </li>
 *   <li>
 *     maxAppAttempts. The maximum number of application attempts.
 *     It should be no larger than the global number of max attempts in the
 *     YARN configuration.
 *   </li>
 *   <li>
 *     attemptFailuresValidityInterval. The default value is -1.
 *     when attemptFailuresValidityInterval in milliseconds is set to
 *     {@literal >} 0, the failure number will no take failures which happen
 *     out of the validityInterval into failure count. If failure count
 *     reaches to maxAppAttempts, the application will be failed.
 *   </li>
 *   <li>Optional, application-specific {@link LogAggregationContext}</li>
 * </ul>
 * 
 * @see ContainerLaunchContext
 * @see ApplicationClientProtocol#submitApplication(org.apache.hadoop.yarn.api.protocolrecords.SubmitApplicationRequest)
 */
@Public
@Stable
public abstract class ApplicationSubmissionContext {