Flink ResourceManage中Slot的计算资源管理

ResourceManage中Slot的管理

ResourceManager资源管理器其继承了FencedRpcEndpoint实现了RPC服务,其内部组件主要包含

  1. 管理所有TaskExecutor上报的slot资源、申请(SlotManager)
  2. 为每个job任务选择出对应ha可用的JobMaster,并将该job任务分配该JobMaster服务(JobLeaderIdService)、高可用leader选举服务leaderElectionService等
  3. 心跳管理器taskManagerHeartbeatManager、jobManagerHeartbeatManager(HeartbeatManagerSenderImpl)
  4. 指标监控服务MetricRegistry等
  5. 所有已注册的TaskExecutors

       之后其会调用ResourceManager#start()方法来启动此RM;在ResourceManager启动的回调函数中,会通过HighAvailabilityServices获取到选举服务,从而参与到选举之中。并启动JobLeaderIdService,管理向当前ResourceManager注册的作业的leader id。其主要启动服务内容如下:

        在上诉简单分析了TaskExecutor中slot的管理机制,其主要针对单个TaskExecutor上的slot资源进行分配管理,而ResoureManager需要对所有注册的TaskExecutor上的slot进行统一的资源管理。所有的JobManager都是通过向ResourceManager进行资源的申请,ResourceManager会实时的根据当前集群(ResoureManager--TaskExecutor--JobManager)的计算资源使用情况将对应资源的请求"转发"给TaskExecutor进行slot资源的申请。

SlotManager

        SlotManager从全局的角度维护了当前有多少个taskManager、每个taskManager有多少空闲的slot和slot等资源的使用情况。当flink作业调度执行时,根据slot分配策略为task分配执行的位置。其主要功能如下:

  1. 对TaskManager提供注册、取消注册、空闲退出等管理操作,注册则集群可用的slot资源变多,取消注册、空闲退出则释放资源,交还给资源管理集群。
  2. 对Flink作业,接收slot的请求和释放、资源汇报等。当资源不足的时候,SlotManager将资源请求暂存在等待队列中,SlotManager通知ResourceManager去申请更多的资源,启动新的taskManager,taskManager注册到SlotManager之后,SlotManager就有可用的新资源了,从等待队列中依次分配资源。

       ResourceManager中是通过委托给其内部slot组件SlotManager来管理slot资源。SlotManager维护了所有已经注册的TaskExecutor上的所有slot的状态以及它们的分配情况。SlotManager还维护了所有处于等待状态的slot请求(pendingSlotRequests)。每当有一个新的slot注册或者一个已经分配的slot被释放的时候,SlotManager会尝试去满足处于等待状态的slot request。如果可用的slot不足以满足要求,SlotManager会通过ResourceActions#allocateResource(ResourceProfile)来告知ResourceManager,其会向ResourceManager来申请额外的slot资源(比如向Yarn申请额外的container资源);ResourceManager可能会尝试启动新的TaskExecutor(如Yarn模式下)。此外,长时间处于空闲状态的TaskExecutor或者长时间没有被满足的pending slot request,会触发超时机制进行处理。

       SlotManager组件内部的一些比较重要的成员变量如下;其主要是对应的slot资源的状态和待处理的pending slot request:

class SlotManager {
	/** Map for all registered slots. */
	private final HashMap<SlotID, TaskManagerSlot> slots; // 所有的slot资源

	/** Index of all currently free slots. */
	private final LinkedHashMap<SlotID, TaskManagerSlot> freeSlots;

	/** All currently registered task managers. */
	private final HashMap<InstanceID, TaskManagerRegistration> taskManagerRegistrations; // 所有已经注册的task managers

	/** Map of fulfilled and active allocations for request deduplication purposes. */
	private final HashMap<AllocationID, SlotID> fulfilledSlotRequests;

	/** Map of pending/unfulfilled slot allocation requests. */
	private final HashMap<AllocationID, PendingSlotRequest> pendingSlotRequests;

	// 当前ResourceManager资源不足的时候会通过ResourceActions#allocateResource(ResourceProfile)向Yarn(yarn cluster模式)申请新的资源
   // 会可能尝试启动新的TaskManager,也可能什么也不做
	// 这些新申请的资源会被封装为 PendingTaskManagerSlot
	private final HashMap<TaskManagerSlotId, PendingTaskManagerSlot> pendingSlots;

	/** ResourceManager's id. */
	private ResourceManagerId resourceManagerId;

	/** Callbacks for resource (de-)allocations. */
	private ResourceActions resourceActions;
}

Slot注册及周期性心跳上报

        当一个新的TaskManager向RM注册的时候,其会通过RPC方式调用ResourceManager#registerTaskExecutor()方法进行自身TaskManager的注册;主要是将自己的注册信息(注册接口、连接信息、硬件描述等)放在对应的ResourceManager#Map<ResourceID, WorkerRegistration<WorkerType>> taskExecutors对象中;并且RM向对应的TM返回对应的注册成功信息(registrationId、resourceManagerResourceId、clusterInformation)对象TaskExecutorRegistrationSuccess;TaskManager接受到RM返回响应的注册成功信息后;其会回调自己的TaskManager#TaskExecutorToResourceManagerConnection#ResourceManagerRegistrationListener#onRegistrationSuccess()注册成功回调监听函数来进行TM到RM的连接建立以及向对应的RM进行RPC接口:slot资源的上报resourceManagerGateway.sendSlotReport();

        ResourceManager在接收到来自TaskExecutor进行的RPC接口调用请求:sendSlotReport slot资源信息上报的时候;其会委托给内部的组件SlotManager进行对应TaskExecutor slot资源的注册;

        此外除此之外,TaskExecutor也会定期通过心跳向ResourceManager报告slot的状态。在reportSlotStatus方法中会更新slot的状态。

//SlotManager#registerTaskManager
public void registerTaskManager(final TaskExecutorConnection taskExecutorConnection, SlotReport initialSlotReport) {
   checkInit();
   LOG.debug("Registering TaskManager {} under {} at the SlotManager.", taskExecutorConnection.getResourceID(), taskExecutorConnection.getInstanceID());

   // we identify task managers by their instance id
   if (taskManagerRegistrations.containsKey(taskExecutorConnection.getInstanceID())) {
      reportSlotStatus(taskExecutorConnection.getInstanceID(), initialSlotReport);
   } else {
      // first register the TaskManager
      ArrayList<SlotID> reportedSlots = new ArrayList<>();
      for (SlotStatus slotStatus : initialSlotReport) {
         reportedSlots.add(slotStatus.getSlotID());
      }
      // 注册记录 实例Id 与 对应的TaskManagerRegistration(连接、slot总数) 信息
      TaskManagerRegistration taskManagerRegistration = new TaskManagerRegistration(
         taskExecutorConnection,
         reportedSlots);
      taskManagerRegistrations.put(taskExecutorConnection.getInstanceID(), taskManagerRegistration);

      // next register the new slots
      // 依次注册所有的slot
      for (SlotStatus slotStatus : initialSlotReport) {
         registerSlot(
            slotStatus.getSlotID(),
            slotStatus.getAllocationID(),
            slotStatus.getJobID(),
            slotStatus.getResourceProfile(),
            taskExecutorConnection);
      }
   }
}

// ............
private void registerSlot( // 单个slot的注册流程操作
      SlotID slotId,
      AllocationID allocationId,
      JobID jobId,
      ResourceProfile resourceProfile,
      TaskExecutorConnection taskManagerConnection) {

   if (slots.containsKey(slotId)) {
      // remove the old slot first
      removeSlot(slotId);
   }
   // 创建一个TaskManagerSlot对象,并加入slots中
   final TaskManagerSlot slot = createAndRegisterTaskManagerSlot(slotId, resourceProfile, taskManagerConnection);
   
   final PendingTaskManagerSlot pendingTaskManagerSlot;
   if (allocationId == null) {
      // 这个slot还没有被分配,则找到和当前slot的计算资源相匹配的PendingTaskManagerSlot
      pendingTaskManagerSlot = findExactlyMatchingPendingTaskManagerSlot(resourceProfile);
   } else {
      // 这个slot已经被分配了
      pendingTaskManagerSlot = null;
   }

   if (pendingTaskManagerSlot == null) {
      // 两种可能:1、slot已经被分配了  2、没有匹配的PendingTaskManagerSlot
      updateSlot(slotId, allocationId, jobId);
   } else {
      // 新注册的slot能够满足PendingTaskManagerSlot的要求; 尝试将该slot资源分配给当前的slot Request 
      pendingSlots.remove(pendingTaskManagerSlot.getTaskManagerSlotId());
      final PendingSlotRequest assignedPendingSlotRequest = pendingTaskManagerSlot.getAssignedPendingSlotRequest();
      // PendingTaskManagerSlot可能有关联的PedningSlotRequest
      if (assignedPendingSlotRequest == null) { 
         handleFreeSlot(slot);  // 没有关联的PedningSlotRequest,则尝试再次从pendingSlots中寻找合适的Requestslot进行分配,否则标记释放该slot为Free状态
      } else {
         assignedPendingSlotRequest.unassignPendingTaskManagerSlot();
         allocateSlot(slot, assignedPendingSlotRequest); // 有关联的PedningSlotRequest,则这个request可以被满足,分配slot
      }
   }
}

// ............
private void handleFreeSlot(TaskManagerSlot freeSlot) {
   Preconditions.checkState(freeSlot.getState() == TaskManagerSlot.State.FREE);
   // 先查找是否有能够满足的PendingSlotReques
   PendingSlotRequest pendingSlotRequest = findMatchingRequest(freeSlot.getResourceProfile());

   if (null != pendingSlotRequest) {
      allocateSlot(freeSlot, pendingSlotRequest); // 如果有匹配的PendingSlotRequest,则分配slot
   } else {
      freeSlots.put(freeSlot.getSlotId(), freeSlot);
   }
}

请求Slot

        ResourceManager#requestSlot会委托给组件SlotManager的registerSlotRequest(SlotRequest slotRequest)方法来请求slot资源,SlotRequest中封装了请求的JobId(表明该slot是被分配给具体的job任务),AllocationID以及请求的资源描述ResourceProfile,SlotManager会将slot request进一步封装为PendingSlotRequest,标识该slot request为一个尚未被满足要求的、等待被处理的pending slot request。

// ResourceManager#requestSlot()
public CompletableFuture<Acknowledge> requestSlot(
      JobMasterId jobMasterId,
      SlotRequest slotRequest,
      final Time timeout) {

   JobID jobId = slotRequest.getJobId();
   JobManagerRegistration jobManagerRegistration = jobManagerRegistrations.get(jobId); // 获取对应任务jobid的jobmaster rpc-getwary接口代理

   if (null != jobManagerRegistration) {
      if (Objects.equals(jobMasterId, jobManagerRegistration.getJobMasterId())) {
         log.info("Request slot with profile {} for job {} with allocation id {}.",
            slotRequest.getResourceProfile(),
            slotRequest.getJobId(),
            slotRequest.getAllocationId());

         try {
            slotManager.registerSlotRequest(slotRequest); // 委托给内部的slotManager的registerSlotRequest方法进行slot资源的请求申请
         } catch (SlotManagerException e) {
            return FutureUtils.completedExceptionally(e);
         }
         return CompletableFuture.completedFuture(Acknowledge.get());
      } else {
         return FutureUtils.completedExceptionally(new ResourceManagerException("The job leader's id " +
            jobManagerRegistration.getJobMasterId() + " does not match the received id " + jobMasterId + '.'));
      }
   } else {
      return FutureUtils.completedExceptionally(new ResourceManagerException("Could not find registered job manager for job " + jobId + '.'));
   }
}

// slotManager#registerSlotRequest(slotRequest)
public boolean registerSlotRequest(SlotRequest slotRequest) throws SlotManagerException {
   checkInit();
   if (checkDuplicateRequest(slotRequest.getAllocationId())) {
      LOG.debug("Ignoring a duplicate slot request with allocation id {}.", slotRequest.getAllocationId());
      return false;
   } else {
      PendingSlotRequest pendingSlotRequest = new PendingSlotRequest(slotRequest); // 将请求封装为PendingSlotRequest
      pendingSlotRequests.put(slotRequest.getAllocationId(), pendingSlotRequest);
      try {
         internalRequestSlot(pendingSlotRequest); // 执行请求申请分配slot的具体逻辑
      } catch (ResourceManagerException e) {
         // requesting the slot failed --> remove pending slot request
         pendingSlotRequests.remove(slotRequest.getAllocationId());
         throw new SlotManagerException("Could not fulfill slot request " + slotRequest.getAllocationId() + '.', e);
      }
      return true;
   }
}

private void internalRequestSlot(PendingSlotRequest pendingSlotRequest) throws ResourceManagerException {
   final ResourceProfile resourceProfile = pendingSlotRequest.getResourceProfile();
   TaskManagerSlot taskManagerSlot = findMatchingSlot(resourceProfile); // 首先从FREE状态的已注册的slot中选择符合要求的slot(cpu、heap-direct-native-network-Memory资源>请求需要的资源)

   if (taskManagerSlot != null) {
      allocateSlot(taskManagerSlot, pendingSlotRequest); // 找到了符合条件的slot,将该slot尝试分配给该pendingSlotRequest
   } else {
      // 从PendingTaskManagerSlot中选择
      // 如果连PendingTaskManagerSlot中都没有
      // 请求ResourceManager再次分配资源,通过ResourceActions#allocateResource(ResourceProfile)进行委托申请回调
      Optional<PendingTaskManagerSlot> pendingTaskManagerSlotOptional = findFreeMatchingPendingTaskManagerSlot(resourceProfile); // 从PendingTaskManagerSlot中选择
      if (!pendingTaskManagerSlotOptional.isPresent()) {
         // 向RM(如Yarn)再次申请资源 再次申请container;并用java cmd命令启动对应的flink TaskManager
         pendingTaskManagerSlotOptional = allocateResource(resourceProfile);
      }
      // 将PendingTaskManagerSlot指派给对应的PendingSlotRequest
      pendingTaskManagerSlotOptional.ifPresent(pendingTaskManagerSlot -> assignPendingTaskManagerSlot(pendingSlotRequest, pendingTaskManagerSlot));
   }
}

       在SlotManager中具体的slot申请分配的逻辑方法为:allocateSlot(taskManagerSlot, pendingSlotRequest);其主要通过TaskExecutorGateway的RPC接口代理调用gateway.requestSlot()方法向对应的TaskExecutor申请请求分配slot资源;该TaskExecutorGateway的RPC方法gateway.requestSlot()调用源码如下:

// SlotManager#allocateSlot
private void allocateSlot(TaskManagerSlot taskManagerSlot, PendingSlotRequest pendingSlotRequest) {
   Preconditions.checkState(taskManagerSlot.getState() == TaskManagerSlot.State.FREE);
   // 获取到对应的TaskExecutorGateway RPC接口代理
   TaskExecutorConnection taskExecutorConnection = taskManagerSlot.getTaskManagerConnection();
   TaskExecutorGateway gateway = taskExecutorConnection.getTaskExecutorGateway();

   final CompletableFuture<Acknowledge> completableFuture = new CompletableFuture<>();
   final AllocationID allocationId = pendingSlotRequest.getAllocationId();
   final SlotID slotId = taskManagerSlot.getSlotId();
   final InstanceID instanceID = taskManagerSlot.getInstanceId();
   // taskManagerSlot状态变为PENDING
   taskManagerSlot.assignPendingSlotRequest(pendingSlotRequest);
   pendingSlotRequest.setRequestFuture(completableFuture);

   // 如果有PendingTaskManager指派给当前pendingSlotRequest,要先解除关联
   returnPendingTaskManagerSlotIfAssigned(pendingSlotRequest);

   TaskManagerRegistration taskManagerRegistration = taskManagerRegistrations.get(instanceID);
   if (taskManagerRegistration == null) {
      throw new IllegalStateException("Could not find a registered task manager for instance id " + instanceID + '.');
   }
   taskManagerRegistration.markUsed();

   // RPC call to the task manager
   // 通过RPC调用向TaskExecutor申请请求slot资源
   CompletableFuture<Acknowledge> requestFuture = gateway.requestSlot(
      slotId,
      pendingSlotRequest.getJobId(),
      allocationId,
      pendingSlotRequest.getTargetAddress(),
      resourceManagerId,
      taskManagerRequestTimeout);

   requestFuture.whenComplete( 	// RPC调用的请求完成
      (Acknowledge acknowledge, Throwable throwable) -> {
         if (acknowledge != null) {
            completableFuture.complete(acknowledge);
         } else {
            completableFuture.completeExceptionally(throwable);
         }
      });

	// PendingSlotRequest请求完成的回调函数(PendingSlotRequest请求完成可能是由于上面RPC调用完成,也可能是因为PendingSlotRequest被取消)
   completableFuture.whenCompleteAsync(
      (Acknowledge acknowledge, Throwable throwable) -> {
         try {
            if (acknowledge != null) {  // 如果请求成功,则取消pendingSlotRequest,并更新slot状态PENDING->ALLOCATED
               updateSlot(slotId, allocationId, pendingSlotRequest.getJobId());
            } else {
               if (throwable instanceof SlotOccupiedException) { // 这个slot已经被占用了,更新状态
                  SlotOccupiedException exception = (SlotOccupiedException) throwable;
                  updateSlot(slotId, exception.getAllocationId(), exception.getJobId());
               } else {
                  removeSlotRequestFromSlot(slotId, allocationId); // 请求失败,将pendingSlotRequest从TaskManagerSlot中移除
               }

               if (!(throwable instanceof CancellationException)) {
                  handleFailedSlotRequest(slotId, allocationId, throwable); // slot request请求失败,会进行重试
               } else {
                  LOG.debug("Slot allocation request {} has been cancelled.", allocationId, throwable); // 主动取消
               }
            }
         } catch (Exception e) {
            LOG.error("Error while completing the slot allocation.", e);
         }
      },
      mainThreadExecutor);
}

取消slot请求

       通过ResourceManager#cancelSlotRequest(allocationID)方法可以取消一个slot request;其内部实现会委托给组件SlotManager的unregisterSlotRequest(slotRequest)方法来取消该slotRequest的slot资源请求申请:

// SlotManager#unregisterSlotRequest
public boolean unregisterSlotRequest(AllocationID allocationId) {
   checkInit();
   PendingSlotRequest pendingSlotRequest = pendingSlotRequests.remove(allocationId); // 从pendingSlotRequests中移除
   if (null != pendingSlotRequest) {
      LOG.debug("Cancel slot request {}.", allocationId);
      cancelPendingSlotRequest(pendingSlotRequest); // 取消请求
      return true;
   } else {
      LOG.debug("No pending slot request with allocation id {} found. Ignoring unregistration request.", allocationId);
      return false;
   }
}

超时设置

        ResourceManager在启动的时候会开启leaderElectionService.start(this)服务,其会在leader被选举出的时候回调通知LeaderContender的具体实现类(this指向ResourceManager当前自己);并调用其内部的grantLeadership()方法尝试进行leadership的接受tryAcceptLeadership(),在该方法内部会尝试启动对应的SlotManager组件;SlotManager组件在启动的时候会启动两个超时检测任务:

  1. 一个是对TaskManager长时间处于空闲状态的检测;
  2. 一个是对slot request超时的检测;

一旦TaskExecutor长时间处于空闲状态,则会通过ResourceActions#releaseResource()回调函数释放资源;如果一个slot request超时,则会取消PendingSlotRequest,并通过ResourceActions#notifyAllocationFailure()告知ResourceManager;

// SlotManager#start
public void start(ResourceManagerId newResourceManagerId, Executor newMainThreadExecutor, ResourceActions newResourceActions) {
   LOG.info("Starting the SlotManager.");
   this.resourceManagerId = Preconditions.checkNotNull(newResourceManagerId);
   mainThreadExecutor = Preconditions.checkNotNull(newMainThreadExecutor);
   resourceActions = Preconditions.checkNotNull(newResourceActions);
   started = true;
   // 检查TaskExecutor是否长时间处于idle状态
   taskManagerTimeoutCheck = scheduledExecutor.scheduleWithFixedDelay(
      () -> mainThreadExecutor.execute(
         () -> checkTaskManagerTimeouts()),  // 超时检查
      0L,
      taskManagerTimeout.toMilliseconds(),
      TimeUnit.MILLISECONDS);
   // 检查slot request是否超时
   slotRequestTimeoutCheck = scheduledExecutor.scheduleWithFixedDelay(
      () -> mainThreadExecutor.execute(
         () -> checkSlotRequestTimeouts()),  // 超时检查
      0L,
      slotRequestTimeout.toMilliseconds(),
      TimeUnit.MILLISECONDS);
}

void checkTaskManagerTimeouts() {
   if (!taskManagerRegistrations.isEmpty()) {
      long currentTime = System.currentTimeMillis();
      ArrayList<TaskManagerRegistration> timedOutTaskManagers = new ArrayList<>(taskManagerRegistrations.size());
      // first retrieve the timed out TaskManagers
      for (TaskManagerRegistration taskManagerRegistration : taskManagerRegistrations.values()) {
         if (currentTime - taskManagerRegistration.getIdleSince() >= taskManagerTimeout.toMilliseconds()) {
            timedOutTaskManagers.add(taskManagerRegistration);
         }
      }

      // second we trigger the release resource callback which can decide upon the resource release
      for (TaskManagerRegistration taskManagerRegistration : timedOutTaskManagers) {
         if (waitResultConsumedBeforeRelease) {
            releaseTaskExecutorIfPossible(taskManagerRegistration);
         } else {
            releaseTaskExecutor(taskManagerRegistration.getInstanceId()); // 通过ResourceActions#releaseResource()回调函数 超时释放资源
         }
      }
   }
}

private void checkSlotRequestTimeouts() {
   if (!pendingSlotRequests.isEmpty()) {
      long currentTime = System.currentTimeMillis();
      Iterator<Map.Entry<AllocationID, PendingSlotRequest>> slotRequestIterator = pendingSlotRequests.entrySet().iterator();

      while (slotRequestIterator.hasNext()) {
         PendingSlotRequest slotRequest = slotRequestIterator.next().getValue();
         if (currentTime - slotRequest.getCreationTimestamp() >= slotRequestTimeout.toMilliseconds()) {
            slotRequestIterator.remove();
            if (slotRequest.isAssigned()) {
               cancelPendingSlotRequest(slotRequest); // 取消
            }

            resourceActions.notifyAllocationFailure( // 告知ResourceManager
               slotRequest.getJobId(),
               slotRequest.getAllocationId(),
               new TimeoutException("The allocation could not be fulfilled in time."));
         }
      }
   }
}

ResourceManager

        ResourceManager作为统一的slot资源管理分配器,其具体实际上的slot资源管理是委托给内部组件SlotManager来进行的;其管理TaskExecutor注册上报的所有slot资源;虽然在ResourceManager内部具体的slot资源是委托给SlotManager进行的;但ResourceManager本身需要与外部组件进行slot资源上的管理交互,其对外提供RPC调用方法,从而将slot管理相关的方法暴露提供给外部组件JobMaster和TaskExecutor。

        RPC接口:ResourceManager提供的slot管理相关的RPC方法如下;其中,requestSlot和cancelSlotRequest主要供JobMaster进行调用而sendSlotReport和notifySlotAvailable则主要供TaskExecutor调用。ResourceManager在接收到对应的slot RPC调用后,会委托SlotManager完成具体的工作。

interface ResouceManagerGateway {
	CompletableFuture<Acknowledge> requestSlot(          // Sent by the JobMaster to Requests a slot from the resource manager.
		JobMasterId jobMasterId,
		SlotRequest slotRequest,
		@RpcTimeout Time timeout);

	void cancelSlotRequest(AllocationID allocationID);   // Sent by the JobMaster to Cancel the slot allocation requests from the resource manager.

	CompletableFuture<Acknowledge> sendSlotReport(       // Sent by the TaskExecutor to Sends the given {@link SlotReport} to the ResourceManager.
		ResourceID taskManagerResourceId,
		InstanceID taskManagerRegistrationId,
		SlotReport slotReport,
		@RpcTimeout Time timeout);

	void notifySlotAvailable(                            // Sent by the TaskExecutor to notify the ResourceManager that a slot has become available.
		InstanceID instanceId,
		SlotID slotID,
		AllocationID oldAllocationId);
}

动态资源管理:ResourceManager支持动态管理TaskExecutor计算资源,从而可以更好地和Yarn、Mesos、Kubernetes等框架进行集成、动态管理计算资源。在SlotManager#请求Slot的时候提到过:

  1. 如果当前注册的slot不能满足slot request的要求,那么SlotManager会通过ResourceActions#allocateResource回调告知当前flink内部的ResourceManager组件;使其向具体的外部资源管理框架(Yarn等)进行计算资源的申请(container);
  2. 当一个SlotManager检查到一个TaskExecutor长时间处于Idle状态时,也会通过ResourceActions#releaseResource回调告知当前flink内部的ResourceManager组件;使其向具体的外部资源管理框架(Yarn等)进行计算资源的回收释放(container);

通过这两个ResourceActions相关的allocateResource、releaseResource回调,ResourceManager 就可以动态申请资源及释放资源:

// ResourceManager#ResourceActionsImpl类
private class ResourceActionsImpl implements ResourceActions {
   @Override   // 释放资源
   public void releaseResource(InstanceID instanceId, Exception cause) {
      validateRunsInMainThread();
      ResourceManager.this.releaseResource(instanceId, cause); // 调用具体ResourceManager的releaseResource方法
   }

   @Override   // 申请新的资源,具体行为和不同的ResourceManager的实现有关。其返回的列表相当于是承诺即将分配的资源(在Yarn模式中,就是requestYarnContainer,申请container并启动对应的TaskManager)
   public Collection<ResourceProfile> allocateResource(ResourceProfile resourceProfile) {
      validateRunsInMainThread();
      return startNewWorker(resourceProfile);
   }

   @Override
   public void notifyAllocationFailure(JobID jobId, AllocationID allocationId, Exception cause) {
      validateRunsInMainThread();
      JobManagerRegistration jobManagerRegistration = jobManagerRegistrations.get(jobId);
      if (jobManagerRegistration != null) {
         jobManagerRegistration.getJobManagerGateway().notifyAllocationFailure(allocationId, cause);
      }
   }
}

// ResourceManager#releaseResource()
protected void releaseResource(InstanceID instanceId, Exception cause) {
   WorkerType worker = null;
   // TODO: Improve performance by having an index on the instanceId
   for (Map.Entry<ResourceID, WorkerRegistration<WorkerType>> entry : taskExecutors.entrySet()) {
      if (entry.getValue().getInstanceID().equals(instanceId)) {
         worker = entry.getValue().getWorker();
         break;
      }
   }

   if (worker != null) {
      if (stopWorker(worker)) { // 释放停止对应的worker,并关闭到对应TaskManager的连接; stopWorker(worker)的具体实现和不同的ResourceManager的实现有关
         closeTaskManagerConnection(worker.getResourceID(), cause);
      } else {
         log.debug("Worker {} could not be stopped.", worker.getResourceID());
      }
   } else {
      // unregister in order to clean up potential left over state
      slotManager.unregisterTaskManager(instanceId);
   }
}
// ResourceManager中提供的抽象方法;交由具体实现类去执行对应worker的具体操作
public abstract Collection<ResourceProfile> startNewWorker(ResourceProfile resourceProfile);
public abstract boolean stopWorker(WorkerType worker);

        在ResourceManager中的abstract抽象方法startNewWorker(ResourceProfile resourceProfile)stopWorker(WorkerType worker)这两个抽象方法是实现动态申请和释放资源的执行关键。对Standalone模式而言,TaskExecutor是固定的,不支持动态启动和释放;而对于在Yarn上运行的Flink,YarnResourceManager中这两个方法的具体实现就涉及到启动新的container和释放已经申请的container:

// YarnResourceManager
public Collection<ResourceProfile> startNewWorker(ResourceProfile resourceProfile) {
   Preconditions.checkArgument(ResourceProfile.UNKNOWN.equals(resourceProfile), "The YarnResourceManager does not support custom ResourceProfiles yet. It assumes that all containers have the same resources.");
   // 向Yarn申请container资源; 申请成功后通过异步回调onContainersAllocated()方法来构建ContainerLaunchContext启动上下文taskExecutorLaunchContext;
   // 包含对应启动TaskExecutor的java cmd指令等;并交由对应的yarn nodeManagerClient.startContainer()去进行对应container taskExecutor进程的启动
   requestYarnContainer(); 
   return slotsPerWorker;
}

public boolean stopWorker(final YarnWorkerNode workerNode) {
   final Container container = workerNode.getContainer();
   log.info("Stopping container {}.", container.getId());
   try {
      nodeManagerClient.stopContainer(container.getId(), container.getNodeId()); // 停止并释放container
   } catch (final Exception e) {
      log.warn("Error while calling YARN Node Manager to stop container", e);
   }
   resourceManagerClient.releaseAssignedContainer(container.getId());
   workerNodeMap.remove(workerNode.getResourceID());
   return true;
}

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值