Azkaban源码分析（1）——Executor选择

最新推荐文章于 2022-07-03 21:51:37 发布

进击的大波

最新推荐文章于 2022-07-03 21:51:37 发布

阅读量594

点赞数 1

分类专栏： Azkaban

本文链接：https://blog.csdn.net/weixin_43666570/article/details/106630800

版权

Azkaban 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文基于Azkaban3.79.0代码版本Mutli Executor模式进行分析。
当azkaban.poll.model设为false（默认）的情况下，Executor的管理和flow的调度是通过ExecutorManager类进行管理的，当azkaban.poll.model设为true的情况下，就用ExecutionController类进行代替。在当前生产模式下，较为成熟的还是调用ExecutorManager类进行管理，本文接下来分析的也是ExecutorManager类中Executor的选择过程。

1. selectExecutorAndDispatchFlow函数

ExecutorManager中Executor的选择和flow的调度主要是通过 private void selectExecutorAndDispatchFlow(final ExecutionReference reference, final ExecutableFlow exflow) 方法实现的。

    /* process flow with a snapshot of available Executors */
    private void selectExecutorAndDispatchFlow(final ExecutionReference reference,
        final ExecutableFlow exflow)
        throws ExecutorManagerException {
      final Set<Executor> remainingExecutors = new HashSet<>(
          ExecutorManager.this.activeExecutors.getAll());
      Throwable lastError;
      synchronized (exflow) {
        do {
          final Executor selectedExecutor = selectExecutor(exflow, remainingExecutors);
          if (selectedExecutor == null) {
            ExecutorManager.this.commonMetrics.markDispatchFail();
            handleNoExecutorSelectedCase(reference, exflow);
            // RE-QUEUED - exit
            return;
          } else {
            try {
              dispatch(reference, exflow, selectedExecutor);
              ExecutorManager.this.commonMetrics.markDispatchSuccess();
              // SUCCESS - exit
              return;
            } catch (final ExecutorManagerException e) {
              lastError = e;
              logFailedDispatchAttempt(reference, exflow, selectedExecutor, e);
              ExecutorManager.this.commonMetrics.markDispatchFail();
              reference.setNumErrors(reference.getNumErrors() + 1);
              // FAILED ATTEMPT - try other executors except selectedExecutor
              updateRemainingExecutorsAndSleep(remainingExecutors, selectedExecutor);
            }
          }
        } while (reference.getNumErrors() < this.maxDispatchingErrors);
        // GAVE UP DISPATCHING
        final String message = "Failed to dispatch queued execution " + exflow.getId() + " because "
            + "reached " + ConfigurationKeys.MAX_DISPATCHING_ERRORS_PERMITTED
            + " (tried " + reference.getNumErrors() + " executors)";
        ExecutorManager.logger.error(message);
        AXReportingExecutorManager.this.executionFinalizer.finalizeFlow(exflow, message, lastError);
      }
    }

对于selectExecutorAndDispatchFlow方法而言，其完成了以下几件事：

通过调用 ExecutorManager.this.activeExecutors.getAll()，获取当前内存中所有Active Executor的信息;
对当前的可执行flow加锁，防止被其他线程使用；
调用selectExecutor函数，根据当前所有的active executor和当前flow，选择最合适的executor；
如果最后没有选择任何executor，即selectedExecutor == null，则标记调度失败，调用handleNoExecutorSelectedCase函数将该flow放入等待队列中，等到有active executor的时候，再从队列中逐一调度flow；
如果找到了合适的executor，则调用dispatch函数对flow进行调度，并标记调度成功；
如果在这一过程中出现Exception，则将当前的Executor从Remaining Executor进行剔除，从剩下的Executor再选择一个进行调度，并将numError值加1；如果剩下的Executor为空，则调用 ExecutorManager.this.activeExecutors.getAll()，从数据库中重新获取所有 Active Executor 的信息，然后sleep一段时间后重新进行调度；
如果当前flow的失败次数超过设置的阈值时，则将该flow置为fail。

2. selectExecutor函数

接下来对selectExecutorAndDispatchFlow调用的selectExecutor函数进行分析。

    /* Choose Executor for exflow among the available executors */
    private Executor selectExecutor(final ExecutableFlow exflow,
        final Set<Executor> availableExecutors) {
      Executor choosenExecutor =
          getUserSpecifiedExecutor(exflow.getExecutionOptions(),
              exflow.getExecutionId());

      // If no executor was specified by admin
      if (choosenExecutor == null) {
        ExecutorManager.logger.info("Using dispatcher for execution id :"
            + exflow.getExecutionId());
        final ExecutorSelector selector = new ExecutorSelector(ExecutorManager.this.filterList,
            ExecutorManager.this.comparatorWeightsMap);
        choosenExecutor = selector.getBest(availableExecutors, exflow);
      }
      return choosenExecutor;
    }

selectExecutor函数实现了如下逻辑：

首先调用getUserSpecifiedExecutor函数判断用户是否通过useExecutor参数来指定executor进行运行；
如果当前用户未指定executor或者用户指定的executor id并未在当前的active executor set中，即构建一个executor选择器，并根据当前各executor的资源使用情况和Compactor选择最佳executor进行调度。

3. getUserSpecifiedExecutor函数

    /* Helper method to fetch  overriding Executor, if a valid user has specifed otherwise return null */
    private Executor getUserSpecifiedExecutor(final ExecutionOptions options,
        final int executionId) {
      Executor executor = null;
      if (options != null
          && options.getFlowParameters() != null
          && options.getFlowParameters().containsKey(
          ExecutionOptions.USE_EXECUTOR)) {
        try {
          final int executorId =
              Integer.valueOf(options.getFlowParameters().get(
                  ExecutionOptions.USE_EXECUTOR));
          executor = fetchExecutor(executorId);

          if (executor == null) {
            ExecutorManager.logger
                .warn(String
                    .format(
                        "User specified executor id: %d for execution id: %d is not active, Looking up db.",
                        executorId, executionId));
            executor = ExecutorManager.this.executorLoader.fetchExecutor(executorId);
            if (executor == null) {
              ExecutorManager.logger
                  .warn(String
                      .format(
                          "User specified executor id: %d for execution id: %d is missing from db. Defaulting to availableExecutors",
                          executorId, executionId));
            }
          }
        } catch (final ExecutorManagerException ex) {
          ExecutorManager.logger.error("Failed to fetch user specified executor for exec_id = "
              + executionId, ex);
        }
      }
      return executor;
    }

getUserSpecifiedExecutor函数实现了用户指定Executor进行调度的逻辑：
首先判断用户输入的flow parameters中，是否使用了useExecutor参数，如果有的话，则根据ExecutorId查看当前内存中存储的executor列表中是否含有该Executor，如果存在该Executor则使用其执行flow，如果不存在该Executor，则去数据库中查看Executor表，查看该Executor是否存在于数据库中，如果存在该Executor则使用其执行flow，如果仍然不存在该Executor的话则返回null值。

4. getBest函数

  public K getBest(final Collection<K> candidateList, final V dispatchingObject) {

    // shortcut if the candidateList is empty.
    if (null == candidateList || candidateList.size() == 0) {
      logger.error("failed to getNext candidate as the passed candidateList is null or empty.");
      return null;
    }

    logger.debug("start candidate selection logic.");
    logger.debug(String.format("candidate count before filtering: %s", candidateList.size()));

    // to keep the input untouched, we will form up a new list based off the filtering result.
    Collection<K> filteredList = new ArrayList<>();

    if (null != this.filter) {
      for (final K candidateInfo : candidateList) {
        if (this.filter.filterTarget(candidateInfo, dispatchingObject)) {
          filteredList.add(candidateInfo);
        }
      }
    } else {
      filteredList = candidateList;
      logger.debug("skipping the candidate filtering as the filter object is not specifed.");
    }

    logger.debug(String.format("candidate count after filtering: %s", filteredList.size()));
    if (filteredList.size() == 0) {
      logger.debug("failed to select candidate as the filtered candidate list is empty.");
      return null;
    }

    if (null == this.comparator) {
      logger.debug(
          "candidate comparator is not specified, default hash code comparator class will be used.");
    }

    // final work - find the best candidate from the filtered list.
    final K executor = Collections.max(filteredList, this.comparator);
    logger.debug(String.format("candidate selected %s",
        null == executor ? "(null)" : executor.toString()));
    return executor;
  }

getBest函数主要实现的逻辑是从Executor List中根据构造的比较器comparator通过比较当前各Executor的负载情况选择最佳的Executor执行flow。此处就不再往下赘述。

进击的大波

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Azkaban源码分析（1）——Executor选择

本文基于Azkaban3.79.0代码版本Mutli Executor模式进行分析。当azkaban.poll.model设为false（默认）的情况下，Executor的管理和flow的调度是通过ExecutorManager类进行管理的，当azkaban.poll.model设为true的情况下，就用ExecutionController类进行代替。在当前生产模式下，较为成熟的还是调用ExecutorManager类进行管理，本文接下来分析的也是ExecutorManager类中Executor的选择
复制链接

扫一扫

专栏目录