DolphinScheduler1.3.4 master-server分配任务的三种调度策略算法实现与分析

最新推荐文章于 2025-03-05 23:17:38 发布

游语

最新推荐文章于 2025-03-05 23:17:38 发布

阅读量1.4k

点赞数 2

分类专栏： DS

本文链接：https://blog.csdn.net/qq_26400953/article/details/113179992

版权

DS 专栏收录该内容

8 篇文章

订阅专栏

master-server分配任务的三种调度策略算法实现与分析

简介

简介

DolphinScheduler1.3.4集群环境默认有三种调度策略算法分配任务到同一work组的不同机器上，分别是随机、轮询、机器资源权重算法。master-server在获取到任务之后会去zk上获取到当前任务所属工作组的所有机器信息列表，然后根据调度策略算法获取到一台机器，通过Netty给该机器发送任务，该机器的work服务就回去执行具体的任务。

随机算法

原理

DS实现随机算法非常简单，zk获取到任务执行的工作组上所有的的机器后，如果机器总数为1直接获取该机器，如果机器数量不为1则随机获取一台机器作为任务的执行机器。

代码实现：

public T select(final Collection<T> source) {
        if (source == null || source.size() == 0) {
            throw new IllegalArgumentException("Empty source.");
        }
        /**
         * if only one , return directly
         */
        if (source.size() == 1) {
            return (T) source.toArray()[0];
        }
        int size = source.size();
        /**
         *  random select
         */
        int randomIndex = random.nextInt(size);
        return (T) source.toArray()[randomIndex];
    }

轮询算法

原理

轮询算法的实现借助了原子类AtomicInteger实现，通过AtomicInteger计数，然后通过对机器数量取摩方式获取到机器所在数组位置，达到任务轮询执行在work机器上目的。

代码实现

private final AtomicInteger index = new AtomicInteger(0);
    @Override
    public T select(Collection<T> source) {
        if (source == null || source.size() == 0) {
            throw new IllegalArgumentException("Empty source.");
        }
        /**
         * if only one , return directly
         */
        if (source.size() == 1) {
            return (T)source.toArray()[0];
        }
        int size = source.size();
        /**
         * round robin
         */
        return (T) source.toArray()[index.getAndIncrement() % size];
    }

机器资源权重算法

原理分析

机器资源权重算法主要侧重于资源，master-server通过zk的机器心跳检测获取到机器资源的信息，然后根据资源计算权重最小的机器。具体的资源权重计算方式是：

private int calculateWeight(double cpu, double memory, double loadAverage){
        return (int)(cpu * CPU_FACTOR + memory * MEMORY_FACTOR + loadAverage * LOAD_AVERAGE_FACTOR);
    }

CPU_FACTOR cpu所占比例10
MEMORY_FACTOR 内存所占比例20
LOAD_AVERAGE_FACTOR 负载均值所占比例70

代码实现

public class LowerWeightHostManager extends CommonHostManager {

    private final Logger logger = LoggerFactory.getLogger(LowerWeightHostManager.class);

    /**
     * zookeeper registry center
     */
    @Autowired
    private ZookeeperRegistryCenter registryCenter;

    /**
     * round robin host manager
     */
    private RoundRobinHostManager roundRobinHostManager;

    /**
     * selector
     */
    private LowerWeightRoundRobin selector;

    /**
     * worker host weights
     */
    private ConcurrentHashMap<String, Set<HostWeight>> workerHostWeightsMap;

    /**
     * worker group host lock
     */
    private Lock lock;

    /**
     * executor service
     */
    private ScheduledExecutorService executorService;

    @PostConstruct
    public void init(){
        this.selector = new LowerWeightRoundRobin();
        this.workerHostWeightsMap = new ConcurrentHashMap<>();
        this.lock = new ReentrantLock();
        this.executorService = Executors.newSingleThreadScheduledExecutor(new NamedThreadFactory("LowerWeightHostManagerExecutor"));
        this.executorService.scheduleWithFixedDelay(new RefreshResourceTask(),0, 5, TimeUnit.SECONDS);
        this.roundRobinHostManager = new RoundRobinHostManager();
        this.roundRobinHostManager.setZookeeperNodeManager(getZookeeperNodeManager());
    }

    @PreDestroy
    public void close(){
        this.executorService.shutdownNow();
    }

    /**
     * select host
     * @param context context
     * @return host
     */
    @Override
    public Host select(ExecutionContext context){
        Set<HostWeight> workerHostWeights = getWorkerHostWeights(context.getWorkerGroup());
        if(CollectionUtils.isNotEmpty(workerHostWeights)){
            return selector.select(workerHostWeights).getHost();
        }
        return new Host();
    }

    @Override
    public Host select(Collection<Host> nodes) {
        throw new UnsupportedOperationException("not support");
    }

    private void syncWorkerHostWeight(Map<String, Set<HostWeight>> workerHostWeights){
        lock.lock();
        try {
            workerHostWeightsMap.clear();
            workerHostWeightsMap.putAll(workerHostWeights);
        } finally {
            lock.unlock();
        }
    }

    private Set<HostWeight> getWorkerHostWeights(String workerGroup){
        lock.lock();
        try {
            return workerHostWeightsMap.get(workerGroup);
        } finally {
            lock.unlock();
        }
    }

    class RefreshResourceTask implements Runnable{

        @Override
        public void run() {
            try {
                Map<String, Set<String>> workerGroupNodes = zookeeperNodeManager.getWorkerGroupNodes();
                Set<Map.Entry<String, Set<String>>> entries = workerGroupNodes.entrySet();
                Map<String, Set<HostWeight>> workerHostWeights = new HashMap<>();
                for(Map.Entry<String, Set<String>> entry : entries){
                    String workerGroup = entry.getKey();
                    Set<String> nodes = entry.getValue();
                    String workerGroupPath = registryCenter.getWorkerGroupPath(workerGroup);
                    Set<HostWeight> hostWeights = new HashSet<>(nodes.size());
                    for(String node : nodes){
                        String heartbeat = registryCenter.getZookeeperCachedOperator().get(workerGroupPath + "/" + node);
                        if(StringUtils.isNotEmpty(heartbeat)
                                && heartbeat.split(COMMA).length == Constants.HEARTBEAT_FOR_ZOOKEEPER_INFO_LENGTH){
                            String[] parts = heartbeat.split(COMMA);

                            int status = Integer.parseInt(parts[8]);
                            if (status == Constants.ABNORMAL_NODE_STATUS){
                                logger.warn("load is too high or availablePhysicalMemorySize(G) is too low, it's availablePhysicalMemorySize(G):{},loadAvg:{}",
                                        Double.parseDouble(parts[3]) , Double.parseDouble(parts[2]));
                                continue;
                            }

                            double cpu = Double.parseDouble(parts[0]);
                            double memory = Double.parseDouble(parts[1]);
                            double loadAverage = Double.parseDouble(parts[2]);
                            HostWeight hostWeight = new HostWeight(Host.of(node), cpu, memory, loadAverage);
                            hostWeights.add(hostWeight);
                        }
                    }
                    workerHostWeights.put(workerGroup, hostWeights);
                }
                syncWorkerHostWeight(workerHostWeights);
            } catch (Throwable ex){
                logger.error("RefreshResourceTask error", ex);
            }
        }
    }

}

public HostWeight select(Collection<HostWeight> sources){
        int totalWeight = 0;
        int lowWeight = 0;
        HostWeight lowerNode = null;
        for (HostWeight hostWeight : sources) {
            totalWeight += hostWeight.getWeight();
            hostWeight.setCurrentWeight(hostWeight.getCurrentWeight() + hostWeight.getWeight());
            if (lowerNode == null || lowWeight > hostWeight.getCurrentWeight() ) {
                lowerNode = hostWeight;
                lowWeight = hostWeight.getCurrentWeight();
            }
        }
        lowerNode.setCurrentWeight(lowerNode.getCurrentWeight() + totalWeight);
        return lowerNode;

    }

这里通过可重入锁ReentrantLock保证workerHostWeightsMap每个线程获取到的工作组信息完整。

设计模式

DS在获取具体的调度算法上使用了抽象工厂设计模式去实现。下面是伪代码：

HostManager抽象产品类抽象出选择host的行为

/**
 *  host manager
 */
public interface HostManager {

    /**
     *  select host
     * @param context context
     * @return host
     */
    Host select(ExecutionContext context);

}

CommonHostManager抽象产品类从zk获取工作组中机器节点

public abstract class CommonHostManager implements HostManager {

    private final Logger logger = LoggerFactory.getLogger(CommonHostManager.class);

    /**
     * zookeeperNodeManager
     */
    @Autowired
    protected ZookeeperNodeManager zookeeperNodeManager;

    /**
     * select host
     * @param context context
     * @return host
     */
    @Override
    public Host select(ExecutionContext context){
        Host host = new Host();
        Collection<String> nodes = null;
        /**
         * executor type
         */
        ExecutorType executorType = context.getExecutorType();
        switch (executorType){
            case WORKER:
                nodes = zookeeperNodeManager.getWorkerGroupNodes(context.getWorkerGroup());
                break;
            case CLIENT:
                break;
            default:
                throw new IllegalArgumentException("invalid executorType : " + executorType);

        }
        if(CollectionUtils.isEmpty(nodes)){
            return host;
        }
        List<Host> candidateHosts = new ArrayList<>(nodes.size());
        nodes.stream().forEach(node -> candidateHosts.add(Host.of(node)));

        return select(candidateHosts);
    }

    protected abstract Host select(Collection<Host> nodes);

    public void setZookeeperNodeManager(ZookeeperNodeManager zookeeperNodeManager) {
        this.zookeeperNodeManager = zookeeperNodeManager;
    }

    public ZookeeperNodeManager getZookeeperNodeManager() {
        return zookeeperNodeManager;
    }
}

RandomHostManager 随机算法具体产品类

/**
 *  round robin host manager
 */
public class RandomHostManager extends CommonHostManager {

    /**
     * selector
     */
    private final Selector<Host> selector;

    /**
     * set round robin
     */
    public RandomHostManager(){
        this.selector = new RandomSelector<>();
    }

    @Override
    public Host select(Collection<Host> nodes) {
        return selector.select(nodes);
    }
}

RoundRobinHostManager轮询算法具体产品类

/**
 *  round robin host manager
 */
public class RoundRobinHostManager extends CommonHostManager {

    /**
     * selector
     */
    private final Selector<Host> selector;

    /**
     * set round robin
     */
    public RoundRobinHostManager(){
        this.selector = new RoundRobinSelector<>();
    }

    @Override
    public Host select(Collection<Host> nodes) {
        return selector.select(nodes);
    }

}

LowerWeightHostManager 资源权重算法产品类

HostManagerConfig工厂类，根据配置获取具体的算法产品类

@Configuration
public class HostManagerConfig {

    private AutowireCapableBeanFactory beanFactory;

    @Autowired
    private MasterConfig masterConfig;

    @Autowired
    public HostManagerConfig(AutowireCapableBeanFactory beanFactory) {
        this.beanFactory = beanFactory;
    }

    @Bean
    public HostManager hostManager() {
        String hostSelector = masterConfig.getHostSelector();
        HostSelector selector = HostSelector.of(hostSelector);
        HostManager hostManager;
        switch (selector){
            case RANDOM:
                hostManager = new RandomHostManager();
                break;
            case ROUNDROBIN:
                hostManager = new RoundRobinHostManager();
                break;
            case LOWERWEIGHT:
                hostManager = new LowerWeightHostManager();
                break;
            default:
                throw new IllegalArgumentException("unSupport selector " + hostSelector);
        }
        beanFactory.autowireBean(hostManager);
        return hostManager;
    }
}