剖析WorkflowSim源码（一）初始化阶段

我是嘉心糖

于 2023-05-04 19:36:17 发布

阅读量776

点赞数 1

文章标签： java c++ 算法

本文链接：https://blog.csdn.net/qq_43445553/article/details/130493711

版权

一，初始化Parameters

Parameters是workflowsim里面重要的参数类，这个类里面的数据都是静态数据，也就是说全局共享这个类的数据。先来看看这个类的一个init方法。

/**
     * A static function so that you can specify them in any place
     *
     * @param vm, the number of vms vm的数目
     * @param dax, the DAX path dax的路径
     * @param runtime, optional, the runtime file path  这个属性没什么用，文章说
		*运行时文件的物理路径在运行时文件中，请使用以下格式ID1 1.0 ID2 2.0…这是可选的，如果您在DAX中指定了任务运行时，则不需要指定此文件
     * @param datasize, optional, the datasize file path 这个也没有用
     * @param op, overhead parameters 
     * @param cp, clustering parameters
     * @param scheduler, scheduling mode
     * @param planner, planning mode
     * @param rMethod , reducer mode
     * @param dl, deadline
     */
    public static void init(
            int vm, String dax, String runtime, String datasize,
            OverheadParameters op, ClusteringParameters cp,
            SchedulingAlgorithm scheduler, PlanningAlgorithm planner, String rMethod,
            long dl) {

        cParams = cp;
        vmNum = vm;
        daxPath = dax;
        runtimePath = runtime;
        datasizePath = datasize;

        oParams = op;
        schedulingAlgorithm = scheduler;
        planningAlgorithm = planner;
        reduceMethod = rMethod;
        deadline = dl;
        maxDepth = 0;
    }

1.OverheadParameters

这个类是用来处理延迟的，看了一下源码，这里面有是个重要的map

Map<Integer, DistributionGenerator> wed_delay,
Map<Integer, DistributionGenerator> queue_delay,
Map<Integer, DistributionGenerator> post_delay,
Map<Integer, DistributionGenerator> cluster_delay,

首先来说说这个Map里面的两个东西，第一个Integer是job的层深(depth)，第二个参数是随机数生成器，workflowsim提供了四个分布生成器分别是*LOGNORMAL*, *GAMMA*, *WEIBULL*, *NORMAL*分别是对数正态分布，伽马分布，韦伯分布和正太分布。设计这个map的目的就是每个层深的job都有不一样的延迟（delay）体现。

1.wed_delay是什么东西，原文有解释说

workflow engine delay for a particular job based on the depth(level)

这个wed_delay就是工作流引擎对某个具体任务的延迟

这个延迟发生workflow引擎的submitJob这一个行为里面，以及Reclustering里面，这个行为是至于submitJob具体是什么要看后面的文章细说，不理解也不影响后面的流程。

2.queue_delay

queue_delay在WorkflowScheduler里面被调用,在processCloudletUpdate()这一个函数里面。以及workflow引擎里面也有调用，还有reclustering里面也有调用，但是最关键的地方在sheduler这一个环节

protected void processCloudletUpdate(SimEvent ev) {

        BaseSchedulingAlgorithm scheduler = getScheduler(Parameters.getSchedulingAlgorithm());
        scheduler.setCloudletList(getCloudletList());
        scheduler.setVmList(getVmsCreatedList());

        try {
            scheduler.run();
        } catch (Exception e) {
            Log.printLine("Error in configuring scheduler_method");
            e.printStackTrace();
        }

        List<Cloudlet> scheduledList = scheduler.getScheduledList();
        for (Cloudlet cloudlet : scheduledList) {
            int vmId = cloudlet.getVmId();
            double delay = 0.0;
            if (Parameters.getOverheadParams().getQueueDelay() != null) {
                delay = Parameters.getOverheadParams().getQueueDelay(cloudlet);
            }
						//关键的代码
            schedule(getVmsToDatacentersMap().get(vmId), delay, CloudSimTags.CLOUDLET_SUBMIT, cloudlet);
        }
        getCloudletList().removeAll(scheduledList);
        getCloudletSubmittedList().addAll(scheduledList);
        cloudletsSubmitted += scheduledList.size();
    }

上面的sheduler.run()做的是一个调度的操作，是cloudlet到vm的映射，在这一个环节里面，每一个cloudlet都会被设置一个vmId，这个VM就是cloudlet要运行的地方。

最后的这个delay会在下面的代码(Cloudsim.java)这里体现出来，这里是entry之间的交互过程，可以暂时先不看，但是要知道这个delay最关键的作用就是在任务调度(run)之后，为下面这一行代码服务的schedule(getVmsToDatacentersMap().get(vmId), delay, CloudSimTags.*CLOUDLET_SUBMIT*, cloudlet);

public static void send(int src, int dest, double delay, int tag, Object data) {
		if (delay < 0) {
			throw new IllegalArgumentException("Send delay can't be negative.");
		}

		SimEvent e = new SimEvent(SimEvent.SEND, clock + delay, src, dest, tag, data);
		future.addEvent(e);
	}

3.post_delay

这个和上面的也很像，在这个函数processCloudletReturn()里面跑

4.cluster_delay

这个就是字面的意思，聚类的时候带来的延迟，本文不做深入探讨。

2.ClusteringParameters

这个ClusteringParameters是用来决定聚类的参数的

public ClusteringParameters(int cNum, int cSize, ClusteringMethod method, String code) {
        this.clusters_num = cNum;
        this.clusters_size = cSize;
        this.method = method;
        this.code = code;
    }

首先是第一个参数cNum，就是聚类的数目，第二个cSize，就是聚类的大小，第三个ClusteringMethod，就是聚类的方法，第四个code就是指定平衡策略，workflowsim一共提供了6种策略。

3.SchedulingAlgorithm

这里的SchedulingAlgorithm是调度的算法

MAXMIN
MINMIN
MCT
DATA
STATIC
FCFS
ROUNDROBIN
INVALID

private BaseSchedulingAlgorithm getScheduler(SchedulingAlgorithm name) {
        BaseSchedulingAlgorithm algorithm;

        // choose which algorithm to use. Make sure you have add related enum in
        //Parameters.java
        switch (name) {
            //by default it is Static
            case FCFS:
                algorithm = new FCFSSchedulingAlgorithm();
                break;
            case MINMIN:
                algorithm = new MinMinSchedulingAlgorithm();
                break;
            case MAXMIN:
                algorithm = new MaxMinSchedulingAlgorithm();
                break;
            case MCT:
                algorithm = new MCTSchedulingAlgorithm();
                break;
            case DATA:
                algorithm = new DataAwareSchedulingAlgorithm();
                break;
            case STATIC:
                algorithm = new StaticSchedulingAlgorithm();
                break;
            case ROUNDROBIN:
                algorithm = new RoundRobinSchedulingAlgorithm();
                break;
            default:
                algorithm = new StaticSchedulingAlgorithm();
                break;

        }
        return algorithm;
    }

具体调度哪个算法，根据传进来的枚举类型SchedulingAlgorithm判断。

这些算法都是决定怎么将cloudlet和vm绑定的

4.PlanningAlgorithm

这个类是枚举类，决定具体调用什么计划算法(PlanningAlgorithm)

private BasePlanningAlgorithm getPlanningAlgorithm(PlanningAlgorithm name) {
        BasePlanningAlgorithm planner;

        // choose which scheduler to use. Make sure you have add related enum in
        //Parameters.java
        switch (name) {
            //by default it is FCFS_SCH
            case INVALID:
                planner = null;
                break;
            case RANDOM:
                planner = new RandomPlanningAlgorithm();
                break;
            case HEFT:
                planner = new HEFTPlanningAlgorithm();
                break;
            case DHEFT:
                planner = new DHEFTPlanningAlgorithm();
                break;
            default:
                planner = null;
                break;
        }
        return planner;
    }

这些个参数是最重要的参数，决定具体要调用哪个planning算法，而这个算法真是大多数使用workflowsim的同志要写的算法，这里假如要加入自己的算法，那就在这个地方（WorkflowPlanner）添上去就好了，这个planner算法的具体作用就是将task和vm绑定在一起，我举个经典的例子，HEFTPlanningAlgorithm.java里面决定哪个任务应该调度到哪个虚拟机就是看的这个算法。

private void allocateTask(Task task) {
        CondorVM chosenVM = null;
        double earliestFinishTime = Double.MAX_VALUE;
        double bestReadyTime = 0.0;
        double finishTime;

        for (Object vmObject : getVmList()) {
            CondorVM vm = (CondorVM) vmObject;
            double minReadyTime = 0.0;

            for (Task parent : task.getParentList()) {
                double readyTime = earliestFinishTimes.get(parent);
                if (parent.getVmId() != vm.getId()) {
                    readyTime += transferCosts.get(parent).get(task);
                }
                minReadyTime = Math.max(minReadyTime, readyTime);
            }

            finishTime = findFinishTime(task, vm, minReadyTime, false);

            if (finishTime < earliestFinishTime) {
                bestReadyTime = minReadyTime;
                earliestFinishTime = finishTime;
                chosenVM = vm;
            }
        }

        findFinishTime(task, chosenVM, bestReadyTime, true);
        earliestFinishTimes.put(task, earliestFinishTime);

        task.setVmId(chosenVM.getId());
    }

…

5.rMethod

这里rMethod指定某个工作流进行聚类整合，减少计算时间。

6.dl

就是deadline拉

7.Parameter小结

这里的初始化Paramater，分别决定了vm数目, 工作流dax路径, runtime起始时间, datasize文件大小,op额外花费的分布函数, ClusteringParameters聚类算法,scheduler调度器, planner计划器, rMethod 缩减算法,dl截止日期，但是这些仅仅只是参数而已，只是确定了工作流这边的参数，并没有和CloudSim连接在一起，所以我们后面还是要创建一下和CloudSim相关的参数才行。

8.代码

/**
             * However, the exact number of vms may not necessarily be vmNum If
             * the data center or the host doesn't have sufficient resources the
             * exact vmNum would be smaller than that. Take care.
             */
            int vmNum = 20;//number of vms;
            /**
             * Should change this based on real physical path
             */
            String daxPath = "config/dax/Montage_100.xml";
            File daxFile = new File(daxPath);
            if (!daxFile.exists()) {
                Log.printLine("Warning: Please replace daxPath with the physical path in your working environment!");
                return;
            }

            /**
             * Since we are using MINMIN scheduling algorithm, the planning
             * algorithm should be INVALID such that the planner would not
             * override the result of the scheduler
             */
            Parameters.SchedulingAlgorithm sch_method = Parameters.SchedulingAlgorithm.MINMIN;
            Parameters.PlanningAlgorithm pln_method = Parameters.PlanningAlgorithm.HEFT;
            ReplicaCatalog.FileSystem file_system = ReplicaCatalog.FileSystem.SHARED;

            /**
             * No overheads
             */
            OverheadParameters op = new OverheadParameters(0, null, null, null, null, 0);

            /**
             * No Clustering
             */
            ClusteringParameters.ClusteringMethod method = ClusteringParameters.ClusteringMethod.NONE;
            ClusteringParameters cp = new ClusteringParameters(0, 0, method, null);

            /**
             * Initialize static parameters
             */
            Parameters.init(vmNum, daxPath, null,
                    null, op, cp, sch_method, pln_method,
                    null, 0);//初始化的操作，插入dax文件，
            ReplicaCatalog.init(file_system);

我是嘉心糖

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
剖析WorkflowSim源码（一）初始化阶段

这些个参数是最重要的参数，决定具体要调用哪个planning算法，而这个算法真是大多数使用workflowsim的同志要写的算法，这里假如要加入自己的算法，那就在这个地方（WorkflowPlanner）添上去就好了，这个planner算法的具体作用就是将task和vm绑定在一起，我举个经典的例子，HEFTPlanningAlgorithm.java里面决定哪个任务应该调度到哪个虚拟机就是看的这个算法。这个类是用来处理延迟的，看了一下源码，这里面有是个重要的map。这个和上面的也很像，在这个函数。
复制链接

扫一扫