Scheduling Hadoop Jobs to Meet Deadlines

关于实现实时要求作业的调度算法的论文


In this paper, we lay the foundation for dealing with deadline requirements in Hadoop-based data processing by 

在这篇论文中,我们列举了在hadoop的平台上进行数据处理时实现实时需求的基础,包括

(1)proposing a job execution cost model that accounts for the various parameters that affect Hadoop job completion time such as map and reduce runtimes, map and reduce input data sizes, data distribution。

提出了一个job运行耗时模型,来描述影响hadoop作业完成时间的多种参数,例如map/reducetask运行时间,map/reduce输入数据集大小,数据分布等。

(2) presenting the design of a Constraint-Based Hadoop Scheduler that takes user deadlines as part of its input and determines the schedulability of a job based on the proposed job execution cost model and independent of how many jobs are currently running in the cluster

提出基于实时的hadoop调度设计。该设计将用户的作业期望完成时间作为job提交的组成部分,在job运行耗时模型的基础上,决定job的可调度性,而与现在有多少作业并发执行无关。


We focus on deadline constraints when MapReduce runtime parameter values are known and leave the issue of estimating job parameter values as future work。

我们现在关注在Mapreduce耗时模型参数已知的情况,把预测参数的工作作为后续工作。

The rest of the paper is organized as follows: in section II, we discuss the scheduling aspects of the problem and derive expression for minimum map/reduce task allocation required to meet deadlines.In section III, we present the design and implementation of Constraint Scheduler and in section IV, we present results for task allocation for different deadlines.

论文的组成:第二章讨论调度方面的问题,提取满足task实时需求的最小资源需求表达式。第三章,描述实时调度器的设计和实现。最后,描述不同的实时性对资源需求的不同结果。


II. FOUNDATIONS

A. Problem Definition:

Problem Statement: Can a given query q that translatesto a MapReduce job J and has to process data of size becompleted within a deadline D, when run in a MapReduce
cluster having N nodes with Nm map task slots, Nr reducetask slots and possibly k jobs executing at the time.

问题描述:一个N个节点的集群,有Nm个map slot,Nr个reduce slot,Job并发度为k。一个Job处理sByte的数据,能否在D时间内完成。

After a job is submitted, the scheduler first needs to determine whether the job can be completed within the specified deadline or not using a schedulability test。

在job提交后,调度器首先需要通过可调度性测试验证job是否能够在指定时间内完成。

Rather than make schedulability determination based on all the jobs running in the system, we focus on the free slots availability at the given time or in the future.

与其针对所有      adf

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值