目录
guarantee策略介绍
我们有时会需要为某些重要作业预留一些资源,当这些作业提交到LSF集群后,可以迅速获得计算资源立即执行,而不需要像普通作业那样,需要排队等待直至获得足够资源才可以运行。LSF的guarantee策略就是预留一部分资源以保证重要作业能够优先调度运行的策略。guarantee策略除了可以为重要作业起到“保驾护航”的作用,还可以通过配置LOAN_POLICIES参数,将这些预留资源暂时“借”出去(例如借给短作业队列或者在预计时间内可以完成的作业),从而实现资源的充分利用。guarantee的中文意思是保障,后文也会用保障策略、保障资源来表示该策略和涉及的计算资源。
LSF支持4种保障资源,分别是:
-
slots:为重要作业预留slots
-
hosts:为重要作业预留计算节点
-
package:为重要作业预留slots与内存的组合资源,称为package
-
resource:为重要作业预留License Scheduler资源
使用LSF的保障策略,需要配置保障资源池(guaranteed resource pool)和保障服务等级(guaranteed service class或者guaranteed SLA),参考下图:
guarantee策略配置方法
1)配置保障服务等级
在lsb.serviceclasses文件进行如下配置:
Begin ServiceClass
NAME = mysla
GOALS = [GUARANTEE]
ACCESS_CONTROL = QUEUES[priority]
DESCRIPTION = enable guarantees for priority jobs
End ServiceClass
在lsb.serviceclasses按照上面方法配置,GOALS必须为GUARANTEE,ACCESS_CONTROL参数指定允许队列priority的作业使用预留计算资源,其它队列没有在这里配置,所以不允许使用预留资源。
2)配置保障资源池
在lsb.resources文件进行如下配置:
Begin GuaranteedResourcePool
NAME = slotPool
HOSTS = host2
TYPE = slots
DISTRIBUTION = [mysla,2]
DESCRIPTION = guaranteed slot pool for mysla
End GuaranteedResourcePool
上述配置指定预留计算节点为host2,预留资源类型为slots,DISTRIBUTION = [mysla,2]指的是将该保障资源池中的2个slot预留给保障策略mysla,也就是预留给priority队列的作业。
上述配置完成后,执行badmin reconfig生效。可以通过bsla和bresources命令查看保障策略和保障资源池当前使用情况。
$ bsla mysla
SERVICE CLASS NAME: mysla
-- enable guarantees for priority jobs
ACCESS CONTROL: QUEUES[priority]
AUTO ATTACH: N
GOAL: GUARANTEE
GUARANTEE GUARANTEE TOTAL
POOL NAME TYPE CONFIG USED USED
slotPool slots 2 0 0
$ bresources -g -l slotPool
GUARANTEED RESOURCE POOL: slotPool
guarantee policy for mysla
TYPE: slots
DISTRIBUTION: [mysla, 2]
HOSTS: host2
STATUS: ok
RESOURCE SUMMARY:
TOTAL 5
FREE 2
GUARANTEE CONFIGURED 2
GUARANTEE USED 0
GUARANTEE GUARANTEE TOTAL
CONSUMERS CONFIGURED USED USED
mysla 2 0 0
guarantee策略使用
1. 假定host2和host3目前没有作业运行,每个节点有5个slot。
$ bhosts
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
host1 closed - 10 0 0 0 0 0
host2 ok - 5 0 0 0 0 0
host3 ok - 5 0 0 0 0 0
host4 closed - 10 0 0 0 0 0
2. 提交10个普通作业,没有使用预留资源:
$ for i in `seq 1 10`; do bsub sleep 10000; done
Job <10043> is submitted to queue <normal>.
Job <10044> is submitted to queue <normal>.
Job <10045> is submitted to queue <normal>.
Job <10046> is submitted to queue <normal>.
Job <10047> is submitted to queue <normal>.
Job <10048> is submitted to queue <normal>.
Job <10049> is submitted to queue <normal>.
Job <10050> is submitted to queue <normal>.
Job <10051> is submitted to queue <normal>.
Job <10052> is submitted to queue <normal>.
3. 观察作业,10个作业只能运行8个,两个作业不能运行,因为host2上的2个slot预留给了使用guarantee策略的作业。
$ bhosts
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
host1 closed - 10 0 0 0 0 0
host2 ok - 5 3 3 0 0 0
host3 closed - 5 5 5 0 0 0
host4 closed - 10 0 0 0 0 0
$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
10043 user RUN normal host1 host2 *eep 10000 Dec 31 17:16
10045 user RUN normal host1 host2 *eep 10000 Dec 31 17:16
10047 user RUN normal host1 host2 *eep 10000 Dec 31 17:16
10044 user RUN normal host1 host3 *eep 10000 Dec 31 17:16
10046 user RUN normal host1 host3 *eep 10000 Dec 31 17:16
10048 user RUN normal host1 host3 *eep 10000 Dec 31 17:16
10049 user RUN normal host1 host3 *eep 10000 Dec 31 17:16
10050 user RUN normal host1 host3 *eep 10000 Dec 31 17:16
10051 user PEND normal host1 *eep 10000 Dec 31 17:16
10052 user PEND normal host1 *eep 10000 Dec 31 17:16
$ bjobs -p
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
10051 user PEND normal host1 *eep 10000 Dec 31 17:16
Slots are reserved for guarantees: 1 host;
10052 user PEND normal host1 *eep 10000 Dec 31 17:16
Slots are reserved for guarantees: 1 host;
4. 提交一个重要作业到priority队列,并通过-sla选项使用预留资源,这个作业可以运行:
$ bsub -sla mysla -q priority sleep 1000
Job <10053> is submitted to queue <priority>.
$ bjobs 10053
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
10053 user RUN priority host1 host2 sleep 1000 Dec 31 17:19
5. 通过bsla和bresources命令观察预留资源使用情况,可以发现在host2上预留的2个slot,使用了1个:
$ bsla mysla
SERVICE CLASS NAME: mysla
-- enable guarantees for priority jobs
ACCESS CONTROL: QUEUES[priority]
AUTO ATTACH: N
GOAL: GUARANTEE
GUARANTEE GUARANTEE TOTAL
POOL NAME TYPE CONFIG USED USED
slotPool slots 2 1 1
$ bresources -g -l slotPool
GUARANTEED RESOURCE POOL: slotPool
guaranteed slot pool for mysla
TYPE: slots
DISTRIBUTION: [mysla, 2]
HOSTS: host2
STATUS: ok
RESOURCE SUMMARY:
TOTAL 5
FREE 1
GUARANTEE CONFIGURED 2
GUARANTEE USED 1
GUARANTEE GUARANTEE TOTAL
CONSUMERS CONFIGURED USED USED
mysla 2 1 1
补充说明
本文提供的是guarantee策略的基本使用方法,guarantee策略还有很多复杂的扩展功能:
1. 可以通过设置主机名、主机组、RES_SELECT(通过语句选择符合条件的节点)来设置预留计算节点;
2. 预留资源的数目可以是数字或者百分比;
3. 预留资源也可以借给普通作业,一旦有重要作业请求预留资源,可以马上停止资源借出;
4. 除了为队列预留资源,也可以设置ACCESS_CONTROL参数为QUEUES、USERS、APPS、PROJECTS、FAIRSHARE_GROUPS和LIC_PROJECTS的任意组合,以实现不同的访问控制,例如设置:
ACCESS_CONTROL = QUEUES[priority] USERS[david] PROJECTS[vip]
则只允许david用户提交到priority队列上的属于vip项目的作业使用预留资源。
关于guarantee策略的详细描述,可以参考IBM官方文档:
欢迎关注下方微信公众号【HPC常青园】,共同交流HPC集群管理经验和最佳实践。如果您有关于HPC集群的具体需求,欢迎邮件沟通交流:hpc@ivyent.cn。