slurm简介


 http://www.schedmd.com/#index


 ub link submit job using SLURM 


Slurm Architecture



slurmd daemon running on each node and a central slurmctld daemon running on a management node




SLURM ITEMS


node, compute resource in Slurm


partition, group of nodes in a logical(possibly overlapping) sets


job, allocation of resources assigned to a user for a specified amout of time


job steps, sets of tasks within a job




SLUM COMMANDS


slurm --help / man  


sacct, report job/job step accouting information about active or completed jobs


salloc, allocate resources for a job in real time ( usually allocate resource ans spawn a shell, which is then to execute srun to launch parallel taskes)


sattach, attach std I/O Error plus signal capabilities to a currently running job/jobstep, usually can attach/deattach multi-times


sbatch, submit a job script for later execution, usually contian one or more srun commands to launch parallel tasks


sbcast, transfer a file from local disk to local disk on the nodes allocated to one job, usually for shared file system


scancel, cancel a pending or running job


scontrol, administrator tool to view/modify Slurm state, usually need root user


sinfo, reports state partion and nodes managed by Slurm, with variety of filtering, sorting, formatting options


smap, report state information for jobs, partitions, node by Slum graphically


squeue, report state of job(samilar as sinfo), with running jobs in priority order 


srun, sumbit job for execution or initiate jobsteps in real time, options e.g.  max/min node cout, processor count, specific node to use or not use, specific node characteristics( memory, disk space, centain required features)


strigger, job monitor


sview, GUI to get / update state informaton for jobs, partions, nodes managed by Slurm


SRUN 


a simple way: a single commond line e.g.  srum -N3 -l  ../mpi/example
/* running example on three nodes with output(-l)


a common way: submit a script:


/*
 * options in script can be supplied by using a prefix of "#SBATCH" followed by the option at the beginning of the script(before any commands to be excuted in the script), Options supplied on the command line would override any options specified within the script
*/


another way:  create a resource allocation and spawn job steps within that allocation


/*
 * salloc used to create a resource allocation and typically start a shell within that allocation. one or more job can excute.
 Slurm doesn't automactically migrate executable or data files to the nodes allocated to a job. Either the files must exist on local disk or in some global file system(NFS). sbcast can be used to transfer files to local storage on allocated nodes using Slurm's hierarchical communications
*/



CPU Management Steps  (重要)


step1: selection of nodes in slurm.conf


NodeName 
PartitionName
FastSchedule  
SelectType   selecc/linear | select/cons_res
SelectTypeParameters  CR_CPU | CR_CPU_Memory | CR_Core | CR_Core_Memory |CR_Socket | CR_Socket_Memory


srun/salloc/sbatch command line options ::
-B --extra-node-info  <sockets[:cores[:threads]]> ,  restricts node selection to nodes with a specified layout of sockets, cores and threads


-C --constraint <list> , restrict node selection to nodes with specified attributes


--contiguous N/A , restrict node selection to contiguous nodes


--cores-per-socket  <cores>  restrict node selection to nodes with at least the specified number of cores per socket


-c, --cpus-per-task <ncpus> ,  control the number of CPUs allocated per tast


--exclusive  N/A ,  pervent sharing of allocated nodes with other jobs, suballocates CPUs to job steps


-F, --nodefile <node file>  File containing a list of specific nodes to be selected for the job(salloc sbatch)


--hint  comput_bound | memory_hound | [no] multithread,   additional controls on allocation CPU resoures


--minicpus  <n>   controls the minimum # of CPUs allocated per node


-N, --nodes  <minnodes[-maxnodes]>   


-n, --ntasks  <numbers>   number of tasks to be created for the job


--ntasks-per-core  <number>  maximum number of tasks per allocated core


--ntasks-per-socket <number>


--ntasks-per-node <number>


-O, --overcommit N/A,  allows fewer CPUs to be allocated than the number of tasks


-p, --partition  <partition_names>, which partition is used for the job


-s, --share  N/A, allow sharing of allocated nodes with other jobs


--sockets-per-node <sockets>


--threads-per-core <threads>


-w, --nodelist <host1,host2, ... or filename>, list of specific nodes to be allocated to the job


-x, --exclude <host1, host2, .. filename>, list of specific nodes to be excluded from allocated to the job


-Z --no-allocate  N/A  




step2: allocation CPUs from the selected nodes in slurm.conf(same as in step1)


step3: distribution of tasks to the selected nodes


each task is distributed to only one node, but more than one task may be distributed to each node.
除非cpu资源巨多,一个节点上分配的任务数取决于所有分配的cpu总数和每个任务需要占用的cpu数目。


in slurm.conf  MaxTasksPerNode <number>, max number of tasks that a job step can spawn on a single node


srun/salloc/sbatch options in this step:


--distribution, -m  block | cyclic | arbitrary | plane=<options> [:block|cyclic]


--ntasks-per-core <number> 


--ntasks-per-socket <number> 


--ntasks-per-node <number>


-r, --relative N/A  which node is used for a job step


step 4:  opt/binding CPU
SLURM distributed and bind each task to a specified subset of the allocated CPUs on the node



举例:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --constraint=CPU-L5520 
#SBATCH --partition=debug
#SBATCH --time=00:00:10
#SBATCH --mail-type=END
#SBATCH --mail-user=xx.gmail.com
#SBATCH --output=core8.out
#

echo "SLUM_JOBID="$SLURM_JOBID
echo "SLURM_JOB_CPUS_PER_NODE="$SLURM_JOB_CPUS_PER_NODE
echo "SLURM_CPUS_PER_TASK="$SLURM_CPUS_PER_TASK
echo "SLURM_TASKS_PER_NODE="$SLURM_TASKS_PER_NODE
echo "SLURM_NTASKS="$SLURM_NTASKS


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值