LSF Study Note

Dependence

NIS/LDAP, NFS

Install

Add LSF Administrator

# useradd -u 50001 lsfadmin

Configure install.config

# vim /tools/tmp/lsf/install.config
LSF_TOP="/tools/env/lsf"       (Install Directory)
LSF_ADMINS="lsfadmin"          (LSF Administrator)
LSF_CLUSTER_NAME="Platform"    (Cluster Name)
LSF_MASTER_LIST="lsf01 lsf02"  (lsf01: Master; lsf02: Candidate Master, if Master down, Candidate will take over the cluster after about 2-5 minutes)
LSF_ENTITLEMENT_FILE="/tools/tmp/lsf/xxx.dat" (License File)
LSF_TARDIR="/tools/tmp/lsf/"   (Installation package)

Install

# ./lsfinstall -f install.config

Automatic startup

# source /tools/env/lsf/conf/profile.sh
# lsadmin limstartup; lsadmin resstartup; badmin hstartup
# /tools/env/lsf/10.1/install/hostsetup --top="/tools/env/lsf" --boot="y"

Manage Cluster

Define client host and hostgroup

Define client host in lsf.cluster.platform & lsb.hosts

# vim /tools/env/lsf/conf/lsf.cluster.platform
...
Begin   Host
HOSTNAME  model    type        server r1m  mem  swp  RESOURCES    #Keywords
#apple    Sparc5S  SUNSOL       1     3.5  1    2   (sparc bsd)   #Example
#peach    DEC3100  DigitalUNIX  1     3.5  1    2   (alpha osf1)
#banana   HP9K778  HPPA         1     3.5  1    2   (hp68k hpux)
#mango    HP735    HPPA         1     3.5  1    2   (hpux cs)
#grape    SGI4D35  SGI5         1     3.5  1    2   (irix)
#lemon    PC200    LINUX        1     3.5  1    2   (linux)
#pear     IBM350   IBMAIX4      1     3.5  1    2   (aix cs)
#plum     PENT_100 NTX86        1     3.5  1    2   (nt)
#berry    DEC3100  !            1     3.5  1    2   (ultrix fs bsd mips dec)
#orange   !        SUNSOL       1     3.5  1    2   (sparc bsd)   #Example
#prune    !        !            1     3.5  1    2   (convex)
lsf01     !        !            1     3.5  ()   ()  (mg)
lsf02     !        !            1     3.5  ()   ()  (mg)
server1   !        !            1     3.5  ()   ()  (mg)
...

# vim /tools/env/lsf/conf/lsbatch/platform/configdir/lsb.hosts
...
Begin Host
HOST_NAME MXJ   r1m     pg    ls    tmp  DISPATCH_WINDOW  AFFINITY  # Keywords
#hostA     () 3.5/4.5   15/   12/15  0      ()            (Y)  # Example
#hostB     !    3.5    15/18  12/    0/  (5:19:00-1:8:30 20:00-8:30)  (Y)
#hostC     1    3.5/5   18    15     ()     ()            (Y)   # Example
#hostD     !    ()      ()    ()     ()     ()            (Y)   # Example
#hostE     4    ()      ()    ()     ()     ()            (Y)   # Example
#SPARCIPC  () 4.0/5.0   18    16     ()     ()            (Y)   # Example
default    !    ()      ()    ()     ()     ()            (Y)   # Example
lsf01      0    ()      ()    ()     ()     ()            (Y)   # Example
lsf02      0    ()      ()    ()     ()     ()            (Y)   # Example
server1    96   ()      ()    ()     ()     ()            (Y)   # Example
...

Notice: "MXJ" usually equal CPU processor, is means how much slot the server can accept.


Define hostgroup in lsb.hosts

# vim /tools/env/lsf/conf/lsbatch/platform/configdir/lsb.hosts
...
# This example is commented out
Begin HostGroup
GROUP_NAME    GROUP_MEMBER      # Key words
#hgroup1      (hostA hostD )    # Define a host group
test_group    (server1)
...

Define usergroup

Define usergroup in lsb.users

# vim /tools/env/lsf/conf/lsbatch/platform/configdir/lsb.users
...
Begin UserGroup
GROUP_NAME       GROUP_MEMBER              USER_SHARES            #GROUP_ADMIN
test_user        (user01 user02)           ()
#ugroup1         (user1 user2 user3 user4) ([user1, 4] [others, 10])   #(user1 user2[full])
...

Define queue

Define queue in lsb.queues

Attention: Change INTERACTIVE = NO to  YES, if you want to submit interactive job, such as xterm...

# vim /tools/env/lsf/conf/lsbatch/platform/configdir/lsb.queues
...
Begin Queue
QUEUE_NAME   = test_queue
PRIORITY     = 30
INTERACTIVE  = YES
FAIRSHARE    = USER_SHARES[[default,1]]
#RUN_WINDOW   = 5:19:00-1:8:30 20:00-8:30
#r1m         = 0.7/2.0        # loadSched/loadStop
#r15m         = 1.0/2.5
#pg           = 4.0/8
#ut           = 0.2
#io           = 50/240
#CPULIMIT     = 180/hostA      # 3 hours of host hostA
#FILELIMIT    = 20000
#DATALIMIT    = 20000          # jobs data segment limit
#CORELIMIT    = 20000
#TASKLIMIT    = 5              # job task limit
USERS         = test_user           # users who can submit jobs to this queue
HOSTS         = test_group     # hosts on which jobs in this queue can run
#PRE_EXEC     = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
#POST_EXEC    = /usr/local/lsf/misc/testq_post |grep -v "Hey"
#REQUEUE_EXIT_VALUES = 55 34 78
#APS_PRIORITY = WEIGHT[[RSRC, 10.0] [MEM, 20.0] [PROC, 2.5] [QPRIORITY, 2.0]] \
#    LIMIT[[RSRC, 3.5] [QPRIORITY, 5.5]] \
#    GRACE_PERIOD[[QPRIORITY, 200s] [MEM, 10m] [PROC, 2h]]
DESCRIPTION  = For normal low priority jobs, running only if hosts are \
lightly loaded.
End Queue
...

Change default queue

The default queue ist "normal" & "interactive", you can redefine in lsb.params

# vim /tools/env/lsf/conf/lsbatch/platform/configdir/lsb.params
...
Begin Parameters
DEFAULT_QUEUE  = test_queues # Default job queue names
...

Reconfigure

# lsadmin reconfig
If there was something wrong with your config file "lsf.cluster.platform", it would tell you there was an error and ask if you need to restart.

# badmin reconfig
After lsadmin reconfig pass, execute "badmin reconfig". Also, if there was something wrong with your config file "lsb.xxx", it would tell you there was an error and ask if you need to restart.

# badmin mbdrestart
check config file "lsb.xxx" syntax

Client join the cluster

# source /tools/env/lsf/conf/profile.sh
# lsadmin limstartup; lsadmin resstartup; badmin hstartup
# /tools/env/lsf/10.1/install/hostsetup --top="/tools/env/lsf" --boot="y"
# bhosts
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV 
lsf01              closed          -      0      0      0      0      0      0
lsf02              closed          -      0      0      0      0      0      0
server1            ok              -     96      0      0      0      0      0
...

End

  • 2
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值