IBM Spectrum LSF 9.1.3 基本概念简介

个人学习笔记,仅供参考,如若描述有误欢迎讨论指正!

1. LSF Hosts

Hosts in your cluster perform different functions.
Master host
LSF server host that acts as the overall coordinator for the cluster, doing all job scheduling and dispatch.
Server host
A host that submits and runs jobs.
Client host
A host that only submits jobs and tasks.
Execution host
A host that runs jobs and tasks.
Submission host
A host from which jobs and tasks are submitted

2. LSF Cluster

badmin controls the operation of mbatchd and sbatchd
lsadmin controls the operation of lim and res
在这里插入图片描述

3. LSF Jobs

在这里插入图片描述

4. Job States

LSF jobs have the following states:
PEND — Waiting in a queue for scheduling and dispatch
RUN — Dispatched to a host and running
DONE — Finished normally with zero exit value
EXIT — Finished with non-zero exit value
PSUSP — Suspended while pending
USUSP — Suspended by user
SSUSP — Suspended by the LSF system
POST_DONE — Post-processing completed without errors
POST_ERR — Post-processing completed with errors
WAIT — Members of a chunk job that are waiting to run

5. LSF Directories

The following directories are owned by the primary LSF administrator (lsfadmin) and are readable by all cluster users:
• LSF_CONFDIR
LSF configuration directory: /usr/share/lsf/conf
• LSB_CONFDIR
LSF Batch configuration directory: /usr/share/lsf/conf/lsbatch
• LSB_SHAREDIR
LSF Batch job history directory: /usr/share/lsf/work
• LSF_LOGDIR
Server daemon error logs, one for each LSF daemon: /usr/share/lsf/log

The following directories are owned by root and are readable by all cluster users:
• LSF_BINDIR
LSF user commands, shared by all hosts of the same type. For example:
/usr/share/lsf/9.1/sparc-sol10/bin/
• LSF_INCLUDEDIR
Header files lsf/lsf.h and lsf/lsbatch.h:
/usr/share/lsf/9.1/include
• LSF_LIBDIR
LSF libraries, shared by all hosts of the same type. For example:
/usr/share/lsf/9.1/sparc-sol10/lib/
• LSF_MANDIR
LSF man pages: /usr/share/lsf/9.1/man
• LSF_MISC
Examples and other miscellaneous files: /usr/share/lsf/9.1/misc
• LSF_SERVERDIR
Server daemon binaries, scripts and other utilities, shared by all hosts of the same type. For example:
/usr/share/lsf/9.1/sparc-sol10/etc/
• LSF_TOP
Top-level installation directory: /usr/share/lsf
Other configuration directories are specified in /usr/share/lsf/conf/lsf.conf.

6. LSF Cluster Configuration Files

The following files are owned by the primary LSF administrator (lsfadmin) and are readable by all cluster users:

• LSF global configuration files describing the configuration and operation of the cluster saozi:
/usr/share/lsf/conf/ego/saozi/kernel/ego.conf
/usr/share/lsf/conf/lsf.conf

• LSF keyword definition file shared by all clusters. Defines cluster name, host types, host models, and site-specific resources:
/usr/share/lsf/conf/lsf.shared

• LSF cluster configuration file that defines hosts, administrators, and location of site-defined shared resources:
/usr/share/lsf/conf/lsf.cluster.saozi

• LSF mapping files for task names and their default resource requirements:
/usr/share/lsf/conf/lsf.task
/usr/share/lsf/conf/lsf.task.saozi

7. LSF Batch Configuration Files\

The following files are owned by the primary LSF administrator (lsfadmin) and are readable by all cluster users:

• LSF server hosts and their attributes, such as scheduling load thresholds, dispatch windows, and job slot limits:
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.hosts
If no hosts are defined in this file, then all LSF server hosts listed in /usr/share/lsf/conf/lsf.cluster.saozi are assumed to be LSF Batch server hosts.

• LSF scheduler and resource broker plugin modules. If no scheduler or resource broker modules are configured, LSF uses the default scheduler plugin module named schmod_default:
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.modules

• LSF Batch system parameter file:
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.params

• LSF job queue definitions:
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.queues

• Resource allocation limits, exports, and resource usage limits:
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.resources

• LSF user groups, hierarchical fairshare for users and user groups, and job slot limits for users and user groups:
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.users
Also used to configure account mappings in a MultiCluster environment.

• Application profiles, common parameters for the same type of jobs, including the execution requirements of the applications, the resources they require, and how they should be run and managed :
/usr/share/lsf/conf/lsbatch/saozi/configdir/lsb.applications
This file is optional. Use the DEFAULT_APPLICATION parameter in lsb.params to specify a default application profile for all jobs. LSF does not automatically assign a default application profile.

8. LSF Batch Log Files

• LSF Batch events log:
/usr/share/lsf/work/saozi/logdir/lsb.events

• LSF Batch accounting log:
/usr/share/lsf/work/saozi/logdir/lsb.acct

9. LSF Daemon Log Files

LSF server daemon log files are stored in /usr/share/lsf/log:
• Load Information Manager (lim)
/usr/share/lsf/log/lim.log.host_name
• Remote Execution Server (res)
/usr/share/lsf/log/res.log.host_name
• Master Batch Daemon (mbatchd)
/usr/share/lsf/log/mbatchd.log.mgmt1
• Master Scheduler Daemon (mbschd)
/usr/share/lsf/log/mbschd.log.mgmt1
• Slave Batch Daemon (sbatchd)
/usr/share/lsf/log/sbatchd.log.host_name
• Process Information Manager (pim)
/usr/share/lsf/log/pim.log.host_name

10. LSF Re-Configure

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值