centos下安装slurm
控制节点node16
计算节点node16,node18
删除安装失败的slurm
yum remove slurm -y
cat /etc/passwd | grep slurm
userdel - r slurm
创建用户
export SLURMUSER=412
groupadd -g $SLURMUSER slurm
useradd -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm -s /bin/bash slurm
查看slurm用户组id是否一致,控制节点和所有计算节点都要一样
id slurm
安装slurm
先装epel库:
yum install epel-release
装slurm的依赖包:
yum install openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad -y
如果出现以下报错:
直接卸载冲突部分再重新运行上述命令即可:
yum -y remove ibacm-1.2.0-1.el7.x86_64
yum -y remove libipathverbs-1.3-2.el7.x86_64
yum -y remove ibacm-1.2.0-1.el7.x86_64
yum -y remove libipathverbs-1.3-2.el7.x86_64
装rpm:
yum install rpm-build
下载slurm:
wget https://www.schedmd.com/archives.php/downloads/archive/slurm-17.02.4