编辑
我真正想要的是一种模仿SLURM的方法,一种可以安装的交互式和合理用户友好的方式.
原帖
我想用SLURM测试一些最小的例子,我试图用Ubuntu 16.04在本地机器上安装它.我跟随the most recent slurm install guide I could find,然后我开始使用sudo /etc/init.d/slurmd start开始slurmd.
[....] Starting slurmd (via systemctl): slurmd.serviceJob for slurmd.service failed because the control process exited with error code. See "systemctl status slurmd.service" and "journalctl -xe" for details.
failed!
我不知道如何解释systemctl日志:
● slurmd.service - Slurm node daemon
Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-10-26 22:49:27 EDT; 12s ago
Process: 5951 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=1/FAILURE)
Oct 26 22:49:27 Haggunenon systemd[1]: Starting Slurm node daemon...
Oct 26 22:49:27 Haggunenon systemd[1]: slurmd.service: Control process exited, code=exited status=1
Oct 26 22:49:27 Haggunenon systemd[1]: Failed to start Slurm node daemon.
Oct 26 22:49:27 Haggunenon systemd[1]: slurmd.service: Unit entered failed state.
Oct 26 22:49:27 Haggunenon systemd[1]: slurmd.service: Failed with result 'exit-code'.
lsb_release -a给出以下内容. (是的,我知道,严格来说,KDE Neon并不完全是Ubuntu.)
o LSB modules are available.
Distributor ID: neon
Description: KDE neon User Edition 5.11
Release: 16.04
Codename: xenial
与导游说的不同,我使用了自己的用户名wlandau,我确保将/ var / lib / slurm-llnl和/ var / run / slurm-llnl chown给我.这是我的/etc/slurm-llnl/slurm.conf.
# slurm.conf file generated by configurator.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=Linux0
#ControlAddr=
#BackupController=
#BackupAddr=
#
AuthType=auth/munge
CacheGroups=0
#CheckpointType=checkpoint/none
CryptoType=crypto/munge
#DisableRootJobs=NO
#EnforcePartLimits=NO
#Epilog=
#EpilogSlurmctld=
#FirstJobId=1
#MaxJobId=999999
#GresTypes=
#GroupUpdateForce=0
#GroupUpdateTime=600
#JobCheckpointDir=/var/lib/slurm-llnl/checkpoint
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
#JobFileAppend=0
#JobRequeue=1
#JobSubmitPlugins=1
#KillOnBadExit=0
#LaunchType=launch/slurm
#Licenses=foo*4,bar
#MailProg=/usr/bin/mail
#MaxJobCount=5000
#MaxStepCount=40000
#MaxTasksPerNode=128
MpiDefault=none
#MpiParams=ports