部署 Slurm
以下内容由 gpt-4o 生成
配置 MUNGE
-
安装 MUNGE:
sudo apt install munge libmunge-dev libmunge2 -
生成 MUNGE 密钥:
sudo -u munge /usr/sbin/mungekey --verbose -
将 MUNGE 密钥上传到所有节点。
-
启动 MUNGE 服务:
sudo systemctl enable munge sudo systemctl start munge
参见:Installation Guide · dun/munge Wiki
配置 MariaDB 数据库
-
安装依赖:
sudo apt install mariadb-client mariadb-server libmariadb-dev-compat libmariadb-dev libmariadb3 -
启动 MariaDB 并设置 root 密码:
sudo systemctl enable mariadb sudo systemctl start mariadb sudo mysql_secure_installation -
创建 Slurm 数据库和用户:
sudo mysql -u root -pCREATE DATABASE slurm_acct_db; CREATE USER 'slurm'@'localhost' IDENTIFIED BY 'your_password'; GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost'; FLUSH PRIVILEGES; EXIT;
安装 Slurm:
-
安装依赖:
sudo apt install build-essential bzip2 -
下载 Slurm 源码并编译:
wget https://download.schedmd.com/slurm/slurm-25.05.2.tar.bz2 tar -xjf slurm-*.tar.bz2 && rm -rf slurm-*.tar.bz2 cd slurm-* ./configure make -j$(nproc) sudo make install cd .. && rm -rf slurm-* -
编辑 Slurm 配置文件:
sudoedit /usr/local/etc/slurm.confClusterName=cluster ControlMachine=controlhost SlurmUser=slurm SlurmdPort=6818 SlurmctldPort=6817 AuthType=auth/munge StateSaveLocation=/var/spool/slurm/state SlurmdSpoolDir=/var/spool/slurmd -
创建状态保存目录:
sudo mkdir -p /var/spool/slurm/state sudo chown slurm: /var/spool/slurm/state -
启动 Slurm 服务:
# Slurm 控制器 sudo systemctl enable slurmctld sudo systemctl start slurmctld # Slurm 守护进程 sudo systemctl enable slurmd sudo systemctl start slurmd

浙公网安备 33010602011771号