Submitting a job
Slurm
Slurm for cluster/resource management and job scheduling. Slurm is responsible for allocating resources to users, providing a framework for starting, executing and monitoring work on allocated resources and scheduling work for future execution.
Slurm Commands
Command | Description |
---|---|
sinfo | View information about SLURM nodes and partitions |
squeue | Display information about jobs in the queue |
sbatch <script> | Submit a batch job script to SLURM |
scancel <job_id> | Cancel a specific job by its job ID |
srun <command> | Run a command interactively or in a job |
salloc | Allocate resources for an interactive job |
scontrol | View or modify SLURM configuration and state |
Slurm Partitions
Show below are the slurm partitions you can run your jobs. You can run jobs on cpu-queue
or cpu-mem-queue
if you want to run CPU only job and use gpu-queue
if you want to run on GPU nodes.
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
cpu-queue* up infinite 19 idle~ cpu-queue-dy-c5x4-[1,3-20]
cpu-queue* up infinite 1 alloc cpu-queue-dy-c5x4-2
cpu-mem-queue up infinite 20 idle~ cpu-mem-queue-dy-r6x4-[1-20]
gpu-queue up infinite 22 idle~ gpu-queue-dy-g4x4-[1-20],gpu-queue-dy-g6x12-[1-2]
To learn more about our HPC system, see here.
Shown below are some simple examples to get started.
$ sbatch -N1 -p cpu-queue --wrap="hostname"
Submitted batch job 68
You can see the status of the job using scontrol show job
, if it shows JobState=CONFIGURING
that means
Slurm is waiting for a compute node to be provisioned before job can be run. You
may expect some delay until job is dispatched which is expected.
$ scontrol show job 68
JobId=68 JobName=wrap
UserId=shahzebsiddiqui93358008(1170) GroupId=shahzebsiddiqui93358008(1170) MCS_label=N/A
Priority=1 Nice=0 Account=(null) QOS=normal
JobState=CONFIGURING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:01:48 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2025-07-17T02:45:27 EligibleTime=2025-07-17T02:45:27
AccrueTime=2025-07-17T02:45:27
StartTime=2025-07-17T02:45:27 EndTime=Unknown Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2025-07-17T02:45:27 Scheduler=Main
Partition=cpu-queue AllocNode:Sid=ip-10-188-48-105:1814577
ReqNodeList=(null) ExcNodeList=(null)
NodeList=cpu-queue-dy-c5x4-1
BatchHost=cpu-queue-dy-c5x4-1
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
ReqTRES=cpu=1,mem=31129M,node=1,billing=1
AllocTRES=cpu=1,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=(null)
WorkDir=/camber/home/shahzebsiddiqui93358008
StdErr=/camber/home/shahzebsiddiqui93358008/slurm-68.out
StdIn=/dev/null
StdOut=/camber/home/shahzebsiddiqui93358008/slurm-68.out
Once job is complete, you will see the result:
$ cat slurm-68.out
cpu-queue-dy-c5x4-1
Job Script Example
Shown below is an example MPI hello world hello.c
and slurm job script hello.slurm
.
Hello Work MPI Example
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(&argc, &argv);
// Get the rank of the process
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// Get the total number of processes
int size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Print message from each process
printf("Hello from process %d of %d\n", rank, size);
// Finalize the MPI environment
MPI_Finalize();
return 0;
}
#!/bin/bash
#SBATCH --job=mpi_hello
#SBATCH --output=mpi_hello_%j.out
#SBATCH --error=mpi_hello_%j.out
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2
#SBATCH --nodes=2
#SBATCH --time=00:05:00
module load openmpi
mpicc -o mpi_hello hello.c
srun --mpi=pmix ./mpi_hello
The slurm script will run 4 MPI tasks (--ntasks=4
) with 2 tasks per node (--ntasks-per-node=2
) using 2 nodes (--nodes=2
).
We will load openmpi
module that will enable us to compile code, we compile the code using mpicc
given the source code hello.c
into
executable mpi_hello
and then we will run this executable via srun
.
Let’s submit the job via sbatch
, keep track of the JobID, in this example the job is 323:
$ sbatch hello.slurm
Submitted batch job 323
Once job is finished, you can see the result, by inspecting the job via scontrol show job <JOBID>
. This
slurm job has output mpi_hello_323.out
, whose content is shown below. We see there are 4 processes printing Hello
from each MPI task
$ cat mpi_hello_323.out
Hello from process 2 of 4
Hello from process 3 of 4
Hello from process 0 of 4
Hello from process 1 of 4
Application-specific instructions are available in the Application Support and Build Guidance.