Scheduler Overview (Slurm)
Jump to navigation
Jump to search
The resource allocation and scheduling software on Anunna is SLURM: Simple Linux Utility for Resource Management. This page is the entry point — most topics have their own page; below is a short summary plus links.
What's on which page
- Partitions / Queues — list of partitions (
main,gpu,gpu_amd) and how to choose one. - Choosing a node (constraints) — defaults, hardware constraints, GPU selection.
- Batch Jobs — writing sbatch scripts and submitting them, including multi-job submissions and dependencies.
- Interactive Jobs —
sinteractiveandsallocfor live shell sessions on a compute node. - Array jobs — running the same script many times with a varying parameter.
- Monitoring Jobs —
squeue,scontrol,sstat,sacct,node_usage_graph. - Cancelling Jobs —
scancel. - Reservations — booking nodes in advance for events.
Quality of Service
When submitting a job, you may optionally assign a different Quality of Service (QoS) to it:
#SBATCH --qos=std
The QoS values configured on Anunna:
- std (priority 10) — the default. Use this unless you have a specific reason to pick another.
- low (priority 1) — reduced priority, but limited to 8 hours per job so a flood of low-priority jobs cannot lock up the cluster.
- high (priority 20) — higher priority than
std. More expensive — see Tariffs. - interactive (priority 100) — the highest priority, exclusively for immediate-running interactive jobs. You may not submit many or large jobs at this QoS.
Jobs can in principle be restarted and rescheduled if a higher-priority job needs cluster resources, but at the time of writing this preemption is not actually configured.
Running MPI jobs
For multi-node MPI workloads see MPI on Anunna.
See also
- Partitions / Queues
- Choosing a node (constraints)
- Batch Jobs
- Interactive Jobs
- Array jobs
- Monitoring Jobs
- Cancelling Jobs
- Reservations
- Costs associated with resource usage