Scheduler Overview (Slurm): Difference between revisions

Latest revision as of 09:48, 16 June 2026

The resource allocation and scheduling software on Anunna is SLURM: Simple Linux Utility for Resource Management. This page is the entry point — most topics have their own page; below is a short summary plus links.

What's on which page

Partitions / Queues — list of partitions (main, gpu, gpu_amd) and how to choose one.
Choosing a node (constraints) — defaults, hardware constraints, GPU selection.
Batch Jobs — writing sbatch scripts and submitting them, including multi-job submissions and dependencies.
Interactive Jobs — sinteractive and salloc for live shell sessions on a compute node.
Array jobs — running the same script many times with a varying parameter.
Monitoring Jobs — squeue, scontrol, sstat, sacct, node_usage_graph.
Cancelling Jobs — scancel.
Reservations — booking nodes in advance for events.

Quality of Service

When submitting a job, you may optionally assign a different Quality of Service (QoS) to it:

#SBATCH --qos=std

The QoS values configured on Anunna:

std (priority 10) — the default. Use this unless you have a specific reason to pick another.
low (priority 1) — reduced priority, but limited to 8 hours per job so a flood of low-priority jobs cannot lock up the cluster.
high (priority 20) — higher priority than std. More expensive — see Tariffs.
interactive (priority 100) — the highest priority, exclusively for immediate-running interactive jobs. You may not submit many or large jobs at this QoS.

Jobs can in principle be restarted and rescheduled if a higher-priority job needs cluster resources, but at the time of writing this preemption is not actually configured.

Running MPI jobs

For multi-node MPI workloads see MPI on Anunna.

External links

@@ Line 1: / Line 1: @@
-== submitting jobs: sbatch ==
+The resource allocation and scheduling software on Anunna is [http://en.wikipedia.org/wiki/Simple_Linux_Utility_for_Resource_Management SLURM]: '''S'''imple '''L'''inux '''U'''tility for '''R'''esource '''M'''anagement. This page is the entry point — most topics have their own page; below is a short summary plus links.
-from decimal import *
+== What's on which page ==
-D=Decimal
-getcontext().prec=10000000
-p=sum(D(1)/16**k*(D(4)/(8*k+1)-D(2)/(8*k+4)-D(1)/(8*k+5)-D(1)/(8*k+6))for k in range(411))
-print(str(p)[:10000002])
-<source lang='python'>
+* [[Partitions / Queues]] — list of partitions (<code>main</code>, <code>gpu</code>, <code>gpu_amd</code>) and how to choose one.
-from decimal import *
+* [[Choosing a node (constraints)]] — defaults, hardware constraints, GPU selection.
-D=Decimal
+* [[Batch Jobs]] — writing sbatch scripts and submitting them, including multi-job submissions and dependencies.
-getcontext().prec=10000000
+* [[Interactive Jobs]] — <code>sinteractive</code> and <code>salloc</code> for live shell sessions on a compute node.
-p=sum(D(1)/16**k*(D(4)/(8*k+1)-D(2)/(8*k+4)-D(1)/(8*k+5)-D(1)/(8*k+6))for k in range(411))
+* [[Array jobs]] — running the same script many times with a varying parameter.
-print(str(p)[:10000002])
+* [[Monitoring Jobs]] — <code>squeue</code>, <code>scontrol</code>, <code>sstat</code>, <code>sacct</code>, <code>node_usage_graph</code>.
-</source>
+* [[Cancelling Jobs]] — <code>scancel</code>.
+* [[Reservations]] — booking nodes in advance for events.
-<source lang='bash'>
+== Quality of Service ==
-#!/bin/bash
-# #SBATCH --time=100
-#SBATCH --ntasks=1
-#SBATCH --output=output_%j.txt
-#SBATCH --error=error_output_%j.txt
-#SBATCH --job-name=calc_pi.py
-#SBATCH --partition=research
-time python3 calc_pi.py
+When submitting a job, you may optionally assign a different Quality of Service (QoS) to it:
-</source>
-  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
+<syntaxhighlight lang="bash">
-  research calc_pi. megen002   R       0:03      1 node049
+#SBATCH --qos=std
+</syntaxhighlight>
-== allocating resources interactively: sallocate ==
+The QoS values configured on Anunna:
-== running MPI jobs on B4F cluster ==
+* '''std''' (priority 10) — the default. Use this unless you have a specific reason to pick another.
+* '''low''' (priority 1) — reduced priority, but limited to 8 hours per job so a flood of low-priority jobs cannot lock up the cluster.
+* '''high''' (priority 20) — higher priority than <code>std</code>. More expensive — see [[Tariffs]].
+* '''interactive''' (priority 100) — the highest priority, exclusively for immediate-running interactive jobs. You may not submit many or large jobs at this QoS.
-== monitoring submitted jobs: squeue ==
+Jobs can in principle be restarted and rescheduled if a higher-priority job needs cluster resources, but at the time of writing this preemption is not actually configured.
-== removing jobs from a list: scancel ==
+== Running MPI jobs ==
-== other ==
+For multi-node MPI workloads see [[MPI on B4F cluster | MPI on Anunna]].
-== external links ==
+== See also ==
+* [[Partitions / Queues]]
+* [[Choosing a node (constraints)]]
+* [[Batch Jobs]]
+* [[Interactive Jobs]]
+* [[Array jobs]]
+* [[Monitoring Jobs]]
+* [[Cancelling Jobs]]
+* [[Reservations]]
+* [[Tariffs | Costs associated with resource usage]]
+== External links ==
+* [http://slurm.schedmd.com Slurm official documentation]
+* [http://en.wikipedia.org/wiki/Simple_Linux_Utility_for_Resource_Management Slurm on Wikipedia]

Scheduler Overview (Slurm): Difference between revisions

Latest revision as of 09:48, 16 June 2026

Contents

What's on which page

Quality of Service

Running MPI jobs

See also

External links

Navigation menu

Scheduler Overview (Slurm): Difference between revisions

Latest revision as of 09:48, 16 June 2026

What's on which page

Quality of Service

Running MPI jobs

See also

External links

Navigation menu

Search