Scheduler Overview (Slurm): Difference between revisions

Latest revision as of 09:48, 16 June 2026

The resource allocation and scheduling software on Anunna is SLURM: Simple Linux Utility for Resource Management. This page is the entry point — most topics have their own page; below is a short summary plus links.

What's on which page

Partitions / Queues — list of partitions (main, gpu, gpu_amd) and how to choose one.
Choosing a node (constraints) — defaults, hardware constraints, GPU selection.
Batch Jobs — writing sbatch scripts and submitting them, including multi-job submissions and dependencies.
Interactive Jobs — sinteractive and salloc for live shell sessions on a compute node.
Array jobs — running the same script many times with a varying parameter.
Monitoring Jobs — squeue, scontrol, sstat, sacct, node_usage_graph.
Cancelling Jobs — scancel.
Reservations — booking nodes in advance for events.

Quality of Service

When submitting a job, you may optionally assign a different Quality of Service (QoS) to it:

#SBATCH --qos=std

The QoS values configured on Anunna:

std (priority 10) — the default. Use this unless you have a specific reason to pick another.
low (priority 1) — reduced priority, but limited to 8 hours per job so a flood of low-priority jobs cannot lock up the cluster.
high (priority 20) — higher priority than std. More expensive — see Tariffs.
interactive (priority 100) — the highest priority, exclusively for immediate-running interactive jobs. You may not submit many or large jobs at this QoS.

Jobs can in principle be restarted and rescheduled if a higher-priority job needs cluster resources, but at the time of writing this preemption is not actually configured.

Running MPI jobs

For multi-node MPI workloads see MPI on Anunna.

External links

@@ Line 1: / Line 1: @@
-The resource allocation / scheduling software on the B4F Cluster is [http://en.wikipedia.org/wiki/Simple_Linux_Utility_for_Resource_Management SLURM]: '''S'''imple '''L'''inux '''U'''tility for '''R'''esource '''M'''anagement.
+The resource allocation and scheduling software on Anunna is [http://en.wikipedia.org/wiki/Simple_Linux_Utility_for_Resource_Management SLURM]: '''S'''imple '''L'''inux '''U'''tility for '''R'''esource '''M'''anagement. This page is the entry point — most topics have their own page; below is a short summary plus links.
-== Submitting jobs: sbatch ==
+== What's on which page ==
-=== Example ===
+* [[Partitions / Queues]] — list of partitions (<code>main</code>, <code>gpu</code>, <code>gpu_amd</code>) and how to choose one.
-Consider this simple python3 script that should calculate Pi to 1 million digits:
+* [[Choosing a node (constraints)]] — defaults, hardware constraints, GPU selection.
-<source lang='python'>
+* [[Batch Jobs]] — writing sbatch scripts and submitting them, including multi-job submissions and dependencies.
-from decimal import *
+* [[Interactive Jobs]] — <code>sinteractive</code> and <code>salloc</code> for live shell sessions on a compute node.
-D=Decimal
+* [[Array jobs]] — running the same script many times with a varying parameter.
-getcontext().prec=10000000
+* [[Monitoring Jobs]] — <code>squeue</code>, <code>scontrol</code>, <code>sstat</code>, <code>sacct</code>, <code>node_usage_graph</code>.
-p=sum(D(1)/16**k*(D(4)/(8*k+1)-D(2)/(8*k+4)-D(1)/(8*k+5)-D(1)/(8*k+6))for k in range(411))
+* [[Cancelling Jobs]] — <code>scancel</code>.
-print(str(p)[:10000002])
+* [[Reservations]] — booking nodes in advance for events.
-</source>
-=== Loading modules ===
+== Quality of Service ==
-In order for this script to run, the first thing that is needed is that Python3, which is not the default Python version on the cluster, is load into your environment. Availability of (different versions of) software can be checked by the following command:
-  module avail
-In the list you should note that python3 is indeed available to be loaded, which then can be loaded with the following command:
+When submitting a job, you may optionally assign a different Quality of Service (QoS) to it:
-  module load python/3.3.3
-=== Batch script ===
+<syntaxhighlight lang="bash">
-The following shell/slurm script can then be used to schedule the job using the sbatch command:
+#SBATCH --qos=std
-<source lang='bash'>
+</syntaxhighlight>
-#!/bin/bash
-#SBATCH --time=1200
-#SBATCH --ntasks=1
-#SBATCH --output=output_%j.txt
-#SBATCH --error=error_output_%j.txt
-#SBATCH --job-name=calc_pi.py
-#SBATCH --partition=research
-time python3 calc_pi.py
+The QoS values configured on Anunna:
-</source>
-=== Submitting ===
+* '''std''' (priority 10) — the default. Use this unless you have a specific reason to pick another.
-The script, assuming it was named 'run_calc_pi.sh', can then be posted using the following command:
+* '''low''' (priority 1) — reduced priority, but limited to 8 hours per job so a flood of low-priority jobs cannot lock up the cluster.
-<source lang='bash'>
+* '''high''' (priority 20) — higher priority than <code>std</code>. More expensive — see [[Tariffs]].
-sbatch run_calc_pi.sh
+* '''interactive''' (priority 100) — the highest priority, exclusively for immediate-running interactive jobs. You may not submit many or large jobs at this QoS.
-</source>
-=== Submitting multiple jobs ===
+Jobs can in principle be restarted and rescheduled if a higher-priority job needs cluster resources, but at the time of writing this preemption is not actually configured.
-Assuming there are 10 job scripts, name runscript_1.sh through runscript_10.sh, all these scripts can be submitted using the following line of shell code:
-<source lang='bash'>for i in `seq 1 10`; do echo $i; sbatch runscript_$i.sh;done
-</source>
-== monitoring submitted jobs: squeue ==
+== Running MPI jobs ==
-Once a job is submitted, the status can be monitored using the <code>squeue</code> command. The <code>squeue</code> command has a number of parameters for monitoring specific properties of the jobs such as time limit.
-=== Generic monitoring of all running jobs ===
+For multi-node MPI workloads see [[MPI on B4F cluster | MPI on Anunna]].
-<source lang='bash'>
-  squeue
-</source>
-You should then get a list of jobs that are running at that time on the cluster, for the example on how to submit using the 'sbatch' command, it may look like so:
+== See also ==
-    JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
-      ABGC BOV-WUR- megen002   R      27:26      1 node004
-      ABGC BOV-WUR- megen002   R      27:26      1 node005
-      ABGC BOV-WUR- megen002   R      27:26      1 node006
-      ABGC BOV-WUR- megen002   R      27:26      1 node007
-      ABGC BOV-WUR- megen002   R      27:26      1 node008
-      ABGC BOV-WUR- megen002   R      27:26      1 node009
-  research BOV-WUR- megen002   R      44:38      1 node049
-  research BOV-WUR- megen002   R      44:38      1 node050
-  research BOV-WUR- megen002   R      44:38      1 node051
-  research BOV-WUR- megen002   R      44:38      1 node052
-  research BOV-WUR- megen002   R      44:38      1 node053
-  research BOV-WUR- megen002   R      44:38      1 node054
-  research BOV-WUR- megen002   R      44:38      3 node[049-051]
-  research BOV-WUR- megen002   R      44:38      3 node[052-054]
-  research BOV-WUR- megen002   R      44:38      1 node001
-  research BOV-WUR- megen002   R      44:38      1 node002
-  research BOV-WUR- megen002   R      44:38      1 node003
-=== Monitoring time limit set for a specific job ===
+* [[Partitions / Queues]]
-The default time limit is set at one hour. Estimated run times need to be specified when running jobs. To see what the time limit is that is set for a certain job, this can be done using the <code>squeue</code> command.
+* [[Choosing a node (constraints)]]
-<source lang='bash'>
+* [[Batch Jobs]]
-squeue -l -j 3532
+* [[Interactive Jobs]]
-</source>
+* [[Array jobs]]
-Information similar to the following should appear:
+* [[Monitoring Jobs]]
-  Fri Nov 29 15:41:00 2013
+* [[Cancelling Jobs]]
-   JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES NODELIST(REASON)
+* [[Reservations]]
-      ABGC BOV-WUR- megen002  RUNNING    2:47:03 3-08:00:00      1 node054
+* [[Tariffs | Costs associated with resource usage]]
-== removing jobs from a list: scancel ==
+== External links ==
-If for some reason you want to delete a job that is either in the queue or already running, you can remove it using the 'scancel' command. The 'scancel' command takes the jobid as a parameter. For the example above, this would be done using the following code:
-<source lang='bash'>
-scancel 3401
-</source>
-== allocating resources interactively: sallocate ==
-== Get overview of past and current jobs: sacct ==
-To do some accounting on past and present jobs, and to see whether they ran to completion, you can do:
-<source lang='bash'>
-sacct
-</source>
-This should provide information similar to the following:
-         JobID    JobName  Partition    Account  AllocCPUS      State ExitCode
-  ------------ ---------- ---------- ---------- ---------- ---------- --------
-         BOV-WUR-58   research                    12  COMPLETED      0:0
-.batch        batch                                1  COMPLETED      0:0
-         BOV-WUR-59   research                    12 CANCELLED+      0:0
-.batch        batch                                1  CANCELLED     0:15
-         BOV-WUR-59       ABGC                    16    RUNNING      0:0
-         BOV-WUR-60       ABGC                    16    RUNNING      0:0
-== running MPI jobs on B4F cluster ==
-== Understanding which resources are available to you: sinfo ==
-By using the 'sinfo' command you can retrieve information on which 'Partitions' are available to you. A 'Partition' using SLURM is similar to the 'queue' when submitting using the Sun Grid Engine ('qsub'). The different Partitions grant different levels of resource allocation. Not all defined Partitions will be available to any given person. E.g., Master students will only have the Student Partition available, researchers at the ABGC will have 'student', 'research', and 'ABGC' partitions available. The higher the level of  resource allocation, though, the higher the cost per compute-hour. The default Partition is the 'student' partition. A full list of Partitions can be found from the Bright Cluster Manager webpage.
-<source lang='bash'>
-sinfo
-</source>
-  PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
-  student*     up   infinite     12  down* node[043-048,055-060]
-  student*     up   infinite     50   idle fat[001-002],node[001-042,049-054]
-  research     up   infinite     12  down* node[043-048,055-060]
-  research     up   infinite     50   idle fat[001-002],node[001-042,049-054]
-  ABGC         up   infinite     12  down* node[043-048,055-060]
-  ABGC         up   infinite     50   idle fat[001-002],node[001-042,049-054]
-== See also ==
-* [[B4F_cluster | B4F Cluster]]
-* [[BCM_on_B4F_cluster | BCM on B4F cluster]]
-* [[Setting_up_Python_virtualenv | Setting up and using a virtual environment for Python3 ]]
-== External links ==
 * [http://slurm.schedmd.com Slurm official documentation]
 * [http://en.wikipedia.org/wiki/Simple_Linux_Utility_for_Resource_Management Slurm on Wikipedia]
-* [http://www.youtube.com/watch?v=axWffyrk3aY Slurm Tutorial on Youtube]

Scheduler Overview (Slurm): Difference between revisions

Latest revision as of 09:48, 16 June 2026

Contents

What's on which page

Quality of Service

Running MPI jobs

See also

External links

Navigation menu

Scheduler Overview (Slurm): Difference between revisions

Latest revision as of 09:48, 16 June 2026

What's on which page

Quality of Service

Running MPI jobs

See also

External links

Navigation menu

Search