Creating sbatch script: Difference between revisions
(7 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
== A skeleton Slurm script == | == A skeleton Slurm script == | ||
< | <pre lang='bash'> | ||
#-----------------------------Mail address----------------------------- | #-----------------------------Mail address----------------------------- | ||
Line 12: | Line 12: | ||
#-----------------------------Other information------------------------ | #-----------------------------Other information------------------------ | ||
#SBATCH --comment= | #SBATCH --comment= | ||
#SBATCH --qos= | |||
#-----------------------------Required resources----------------------- | #-----------------------------Required resources----------------------- | ||
#SBATCH --time=0-0:0:0 | #SBATCH --time=0-0:0:0 | ||
#SBATCH --ntasks= | #SBATCH --ntasks= | ||
Line 27: | Line 26: | ||
#your job | #your job | ||
</pre> | |||
</ | |||
==Explanation of used SBATCH parameters== | ==Explanation of used SBATCH parameters== | ||
===partition for resource allocation=== | |||
<pre> | |||
#SBATCH --partition=gpu | |||
</pre> | |||
Request a specific partition for the resource allocation. | |||
=== Adding accounting information or project number === | === Adding accounting information or project number === | ||
< | <pre> | ||
#SBATCH -- | #SBATCH --comment=773320000 | ||
</ | </pre> | ||
Charge resources used by this job to specified account. The | Charge resources used by this job to specified account. The comment is an arbitrary string. The comment may be changed after job submission using the <tt>scontrol</tt> command. For WUR users a projectnumber or KTP number would be advisable. | ||
===time limit=== | ===time limit=== | ||
< | <pre> | ||
#SBATCH --time=1200 | #SBATCH --time=1200 | ||
</ | </pre> | ||
A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds". So in this example the job will run for a maximum of 1200 minutes. | A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds". So in this example the job will run for a maximum of 1200 minutes. | ||
===memory limit=== | ===memory limit=== | ||
< | <pre> | ||
#SBATCH --mem=2048 | #SBATCH --mem=2048 | ||
</ | </pre> | ||
SLURM imposes a memory limit on each job. By default, it is deliberately relatively small — 100 MB per node. If your job uses more than that, you’ll get an error that your job Exceeded job memory limit. To set a larger limit, add to your job submission: | SLURM imposes a memory limit on each job. By default, it is deliberately relatively small — 100 MB per node. If your job uses more than that, you’ll get an error that your job Exceeded job memory limit. To set a larger limit, add to your job submission: | ||
< | <pre> | ||
#SBATCH --mem X | #SBATCH --mem X | ||
</ | </pre> | ||
where X is the maximum amount of memory your job will use per node, in MB. The larger your working data set, the larger this needs to be, but the smaller the number the easier it is for the scheduler to find a place to run your job. To determine an appropriate value, start relatively large (job slots on average have about 4000 MB per core, but that’s much larger than needed for most jobs) and then use sacct to look at how much your job is actually using or used: | where X is the maximum amount of memory your job will use per node, in MB. The larger your working data set, the larger this needs to be, but the smaller the number the easier it is for the scheduler to find a place to run your job. To determine an appropriate value, start relatively large (job slots on average have about 4000 MB per core, but that’s much larger than needed for most jobs) and then use sacct to look at how much your job is actually using or used: | ||
< | <pre> | ||
$ sacct -o MaxRSS -j JOBID | $ sacct -o MaxRSS -j JOBID | ||
</ | </pre> | ||
where JOBID is the one you’re interested in. The number is in KB, so divide by 1024 to get a rough idea of what to use with –mem (set it to something a little larger than that, since you’re defining a hard upper limit). If your job completed long in the past you may have to tell sacct to look further back in time by adding a start time with -S YYYY-MM-DD. Note that for parallel jobs spanning multiple nodes, this is the maximum memory used on any one node; if you’re not setting an even distribution of tasks per node (e.g. with –ntasks-per-node), the same job could have very different values when run at different times. | where JOBID is the one you’re interested in. The number is in KB, so divide by 1024 to get a rough idea of what to use with –mem (set it to something a little larger than that, since you’re defining a hard upper limit). If your job completed long in the past you may have to tell sacct to look further back in time by adding a start time with -S YYYY-MM-DD. Note that for parallel jobs spanning multiple nodes, this is the maximum memory used on any one node; if you’re not setting an even distribution of tasks per node (e.g. with –ntasks-per-node), the same job could have very different values when run at different times. | ||
===number of tasks=== | ===number of tasks=== | ||
< | <pre> | ||
#SBATCH --ntasks=1 | #SBATCH --ntasks=1 | ||
</ | </pre> | ||
sbatch does not launch tasks, it requests an allocation of resources and submits a batch script. This option advises the SLURM controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources. The default is one task per node, but note that the --cpus-per-task option will change this default. | sbatch does not launch tasks, it requests an allocation of resources and submits a batch script. This option advises the SLURM controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources. The default is one task per node, but note that the --cpus-per-task option will change this default. | ||
When requesting multiple tasks, you may or may not want the job to be partitioned among multiple nodes. You can specify the minimum number of nodes using the <code>-N</code> or <code>--node</code> flag. If you provide only one number, this will be minimum and maximum at the same time. For instance: | When requesting multiple tasks, you may or may not want the job to be partitioned among multiple nodes. You can specify the minimum number of nodes using the <code>-N</code> or <code>--node</code> flag. If you provide only one number, this will be minimum and maximum at the same time. For instance: | ||
< | <pre> | ||
#SBATCH --nodes=1 | #SBATCH --nodes=1 | ||
</ | </pre> | ||
This should force your job to be scheduled to a single node. | This should force your job to be scheduled to a single node. | ||
Because the cluster has a hybrid configuration, i.e. normal and fat nodes, it may be prudent to schedule your job specifically for one or the other node type, depending for instance on memory requirements. This can be done by using the <code>-C</code> or <code>--constraints</code> flag. | Because the cluster has a hybrid configuration, i.e. normal and fat nodes, it may be prudent to schedule your job specifically for one or the other node type, depending for instance on memory requirements. This can be done by using the <code>-C</code> or <code>--constraints</code> flag. | ||
===constraints: selecting | ===constraints: selecting by feature=== | ||
< | <pre> | ||
#SBATCH --constraint= | #SBATCH --constraint=4gpercpu | ||
</ | </pre> | ||
The example above will result in jobs being scheduled to the | The HPC nodes have features associated with them, such as Intel CPU's, or the amount of memory per node. If you know that your job requires a specific architecture or memory size, you can elect to constrain your job to only these features. | ||
The example above will result in jobs being scheduled to the compute nodes with 4GB of memory per CPU. By using <code>12gpercpu</code> as option the job will specifically be scheduled to one of the larger nodes with 12GB per CPU. | |||
All features can be seen using: | |||
<pre> | |||
scontrol show nodes | grep ActiveFeatures | sort | uniq | |||
</pre> | |||
===requesting specific resources=== | |||
<pre> | |||
#SBATCH --gres=gpu:1 | |||
</pre> | |||
In order to be able to use specific hardware resources, you need to request a Generic Resource. Once you do this, one of the resources will be allocated to your job when they are available. In the above example, one GPU is requested for use. | |||
===output (stderr,stdout) directed to file=== | ===output (stderr,stdout) directed to file=== | ||
< | <pre> | ||
#SBATCH --output=output_%j.txt | #SBATCH --output=output_%j.txt | ||
</ | </pre> | ||
Instruct SLURM to connect the batch script's standard output directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job allocation number. See the --input option for filename specification options. | Instruct SLURM to connect the batch script's standard output directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job allocation number. See the --input option for filename specification options. | ||
< | <pre> | ||
#SBATCH --error=error_output_%j.txt | #SBATCH --error=error_output_%j.txt | ||
</ | </pre> | ||
Instruct SLURM to connect the batch script's standard error directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job allocation number. See the --input option for filename specification options. | Instruct SLURM to connect the batch script's standard error directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job allocation number. See the --input option for filename specification options. | ||
===adding a job name=== | ===adding a job name=== | ||
< | <pre> | ||
#SBATCH --job-name=calc_pi.py | #SBATCH --job-name=calc_pi.py | ||
</ | </pre> | ||
Specify a name for the job allocation. The specified name will appear along with the job id number when querying running jobs on the system. The default is the name of the batch script, or just "sbatch" if the script is read on sbatch's standard input. | Specify a name for the job allocation. The specified name will appear along with the job id number when querying running jobs on the system. The default is the name of the batch script, or just "sbatch" if the script is read on sbatch's standard input. | ||
===receiving mailed updates=== | ===receiving mailed updates=== | ||
< | <pre> | ||
#SBATCH --mail-type=ALL | #SBATCH --mail-type=ALL | ||
</ | </pre> | ||
Notify user by email when certain event types occur. Valid type values are BEGIN, END, FAIL, REQUEUE, and ALL (any state change). The user to be notified is indicated with --mail-user. | Notify user by email when certain event types occur. Valid type values are BEGIN, END, FAIL, REQUEUE, and ALL (any state change). The user to be notified is indicated with --mail-user. | ||
< | <pre> | ||
#SBATCH --mail-user=yourname001@wur.nl | #SBATCH --mail-user=yourname001@wur.nl | ||
</ | </pre> | ||
Email address to use. | Email address to use. | ||
== See also == | == See also == | ||
* [[ | * [[Anunna | Anunna]] | ||
* [[ | * [[Using_Slurm#Batch_script | Submitting jobs to Slurm]] | ||
* [[Array_jobs|Array job hints]] |
Latest revision as of 09:07, 24 April 2024
A skeleton Slurm script
#-----------------------------Mail address-----------------------------
#SBATCH --mail-user=
#SBATCH --mail-type=ALL
#-----------------------------Output files-----------------------------
#SBATCH --output=output_%j.txt
#SBATCH --error=error_output_%j.txt
#-----------------------------Other information------------------------
#SBATCH --comment=
#SBATCH --qos=
#-----------------------------Required resources-----------------------
#SBATCH --time=0-0:0:0
#SBATCH --ntasks=
#SBATCH --cpus-per-task=
#SBATCH --mem-per-cpu=
#-----------------------------Environment, Operations and Job steps----
#load modules
#export variables
#your job
Explanation of used SBATCH parameters
partition for resource allocation
#SBATCH --partition=gpu
Request a specific partition for the resource allocation.
Adding accounting information or project number
#SBATCH --comment=773320000
Charge resources used by this job to specified account. The comment is an arbitrary string. The comment may be changed after job submission using the scontrol command. For WUR users a projectnumber or KTP number would be advisable.
time limit
#SBATCH --time=1200
A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds". So in this example the job will run for a maximum of 1200 minutes.
memory limit
#SBATCH --mem=2048
SLURM imposes a memory limit on each job. By default, it is deliberately relatively small — 100 MB per node. If your job uses more than that, you’ll get an error that your job Exceeded job memory limit. To set a larger limit, add to your job submission:
#SBATCH --mem X
where X is the maximum amount of memory your job will use per node, in MB. The larger your working data set, the larger this needs to be, but the smaller the number the easier it is for the scheduler to find a place to run your job. To determine an appropriate value, start relatively large (job slots on average have about 4000 MB per core, but that’s much larger than needed for most jobs) and then use sacct to look at how much your job is actually using or used:
$ sacct -o MaxRSS -j JOBID
where JOBID is the one you’re interested in. The number is in KB, so divide by 1024 to get a rough idea of what to use with –mem (set it to something a little larger than that, since you’re defining a hard upper limit). If your job completed long in the past you may have to tell sacct to look further back in time by adding a start time with -S YYYY-MM-DD. Note that for parallel jobs spanning multiple nodes, this is the maximum memory used on any one node; if you’re not setting an even distribution of tasks per node (e.g. with –ntasks-per-node), the same job could have very different values when run at different times.
number of tasks
#SBATCH --ntasks=1
sbatch does not launch tasks, it requests an allocation of resources and submits a batch script. This option advises the SLURM controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources. The default is one task per node, but note that the --cpus-per-task option will change this default.
When requesting multiple tasks, you may or may not want the job to be partitioned among multiple nodes. You can specify the minimum number of nodes using the -N
or --node
flag. If you provide only one number, this will be minimum and maximum at the same time. For instance:
#SBATCH --nodes=1
This should force your job to be scheduled to a single node.
Because the cluster has a hybrid configuration, i.e. normal and fat nodes, it may be prudent to schedule your job specifically for one or the other node type, depending for instance on memory requirements. This can be done by using the -C
or --constraints
flag.
constraints: selecting by feature
#SBATCH --constraint=4gpercpu
The HPC nodes have features associated with them, such as Intel CPU's, or the amount of memory per node. If you know that your job requires a specific architecture or memory size, you can elect to constrain your job to only these features.
The example above will result in jobs being scheduled to the compute nodes with 4GB of memory per CPU. By using 12gpercpu
as option the job will specifically be scheduled to one of the larger nodes with 12GB per CPU.
All features can be seen using:
scontrol show nodes | grep ActiveFeatures | sort | uniq
requesting specific resources
#SBATCH --gres=gpu:1
In order to be able to use specific hardware resources, you need to request a Generic Resource. Once you do this, one of the resources will be allocated to your job when they are available. In the above example, one GPU is requested for use.
output (stderr,stdout) directed to file
#SBATCH --output=output_%j.txt
Instruct SLURM to connect the batch script's standard output directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job allocation number. See the --input option for filename specification options.
#SBATCH --error=error_output_%j.txt
Instruct SLURM to connect the batch script's standard error directly to the file name specified in the "filename pattern". By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job allocation number. See the --input option for filename specification options.
adding a job name
#SBATCH --job-name=calc_pi.py
Specify a name for the job allocation. The specified name will appear along with the job id number when querying running jobs on the system. The default is the name of the batch script, or just "sbatch" if the script is read on sbatch's standard input.
receiving mailed updates
#SBATCH --mail-type=ALL
Notify user by email when certain event types occur. Valid type values are BEGIN, END, FAIL, REQUEUE, and ALL (any state change). The user to be notified is indicated with --mail-user.
#SBATCH --mail-user=yourname001@wur.nl
Email address to use.