Using Slurm: Difference between revisions
No edit summary |
No edit summary |
||
Line 42: | Line 42: | ||
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | ||
3347 research calc_pi. megen002 R 0:03 1 node049 | 3347 research calc_pi. megen002 R 0:03 1 node049 | ||
== removing jobs from a list: scancel == | |||
If for some reason you want to delete a job that is either in the queue or already running, you can remove it using the 'scancel' command. The 'scancel' command takes the jobid as a parameter. The For the example above, this would be done using the following code: | |||
<source lang='bash'> | |||
scancel 3347 | |||
</source> | |||
== allocating resources interactively: sallocate == | == allocating resources interactively: sallocate == | ||
Line 51: | Line 57: | ||
== other == | == other == | ||
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST | |||
student* up infinite 12 down* node[043-048,055-060] | |||
student* up infinite 50 idle fat[001-002],node[001-042,049-054] | |||
research up infinite 12 down* node[043-048,055-060] | |||
research up infinite 50 idle fat[001-002],node[001-042,049-054] | |||
ABGC up infinite 12 down* node[043-048,055-060] | |||
ABGC up infinite 50 idle fat[001-002],node[001-042,049-054] | |||
== external links == | == external links == |
Revision as of 10:29, 23 November 2013
submitting jobs: sbatch
Consider this simple python3 script that should calculate Pi up to 1 million digits: <source lang='python'> from decimal import * D=Decimal getcontext().prec=10000000 p=sum(D(1)/16**k*(D(4)/(8*k+1)-D(2)/(8*k+4)-D(1)/(8*k+5)-D(1)/(8*k+6))for k in range(411)) print(str(p)[:10000002]) </source>
In order for this script to run, the first thing that is needed is that Python3, which is not the default Python version on the cluster, is load into your environment. Availability of (different versions of) software can be checked by the following command:
module avail
In the list you should note that python3 is indeed available to be loaded, which then can be loaded with the following command:
module load python/3.3.3
The following shell/slurm script can then be used to schedule the job using the sbatch command: <source lang='bash'>
- !/bin/bash
- #SBATCH --time=100
- SBATCH --ntasks=1
- SBATCH --output=output_%j.txt
- SBATCH --error=error_output_%j.txt
- SBATCH --job-name=calc_pi.py
- SBATCH --partition=research
time python3 calc_pi.py </source>
The script, assuming it was named 'run_calc_pi.sh', can then be posted using the following command:
<source lang='bash'> sbatch run_calc_pi.sh </source>
monitoring submitted jobs: squeue
Once a job is submitted, the status can be monitored using the 'squeue' command:
squeue
You should then get a list of jobs that are running at that time on the cluster, for the example on how to submit using the 'sbatch' command, it may look like so:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 3347 research calc_pi. megen002 R 0:03 1 node049
removing jobs from a list: scancel
If for some reason you want to delete a job that is either in the queue or already running, you can remove it using the 'scancel' command. The 'scancel' command takes the jobid as a parameter. The For the example above, this would be done using the following code: <source lang='bash'> scancel 3347 </source>
allocating resources interactively: sallocate
running MPI jobs on B4F cluster
removing jobs from a list: scancel
other
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST student* up infinite 12 down* node[043-048,055-060] student* up infinite 50 idle fat[001-002],node[001-042,049-054] research up infinite 12 down* node[043-048,055-060] research up infinite 50 idle fat[001-002],node[001-042,049-054] ABGC up infinite 12 down* node[043-048,055-060] ABGC up infinite 50 idle fat[001-002],node[001-042,049-054]