Using Slurm: Difference between revisions

From HPCwiki
Jump to navigation Jump to search
Line 107: Line 107:
   3529        BOV-WUR-60      ABGC                    16    RUNNING      0:0
   3529        BOV-WUR-60      ABGC                    16    RUNNING      0:0


== running MPI jobs on B4F cluster ==
== Running MPI jobs on B4F cluster ==


== Understanding which resources are available to you: sinfo ==
== Understanding which resources are available to you: sinfo ==

Revision as of 16:56, 29 November 2013

The resource allocation / scheduling software on the B4F Cluster is SLURM: Simple Linux Utility for Resource Management.

Submitting jobs: sbatch

Example

Consider this simple python3 script that should calculate Pi to 1 million digits: <source lang='python'> from decimal import * D=Decimal getcontext().prec=10000000 p=sum(D(1)/16**k*(D(4)/(8*k+1)-D(2)/(8*k+4)-D(1)/(8*k+5)-D(1)/(8*k+6))for k in range(411)) print(str(p)[:10000002]) </source>

Loading modules

In order for this script to run, the first thing that is needed is that Python3, which is not the default Python version on the cluster, is load into your environment. Availability of (different versions of) software can be checked by the following command:

 module avail

In the list you should note that python3 is indeed available to be loaded, which then can be loaded with the following command:

 module load python/3.3.3

Batch script

The following shell/slurm script can then be used to schedule the job using the sbatch command: <source lang='bash'>

  1. !/bin/bash
  2. SBATCH --time=1200
  3. SBATCH --ntasks=1
  4. SBATCH --output=output_%j.txt
  5. SBATCH --error=error_output_%j.txt
  6. SBATCH --job-name=calc_pi.py
  7. SBATCH --partition=research

time python3 calc_pi.py </source>

Submitting

The script, assuming it was named 'run_calc_pi.sh', can then be posted using the following command: <source lang='bash'> sbatch run_calc_pi.sh </source>

Submitting multiple jobs

Assuming there are 10 job scripts, name runscript_1.sh through runscript_10.sh, all these scripts can be submitted using the following line of shell code: <source lang='bash'>for i in `seq 1 10`; do echo $i; sbatch runscript_$i.sh;done </source>

monitoring submitted jobs: squeue

Once a job is submitted, the status can be monitored using the squeue command. The squeue command has a number of parameters for monitoring specific properties of the jobs such as time limit.

Generic monitoring of all running jobs

<source lang='bash'>

 squeue

</source>

You should then get a list of jobs that are running at that time on the cluster, for the example on how to submit using the 'sbatch' command, it may look like so:

   JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
  3396      ABGC BOV-WUR- megen002   R      27:26      1 node004
  3397      ABGC BOV-WUR- megen002   R      27:26      1 node005
  3398      ABGC BOV-WUR- megen002   R      27:26      1 node006
  3399      ABGC BOV-WUR- megen002   R      27:26      1 node007
  3400      ABGC BOV-WUR- megen002   R      27:26      1 node008
  3401      ABGC BOV-WUR- megen002   R      27:26      1 node009
  3385  research BOV-WUR- megen002   R      44:38      1 node049
  3386  research BOV-WUR- megen002   R      44:38      1 node050
  3387  research BOV-WUR- megen002   R      44:38      1 node051
  3388  research BOV-WUR- megen002   R      44:38      1 node052
  3389  research BOV-WUR- megen002   R      44:38      1 node053
  3390  research BOV-WUR- megen002   R      44:38      1 node054
  3391  research BOV-WUR- megen002   R      44:38      3 node[049-051]
  3392  research BOV-WUR- megen002   R      44:38      3 node[052-054]
  3393  research BOV-WUR- megen002   R      44:38      1 node001
  3394  research BOV-WUR- megen002   R      44:38      1 node002
  3395  research BOV-WUR- megen002   R      44:38      1 node003

Monitoring time limit set for a specific job

The default time limit is set at one hour. Estimated run times need to be specified when running jobs. To see what the time limit is that is set for a certain job, this can be done using the squeue command. <source lang='bash'> squeue -l -j 3532 </source> Information similar to the following should appear:

 Fri Nov 29 15:41:00 2013
  JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES NODELIST(REASON)
  3532      ABGC BOV-WUR- megen002  RUNNING    2:47:03 3-08:00:00      1 node054

removing jobs from a list: scancel

If for some reason you want to delete a job that is either in the queue or already running, you can remove it using the 'scancel' command. The 'scancel' command takes the jobid as a parameter. For the example above, this would be done using the following code: <source lang='bash'> scancel 3401 </source>

allocating resources interactively: sallocate

Get overview of past and current jobs: sacct

To do some accounting on past and present jobs, and to see whether they ran to completion, you can do: <source lang='bash'> sacct </source> This should provide information similar to the following:

        JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
 ------------ ---------- ---------- ---------- ---------- ---------- -------- 
 3385         BOV-WUR-58   research                    12  COMPLETED      0:0 
 3385.batch        batch                                1  COMPLETED      0:0 
 3386         BOV-WUR-59   research                    12 CANCELLED+      0:0 
 3386.batch        batch                                1  CANCELLED     0:15 
 3528         BOV-WUR-59       ABGC                    16    RUNNING      0:0 
 3529         BOV-WUR-60       ABGC                    16    RUNNING      0:0

Running MPI jobs on B4F cluster

Understanding which resources are available to you: sinfo

By using the 'sinfo' command you can retrieve information on which 'Partitions' are available to you. A 'Partition' using SLURM is similar to the 'queue' when submitting using the Sun Grid Engine ('qsub'). The different Partitions grant different levels of resource allocation. Not all defined Partitions will be available to any given person. E.g., Master students will only have the Student Partition available, researchers at the ABGC will have 'student', 'research', and 'ABGC' partitions available. The higher the level of resource allocation, though, the higher the cost per compute-hour. The default Partition is the 'student' partition. A full list of Partitions can be found from the Bright Cluster Manager webpage.

<source lang='bash'> sinfo </source>

 PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
 student*     up   infinite     12  down* node[043-048,055-060]
 student*     up   infinite     50   idle fat[001-002],node[001-042,049-054]
 research     up   infinite     12  down* node[043-048,055-060]
 research     up   infinite     50   idle fat[001-002],node[001-042,049-054]
 ABGC         up   infinite     12  down* node[043-048,055-060]
 ABGC         up   infinite     50   idle fat[001-002],node[001-042,049-054]

See also

External links