Performance Optimization/Multiple CPUs
Jump to navigation
Jump to search
Many programs can use more than one CPU core at once — through threads, OpenMP, or built-in multiprocessing. To use several cores on a single node, request them from the scheduler and tell your program how many to use. To scale beyond one node, see MPI or job arrays.
Requesting cores
For a threaded program, ask SLURM for cores on one node with --cpus-per-task — for example, 8 cores:
#!/bin/bash
#SBATCH --job-name=threads
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=01:00:00
# many tools read this to decide how many threads to use
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./my_threaded_program
Requesting more cores than your program can actually use wastes the allocation and makes your job wait longer in the queue, so match the request to what the program will use.
Telling your program how many cores to use
Most programs need to be told how many cores to use; they do not pick it up automatically. Common ways:
- OpenMP programs read the
OMP_NUM_THREADSenvironment variable — set it from$SLURM_CPUS_PER_TASK, as above. - Many tools take a flag, for example
-t,--threads, or-p— check the program's documentation. - In Python, R, and similar, use the language's multiprocessing or parallel facilities and pass the core count explicitly.