MPI on B4F cluster
A simple 'Hello World' example
Consider the following simple MPI version, in C, of the 'Hello World' example:
#include <stdio.h> #include <mpi.h> int main(int argc, char ** argv) { int size,rank,namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,&rank); MPI_Comm_size(MPI_COMM_WORLD,&size); MPI_Get_processor_name(processor_name, &namelen); printf("Hello MPI! Process %d of %d on %s\n", rank, size, processor_name); MPI_Finalize(); }
Before compiling, make sure that the compilers that are required available.
module list
To avoid conflicts between libraries, the safest way is purging all modules:
module purge
Then load both gcc and openmpi libraries. If modules were purged, then slurm needs to be reloaded too.
module load gcc openmpi/gcc slurm
Compile the hello_mpi.c
code.
mpicc hello_mpi.c -o test_hello_world
If desired, a list of libraries compiled into the executable can be viewed:
ldd test_hello_world
linux-vdso.so.1 (0x00007ffc6fb18000) libmpi.so.40 => /usr/lib/x86_64-linux-gnu/libmpi.so.40 (0x000014d19dfb2000) libpthread.so.0 => /usr/lib/x86_64-linux-gnu/libpthread.so.0 (0x000014d19df8f000) libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x000014d19dd9d000) libopen-rte.so.40 => /usr/lib/x86_64-linux-gnu/libopen-rte.so.40 (0x000014d19dce3000) libopen-pal.so.40 => /usr/lib/x86_64-linux-gnu/libopen-pal.so.40 (0x000014d19dc33000) libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x000014d19dae4000) libhwloc.so.15 => /usr/lib/x86_64-linux-gnu/libhwloc.so.15 (0x000014d19da93000) /lib64/ld-linux-x86-64.so.2 (0x000014d19e0d9000) libz.so.1 => /usr/lib/x86_64-linux-gnu/libz.so.1 (0x000014d19da77000) libevent-2.1.so.7 => /usr/lib/x86_64-linux-gnu/libevent-2.1.so.7 (0x000014d19da21000) libdl.so.2 => /usr/lib/x86_64-linux-gnu/libdl.so.2 (0x000014d19da1b000) libutil.so.1 => /usr/lib/x86_64-linux-gnu/libutil.so.1 (0x000014d19da14000) libevent_pthreads-2.1.so.7 => /usr/lib/x86_64-linux-gnu/libevent_pthreads-2.1.so.7 (0x000014d19da0f000) libudev.so.1 => /usr/lib/x86_64-linux-gnu/libudev.so.1 (0x000014d19d9e3000) libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x000014d19d9d8000)
Running the executable on two nodes, with four tasks per node, can be done like this:
srun --nodes=2 --ntasks-per-node=4 --mpi=openmpi ./test_hello_world
This will result in the following output:
Hello MPI! Process 4 of 8 on node011 Hello MPI! Process 1 of 8 on node010 Hello MPI! Process 7 of 8 on node011 Hello MPI! Process 6 of 8 on node011 Hello MPI! Process 5 of 8 on node011 Hello MPI! Process 2 of 8 on node010 Hello MPI! Process 0 of 8 on node010 Hello MPI! Process 3 of 8 on node010
A mvapich2 sbatch example
A mpi job using mvapich2 on 32 cores, using the normal compute nodes and the fast infiniband interconnect for RDMA traffic. <source lang='bash'> $ module load mvapich2/gcc $ vim batch.sh
#!/bin/sh #SBATCH --comment=projectx #SBATCH --time=30-0 #SBATCH -n 32 #SBATCH --constraint=4gpercpu #SBATCH --output=output_%j.txt #SBATCH --error=error_output_%j.txt #SBATCH --job-name=MPItest #SBATCH --mail-type=ALL #SBATCH --mail-user=user@wur.nl echo "Starting at `date`" echo "Running on hosts: $SLURM_NODELIST" echo "Running on $SLURM_NNODES nodes." echo "Running on $SLURM_NPROCS processors." echo "Current working directory is `pwd`" # echo "Env var MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE is $MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE" # export MPIR_CVAR_NEMESIS_TCP_NETWORK_IFACE=ib0
mpirun -iface ib0 -np 32 ./tmf_par.out -NX 480 -NY 240 -alpha 11 -chi 1.3 -psi_b 5e-2 -beta 0.0 -zeta 3.5 -kT 0.10
echo "Program finished with exit code $? at: `date`"
$ sbatch batch.sh
</source>