CCF (Core Computational Facility) @ UQ run by ITS / SMP


[Home] [why HPC?] [UQ Resources] [getafix] [dogmatix] [asterix] [ghost] [contacts/help]
howto: [data] [Slurm] [OpenMP/MKL] [Nvidia GPU] [Intel Phi] [MPI] [matlab] [mathematica] [FAQ]

Howto - Run MPI Calculations

This is not a whyto/howto parallelisation. There are many reasons to do this. MPI (Message Passing Interface) is a simple/as-complicated-as-you-can-imagine protocol that can be added into computer programs to enable multiple calculations to talk and share data between each other.

The easiest way to set (and change) which MPI-enabled compiler/launcher you are using on linux machines is to use environment-modules. Most *nix/HPC systems use environment-modules these days.


MPI Setup - quick method

The first thing you need to do on the SMP systems is to find an MPI-capable C or C++ or fortran compiler. Generally called mpicc or mpicxx or mpif90.

module list
module avail
locate mpicc
locate mpirun

With modules the way to compile MPI code is to load the compiler then the MPI application. eg. use the default system GCC compiler with Open MPI

module purge
module load mpi/openmpi-x86_64
which mpicc
mpicc -v
which mpirun
mpirun --version
ompi_info

which will show you what compiler and OpenMPI version has been loaded.

To compile your code then

mpif90 mycode.f90
mpicc mycode.c
mpicxx mycode.cpp

you will need to add the same module load commands to your slurm batch file. After compiling you then use the related launcher mpirun (or mpiexec).


Running MPI code - interactively

So you've been able to write/compile your codes with mpicc etc. In general once you've added the correct MPI module, you will already have in your path the MPI launcher mpirun (, mpiexec and orterun are all synonyms for each other).

For example, to debug a calculation with 8 processors on a standalone machine:
srun --nodes=1 --ntasks=8 --ntasks-per-node=8 --cpus-per-task=1 --mem=4GB --pty bash
module load mpi/openmpi-x86_64
mpirun ./yourmpiprogram.exe
exit

Note that doing the following might confuse slurm
mpirun -n ${SLURM_NPROCS} ./yourmpiprogram.exe
and result in being allocated with the wrong number of tasks on a node/s. (although circa 2020 this works on some nodes and the other one does not! - MB. Why???)


Running MPI code - batch

To run on getafix or dogmatix requires you to submit a Slurm job such as that

The following is a sample slurm script ( goslurm.sh ) which for 4 MPI processes (and OpenMP turned off) you might need:
#!/bin/bash
#SBATCH --partition=smp
#SBATCH --job-name=go1x4x4x1
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

echo "This is job '$SLURM_JOB_NAME' (id: $SLURM_JOB_ID) running on the following nodes:"
echo $SLURM_NODELIST
echo "running with OMP_NUM_THREADS= $OMP_NUM_THREADS "
echo "running with SLURM_TASKS_PER_NODE= $SLURM_TASKS_PER_NODE "
echo "running with SLURM_NPROCS= $SLURM_NPROCS "

module load mpi/openmpi-x86_64
time mpirun ./yourmpiprogram.exe
### the following might fix OR break the slurm scheduler...
## time mpirun -n ${SLURM_NPROCS} ./yourmpiprogram.exe

Word of warning: mixing threads

Note that you could potentially link your mpicc or mpic++ or mpif90 compiled code against the Intel MKL library. Or your code might have threading/OpenMP enabled. In these cases running with MPI is dangerous, in that, you must request as many slurm resources (cores) as your MPI system runs on when each MPI process may spawn multiple threads.


MPI Versions - open source

Both getafix and dogmatix are setup for MPI-based calculations, and both connect to the same ethernet network switches (min 10 Gb/s, 40 Gb/s interconnects).

There are multiple types of open-source implementations of the MPI protocol on SMP systems (MPI-3.1 is the latest standard). There is:


MPI Versions - closed source

There are also closed-source implementations of the MPI protocol:


Installing the MPI compilers and environment-modules - other linux systems

On your own linux system/s you can install MPI implementations, eg.
sudo dnf install environment-modules openmpi openmpi-devel blacs-openmpi scalapack-openmpi mpich mpich-devel mpich-doc blacs-mpich scalapack-mpich
note that openmpi is daemon free, just requires passwordless ssh between machines. To run mpich on your own machines needs a daemon via:
mpi -d &

There used to be a distinct package called mpich2 etc, which still exists in some older repositories. The latest mpich etc packages actually implement up to MPI-3 and so are the default now.

Compiling MPI code - other linux systems

To see which module to use:
[user@linux ~]$ module list
No Modulefiles Currently Loaded.
[user@linux ~]$ module avail

--------------------- /etc/modulefiles ---------------------------
mpi/mpich-x86_64
---------------------------- /usr/share/modulefiles --------------
mpi/openmpi-x86_64

and then load one of them
[user@linux ~]$ module load mpi/openmpi-x86_64
[user@linux ~]$ module load mpi/mpich-x86_64
Note that this can also be added to your ~/.bashrc to automatically load at login.

Now you should be able to compile your MPI-capable C or fortran code.

For example, to run the calculation with 8 processors on a standalone linux machine:
mpirun -np 8 ./a.out
If you use the machinefile options, eg mpirun -np 8 -npernode 1 --host ... you could run these calcs across, eg. multiple linux machines. [Note that adding -np option works for standalone systems, but might break the slurm MPI scheduling if using slurm/scripts]


This page last updated 3rd July 2020. [Contacts/help]