Howto - Run MPI Calculations
This is not a whyto/howto parallelisation. There are many reasons to do this?.
MPI (Message Passing Interface) is a simple/as-complicated-as-you-can-imagine protocol that can be added into computer programs to enable multiple calculations to talk and share data between each other.
The easiest way to set (and change) which MPI-enabled compiler/launcher you are using on
linux machines is to use environment-modules. Most *nix/HPC systems use environment-modules these days.
MPI Setup - quick method
The first thing you need to do on the SMP systems is to find an MPI-capable C or C++ or fortran compiler.
Generally calledmpicc
ormpicxx
ormpif90
.
@@
module list
module avail
locate mpicc
locate mpirun
@@
With modules the way to compile MPI code is to load the compiler then the MPI application.
eg. use the default system GCC compiler with Open MPI
@@
module purge
module load mpi/openmpi-x86_64
which mpicc
mpicc -v
which mpirun
mpirun --version
ompi_info
@@
which will show you what compiler and OpenMPI version has been loaded.
To compile your code then @@
mpif90 mycode.f90
mpicc mycode.c
mpicxx mycode.cpp
@@
you will need to add the same module load
commands to your slurm batch file.
After compiling you then use the related launcher mpirun
(or mpiexec
).
Running MPI code - interactively
So you've been able to write/compile your codes with mpicc
etc.
In general once you've added the correct MPImodule
, you will already have in your path the MPI launchermpirun
(,mpiexec
andorterun
are all synonyms for each other).
For example, to debug a calculation with 8 processors on a standalone machine:
@@
srun --nodes=1 --ntasks=8 --ntasks-per-node=8 --cpus-per-task=1 --mem=4GB --pty bash
module load mpi/openmpi-x86_64
mpirun ./yourmpiprogram.exe
exit
@@
Note that doing the following might confuse slurm
@@
mpirun -n ${SLURM_NPROCS} ./yourmpiprogram.exe
@@
and result in being allocated with the wrong number of tasks on a node/s. (although circa 2020 this works on some nodes and the other one does not! - MB. Why???)
Running MPI code - batch
To run on <FONT FACE=Courier>getafix</FONT> or <FONT FACE=Courier>dogmatix</FONT> requires
you to submit a Slurm? job such as that
</p>
The following is a sample slurm script ( )
which for 4 MPI processes (and OpenMP turned off) you might need:
@@
- !/bin/bash
#SBATCH --partition=smp
#SBATCH --job-name=go1x4x4x1
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
echo "This is job '$SLURM_JOB_NAME' (id: $SLURM_JOB_ID) running on the following nodes:"
echo $SLURM_NODELIST
echo "running with OMP_NUM_THREADS= $OMP_NUM_THREADS "
echo "running with SLURM_TASKS_PER_NODE= $SLURM_TASKS_PER_NODE "
echo "running with SLURM_NPROCS= $SLURM_NPROCS "
module load mpi/openmpi-x86_64
time mpirun ./yourmpiprogram.exe
### the following might fix OR break the slurm scheduler...
## time mpirun -n ${SLURM_NPROCS} ./yourmpiprogram.exe
@@
Word of warning: mixing threads
Note that you could potentially link your mpicc
or mpic++
or mpif90
compiled code
against the Intel MKL library. Or your code might have threading/OpenMP enabled. In these cases running with MPI is dangerous, in that, you must request as many slurm resources (cores) as your MPI system runs on when each MPI process may spawn multiple threads.
MPI Versions - open source
Both <FONT FACE=Courier>getafix</FONT> and <FONT FACE=Courier>dogmatix</FONT>
are setup for MPI-based calculations, and both connect to the same ethernet network switches (min 10 Gb/s, 40 Gb/s interconnects).
There are multiple types of open-source implementations of the MPI protocol
on SMP systems (MPI-3.1 is the latest standard). There is:
* Open MPI which has incongruous version numbering, eg. even v1.X implements full MPI-3 standards. Instructions to load the CentOS defaults are
module load mpi/openmpi-x86_64
We also have modules for Open MPI v3.X updates that are built against newer GNU compiler available via
module load gnu openmpi3_eth
and to use the intel compiler with Open MPI v3.X, eg.
module load intel openmpi3_eth
or you can also load older versions via
openmpi2_eth
(for v2.X) oropenmpi_eth
(for v1.X).
Note that to compile with either compilers just usempicc
,mpicxx
,mpif90
, etc which are automatically going to be in your path. (on other systems, eg. <FONT FACE=Courier>goliath</FONT> you can usempigcc
ormpigxx
for GNU compilers, ormpiicc
ormpiicpc
for Intel.)
* MVAPICH is also on our systems, which is supposed to be optimised for faster networks, and possibly for running MPI on Intel Xeon Phi co-processors. Note that MVAPICH2 implements full MPI-3 standards.
module load gnu mvapich2_eth
or with the intel compiler
module load intel mvapich2_eth
Do not load one of the following:
module load mpi/mvapich2-x86_64
are the same modules... they enable
module load mpi/mvapich2-2.0-x86_64
/usr/lib64/mvapich2/bin/mpirun
but nompicc
.
* MPICH (V3.0 installed). Load one of the following:
module load gnu mpi/mpich-x86_64
which are the same modules... to enable CentOS default
module load gnu mpi/mpich-3.0-x86_64
/usr/lib64/mpich/bin/mpirun
and/usr/lib64/mpich/bin/mpicc
. Note that on our systems there are/opt/mpich3/gnu/bin/mpicc
compiler and/opt/mpich3/gnu/bin/mpirun
... how to load them with modules though???
* MPICH during V2.X was called MPICH2. ie. some systems may have references to both (older systems may also have implementations of MPICH V1.X). This is no longer on <FONT FACE=Courier>getafix</FONT>, although <FONT FACE=Courier>dogmatix</FONT> has/opt/mpich2/gnu/bin/mpirun
... how to load that with current modules though???
----
MPI Versions - closed source
There are also closed-source implementations of the MPI protocol:
- Intel MPI library and the Intel compiler 18.0.1.163 are installed:
module load intel intelmpi
To also build with the Intel Math Kernel Library which has built in OpenMP parallelism
module load intel mkl intelmpi
(see https://software.intel.com/en-us/intel-parallel-studio-xe for comparison of Intel products). Note that to compile with just use
mpicc
,mpicxx
,mpif90
, etc which are automatically going to be in your path. (on other systems, eg. <FONT FACE=Courier>goliath</FONT> you can usempigcc
ormpigxx
for GNU compilers, ormpiicc
ormpiicpc
for Intel.)
* On <FONT FACE=Courier>dogmatix</FONT> we also have Oracle Message Passing Toolkit (Formerly known as Sun HPC ClusterTools). http://www.oracle.com/technetwork/documentation/hpc-clustertools-193010.html This is installed in/opt/SUNWhpc/HPC8.2.1/
directory... (Need to add some reasons to use this...) (Need to add some instructions for this below...)
----
Installing the MPI compilers and environment-modules - other linux systems
On your own linux system/s you can install MPI implementations, eg.
@@
sudo dnf install environment-modules openmpi openmpi-devel blacs-openmpi scalapack-openmpi mpich mpich-devel mpich-doc blacs-mpich scalapack-mpich @@
note that openmpi is daemon free, just requires passwordless ssh between machines. To run mpich on your own machines needs a daemon via:
mpi -d &
There used to be a distinct package calledmpich2
etc, which still exists in some older repositories. The latestmpich
etc packages actually implement up to MPI-3 and so are the default now.
Compiling MPI code - other linux systems
To see which module to use:
@@
[user@linux ~]$ module list
No Modulefiles Currently Loaded.
[user@linux ~]$ module avail
--------------------- /etc/modulefiles ---------------------------
mpi/mpich-x86_64
---------------------------- /usr/share/modulefiles --------------
mpi/openmpi-x86_64
@@
and then load one of them
@@
[user@linux ~]$ module load mpi/openmpi-x86_64
[user@linux ~]$ module load mpi/mpich-x86_64
@@ Note that this can also be added to your~/.bashrc
to automatically load at login.
Now you should be able to compile your MPI-capable C
or fortran
code.
For example, to run the calculation with 8 processors on a standalone linux machine:
@@
mpirun -np 8 ./a.out
@@ If you use the machinefile options, egmpirun -np 8 -npernode 1 --host ...
you could run these calcs across, eg. multiple linux machines. [Note that adding -np option works for standalone systems, but might break the slurm MPI scheduling if using slurm/scripts]