A forwarding MPI implementation that can use any other MPI implementation via an MPI ABI

Overview

MPItrampoline

  • MPI wrapper library: GitHub CI
  • MPI trampoline library: GitHub CI
  • MPI integration tests: GitHub CI

MPI is the de-facto standard for inter-node communication on HPC systems, and has been for the past 25 years. While highly successful, MPI is a standard for source code (it defines an API), and is not a standard defining binary compatibility (it does not define an ABI). This means that applications running on HPC systems need to be compiled anew on every system. This is tedious, since the software that is available on every HPC system is slightly different.

This project attempts to remedy this. It defines an ABI for MPI, and provides an MPI implementation based on this ABI. That is, MPItrampoline does not implement any MPI functions itself, it only forwards them to a "real" implementation via this ABI. The advantage is that one can produce "portable" applications that can use any given MPI implementation. For example, this will make it possible to build external packages for Julia via Yggdrasil that run efficiently on almost any HPC system.

A small and simple MPIwrapper library is used to provide this ABI for any given MPI installation. MPIwrapper needs to be compiled for each MPI installation that is to be used with MPItrampoline, but this is quick and easy.

Successfully Tested

  • Debian 11.0 via Docker (MPICH; arm32v5, arm32v7, arm64v8, mips64le, ppc64le, riscv64; C/C++ only)
  • Debian 11.0 via Docker (MPICH; i386, x86-64)
  • macOS laptop (MPICH, OpenMPI; x86-64)
  • macOS via Github Actions (OpenMPI; x86-64)
  • Ubuntu 20.04 via Docker (MPICH; x86-64)
  • Ubuntu 20.04 via Github Actions (MPICH, OpenMPI; x86-64)
  • Blue Waters, HPC system at the NCSA (Cray MPICH; x86-64)
  • Graham, HPC system at Compute Canada (Intel MPI; x86-64)
  • Marconi A3, HPC system at Cineca (Intel MPI; x86-64)
  • Niagara, HPC system at Compute Canada (OpenMPI; x86-64)
  • Summit, HPC system at ORNL (Spectrum MPI; IBM POWER 9)
  • Symmetry, in-house HPC system at the Perimeter Institute (MPICH, OpenMPI; x86-64)

Workflow

Preparing an HPC system

Install MPIwrapper, wrapping the MPI installation you want to use there. You can install MPIwrapper multiple times if you want to wrap more than one MPI implementation.

This is possibly as simple as

cmake -S . -B build -DMPIEXEC_EXECUTABLE=mpiexec -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=$HOME/mpiwrapper
cmake --build build
cmake --install build

but nothing is ever simple on an HPC system. It might be necessary to load certain modules, or to specify more cmake MPI configuration options.

The MPIwrapper libraries remain on the HPC system, they are installed independently of any application.

Building an application

Build your application as usual, using MPItrampline as MPI library.

Running an application

At startup time, MPItrampoline needs to be told which MPIwrapper library to use. This is done via the environment variable MPITRAMPOLINE_LIB. You also need to point MPItrampoline's mpiexec to a respective wrapper created by MPIwrapper, using the environment variable MPITRAMPOLINE_MPIEXEC.

For example:

env MPITRAMPOLINE_MPIEXEC=$HOME/mpiwrapper/bin/mpiwrapper-mpiexec MPITRAMPOLINE_LIB=$HOME/mpiwrapper/lib/libmpiwrapper.so mpiexec -n 4 ./your-application

The mpiexec you run here needs to be the one provided by MPItrampoline.

Current state

MPItrampoline uses the C preprocessor to create wrapper functions for each MPI function. This is how MPI_Send is wrapped:

FUNCTION(int, Send,
         (const void *buf, int count, MT(Datatype) datatype, int dest, int tag,
          MT(Comm) comm),
         (buf, count, (MP(Datatype))datatype, dest, tag, (MP(Comm))comm))

Unfortunately, MPItrampoline does not yet wrap the Fortran API. Your help is welcome.

Certain MPI types, constants, and functions are difficult to wrap. Theoretically, there could be MPI libraries where it is not possible to implement the current MPI ABI. If you encounter this, please let me know -- maybe there is a work-around.

Comments
  • Add support for MPI profiling interface

    Add support for MPI profiling interface

    Each standard MPI function can be called with an MPI_ or PMPI_ prefix (quoting from https://www.open-mpi.org/faq/?category=perftools#PMPI), I think MPItrampoline doesn't currently support the PMPI_ calls.

    opened by ocaisa 6
  • Supporting `MPIX_Query_cuda_support()`

    Supporting `MPIX_Query_cuda_support()`

    While not part of the standard, MPIX_Query_cuda_support() is available in a number of MPI implementations (see https://github.com/pmodels/mpich/pull/4741). It's also being used by a number of applications (and hopefully that will grow, see my issue at https://github.com/lammps/lammps/issues/3140 which links back to the support in GROMACS). This would be a valuable inclusion in MPItrampoline since it would then be able to handle the runtime detection of CUDA support in the MPI implementation.

    opened by ocaisa 6
  • Allow overriding default compilation options

    Allow overriding default compilation options

    Currently default compilation options are set during the build of MPItrampoline, it would probably be useful to be able to fully override these, e.g.,

    exec ${MPITRAMPOLINE_CC:[email protected]_C_COMPILER@} ${CFLAGS:-"@CMAKE_C_FLAGS@"} [email protected]_INSTALL_PREFIX@/@CMAKE_INSTALL_INCLUDEDIR@ @LINK_FLAGS@ [email protected]_INSTALL_PREFIX@/@CMAKE_INSTALL_LIBDIR@ -Wl,-rpath,@CMAKE_INSTALL_PREFIX@/@CMAKE_INSTALL_LIBDIR@ "$@" -lmpi -ldl
    
    opened by ocaisa 5
  • Allow use of a fallback/default value for `MPITRAMPOLINE_MPIEXEC`

    Allow use of a fallback/default value for `MPITRAMPOLINE_MPIEXEC`

    It's possible to configure a default MPI library with -DMPITRAMPOLINE_DEFAULT_LIB=XXX, it would be good to be able to also configure a default mpiexec (-DMPITRAMPOLINE_DEFAULT_MPIEXEC=XXX) so that one can have a fully functional fallback in place at build time.

    opened by ocaisa 3
  • Incomplete installation

    Incomplete installation

    Hello ! I am looking at installing MPItrampoline, and following the steps outlined in the README.md, I cannot access files later referenced.

    For instance, on Summit at ORNL, using the following script:

    INSTALLDIR=$HOME/mpiwrapper
    
    module load cmake/3.18
    module load gcc
    
    cmake -S . -B build -DMPIEXEC_EXECUTABLE=mpiexec \
                                   -DCMAKE_BUILD_TYPE=RelWithDebInfo \
                                   -DCMAKE_INSTALL_PREFIX=$INSTALLDIR
    cmake --build build
    cmake --install build
    

    The following installation is generated:

    $ tree mpiwrapper/
    mpiwrapper/
    |-- bin
    |   |-- mpicc
    |   |-- mpicxx
    |   |-- mpiexec
    |   |-- mpifc
    |   `-- mpifort
    |-- include
    |   |-- mpi.h
    |   |-- mpi.mod
    |   |-- mpi_declarations.h
    |   |-- mpi_declarations_fortran.h
    |   |-- mpi_declarations_fortran90.h
    |   |-- mpi_defaults.h
    |   |-- mpi_f08.mod
    |   |-- mpi_version.h
    |   |-- mpiabi.h
    |   |-- mpiabif.h
    |   |-- mpif.h
    |   `-- mpio.h
    |-- lib
    |   |-- cmake
    |   |   `-- MPItrampoline
    |   |       |-- MPItrampolineConfig.cmake
    |   |       |-- MPItrampolineConfigVersion.cmake
    |   |       |-- MPItrampolineTargets-relwithdebinfo.cmake
    |   |       `-- MPItrampolineTargets.cmake
    |   `-- pkgconfig
    |       `-- MPItrampoline.pc
    `-- lib64
        |-- libmpi.a
        `-- libmpifort.a
    
    7 directories, 24 files
    

    The README.md dictates to use the wrapped libraries, but no shared libraries are available here. I looked through the options of the CMakeLists.txt but the only one defined in the project is the fortran flag.

    Am I missing something ?

    opened by spoutn1k 2
  •  Issues building Global Arrays

    Issues building Global Arrays

    I'm trying to build the Global Arrays library with MPItrampoline and have run into some issues. This package is a dependency for Molpro and several other computational chemistry applications. Any assistance that you can provide to get it working would be greatly appreciated.

    The compilation fails at https://github.com/GlobalArrays/ga/blob/f4016b869dfd1a2b2856f74a73dd9452dbbc8ae4/comex/src-mpi-pr/comex.c#L4968 with the error message:

    libtool: compile:  mpicc -DHAVE_CONFIG_H -I. -I./src-common -I./src-mpi-pr -g -O2 -MT src-mpi-pr/comex.lo -MD -MP -MF src-mpi-pr/.deps/comex.Tpo -c src-mpi-pr/comex.c -o src-mpi-pr/comex.o
    src-mpi-pr/comex.c: In function ‘str_mpi_retval’:
    src-mpi-pr/comex.c:4932:9: error: case label does not reduce to an integer constant
             case MPI_SUCCESS       : msg = "MPI_SUCCESS"; break;
             ^
    

    Steps to reproduce:

    git clone -b develop https://github.com/GlobalArrays/ga
    cd ga
    ./autogen.sh
    ./configure --with-mpi-pr --with-blas=no --with-lapack=no --with-scalapack=no --disable-f77
    make
    

    I've tried both the develop and master branches.

    There are lots of configurations which can be used, we're most interested in the recommended port which uses MPI-1 with progress ranks (--with-mpi-pr) but I tried several other configurations without success.

    I'm not sure whether it's an MPItrampoline issue or whether it's non-standard use of MPI within Global Arrays.

    I've attached the full output from build build-ga.txt

    I've tried on

    • RHEL 7.9 with gcc 4.8.5
    • Ubuntu 22.04 with gcc 11.3.0
    opened by nick-wilson 9
  • Issues building CP2K

    Issues building CP2K

    I thought I would give this a full test with Fortran, and CP2K is a good benchmark for that. The build (v8.2) is failing with:

    /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exts/dbcsr/src/mpi/dbcsr_mpiwrap.F:1669:21:
    
     1669 |       CALL mpi_bcast(msg, msglen, MPI_LOGICAL, source, gid, ierr)
          |                     1
    ......
     3160 |       CALL mpi_bcast(msg, msglen, ${mpi_type1}$, source, gid, ierr)
          |                     2
    Error: Type mismatch between actual argument at (1) and actual argument at (2) (LOGICAL(4)/COMPLEX(4)).
    
    opened by ocaisa 24
  • Building shared and static libraries at once

    Building shared and static libraries at once

    Currently the default behaviour is to only build static libraries. It might be good build both static and shared libraries since then if libmpi.so is in the default search path it is not selected over the library from MPItrampoline (MPItrampoline would shadow libmpi.so and libmpi.a).

    opened by ocaisa 4
Releases(v5.2.0)
Owner
Erik Schnetter
Erik Schnetter
Duke Machine Learning Winter School: Computer Vision 2022

mlwscv2002 Welcome to the Duke Machine Learning Winter School: Computer Vision 2022! The MLWS-CV includes 3 hands-on training sessions on implementing

Duke + Data Science (+DS) 9 May 25, 2022
Open-source implementation of Google Vizier for hyper parameters tuning

Advisor Introduction Advisor is the hyper parameters tuning system for black box optimization. It is the open-source implementation of Google Vizier w

tobe 1.5k Jan 04, 2023
Linear algebra python - Number of operations and problems in Linear Algebra and Numerical Linear Algebra

Linear algebra in python Number of operations and problems in Linear Algebra and

Alireza 5 Oct 09, 2022
Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

AdaptationSeg This is the Python reference implementation of AdaptionSeg proposed in "Curriculum Domain Adaptation for Semantic Segmentation of Urban

Yang Zhang 128 Oct 19, 2022
Efficient 3D Backbone Network for Temporal Modeling

VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and

102 Dec 06, 2022
Deep Learning Algorithms for Hedging with Frictions

Deep Learning Algorithms for Hedging with Frictions This repository contains the Forward-Backward Stochastic Differential Equation (FBSDE) solver and

Xiaofei Shi 3 Dec 22, 2022
Tiny Kinetics-400 for test

Kinetics-400迷你数据集 English | 简体中文 该数据集旨在解决的问题:参照Kinetics-400数据格式,训练基于自己数据的视频理解模型。 数据集介绍 Kinetics-400是视频领域benchmark常用数据集,详细介绍可以参考其官方网站Kinetics。整个数据集包含40

38 Jan 06, 2023
Yolo ros - YOLO-ROS for HUAWEI ATLAS200

YOLO-ROS YOLO-ROS for NVIDIA YOLO-ROS for HUAWEI ATLAS200, please checkout for b

ChrisLiu 5 Oct 18, 2022
A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.

Karoo GP Karoo GP is an evolutionary algorithm, a genetic programming application suite written in Python which supports both symbolic regression and

Kai Staats 149 Jan 09, 2023
Prototype-based Incremental Few-Shot Semantic Segmentation

Prototype-based Incremental Few-Shot Semantic Segmentation Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo -- BMVC 20

Fabio Cermelli 21 Dec 29, 2022
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Anirudh S Chakravarthy 6 May 03, 2022
Paper Code:A Self-adaptive Weighted Differential Evolution Approach for Large-scale Feature Selection

1. SaWDE.m is the main function 2. DataPartition.m is used to randomly partition the original data into training sets and test sets with a ratio of 7

wangxb 14 Dec 08, 2022
Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

2 Feb 03, 2022
GndNet: Fast ground plane estimation and point cloud segmentation for autonomous vehicles using deep neural networks.

GndNet: Fast Ground plane Estimation and Point Cloud Segmentation for Autonomous Vehicles. Authors: Anshul Paigwar, Ozgur Erkent, David Sierra Gonzale

Anshul Paigwar 114 Dec 29, 2022
A new test set for ImageNet

ImageNetV2 The ImageNetV2 dataset contains new test data for the ImageNet benchmark. This repository provides associated code for assembling and worki

186 Dec 18, 2022
PointCloud Annotation Tools, support to label object bound box, ground, lane and kerb

PointCloud Annotation Tools, support to label object bound box, ground, lane and kerb

halo 368 Dec 06, 2022
🛰️ Awesome Satellite Imagery Datasets

Awesome Satellite Imagery Datasets List of aerial and satellite imagery datasets with annotations for computer vision and deep learning. Newest datase

Christoph Rieke 3k Jan 03, 2023
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

MaCan 4.2k Dec 29, 2022
A custom DeepStack model that has been trained detecting ONLY the USPS logo

This repository provides a custom DeepStack model that has been trained detecting ONLY the USPS logo. This was created after I discovered that the Deepstack OpenLogo custom model I was using did not

Stephen Stratoti 9 Dec 27, 2022