Getting Started with MPI - Message Passing Interface
MPI, or Message Passing Interface, is a standardized message passing system that was developed in the early 1990s by a broad coalition of engineers and computer science academics. It provides a standard protocol for various computer programs or computer systems to interact with each other by passing messages back and forth, based on the concepts inherent in Object Oriented Programming (OOP).
MPI is the dominant standard for message passing in a parallel computing environment. In order to understand MPI, it is helpful to have an understanding of:
- message passing in general
- parallel computing
Message passing is a way for a program to invoke a behavior, or run a program. It differs from more conventional method of calling a program, message passing is based on the object model, which separates the general functional need from the specific implementation. The program that needs the functionality calls an object, and that object runs the program.
The primary benefit of this technique is related to the OOP concept of Encapsulation. The logic of determining which specific implementation to use is left up to the object, rather than to the invoking program, encapsulating the many disparate aspects of the feature into a single object.
For example: A computer system might have a Print Manager object, and several individual Printers. Each of the programs that might want to use a printer does not need to have its own implementation of each printer, along with complex logic determining which printer to use in what situation. Any program that needs to print something can simply send a
Modern computers use this form of inter-system message passing for (almost) every aspect of computing. A couple examples of how this impacts your daily experience of computing:
- you see nearly the same User Interface (UI) every time you open or save file, regardless of the application you are accessing the file from — this is because all the different applications are passing file-access messages to the same File Manager
- when you add a new piece of hardware (a scanner, a mouse, etc.), every application that can make use of it has access to it immediately — you don’t have to add drivers for the hardware to each individual application that might use it, each program is able to simply pass messages to the independent controller
These are high-level examples of message passing. MPI works on a lower level, enabling message passing between diverse systems in a parallel computing environment.
Parallel computing is a computing paradigm where tasks (calculations, processes, etc.) are divided up into smaller tasks which can be accomplished in parallel (at the same time), rather than serially (one after another).
Generally, a computer processing core can only do one thing at a time, one calculation. Quite literally, it can only move one bit of data at a time. Contemporary CPUs operate so quickly that this one-bit-a-time method of computation can achieve a relatively high level of performance, but it is still running through every tiny calculation in serial — one after the other. And there is a limit to how fast this can go — a practical limit based on today’s technology, and an absolute theoretical limit based on the laws of physics.
In order to increase the speed of a computer system, parallel processing and parallel computing were invented. This breaks up serially computed tasks and allows them to be completed in parallel — at the same time – by separate processors.
MPI — Message Passing in Parallel Computing
In order for parallel computing to work, the various computers need to be able to communicate with each other — to pass messages back and forth. MPI — Message Passing Interface — was created to facilitate this communication.
MPI is language-independent protocol which provides an API (Application Programming Interface) to processors and other hardware (real or virtual) which can be accessed by other systems. MPI needs to be implemented by the hardware vendor, and any hardware that has an MPI implementation built into can be accessed by connected systems via the MPI protocol.
MPI provides two modes of communication:
- point-to-point — one system passing messages directly to another
- collective, or broadcast — one system passing messages to a group
Online MPI Resources
These introductory tutorials will help you learn to use MPI for parallel computing.
- MPI Tutorial by Wes Kendall — This is a very thorough introduction to MPI, one of the best available online.
- Tutorial on MPI: The Message Passing Interface — Another very thorough resource, by William Gropp, from the Mathematics and Computer Science Division of Argonne National Laboratory.
- William Grop also prepared this PDF presentation on MPI, which covers largely the same material.
- An introduction to the Message Passing Interface (MPI) using C — Language-specific MPI tutorial.
- MPI Tutorial by Blaise Barny — from the Lawrence Livermore National Laboratory
MPI is a standard, not a specific technology. It relies on implementations from various vendors. These are a few of the more frequently used MPI implementations (there are many more).
Community and Discussion
One of the best ways to get started with MPI, and to solve problems once you’re up and running, is to talk to experts and other MPI programmers.
- MPI Newsgroup Forum on Google Groups — Usenet newsgroup devoted to discussions about MPI
- MPI Forum Mailing Lists — A number of different mailing lists from the official MPI Forum
- Open MPI Mailing List — The mailing list for the Open MPI implementation.
- MPI Forum Meetings — Information on when the MPI Forum meets.
- Open MP Forum
- MPI at StackOverflow — MPI Questions and Answers.
MPI is a frequent topic of lectures and professional development talks, so there are plenty of videos exploring various aspects of MPI.
- Introduction to parallel Programming – Message Passing Interface (MPI)
- Introduction to MPI Programming
- High-Performance Computing — Introducing MPI
Open MPI Channel on YouTube — Lots of great MPI-related videos here.
A few key MPI-related reference pages to bookmark and return to again and again.
- MPI: A Message-Passing Interface Standard — This is the original 1994 Technical Report of the Message Passing Interface Forum.
- MPI Documents — The current MPI standard, as well as all previous versions of the standards document.
- Open MPI Documentation — Some of this material is specific to the Open MPI implementation, but most of it is general to the MPI standard.
Books about MPI
Because MPI is a bit advanced, most of the really detailed information is easier to find in printed books than in online tutorials. Here are a few of the best tutorial and reference books on MPI.
- Beginning MPI (An Introduction in C) — By Wes Kendall, who also wrote our #1 recommended tutorial on MPI.
- Parallel Programming with MPI — Another good introductory text.
- Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface (Scientific and Engineering Computation) — Also somewhat introductory, but with an emphasis on using MPI in Science and Math analysis.
- Parallel Programming in C with MPI and OpenMP — A programming tutorial for a specific language and MPI implementation
- MPI: The Complete Reference — An essential MPI desk reference for serious parallel programmers.
What is MPI?
MPI is Message Passing Interface. It is a communications protocol that enables computer systems to talk to each other in a parallel computing environment.
Who uses MPI?
MPI is used by nearly anyone writing applications that will take advantage of a parallel or clustered computing system.
Who manages the MPI standard?
The MPI standard is published by the Message Passing Interface Forum, an open and ever-evolving group of engineers and computer science academics.
Do I need to learn MPI?
That depends on what kind of development work you do, and what your goals are.
If you are writing primarily web applications in high-level scripting languages like Ruby, Python, or PHP (and you primarily want to keep doing that), then MPI is not an important standard to learn.
If you want to get more involved with foundational systems development, especially in a clustered or parallel computing environment (like cloud computing, super computers, or big data), MPI is an important thing to know.