Derived Data Types with MPI


1. Introduction

The basic MPI communication mechanisms can be used to send or receive a sequence of identical elements that are contiguous in memory. It is often desirable to send data that is not homogeneous or that is not contiguous in memory. This would amortize the fixed overhead of sending and receiving a message over the transmittal of many elements. MPI provides two mechanisms to achieve this:

This tutorial focuses on the construction and use of derived datatypes, why, how, and when to use them.


2. Why Use Derived Datatypes?

2.1 Basic MPI Datatypes

MPI Basic predefined Datatypes for C

MPI datatype C datatype
MPI_CHAR signed char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED_LONG unsigned long_int
MPI_UNSIGNED unsigned int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE  
MPI_PACKED  

MPI Basic predefined Datatypes for FORTRAN

MPI datatype FORTRAN datatype
MPI_INTEGER INTEGER
MPI_REAL REAL
MPI_REAL8 REAL*8
MPI_DOUBLE_PRECISION DOUBLE PRECISION
MPI_COMPLEX COMPLEX
MPI_LOGICAL LOGICAL
MPI_CHARACTER CHARACTER
MPI_BYTE  
MPI_PACKED  

Given these datatypes and a count, you can handle messages of contiguous data of the same type.

2.2 Motivation

What if you want to specify:

A few possible solutions are that you could:

Generally, however, these solutions are slow, clumsy, and wasteful of memory. Using MPI_BYTE or MPI_PACKED might also result in a program that isn't portable to a heterogeneous system of machines.

The idea of MPI derived datatypes is to provide a portable and efficient way of communicating non-contiguous or mixed types in a message. MPI derived datatypes provide a simpler, cleaner, more elegant and efficient way to handle this type of data.


3. What are Derived Datatypes?

Derived datatypes are datatypes that are built from the basic MPI datatypes. To better understand what is needed to construct such a datatype, you need to understand the general concept of an MPI datatype and something called a typemap.

Formally, the MPI Standard defines a general datatype as an object that specifies two things:

An easy way to represent such an object is as a sequence of pairs of basic datatypes and displacements. MPI calls this sequence a typemap.


4. When and How Do I Use Derived Datatypes?

4.1 When to Use

When you want to create a datatype in C or FORTRAN, you do so by declaring the datatype before executing any statements. Your declarations are read by the compiler that sets up storage for your datatype. In contrast, MPI derived datatypes are created at run-time through calls to MPI library routines. Since MPI derived datatypes are often used to send or receive C or FORTRAN datatypes, in the typical scenario, you first declare your C or FORTRAN datatypes. Later, in the execution part of your program between calls to MPI_INIT and MPI_FINALIZE, you create and use your MPI derived datatypes.

4.2 How to Use

Before you can use a derived datatype, you must create it. Here are the steps you take:

  1. Construct the datatype.
  2. Allocate the datatype.
  3. Use the datatype.
  4. Deallocate the datatype.
You must construct and allocate a datatype before using it. On the other hand, once you have it constructed and allocated, you are not required to use or deallocate it.

4.2.1 Construct the datatype

MPI_Type_contiguous
The simplest constructor. Produces a new datatype by making count copies of an existing data type.

MPI_Type_vector
MPI_Type_hvector
Similar to contiguous, but allows for regular gaps (stride) in the displacements.
MPI_Type_hvector is identical to MPI _Type_vector except that stride is specified in bytes.

MPI_Type_indexed
MPI_Type_hindexed
An array of displacements of the input data type is provided as the map for the new data type.
MPI_Type_hindexed is identical to MPI_Type_indexed except that offsets are specified in bytes.

MPI_Type_struct
The most general of all derived datatypes. The new data type is formed according to
completely defined map of the component data types.

4.2.2 Allocate the datatype

A constructed datatype must be committed to the system before it can be used in a communication. The constructed datatype is committed with a call to MPI_TYPE_COMMIT. (There is no need to commit basic datatypes; they are pre-committed.) It can then be used in any number of communications. The syntax of MPI_TYPE_COMMIT is:

4.2.3 Use the datatype

Derived datatypes can be used in all send and receive operations. You simply use the handle to the derived datatype as an argument in a send or receive operation instead of a basic datatype argument. Here is a sample C code segment:

MPI_Type_vector(count, blocklength, stride, oldtype, &newtype);
MPI_Type_commit (&newtype);
MPI_Send(buffer, 1, newtype, dest, tag, comm);

4.2.4 Deallocate the datatype

Finally, there is a complementary routine to MPI_TYPE_COMMIT, namely, MPI_TYPE_FREE, which marks a datatype for deallocation. The syntax of MPI_TYPE_FREE is:


5. DERIVED DATATYPES

This section presents the MPI functions for constructing derived datatypes.

5.1 CONTIGUOUS

Example



5.2 VECTOR & HVECTOR


The Vector constructor is similar to contiguous, but allows for regular gaps or overlaps (stride) in the displacements.

Hvector: MPI_Type_hvector (in C) and MPI_TYPE_HVECTOR (in Fortran), respectively, are the same as those for MPI_TYPE_VECTOR given above, except that displacement stride is specified in bytes rather than by length.

Example



5.3 INDEXED & HINDEXED

This constructor replicates a datatype, taking blocks at different offsets. It allows one to specify a noncontiguous data layout where displacements between successive blocks need not be equal.

Hindexed: MPI_Type_hindexed (in c) and MPI_TYPE_HINDEXED (in FORTRAN), respectively, are the same as those for MPI_TYPE_INDEXED given above, except that offsets array is specified in bytes.

Example



5.4 STRUCT

To gather a mix of different datatypes scattered at many locations in space into one datatype that can be used for the communication.

Example



6. Conclusions


7. References