Posted by & filed under Beginner MPI.

Sending and receiving are the two foundational concepts of MPI. Almost every single function in MPI can be implemented with basic send and receive calls. In this lesson, I will discuss how to use MPI’s blocking sending and receiving functions, and I will also overview other basic concepts associated with transmitting data using MPI. The code for this tutorial is available here or can be viewed/cloned on GitHub.

Overview of Sending and Receiving with MPI

MPI’s send and receive calls operate in the following manner. First, process A decides a message needs to be sent to process B. Process A then packs up all of its necessary data into a buffer for process B. These buffers are often referred to as envelopes since the data is being packed into a single message before transmission (similar to how letters are packed into envelopes before transmission to the post office). After the data is packed into a buffer, the communication device (which is often a network) is responsible for routing the message to the proper location. The location of the message is defined by the process’s rank.

Even though the message is routed to B, process B still has to acknowledge that it wants to receive A’s data. Once it does this, the data has been transmitted. Process A is acknowledged that the data has been transmitted and may go back to work.

Sometimes there are cases when A might have to send many different types of messages to B. Instead of B having to go through extra measures to differentiate all these messages, MPI allows senders and receivers to also specify message IDs with the message (known as tags). When process B only requests a message with a certain tag number, messages with different tags will be buffered by the network until B is ready for them.

With these concepts in mind, let’s look at the prototypes for the MPI sending and receiving functions.

MPI_Send(void* data, int count, MPI_Datatype datatype, int destination,
         int tag, MPI_Comm communicator)
MPI_Recv(void* data, int count, MPI_Datatype datatype, int source,
         int tag, MPI_Comm communicator, MPI_Status* status)

Although this might seem like a mouthful when reading all of the arguments, they become easier to remember since almost every MPI call uses similar syntax. The first argument is the data buffer. The second and third arguments describe the count and type of elements that reside in the buffer. MPI_Send sends the exact count of elements, and MPI_Recv will receive at most the count of elements (more on this in the next lesson). The fourth and fifth arguments specify the rank of the sending/receiving process and the tag of the message. The sixth argument specifies the communicator and the last argument (for MPI_Recv only) provides information about the received message.

Elementary MPI Datatypes

The MPI_Send and MPI_Recv functions utilize MPI Datatypes as a means to specify the structure of a message at a higher level. For example, if the process wishes to send one integer to another, it would use a count of one and a datatype of MPI_INT. The other elementary MPI datatypes are listed below with their equivalent C datatypes.

MPI datatypeC equivalent
MPI_CHARchar
MPI_SHORTshort int
MPI_INTint
MPI_LONGlong int
MPI_LONG_LONGlong long int
MPI_UNSIGNED_CHARunsigned char
MPI_UNSIGNED_SHORTunsigned short int
MPI_UNSIGNEDunsigned int
MPI_UNSIGNED_LONGunsigned long int
MPI_UNSIGNED_LONG_LONGunsigned long long int
MPI_FLOATfloat
MPI_DOUBLEdouble
MPI_LONG_DOUBLElong double
MPI_BYTEchar

For now, we will only make use of these datatypes in the beginner MPI tutorial. Once we have covered enough basics, you will learn how to create your own MPI datatypes for characterizing more complex types of messages.

MPI Send / Recv Program

The code for this tutorial is available here as a tgz file or can be viewed/cloned on GitHub. Go ahead and download and extract the code. I refer the reader back to the MPI Hello World Lesson for instructions on how to use my code packages.

The first example is in send_recv.c. Some of the major parts of the program are shown below.

  // Find out rank, size
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);
 
  int number;
  if (world_rank == 0) {
    number = -1;
    MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
  } else if (world_rank == 1) {
    MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
             MPI_STATUS_IGNORE);
    printf("Process 1 received number %d from process 0\n",
            number);
  }

MPI_Comm_rank and MPI_Comm_size are first used to determine the world size along with the rank of the process. Then process zero initializes a number to the value of negative one and sends this value to process one. As you can see in the else if statement, process one is calling MPI_Recv to receive the number. It also prints off the received value.

Since we are sending and receiving exactly one integer, each process requests that one MPI_INT be sent/received. Each process also uses a tag number of zero to identify the message. The processes could have also used the predefined constant MPI_ANY_TAG for the tag number since only one type of message was being transmitted.

Running the example program looks like this.

>>> tar -xzf mpi_send_recv.tgz
>>> cd mpi_send_recv
>>> make
mpicc -o send_recv send_recv.c
mpicc -o ping_pong ping_pong.c
mpicc -o ring ring.c
>>> ./run.perl send_recv
mpirun -n 2 ./send_recv
Process 1 received number -1 from process 0

As expected, process one receives negative one from process zero.

MPI Ping Pong Program

The next example is a ping pong program. In this example, processes use MPI_Send and MPI_Recv to continually bounce messages off of each other until they decide to stop. Take a look at ping_pong.c in the example code download. The major portions of the code look like this.

  int ping_pong_count = 0;
  int partner_rank = (world_rank + 1) % 2;
  while (ping_pong_count < PING_PONG_LIMIT) {
    if (world_rank == ping_pong_count % 2) {
      // Increment the ping pong count before you send it
      ping_pong_count++;
      MPI_Send(&ping_pong_count, 1, MPI_INT, partner_rank, 0,
               MPI_COMM_WORLD);
      printf("%d sent and incremented ping_pong_count "
             "%d to %d\n", world_rank, ping_pong_count,
             partner_rank);
    } else {
      MPI_Recv(&ping_pong_count, 1, MPI_INT, partner_rank, 0,
               MPI_COMM_WORLD, MPI_STATUS_IGNORE);
      printf("%d received ping_pong_count %d from %d\n",
             world_rank, ping_pong_count, partner_rank);
    }
  }

This example is meant to be executed with only two processes. The processes first determine their partner with some simple arithmetic. A ping_pong_count is initiated to zero and it is incremented at each ping pong step by the sending process. As the ping_pong_count is incremented, the processes take turns being the sender and receiver. Finally, after the limit is reached (ten in my code), the processes stop sending and receiving. The output of the example code will look something like this.

>>> ./run.perl ping_pong
0 sent and incremented ping_pong_count 1 to 1
0 received ping_pong_count 2 from 1
0 sent and incremented ping_pong_count 3 to 1
0 received ping_pong_count 4 from 1
0 sent and incremented ping_pong_count 5 to 1
0 received ping_pong_count 6 from 1
0 sent and incremented ping_pong_count 7 to 1
0 received ping_pong_count 8 from 1
0 sent and incremented ping_pong_count 9 to 1
0 received ping_pong_count 10 from 1
1 sent and incremented ping_pong_count 1 to 0
1 received ping_pong_count 2 from 0
1 sent and incremented ping_pong_count 3 to 0
1 received ping_pong_count 4 from 0
1 sent and incremented ping_pong_count 5 to 0
1 received ping_pong_count 6 from 0
1 sent and incremented ping_pong_count 7 to 0
1 received ping_pong_count 8 from 0
1 sent and incremented ping_pong_count 9 to 0
1 received ping_pong_count 10 from 0

The output of the programs of others will likely be different. However, as you can see, process zero and one are both taking turns sending and receiving the ping pong counter to each other.

Ring Program

I have included one more example of MPI_Send and MPI_Recv using more than two processes. In this example, a value is passed around by all processes in a ring-like fashion. Take a look at ring.c in the example code download. The major portion of the code looks like this.

  int token;
  if (world_rank != 0) {
    MPI_Recv(&token, 1, MPI_INT, world_rank - 1, 0,
             MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    printf("Process %d received token %d from process %d\n",
           world_rank, token, world_rank - 1);
  } else {
    // Set the token's value if you are process 0
    token = -1;
  }
  MPI_Send(&token, 1, MPI_INT, (world_rank + 1) % world_size,
           0, MPI_COMM_WORLD);
  // Now process 0 can receive from the last process.
  if (world_rank == 0) {
    MPI_Recv(&token, 1, MPI_INT, world_size - 1, 0,
             MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    printf("Process %d received token %d from process %d\n",
           world_rank, token, world_size - 1);
  }

The ring program initializes a value from process zero, and the value is passed around every single process. The program terminates when process zero receives the value from the last process. As you can see from the program, extra care is taken to assure that it doesn’t deadlock. In other words, process zero makes sure that it has completed its first send before it tries to receive the value from the last process. All of the other processes simply call MPI_Recv (receiving from their neighboring lower process) and then MPI_Send (sending the value to their neighboring higher process) to pass the value along the ring.

MPI_Send and MPI_Recv will block until the message has been transmitted. Because of this, the printfs should occur by the order in which the value is passed. Using five processes, the output should look like this.

>>> ./run.perl ring
Process 1 received token -1 from process 0
Process 2 received token -1 from process 1
Process 3 received token -1 from process 2
Process 4 received token -1 from process 3
Process 0 received token -1 from process 4

As we can see, process zero first sends a value of negative one to process one. This value is passed around the ring until it gets back to process zero.

Up Next

Now that you have a basic understanding of MPI_Send and MPI_Recv, it is now time to go a little bit deeper into these functions. In the next lesson, I cover how to probe and dynamically receive messages. Feel free to also examine the beginner MPI tutorial for a complete reference of all of the beginning MPI lessons.

Having trouble? Confused? Feel free to leave a comment below and perhaps I or another reader can be of help.

Recommended books

View all recommended books here.

15 Responses to “MPI Send and Receive”

  1. mcat

    Hi,
    The ring-like fashion example can be have any problem if the token round around all processes many times?

    thanks!

    Reply
    • Wes

      Hello, the code currently does not send the token around the ring multiple times, but it should work in practice if you wanted to modify the code. For an MPI example that is more similar to your question, I invite you to check out the random walk MPI application.

      Reply
  2. Sotiris

    Hello, this is a great tutorial thanks,

    Just a notice, I think that you should change

    int destination -> int source

    in the MPI_Recv function declaration. I was a little confused when I saw it :P

    Reply
  3. clark wu

    I tried to follow the example of send_recv, but kept getting the following error; your hints are greatly welcome.
    ……
    processor name HP-xw6600 Process id 0
    processor name robot-desk Process id 1
    Fatal error in MPI_Send: Other MPI error, error stack:
    MPI_Send(173)…………..: MPI_Send(buf=0xbfddada8, count=1, MPI_INT, dest=1, tag=0, MPI_COMM_WORLD) failed
    MPID_nem_tcp_connpoll(1826): Communication error with rank 1: Connection refused
    ……

    Reply
    • Wes

      Hello Clark, have you successfully executed any MPI programs yet? It appears that your MPI installation, runtime, or hardware is not configured properly

      Reply
  4. Joy Prakash Sharma

    How come the control jumps from MPI_Send() in if to MPI_Recv() in else if.
    As far as decision making conditions are concerned it should only execute a single block of code.

    Reply
    • Wes

      Hello, the reason for the “if” and “else if” is because this code is being executed on two different processes at the same time. The first process has a world_rank of 0 and ends up calling the MPI_Send function (sending the number -1 to the receiving process). The second process has a world_rank of 1 and ends up calling MPI_Recv to receive the number sent from process 0.

      Reply
    • Krish

      If you imagine like simultaneous access of all process , when process 1 called the receive function, it will wait in that place until the process 0 send the information…. seriously, I spent hours on staring on this code.. I learnt so many things..

      Reply
  5. Hoagha

    Thank you.
    Your website is the best MPI tutorial that I’ve ever seen.
    I like your method, because you make it simple and provide helpful examples. I wish you can make it more complete

    Reply
    • Wes

      Thank you so much for the feedback. I have been trying to start a business lately and unfortunately have not had time to add to this site. I’m going to start adding more content every night and try to produce some other tutorials. Thanks again!

      Reply
  6. hyuga

    nice blog.i’m really2 beginner in paralel computing.

    need ur help.thanks

    i have several data that need separate,paralel to every node.but when i do the paralel.all data just sent to single node ,actually the whole package just sent to single node in my cluster.

    case 1:
    fin.open(“kddcup.data.corrected.001″);

    break;

    case 2:
    fin.open(“kddcup.data.corrected.002″);

    break;// fin.open(“example.txt”);
    case 3:
    fin.open(“kddcup.data.corrected.003″);
    break;
    case 4:
    fin.open(“kddcup.data.corrected.004″);
    break;

    how i can send first data to first node.second data to second node?
    thanks .i appreciate ur help.if u need the whole code,just mail me.

    Reply

Leave a Reply

  • (will not be published)