We suggest an implementation of the MPI send functions that may not be found in existing MPI implementations, but is simple, obeys the MPI standard correctly, and most importantly, its simulation model accurately predicts reality.
The MPI standard requires that messages sent use the area specified by the function call MPI_Buffer_attach for message buffering. Hence, a process can statically or dynamically allocate memory, and give it to MPI for message buffering using MPI_Buffer_attach. Messages on all communicators share the same buffer. Our implementation assumes sender buffering, meaning that messages are not sent to the receiver unless the receiver has posted a matching receive. Consequently, when a message is sent using MPI_Ssend, the data is not actually sent, but only a request-to-send, consisting of the source, destination, tag and communicator of the message. When request-to-sends are received at the receiver, they are not stored in the attached buffer area, but in a separate unbounded area. We assume this area can never overflow (which is a realistic assumption considering the small size of a request-to-send). When a matching receive is posted, a ready-to-receive reply is returned to the sender, after which the message data is transferred.
MPI_Bsend copies the message that needs to be sent to the area declared by MPI_Buffer_attach, provided it is not exhausted by other messages sent using MPI_Bsend. If it is, then there is a program error. Subsequently, the message is sent to the receiver using MPI_Ssend, using a pointer to the message lying in the buffer.
As mentioned earlier, we assume MPI_Rsend is implemented as an MPI_Ssend, and MPI_Send actually uses either MPI_Ssend or MPI_Bsend. For these reasons, all sends can be implemented using only MPI_Ssend and MPI_Bsend as the real underlying message passing function calls, and even they are very similar. Since each send is functionally equivalent to its non-blocking counterpart and MPI_Wait, constructing a simulation model based on MPI_Issends, MPI_Ibsends and MPI_Waits should be sufficient to cover all types of sends.