next up previous
Next: EOT calculation Up: Simulation Strategy Previous: Simulation Model for

Parallel Execution of the Simulation Model

  We now describe how the above simulation model is executed using the parallel simulation protocol described in the previous chapter. LPs have two attributes associated with them at all times:

An LP executes without synchronizing with other LPs until it gets blocked on a specific request. A blocked LP in deterministic mode matches the request list with its message queues until the request it is waiting on gets satisfied. A message cannot satisfy a request if an earlier matching request (i.e. earlier in simulation time) is still unsatisfied. A request cannot claim a message if an earlier matching message (i.e. earlier in simulation time) is still unclaimed. When matching messages, an LP does not need to synchronize with other LPs in order to ensure that it has made a safe match i.e. it is guaranteed that a matching message with an earlier timestamp will not arrive (see Section 4.5 for a discussion on safe selections). This is because deterministic code does not require a simulation protocol (see Chapter 4.5.2.1).

A blocked LP in non-deterministic mode uses a combination of the conditional event and the null message protocols to determine which messages in its message queues are safe. Safe messages are are matched with the LP's request list in exactly the same way as with deterministic LPs, until the request the LP is waiting on gets satisfied.

In order to explain in detail the computations performed by the null message and conditional event protocols, we define the following terms:

  1. : A lower bound on the send timestamp of the next message LP P will send on communicator C.
  2. : A lower bound on the send timestamp of the next acknowledgement LP P will send on communicator C.
  3. : In round r of the conditional event protocol, a lower bound on the send timestamp of the next message or acknowledgement that LP P will send, provided it does not get any additional messages or acknowledgements (see Chapter 4.5.2.2).
  4. : The minimum of and the send timestamp of the earliest message sent by P in round r.
  5. : A lower bound on the send timestamp of the next message LP P will get on communicator C, as calculated by the null message protocol.
  6. : A lower bound on the send timestamp of the next acknowledgement LP P will get on communicator C, as calculated by the null message protocol.
  7. : A lower bound on the send timestamp of the next message or acknowledgement LP P will get, as calculated by the conditional event protocol.
  8. : Defined as .
  9. : Defined as .
In all the above definitions, we use the send, instead of the receive, timestamps of messages and acknowledgements since in the MPI simulation model, messages are accepted in order of their send timestamp not their receive timestamp (see Section 5.2.1.2). The null message protocol uses the communication topology specified by the target program, since it computes EIT on a per communicator basis. At LP P, is calculated as i.e. the minimum of the message EOTs of all LPs in communicator C. The acknowledgement EIT, is computed in a similar way. The conditional event protocol cannot use the communication topology so it computes a single EIT for all communicators. Consequently, for LP P, is calculated as i.e. the minimum of the ECOTs of all LPs in the simulation, where r is the latest round for which P has received all effective ECOTs. At LP P, a message on communicator C is deemed safe if its send timestamp is less than . In other words, the better of the EIT estimates of the null and conditional event protocols is used in deciding if a message is safe to process. As described in the previous chapter, we use a demand driven implementation of the null message and conditional event protocols, and hence both protocols get automatically switched off when all LPs are in deterministic mode.

We now describe how an LP computes its message and acknowledgement EOTs (i.e. and respectively) and its ECOT (i.e. ). An LP must be able to compute these values irrespective of its execution or simulation status. In all the calculations presented subsequently, let the current simulation time of LP P be . We assume the simulation time is updated whenever EITs change, in order to ensure that the following inequalities always hold: (a) If P is blocked waiting for an acknowledgement on communicator C, , and (b) If P is blocked waiting for a message on communicator C, . L is the minimum message latency of the target machine and consequently the receive timestamp of any message or acknowledgment must be at least L more than its send timestamp. Additionally, we assume that if P has terminated, is .





next up previous
Next: EOT calculation Up: Simulation Strategy Previous: Simulation Model for



Andy Kahn
Wed Jun 25 20:28:02 PDT 1997