Motivation
If a target program executes receive statements infrequently, the null message protocol implementation outlined above is fast, except in situations similar to the one shown in Figure 4.1.
Figure 4.1: Null Message Protocol inadequacy
Let
,
and
be the LPs in the
simulation of the illustrated system. Assume the
source set of each LP is all other LPs.
sends messages to
and
timestamped
1050 and 1000 respectively.
and
, upon
reaching their respective receive statements at simulation time
50, request each other's EOTs and the EOT of
.
Both
and
compute their EOTs to be 50+L, where
L is the minimum communication latency of the target machine.
Consequently, since
's EOT must be significantly larger,
both LPs compute their EITs to be 50+L. A blocked
LP can advance its simulation time to the maximum of
its EIT and its current simulation time. Consequently,
and
advance their simulation times to 50+L. Assuming
L is fairly small, both
and
must go through many such rounds
of null messages before they are able to advance their simulation
times and EITs enough to accept the message from
.
This problem can be alleviated within the null message protocol if, when at a receive statement, an LP can be given a better lookahead than simply L, the minimum communication latency of the target machine. The only way to have better lookahead is for an LP to predict how long it would take to send a message if it were resumed instantly. The predicted value can be added to L to produce a significantly better lookahead. This method requires static analysis of the target program, in order to get a lower bound on the execution time of each local code block. This method is neither portable, since an analysis of each target instruction set needs to be performed, and does not always guarantee success, since there may be a send statement immediately after the receive statement.
Protocol
The conditional event protocol offers a simpler solution to
the above problem. The ECOT[Jha93] (for Earliest
Conditional Output Time) of an LP is a lower bound on the
timestamp of the next message it will send, provided
it does not get any messages from other LPs.
Assuming the messages sent by all LPs have been received
before the ECOT computation at any LP is performed
(i.e. no messages are in transit), it is easy to see
that a lower bound on the receive timestamp of the next message to arrive at
any LP is the minimum ECOT of all LPs.
In other words, the minimum ECOT of all LPs is the EIT
for each LP in the simulation.
For a running LP,
ECOT is the same as the EOT (current
simulation time plus L). For a blocked LP, ECOT is
the receive timestamp of the earliest matching message currently in
the LP's queue (since that is the earliest time the LP
can get resumed if it did not get any messages) plus L.
The utility of the conditional event protocol is evident in
the situation shown in Figure 4.1:
when at their respective receive statements,
and
would calculate their ECOTs to be 1050 and
1000 respectively.
would
calculate its ECOT to be at least 1000, and consequently in a single
round of the conditional event protocol,
and
would
advance their EITs to 1000.
A message in transit might reduce an LP's ECOT when it arrives.
In the worst case, a message may cause an LP to send a
message as soon as it arrives. Consequently, in order to
be able to compute EIT at any point in the simulation
(i.e. even when messages are in transit), it is sufficient
to calculate the EIT of any LP as the minimum of the
minimum ECOT of all LPs, and the minimum receive timestamp of
all messages in transit. Since EIT computation using the conditional
event protocol is a global computation (i.e. over all LPs), we can
associate a common global round number with the ECOT computation at each LP.
An LP keeps track of the messages in transit
in the following way:
an LP computes its effective ECOT as
, where E is its ECOT, and M is the receive timestamp
of the earliest message it sent in the current round. It
broadcasts its effective ECOT to all LPs and immediately starts
a new round. Upon receiving
the effective ECOTs of all LPs in the previous round, it computes its EIT
as the minimum of all effective ECOTs (the received
effective ECOTs and its own effective ECOT from the previous round).
All LPs are always within
one round of each other, and consequently one bit in the message
containing an LP's ECOT is sufficient to distinguish its round number.
The conditional event protocol can be made demand driven by an LP sending its effective ECOTs to other LPs only if it reaches a receive statement and needs to compute its EIT, or if it gets an effective ECOT from another LP. In this way, the protocol gets switched on only if at least one LP is at a receive statement, and automatically gets switched off when all LPs complete their receives.