Next: Validation
Up: Experiments
Previous: Experiments
Our results are based on a simulation engine written for a
compiler of the data parallel language UC[Bag] on the
IBM-SP2. The compiler
translates UC code into Maisie[Bag94]
with
MPL calls for fast communication and synchronization over the
HPS (high performance switch). The resulting program executes as one
single-threaded process per processor.
The simulation engine is simply a modified version of the same compiler, which:
- Also produces Maisie code. In the simulator,
Maisie additionally provides
lightweight threads (in order to put many
logical simulation processes on one processor) and
point-to-point communication
between threads (whether they are on the same or different processors).
- Inserts code to calculate local code execution time at the
beginning and end of each local code block, and to update simulation time.
- Substitutes MPL communication routines i.e. broadcasts,
reduces, barriers and point-to-point communication
with the respective communication simulation
routines. For example, we discovered the tree algorithm used in the
IBM-SP2 for broadcasts, barriers and reduces. The same algorithm is used
to simulate each of those routines using point-to-point communications.
It is at all these points that the simulation optimizations discussed
in the previous section are inserted.
Point-to-point message delays are predicted using an analytic model based on
linear interpolation and extrapolation of measurements of delays
for selected message sizes.
The resulting simulation program executes as one multi-threaded process
per processor.
Next: Validation
Up: Experiments
Previous: Experiments
Andy Kahn
Wed Jun 25 20:28:02 PDT 1997