Developed at the University of California, Los Angeles, Parallel Computing Lab
Developers: Sundeep Prakash, Andy Kahn, Stephen Docy
Advisor: Rajive Bagrodia (rajive@cs.ucla.edu)
Accurate simulation of large parallel programs to predict application performance as a function of varying architectural and algorithmic features can be facilitated with the use of direct execution and parallel discrete-event simulation. We have developed COMPASS, a COMponent-Based Parallel System Simulator, to provide direct execution-driven, parallel simulation for performance prediction of parallel computation-, communication-, and I/O-intensive programs written using the MPI message-passing library. In particular, simulation components have been developed to predict the behaviour of applications based on communication latency, the number of available processors on the architecture of interest, different caching strategies for parallel I/O, parallel file system characteristics, and alternative implementations of collective communication and I/O commands.
The simulator has predicted the performance of applications on both distributed-memory machines, such as the IBM SP, and shared-memory machines, such as the SGI Origin 2000. Furthermore, we have illustrated the usefulness of COMPASS as a versatile performance prediction tool by analyzing both real-world scientific programs and synthetic benchmarks to study application scalability, sensitivity to interconnection latency, and the interplay between factors such as communication pattern and parallel file system caching. We have also shown that the simulator is very accurate in its predictions and that it is itself scalable through its use of parallel execution to reduce its own running time, in some cases yielding a near-linear speedup.
COMPASS is being used for detailed program simulations within the POEMS (Performance Oriented End-to-end Modeling System) project. POEMS is a collaborative, multi-institute project whose goal is to create and experimentally evaluate a problem-solving environment for end-to-end performance modeling of complex parallel and distributed systems.
MPISIM software and benchmarks