In the simulation, an LP is implemented as a thread.
Each thread executes the target MPI program, which has previously
been linked with MPISIM, so that during the simulation, instead of
of calling the actual MPI function,
the thread calls the corresponding function of the simulation model.
Non-MPI code is directly executed, as dictated by the simulation model.
There
are two inputs to the simulation: the number of processes
in the target program execution, which determines
the number of threads to start, and the number of host
processes.
The degree of parallelism
of the simulation is equal to the number of host processes onto
which the threads are distributed. Threads are mapped onto host
processes using a simple block mapping scheme, such that
if
and
are the number of threads on any two host
processes k and l, then
.
For example, if there are 10 target processes and 4 host processes,
then 10 threads need to be started. Host process 0 is assigned
threads 0-2, host process 1 threads 3-5, host process
2 threads 6-7 and host process 3 threads 8-9.