Threads are started using a simple and streamlined
threading scheme[Kof95], which ensures fast context switching.
A host process that needs to start p threads each with a stack
of b bytes, simply grows its stack by b bytes by recursively
calling a dummy function. When the
stack has grown the requisite amount, it calls a function that contains (a) a
setjmp function call (which is part of the library
available
with all C compilers) to save the current processor state, in a thread
specific data structure, as the beginning of the stack
of the current thread, and (b) a function that calls the target program.
It repeats this procedure p times to initiate p threads each
executing a copy of the target program.
The thread scheduler switches to any thread it wants by using the
longjmp function, which restores processor state to that saved by a given
setjmp. During the simulation, whenever a thread is
ready to switch out, it simply saves
its current state using setjmp, so that it can be later switched in at the
same point, and then calls the scheduler.