Two important paradigms for programming multicomputers are data parallelism and task parallelism. Data parallel programs are easy to write and debug, and well suited to many scientific applications. However, multicomputers are directly programmed using task parallel programs. The initial motivation for this dissertation was to devise techniques to automatically translate data parallel programs into task parallel programs while still achieving performance comparable to what would have been achieved if the task parallel program had been hand coded. In order to estimate the improvements due to these techniques on various architectures, parallel program simulators are essential. Most parallel architecture simulators are sequential and slow. This motivated the remaining work of this dissertation: the design of a fast parallel simulator for parallel programs, which would use application characteristics to optimize its simulation protocol.