A par statement is implemented on a multicomputer by distributing arrays and replicating scalars over the processors, and partitioning the range of the par statement over the processors. We assume a simple block mapping of arrays. Partitioning the range is performed according to the owner computes rule whenever possible. This is discussed further in Section 2.2.4. Henceforth, we use the term ``thread" in the context of a par statement to mean a single element in its range. A naive implementation of a par statement results in a global synchronization by all threads after the evaluation of each expression. This is expensive and is not really necessary. The only synchronization necessary is that which enforces a data dependence across threads of the par statement. Hence a data dependence graph for a par statement encapsulates the information needed to deduce the minimal synchronization needed to implement the statement. A synchronization graph is essentially such a data dependence graph. As shown subsequently, minor differences arise between the two mainly because data dependence graphs are used for reordering statements, whereas synchronization graphs are used only for synchronization.
This section outlines the use of
standard dependence analysis tools
[AK87,Fer87,BEN93,Wol95]
to construct synchronization graphs.
The example shown in Figure 2.4 shows the three well
known kinds of data dependences in the context of par statements
(the notation
denotes the instance of assignment
for an element i of the range I):
Figure 2.4: UC code showing dependences
is used by the instance
hence
s2 is flow dependent on s1, or
.
, and another value for this variable is
later computed by instance
.
s4 is anti-dependent on s3, or
.
and the
instance
both store a value in b[
],
hence s3
is output dependent on s2 or
.
In general, techniques for data dependence graph construction for sequential do loops can be easily adapted to predict the dependences in a par statement. Array dependence analysis, which is what is used to predict dependences, essentially predicts conflicts between two data references over a given iteration space. For example, consider the following statements:
... = ... a[f(i)] ...; /* s1 */The two references to array a area[g(i)] = ...; /* s2 */
(a use reference) and
(a define reference). Assume that
. This means
that each array variable gets defined in the iteration after which
it is used. If the iteration space is mapped sequentially to a processor
as in a do loop,
this translates to an anti dependence. If
, then on
sequential mapping of the iteration space, the dependence is a flow
dependence. However, if the iteration space is mapped in parallel
as in the range of a par statement, both the preceding cases
translate to anti dependences.
Consequently, given the semantics of a par statement (and hence
the HPF FORALL),
the following simple rules determine the type of data dependence
between two references:
In this way, a data dependence graph can be constructed for a par statement by constructing a dependence graph for a do loop with the same index set and the same body (using existing tools), and renaming the dependences according the rules listed above. In concurrentizing a do-loop, we distinguish between inter-iteration and intra-iteration dependences. In the context of par statements, we similarly distinguish between inter-thread and intra-thread dependences. When the data dependence graph for a par statement is used as a synchronization graph, intra thread dependences may be deleted from the graph. This is because the synchronization implied by an intra thread dependence is automatically enforced.
The synchronization graph for the par statement shown in the
beginning of the section appears in
figure 2.5. Each dependence arc in the
synchronization graph is labeled with the type of dependence
(flow, anti or output),
and the references that cause the dependence.
Hence an arc labeled
means that
there is a flow dependence and that thread i writes
in
the source statement and reads
in the sink statement.
Figure 2.5: Synchronization Graphs