next up previous
Next: Detection of minimal Up: Barrier Minimization Previous: Communication model

Implementation of dependence arcs

  Flow dependence arcs of the synchronization graph represent all and only the set of remote data accesses (reads or writes) of a thread. Hence, a simple implementation of the synchronization graph involves embedding a remote data read or remote data write call at the sink statement of each flow dependence arc. Figure 2.7 illustrates this idea.

 
Figure 2.7:   Synchronization Graph Implementation using Barriers

The synchronization graph for the code in Figure 2.7(a) is shown in Figure 2.7(b). There is one flow dependence due to the reference in statement s2. The code shown in Figure 2.7(c) shows the SPMD program for each thread. Each thread sends a request to its right neighbor for its value of a. While the thread is waiting for a reply, it must reply to incoming requests, otherwise deadlock may occur. Hence in the while loop it accepts messages of type rply and rqst. Not all requests may get serviced at this point in the code, since a thread might receive its reply before getting a request. Ideally, an interrupt driven service mechanism (e.g. using active messages [Eic92]) is required, to provide immediate response to incoming requests, irrespective of what point in the program a thread is at. Such a mechanism can be approximated by polling for arrived requests at regular intervals in the program. The function shown in Figure 2.8 can be used for this purpose.

 
Figure 2.8:   Polling Function

Clearly, if this function is inserted at regular intervals in the SPMD code shown in Figure 2.7(c), all requests would get serviced. However, if requests get serviced before statement s1 or after statement s3 the replies would be incorrect, since a would not have the desired value. Requests arriving before s1 are called early requests and those after s3, late requests. In order to prevent early requests a barrier must be placed at some point after statement s1. To prevent late requests a barrier must be placed before statement s3. Hence the barriers in Figure 2.7(c).

In essence, the barriers enforce the dependences shown in the synchronization graph. The first barrier enforces the flow dependence and the second barrier the anti dependence. Notice too that the barriers must be responsive, otherwise deadlock may occur. A responsive barrier is one in which a thread can respond to requests while waiting at the barriergif. Figure 2.7(d) shows an implementation of a responsive barrier. The underlying barrier algorithm is that of a simple centralized barrier, in which every thread sends a barrier message to a central thread, which waits until all barrier messages have arrived, and then sends a barrier release message to each waiting thread. This algorithm is shown only for simplicity in presentation; in practice more balanced algorithms are used.

Figure 2.9 shows another example of synchronization graph implementation using barriers. The flow dependence arc from statement s2 to itself in Figure 2.9(b) leads to write requests rather than read requests. Only in programs in which the owner computes rule cannot be applied, will such flow dependence arcs occur. In the corresponding SPMD program in Figure 2.9(c), thread i, after calculating the LHS of statement s2 and assigning to a temporary variable, sends a write request to thread . While waiting for the acknowledgement, it services incoming requests. Only one barrier is required to enforce the two flow dependences of the synchronization graph, since the dependence arcs overlap.

 
Figure 2.9:  Synchronization Graph with Write Flow Dependences



next up previous
Next: Detection of minimal Up: Barrier Minimization Previous: Communication model



Andy Kahn
Wed Jun 25 20:28:02 PDT 1997