The main objective of this study is to devise techniques that will alleviate the problem of coherence-induced misses. Our approach is to find the communication pattern of the application at the compiler level and add all the necessary hardware support to exploit it.
First, we propose
a set of primitives targeted to hide
the latency of coherence misses. To hide a coherence misses, one of
the two processors involved, the consumer or the producer, must initiated a
non-blocking memory transfer. Our set of proposed instructions include
both consumer-initiated (prefetching) and producer-initiated primitives
(write through, write update, broadcast and write forwarding, shown in the picture).
.
Second, we propose
a compiler algorithm
targeted to support all these primitives. Since coherence-misses will
probably limit the performance of COMA architectures,
we chose COMA as our baseline architecture. Note, however that the
ideas presented here are also applicable in non-COMA architectures.