Evaluation of Mechanisms for Fine-Grained Parallel Programs in the J-Machine and the CM-5
journal contributionposted on 2005-11-01, 00:00 authored by Ellen Spertus, Seth C. Goldstein, Klaus Erik Schauser, Thorsten von Eicken, David E. Culler, William J. Dally
This paper uses an abstract machine approach to compare the mechanisms of two parallel machines: the J-Machine and the CM-5. High-level parallel programs are translated by a single optimizing compiler to a finegrained abstract parallel machine, TAM. A final compilation step is unique to each machine and optimizes for specifics of the architecture. By determining the cost of the primitives and weighting them by their dynamic frequency in parallel programs, we quantify the effectiveness of the followingmechanisms individuallyand in combination. Efficient processor/network coupling proves valuable. Message dispatch is found to be less valuable without atomic operations that allow the scheduling levels to cooperate. Multiple hardware contexts are of small value when the contexts cooperate and the compiler can partition the register set. Tagged memory provides little gain. Finally, the performance of the overall system is strongly influenced by the performance of the memory system and the frequency of control operations.