This paper advocates a architecture-independent, hierarchical approach to algorithm design and analysis for distributed-memory architectures, in contrast to the current trend of tailoring algorithms towards specific architectures. We show that, rather surprisingly, this new approach can achieve uniformity without sacrificing optimality. In our particular framework there are three levels of algorithm design: design of a network-independent algorithm in a network-independent programming environment, design of virtual architectures for the algorithm, and design of emulations of the virtual architectures on physical architectures. We propose and substantiate through a complete complexity analysis of the example of ordinary matrix multiplication, the following thesis: architecture-independent optimality can lead to portable optimality. Namely, a single network-independent algorithm, when optimized network-independently, with the support of properly chosen virtual architectures, can be implemented on a wide spectrum of networks to achieve optimality on each of them with respect to both computation and communication. Besides its implications to the methodology of parallel algorithm design, our theory also suggests new questions for theoretical research in parallel computation on interconnection networks.
If you have any questions or comments regarding this page please send mail to email@example.com.