Performance Monitoring in Multi-transputer Networks

ID
TR-90-32
Authors
Jie Cheng Jiang
Publishing date
October 1990
Abstract

Parallel architectures, like the transputer-based multicomputer network, offer potentially enormous computational power at modest cost. However, writing programs on a multicomputer to exploit parallelism is very difficult due to the lack of tools to help users understand the run-time behavior of the parallel system and detect performance bottlenecks in their programs. This thesis examines the performance characteristics of parallel programs in a multicomputer network, and describes the design and implementation of a real-time performance monitoring tool on transputers.

We started with a simple graph theoretical model in which a parallel computation is represented as a weighted directed acyclic graph, called the execution graph. This model allows us to easily derive a variety of performance metrics for parallel programs, such as program execution time, speedup, efficiency, etc. From this model, we also developed a new analysis method called weighted critical path analysis (WCPA), which incorporates the notion of parallelism into critical path analysis and helps users identify the program activities which have the most impact on performance. Based on these ideas, the design of a real-time performance monitoring tool was proposed and implemented on a 74-node transputer-based multicomputer. Major problems in parallel and distributed monitoring addressed in this thesis are: global state and global clock, minimization of monitoring overhead, and the presentation of meaningful data. New techniques and novel approaches to these problems have been investigated and implemented in our tool. Lastly, benchmarks are used to measure the accuracy and the overhead of our monitoring tool. We also demonstrate how this tool was used to improve the performance of an actual parallel application by more than 50%.