Logo image
The performance of parallel matrix algorithms on a broadcast‐based architecture
Journal article   Peer reviewed

The performance of parallel matrix algorithms on a broadcast‐based architecture

Constantine Katsinis, Diana Hecht, Ming Zhu and Harsha Narravula
Concurrency and computation, v 18(3)
Mar 2006

Abstract

multiprocessors broadcast architectures numerical algorithms
Due to advances in fiber‐optics and very large scale integration (VLSI) technology, interconnection networks which allow multiple simultaneous broadcasts are becoming feasible. This paper summarizes one such multiprocessor architecture called the Simultaneous Optical Multiprocessor Exchange Bus (SOME‐Bus). It also presents enhancements to the network interface and the cache and directory controllers which support cache block combining, capture and prefetch, and allow complete overlap of processing time with the communication time due to compulsory misses. The paper uses two fundamental matrix algorithms to characterize the impact of each enhancement on performance. Cache miss analysis and results from the execution of these programs on a SOME‐Bus simulator show that block capture and prefetch combined with an effective block replacement policy succeed in significantly reducing the miss rate due to compulsory misses as the cache size increases, while a similar increase of cache size in traditional architectures leaves the miss rate due to compulsory misses unaffected. Copyright © 2005 John Wiley & Sons, Ltd.

Metrics

17 Record Views

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Computer Science, Software Engineering
Computer Science, Theory & Methods
Logo image