Computing Reviews, the leading online review service for computing literature.

Search

The effect of sharing on the cache and bus performance of parallel programs
Eggers S., Katz R. ACM SIGARCH Computer Architecture News17 (2):257-270,1989.Type:Article

Date Reviewed: Jul 1 1990

How does the sharing resulting from writing an application program as a set of parallel processes affect cache performance? The authors investigate this question for shared-memory multiprocessors with a single bus. They use trace-driven simulation to examine the performance of four applications written explicitly for parallel execution. The parallel programming model used is single-program-multiple-data: N processes each execute identical instructions on their own part of the shared data. This corresponds to many real-world applications written for some small number of processors, with each process dedicated to its own processor. The applications are actual CAD programs written for N = 5, 11, 12, and 12 processors. The hardware simulated is RISC-like. The unsurprising answer is an unequivocal “it depends”-- on the sharing the application does. Applications whose processes exhibit locality (multiple consecutive writes to shared data within a cache block) behave much like nonparallel programs. Applications with fine-grain sharing (where multiple processes contend for shared data within cache blocks) do not. In either case, cache miss ratios and bus utilization are higher than in nonparallel programs because of extra misses caused by the cache invalidations necessary to maintain cache consistency. For programs with locality, this shows up as a smaller improvement in the miss ratio as cache block size or total cache size increases. For programs with fine-grain sharing, the extra misses can be sufficient to increase the miss ratio for large block or cache size. The results for bus utilization are similar. The paper is competently organized and presented. The usual caveats apply since the model and applications used, while representative, are limited, and the traces include only application references. It would have been interesting to see how the metrics varied with the number of processes. The results will be of interest to cache designers of shared memory multiprocessors and to programmers interested enough in performance to reorganize applications to take cache parameters into account.

Reviewer: Andrew R. Huber	Review #: CR114129

Performance of Systems (C.4 )

Cache Memories (B.3.2 ... )

Simulation (B.3.3 ... )

Single-Instruction-Stream, Multiple-Data-Stream Processors (SIMD) (C.1.2 ... )

Design Styles (B.3.2 )

Performance Analysis And Design Aids (B.3.3 )

Would you recommend this review?

yes

Other reviews under "Performance of Systems":	Date

A computer and communications network performance analysis primer Stuck B., Arthurs E., Prentice-Hall, Inc., Upper Saddle River, NJ, 1985. Type: Book (9789780131639812)	Jun 1 1985

A mean value performance model for locking in databases Tay Y., Suri R. (ed), Goodman N. Journal of the ACM 32(3): 618-651, 1985. Type: Article	Mar 1 1986

The relationship between benchmark tests and microcomputer price Sircar S., Dave D. Communications of the ACM 29(3): 212-217, 1986. Type: Article	Nov 1 1986

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy