Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The effect of sharing on the cache and bus performance of parallel programs
Eggers S., Katz R. ACM SIGARCH Computer Architecture News17 (2):257-270,1989.Type:Article
Date Reviewed: Jul 1 1990

How does the sharing resulting from writing an application program as a set of parallel processes affect cache performance? The authors investigate this question for shared-memory multiprocessors with a single bus. They use trace-driven simulation to examine the performance of four applications written explicitly for parallel execution. The parallel programming model used is single-program-multiple-data: N processes each execute identical instructions on their own part of the shared data. This corresponds to many real-world applications written for some small number of processors, with each process dedicated to its own processor. The applications are actual CAD programs written for N = 5, 11, 12, and 12 processors. The hardware simulated is RISC-like.

The unsurprising answer is an unequivocal “it depends”-- on the sharing the application does. Applications whose processes exhibit locality (multiple consecutive writes to shared data within a cache block) behave much like nonparallel programs. Applications with fine-grain sharing (where multiple processes contend for shared data within cache blocks) do not. In either case, cache miss ratios and bus utilization are higher than in nonparallel programs because of extra misses caused by the cache invalidations necessary to maintain cache consistency. For programs with locality, this shows up as a smaller improvement in the miss ratio as cache block size or total cache size increases. For programs with fine-grain sharing, the extra misses can be sufficient to increase the miss ratio for large block or cache size. The results for bus utilization are similar.

The paper is competently organized and presented. The usual caveats apply since the model and applications used, while representative, are limited, and the traces include only application references. It would have been interesting to see how the metrics varied with the number of processes. The results will be of interest to cache designers of shared memory multiprocessors and to programmers interested enough in performance to reorganize applications to take cache parameters into account.

Reviewer:  Andrew R. Huber Review #: CR114129
Bookmark and Share
  Featured Reviewer  
 
Performance of Systems (C.4 )
 
 
Cache Memories (B.3.2 ... )
 
 
Simulation (B.3.3 ... )
 
 
Single-Instruction-Stream, Multiple-Data-Stream Processors (SIMD) (C.1.2 ... )
 
 
Design Styles (B.3.2 )
 
 
Performance Analysis And Design Aids (B.3.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Performance of Systems": Date
A computer and communications network performance analysis primer
Stuck B., Arthurs E., Prentice-Hall, Inc., Upper Saddle River, NJ, 1985. Type: Book (9789780131639812)
Jun 1 1985
A mean value performance model for locking in databases
Tay Y., Suri R. (ed), Goodman N. Journal of the ACM 32(3): 618-651, 1985. Type: Article
Mar 1 1986
The relationship between benchmark tests and microcomputer price
Sircar S., Dave D. Communications of the ACM 29(3): 212-217, 1986. Type: Article
Nov 1 1986
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy