Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Simulation as a tool for optimizing memory accesses on NUMA machines
Tao J., Schulz M., Karl W. Performance Evaluation60 (1-4):31-50,2005.Type:Article
Date Reviewed: Oct 19 2005

Non-uniform memory access (NUMA) machines offer difficult memory performance challenges. Simulation provides one means of measuring and optimizing the performance of NUMA architectures. This work describes the use of the SIMT simulation tool to improve cache misses, cache invalidations, and page placement and movement strategies.

Based on the Augmint multiprocessor simulator toolkit for the Intel x86 architecture, SIMT is designed for measuring memory performance, and models caches, distributed shared memory, and data transfer between processors. By presenting detailed data about cache misses, cache invalidations, and remote and local memory accesses, SIMT provides information that can be used to improve performance by changing program memory organization and access patterns.

Cache and memory parameters that can be specified and varied with SIMT include number, organization, and size; cache coherency protocol; and local and remote memory access latencies. Five cache coherency protocols, seven data allocation policies, and three data migration policies can be modeled.

The use of SIMT is illustrated using three programs from the standard SPLASH shared memory benchmark suite. These examples demonstrate how to improve cache misses, how to find an optimal cache coherency policy, and how to improve initial data placement and dynamic data migration.

The authors provide no verification of their simulation results with actual measurements from existing systems, but do reference other work that they claim shows the accuracy of the simulator. The work is well organized and clearly presented, and should be of interest to those who work with NUMA machines and architectures, especially those interested in performance.

Reviewer:  Andrew R. Huber Review #: CR131885 (0604-0399)
Bookmark and Share
  Featured Reviewer  
 
Simulation (D.4.8 ... )
 
 
Main Memory (D.4.2 ... )
 
 
Modeling And Prediction (D.4.8 ... )
 
 
Performance (D.4.8 )
 
 
Storage Management (D.4.2 )
 
Would you recommend this review?
yes
no
Other reviews under "Simulation": Date
Use of application characteristics and limited preemption for run-to-completion parallel processor scheduling policies
Chiang S., Mansharamani R., Vernon M. ACM SIGMETRICS Performance Evaluation Review 22(1): 33-44, 1994. Type: Article
Dec 1 1995
Using the SimOS machine simulator to study complex computer systems
Rosenblum M., Bugnion E., Devine S., Herrod S. ACM Transactions on Modeling and Computer Simulation 7(1): 78-103, 1997. Type: Article
Apr 1 1998
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems
Dandamudi S., Cheng P. IEEE Transactions on Parallel and Distributed Systems 6(1): 1-16, 1995. Type: Article
May 1 1996

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy