Computing Reviews

GPU acceleration of data assembly in finite element methods and its energy implications
Hu X., Hsieh G., Tang L., Hammond S., Chen D., Niemier M., Barrett R.  ASAP 2013 (Proceedings of the 24th IEEE International Conference on Application-Specific Systems, Architectures and Processors, Washington, DC, Jun 5-7, 2013)321-328,2013.Type:Proceedings
Date Reviewed: 12/05/13

Finite element methods are a standard tool for the numerical simulation of many phenomena in physics, engineering, and other areas. As the processes to be simulated become more and more complex, and the accuracy requirements become higher, the computational cost for such simulations increases as well. This paper is devoted to a study of certain measures counteracting the adverse effects of the increasing complexity. The two computationally most expensive parts of a finite element algorithm are the data assembly phase, where a high-dimensional system of equations is constructed, and the solution phase, where this system is actually solved. Since the latter part is quite well understood with respect to the questions of interest here, the authors concentrate on the former.

A standard approach for reducing the runtime of the corresponding programs in spite of the increasing complexity is the use of massive parallelism. In recent years, this includes pure central processing unit (CPU)- as well as hybrid CPU/graphics processing unit (GPU)-based concepts. Hu et al. address the implications of both techniques and discuss the resulting speedup--a classical question--and the currently highly important aspect of energy requirements.

In contrast to the common belief that an algorithm that is efficient in the sense that it runs fast and scales well must automatically be energy efficient, too, the key finding of this paper is that there doesn't need to be such a direct connection between the two design goals. Rather, by comparing numerous different approaches, the authors demonstrate that certain performance-tuning actions can be very successful with respect to runtime but not helpful at all with respect to energy, while other ideas can improve the code in both respects simultaneously. The key problem is that the quality of a tuning action in either respect can strongly depend on the hardware platform in use. Thus, it appears to be impossible to provide general advice.

Reviewer:  Kai Diethelm Review #: CR141785 (1402-0146)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy