Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Algorithm 980: sparse QR factorization on the GPU
Yeralan S., Davis T., Sid-Lakhdar W., Ranka S. ACM Transactions on Mathematical Software44 (2):1-29,2017.Type:Article
Date Reviewed: Mar 14 2018

Many large-scale scientific and engineering computational problems lead, after some kind of discretization, to the solution of huge systems of linear algebraic equations and/or linear least squares problems containing many hundreds of millions of equations. When this happens, it is necessary to exploit properly the available advanced supercomputer systems “with multiple general-purpose cores in the [central processing unit (CPU)], coupled with one or more general-purpose graphics processing units (GPGPUs), each [of them] with hundreds or thousands of simple [but] fast computational cores.”

This is in general not an easy task, and the authors have demonstrated how it can successfully be solved when the selected algorithm is the well-known QR factorization. First, the authors explain in detail the application of multifrontal techniques during the factorization and the successive solution phase of the solution process. Then, they discuss the problem of achieving highly parallel computations on GPUs. Dense frontal matrices arise during sparse multifrontal QR factorization and have to be treated efficiently. The algorithm used in the treatment of these dense blocks is also thoroughly explained. Many figures, pieces of code, and explanations of the involved loops are presented in an attempt to facilitate the presentation of the results. The efficiency of the presented algorithm is demonstrated by many numerical tests at the end of the paper. Plans for future research are also briefly presented.

The presented algorithm might be useful for many researchers working with very large scientific and engineering models.

Reviewer:  Z. Zlatev Review #: CR145912 (1806-0308)
Bookmark and Share
  Featured Reviewer  
 
Parallel Processors (C.1.2 ... )
 
 
Graphics Processors (I.3.1 ... )
 
 
Parallel Algorithms (G.1.0 ... )
 
 
Parallel Programming (D.1.3 ... )
 
 
Parallel Architectures (C.1.4 )
 
Would you recommend this review?
yes
no
Other reviews under "Parallel Processors": Date
Spending your free time
Gelernter D. (ed), Philbin J. BYTE 15(5): 213-ff, 1990. Type: Article
Apr 1 1992
Higher speed transputer communication using shared memory
Boianov L., Knowles A. Microprocessors & Microsystems 15(2): 67-72, 1991. Type: Article
Jun 1 1992
On stability and performance of parallel processing systems
Bambos N., Walrand J. (ed) Journal of the ACM 38(2): 429-452, 1991. Type: Article
Sep 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy