Computing Reviews, the leading online review service for computing literature.

Search

Algorithm 980: sparse QR factorization on the GPU
Yeralan S., Davis T., Sid-Lakhdar W., Ranka S. ACM Transactions on Mathematical Software44 (2):1-29,2017.Type:Article

Date Reviewed: Mar 14 2018

Many large-scale scientific and engineering computational problems lead, after some kind of discretization, to the solution of huge systems of linear algebraic equations and/or linear least squares problems containing many hundreds of millions of equations. When this happens, it is necessary to exploit properly the available advanced supercomputer systems “with multiple general-purpose cores in the [central processing unit (CPU)], coupled with one or more general-purpose graphics processing units (GPGPUs), each [of them] with hundreds or thousands of simple [but] fast computational cores.” This is in general not an easy task, and the authors have demonstrated how it can successfully be solved when the selected algorithm is the well-known QR factorization. First, the authors explain in detail the application of multifrontal techniques during the factorization and the successive solution phase of the solution process. Then, they discuss the problem of achieving highly parallel computations on GPUs. Dense frontal matrices arise during sparse multifrontal QR factorization and have to be treated efficiently. The algorithm used in the treatment of these dense blocks is also thoroughly explained. Many figures, pieces of code, and explanations of the involved loops are presented in an attempt to facilitate the presentation of the results. The efficiency of the presented algorithm is demonstrated by many numerical tests at the end of the paper. Plans for future research are also briefly presented. The presented algorithm might be useful for many researchers working with very large scientific and engineering models.

Reviewer: Z. Zlatev	Review #: CR145912 (1806-0308)

Parallel Processors (C.1.2 ... )

Graphics Processors (I.3.1 ... )

Parallel Algorithms (G.1.0 ... )

Parallel Programming (D.1.3 ... )

Parallel Architectures (C.1.4 )

Would you recommend this review?

yes

Other reviews under "Parallel Processors":	Date

Spending your free time Gelernter D. (ed), Philbin J. BYTE 15(5): 213-ff, 1990. Type: Article	Apr 1 1992

Higher speed transputer communication using shared memory Boianov L., Knowles A. Microprocessors & Microsystems 15(2): 67-72, 1991. Type: Article	Jun 1 1992

On stability and performance of parallel processing systems Bambos N., Walrand J. (ed) Journal of the ACM 38(2): 429-452, 1991. Type: Article	Sep 1 1992

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy