Computing Reviews

Tiled QR decomposition and its optimization on CPU and GPU computing system
Kim D., Park K.  ICPP 2013 (Proceedings of the 2013 42nd International Conference on Parallel Processing, Oct 1-4, 2013)744-753,2013.Type:Proceedings
Date Reviewed: 05/20/14

Single-node heterogeneous computing systems comprised of multicore central processing units (CPUs) and accelerators such as graphics processing units (GPUs) are becoming the norm in high-performance computing (HPC) environments. Each computing device has strengths and weaknesses for a given application, and identifying the computations that should be performed on each computing device is an area of open research. Kim and Park present an algorithm that automatically distributes data and computation to an optimized number of devices in a single heterogeneous system. The application they target is tiled QR decomposition.

There are three primary contributions in their work. First, they break the QR decomposition of a single tile into multiple tasks, and distribute small or serial tasks to the CPU and large or highly parallel tasks to one or more GPUs. Second, they automatically optimize the number of devices utilized based on the properties of the devices in the system. Finally, a distribution guide array is used to map which tiles are being operated on by each device in use. The authors evaluate their algorithm on matrices of up to 4000 elements on a side with randomly generated values.

This paper is not for those looking for a new parallel algorithm for QR decomposition, as Kim and Park base their implementation on the Householder reflections method. What they really present is a load-balancing algorithm integrated into tiled QR decomposition. Some of the optimal computing configurations are not intuitive, which lends value to their automated technique. While not always the easiest paper to read, Kim and Park’s work should be of value to those trying to optimize a parallel algorithm across a number of heterogeneous computing devices in a single system.

Reviewer:  Chris Lupo Review #: CR142298 (1408-0645)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy