Compression is important. With estimates of hundreds or thousands of exabytes of digital data in the world today, numbers big enough to be almost incomprehensible, and increasing even faster every year, compression helps to reduce storage, and increase network transmission speed. Compression is everywhere, from portable music players to image collections to video.
Of course, not all data can be compressed, and, more essentially, the same methods don’t work for all data. While claims of universal compression do appear from time to time, these are usually snake oil or delusion. Indeed, a compression method that works well on one kind of data will often make other data larger, or even make it unusable. For many areas in computing, understanding the types of compression that work, or that do not work, is important, and an understanding of the fundamental algorithms will be helpful.
This book provides an overview of compression methods in general, and goes into a fair amount of detail for some specific algorithms in use. There are two main divisions in the book. The first contains an overview of useful concepts, such as entropy, Huffman coding, and dictionary methods (such as Lempel-Ziv and deflate). The second contains chapters on arithmetic coding, image compression (JPEG and a brief introduction to wavelets), and audio compression, as well as a short chapter on a couple of other methods.
The discussions of specific algorithms are concise, but usually contain a fair amount of detail including specific examples. There are also many figures and graphs illustrating the specific algorithms, as well as a number of exercises with answers. There is a nice bibliography and a reasonable glossary.
Most of the algorithms are described with words instead of either pseudocode or an implementation in a common language, although a few do have more formal descriptions, or MATLAB code. For readers familiar with MATLAB, this is undoubtedly useful; for other readers, the MATLAB listings are sometimes a bit cryptic. In contrast, many of the graphs are annotated with the MATLAB code used to generate them. It is not clear how this adds to the information presented; in many cases, the graph-drawing commands swamp any computation done, and this practice seems strange when contrasted with the lack of code for the compression algorithms used. It would also probably be useful to more completely sketch out just what the file formats look like for each algorithm.
Another odd choice is in the selection of audio compression algorithms. While mu-law and a-law compression are certainly among the simplest, the Moving Pictures Experts Group (MPEG) family of algorithms is used more.
In a few places, the results given by the book and the results I computed differed, but I’m not sure if the error is in the description, in the results presented, or in my computations, and there seems to be no easy way to check some of these.
On the whole, this could be a useful book for a short course on compression at the upper undergraduate level, and would provide students with a taste for the material. In a few places (for audio and image methods), the mathematics might at least look a bit daunting, but the code given should more than compensate for that.