The dimensions of data cubes--the domains of online analytical processing (OLAP) queries--are hierarchic, meaning the important ones are ordered. From a practical point of view, queries that refer to ranges are usual and it is important that basic or aggregated data can be extracted at different levels of granularity. For this reason, the investigation of hierarchical range queries is a relevant theme. Data cubes are redundant portions of data; because of their redundancy, the compression of large cubes can make both their storage and retrieval much more efficient.
In this paper on cube compression, Cuzzocrea gives a solution for the complicated but very important real-life problem of how to determine a compression technique that supports not just one class of queries, but a large family of essentially different given queries, in a balanced way. Cuzzocrea’s solution uses a hierarchical multidimensional histogram to realize the cube’s compression. After Cuzzocrea presents the problem and discusses some related work, he describes the algorithm implemented. Then, he provides a theoretical complexity analysis of the algorithm.
The author compares his implementation to other state-of-the-art histogram-based data cube compression techniques. While the sophisticated method described in the paper behaves better than the other methods in terms of quality and scalability, it is slower than the others on the sample cubes.
I recommend this comprehensive paper, and I hope that the results will be used in commercial tools.