Computing Reviews

Methods for finding frequent items in data streams
Cormode G., Hadjieleftheriou M. The VLDB Journal: The International Journal on Very Large Data Bases19(1):3-20,2010.Type:Article
Date Reviewed: 08/10/10

A large class of real-world applications, with a tight relation to large-scale industrial systems, requires the processing of data that arrives in the form of streams. Data stream mining is a challenging research problem due to the high volume of the data and the enormous rate in which it is produced by data generation processes, as well as the fact that data processing has to be done on-the-fly (as the data occurs) to allow for a timely analysis of the observed trends. Several research directions of this problem have been studied over the years.

Cormode and Hadjieleftheriou provide an interesting survey on the research methods that have been proposed since the 1980s for the mining of frequent items in data streams. Simply stated, given a sequence of items, the problem of frequent item mining is to identify those items that occur most frequently in the sequence. Several variations of this problem have been studied, assuming the existence of importance (positive or negative) weights on the items. In this work, the authors partitioned the existing methodologies for finding frequent items into three broad classes: counter-based algorithms, quantile algorithms, and sketch algorithms. For each class, they produced baseline implementations of the most important algorithms and performed a thorough experimental evaluation to test the performance of the algorithms on different datasets.

The paper is well written, and leads to several important observations about the quality of the different algorithms that have been proposed for mining frequent items. It is a must-read for anyone interested in data stream mining.

Reviewer:  Aris Gkoulalas-Divanis Review #: CR138236 (1012-1274)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy