Sorting and clustering are considered very important operations. As a consequence, the existence of efficient algorithms for these operations is of vital importance in several scientific fields, such as database systems and data mining. The authors investigate ways in which an algorithm can improve its performance by using an automatic tuning approach. In other words, they show how an algorithm can improve itself.
For the sorting operation, a learn-and-sort algorithm is proposed. A learn-and-cluster algorithm is proposed for the clustering operation. This latter operation is made up of two phases: the learning phase and the normal (clustering) phase. During the learning phase, the algorithm tunes itself with respect to the input data distribution, using a logarithmic number of steps (rounds). During the normal phase, an optimized version of the algorithm is applied. The normal method is based on the k-median clustering algorithm.
The work is based on a solid theoretical study, which analyzes both the time and space requirements of the proposed schemes. The research results of this work can be applied toward the development of more efficient algorithms. Database systems can benefit from the learn-and-sort approach, since sorting is one of the fundamental database operations. Moreover, knowledge discovery systems can use the learn-and-cluster method to elicit more efficient clustering (finding groups in data). It would be interesting to investigate other fundamental operations under this framework (for example, learn-and-search). Finally, a comparison with other approaches, such as evolutionary methods, would be very interesting.