The primary problem with online incremental machine learning methods is the tradeoff between predictive accuracy and the increasing computational costs of keeping increasing amounts of training data.
When should training data be forgotten? This paper shows that the “support vector” philosophy, of just keeping the extreme examples, those near the boundaries of a concept, is a good answer to this question. The paper presents two new incremental learning algorithms, AQ11-PM and GEM-PM, by adding partial memory to the AQ11 algorithm, and restricting GEM memory to just extreme examples. These four algorithms were tested on synthetic data and two well-chosen real-world applications: computer intrusion data and blasting cap detection in x-ray images.
The paper also shows that the two new algorithms produce state-of-the-art results with drifting concepts. It would be interesting and important if the advantages of “extreme example” memory hold for all incremental machine learning methods, not just the “covering rule induction” method underlying the four algorithms studied in the paper.