Friday, December 4, 2009

Research Notes 7: Ensemble Learning for Data Stream Classification

Ensemble learning attempts to produce a strong learner by combining many weak learners. Ensemble learning can train stronger classification model on top of multiple simpler models such as decision trees, if there is significant diversity among the models.

In SEA (2001), it is suggested that even simple ensembles can produce results better that are comparable to individual classifiers. SEA is an ensemble learning scheme proposed for data stream classification. In SEA, a new classifier is trained whenever a new chunk of data arrives. The new classifiers are added to existing classifiers to form an ensemble until the memory is full. Whenever the memory is full, the new classifier is evaluated whether it should replaced one of the existing classifiers. The replacement can be considered as a way to handle concept drift. To decide which classifiers to be retained, SEA evaluate the classifiers with a novel heuristic that favors classifiers that can correctly classify instances on which the ensemble is nearly undecided. In the paper, SEA with C4.5 decision trees are tested.

No comments:

Post a Comment