Online learning has been showing to be very useful for a large number of applications in which data arrive continuously and a timely response is required. In many online cases, the data stream can have very skewed class distributions, known as class imbalance, such as fault diagnosis of realtime control monitoring systems and intrusion detection in computer networks. Classifying imbalanced data streams poses new challenges, which have attracted very little attention so far. As the first work that formally addresses this problem, this paper looks into the underlying issues, clarifies the research questions, and proposes a framework for online class imbalance learning that decomposes the learning task into three modules. Within the framework, we use a time decay function to capture the imbalance rate dynamically. Then, we propose a class imbalance detection method, in order to decide the current imbalance status in data streams. According to this information, two resampling-based online learning algorithms are developed to tackle class imbalance in data streams. Three basic types of class imbalance change are discussed in our studies. The results suggest the usefulness of the learning framework. The proposed methods are shown to be effective on both minority-class accuracy and overall performance in all three cases we considered. © 2013 IEEE.
|Title of host publication
|Proceedings of the 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning, CIEL 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
|Number of pages
|Published - Apr 2013