Abstract
AdaBoost approaches have been used for multi-class imbalance classification with an imbalance ratio measured on class sizes. However, such ratio would assign each training sample of the same class with the same weight, thus failing to reflect the data distribution within a class. We propose to incorporate the density information of training samples into the class imbalance ratio so that samples of the same class could have different weights. As one could use the entire training set to calculate the imbalance and density factors, the weight of a training sample resulting from the two factors remains static throughout the training epochs. However, static weights could not reflect the up-to-date training status of base learners. To deal with this, we propose to design an adaptive weighting mechanism by making use of up-to-date training status to further alleviate the multi-class imbalance issue. Ultimately, we incorporate the class imbalance ratio, the density-based factor, and the adaptive weighting mechanism into a single variable, based on which the adaptive weights of all training samples are computed. Experimental studies are carried out to investigate the effectiveness of the proposed approach and each of the three components in dealing with multi-class imbalance classification problem.
Original language | English |
---|---|
Pages (from-to) | 5265-5279 |
Number of pages | 15 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 36 |
Issue number | 10 |
Early online date | 4 Apr 2024 |
DOIs | |
Publication status | Published - 2024 |
Bibliographical note
Publisher Copyright:IEEE
Keywords
- AdaBoost
- Classification algorithms
- Computer science
- Costs
- Ensemble learning
- Learning systems
- Linear programming
- Training
- adaptive weight
- data density
- ensembles
- multi-class imbalance classification