Multi-Class Imbalance Classification Based on Data Distribution and Adaptive Weights

Shuxian LI, Liyan SONG, Xiaoyu WU, Zheng HU, Yiu-Ming CHEUNG, Xin YAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

5 Citations (Scopus)

Abstract

AdaBoost approaches have been used for multi-class imbalance classification with an imbalance ratio measured on class sizes. However, such ratio would assign each training sample of the same class with the same weight, thus failing to reflect the data distribution within a class. We propose to incorporate the density information of training samples into the class imbalance ratio so that samples of the same class could have different weights. As one could use the entire training set to calculate the imbalance and density factors, the weight of a training sample resulting from the two factors remains static throughout the training epochs. However, static weights could not reflect the up-to-date training status of base learners. To deal with this, we propose to design an adaptive weighting mechanism by making use of up-to-date training status to further alleviate the multi-class imbalance issue. Ultimately, we incorporate the class imbalance ratio, the density-based factor, and the adaptive weighting mechanism into a single variable, based on which the adaptive weights of all training samples are computed. Experimental studies are carried out to investigate the effectiveness of the proposed approach and each of the three components in dealing with multi-class imbalance classification problem.
Original languageEnglish
Pages (from-to)5265-5279
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume36
Issue number10
Early online date4 Apr 2024
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
IEEE

Funding

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 62002148 and Grant 62250710682, in part by Guangdong Provincial Key Laboratory under Grant 2020B121201001, in part by the Program for Guangdong Introducing Innovative and Enterpreneurial Teams under Grant 2017ZT07X386, in part by the Research Institute of Trustworthy Autonomous Systems (RITAS), in part by the NSFC/Research Grants Council (RGC) Joint Research Scheme under Grant N_HKBU214/21, in part by the General Research Fund of RGC under Grant 12201321, Grant 12202622, and Grant 12201323, and in part by RGC Senior Research Fellow Scheme under Grant SRFS2324-2S02.

Keywords

  • AdaBoost
  • Classification algorithms
  • Computer science
  • Costs
  • Ensemble learning
  • Learning systems
  • Linear programming
  • Training
  • adaptive weight
  • data density
  • ensembles
  • multi-class imbalance classification

Fingerprint

Dive into the research topics of 'Multi-Class Imbalance Classification Based on Data Distribution and Adaptive Weights'. Together they form a unique fingerprint.

Cite this