Ensemble of classifiers based on multiobjective genetic sampling for imbalanced data

Everlandio R.Q. FERNANDES, Andre C.P.L.F. DE CARVALHO, Xin YAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

58 Citations (Scopus)

Abstract

Imbalanced datasets may negatively impact the predictive performance of most classical classification algorithms. This problem, commonly found in real-world, is known in machine learning domain as imbalanced learning. Most techniques proposed to deal with imbalanced learning have been proposed and applied only to binary classification. When applied to multiclass tasks, their efficiency usually decreases and negative side effects may appear. This paper addresses these limitations by presenting a novel adaptive approach, E-MOSAIC (Ensemble of Classifiers based on MultiObjective Genetic Sampling for Imbalanced Classification). E-MOSAIC evolves a selection of samples extracted from training dataset, which are treated as individuals of a MOEA. The multiobjective process looks for the best combinations of instances capable of producing classifiers with high predictive accuracy in all classes. E-MOSAIC also incorporates two mechanisms to promote the diversity of these classifiers, which are combined into an ensemble specifically designed for imbalanced learning. Experiments using twenty imbalanced multi-class datasets were carried out. In these experiments, the predictive performance of E-MOSAIC is compared with state-of-the-art methods, including methods based on presampling, active-learning, cost-sensitive, and boosting. According to the experimental results, the proposed method obtained the best predictive performance for the multiclass accuracy measures mAUC and G-mean. © 1989-2012 IEEE.
Original languageEnglish
Article number8640265
Pages (from-to)1104-1115
Number of pages12
JournalIEEE Transactions on Knowledge and Data Engineering
Volume32
Issue number6
Early online date12 Feb 2019
DOIs
Publication statusPublished - 1 Jun 2020
Externally publishedYes

Bibliographical note

The authors would like to thank FAPESP, CNPq, CAPES and Intel for their financial support.

Keywords

  • Ensemble of classifiers
  • Evolutionary algorithm
  • Imbalanced datasets

Fingerprint

Dive into the research topics of 'Ensemble of classifiers based on multiobjective genetic sampling for imbalanced data'. Together they form a unique fingerprint.

Cite this