Efficient Cluster-Based Boosting for Semisupervised Classification

Rodrigo G.F. SOARES, Huanhuan CHEN, Xin YAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

6 Citations (Scopus)

Abstract

Semisupervised classification (SSC) consists of using both labeled and unlabeled data to classify unseen instances. Due to the large number of unlabeled data typically available, SSC algorithms must be able to handle large-scale data sets. Recently, various ensemble algorithms have been introduced with improved generalization performance when compared to single classifiers. However, existing ensemble methods are not able to handle typical large-scale data sets. We propose efficient cluster-based boosting (ECB), a multiclass SSC algorithm with cluster-based regularization that avoids generating decision boundaries in high-density regions. A semisupervised selection procedure reduces time and space complexities by selecting only the most informative unlabeled instances for the training of each base learner. We provide evidences to demonstrate that ECB is able to achieve good performance with small amounts of selected data and a relatively small number of base learners. Our experiments confirmed that ECB scales to large data sets while delivering comparable generalization to state-of-the-art methods. © 2012 IEEE.
Original languageEnglish
Article number8320854
Pages (from-to)5667-5680
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume29
Issue number11
Early online date21 Mar 2018
DOIs
Publication statusPublished - Nov 2018
Externally publishedYes

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2016YFB1000905 and in part by the National Natural Science Foundation of China under Grant 91546116 and Grant 91746209.

Keywords

  • Cluster-based regularization
  • ensemble learning
  • multiclass classification
  • semisupervised classification

Fingerprint

Dive into the research topics of 'Efficient Cluster-Based Boosting for Semisupervised Classification'. Together they form a unique fingerprint.

Cite this