Abstract
Semisupervised classification (SSC) algorithms use labeled and unlabeled data to predict labels of unseen instances. Classifier ensembles have been successfully studied and employed as a SSC approach. However, the generalization of existing semisupervised ensembles can be strongly affected by incorrect label estimates produced by ensemble algorithms in order to train supervised base learners. These ensembles do not optimize the objective function present in their base learners, which causes their supervised base classifiers to be sensitive to incorrect labeling and to reinforce errors during training. We propose cluster-based boosting (CBoost), a multiclass classification algorithm with cluster regularization. In contrast to existing algorithms, CBoost and its base learners jointly perform a cluster-based semisupervised optimization, which allows base classifiers to overcome potential incorrect label estimates for unlabeled data. CBoost is effective and stable in the presence of overlapping classes and scarce labeled points in dense regions. Experiments on artificial and real-world datasets confirmed the effectiveness of our approach.
Original language | English |
---|---|
Article number | 8116700 |
Pages (from-to) | 408-420 |
Number of pages | 13 |
Journal | IEEE Transactions on Emerging Topics in Computational Intelligence |
Volume | 1 |
Issue number | 6 |
Early online date | 21 Nov 2017 |
DOIs | |
Publication status | Published - Dec 2017 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Funding
This work was supported in part by the National Key Research and Development Program of China under Grant 2016YFB1000905, in part by the National Natural Science Foundation of China under Grants 91546116 and 61673363, and in part by the CAPES Foundation, Ministry of Education of Brazil. The work of X. Yao was supported by a Royal Society Wolfson Research Merit Award.
Keywords
- Boosting
- Clusterreg
- ensemble learning
- semi-supervised learning