Abstract
Semisupervised classification (SSC) algorithms use labeled and unlabeled data to predict labels of unseen instances. Classifier ensembles have been successfully studied and employed as a SSC approach. However, the generalization of existing semisupervised ensembles can be strongly affected by incorrect label estimates produced by ensemble algorithms in order to train supervised base learners. These ensembles do not optimize the objective function present in their base learners, which causes their supervised base classifiers to be sensitive to incorrect labeling and to reinforce errors during training. We propose cluster-based boosting (CBoost), a multiclass classification algorithm with cluster regularization. In contrast to existing algorithms, CBoost and its base learners jointly perform a cluster-based semisupervised optimization, which allows base classifiers to overcome potential incorrect label estimates for unlabeled data. CBoost is effective and stable in the presence of overlapping classes and scarce labeled points in dense regions. Experiments on artificial and real-world datasets confirmed the effectiveness of our approach. © 2017 IEEE.
Original language | English |
---|---|
Article number | 8116700 |
Pages (from-to) | 408-420 |
Number of pages | 13 |
Journal | IEEE Transactions on Emerging Topics in Computational Intelligence |
Volume | 1 |
Issue number | 6 |
Early online date | 21 Nov 2017 |
DOIs | |
Publication status | Published - Dec 2017 |
Externally published | Yes |
Keywords
- Boosting
- Clusterreg
- ensemble learning
- semi-supervised learning