Abstract
Motivated by the idea of cross-validation, a novel instance selection algorithm is proposed in this paper. The novelties of the proposed algorithm are that (1) it cross selects the important instances from the original data set with a committee, (2) it can deal with the problem of selecting instance from large data sets. We experimentally compared our algorithm with five state-of-the-art approaches which are CNN, ENN, RNN, MCS, and ICF on 3 artificial data sets and 6 UCI data sets, including 4 large data sets, ranking from 130K to 4898K in size. The experimental results show that the proposed algorithm is very efficient and effective, especially on large data sets.
Original language | English |
---|---|
Pages (from-to) | 717-728 |
Number of pages | 12 |
Journal | Journal of Intelligent and Fuzzy Systems |
Volume | 30 |
Issue number | 2 |
DOIs | |
Publication status | Published - 9 Feb 2016 |
Externally published | Yes |
Bibliographical note
This research is supported by the national natural science foundation of China (61170040, 71371063), by the natural science foundation of Hebei Province (F2013201110, F2013201220), by the Key Scientific Research Foundation of Education Department of Hebei Province (ZD20131028) and by the Opening Fund of Zhejiang Provincial Top Key Discipline of Computer Science and Technology at Zhejiang Normal University, China.Keywords
- extreme learning machine
- Instances selection
- K-L divergence
- large data sets