Abstract
Incremental feature selection can improve learning of accumulated data. We focus on incremental feature selection based on rough sets, which along with their generalizations (e.g., fuzzy rough sets), reduce dimensionality without requiring domain knowledge, such as data distributions. By analyzing the basic concepts of fuzzy rough sets on incremental datasets, we propose incremental mechanisms of information measure. Moreover, we introduce a key instance set containing representative instances to select supplementary features when new instances arrive. As the key instance set is much smaller than the whole dataset, the proposed incremental feature selection mostly suppresses redundant computations. We experimentally compare the proposed method with various non-incremental and two state-of-the-art incremental methods on a variety of datasets. The comparison results demonstrate that the proposed method achieves compact results with reduced computation time, especially on high-dimensional datasets.
Original language | English |
---|---|
Pages (from-to) | 185-204 |
Number of pages | 20 |
Journal | Information Sciences |
Volume | 536 |
Early online date | 18 May 2020 |
DOIs | |
Publication status | Published - Oct 2020 |
Externally published | Yes |
Bibliographical note
This work is supported by the National Key Research & Develop Plan (2018YFB1004401, 2017YFB1400700, 2016YFB1000702), NSFC (No. 61702522, 61772536, 61772537, 61732006, 61532021), NSSFC (No. 12\&ZD220), National Basic Research Program of China (973) (No.2014CB340402), National High-Technology Research and Development Program of China (863) (No.2014AA015204), and Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China (15XNLQ06). This study was partially done when the authors worked in SA Center for Big Data Research in RUC. This Center is funded by a Chinese National 111 Project Attracting.Keywords
- Feature selection
- Fuzzy rough set
- Incremental learning
- Information measure