Abstract
Tolerance Rough Set (TRS) theory is commonly employed for feature selection with incomplete data. However, TRS has limitations such as ignoring uncertainty, which often leads to the inclusion of redundant features and diminished classification accuracy. To address these limitations, we propose an extension called Subrelation Tolerance Class (STC). STC decomposes the tolerance relation into two subrelations, enabling a two-stage certainty measurement. This approach progressively filters out certain regions, thereby reducing computational space requirements, and introduces a new significance measure that considers both certain and uncertain information. Leveraging STC and our proposed measure, we develop an incremental feature selection algorithm capable of handling incomplete streaming data. We conduct experiments on real-world datasets and compare the performance with existing algorithms to validate the superiority of our method. The experimental results show that our algorithm reduces the execution time by over 89.78% compared to the baselines while maintaining the classification accuracy.
Original language | English |
---|---|
Article number | 110125 |
Journal | Pattern Recognition |
Volume | 148 |
Early online date | 19 Nov 2023 |
DOIs | |
Publication status | Published - Apr 2024 |
Bibliographical note
Publisher Copyright:© 2023 Elsevier Ltd
Funding
This work was supported by the National Natural Science Foundation of China [grant numbers 72271063 , 71871069 , 61962038 , 62262045 ], the Open Research Fund of Guangxi Key Lab of Human-machine Interaction and Intelligent Decision, China [ No.GXHIID2203 ].
Keywords
- Incremental feature selection
- Significance measure
- Sub-tolerance relation
- Tolerance rough set