Abstract
An improved proximity weighted synthetic oversampling technique (IProWSyn) is proposed to address the shortcomings of the proximity weighted synthetic oversampling technique (ProWSyn) in that it does not remove noise samples during the synthesis of samples, and when the smoothing factor takes values in the range from 0 to 1, the weight ratio is difficult to cover the entire search space. The calculation strategy of weights is changed, and by introducing a common exponential function with a base ranging from 0 to 1, the base is dynamically changed to allow the weights to cover a larger range of the search space, thereby finding better weights. Apply IProWSyn, ASNSMOTE, and ProWSyn oversampling methods to six imbalanced datasets: ada, ecoli1, glass1, haberman, Pima, andyeast1. The effectiveness of the method is verified by using k-nearest neighbor (kNN) and neural network classifier. Through the comparison of experimental results, the F1 value, geometric mean (G-mean) value and area under curve (AUC) value of IProWSyn are higher than those of other oversampling methods on most datasets. It indicates that IProWSyn has better comprehensive classification performance and better generalization performance on these datasets.
Translated title of the contribution | Improved proximity weighted synthetic oversampling technique |
---|---|
Original language | Chinese (Simplified) |
Pages (from-to) | 748-755 |
Number of pages | 8 |
Journal | Shenzhen Daxue Xuebao (Ligong Ban)/Journal of Shenzhen University Science and Engineering |
Volume | 41 |
Issue number | 6 |
Early online date | 23 Aug 2024 |
DOIs | |
Publication status | Published - Nov 2024 |
Bibliographical note
Publisher Copyright:© 2024 Editorial Office of Journal of Shenzhen University. All rights reserved.
Funding
Foundation:Science and Technology Research Project of Colleges and Universities in Hebei Province (ZC2022071); Research Fund of CangzhouNormalUniversity(XNJJl1904);NaturalScienceFoundationofGuangdongProvince(2023A1515011667); GuangdongBasicandAppliedBasicResearchFoundation(2023B1515120020) Corresponding author:AssociateprofessorWANGXiaolan([email protected]) Citation:XING Sheng, WANG Xiaolan, SHEN Jiaxing, et al. Improved proximity weighted synthetic oversampling technique [J]. Journal ofShenzhenUniversityScienceandEngineering,2024,41(6):748-755.(inChinese) Science and Technology Research Project of Colleges and Universities in Hebei Province (ZC2022071); Research Fund of Cangzhou Normal University (XNJJl1904); Natural Science Foundation of Guangdong Province (2023A1515011667); Guangdong Basic and Applied Basic Research Foundation (2023B1515120020)
Keywords
- artificial intelligence
- imbalanced data
- k-nearest neighbor classifier
- neural network
- oversampling method
- proximity weighted synthetic oversampling technique