Abstract
The classification in imbalanced datasets is one of the main problems for machine learning techniques. Support vector machine (SVM) is biased to the majority class samples, and the minority class samples may incorrectly be considered as noise. Therefore, SVM has poor predictive accuracy for imbalanced datasets and generates inaccurate classification models. Existing class imbalance learning (CIL) techniques can make SVM less sensitive to class imbalance, but these methods suffer from issues related to noise and outliers. Moreover, despite the solid theoretical basis and good classification performance, SVM is not appropriate for the classification of large-scale datasets because the training complexity of SVM is closely related to the dataset size. Class imbalance learning (CIL) using Fuzzy adaptive resonance theory (ART) and intuitionistic fuzzy twin SVM (CIL-FART-IFTSVM), which can be applied to address the class imbalance issue in the presence of noise and outliers and large scale datasets, is proposed to overcome these substantial difficulties. In this method, we modify the distribution of the datasets using fuzzy adaptive resonance theory (Fuzzy ART) as a clustering method to overcome the imbalance problem. Then, after data reduction, IFTSVM is utilized to find excellent non-parallel hyperplanes in the generated data points. Finally, a coordinate descent system with shrinking by an active set is applied to reduce the computational complexity. Forty-five imbalanced datasets are considered to validate the performance of the proposed CIL-FART-IFTSVM method. The Friedman test and the bootstrap technique with 95% confidence intervals are applied to quantify the results statistically. The experimental results indicate that the method proposed in this paper has a better performance compared with other methods, and the training time is significantly better than that of other classifiers for large-scale datasets.
Original language | English |
---|---|
Pages (from-to) | 659-682 |
Number of pages | 24 |
Journal | Information Sciences |
Volume | 578 |
Early online date | 6 Jul 2021 |
DOIs | |
Publication status | Published - Nov 2021 |
Externally published | Yes |
Bibliographical note
This work was supported in part by the National Natural Science Foundation of China (Grants 61772344, 71371063, and 61732011) and in part by Basic Research Project of Knowledge Innovation Program in ShenZhen (JCYJ20180305125850156).Keywords
- Class imbalance learning
- Coordinate descent
- Fuzzy ART
- Intuitionistic fuzzy number
- Twin support vector machine