Abstract
The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes. © 2014 IEEE.
Original language | English |
---|---|
Article number | 6940273 |
Pages (from-to) | 1233-1245 |
Number of pages | 13 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 27 |
Issue number | 5 |
Early online date | 30 Oct 2014 |
DOIs | |
Publication status | Published - 1 May 2015 |
Externally published | Yes |
Funding
This work has been subsidized by the TIN2011-22794 project of the Spanish Ministerial Commission of Science and Technology (MICYT), FEDER funds and the P11-TIC-7508 project of the Junta de Andalucia (Spain). Xin Yao's work was supported by an EPSRC grant (EP/J017515/1) and a Royal Society Wolfson Research Merit Award.
Keywords
- imbalanced classification
- ordinal classification
- ordinal regression
- Over-sampling