Feature selection using localized generalization error for supervised classification problems using RBFNN

W. Y., Wing NG, Daniel S. YEUNG, Michael FIRTH, C. C, Eric TSANG, Xi Zhao WANG

Research output: Journal PublicationsJournal Article (refereed)Researchpeer-review

71 Citations (Scopus)

Abstract

A pattern classification problem usually involves using high-dimensional features that make the classifier very complex and difficult to train. With no feature reduction, both training accuracy and generalization capability will suffer. This paper proposes a novel hybrid filter-wrapper-type feature subset selection methodology using a localized generalization error model. The localized generalization error model for a radial basis function neural network bounds from above the generalization error for unseen samples located within a neighborhood of the training samples. Iteratively, the feature making the smallest contribution to the generalization error bound is removed. Moreover, the novel feature selection method is independent of the sample size and is computationally fast. The experimental results show that the proposed method consistently removes large percentages of features with statistically insignificant loss of testing accuracy for unseen samples. In the experiments for two of the datasets, the classifiers built using feature subsets with 90% of features removed by our proposed approach yield average testing accuracies higher than those trained using the full set of features. Finally, we corroborate the efficacy of the model by using it to predict corporate bankruptcies in the US.
Original languageEnglish
Pages (from-to)3706-3719
Number of pages14
JournalPattern Recognition
Volume41
Issue number12
DOIs
Publication statusPublished - 1 Dec 2008

Fingerprint

Feature extraction
Classifiers
Testing
Pattern recognition
Neural networks
Experiments

Keywords

  • Feature selection
  • Generalization error
  • Neural network
  • RBFNN

Cite this

NG, W. Y., Wing ; YEUNG, Daniel S. ; FIRTH, Michael ; TSANG, C. C, Eric ; WANG, Xi Zhao. / Feature selection using localized generalization error for supervised classification problems using RBFNN. In: Pattern Recognition. 2008 ; Vol. 41, No. 12. pp. 3706-3719.
@article{dca0c388e28b4198a85a8735f82aeb0c,
title = "Feature selection using localized generalization error for supervised classification problems using RBFNN",
abstract = "A pattern classification problem usually involves using high-dimensional features that make the classifier very complex and difficult to train. With no feature reduction, both training accuracy and generalization capability will suffer. This paper proposes a novel hybrid filter-wrapper-type feature subset selection methodology using a localized generalization error model. The localized generalization error model for a radial basis function neural network bounds from above the generalization error for unseen samples located within a neighborhood of the training samples. Iteratively, the feature making the smallest contribution to the generalization error bound is removed. Moreover, the novel feature selection method is independent of the sample size and is computationally fast. The experimental results show that the proposed method consistently removes large percentages of features with statistically insignificant loss of testing accuracy for unseen samples. In the experiments for two of the datasets, the classifiers built using feature subsets with 90{\%} of features removed by our proposed approach yield average testing accuracies higher than those trained using the full set of features. Finally, we corroborate the efficacy of the model by using it to predict corporate bankruptcies in the US.",
keywords = "Feature selection, Generalization error, Neural network, RBFNN",
author = "NG, {W. Y., Wing} and YEUNG, {Daniel S.} and Michael FIRTH and TSANG, {C. C, Eric} and WANG, {Xi Zhao}",
year = "2008",
month = "12",
day = "1",
doi = "10.1016/j.patcog.2008.05.004",
language = "English",
volume = "41",
pages = "3706--3719",
journal = "Pattern Recognition",
issn = "0031-3203",
publisher = "Elsevier Ltd",
number = "12",

}

Feature selection using localized generalization error for supervised classification problems using RBFNN. / NG, W. Y., Wing; YEUNG, Daniel S.; FIRTH, Michael; TSANG, C. C, Eric; WANG, Xi Zhao.

In: Pattern Recognition, Vol. 41, No. 12, 01.12.2008, p. 3706-3719.

Research output: Journal PublicationsJournal Article (refereed)Researchpeer-review

TY - JOUR

T1 - Feature selection using localized generalization error for supervised classification problems using RBFNN

AU - NG, W. Y., Wing

AU - YEUNG, Daniel S.

AU - FIRTH, Michael

AU - TSANG, C. C, Eric

AU - WANG, Xi Zhao

PY - 2008/12/1

Y1 - 2008/12/1

N2 - A pattern classification problem usually involves using high-dimensional features that make the classifier very complex and difficult to train. With no feature reduction, both training accuracy and generalization capability will suffer. This paper proposes a novel hybrid filter-wrapper-type feature subset selection methodology using a localized generalization error model. The localized generalization error model for a radial basis function neural network bounds from above the generalization error for unseen samples located within a neighborhood of the training samples. Iteratively, the feature making the smallest contribution to the generalization error bound is removed. Moreover, the novel feature selection method is independent of the sample size and is computationally fast. The experimental results show that the proposed method consistently removes large percentages of features with statistically insignificant loss of testing accuracy for unseen samples. In the experiments for two of the datasets, the classifiers built using feature subsets with 90% of features removed by our proposed approach yield average testing accuracies higher than those trained using the full set of features. Finally, we corroborate the efficacy of the model by using it to predict corporate bankruptcies in the US.

AB - A pattern classification problem usually involves using high-dimensional features that make the classifier very complex and difficult to train. With no feature reduction, both training accuracy and generalization capability will suffer. This paper proposes a novel hybrid filter-wrapper-type feature subset selection methodology using a localized generalization error model. The localized generalization error model for a radial basis function neural network bounds from above the generalization error for unseen samples located within a neighborhood of the training samples. Iteratively, the feature making the smallest contribution to the generalization error bound is removed. Moreover, the novel feature selection method is independent of the sample size and is computationally fast. The experimental results show that the proposed method consistently removes large percentages of features with statistically insignificant loss of testing accuracy for unseen samples. In the experiments for two of the datasets, the classifiers built using feature subsets with 90% of features removed by our proposed approach yield average testing accuracies higher than those trained using the full set of features. Finally, we corroborate the efficacy of the model by using it to predict corporate bankruptcies in the US.

KW - Feature selection

KW - Generalization error

KW - Neural network

KW - RBFNN

UR - http://commons.ln.edu.hk/sw_master/7012

U2 - 10.1016/j.patcog.2008.05.004

DO - 10.1016/j.patcog.2008.05.004

M3 - Journal Article (refereed)

VL - 41

SP - 3706

EP - 3719

JO - Pattern Recognition

JF - Pattern Recognition

SN - 0031-3203

IS - 12

ER -