TY - GEN
T1 - A comparative study among different kernel functions in flexible naïve Bayesian classification
AU - LIU, James N.K.
AU - HE, Yu-Lin
AU - WANG, Xi-Zhao
AU - HU, Yan-Xing
N1 - This paper is supported by GRF grant (5237/08E), CRG grant (G-U756) of the Hong Kong Polytechnic University.
PY - 2011
Y1 - 2011
N2 - When determining the class of the unknown example by using naïve Bayesian classifier, we need to estimate the class conditional probabilities for the continuous attributes. In flexible Bayesian classifier, the Gaussian kernel function is frequently used for classification task under the framework of Parzen window method. In this paper, the other six kernel functions (uniform, triangular, epanechnikov, biweight, triweight and cosine) are introduced in the flexible naïve Bayesian. The performances of these seven kernels are compared in 30 UCI datasets. The experimental comparisons are carried out according to the following three aspects: the classification accuracy, ranking performance and the class probability estimation. The latter two are measured by the area under the ROC curve (AUC) and the conditional log likelihood (CLL). The related kernels are compared via two-tailed t-test with a 95 percent confidence level and the Friedman's test using the 0.05 critical level. The experimental results show that the most commonly used Gaussian kernel can not achieve the best classification accuracy and AUC. However, on the CLL, the Gaussian kernel is statistically significantly better than the other six kernels. Finally, the corresponding analyses are given based on the experimental results.
AB - When determining the class of the unknown example by using naïve Bayesian classifier, we need to estimate the class conditional probabilities for the continuous attributes. In flexible Bayesian classifier, the Gaussian kernel function is frequently used for classification task under the framework of Parzen window method. In this paper, the other six kernel functions (uniform, triangular, epanechnikov, biweight, triweight and cosine) are introduced in the flexible naïve Bayesian. The performances of these seven kernels are compared in 30 UCI datasets. The experimental comparisons are carried out according to the following three aspects: the classification accuracy, ranking performance and the class probability estimation. The latter two are measured by the area under the ROC curve (AUC) and the conditional log likelihood (CLL). The related kernels are compared via two-tailed t-test with a 95 percent confidence level and the Friedman's test using the 0.05 critical level. The experimental results show that the most commonly used Gaussian kernel can not achieve the best classification accuracy and AUC. However, on the CLL, the Gaussian kernel is statistically significantly better than the other six kernels. Finally, the corresponding analyses are given based on the experimental results.
KW - AUC
KW - biweight
KW - CLL
KW - cosine
KW - density estimation
KW - epanechnikov
KW - Gaussian
KW - Naïve Bayesian classifier
KW - triangular
KW - triweight
KW - uniform
UR - http://www.scopus.com/inward/record.url?scp=80155175961&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2011.6016813
DO - 10.1109/ICMLC.2011.6016813
M3 - Conference paper (refereed)
AN - SCOPUS:80155175961
SN - 9781457703058
VL - 4
T3 - International Conference on Machine Learning and Cybernetics
SP - 638
EP - 643
BT - Proceedings of 2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011
PB - IEEE
T2 - 2011 International Conference on Machine Learning and Cybernetics, ICMLC 2011
Y2 - 10 July 2011 through 13 July 2011
ER -