Comparative Analysis on Kernel Based Probability Density Estimation

Article Preview

Abstract:

In this paper, we compare the estimation performances of 7 different kernels (i.e., Uniform, Triangular, Epanechnikov, Biweight, Triweight, Cosine and Gaussian) when using them to conduct the probability density estimation with Parzen window method. We firstly analyze the efficiencies of these 7 kernels and then compare their estimation errors measured by mean squared error (MSE). The theoretical analysis and the experimental comparisons show that the mostly-used Gaussian kernel is not the best choice for the probability density estimation, of which the efficiency is low and estimation error is high. The derived conclusions give some guidelines for the selection of kernel in the practical application of probability density estimation.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1655-1658

Citation:

Online since:

March 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] M. P. Wand, M.C. Jones, Kernel Smoothing. Chapman and Hall, (1995).

Google Scholar

[2] D. W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, Inc, (1992).

Google Scholar

[3] E. Parzen, On Estimation of a Probability Density Function and Mode, Annals of Mathematical Statistics, Vol. 33, No. 3, pp.1065-1076. (1962).

DOI: 10.1214/aoms/1177704472

Google Scholar

[4] M. G. Genton, Classes of Kernels for Machine Learning: A Statistics Perspective, Journal of Machine Learning Research, Vol. 2, pp.299-312, (2001).

Google Scholar

[5] M. C. Jones , J. S. Marron ,S. J. Sheather, A Brief Survey of Bandwidth Selection for Density Estimation, Journal of the American Statistical Association, Vol. 91, No. 433, pp.401-407, Mar. (1996).

DOI: 10.1080/01621459.1996.10476701

Google Scholar

[6] C. R. Heathcote, The Integrated Squared Error Estimation of Parameters, Biometrika, Vol. 64, No. 2, pp.255-264, Aug. (1977).

DOI: 10.1093/biomet/64.2.255

Google Scholar

[7] J. S. Marron, M. P. Wand, Exact Mean Integrated Squared Error, The Annals of Statistics, Vol. 20, No. 2, pp.712-736, Jun. (1992).

DOI: 10.1214/aos/1176348653

Google Scholar

[8] C. C. Taylor, Bootstrap Choice of the Smoothing Parameter in Kernel Density Estimation, Biometrik, Vol. 76, No. 4, pp.705-712, Dec. (1989).

DOI: 10.1093/biomet/76.4.705

Google Scholar

[9] A. W. Bowman, An Alternative Method of Cross-Validation for the Smoothing of Density Estimates, Biometrika, Vol. 71, No. 2, pp.353-360, Aug. (1984).

DOI: 10.1093/biomet/71.2.353

Google Scholar

[10] D. W. Scott, G. R. Terrell, Biased and Unbiased Cross-Validation in Density Estimation, Journal of the American Statistical Association, Vol. 82, No. 400, pp.1131-1146, Dec, (1987).

DOI: 10.1080/01621459.1987.10478550

Google Scholar

[11] J. N. K. Liu, Y. L. He, X. Z. Wang, Y. X. Hu, A comparative study among different kernel functions in flexible naïve Bayesian classification, In Proceedings of the 2011 International Conference on Machine Learning and Cybernetics, Vol. 2, pp.638-643, (2011).

DOI: 10.1109/icmlc.2011.6016813

Google Scholar

[12] J. N. K. Liu, Y. L. He, X. Z. Wang, Improving kernel incapability by equivalent probability in flexible naive Bayesian, In Proceedings of the 2012 IEEE International Conference on Fuzzy Systems, pp.1-8, (2012).

DOI: 10.1109/fuzz-ieee.2012.6250811

Google Scholar

[13] G. H. John, P. Langley, Estimating continuous distributions in Bayesian classifiers, In Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, pp.338-345, (1995).

Google Scholar