Paper Titles

A Novel Quantum Genetic Algorithm Based on Potential
p.1434

A Novel Algorithm of Fractal-Wavelet Image Denosing
p.1440

EM Algorithm Based on Fuzzification and its Application
p.1445

Shape Modification of NURBS Curves Using Genetic Algorithm
p.1450

Investigating the Performance of Cosine Value and Jensen-Shannon Divergence in the kNN Algorithm
p.1455

A Novel Approach to Computing with Words: A Statistical Algorithm for Algebra Operations of Cloud Model
p.1460

Based on Ant Colony Algorithm Finite Element Analysis of the Alloy
p.1468

Research on Extension of the Fuzzy Rough Set Theory
p.1472

An Extended Rough Set Model for Generalized Incomplete Information Systems Based on α-Limited Dominance Relation
p.1477

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 532-533Investigating the Performance of Cosine Value and...

Investigating the Performance of Cosine Value and Jensen-Shannon Divergence in the kNN Algorithm

Article Preview

Abstract:

K Nearest Neighbor (kNN) is a commonly-used text categorization algorithm. Previous studies mainly focused on improvements of the algorithm by modifying feature selection and k value selection. This research investigates the possibility to use Jensen-Shannon Divergence as similarity measure in the kNN classifier, and compares the performance, in terms of classification accuracy. The experiment denotes that the kNN algorithm based on Jensen-Shannon Divergence outperforms that based on Cosine value, while the performance is also largely dependent on number of categories and number of documents in a category.

You might also be interested in these eBooks

Materials Science and Information Technology II

Info:

Periodical:

Advanced Materials Research (Volumes 532-533)

Pages:

1455-1459

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.532-533.1455

Citation:

Cite this paper

Online since:

June 2012

Authors:

Xiang Dong Li, Han Jia, Li Huang

Keywords:

Jensen-Shannon Divergence, KNN, Performance, Text Categorization

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] M. Aci, C. Inan, M. Avci : A hybrid classification method of k nearest neighbor, Bayesian methods and genetic algorithm., Expert Systems with Applications, 2010: 5061–5067.

DOI: 10.1016/j.eswa.2009.12.004

[2] D. Carmel, H. Roitman, H. Zwerdling:. Enhancing cluster labeling using Wikipedia., Proceedings of the 32nd annual international ACM SIGIR conference on Research and development in information retrieval. 2009. 139-146.

DOI: 10.1145/1571941.1571967

[3] D. Carmel, E. Yom-Tov, H. Roitman: Enhancing Digital Libraries Using Missing Content Analysis., Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries. 2008. 1-10.

DOI: 10.1145/1378889.1378891

[4] D. Carmel, E. Yom-Tov, A. Darlow, D. Pelleg: What Makes a Query Difficult?, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 2006. 390-397.

DOI: 10.1145/1148170.1148238

[5] G. Henkelman, G. Johannesson, H. Jónsson, in: Theoretical Methods in Condencsed Phase Chemistry, edited by S.D. Schwartz, volume 5 of Progress in Theoretical Chemistry and Physics, chapter, 10, Kluwer Academic Publishers (2000).

[6] I. Dagan, L. Lee, F. Pereira: Similarity-based Models of Word Co-occurrence Probabilities., Machine Learning, 1999(1): 43.

[7] N. García-Pedrajas, D. Ortiz-Boyer: Boosting k-Nearest Neighbor Classifier by Means of Input Space Projection., Expert Systems with Applications, 36 2009: 10570-10582.

DOI: 10.1016/j.eswa.2009.02.065

[8] G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer: Using kNN Model for Automatic Text Categorization., Soft Comput , 2006(10): 423-430.

DOI: 10.1007/s00500-005-0503-y

[9] X. Hao, X. Tao, C. Zhang, Y. Hu: An Effective Method To Improve kNN Text Classifier., Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing. 2007. 379-384.

DOI: 10.1109/snpd.2007.296

[10] R. Imad, P. William: An Optimized Approach for KNN Text Categorization Using P-Trees., Proceedings of the 2004 ACM symposium on Applied computing. 2004. 613–617.

DOI: 10.1145/967900.968026

[11] B. Li, S. Yu, Q. Lu: An Improved k-Nearest Neighbor Algorithm for Text Categorization., Proc. of the 20th International Conference on Computer Processing of Oriental Languages. (2003).

[12] X. Li, S. Shi, V. Charastrakul, J. Zhou: Advanced P-Tree based K-Nearest Neighbors for Customer Preference Reasoning Analysis., J Intell Manuf, 20(2009): 569-579.

DOI: 10.1007/s10845-008-0146-9

[13] X. Li, P. Xu, L. Huang, X. Shen: Reseach of Journals Manuscript Categorization Based on kNN Algorithm (in Chinese)., Document, Information & Knowledge, 2010(4): 71-76.

[14] B. Lim, M. Tsui, V. Charastrakul, D. Shi: Web Search with Text Categorization Using Probabilistic Framework of SVM., IEEE International Conference on Systems, Man, and Cybernetics. 2006. 2950-2955.

DOI: 10.1109/icsmc.2006.384566

[15] Y. Song, J. Huang, D. Zhou, H. Zha, C. Giles: IKNN: Informative k-Nearest Neighbor Pattern Classification., Proceedings of Oriental Languages. 2007. 248-264.

DOI: 10.1007/978-3-540-74976-9_25

[16] S. Tan: Neighbor-weighted K-Nearest Neighbor for Unbalanced Text Corpus., Expert Systems with Applications, 28(2005): 667-671.

DOI: 10.1016/j.eswa.2004.12.023

[17] Y. Wang, Z. Wang: A Fast kNN Algorithm for Text Categorization., Proceedings of the Sixth International Conference on Machine Learning and Cybernetics. 2007. 3436-3441.

DOI: 10.1109/icmlc.2007.4370742

[18] X. Xu, Q. Zhang: Research of Medical Information Text Categorization Based on KNN Algorithm (in Chinese)., Computer technology and development, 19(4) 2009: 206-209.

[19] Y. Yang, X. Liu: A Re-examination of Text Categorization Methods., Proceedings of 22nd ACM SIGIR Conference on Research and Development in Information Retrieval. 1999. 42-49.

DOI: 10.1145/312624.312647

[20] N. Zhang, Z. Jia: Text Categorization with KNN Algorithm (in Chinese)., Comupter Engineering, (31)8 2005: 171-173.