A K-Nearest-Neighbour Based Classifier for Securities Text Categorization

Article Preview

Abstract:

Event-driven investments have gained great importance and popularity. Due to the importance of the timely and effective messages for successful investment, the automated categorization of documents into predefined labels has received an ever-increased attention in the recent years. This paper implements a new text document classifier by integrating the K-nearest neighbour (KNN) classification approach with the VSM vector space model. By screening the feature items and weighted key items, the proposed classifier turns the financial information text into N-dimensional vector and identified the positive and negative information, furthermore achieve to the classification optimized. In addition, the classification model constructed by the proposed algorithm can be updated incrementally, and it has great scalability in event-driven securities investment for investors.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 989-994)

Pages:

1541-1546

Citation:

Online since:

July 2014

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Sebastiani, Fabrizio. Machine learning in automated text categor-ization. ACM Computing Surveys, 2002, 34(1): 1-47.

Google Scholar

[2] Han, E., & Karypis, G. Centroid-based document classification analysis and experimental result PKDD (2000).

Google Scholar

[3] Kjersti Aas, Line Eikvil. Text ctegorisation: A survey. http: /citeseer. ist. psu. edu/aas99text. html.

Google Scholar

[4] Larkey, L. S., & Croft, W. B. Combining classifiers in text categorization SIGIR, 1996: 289-297.

Google Scholar

[5] Kai M.H., Wang B., Yong J. H., Paul J. C., Relaxed lightweight assembly retrieval using vector space model, Computer-Aided Design, 2013, 45: 739-750.

DOI: 10.1016/j.cad.2012.10.005

Google Scholar

[6] Chen Y. L., Chiu Y. T., An IPC-based vector space model for patent retrieval, Information Processing and Management, 2011, 47: 309-322.

DOI: 10.1016/j.ipm.2010.06.001

Google Scholar

[7] Tan S.B., An effective refinement strategy for KNN text classifier, Expert Systems with Applications, 2006, 30: 290-298.

DOI: 10.1016/j.eswa.2005.07.019

Google Scholar

[8] Pang G. S., Jiang S. Y., A generalized cluster centroid based classifier for text categorization, Information Processing and Management, 2013, 49: 576-586.

DOI: 10.1016/j.ipm.2012.10.003

Google Scholar

[9] Jiang S. Y., Pang G. S., Wu M. L., Kuang L. M., An improved K -nearest-neighbor algorithm for text categorization, Expert Systems with Applications, 2012, 39: 1503-1509.

DOI: 10.1016/j.eswa.2011.08.040

Google Scholar