Researching in Web Technology Classification Based on Improved Support Vector Machine

Article Preview

Abstract:

Abstract. Web as the main platform of information manufacturing, issuing, processing and transacting, has emerged massive isomerous dynamic semi-structural or non-structural information resources [1], so how to extract useful information from the mass Web resources has become a question which need to be solved. This paper introduces the background and status from domestic and abroad, and expounds the related theory and technology of text classification, constructs the Web text classification system model, gives a Web text collection algorithm and collection system. It verified the theoretical method which proposed in this paper by experiment, the result show that the extraction of information and Web classification is more accurate.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

563-566

Citation:

Online since:

January 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Vapnik V N, The nature of statistical learning theory,: Springer-verlag, New York, 1999.

Google Scholar

[2] Choon Y, Classification of World Web Documents, Pittsburgh, (2000).

Google Scholar

[3] Min-Yen Kan, J. Web Page Categorization without the Web Page, (2004)17-22.

Google Scholar

[4] Maron M.E. and Kuhns J. L, Probabilistic Indexing and Information Retrieval. J. Journal of the ACM, (1960) 216-244.

DOI: 10.1145/321033.321035

Google Scholar

[5] Robertson, S.E. and Sparch Jones, Relevance Weighing of Search Terms, J. Journal of the American Society for Information Science, (1976) 129-146.

Google Scholar

[6] Salton G, Yang C and Yu, A Theory of Term Importance in Automatic Text Analysis, J. Journal of the American Society for Information Science, (1975) 33-44.

Google Scholar

[7] S. Soderland, Learning Information Extraction Rules for Semi-structured and Free Text, J. Machine Learning, (2001)125-127.

Google Scholar

[8] Milos Kovacevic, Michelangelo Dilligenti and Marco Gori, Recognition of Common Areas in a Web Page Using a Visualization Approach, Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications(AIMSA). (2002).

DOI: 10.1007/3-540-46148-5_21

Google Scholar