The Algorithm of Abstract Extraction Based on Semantics

Article Preview

Abstract:

When data mining involves document processing, extracting abstract information from the content of documents has become an essential procedure. The core idea of the algorithms of abstract extraction represented by Luhn is extracting the abstract merely from the sentences which contain frequent words in the essay. However, these algorithms fail to extract from the full text in a deeper semantic level. Therefore, the accuracy of the traditional abstract extraction algorithm needs to be enhanced. In order to improve the accuracy, we propose a method which can improve the performance of the algorithm of candidate key words extraction by using the substitute words and considering the semantic meanings of the candidate key words.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1506-1509

Citation:

Online since:

February 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Hidalgo J M G. Evaluating cost-sensitive unsolicited bulk email categorization[C]. Proceedings of ACM Symposium on Applied Computing, 2002: 615-620.

DOI: 10.1145/508791.508911

Google Scholar

[2] Hobbs Jerry R. Information extraction from biomedical text[J]. Journal of Biomedical Informatics, 2002, 35(4): 260-264.

DOI: 10.1016/s1532-0464(03)00015-7

Google Scholar

[3] H P Luhn. The automatic creation of literature abstracts[J] . IBM Journal of Research Development, 1958, 2 ( 2): 159-165.

Google Scholar

[4] Girvan M, Newman M EJ. Community structure in social and biological networks[J]. Proceedings of National Academy of Sciences, 2002, 99(12): 7821-7826.

DOI: 10.1073/pnas.122653799

Google Scholar

[5] Pedersen T, Banerjee S, Patwardh an S. Maximizing Semantic Relatedness to Perform Word Sense Disambiguation. Supercomputing institute research report umsi 2005/ 25, University of Minnes ota, (2005).

Google Scholar

[6] Fellbaum C. Wordnet : An Electronic Lexical Database. Cambridge: MIT Press, (1998).

Google Scholar

[7] Sebastiani E Machine learning in automated text categorization[J]. ACM Computing Surveys, 2002, 34(1): 1-47.

DOI: 10.1145/505282.505283

Google Scholar

[8] Alani H, Kim S, Millard D, et a1. Automatic ontology-based knowledge extraction from web documents[J]. IEEE Intelligent Systems, 2003, 18(1): 14-21.

DOI: 10.1109/mis.2003.1179189

Google Scholar