A Document Feature Extraction Method Based on Concept-Word List

Abstract:

Article Preview

When describing a document in Vector Space Model (VSM), it often assumes that there is no semantic relationship between words or they are orthogonal to each other. In order to improve the inaccurate document description, a new document description method has been proposed in this paper by introducing a concept-word, which calculates the semantic similarity between words based on HowNet ontology database. Comparative experiments show that the new method can not only improve effectively the effect of document feature description in VSM, but also reduce significantly the dimension of a document vector. The research is very useful to document clustering, query word expansion in Web information retrieval and personalized service in e-business applications.

Info:

Periodical:

Edited by:

Yanwen Wu

Pages:

386-392

DOI:

10.4028/www.scientific.net/AMR.267.386

Citation:

Z. Y. Zhu et al., "A Document Feature Extraction Method Based on Concept-Word List", Advanced Materials Research, Vol. 267, pp. 386-392, 2011

Online since:

June 2011

Export:

Price:

$35.00

In order to see related information, you need to Login.

In order to see related information, you need to Login.