Automatic Extraction of Domain-Specific Terms

Article Preview

Abstract:

Automatic extraction of domain-specific terms improves the efficiency of management staff. This paper proposes a variance-based method to extract candidate domain-specific terms in terms of between domains, within a certain domain and unbalance of corpus. Then, we fill the meaningless word of candidate terms by using the combinational degree between two words based on original text. Finally, we can obtain meaningful terms. Experiments on the corpus of complaint about urban management show that our approach is effective to extract domain-specific terms. We evaluate the performance of our approach manually and compare the results against TFIDF, the Accuracylenient and Accuracystrict are 13 percent and 7 percent higher than TFIDF.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 1044-1045)

Pages:

1088-1093

Citation:

Online since:

October 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Lee-Feng Chien. PAT-Tree-Based adaptive Keyphrase Extraction for Intelligent Chinese Information Retrieval[J]. Information Processing and Management , 1999, 35: 501-521.

Google Scholar

[2] MA Y H, WANG Y C, SU G Y, et al. A Novel Chinese Text Subject Extraction Method Based on Character Co-occurrence [J]. Journal of Computer Research and Development, 2003, 40(6): 874-878.

Google Scholar

[3] Tian J W. Research for Chinese Term Extraction in the Military Domain[D]. Dalian University of Technology, (2013).

Google Scholar

[4] Li L S, Dang Y Z, Zhang J, et al. Automotive term extraction based on conditional random fields [J]. Journal of Dalian University of Technology, 2013, 53 (2): 267-272.

Google Scholar

[5] Zhan X G, Wu Q. Keyword Extraction Algorithm Based on TF Statistics and Syntactic Parsing[J]. Computer Applications and Software. 2014, 31(1): 47-49.

Google Scholar

[6] Jian-e Z. A Chinese Keywords Extraction Approach Based on TFIDF and Word Correlation[J]. Information Science, 2012, 10: 1542-1544.

Google Scholar

[7] Liu T, Liu B Q, Xu Z M, et al. Automatic domain-specific term extraction and its application in text classification[J]. Acta Electronica Sinica, 2007, 35(2): 328-332.

Google Scholar

[8] Geng H T, Cai Q S, Yu K, et al. A Kind of Automatic Text Keyphrase Extraction Method Based on Word Co-occurrence[J]. Journal of NanJing University(Natural Sciences), 2006, 42(2): 156-162.

Google Scholar

[9] Wang F, Wan C X, Chinese Integrity Word Segmentation Automatic Recognition Model Based on Mutual Information[C], Twenty-fourth Chinese Database Conference Proceedings (Technical Report articles). (2007).

Google Scholar

[10] Xin Z, Zhou Y J. Study and improvement of mutual information for feature selection in text categorization[J]. Journal of Computer Applications, 2013, 33(S2): 116-118, 152.

Google Scholar

[11] NLPIR Chinese word segmentation system. http: /ictclas. nlpir. org/downloads.

Google Scholar