Research on Chinese Named Entity Recognition Based on Ontology

Article Preview

Abstract:

As a critical role in many Natural Language Processing (NLP) applications, such as Information Extraction, Machine Translation etc, Chinese Named Entity Recognition (NER) remains a challenging task because of its characteristics. This paper proposes a method of Chinese NER, which combining Conditional Random Fields (CRFs) model with domain ontology as a semantic feature besides word and part of speech features. Experiments were made to compare the two kinds of feature templates, and the precision rate and recall rate of Chinese NER rose to 90.86% and 88.23%, which showed remarkable performance of the proposed approach. Combination of ontology and CRFs method increased effectively the precision and recall of Chinese NER.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1180-1185

Citation:

Online since:

August 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Wenliang Chen, Yujie Zhang, and Hitoshi Isahara. 2006. Chinese Named Entity Recogniton with Conditional Random Fields. In SIGHAN-5, pages 118-121, Sydney, Australia, July22-23.

Google Scholar

[2] Burr Settles. Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets. COLING, (2004).

DOI: 10.3115/1567594.1567618

Google Scholar

[3] Shaojun Zhao. Named Entity Recognition in Bilmedical Texts using an HMM Model. JNLPB, (2004).

Google Scholar

[4] Hai Leong Chieu, Hwee Tou NG. Named Entity Recognition: A Maximum Entropy Approach Using Global Information. COLING, Taipei, Taiwan, (2002).

DOI: 10.3115/1072228.1072253

Google Scholar

[5] BERNERS-LEE T,HENDLER J,LASSILA O. The Semantic Web[J]. Scientific American, 2001,284(5):34-43.

DOI: 10.1038/scientificamerican0501-34

Google Scholar

[6] Hanna M. Wallach, Conditional Random Fields: An Introduction. Technical Report MS-CIS-04-21. Department of Computer and Information Sciende, University of PENNSylvania, (2004).

Google Scholar

[7] Lafferty J, McCallum A, Pereira F. Conditional random fiels: Probabilistic models for segmenting and labeling sequence data[C]. /Broadley C, Danyluk A, eds. Proc. of the 18th Int'1 Conf. on Machine Learning (ICML-01). Williams College: Morgan Kaufmann Publishers, 2001: 282-289.

DOI: 10.1145/1015330.1015422

Google Scholar

[8] Dongjian Liao, Dayuan Cao, Xinying Li. Information Extraction Based on Ontology[J]. Computer Enginnering and Applications, 2002(5):8-15.

Google Scholar

[9] Xianyi Cheng, Qian Zhu, Jin Wang. Chinese Information Extraction Principle and Apllication[M](in chinese). Beijing Science Publishing House, 2010: 151-182.

Google Scholar

[10] Guanming Zeng. CRFs-based Chinese Named Enitity Recognition with Improved Tag Set[D]. Beijing: Beijing University of Posts and Telecommunications, (2009).

DOI: 10.1109/csie.2009.551

Google Scholar

[11] Gloria L Zuniga. Ontology: Its Transformation form Philosophy to Informationn Systems[C]. Proceedings of the International Conference on Formal Ontology in Information Systems, 2001: 187-197.

DOI: 10.1145/505168.505187

Google Scholar

[12] Sumin Shi. Chinese Coreference Resoulution and Related Technical Research Based on Domain Ontology[D], NanJing: Nanjing University of Science and Technology, (2008).

Google Scholar

[13] Mei Wang. Research of The Constructing Methods on OWL Ontology[J]. Library and Information Service, 2006, (12): 30-33.

Google Scholar

[14] Jing Ma, Qingqing Song, Sifeng liu. The Comprehensive Construction and Evolution of Domain Ontology[J]. Journal of the China Society for Scientific and Technical Information, 2007, 26(6): 827-832.

DOI: 10.1109/gsis.2007.4443395

Google Scholar

[15] Hai Zhao, Chunyu Kit. Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition. The Sixth SIGHAN Workshop on Chinese Language Processing(SIGHAN-6), pp.106-111, Hyderabad, India, Januarey 11-12, (2008).

DOI: 10.1109/icmlc.2009.5212769

Google Scholar

[16] Guangjing Jin, Xiao Chen. The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging . The Sixth SIGHAN Workshop on Chinese Language Procdssing. pp.69-81.

DOI: 10.3115/1119250.1119276

Google Scholar

[17] The Sogou Lab. http: /www. sogou. com/labs.

Google Scholar