Paper Title:
Chinese New Word Identification Using N-Gram and PPM Models
  Abstract

New word identification is one of the difficult problems of the Chinese information processing. This paper presents a new method to identify new words. First of all, the text is segmented using N-Gram; then PPM is used to identify the new words which are in the text; finally, the new identified words are added to update the dictionary using LRU. Compared with three well-known word segmentation systems, the experimental results show that this method can improve the precision and recall rate of new word identification to a certain extent.

  Info
Periodical
Chapter
Chapter 6: Materials and Mechanics Information System
Edited by
Yongping Zhang, Linhua Zhou and Elwin Mao
Pages
612-616
DOI
10.4028/www.scientific.net/AMM.109.612
Citation
D. Li, W. Tu, L. Shi, "Chinese New Word Identification Using N-Gram and PPM Models", Applied Mechanics and Materials, Vol. 109, pp. 612-616, 2012
Online since
October 2011
Authors
Export
Price
$32.00
Share

In order to see related information, you need to Login.

In order to see related information, you need to Login.

Authors: Pek Ling Ong, Norlia Baharun, Suhaina Ismail
Abstract:Refractory gold ore understudied showed traces amount of gold from X-ray fluorescence analysis. Phases identified are quartz, muscovite and...
236
Authors: Jie Ran, Ji Ya Huang, Zu Xiao
Chapter 3: Data, Text, Sound, Image, Signal and Video Processing and Technologies
Abstract:Word similarity computing is a crucial question in information processing technology. In this paper, an integrated word similarity computing...
413
Authors: Yan Xu, Chi Cheng, Han Ping
Chapter 7: Power System and Automation
Abstract:Life-cycle cost management is the foundation and techniques for life-cycle cost management. Petri nets has a graphical expression, can...
956