Chinese New Word Identification Using N-Gram and PPM Models

Abstract:

Article Preview

New word identification is one of the difficult problems of the Chinese information processing. This paper presents a new method to identify new words. First of all, the text is segmented using N-Gram; then PPM is used to identify the new words which are in the text; finally, the new identified words are added to update the dictionary using LRU. Compared with three well-known word segmentation systems, the experimental results show that this method can improve the precision and recall rate of new word identification to a certain extent.

Info:

Periodical:

Edited by:

Yongping Zhang, Linhua Zhou and Elwin Mao

Pages:

612-616

DOI:

10.4028/www.scientific.net/AMM.109.612

Citation:

D. Li et al., "Chinese New Word Identification Using N-Gram and PPM Models", Applied Mechanics and Materials, Vol. 109, pp. 612-616, 2012

Online since:

October 2011

Authors:

Export:

Price:

$35.00

In order to see related information, you need to Login.

In order to see related information, you need to Login.