p.591
p.596
p.603
p.608
p.612
p.617
p.621
p.626
p.631
Chinese New Word Identification Using N-Gram and PPM Models
Abstract:
New word identification is one of the difficult problems of the Chinese information processing. This paper presents a new method to identify new words. First of all, the text is segmented using N-Gram; then PPM is used to identify the new words which are in the text; finally, the new identified words are added to update the dictionary using LRU. Compared with three well-known word segmentation systems, the experimental results show that this method can improve the precision and recall rate of new word identification to a certain extent.
Info:
Periodical:
Pages:
612-616
Citation:
Online since:
October 2011
Keywords:
Price:
Сopyright:
© 2012 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: