Improving Statistical Parser by Recognition of Chinese Number and Quantifier Prefix in Machine Translation

Article Preview

Abstract:

By studying the Chinese number and quantifier prefix (CNQP) as a special language phenomenon in machine translation, this paper presents a CNQP recognition method, which is rule based and independent of word segmentation. The method expressed CNQPs compositions using Backus-Naur Form (BNF), and took the numeral as the active information and the quantifiers as the boundaries of the CNQPs. To avoid the word segmentation noise, a forward maximum matching method was used for obtaining the compositions of the CNQPs, which can be fed into the statistical parser for the analysis of the Chinese sentences. The experimental results indicate the proposed method as a pre-processing module can effectively improve the parsing results of the statistical parser without retraining on experimental data constructed manually, which can further enhance the translation qualities.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1841-1844

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] P. Kishore, R. Salim, W. Todd, and W.J. Zhu: Bleu: a method for automatic evaluation of machine translation, in: Proc. ACL' 02, 2002, pp.311-318.

Google Scholar

[2] R. Levy and C.D. Manning: Is it harder to parse Chinese, or the Chinese Treebank? in: Proc. ACL '03, 2003, pp.439-446.

DOI: 10.3115/1075096.1075152

Google Scholar

[3] K. Martin: Functional unification grammar: a formalism for machine translation, in: Proc. Coling' 84, 1984, pp.75-78.

Google Scholar

[4] P. Brown, J. Cocke, S. Della Pietra, V. Della Pietra, F. Jelinek, R. Mercer, and P. Roossin: A statistical approach to language translation, in: Proc. Coling '88, 1988, pp.71-76.

DOI: 10.3115/991635.991651

Google Scholar

[5] C. David: A hierarchical phrase-based model for statistical machine translation, in: Proc. ACL '05, (2005).

Google Scholar

[6] W. Chao, C. Michael, and K. Philipp: Chinese syntactic reordering for statistical machine translation, in: Proc. 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007, pp.737-745.

Google Scholar

[7] X. Fei, M. Michael: Improving a statistical MT system with automatically learned rewrite patterns, in: Proc. the 20th international conference on Computational Linguistics, 2004, pp.508-516.

DOI: 10.3115/1220355.1220428

Google Scholar

[8] Y.Q. Zhang, Z. Richard, N. Hermann: Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation, in: Proc. NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation, 2007, pp.1-8.

DOI: 10.3115/1626281.1626282

Google Scholar

[9] Y. Liu, Q. Liu, S. Lin: Tree-to-string alignment template for statistical machine translation, in: Proc. the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2006, pp.609-616.

DOI: 10.3115/1220175.1220252

Google Scholar

[10] D. Klein, and C. D. Manning: Fast exact inference with a factored model for natural language parsing, in: Advances in neural information processing systems 15 (2002), 2003, pp.3-10.

Google Scholar

[11] X. Song: Composition and Recognition of Noun Quantifier and verb Quantifier Phrase, Ocean Press, Beijing, 2003 (In Chinese).

Google Scholar

[12] L. Zhang, W. Xiong, Y.J. Li, and Y. Liu: Recognize modern Chinese quantifier phrases based on a knowledge-database, in: Proc. ICCC '07, 2007 (In Chinese).

Google Scholar