Health Database Oriented Word Alignment for Machine Translation Based on Generalized Intersection

Article Preview

Abstract:

Health database oriented data analysis and processing is very valuable, and in which the word alignment plays an important role. Health database contains a lot of medical terms. The existing word alignment methods cannot perform well due to the deficiency of term dictionary. This paper proposed a method of word alignment between Chinese and Japanese for healthy database. The method is based on the generalized intersection upon the set form of the sentence-level aligned bilingual corpus. We use GI (generalized intersection) model to align words. The GI model includes an algorithm based on generalized intersection operations on word set, and uses special stop-word set to improve the recall further. The results of experiments indicate that the GI model performed well for the health database with huge amounts of medical terms, as well as the language pairs with less linguistic resource, such as Chinese and Japanese.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

3368-3371

Citation:

Online since:

August 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] W. Gale and K. Church, A program for aligning sentences in bilingual corpora, Computational Linguistics, vol. 19, no. 1, pp.75-102, (1993).

Google Scholar

[2] K. Imamura, A hierarchical phrase alignment from English and Japanese bilingual text, In: Proc the 2nd International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico: Springer, pp.206-207, (2001).

DOI: 10.1007/3-540-44686-9_22

Google Scholar

[3] S. Ker and J. Chang, A class-based approach to word alignment, Computational linguistics, vol. 23, no. 2, pp.313-344, (1997).

Google Scholar

[4] X. Lv, H. Wu and T. Yao, Aligning English-Chinese words without bilingual dictionary, Chinese Journal of Computers, vol. 27, no. 8, pp.1036-1045, (2004).

Google Scholar

[5] H. Wu, X. Lv, F. Ren, Y. Zhao and T. Yao, Word alignment based on corpora using ΜΙΜ, Mini-Micro Systems, vol. 25, no. 7, pp.1132-1134, (2004).

Google Scholar