A Transformation Approach between the Similar Language Text

Article Preview

Abstract:

This paper first investigates the similarity level between the same family and closer languages (such as Altai family languages) and then examines a transformation between their entries and texts. Cosine similarity measure and dynamic programming (DP) algorithm are used to calculate the similarity and transformation between the source and target languages using a multilingual parallel data set and a function word dictionary. Test data set includes 7,854 paralleled sentences of Chinese, Uyghur, Kazakh and Mongolian various writing systems. Experimental results show that the similarity level of the languages from the same language branch is higher than that between different language branches. And a transformation test focused on the Mongolian language branch showed accuracy of 86.7% for NM to TM and 91.1% for NM to TODO.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 791-793)

Pages:

1716-1720

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Shi-Kuo Chang, Jong-Hyeok Lee and Kam-Fai Wong, COMPUTER PROCESSING OF ORIENTAL LANGUAGES, Vol. 19(2&3), World Scientific (2006).

Google Scholar

[2] T. Schultz and A. Waibel, "Fast Bootstrapping of LVCSR System with Multilingual phoneme Sets, Proc. Eurospeech, 1, pp.371-374 (2001).

DOI: 10.21437/eurospeech.1997-141

Google Scholar

[3] Lin jun Zhang, et. al, Cross-Language information retrival, Journal of Computer Science, Vol. 31(7), pp.16-19 (2004).

Google Scholar

[4] Shen gwei Tian, et. al, A method fro Uyghur Sentence Similarity Computation, Journal of Computer Engineering and Application, China, Vol49(26), pp.144-146 (2009).

Google Scholar

[5] Idomucogiin Dawa and Satoshi Nakamura, A Study on Cross Transformation of Mongolian Language, Journal Natural Language Processing, Vol. 15(5), pp.3-21 ( 2008, ).

DOI: 10.5715/jnlp.15.5_3

Google Scholar

[6] Jun. Ye, Cosine Similarity measures for intuitionistic fuzzy sets and their Applications,. Mathmatical and Computer Modeling, Vol. 53, pp.91-97 (2011).

DOI: 10.1016/j.mcm.2010.07.022

Google Scholar

[7] Landau G. M, et al, Two algorithms for LCS consecutive suffix alignment, In Proc. 15th Ann. Simp. On combinatorial Pattern Matching, LNCS 3109, pp.173-193 (2004).

DOI: 10.1007/978-3-540-27801-6_13

Google Scholar

[8] Francois Nicolas, Eric Rivals, Longest common subsequence problem for unoriented and cyclic strings, J. Theoretical Computer Science 370, pp.1-18 (2007).

DOI: 10.1016/j.tcs.2006.10.002

Google Scholar