Discriminative Training for Native Chinese Speakers' Pronunciation Proficiency Evaluation

Article Preview

Abstract:

Most computer assisted language learning (CALL) systems use acoustic models trained by MLE (Maximum Likelihood Estimation) for pronunciation proficiency evaluation. However, MLE ignores information of other phones during training stage and cant distinguish confusing phones well. This paper introduced discriminative measures of minimum phone/word error to refine acoustic models to deal with the problem. This paper analyzed discriminative trained acoustic models on Putonghua proficiency test in detail and found that: 1) They are much more distinguishable than MLE ones; 2) Even though the training and test are mismatch, they still perform significantly better than MLE-trained models under the same phone boundaries. The final system performance has approximately 4.5% relative improvement.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

521-528

Citation:

Online since:

August 2012

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Si Wei, Yu Hu, Renhua Wang, The Electronic PSC Testing System, Journal of Chinese Information Processing, Vol 20, No. 6, Jun 2006, pp.89-96 (in Chinese).

Google Scholar

[2] Qingsheng Liu, Si Wei, Yu Hu, Renhua Wang, The Linguistic Knowledge Based Improvement in Automatic Putonghua Pronunciation Quality Assessment Algorithm, Journal of Chinese Information Processing, Vol 21, No. 4, July 2007, pp.92-96 (in Chinese).

Google Scholar

[3] SiWei, et al. Putonghua Proficiency Test and Evaluation, Advances in Chinese Spoken Language Processing, Chapter 18: Springer Press, (2006).

Google Scholar

[4] H. L Franco, L. Neumeyer, Y. Kim, O. Ronen. Automatic pronunciation scoring for language instruction, ICASSP 1997, pp.1465-1468.

DOI: 10.1109/icassp.1997.596227

Google Scholar

[5] L. Neumeyer, H. Franco, V. Digalakis, M. Weintraub. Automatic Scoring of Pronunciation Quality,. Speech Communication 30, 2000, pp.83-93.

DOI: 10.1016/s0167-6393(99)00046-1

Google Scholar

[6] C. Cucchiarini, F.D. Wet, H. Strik, L. Boves, Automatic Evaluation of Dutch Pronunciation by Using Speech Recognition Technology, ICSLP Vol. 5, 1998, 1739-1742.

DOI: 10.21437/icslp.1998-720

Google Scholar

[7] S. M Witt, Use of speech recognition in computer assisted language learning", A dissertation for doctor, s degree of Cambridge, Nov (1999).

Google Scholar

[8] S. M Witt, S,J. Young, Phone-level pronunciation scoring and assessment for interactive language learning, Speech Communication 30, 2000, 95-108.

DOI: 10.1016/s0167-6393(99)00044-8

Google Scholar

[9] Bahl L R, Brown P F, Souza P V, et al, Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition,. Proceedings of ICASSP1986, 1986. 49-52.

DOI: 10.1109/icassp.1986.1169179

Google Scholar

[10] Valtchev V, Odell J, Woodland P, et al. Lattice-Based Discriminative Training for Large Vocabulary Speech Recognition,. Proceedings of ICASSP1996, 1996. Vol2, 605-608.

DOI: 10.1109/icassp.1996.543193

Google Scholar

[11] Valtchev V, Odell J, Woodland P, et al. MMIE Training of Large Vocabulary Recognition Systems, Speech Communication, 1997. 22(4): 303-314.

DOI: 10.1016/s0167-6393(97)00029-0

Google Scholar

[12] D. Provey and P. Woodland, Minimum Phone Error and I-Smoothing for Improved Discriminative Training, Proceedings of ICASSP 2002, pp.105-108.

Google Scholar

[13] Feng Zhang, A Reserch on Automatic Error Detection Based on Statistical Pattern Recognition", A dissertation for doctor, s degree of USTC, May 2009 (in Chinese).

Google Scholar

[14] Xiaojun Qian, Frank Soong, Helen Meng, Discriminative Acoustic Model for Improving Mispronunciation Detection and Diagnosis in Computer-Aided Pronunciation Training(CAPT), Interspeech 2010, Sep (2010).

DOI: 10.21437/interspeech.2010-278

Google Scholar

[15] Putonghua training and testing center, Outline for Putonghua proficiency test and evaluation, Commercial Press, 2004 (in Chinese).

Google Scholar

[16] Si Wei, Automatic Error Detection Based on Statistical Pattern Recognition", A dissertation for doctor, s degree of USTC, Apr. 2008 (in Chinese).

Google Scholar

[17] Ke Yan, Pronunciation Accuracy Assessment based on Phone Scoring Model, unpublished (in Chinese).

Google Scholar

[18] www. isay365. com.

Google Scholar

[19] Ke Yan, Research on Automatic Evaluation of English Recitation and Retelling Test", A dissertation for master, s degree of USTC, May 23rd. 2008, (in Chinese).

Google Scholar

[20] Ke Yan, Guoping Hu, Si Wei, Lirong Dai et al, Automatic Evaluation of English Retelling Proficiency for Large Scale Machine Examinations of Oral English Test, Academy Journal of TsingHua Univerisity ( Nature Science Edition), 2009 S1 (in Chinese).

Google Scholar

[21] Chiharu Tsurutani, Foreign Accent Matters Most When Timing is Wrong, Interspeech 2010, pp.1854-1857.

DOI: 10.21437/interspeech.2010-536

Google Scholar

[22] Peng Liu, Frank K. Soong, Kullback-Leibler Divergence between Two Hidden Markov Models, Microsoft Research Asia, Speech Group, unpublished.

Google Scholar