An Approach to Tone Recognition of Mandarin Speech Based-On Two-Stage Model

Article Preview

Abstract:

After inspecting the pitch contours of tone 1 of Mandarin speech, we found that the pitch contour of tone 1 consists of upward and downward line segments, while it is supposed that the contour of tone 1 is flat. Our study also found that tone 1 tends to be recognized as other three tones if the recognition algorithm used is based on the tone contour slope or shape. According to our experiments, we conclude that the recognition rate of the tones would be improved if two stage tone recognition scheme is conducted. At the first stage, tone one is recognized out and then the other three tones are identified at the second stage. The fundamental frequencies of input Mandarin speech of tone 1 are first retrieved from the training data and then a threshold value relating to standard deviation of fundamental frequencies is determined. In the first recognition stage, if the statistic standard deviation of fundamental frequencies is less than the determined threshold, the Mandarin speech is recognized as tone one. The input Mandarin speech which is not classified as tone 1 are the recognition targets of the second recognition stage. In the second stage, a so-called linear gradient analysis is conducted, and the tones are identified according to the derived positive or negative linear gradients. Our proposed recognition method is superior to traditional methods of Mandarin tone recognition in terms of effectiveness and recognition rate. Some experiments to prove the necessity of conducting two recognition stages will be described in detail.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

297-303

Citation:

Online since:

December 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Wang, Xiao-Chuan: Speech Signal Processing(OpenTech Internet Bookstore, Taiwan 2008).

Google Scholar

[2] Xiao Han-guang and Cai Cong-zhong, Study of speaker-independent tone recognition based on support vector machine, Computer Engineering and Applications. Vol. 45(2009), pp.174-176.

Google Scholar

[3] Xin Lei, Manhung Siu, Mei-Yuh Hwang and M. O. Dunham, Improved tone modeling for Mandarin broadcast news speech recognition, Proceedings of Interspeeeh(ICSLP), Pittsburgh, USA, (2006), pp.1277-1280.

DOI: 10.21437/interspeech.2006-372

Google Scholar

[4] Huang Hao, Zhu Jle and Ha Li-dan, Tone modeling based on discriminative training for Mandarin speech recognltion, Computer Engineering and Applications. Vol. 45(2009) , pp.178-182.

Google Scholar

[5] Tian, J. and Nurminen, J., On analysis of eigenpitch in Mandarin Mandarin, Chinese Spoken Language Processing, 2004 International Symposium on. (2004) , pp.89-92.

Google Scholar

[6] Xufang Zhao, O'Shaughnessy and D. Nguyen Minh-Quang, A Processing Method for Pitch Smoothing Based on Autocorrelation and Cepstral F0 Detection Approaches, Signals, Systems and Electronics. ISSSE '07, International Symposium on. (2007) , pp.59-62.

DOI: 10.1109/issse.2007.4294413

Google Scholar

[7] MAT-400 Speech Database on http: /www. aclclp. org. tw/use_mat_c. php#mat400.

Google Scholar