An Approach to Tone Recognition of Mandarin Speech Based-On Two-Stage Model

Wei Chih Hsu; Jung Nan Sun; Huai I Wang

doi:10.4028/www.scientific.net/AMM.145.297

Paper Titles

Adaptive Return Prediction Block Matching Algorithms for Video Coding
p.277

A New Control Scheme for Sealing System in Continuous Motion Form/Fill/Seal Packaging Machines
p.282

Applications of Genetic Algorithm and Text Mining on Technology Innovation
p.287

A Hybrid Algorithm of Mining Closed Itemsets for Large Databases
p.292

An Approach to Tone Recognition of Mandarin Speech Based-On Two-Stage Model
p.297

Evaluation of Light Irradiation on Decolorization of Azo Dyes by Tsukamurella sp. J8025
p.304

Predicting Failure Behavior of Reinforced Concrete Columns Subjected to Cyclic Loading
p.309

A Prediction Model for Phytoplankton Abundance Based on Relevance Vector Machine
p.314

Evaluation on Sectional Warping Constants of Equal Leg Angle with Lips
p.320

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vol. 145An Approach to Tone Recognition of Mandarin Speech...

An Approach to Tone Recognition of Mandarin Speech Based-On Two-Stage Model

Abstract:

After inspecting the pitch contours of tone 1 of Mandarin speech, we found that the pitch contour of tone 1 consists of upward and downward line segments, while it is supposed that the contour of tone 1 is flat. Our study also found that tone 1 tends to be recognized as other three tones if the recognition algorithm used is based on the tone contour slope or shape. According to our experiments, we conclude that the recognition rate of the tones would be improved if two stage tone recognition scheme is conducted. At the first stage, tone one is recognized out and then the other three tones are identified at the second stage. The fundamental frequencies of input Mandarin speech of tone 1 are first retrieved from the training data and then a threshold value relating to standard deviation of fundamental frequencies is determined. In the first recognition stage, if the statistic standard deviation of fundamental frequencies is less than the determined threshold, the Mandarin speech is recognized as tone one. The input Mandarin speech which is not classified as tone 1 are the recognition targets of the second recognition stage. In the second stage, a so-called linear gradient analysis is conducted, and the tones are identified according to the derived positive or negative linear gradients. Our proposed recognition method is superior to traditional methods of Mandarin tone recognition in terms of effectiveness and recognition rate. Some experiments to prove the necessity of conducting two recognition stages will be described in detail.

You might also be interested in these eBooks

Innovation in Materials Science and Emerging Technology

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volume 145)

Pages:

297-303

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.145.297

Citation:

Cite this paper

Online since:

December 2011

Authors:

Wei Chih Hsu, Jung Nan Sun, Huai I Wang

Keywords:

Fundamental Frequency Standard Deviation, Linear Gradient, Mandarin Tone Recognition

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Wang, Xiao-Chuan: Speech Signal Processing(OpenTech Internet Bookstore, Taiwan 2008).

Google Scholar

[2] Xiao Han-guang and Cai Cong-zhong, Study of speaker-independent tone recognition based on support vector machine, Computer Engineering and Applications. Vol. 45(2009), pp.174-176.

Google Scholar

[3] Xin Lei, Manhung Siu, Mei-Yuh Hwang and M. O. Dunham, Improved tone modeling for Mandarin broadcast news speech recognition, Proceedings of Interspeeeh(ICSLP), Pittsburgh, USA, (2006), pp.1277-1280.

DOI: 10.21437/interspeech.2006-372

Google Scholar

[4] Huang Hao, Zhu Jle and Ha Li-dan, Tone modeling based on discriminative training for Mandarin speech recognltion, Computer Engineering and Applications. Vol. 45(2009) , pp.178-182.

Google Scholar

[5] Tian, J. and Nurminen, J., On analysis of eigenpitch in Mandarin Mandarin, Chinese Spoken Language Processing, 2004 International Symposium on. (2004) , pp.89-92.

Google Scholar

[6] Xufang Zhao, O'Shaughnessy and D. Nguyen Minh-Quang, A Processing Method for Pitch Smoothing Based on Autocorrelation and Cepstral F0 Detection Approaches, Signals, Systems and Electronics. ISSSE '07, International Symposium on. (2007) , pp.59-62.

DOI: 10.1109/issse.2007.4294413

Google Scholar

[7] MAT-400 Speech Database on http: /www. aclclp. org. tw/use_mat_c. php#mat400.

Google Scholar