A Mathematical Morphological Processing of Spectrograms for the Tone of Chinese Vowels Recognition

Article Preview

Abstract:

As One of Features from other Languages, the Chinese Tone Changes of Chinese are Mainly Decided by its Vowels, so the Vowel Variation of Chinese Tone Becomes Important in Speech Recognition Research. the Normal Tone Recognition Ways are Always Based on Fundamental Frequency of Signal, which can Not Keep Integrity of Tone Signal. we Bring Forward to a Mathematical Morphological Processing of Spectrograms for the Tone of Chinese Vowels. Firstly, we will have Pretreatment to Recording Good Tone Signal by Using Cooledit Pro Software, and Converted into Spectrograms; Secondly, we will do Smooth and the Normalized Pretreatment to Spectrograms by Mathematical Morphological Processing; Finally, we get Whole Direction Angle Statistics of Tone Signal by Skeletonization way. the Neural Networks Stimulation Shows that the Speech Emotion Recognition Rate can Reach 92.50%.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

665-671

Citation:

Online since:

June 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] J. Cadore,A. Gallardo-Antolín,C. Peláez-Moreno: Advances in Nonlinear Speech Processing Lecture Notes in Computer Science , 2011, Volume 7015, pp.224-231.

DOI: 10.1007/978-3-642-25020-0_29

Google Scholar

[2] Cadore, J. Valverde-Albacete, F. J. Gallardo-Antolín, A. Peláez-Moreno, C. Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement. Cognitive Computation, 2012, 1-16.

DOI: 10.1007/s12559-012-9196-6

Google Scholar

[3] Cadore, J. Gallardo-Antolín, A. Peláez-Moreno, C. Morphological processing of spectrograms for speech enhancement. 2011, 7015 LNAI, 224-231.

DOI: 10.1007/978-3-642-25020-0_29

Google Scholar

[4] Ariki Y, Kato S, and Takiguchi T. phoneme Recognition Based on Fisher Weight Map to Higher-Order Local Auto-Correlation, Proc. of Interspeech, 2006, 377-380.

DOI: 10.21437/interspeech.2006-126

Google Scholar

[5] S. Raphael O.S. Douglas: Segmentation of a Speech Spectrogram using Mathematical Morphology[J]. Proceeding of IEEE ICASSP, 2008, 1637-1640.

Google Scholar

[6] Yin, H. Nadeu, C. Hohmann, V. Pitch- and formant-based order adaptation of the fractional fourier transform and its application to speech recognition. Eurasip Journal on Audio, Speech, and Music Processing. (2009).

DOI: 10.1155/2009/304579

Google Scholar

[7] Cao, J. Li, A. Hu, F. Zhang, L. Application of phonetic knowledge in automatic speech recognition - Case analysis. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2008, 48, SUPPL, 748-753.

Google Scholar

[8] Serra J. Image analysis and mathematical morphology[C]. London: Academic Press, 1982, 115.

Google Scholar

[9] Mellinger, D. K. Clark, C. W. Recognizing transient low-frequency whale sounds by spectrogram correlation. Journal of the Acoustical Society of America. 2000, 107, 6, 3518-3529.

DOI: 10.1121/1.429434

Google Scholar

[10] Ashish B. Ingale, D. S. Chaudhari. Speech Emotion Recognition. International Journal of Soft Computing and Engineering . March 2012 , Volume-2, Issue-1.

Google Scholar

[11] Shinha D and Dougherty E R. Fuzzy mathematical morphology[J]. J Vision, Communication and Imagine and Representation, 1992, 3( 3): 286-302.

Google Scholar

[12] Evangelista G. Pitch-Synchronous Wavelet Representations of Speech and Music Signal. IEEE. Trans on 1993, 41(12): 3313-3330.

DOI: 10.1109/78.258076

Google Scholar

[13] Nwe, Tin Lay. Foo, Say Wei. De Silva, Liyanage C. Speech emotion recognition using hidden Markov models. Speech Communication. 2003, 41, 4, 603-623.

DOI: 10.1016/s0167-6393(03)00099-2

Google Scholar

[14] Schuller B. Rigoll G. Lang M. Hidden Markov model-based speech emotion recognition. 2003 April, II - 1-4 vol. 2.

DOI: 10.1109/icassp.2003.1202279

Google Scholar

[15] Hunt, Melvyn J. Lefebvre, Claude. Speech recongnition using an auditory model with pitch-synchronous analysis. Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. 1987, 813-816.

DOI: 10.1109/icassp.1987.1169585

Google Scholar

[16] Lam L, Suen C Y. An evaluation of parallel thinning algorithms for character recognition[J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 1995, 17(9): 914-919.

DOI: 10.1109/34.406659

Google Scholar

[17] Gasteratos A, Tsalides S. Fuzzy soft mathematical morphology[J]. Image Signal Processing, 1998, 145(1): 41- 49.

DOI: 10.1049/ip-vis:19981557

Google Scholar

[18] Chen Sin-Homg, Senior Member, and Wang Yih-Ru. Tone Recognition of Continuous Mandarin Speech Based on Neural Networks[J]. IEEE Transactions on Speech and Audio Processing, 1995, 3(2): 146-150.

DOI: 10.1109/89.366544

Google Scholar

[19] Yang W J. Hidden Markov Model for Mandarin lexical tone recognition, IEEE Trans. Acoust Speech Signal Process, 1988, 36: 988-992.

DOI: 10.1109/29.1620

Google Scholar

[20] Chennonkh S, Gerrits A, Miet G, et al. Speech Enhancement via Frequency Extension using Spectral Frequency[A]. Proc. ICASSP[C]. SaltLakeCity, 2001, 5.

Google Scholar

[21] Rea J A, Longbotham H G, Kothari H.N. Fuzzy Logic Mathematical Morphology: implementation by Stack Filter. IEEE Trans. on Signal Processing, 1996, 44(l): 142-147.

DOI: 10.1109/78.482024

Google Scholar

[22] Blum H. A Transformation for Extracting New Descriptors of Shape, Models for the Perception of Speech and Visual Forms,W. Watheen Dunn, Ed. Cambridge, MA: MIT Press, (1967).

Google Scholar