HMM-Based Speaker Emotional Recognition Technology for Speech Signal

Article Preview

Abstract:

In emotion classification of speech signals, the popular features employed are statistics of fundamental frequency, energy contour, duration of silence and voice quality. However, the performance of systems employing these features degrades substantially when more than two categories of emotion are to be classified. In this paper, a text independent method of emotion classification of speech is proposed. The proposed method makes use of short time log frequency power coefficients(LFPC) to represent the speech signals and a discrete Hidden Markov Model (HMM) as the classifier. The category labels used are, the archetypal emotions of anger, joy, sadness and neutral. Results show that the proposed system yields an average accuracy of 82.55%and the best accuracy of 94.4% in the classification of 4 emotions. Results also reveal that LFPC is a better choice as feature parameters for emotion classification than the traditional feature parameters.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 230-232)

Pages:

261-265

Citation:

Online since:

May 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Thomas F.Quatieri: Discrete-Time Speech Signal Processing : Principles and Practice[M]. Beijing: Publishing House of Electronics Industry(2004)468-488.

Google Scholar

[2] D.Morrison, R.Wang, et al: Ensemble methods for spoken emotion recognition in call-centres, Speech communication 49 (2007) 98–112.

DOI: 10.1016/j.specom.2006.11.004

Google Scholar

[3] Schuller R., Rigoll G., Lang M: Hidden Markov model-based speech emotion recognition, Porceeding of IEEE ICASSP Conference, Vol.2, 6-10(2003)pp.1-4.

DOI: 10.1109/icassp.2003.1202279

Google Scholar

[4] C.-W. Hsu, C.-J. Lin: A comparison of methods for multi-class support vector machines, IEEE Transactions on Neural Networks(2002) pp.415-425.

DOI: 10.1109/72.991427

Google Scholar

[5] M.Cernak, C.Wellekens: Emotional aspects of intrinsic speech variabilities in automatic speech recognition, in: International Conference on Speech and Computer (2006) p.405–408.

Google Scholar

[6] M.Lugger, B.Yang: Psychological motivated multi-stage emotion classification exploiting voice quality features, in: F.Mihelic, J. Zibert (Eds.), Speech Recognition, In-Tech, (2008)Chapter22.

DOI: 10.5772/6383

Google Scholar

[7] Y.Xu, S.Chuenwattanapranithi: Perceiving anger and joy in speech through the size code, in: Proceedings of the International Conference on Phonetic Sciences(2007) p.2105–2108.

Google Scholar

[8] B.Schuller, D.Arsic, et al.: Emotion recognition in the noise applying large acoustic feature sets, in: Speech Prosody, Dresden(2006)pp.276-289.

Google Scholar