HMM-Based Speaker Emotional Recognition Technology for Speech Signal
In emotion classification of speech signals, the popular features employed are statistics of fundamental frequency, energy contour, duration of silence and voice quality. However, the performance of systems employing these features degrades substantially when more than two categories of emotion are to be classified. In this paper, a text independent method of emotion classification of speech is proposed. The proposed method makes use of short time log frequency power coefficients(LFPC) to represent the speech signals and a discrete Hidden Markov Model (HMM) as the classifier. The category labels used are, the archetypal emotions of anger, joy, sadness and neutral. Results show that the proposed system yields an average accuracy of 82.55%and the best accuracy of 94.4% in the classification of 4 emotions. Results also reveal that LFPC is a better choice as feature parameters for emotion classification than the traditional feature parameters.
Ran Chen and Wenli Yao
Y. Q. Qin and X. Y. Zhang, "HMM-Based Speaker Emotional Recognition Technology for Speech Signal", Advanced Materials Research, Vols. 230-232, pp. 261-265, 2011