Analysis of Sad Speech and Neutral-Sad Speech Conversion

Article Preview

Abstract:

A parameter mapping model to convert neutral speech to sad speech was proposed in this paper by comparison of statistical parameters between neutral and sad speech sample pairs with the same text content. In this model, we found sad speech with generally lower fundamental frequency than neutral speech, and the F0 contour is more stable than neutral speech; while the formants of sad speech is slightly higher than neutral speech. When concerning rhythm, the speed of sad speech is slightly slower than neutral speech. There exists significant difference between voiced segments and voiceless segments. Voiceless segments are significantly longer in sad speech. Speech conversion from neutral to sad was realized using this model, and got good results.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1605-1609

Citation:

Online since:

January 2015

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2015 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Picard, R., Affective Computing, MIT Press, (1997).

Google Scholar

[2] Ling Cen; Chan, P.; Minghui Dong; Haizhou Li, Generating emotional speech from neutral speech, 7th International Symposium on Chinese Spoken Language Processing (ISCSLP), 2010, pp.383-386.

DOI: 10.1109/iscslp.2010.5684862

Google Scholar

[3] Haojie Zhang; Yong Yang, Fundamental frequency adjustment and formant transition based emotional speech synthesis, 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2012, pp.1797-1801.

DOI: 10.1109/fskd.2012.6234018

Google Scholar

[4] Xiao, Z., Dellandrea, E., Dou, W., Chen, Ambiguous classification of emotional speech, International Workshop on EMOTION – satellite of International Conference on Language Resources and Evaluation (LREC), (2008).

Google Scholar

[5] Burkhardt F, Paeschke A, Rolfes M, et al., A database of German emotional speech, Interspeech, 2005, 5: 1517-1520.

DOI: 10.21437/interspeech.2005-446

Google Scholar