Emotional Analysis and Synthesis of Human Voice Based on STRAIGHT

Article Preview

Abstract:

Speech synthesis is a hot research of artificial intelligence today, and urgent difficulty to overcome is how to make the machine more "emotional intelligence" for the human-computer interaction. With the STRAIGHT algorithm, this paper extracted the acoustic feature parameters of the speech signals and did statistical analysis, modified the characteristic parameters of the neutral sounds to synthesize emotional speeches, including happy, angry, frustration, then analyzed a frame of spectrum of synthetic emotional speeches through standard voices, voices added noise and voices de-nosing. The experimental results show that the method is feasible and the synthetic emotional speeches through voices de-nosing are better than voices added noise.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

105-110

Citation:

Online since:

April 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Ning Wang. Emotional speech conversion by Pitch Target model and adjustment of the prosodic parameter. Soochow University, (2012).

Google Scholar

[2] Zhang S, Xu Y, Jia J, et al. Analysis and Modeling of Affective Audio Visual Speech Based on PAD Emotion Space[C] Chinese Spoken Language Processing, 2008. ISCSLP'08. 6th International Symposium on. IEEE, 2008: 1-4.

DOI: 10.1109/chinsl.2008.ecp.82

Google Scholar

[3] Li Zhao, communication. Speech signal processing [M]. Machinery Industry Press, Beijing, (2009).

Google Scholar

[4] Method of personalized speech generation based on STRAIGHT algorithm[J]. Science and technology in Gansu, 2010 (004): 34-35.

Google Scholar

[5] Zhang H, Yang Y. Fundamental frequency adjustment and formant transition based emotional speech synthesis[C]. Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on. IEEE, 2012: 1797-1801. Brown, F., The title of the patent (if available),. British Patent 123456, July (2004).

DOI: 10.1109/fskd.2012.6234018

Google Scholar

[6] Shue Y L, Keating P, Vicenik C, et al. VoiceSauce: A program for voice analysis[J]. Energy, 2010, 1(H2): H1-A1.

Google Scholar

[7] Ren R, Miao Z. Emotional speech synthesis and its application to pervasive E-learning[C]. Ubi-Media Computing, 2008 First IEEE International Conference on. IEEE, 2008: 431-435.

DOI: 10.1109/umedia.2008.4570930

Google Scholar

[8] Qin Y, Zhang X, Ying H. A hmm-based fuzzy affective model for emotional speech synthesis[C]. Signal Processing Systems (ICSPS), 2010 2nd International Conference on. IEEE, 2010, 3: V3-525-V3-528.

DOI: 10.1109/icsps.2010.5555658

Google Scholar

[9] Wang M, Li Y, Lin M, et al. The development of a database of functional and emotional intonation in Chinese[C]. Speech Database and Assessments (Oriental COCOSDA), 2011 International Conference on. IEEE, 2011: 136-141.

DOI: 10.1109/icsda.2011.6085995

Google Scholar

[10] Kawahara H, Morise M. Technical foundations of TANDEM-STRAIGHT, a speech analysis, modification and synthesis framework[J]. Sadhana, 2011, 36(5): 713-727.

DOI: 10.1007/s12046-011-0043-3

Google Scholar