Emotional Analysis and Synthesis of Human Voice Based on STRAIGHT

Ran Ran Chang; Xiao Qing Yu; Ying Ying Yuan; Wang Gen Wan

doi:10.4028/www.scientific.net/AMM.536-537.105

Paper Titles

A Video Compression Coding Algorithm for Network Content Transmission
p.81

An Efficient Shape Recognition Method Using Multiple Description Fusion
p.89

Clonal Selection Algorithm Based on Manifold Distance for Image Compression
p.95

Dot Area Coverage Percentage Measurement Method for Plate Microscopic Image Based on Color Segmentation
p.100

Emotional Analysis and Synthesis of Human Voice Based on STRAIGHT
p.105

Fusion of Polarization Image Based on Curvelet Transform
p.111

Human Face Expression Recognition Based on Feature Fusion
p.115

Image Defogging Based on Improved Guided Image Filtering
p.121

Image Recognition Based on Shape and Texture Features
p.127

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 536-537Emotional Analysis and Synthesis of Human Voice...

Emotional Analysis and Synthesis of Human Voice Based on STRAIGHT

Abstract:

Speech synthesis is a hot research of artificial intelligence today, and urgent difficulty to overcome is how to make the machine more "emotional intelligence" for the human-computer interaction. With the STRAIGHT algorithm, this paper extracted the acoustic feature parameters of the speech signals and did statistical analysis, modified the characteristic parameters of the neutral sounds to synthesize emotional speeches, including happy, angry, frustration, then analyzed a frame of spectrum of synthetic emotional speeches through standard voices, voices added noise and voices de-nosing. The experimental results show that the method is feasible and the synthetic emotional speeches through voices de-nosing are better than voices added noise.

You might also be interested in these eBooks

Advances in Mechatronics, Robotics and Automation II

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 536-537)

Pages:

105-110

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.536-537.105

Citation:

Cite this paper

Online since:

April 2014

Authors:

Ran Ran Chang, Xiao Qing Yu*, Ying Ying Yuan, Wang Gen Wan

Keywords:

Acoustic Feature Parameters, Emotional Speeches, Frame of Spectrum, Speech Synthesis, Straight

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Ning Wang. Emotional speech conversion by Pitch Target model and adjustment of the prosodic parameter. Soochow University, (2012).

Google Scholar

[2] Zhang S, Xu Y, Jia J, et al. Analysis and Modeling of Affective Audio Visual Speech Based on PAD Emotion Space[C] Chinese Spoken Language Processing, 2008. ISCSLP'08. 6th International Symposium on. IEEE, 2008: 1-4.

DOI: 10.1109/chinsl.2008.ecp.82

Google Scholar

[3] Li Zhao, communication. Speech signal processing [M]. Machinery Industry Press, Beijing, (2009).

Google Scholar

[4] Method of personalized speech generation based on STRAIGHT algorithm[J]. Science and technology in Gansu, 2010 (004): 34-35.

Google Scholar

[5] Zhang H, Yang Y. Fundamental frequency adjustment and formant transition based emotional speech synthesis[C]. Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on. IEEE, 2012: 1797-1801. Brown, F., The title of the patent (if available),. British Patent 123456, July (2004).

DOI: 10.1109/fskd.2012.6234018

Google Scholar

[6] Shue Y L, Keating P, Vicenik C, et al. VoiceSauce: A program for voice analysis[J]. Energy, 2010, 1(H2): H1-A1.

Google Scholar

[7] Ren R, Miao Z. Emotional speech synthesis and its application to pervasive E-learning[C]. Ubi-Media Computing, 2008 First IEEE International Conference on. IEEE, 2008: 431-435.

DOI: 10.1109/umedia.2008.4570930

Google Scholar

[8] Qin Y, Zhang X, Ying H. A hmm-based fuzzy affective model for emotional speech synthesis[C]. Signal Processing Systems (ICSPS), 2010 2nd International Conference on. IEEE, 2010, 3: V3-525-V3-528.

DOI: 10.1109/icsps.2010.5555658

Google Scholar

[9] Wang M, Li Y, Lin M, et al. The development of a database of functional and emotional intonation in Chinese[C]. Speech Database and Assessments (Oriental COCOSDA), 2011 International Conference on. IEEE, 2011: 136-141.

DOI: 10.1109/icsda.2011.6085995

Google Scholar

[10] Kawahara H, Morise M. Technical foundations of TANDEM-STRAIGHT, a speech analysis, modification and synthesis framework[J]. Sadhana, 2011, 36(5): 713-727.

DOI: 10.1007/s12046-011-0043-3

Google Scholar