Formant Speech Synthesis Based on Trainable Model

Zhi Ping Zhang; Xi Hong Wu

doi:10.4028/www.scientific.net/AMM.303-306.1334

Paper Titles

Centralized Control System of Emergency Communication Commanding Vehicle Based on Web2.0
p.1315

Convergence Theory of a SAA Method for Min-Max Stochastic Optimization Problems
p.1319

Wind Farm Active Load Allocation Scheme Considering Wind Speed Fluctuations
p.1323

Modeling and Simulation of Swarm Behaviors in Shoal of Fish
p.1329

Formant Speech Synthesis Based on Trainable Model
p.1334

Hand Gesture Recognition Algorithm: A Real-Time Human-Body-Based Approach
p.1338

Regularized Common Spatial Pattern with Incomplete Generic Learning for EEG Classification in Small Sample Setting
p.1344

Fault Diagnosis Research of Hydraulic Excavator Based on Fault Tree and Fuzzy Neural Network
p.1350

Multi-Feature Extraction and Fusion for the Underwater Moving Targets Classification
p.1357

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 303-306Formant Speech Synthesis Based on Trainable Model

Formant Speech Synthesis Based on Trainable Model

Abstract:

The authors proposed a trainable formant synthesis method based on the multi-channel Hidden Trajectory Model (HTM). In the method, the phonetic targets, formant trajectories and spectrum states from the oral, nasal, voiceless and background channels were designed to construct hierarchical hidden layers, and then spectrum were generated as observable features. In model training, the phonemic targets were learned from one-hour training speech data and the boundaries of phonemes were also aligned. The experimental results showed that the speech could be reconstructed with the formant trainable model by a source-filter synthesizer.

You might also be interested in these eBooks

Sensors, Measurement and Intelligent Materials

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 303-306)

Pages:

1334-1337

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.303-306.1334

Citation:

Cite this paper

Online since:

February 2013

Authors:

Zhi Ping Zhang, Xi Hong Wu

Keywords:

Formant Model, Hidden Trajectory Model, Speech Synthesis, Trainable Synthesis

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] D.H. Klatt, Review of text-to-speech conversion for English, J. Acoust. Soc. Am. 82(3): 737-793, (1987).

Google Scholar

[2] K. Tokuda and T. Masuko, et al., An Algorithm for Speech Parameter Generation from Continuous Mixture HMMs with Dynamic Features, EUROSPEECH'95, Madrid, Spain (1995).

DOI: 10.21437/eurospeech.1995-173

Google Scholar

[3] R. D. Donavan, Trainable Speech Synthesis, doctor thesis of Cambridge University (1996).

Google Scholar

[4] J. Bridle, et al., An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition", in Final Report for the 1998 Workshop on Language Engineering, Center for Language and Speech Processing (1998).

Google Scholar

[5] M. J. Russell and P. J. B. Jackson, A Multiple-level Linear/Linear segmental HMM with a formant-based intermediate layer, Computer Speech and Language, 19 (2005) 205-225.

DOI: 10.1016/j.csl.2004.08.001

Google Scholar

[6] L. Deng, D. Yu, and A. Acero, A Bidirectinal Target-Filtering Model of Speech Coariticulaiton and Reduction: Two-Stage Implementation for Phonetic Recognition, IEEE Trans. Audio, Speech and Language Proc. 14 (2006) 256-265.

DOI: 10.1109/tsa.2005.854107

Google Scholar