A Neural Network Model for Phoneme Generation

Article Preview

Abstract:

The paper discusses the possibility of phonemes generation based on a recurrent neural network model. In each phoneme a typical or elemental pattern can be identified that repeats itself with slight fluctuations along the signal length. This elemental pattern constitutes the training data for the recurrent neural network. After training, the network can generate three new periods of elemental patterns. In a repetitive loop the network can generate the entire phoneme signal. The model proved very simple and effective, and the generated phonemes gave the impression of a natural sound.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

478-483

Citation:

Online since:

August 2013

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] J. Holmes and W. Holmes: Speech Synthesis and Recognition, 2nd Edition, Taylor & Francis, N.Y. (2001).

Google Scholar

[2] D. Jurafsky and J.H. Martin: Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Second Edition, Pearson Prentice Hall, (2008).

DOI: 10.1162/coli.b09-001

Google Scholar

[3] S. McLaughlin and P. Maragos: Nonlinear methods for speech analysis and synthesis. In: Marshall S, Sicuranza G, editor. Advances in nonlinear signal and image processing, Vol. 6. Hindawi Publishing Corporation (2007), p.103.

Google Scholar

[4] V. Pitsikalis and P. Maragos: Analysis and classification of speech signals by generalized fractal dimension features, Speech Communication, Vol. 51, no. 12, (2009), p.1206–1223.

DOI: 10.1016/j.specom.2009.06.005

Google Scholar

[5] K. Sreenivasa Rao: Role of neural network models for developing speech systems, Sadhana Vol. 36, Part 5, (2011 Oct), p.783–836.

DOI: 10.1007/s12046-011-0047-z

Google Scholar

[6] E.V. Raghavendra, P. Vijayaditya and K. Prahallad: Speech synthesis using artificial neural networks, National Conference on Communications (NCC), Chennai, India, (2010), pp.1-5.

DOI: 10.1109/ncc.2010.5430190

Google Scholar

[7] O. Karaali, G. Corrigan, and I. Gerson: Speech Synthesis with Neural Networks, World Congress on Neural Networks, San Diego, (1996 Sep), pp.45-50.

Google Scholar

[8] W. K. Lo and P. C. Ching: Phone-Based Speech Synthesis With Neural Network And Articulatory Control, In Proceedings of Fourth International Conference on Spoken Language (ICSLP 96), (1996), vol. 4, pp.2227-2230.

DOI: 10.1109/icslp.1996.607248

Google Scholar

[9] I. Gerson, O. Karaali and G. Corrigan: Neural Network Speech Synthesis, In Proceedings of the World Conference on Neural Networks, San Diego, California, USA, (1996), pp.45-50.

Google Scholar

[10] M. Malcangi and D. Frontini: A Language-Independent Neural Network-Based Speech Synthesizer. In: Neurocomputing, 73: 1-3 (2009 Dec), pp.87-96.

DOI: 10.1016/j.neucom.2008.08.023

Google Scholar

[11] S.D. Balkin: Using Recurrent Neural Networks for Time Series Forecasting, Technical Report 97-11, Pennsylvania State University, (1997).

Google Scholar

[12] R.J. Frank, N. Davey and S.P. Hunt: Time Series Prediction and Neural Networks. In: Journal of Intelligent and Robotic Systems 31, (2001), pp.91-103.

Google Scholar

[13] W. Kinzel: Predicting and generating time series by neural networks: An investigation using statistical physics. In: Computational Statistical Physics (2002), pp.97-111.

DOI: 10.1007/978-3-662-04804-7_6

Google Scholar

[14] A. Priel and I. Kanter: Time series generation by recurrent neural networks, In: Annals of Mathematics and Artificial Intelligence 39 (2003), p.315–332.

DOI: 10.1023/a:1024620813258

Google Scholar

[15] M. Crisan: New Aspects of Phoneme Synthesis Based on Chaotic Modeling, In: Proceedings of 2011 International Conference on Instrumentation, Measurement, Circuits and Systems (ICIMCS 2011), Hong Kong, pp.605-614.

DOI: 10.1007/978-3-642-27334-6_71

Google Scholar