Isolated Words Digits Speech Recognition

Article Preview

Abstract:

This paper implemented a speech recognition program for isolated digit words using a method called the Hidden Markov Model (HMM) for speech modeling. The K-means,Baun-welch algorithms for training and codebook conception and finally the Viterbi decoding algorithm for recognition process. This method uses a statistical approach in characterizing speech. Briefly, speech utterance is fit into a probabilistic framework, which consists of transition of states and observable sequences. The target is to evaluate the probability score of the speech utterance based on a given model, and also to find the best model that gives the highest probability score. Research has shown that the HMM method is superior over conventional template matching methods, and it has already been applied by oversea companies successfully in commercial speech recognition programs. Implementing a LP Cepstrum, Coefficient function, a training function, which creates Hidden Markov Models of specific utterances and a testing function, testing utterances on the models created by the training-function. These functions created in MatLab. The recognized word decision is based on the maximal likehood value. The speech database is TI46 which is downloading from internet.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 433-440)

Pages:

4983-4988

Citation:

Online since:

January 2012

Keywords:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Sadaoki Furui. Speech recognition technology in the ubiquitous/wearable computing environment. Proceedings of the 2000 IEEE International Conference on Acousitcs, Speech, and Signal Processing.

DOI: 10.1109/icassp.2000.860214

Google Scholar

[2] John H. L. Hansen John R. Deller and John G. Proakis. Discrete-Time Processing of Speech Signals. IEEE Press, (2000).

Google Scholar

[3] John G. Proakis and Dimitris G. Manolakis. Digital Signal Processing. Prentice-Hall, third edition, (1996).

Google Scholar

[4] Thomas F. Quatieri. Discrete-Time Speech Signal Processing. Prentice-Hall, third edition, (1996).

Google Scholar

[5] Andreas Spanias Ted Painter. Perceptual coding of digital audio. Proceedings of IEEE, 88(4), April (2000).

Google Scholar

[6] R. Duda and P. Hart, Pattern Classification and Scene Analysis, Wiley-Interscience, (1973).

Google Scholar

[7] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice-Hall, (1999).

Google Scholar

[8] D. Jurafsky and James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice-Hall, (2000).

Google Scholar

[9] H. Stark and J. Woods, Probability, Random Processes, and Estimation Theory for Engineers, 2nd ed., Prentice-Hall, (1994).

Google Scholar

[10] J. Y. Stein, Digital Signal Processing: A Computer Science Perspective, Wiley-Interscience, (2000).

Google Scholar