Features Extraction for Lhasa Tibetan Speech Recognition

Article Preview

Abstract:

Speech feature extraction is discussed. Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction coefficient (PLP) method is analyzed. These two types of features are extracted in Lhasa large vocabulary continuous speech recognition system. Then the recognition results are compared.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

205-208

Citation:

Online since:

June 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Gongque Jiangcuo, Theory of Tibetan, Tibetan Studies . 1997 III.

Google Scholar

[2] Kelsang Jumian, Practical Tibetan grammar course. Sichuan minorities press. November 2004 edition.

Google Scholar

[3] Han Qinghua, Yu Hongzhi. Anduo Tibetan speaker-independent isolated words speech recognition based on HMMs. Software Guide . 2010 07.

Google Scholar

[4] Pei Chun Bao. The Tibetan language speech recognition technology based on the standard Lhasa, Master Thesis in Tibet University , (2009).

Google Scholar

[5] The HTK Book(for HTK Version 3. 4). Cambridge University Engineering Department. (2009).

Google Scholar

[6] Website: http: /htk. eng. cam. ac. uk.

Google Scholar

[7] L awrence Rabiner, Biing-Hwang Juang. Fundamentals of Speech Recognition, Tsinghua University Press Copy.

Google Scholar

[8] Nichong Jia, Liu Wen Ju, Xu Bo. Chinese large vocabulary continuous speech recognition system progress. Chinese Information. Volume 23 No. 1 January (2009).

Google Scholar

[9] Li Yonghong, Kong Jiangping, Yu Hongzhi. Automatically convert Tibetan language audio and its implementation. Tsinghua University (Natural Science) . 2008 Volume 48 of the S1.

Google Scholar

[10] Zheng Fang, Wen Hu Wu, Fang Ditang. Recognition Keyword Research of Continuous stream voice. Fourth National Conference on Human Machine Speech Communication Proceedings, (1996).

Google Scholar

[11] Gao Sheng, XU Bo, HUANG Taiyi. Chinese triphone model Based on Acoustics Decision Tree. Vol 25 No. 6 November (2000).

Google Scholar

[12] Julian James Odell. The Use of Context in Large Vocabulary Speech Recognition. University of Cambridge. March (1995).

Google Scholar