Large-Vocabulary Continuous Speech Recognition of Lhasa Tibetan

Article Preview

Abstract:

The framework of auto speech recognition of Lhasa dialect was established in this paper. Phoneme was chosen as the basic unit for modeling. Then, phonemes set of Lhasa dialect and their Latin transliteration were designed. There were 5568 frequently used monosyllables in the vocabulary. Hidden Markov Models of triphones were established and trained by use of HTK. Word error rate (WER) was 21.81% under the optimal situation.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

802-806

Citation:

Online since:

February 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Gongque Jiangcuo, Theory of Tibetan, Tibetan Studies . 1997 III.

Google Scholar

[2] Kelsang Jumian, Practical Tibetan grammar course. Sichuan minorities press. November 2004 edition.

Google Scholar

[3] Han Qinghua, Yu Hongzhi. Anduo Tibetan speaker-independent isolated words speech recognition based on HMMs. Software Guide . 2010 07.

Google Scholar

[4] Pei Chun Bao. The Tibetan language speech recognition technology based on the standard Lhasa, Master Thesis in Tibet University , (2009).

Google Scholar

[5] The HTK Book(for HTK Version 3. 4). Cambridge University Engineering Department. (2009).

Google Scholar

[6] Website: http: /htk. eng. cam. ac. uk.

Google Scholar

[7] L awrence Rabiner, Biing-Hwang Juang. Fundamentals of Speech Recognition, Tsinghua University Press Copy.

Google Scholar

[8] Nichong Jia, Liu Wen Ju, Xu Bo. Chinese large vocabulary continuous speech recognition system progress. Chinese Information. Volume 23 No. 1 January (2009).

Google Scholar

[9] Li Yonghong, Kong Jiangping, Yu Hongzhi. Automatically convert Tibetan language audio and its implementation. Tsinghua University (Natural Science) . 2008 Volume 48 of the S1.

Google Scholar

[10] Zheng Fang, Wen Hu Wu, Fang Ditang. Recognition Keyword Research of Continuous stream voice. Fourth National Conference on Human Machine Speech Communication Proceedings, (1996).

Google Scholar

[11] Gao Sheng, XU Bo, HUANG Taiyi. Chinese triphone model Based on Acoustics Decision Tree. Vol 25 No. 6 November (2000).

Google Scholar

[12] Julian James Odell. The Use of Context in Large Vocabulary Speech Recognition. University of Cambridge. March (1995).

Google Scholar