Design of Tibetan Continuous Speech Corpus Based on Triphone

Article Preview

Abstract:

Large vocabulary continuous speech recognition system performance largely depends on the quality of speech corpus and how to select corpus is the key of corpus design. By taking Tibetan Amdo dialect in XiaHe as the research object, this paper builds continuous speech corpus based on triphone. At first, we collected text corpus with 1000 thousand Tibetan sentences and transformed them into IPA according to real pronunciation in XiaHe dialect, and then summarized the structure of triphone juncture, analyzed the combination types and frequency of triphone in corpus statistically with text-processing platform in detail. At last by comprehensively considering coverage rate and sparseness of triphone and class-triphone we designed the algorithm for extraction of corpus and realized automatic selection to corpus.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2245-2248

Citation:

Online since:

September 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] R.H. Wang, J.F. Ni: The Design of Chinese Speech Corpus based on the juncture, Computer Applications and Software, Vol. 34 (1994) No. 1, pp.30-39.

Google Scholar

[2] J.F. Cao: Mandarin Environment Phonetic Change and the Structure of Diphone and Triphone, Applied linguistics, Vol. 45 (1996) No. 2, pp.58-63.

Google Scholar

[3] J.F. Cao: Mandarin Corpus Represents Set of Diphone and Triphone Structure System, Applied linguistics, Vol. 27 (1997) No. 1, pp.60-68.

Google Scholar

[4] Y.Q. ZU: The Text Design for Continuous Speech Database of Standard Chinese, Acta Acustica, Vol. 33 (1999) No. 3, pp.236-247.

Google Scholar

[5] H. Wu, B. Xu, T.Y. Huang: Automatic Corpus Selecting Algorithm Based on Triphone Models, Journal of software Vol. 47 (2000) No. 2, p.271.

Google Scholar

[6] Y.H. Li, J.P. Kong, H.Z. Yu: Rules for the Auto-Transformation of Tibetan Text to IPA, Journal of Tsinghua University (Science and Technology) Vol. 38 (2008) No. 4, pp.621-626.

Google Scholar