Connected Mandarin Digit Speech Recognition Using Two-Layer Acoustic Universal Structure

Xian Yi Rui; Yi Biao Yu; Ying Jiang

doi:10.4028/www.scientific.net/AMR.846-847.1380

Paper Titles

Performance Analysis of Local Optimization Algorithms in Traveling Salesman Problem
p.1364

Research on Terrain Environment and Maneuver Simulation for Tactical Internet
p.1368

An Optimized Collision Detection Algorithm Based on Dynamic Bounding Volume Tree
p.1372

Researching on Parsing
p.1376

Connected Mandarin Digit Speech Recognition Using Two-Layer Acoustic Universal Structure
p.1380

Research on the Auxiliary Function of Image Processing Software in the Modern Arts Creation
p.1384

Point-Based Monte Carto Online Planning in POMDPs
p.1388

Hybrid Corrected Approach for Wind Power Forecasting Based on Ordinary Least Square Method
p.1392

A Fast Way of Enterprise-Grade Embedded Software Development and Release
p.1401

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 846-847Connected Mandarin Digit Speech Recognition Using...

Connected Mandarin Digit Speech Recognition Using Two-Layer Acoustic Universal Structure

Abstract:

Because of the single-syllable of Chinese words and the confusing nature of Chinese pronunciation, connected mandarin digit speech recognition (CMDSR) is a challenging task in the field of speech recognition. This paper applied a novel acoustic representation of speech, called the acoustic universal structure (AUS) where the non-linguistic variations such as vocal tract length, lines and noises are well removed. A two-layer matching strategy based on the AUS models of speech, including the digit and string AUS models, is proposed for connected mandarin digit speech recognition. The speech recognition system for connected mandarin digits is described in detail, and the experimental results show that the proposed method can obtain the higher recognition rate.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 846-847)

Pages:

1380-1383

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.846-847.1380

Citation:

Cite this paper

Online since:

November 2013

Authors:

Xian Yi Rui*, Yi Biao Yu, Ying Jiang

Keywords:

Acoustic Universal Structure, Candidate Digit Lattice, Connected Mandarin Digit Speech Recognition

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] R. -C. Shyu, J. -F. Wang and J. -Y. Lee: Improvement in connected mandarin digit recognition by explicitly modeling coarticulatory information, Journal of Information Science and Engineering, Vol. 16, No. 4, pp.649-660, (2000).

Google Scholar

[2] J. Gemmeke and B. Cranen: Missing data imputation using compressive sensing techniques for connected digit recognition, International Conference on Digital Signal Processing, pp.1-8, (2009).

DOI: 10.1109/icdsp.2009.5201176

Google Scholar

[3] Y. Deng, T. Huang and B. Xu: Towards high performance continuous mandarin digit string recognition, International Conference on Spoken Language Processing (ICSLP), (2000).

DOI: 10.21437/icslp.2000-617

Google Scholar

[4] W. Chao, S. Stephanie: Robust pitch tracking for prosodic modeling in telephone speech, International Conference on Acoustics and Signal Processing (ICASSP), pp.1343-1346, (2000).

DOI: 10.1109/icassp.2000.861827

Google Scholar

[5] T. Murakami, K. Maruyama, N. Minematsu and K. Hirose: Japanese vowel recognition using external structure of speech, Proceedings of Automatic Speech Recognition and Understanding, pp.203-208, (2005).

DOI: 10.1109/asru.2005.1566481

Google Scholar

[6] D. Zeng, Yibiao Yu: Voice conversion using structured Gaussian mixture model, International Conference on Signal Processing (ICSP), Beijing, pp.541-544, (2010).

DOI: 10.1109/icosp.2010.5656960

Google Scholar

[7] N. Minematsu, S. Asakawam and K. Hirose: Structural representation of the pronunciation and its use for CALL, Workshop on Spoken Language Technology, pp.126-129, (2006).

DOI: 10.1109/slt.2006.326833

Google Scholar