Adjustment Method between Phonological Attributes and Phone Boundaries

Lian Hai Zhang; Qi Chen; Dan Qu

doi:10.4028/www.scientific.net/AMM.433-435.316

Paper Titles

Video Synopsis Generation Using GPGPU
p.297

A New Adaptive Image Denoising Method Based on Wavelet Packet Transform and Neighbor Dependency
p.301

An Improved Compressive Sensing Image Fusion Algorithm Based on NSCT Transform
p.306

3D Subtitles Superimposed Technical Overview
p.310

Adjustment Method between Phonological Attributes and Phone Boundaries
p.316

Study on Compressed Sensing Recovery Algorithms
p.322

An Efficient Multi-Hypothesis Temporal Error Concealment Method for H.264
p.326

Research on Image Scene Semantics Recognition System
p.330

Learning Facial Expression Codes with Sparse Auto-Encoder
p.334

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 433-435Adjustment Method between Phonological Attributes...

Adjustment Method between Phonological Attributes and Phone Boundaries

Abstract:

Two kinds of imperfections, namely the detection errors and the asynchrony between phonological attributes and phone boundaries, can cause a substantial decline in recognition accuracy of a detection-based automatic speech recognition system. To solve these problems, an adjustment method between phonological attributes and phone boundaries is proposed in this paper. At first the prior knowledge of corpus and the detection results are combined, then the asynchronies in the phone boundary area are compensated and the detection errors are corrected; additionally, by selectively deleting some frames with errors, the precision of the phone models are improved. After adoption of this adjustment method, 1.4% of phoneme recognition rate can be improved in the TIMIT phone classification experiments based on Conditional Random Fields.

You might also be interested in these eBooks

Advances in Mechatronics and Control Engineering II

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 433-435)

Pages:

316-321

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.433-435.316

Citation:

Cite this paper

Online since:

October 2013

Authors:

Lian Hai Zhang, Qi Chen, Dan Qu

Keywords:

Automatic Speech Recognition, Conditional Random Fields, Phone Boundary Asynchrony, Phonological Attributes Detection

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Chin-Hui Lee, Mark A. Clements, Sorin Dusan. An Overview on Automatic Speech Attribute Transcription (ASAT), . In Proceeding of Interspeech 2007, Antwerp Belgium, 2007, p.1825–1828.

DOI: 10.21437/interspeech.2007-509

Google Scholar

[2] Afsaneh Asaei, Benjamin Picart, Hervé Bourlard. Analysis of Phone Posterior Feature space Exploiting Class-Specific Sparsity And MLP-Based Similarity Measure,. IEEE International Conference on ICASSP. Dallas, TX: 2010, p.4886–4889.

DOI: 10.1109/icassp.2010.5495121

Google Scholar

[3] S. King, P. Taylor. Detection of phonological features in continuous speech recognition using neural networks,. Computer, Speech and Language, 2000, 14(4), p.333–353.

DOI: 10.1006/csla.2000.0148

Google Scholar

[4] J. Morris, E. Fosler-Lussier. Further Experiments With Detector-Based Conditional Random Fields In Phonetic Recognition,. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007, p.441–444.

DOI: 10.1109/icassp.2007.366944

Google Scholar

[5] M. Wester, J. Frankel, and S. King. Asynchronous Articulatory Feature Recognition Using Dynamic Bayesian Networks,. Computer Speech & Language, Vol. 21, Issue 4, October, 2007, p.620–640.

DOI: 10.1016/j.csl.2007.03.002

Google Scholar

[6] L. Bosch ten, H. Baayen, and M. Ernestus, On Speech Variation and Word Type Differentiation by Articulatory Feature Representations,. In Proceedings of Interspeech, Pittsburgh, 2006, p.2230–2233.

DOI: 10.21437/interspeech.2006-319

Google Scholar

[7] P. Jyothi, K. Livescu, E. Fosler-Lussier. Lexical Access Experiments With Context-Dependent Articulatory,. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, 2011, p.4900–4903.

DOI: 10.1109/icassp.2011.5947454

Google Scholar

[8] John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random ﬁelds: Probabilistic models for segmenting and labeling sequence data,. Proceedings of Machine Learning. Morgan Kaufmann, SanFrancisco, CA, 2001, p.282–289.

DOI: 10.1145/1015330.1015422

Google Scholar

[9] Bingxi Wang, Dan Qu, Xuan Peng. Practical fundamentals of speech recognition,. National Defence Industry Press. (2005).

Google Scholar

[10] N. Strom, . The NICO Artificial Neural Network Toolkit,. http: /nico. nikkostrom. com.

Google Scholar

[11] R. Prabhavalkar, E. Fosler-Lussier, K. Livescu. A Factored Conditional Random Field Model For Articulatory Feature Forced Transcription,. IEEE workshop on ASRU, Hawaii, USA, (2011).

DOI: 10.1109/asru.2011.6163909

Google Scholar