Long Mandarin Spoken Term Detection Using Two-Stage Search

Zhen Zhang; Ji  Xu; Xu Yang Wang; Qing Wei  Zhao; Yong Hong Yan

doi:10.4028/www.scientific.net/AMM.380-384.2720

Paper Titles

Traffic Measurement between Peers in BitTorrent-Like Networks
p.2703

Intrusion Detection Model Based on Improved Genetic Algorithm Neural Network in Computer Integrated Process System
p.2708

Web Form Entrance Detection and Automatic Form Filling
p.2712

An Energy-Efficient Attack Detection Protocol for WSN
p.2716

Long Mandarin Spoken Term Detection Using Two-Stage Search
p.2720

The Cooperation-Competition Model for the Hot Topics of Chinese Microblogs
p.2724

A New Research on Instrusion Detection System Based on Artificial Immune
p.2728

Application of Virtools in Virtual Campus Roaming
p.2732

Implementation and Design of Basketball Technical Action Based on B/S Framework Data Mining System
p.2736

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 380-384Long Mandarin Spoken Term Detection Using...

Long Mandarin Spoken Term Detection Using Two-Stage Search

Abstract:

For efficient collection of speech recordings, the ability to search for spoken terms in the speech stream is an essential capability. Although the Chinese spoken term detection (STD) does not suffer the out-of-vocabulary (OOV) problem as English, it is still hard to retrieve the long spoken terms which contain four characters or more. In this paper, we details our approach for long Mandarin spoken term detection which combines the search on inverted index produced by speech recognizer and linear scan on syllable confusion network. First, we split the long spoken terms into syllables and search the syllables on the inverted index _le to get the segments which may contain the long spoken terms. Then we use a linear scan algorithm on syllable confusion networks (SCNs). On two Mandarin conversation telephone speech sets, we compare performance using the method proposed with that of the baseline syllable-based systems, and our approach gives satisfying performance gains over the others.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 380-384)

Pages:

2720-2723

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.380-384.2720

Citation:

Cite this paper

Online since:

August 2013

Authors:

Zhen Zhang, Ji Xu, Xu Yang Wang, Qing Wei Zhao, Yong Hong Yan

Keywords:

Inverted Index, MED Search, Spoken Term Detection, Syllable Confusion Network

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] I. Szoke, P. Schwarz, P. Matejka, L. Burget, M. Kara_at, M. Fapso, and J. Cernocky, \Comparison of keyword spotting approaches for informal continuous speech, " in Ninth European Conference on Speech Communication and Technology, (2005).

DOI: 10.21437/interspeech.2005-69

Google Scholar

[2] T. Mertens and D. Schneider, \E_cient subword lattice retrieval for german spoken term detection, " in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. IEEE, 2009, pp.4885-4888.

DOI: 10.1109/icassp.2009.4960726

Google Scholar

[3] J. Shao, Q. Zhao, P. Zhang, Z. Liu, and Y. Yan, \A fast fuzzy keyword spotting algorithm based on syllable confusion network, " in Eighth Annual Conference of the International Speech Communication Association, (2007).

DOI: 10.21437/interspeech.2007-185

Google Scholar

[4] L. Mangu, E. Brill, and A. Stolcke, \Finding consensus in speech recognition: word error minimization and other applications of confusion networks, " Computer Speech & Language, vol. 14, no. 4, pp.373-400, (2000).

DOI: 10.1006/csla.2000.0152

Google Scholar

[5] D. Hakkani-Tur and G. Riccardi, \A general algorithm for word graph matrix decomposition, in Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP, 03). 2003 IEEE International Conference on, vol. 1. IEEE, 2003, pp. I{596.

DOI: 10.1109/icassp.2003.1198851

Google Scholar

[6] D. R. Miller, M. Kleber, C. -L. Kao, O. Kimball, T. Colthurst, S. A. Lowe, R. M. Schwartz, and H. Gish, \Rapid and accurate spoken term detection, " in Eighth Annual Conference of the International Speech Communication Association, (2007).

DOI: 10.21437/interspeech.2007-174

Google Scholar