p.2703
p.2708
p.2712
p.2716
p.2720
p.2724
p.2728
p.2732
p.2736
Long Mandarin Spoken Term Detection Using Two-Stage Search
Abstract:
For efficient collection of speech recordings, the ability to search for spoken terms in the speech stream is an essential capability. Although the Chinese spoken term detection (STD) does not suffer the out-of-vocabulary (OOV) problem as English, it is still hard to retrieve the long spoken terms which contain four characters or more. In this paper, we details our approach for long Mandarin spoken term detection which combines the search on inverted index produced by speech recognizer and linear scan on syllable confusion network. First, we split the long spoken terms into syllables and search the syllables on the inverted index _le to get the segments which may contain the long spoken terms. Then we use a linear scan algorithm on syllable confusion networks (SCNs). On two Mandarin conversation telephone speech sets, we compare performance using the method proposed with that of the baseline syllable-based systems, and our approach gives satisfying performance gains over the others.
Info:
Periodical:
Pages:
2720-2723
Citation:
Online since:
August 2013
Authors:
Price:
Сopyright:
© 2013 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: