Speaker Identification in Total Variability Space

Article Preview

Abstract:

Gaussian Mixture Model-Universal Background Model based approaches have been popular used for speaker identification task. But in real complex environment the identification system performs too much worse than in laboratory, and the main reason is the mismatch of the training and testing channel and also the variability of the speaker himself. In this paper we introduce i-vector to the speaker identification system. In i-vector approach, a low dimensional subspace called total variability space is used to estimate both speaker and channel variability. Baum-Welch statistics are first computed over the given UBM to estimate the total variability. From the experiment results, we obtain 2.44% relative accurate identification rate improvement when using total variability space to compensate the mismatch of the variabilities from both the speaker and channel.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1489-1492

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] D. A. Reynolds, T. F. Quatieri, R. B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing , 2000, 10(1), 19-41.

DOI: 10.1006/dspr.1999.0361

Google Scholar

[2] W. M. Campbell, D. E. Sturim, D. A. Reynolds, SVM based speaker verification using a GMM supervector kernel and NAP variability compensation, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Toulouse, France, 2006, vol. 1, p.97–100.

DOI: 10.1109/icassp.2006.1659966

Google Scholar

[3] P. Kenny, G. Boulianne, P. Ouellet, Joint factor analysis versus eigenchannels in speaker recognition, IEEE Trans. Audio, Speech, Lang. Process. , vol. 15, no. 4, p.1435–1447, May (2007).

DOI: 10.1109/tasl.2006.881693

Google Scholar

[4] N. Dehak, R. Dehak, P. Kenny, Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification, Interspeech, Brighton, UK, Sept (2009).

DOI: 10.21437/interspeech.2009-385

Google Scholar

[5] A. Kanagasundaram, R. Vogt, D. Dean, i-vector Based Speaker Recognition on Short Utterances, Interspeech, Brighton, UK, (2011).

DOI: 10.21437/interspeech.2011-58

Google Scholar

[6] D. A. Reynolds, A gaussian mixture modeling approach to text-independent speaker identification, Ph.D. thesis, Georgia Institute of Technology, August (1992).

Google Scholar

[7] N. Dehak, P. Kenny, R. Dehak, Front-End Factor Analysis for Speaker Verification, IEEE Trans. on Audio, Speech and Language Processing, vol. 19, pp.788-798, May (2011).

DOI: 10.1109/tasl.2010.2064307

Google Scholar