Acoustic Features Selection of Speaker Verification Based on Average KL Distance

Article Preview

Abstract:

This paper proposes a new Average Kullback-Leibler distance to make an optimal feature selection algorithm for the matching score fusion of speaker verification. The advantage of this novel distance is to overcome the shortcoming of the asymmetry of conventional Kullback-Leibler distance, which can ensure the accuracy and robustness of the computation of the information content between matching scores of two acoustic features. From the experimental results by a variety of fusion schemes, it is found that the matching score fusion between MFCC and residual phase gains most information content. It indicates this scheme can yield an excellent performance.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

629-633

Citation:

Online since:

August 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] NAKAGAWA S, Wang L, Ohtsuka S. Speaker Identification and Verification by Combining MFCC and Phase Information[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(4): 1085-1095.

DOI: 10.1109/tasl.2011.2172422

Google Scholar

[2] Martinez J, Perez H, Escamilla E, Suzuki M. Speaker Recognition Using Mel Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) Techniques[C]/ Proceedings of the 22nd International Conference on Electrical Communications and Computers (CONIELECOMP): 2012, 248-251.

DOI: 10.1109/conielecomp.2012.6189918

Google Scholar

[3] Patil H A, Parhi K K. Development of TEO Phase for Speaker Recognition[C]/ Proceedings of the 2010 International Conference on Signal Processing and Communications (SPCOM): 2012, 1-5.

DOI: 10.1109/spcom.2010.5560486

Google Scholar

[4] Djamel A, Boudraa B. A Robust Distributed Speech Recognition in Mobile Communications[C]/ Proceedings of the Developments in E-systems Engineering (DeSE), 2011, 479-483.

DOI: 10.1109/dese.2011.94

Google Scholar

[5] Sant'Ana R, Coelho R, Alcaim A. Text-independent Speaker Reocognition based on the Hurst Parameter and the Multidimensional Fractional Brownian Motion Model[J]. IEEE Transaction on Audio, Speech and Language processing, 2006, 14(3): 931-940.

DOI: 10.1109/tsa.2005.858054

Google Scholar

[6] Adachi Y, Kawamoto S, Morishima S, Nakamura S. Perceptual Similarity Measurement of Speech by Combination of Acoustic Features[C]/ Proceedings of the IEEE International Conference Acoustics, Speech, and Signal Processing: 2008, 4861-4864.

DOI: 10.1109/icassp.2008.4518746

Google Scholar

[7] Liu D, Sun D, Qiu Z. Feature Selection for Fusion of Speaker Verification via Maximum Kullback-Leibler Distance[C]/ Proceedings of the 2010 IEEE 10th International Conference on Signal Processing (ICSP): 2010 , 565-568.

DOI: 10.1109/icosp.2010.5655871

Google Scholar

[8] Campbell J P, Shen W, Campbell W M, Reva S, Jean-Franqois B, Driss M, Forensic Speaker Recognition[J]. IEEE Signal Processing Magazine, 2009, 26(2): 95 -103.

DOI: 10.1109/msp.2008.931100

Google Scholar

[9] Wang L, Ohtsuka S, Nakagawa S. High Improvement of Speaker Identification and Verification by Combing MFCC and Phase Information[C]/ Proceedings of the IEEE International Conference Acoustics, Speech, and Signal Processing: 2009, 4529-4532.

DOI: 10.1109/icassp.2009.4960637

Google Scholar

[10] The NIST 2001 Speaker ID Evaluation Protocol, [Online], Available: http: /www. nist. gov/ speech/tests/spk/2001/index. htm.

Google Scholar

[11] Murty K, Yegnanarayana B. Combining Evidence from Residual Phase and MFCC Features for Speaker Recognition[J]. IEEE Signal Processing Letters, 2006, 13(1): 52-55.

DOI: 10.1109/lsp.2005.860538

Google Scholar

[12] Guerin B, El Fakhri G. Realistic PET Monte Carlo Simulation With Pixelated Block Detectors, Light Sharing, Random Coincidences and Dead-Time Modeling[J]. IEEE Transactions on Nuclear Science, 2008, 55(3): 942- 952.

DOI: 10.1109/tns.2008.924064

Google Scholar