Speaker Identification Using a Novel Combination of Sparse Representation and Gaussian Mixture Models

Article Preview

Abstract:

In recent years, sparse representation has become a very popular method for pattern recognition which could outperform the traditional methods. This paper presents a novel combination of sparse representation and traditional Gaussian mixture models. Each person’s dictionary or termed as subspace in this paper are learned using K-SVD algorithm while the entries are GMM mean matrixes union for each speaker. Then project the test utterance into each dictionary and finally make decision depending on the reconstruction errors. The experiments are conducted on the database collected in our anechoic chamber. The proposed approach results in different accuracy for different sparsity and dictionary size. In appropriate parameters, the accuracy can reach 98.5% which is fairly good.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

265-269

Citation:

Online since:

August 2014

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Kumar G S. Speaker recognition using GMM[J]. International Journal of Engineering Science and Technology, 2010, 2 (6): 2428-2436.

Google Scholar

[2] Naseem Imran, Togneri Roberto, Bennamoun Mohammed. Sparse Representation for Speaker Identification[C]. International Conference on pattern Recognition, Aug 23-26, 2010, Istanbul: IEEE 2010: 4460-4463.

DOI: 10.1109/icpr.2010.1083

Google Scholar

[3] Reynolds D A, Quatieri T F, Dunn R. Speaker verification using adapted Gaussian mixture models[J]. Dig. Signal Process, 2000, 10 (1-3): 19-41.

DOI: 10.1006/dspr.1999.0361

Google Scholar

[4] Hairs B C, Sinha R. Speaker verification using Sparse Representation over KSVD Learned Dictionary[C]. National Conference on Communications , Feb 3-5, 2012, Kharagpur. 2012: 1-5.

DOI: 10.1109/ncc.2012.6176916

Google Scholar

[5] Reynolds D A. Speaker identification and verification using Gaussian mixture speaker models[J]. Speech Communication, 1995, 17(1-2): 91-108.

DOI: 10.1016/0167-6393(95)00009-d

Google Scholar

[6] Reynolds D A. Robust Text-independent speaker identification using Gaussian mixture speaker models[J]. IEEE trans. Speech and audio process, 1995, 13(1): 72-83.

DOI: 10.1109/89.365379

Google Scholar

[7] Donoho D L. Compressed sensing[J]. IEEE trans. Inform. Theory, 2006, 52(4): 5406-5425.

Google Scholar

[8] Candes E J, Wakin M B. An Introduction To Compressive Sampling. Signal Processing Magazine[J], IEEE, 2008, 25(2): 21-30.

DOI: 10.1109/msp.2007.914731

Google Scholar

[9] Baraniuk R G. Compressive Sensing [J]. Signal Processing Magazine, IEEE, 2007, 24(4): 118-152.

Google Scholar

[10] Aharon Michal, Elad Michael, Bruckstein Alfred. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation[J]. IEEE trans. Signal Processing, 2006, 54(11): 4311-4321.

DOI: 10.1109/tsp.2006.881199

Google Scholar

[11] Rubinstein R, Peleg T, Elad M. Analysis K-SVD: A Dictionary-Learning Algorithm for the Analysis Sparse Model[J]. IEEE Trans. Signal Processing, 2013, 61(3): 661-667.

DOI: 10.1109/tsp.2012.2226445

Google Scholar

[12] Barsi R, Jacobs D. Lambertian reflection and linear subspaces[J]. IEEE Trans. PAMI, 2003, 25(3): 218-233.

Google Scholar