Chinese Dialect Identification Using SC-GMM

Article Preview

Abstract:

Gaussian mixture model (GMM) is a sort of effective identification method in Chinese dialects identification, estimating GMM parameters is always an important step in building a state-of-the-art speech processing system. One of the most widely used approaches is maximum- likelihood estimation, where parameters of class-specific distributions are estimated using Expectation Maximization algorithm(EM). Initial parameters have great influence on the convergence of EM algorithm, so how to initialize GMM parameters is a key problem. In this paper, we apply spectral clustering(SC) to initialize GMM parameters. Experimental results prove that using spectral clustering algorithm to initialize GMM parameters is superior to traditional K-Means method and identification system has a higher recognition rate.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 433-440)

Pages:

3292-3296

Citation:

Online since:

January 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Ma Bin, Zhu Donglai and Tong Rong, Chinese dialect identification using tone features based on pitch flux. ICASSP, 2006, Vol. 1 : 1029-1032.

DOI: 10.1109/icassp.2006.1660199

Google Scholar

[2] Zissman M A, Comparison of four approaches to automatic language identification of telephone speech , IEEE Transactions on Speech and Audio Processing, l996, 4(1): 31-44.

DOI: 10.1109/tsa.1996.481450

Google Scholar

[3] P. A. Torres-Carrasquillo, D. A. Reynolds, and J. R. Deller Jr., Language Identification Using Gaussian Mixture Model Tokenization, at ICASSP, Orlando, Fl. , USA, (2002).

DOI: 10.1109/icassp.2002.1005850

Google Scholar

[4] Wong E and Sridharan S, Methods to improve Gaussian mixture model based language identification system, Proc. ICSLP, 2002, pp.93-96.

DOI: 10.21437/icslp.2002-75

Google Scholar

[5] B. Bielefeld, Language identification using shifted delta cepstrum, in proc. Fourteenth Annual Speech Research Symposium, (1994).

Google Scholar

[6] Dempster A P, Laird N M, Rubin D B, Maximum Likelihood from Incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society. Series B Vol. 39, No. 1, 1977, pp.1-38.

DOI: 10.1111/j.2517-6161.1977.tb01600.x

Google Scholar

[7] Ulrike von Luxburg., A Tutorial on Spectral Clustering, Statistics and Computing. Volume 17, 2007, pp.395-416.

DOI: 10.1007/s11222-007-9033-z

Google Scholar

[8] V Vapnik. The Nature of Statistical Learning Theory[M ]. New York: Springer-Verlag, (1995).

Google Scholar

[9] C Burges. A tutorial on support vector machines for pattern recognition , Data Mining and Knowledge Discovery, Vol. 2, no. 2, pp.121-167, (1998).

Google Scholar