On Semi-Supervised Learning Genetic-Based and Deterministic Annealing EM Algorithm for Dirichlet Mixture Models

Article Preview

Abstract:

We propose a genetic-based and deterministic annealing expectation-maximization (GA&DA-EM) algorithm for learning Dirichlet mixture models from multivariate data. This algorithm is capable of selecting the number of components of the model using the minimum description length (MDL) criterion. Our approach benefits from the properties of Genetic algorithms and deterministic annealing algorithm by combination of both into a single procedure. The population-based stochastic search of the GA&DA explores the search space more thoroughly than the EM method. Therefore, our algorithm enables escaping from local optimal solutions since the algorithm becomes less sensitive to its initialization. The GA&DA-EM algorithm is elitist which maintains the monotonic convergence property of the EM algorithm. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that show that 1) the GA&DA-EM outperforms the EM method since: Our approach identifies the number of components which were used to generate the underlying data more often than the EM algorithm. 2) the algorithm alternatives to EM that overcoming the challenges of local maxima.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

151-156

Citation:

Online since:

November 2010

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] G. Govaert and M. Nadif, in: An EM Algorithm for the Block Mixture Model, Vol. 27, No. 4, IEEE Transactions on Pattern Analysis and Machine Intelligence, p.634. (2005).

DOI: 10.1109/tpami.2005.69

Google Scholar

[2] D. M. Rouse, in: Estimation of Finite Mixture Models, North Carolina State University, Electrical Engineering, Master's Thesis. (2005).

Google Scholar

[3] N. Bouguila and D. Z. Mishing, in: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length, Vol. 29, No. 10, IEEE Transactions on Pattern Analysis and Machine Intelligence, p.1716. (2007).

DOI: 10.1109/tpami.2007.1095

Google Scholar

[4] V. Nannen, in: A Short Introduction to Model Selection, Kolmogorov Complexity and Minimum Description Length (MDL), Information on http: /volker. nannen. com/work/mdl/ . (2003).

Google Scholar

[5] T. Wong, in: Generalized Dirichlet Distribution in Bayesian Analysis, vol. 97, Applied Math. and Computation, pp.165-181. (1998).

DOI: 10.1016/s0096-3003(97)10140-0

Google Scholar

[6] N. Bouguila and D. Ziou, in: A Powerful Finite Mixture Model Based on the Generalized Dirichlet Distribution: Unsupervised Learning and Applications, Proceedings of the 17th International Conference on Pattern Recognition. (2004).

DOI: 10.1109/icpr.2004.1334107

Google Scholar

[7] G. McLachlan and D. Peel, Finite Mixture Models. John Wiley & Sons inc. . (2000).

Google Scholar

[8] K. Nigam, A. McCallum and T.M. Mitchell, in: Semi-Supervised Text Classification Using EM, semi-supervised learning, Mit Press. (2006).

DOI: 10.7551/mitpress/9780262033589.003.0003

Google Scholar

[9] F. Pernkopf and D. Bouchaffra, in: Genetic-Based EM Algorithm for Learning Gaussian Mixture Models, Vol. 27, No. 8, IEEE Transactions on Pattern Analysis and Machine Intelligence, p.1344. (2005).

DOI: 10.1109/tpami.2005.162

Google Scholar

[10] R.O. Duda, P.E. Hart and D.G. Stork, in: Pattern Classification, 2nd edn. Wiley-Interscience Publication, Hoboken. (2000).

Google Scholar