On Semi-Supervised Learning Genetic-Based and Deterministic Annealing EM Algorithm for Dirichlet Mixture Models
We propose a genetic-based and deterministic annealing expectation-maximization (GA&DA-EM) algorithm for learning Dirichlet mixture models from multivariate data. This algorithm is capable of selecting the number of components of the model using the minimum description length (MDL) criterion. Our approach benefits from the properties of Genetic algorithms and deterministic annealing algorithm by combination of both into a single procedure. The population-based stochastic search of the GA&DA explores the search space more thoroughly than the EM method. Therefore, our algorithm enables escaping from local optimal solutions since the algorithm becomes less sensitive to its initialization. The GA&DA-EM algorithm is elitist which maintains the monotonic convergence property of the EM algorithm. We conducted experiments on the WebKB and 20NEWSGROUPS. The results show that show that 1) the GA&DA-EM outperforms the EM method since: Our approach identifies the number of components which were used to generate the underlying data more often than the EM algorithm. 2) the algorithm alternatives to EM that overcoming the challenges of local maxima.
J. H. Bai et al., "On Semi-Supervised Learning Genetic-Based and Deterministic Annealing EM Algorithm for Dirichlet Mixture Models", Applied Mechanics and Materials, Vol. 39, pp. 151-156, 2011