Kevin Reilly

Review of:
Nikos Vlassis, Aristidis Likas
"A greedy EM algorithm for Gaussian mixture learning"
Neural Processing Letters 15(1): 77-87, 2002

Computing Reviews, 44, 9, 2003, 539.



Improvement in applying Gaussians mixtures (k weighted sums, weights summing to one) to heterogenous population data employs the EM (Expectation-Maximization) algorithm in a greedy algorithm manner, solving the problem successively over values of k. The problem reduces to EM learning of two-component mixtures, the k+1 mixture becoming a weighted sum of the (``old") k mixture and a new component, the old mixture being unchanged. The approach is based on quite recent (2000) theory showing that, if the new components are allocated optimally (left an open problem), the greedy approach works ``almost as good" (defined precisely in the paper) as any k mixture.

Component allocation builds on previous work, e.g., rendering the k+1 liklihood independent of the mixing weigh and employing pre-stored (Gaussian) matrices to simplify parameter search (to which the authors present an additional proposal). A 6-step algorithm summarizes all, (search) computation complexity being quadratic in the training set size. A Matlab implementation available on the internet is mentioned.

Experiments (synthetic and real data) include results better than proffered by the theory, possibly due to the usual initialization difficulty in the non-greedy case (the authors noting this). Final brief discussion lists other approaches and the author's future plans.

The paper is well written with surgically placed repetitions of important theory and practice points. Outined also are EM (general and greedy) algorithm details. The paper may have a slightly bloated introduction and text on better known information may squeeze out some paper-pertinent details which might promote wider readership.