Modeling Tutorial
Mixture Model
A mixture model is a probabilistic model for density estimation using a mixture distribution. A mixture model can be regarded as a type of unsupervised learning or clustering.
The normal distribution is plotted using different means and variances
Suppose researchers are trying to find the optimal mixture of ingredients for a fruit punch consisting of grape juice, mango juice, and pineapple juice. A mixture model is suitable here because the results of the taste tests will not depend on the amount of ingredients used to make the batch but rather on the fraction of each ingredient present in the punch. The components always sum to a whole, which a mixture model takes into account. As another example, financial returns often behave differently in normal situations and during crisis times. A mixture model for return data seems reasonable. Some model this as a jump-diffusion model, or as a mixture of two normal distributions.
Direct and indirect applications of mixture models
The financial example above is one direct application of the mixture model, a situation in which we assume an underlying mechanism so that each observation belongs to one of some number of different sources or categories. This underlying mechanism may or may not, however, be observable. In this form of mixture, each of the sources is described by a component probability density function, and its mixture weight is the probability that an observation comes from this component.
In an indirect application of the mixture model we do not assume such a mechanism. The mixture model is simply used for its mathematical flexibilities. For example, a mixture of two normal distributions with different means may result in a density with two modes, which is not modeled by standard parametric distributions. Another example is given by the possibility of mixture distributions to model fatter tails than the basic Gaussian ones, so as to be a candidate for modeling more extreme events. When combined with dynamical consistency, this approach has been applied to financial derivatives valuation in presence of the volatility smile in the context of local volatility models.
Common approaches for estimation in mixture models
Parametric mixture models are often used when we know the distribution Y and we can sample from X , but we would like to determine the a i and i values. Such situations can arise in studies in which we sample from a population that is composed of several distinct subpopulations. It is common to think of probability mixture modeling as a missing data problem. One way to understand this is to assume that the data points under consideration have “membership” in one of the distributions we are using to model the data. When we start, this membership is unknown, or missing. The job of estimation is to devise appropriate parameters for the model functions we choose, with the connection to the data points being represented as their membership in the individual model distributions.
Expectation maximization (EM)
The Expectation-maximization algorithm can be used to compute the parameters of a parametric mixture model distribution (the a i ’s and i ’s). It is an iterative algorithm with two steps: an expectation step and a maximization step . Practical examples of EM and Mixture Modeling are included in the SOCR demonstrations.
The expectation step
With initial guesses for the parameters of our mixture model, “partial membership” of each data point in each constituent distribution is computed by calculating expectation values for the membership variables of each data point. That is, for each data point x j and distribution Y i , the membership value y i , j is:
The maximization step
With expectation values in hand for group membership, plug-in estimates are recomputed for the distribution parameters. The mixing coefficients a i are the means of the membership values over the N data points. The component model parameters i are also calculated by expectation maximization using data points x j that have been weighted using the membership values. For example, if is a mean With new estimates for a i and the i ’s, the expectation step is repeated to recompute new membership values. The entire procedure is repeated until model parameters converge.
Markov-chain Monte Carlo
As an alternative to the EM algorithm, the mixture model parameters can be deduced using posterior sampling as indicated by Bayes’ theorem. This is still regarded as an incomplete data problem whereby membership of data points is the missing data. A two-step iterative procedure known as Gibbs sampling can be used.
The previous example of a mixture of two Gaussian distributions can demonstrate how the method works. As before, initial guesses of the parameters for the mixture model are made. Instead of computing partial memberships for each elemental distribution, a membership value for each data point is drawn from a Bernoulli distribution (that is, it will be assigned to either the first or the second Gaussian). The Bernoulli parameter is determined for each data point on the basis of one of the constituent distributions. [ vague ] Draws from the distribution generate membership associations for each data point. Plug-in estimators can then be used as in the M step of EM to generate a new set of mixture model parameters, and the binomial draw step repeated.
Spectral method
Some problems in mixture model estimation can be solved using spectral methods. In particular it becomes useful if data points x i are points in high-dimensional Euclidean space, and the hidden distributions are known to be log-concave (such as Gaussian distribution or Exponential distribution). Spectral methods of learning mixture models are based on the use of Singular Value Decomposition of a matrix which contains data points. The idea is to consider the top k singular vectors, where k is the number of distributions to be learned. The projection of each data point to a linear subspace spanned by those vectors groups points originating from the same distribution very close together, while points from different distributions stay far apart. One distinctive feature of the spectral method is that it allows us to prove that if distributions satisfy certain separation condition, then the estimated mixture will be very close to the true one with high probability.
Other methods
Some of them can even probably learn mixtures of heavy-tailed distributions including those with infinite variance (see links to papers below). In this setting, EM based methods would not work, since the Expectation step would diverge due to presence of outliers.
A simulation
To simulate a sample of size N that is from a mixture of distributions F i , i=1 to n , with probabilities p i (sum p i =1): Generate N random numbers from a Categorical distribution of size n and probabilities p i for i=1 to n . These tell you which of the F i each of the N values will come from. Denote by m i the quantity of random numbers assigned to the i th category. For each i, generate m i random numbers from the F i distribution.
The SOCR demonstrations of EM and Mixture Modeling Interactive Mixture of Normal-distributions Java applet Mixture modelling page (and the Snob program for Minimum Message Length (MML) applied to finite mixture models), maintained by D.L. Dowe. PyMix - Python Mixture Package, algorithms and data structures for a broad variety of mixture model based data mining applications in Python em - A Python package for learning Gaussian Mixture Models with Expectation Maximization, currently packaged with SciPy GMM.m Matlab code for GMM Implementation




[...] actors are expensive and most friends don’t make good actors. This is why making a 3D animation movie is a satisfying low cost [...]
[...] you have done any work with Autodesk 3dsmax, chances are you have already heard about V-Ray - a 3rd party render engine, if not *the* render engine for 3dsmax. Although popular, little is known about the [...]
[...] materials (or composites for short) are engineered materials made from two or more constituent materials with significantly different physical or [...]
Awesome Post, thanks for your useful Post. I will come back later . Great information about talents: modeling
Adding Embedded Code to a Report
You’ve been kicked (a good thing) - Trackback from DotNetKicks.com