GMM
GMM
Cluster assignment of point
- describe by with
Each cluster has its own Gaussian distribution
- class conditional density
Gaussian mixture distribution:
Modelling the data points as independent, aim is to find that maximise
where
Try the usual log trick
EM
MLE is a frequentist principle that suggests that given a daraset, the "best" parameters to use are the ones that maximise the probability of the data
- MLE is a way to formally pose the problem
EM is an algorihm
- EM is a way to solve the problem posed by MLE
- Especially convenient under unvserved data MLE can be found by other methods such as gradient descent
EM is a general approach, goes beyond GMMs
- Purpose: Implement MLE underlatent variables Z
Variableasz in GMMs:
- Variables: Point locations X abd cluster assignments Z
- Parameters : are cluster locations and scales
What is EM really doing?
-
Coordinate ascent on a lower bound on the log-likelihood
- M-step: ascent in modelede parameters
- E-step: ascent in the marginal likelihood
-
Each step moves towards a local optimum
-
Can get stuck, can need random restarts
EM For fitting the GMM
Initialisation Step:
- Initialze K clusters and for each cluster j.
Iterationm Step:
- Estimate the cluster of each datum Expection
- Re-estimate the cluster parameters for each cluster j