# Expectation–maximization algorithm

**expectation-maximization algorithmEM algorithmexpectation maximizationexpectation-maximizationexpectation–maximizationexpectation-maximization (EM) algorithmEMEM-algorithmexpectation maximization algorithm EM algorithm**

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables.wikipedia

181 Related Articles

### C. F. Jeff Wu

**C.F. Jeff WuChien-Fu Jeff WuC.-F. Wu**

The convergence analysis of the Dempster–Laird–Rubin algorithm was flawed and a correct convergence analysis was published by C. F. Jeff Wu in 1983.

He is known for his work on the convergence of the EM algorithm, resampling methods such as the bootstrap and jackknife, and industrial statistics, including design of experiments, and robust parameter design (Taguchi methods).

### Per Martin-Löf

**Martin-LöfMartin-Löf, PerPer Martin Löf**

A very detailed treatment of the EM method for exponential families was published by Rolf Sundberg in his thesis and several papers following his collaboration with Per Martin-Löf and Anders Martin-Löf.

The research of Anders and Per Martin-Löf has influenced statistical theory, especially concerning exponential families, the expectation-maximization method for missing data, and model selection.

### Baum–Welch algorithm

**Baum-Welch algorithmBaum Welch algorithmmathematical models**

In natural language processing, two prominent instances of the algorithm are the Baum–Welch algorithm for hidden Markov models, and the inside-outside algorithm for unsupervised induction of probabilistic context-free grammars.

In electrical engineering, computer science, statistical computing and bioinformatics, the Baum–Welch algorithm is a special case of the EM algorithm used to find the unknown parameters of a hidden Markov model (HMM).

### Arthur P. Dempster

**Arthur DempsterArthur Pentland DempsterDempster, Arthur P.**

The EM algorithm was explained and given its name in a classic 1977 paper by Arthur Dempster, Nan Laird, and Donald Rubin.

Among his contributions to statistics are the Dempster–Shafer theory and the expectation-maximization (EM) algorithm.

### Mixture model

**Gaussian mixture modelmixture modelsGMM**

For example, a mixture model can be described more simply by assuming that each observed data point has a corresponding unobserved data point, or latent variable, specifying the mixture component to which each data point belongs.

:with new parameters and that are updated using the EM algorithm.

### Maximum a posteriori estimation

**maximum a posterioriMAPposterior mode**

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables.

### Nan Laird

**Laird, NanLaird NM**

The EM algorithm was explained and given its name in a classic 1977 paper by Arthur Dempster, Nan Laird, and Donald Rubin.

Laird is well known for many seminal papers in biostatistics applications and methods, including the Expectation-maximization algorithm.

### Missing data

**missing valuesmissing at randomincomplete data**

That is, either missing values exist among the data, or the model can be formulated more simply by assuming the existence of further unobserved data points. Given the statistical model which generates a set \mathbf{X} of observed data, a set of unobserved latent data or missing values \mathbf{Z}, and a vector of unknown parameters, along with a likelihood function, the maximum likelihood estimate (MLE) of the unknown parameters is determined by maximizing the marginal likelihood of the observed data

The expectation-maximization algorithm is an approach in which values of the statistics which would be computed if a complete dataset were available are estimated (imputed), taking into account the pattern of missing data.

### Cluster analysis

**clusteringdata clusteringcluster**

EM is frequently used for data clustering in machine learning and computer vision.

Third, it can be seen as a variation of model based clustering, and Lloyd's algorithm as a variation of the Expectation-maximization algorithm for this model discussed below.

### Donald Rubin

**Don RubinRubinDonald B. Rubin**

The EM algorithm was explained and given its name in a classic 1977 paper by Arthur Dempster, Nan Laird, and Donald Rubin.

### Marginal likelihood

**evidencemodel evidenceBayesian model evidence**

Given the statistical model which generates a set \mathbf{X} of observed data, a set of unobserved latent data or missing values \mathbf{Z}, and a vector of unknown parameters, along with a likelihood function, the maximum likelihood estimate (MLE) of the unknown parameters is determined by maximizing the marginal likelihood of the observed data

In other cases, some kind of numerical integration method is needed, either a general method such as Gaussian integration or a Monte Carlo method, or a method specialized to statistical problems such as the Laplace approximation, Gibbs/Metropolis sampling, or the EM algorithm.

### Latent variable

**latent variableslatenthidden variables**

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables.

### Hidden Markov model

**hidden Markov modelsHMMPoisson hidden Markov model**

In natural language processing, two prominent instances of the algorithm are the Baum–Welch algorithm for hidden Markov models, and the inside-outside algorithm for unsupervised induction of probabilistic context-free grammars.

The Baum–Welch algorithm is a special case of the expectation-maximization algorithm.

### MM algorithm

**MM**

It is also possible to consider the EM algorithm as a subclass of the MM (Majorize/Minimize or Minorize/Maximize, depending on context) algorithm, and therefore use any machinery developed in the more general case.

The expectation–maximization algorithm can be treated as a special case of the MM algorithm.

### Variational Bayesian methods

**variational Bayesvariational methodsVariational free energy**

If using the factorized Q approximation as described above (variational Bayes), solving can iterate over each latent variable (now including θ) and optimize them one at a time.

Variational Bayes can be seen as an extension of the EM (expectation-maximization) algorithm from maximum a posteriori estimation (MAP estimation) of the single most probable value of each parameter to fully Bayesian estimation which computes (an approximation to) the entire posterior distribution of the parameters and latent variables.

### Mixed model

**mixed effects modelslinear mixed modellinear mixed models**

EM is frequently used for parameter estimation of mixed models, notably in quantitative genetics.

One method used to fit such mixed models is that of the EM algorithm where the variance components are treated as unobserved nuisance parameters in the joint likelihood.

### Inside–outside algorithm

**Inside-Outside algorithm**

In natural language processing, two prominent instances of the algorithm are the Baum–Welch algorithm for hidden Markov models, and the inside-outside algorithm for unsupervised induction of probabilistic context-free grammars.

It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

### Probabilistic context-free grammar

**stochastic context-free grammarSCFGsWeighted context-free grammar**

The inside-outside algorithm is a recursive dynamic programming scoring algorithm that can follow expectation-maximization paradigms.

### Yasuo Matsuyama

**Matsuyama, Yasuo**

Thus, the α-EM algorithm by Yasuo Matsuyama is an exact generalization of the log-EM algorithm.

Because the alpha-logarithm includes the usual logarithm, the alpha-EM algorithm contains the EM-algorithm (more precisely, the log-EM algorithm).

### Kalman filter

**Kalman filteringunscented Kalman filterInformation Filter**

A Kalman filter is typically used for on-line state estimation and a minimum-variance smoother may be employed for off-line or batch state estimation.

Expectation-maximization algorithms may be employed to calculate approximate maximum likelihood estimates of unknown state-space parameters within minimum-variance filters and smoothers.

### Positron emission tomography

**PETPET scanPET scans**

The EM algorithm (and its faster variant ordered subset expectation maximization) is also widely used in medical image reconstruction, especially in positron emission tomography and single photon emission computed tomography.

iterative expectation-maximization algorithms such as the Shepp-Vardi algorithm

### Ordered subset expectation maximization

**OsemOsem (mathematics)**

The EM algorithm (and its faster variant ordered subset expectation maximization) is also widely used in medical image reconstruction, especially in positron emission tomography and single photon emission computed tomography.

The OSEM method is related to the expectation maximization (EM) method of statistics.

### Total absorption spectroscopy

**total absorption spectrometer**

Then the procedure to find the feedings is iterative: using the expectation-maximization algorithm to solve the inverse problem, Then the procedure to find the feedings is iterative: using the expectation-maximization algorithm to solve the inverse problem, the feedings are extracted; if they don't reproduce the experimental data, it means that the initial guess of the branching ratios is wrong and has to be changed (of course, it is possible to play with other parameters of the analysis).

### Compound probability distribution

**compound distributionmixturecompounding**

Parameter estimation (maximum-likelihood or maximum-a-posteriori estimation) within a compound distribution model may sometimes be simplified by utilizing the EM-algorithm.

### Mixture distribution

**mixture densitymixturedensity mixture**