Algorithms for calculating variance

computational algorithmsNumerically stable algorithmsnumerically stable alternativesparallel algorithm
Algorithms for calculating variance play a major role in computational statistics.wikipedia
30 Related Articles

Variance

sample variancepopulation variancevariability
A key difficulty in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical instability as well as to arithmetic overflow when dealing with large values.
There exist numerically stable alternatives.

Numerical stability

numerically stablenumerical instabilitynumerically unstable
A key difficulty in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical instability as well as to arithmetic overflow when dealing with large values.
Algorithms for calculating variance

Online algorithm

onlineofflineonline algorithms
For such an online algorithm, a recurrence relation is required between quantities from which the required statistics can be calculated in a numerically stable fashion.
Algorithms for calculating variance

Algebraic formula for the variance

computational formula for the variancecomputational formulaformula for the covariance
Algebraic formula for the variance
The context here is that of deriving algebraic expressions for the theoretical variance of a random variable, in contrast to questions of estimating the variance of a population from sample data for which there are special considerations in implementing computational algorithms.

Covariance

covariantcovariationcovary
One can also find there similar formulas for covariance.
Numerically stable algorithms should be preferred in this case.

Squared deviations from the mean

squared deviationssum of squared deviationssum of squared differences
Squared deviations from the mean
Algorithms for calculating variance

Kahan summation algorithm

compensated summation
Techniques such as compensated summation can be used to combat this error to a degree.
* Algorithms for calculating variance, which includes stable summation

Computational statistics

statistical computingscientific computing and statistical practicecomputational methods in statistics
Algorithms for calculating variance play a major role in computational statistics.

Algorithm

algorithmscomputer algorithmalgorithm design
A key difficulty in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical instability as well as to arithmetic overflow when dealing with large values.

Integer overflow

overflowarithmetic overflowoverflows
A key difficulty in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical instability as well as to arithmetic overflow when dealing with large values.

Statistical population

populationsubpopulationsubpopulations
A formula for calculating the variance of an entire population of size N is:

Bessel's correction

Bessel-corrected
Using Bessel's correction to calculate an unbiased estimate of the population variance from a finite sample of n observations, the formula is:

Bias of an estimator

unbiasedunbiased estimatorbias
Using Bessel's correction to calculate an unbiased estimate of the population variance from a finite sample of n observations, the formula is:

Sample (statistics)

samplesamplesstatistical sample
Using Bessel's correction to calculate an unbiased estimate of the population variance from a finite sample of n observations, the formula is:

Significant figures

precisionsignificant digitssignificant digit
can be very similar numbers, cancellation can lead to the precision of the result to be much less than the inherent precision of the floating-point arithmetic used to perform the computation.

Floating-point arithmetic

floating pointfloating-pointfloating-point number
can be very similar numbers, cancellation can lead to the precision of the result to be much less than the inherent precision of the floating-point arithmetic used to perform the computation.

Assumed mean

However, the algorithm can be improved by adopting the method of the assumed mean.

Invariant (mathematics)

invariantinvariantsinvariance
namely the variance is invariant with respect to changes in a location parameter

Location parameter

locationlocation modelshift parameter
namely the variance is invariant with respect to changes in a location parameter

Python (programming language)

PythonPython programming languagePython 3
If we take just the first sample as K the algorithm can be written in Python programming language as

Recurrence relation

difference equationdifference equationsrecurrence
For such an online algorithm, a recurrence relation is required between quantities from which the required statistics can be calculated in a numerically stable fashion.

Mean

mean valuepopulation meanaverage
The following formulas can be used to update the mean and (estimated) variance of the sequence, for an additional element x n . Here, n denotes the sample mean of the first n samples (x 1, ..., x n ), s 2 n their sample variance, and σ 2 n their population variance.

Loss of significance

catastrophic cancellationcancellationdifferences of similar values
can be very similar numbers, cancellation can lead to the precision of the result to be much less than the inherent precision of the floating-point arithmetic used to perform the computation.

Advanced Vector Extensions

AVXAVX2AVX instruction set
This can be generalized to allow parallelization with AVX, with GPU

Graphics processing unit

GPUGPUsgraphics processor
This can be generalized to allow parallelization with AVX, with GPU