Unbiased estimation of standard deviation

sample standard deviationanti-biasingbiasunbiased estimation of the standard deviation of the normal distribution
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value.wikipedia
40 Related Articles

Standard deviation

standard deviationssample standard deviationSD
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value. In statistics, the standard deviation of a population of numbers is often estimated from a random sample drawn from the population.
Unlike in the case of estimating the population mean, for which the sample mean is a simple estimator with many desirable properties (unbiased, efficient, maximum likelihood), there is no single estimator for the standard deviation with all these properties, and unbiased estimation of standard deviation is a very technically involved problem.

Bias of an estimator

unbiasedunbiased estimatorbias
It also provides an example where imposing the requirement for unbiased estimation might be seen as just adding inconvenience, with no real benefit. One way of seeing that this is a biased estimator of the standard deviation of the population is to start from the result that s 2 is an unbiased estimator for the variance σ 2 of the underlying population if that variance exists and the sample values are drawn independently with replacement.
A biased estimator may be used for various reasons: because an unbiased estimator does not exist without further assumptions about a population or is difficult to compute (as in unbiased estimation of standard deviation); because an estimator is median-unbiased but not mean-unbiased (or the reverse); because a biased estimator gives a lower value of some loss function (particularly mean squared error) compared with unbiased estimators (notably in shrinkage estimators); or because in some cases being unbiased is too strong a condition, and the only unbiased estimators are not useful.

Variance

sample variancepopulation variancevariability
One way of seeing that this is a biased estimator of the standard deviation of the population is to start from the result that s 2 is an unbiased estimator for the variance σ 2 of the underlying population if that variance exists and the sample values are drawn independently with replacement.
Four common values for the denominator are n, n − 1, n + 1, and n − 1.5: n is the simplest (population variance of the sample), n − 1 eliminates bias, n + 1 minimizes mean squared error for the normal distribution, and n − 1.5 mostly eliminates bias in unbiased estimation of standard deviation for the normal distribution.

Chi distribution

ChiChi-distributedχ distributed
To derive the correction, note that for normally distributed X, Cochran's theorem implies that has a chi square distribution with n − 1 degrees of freedom and thus its square root, has a chi distribution with n − 1 degrees of freedom.
Accordingly, dividing by the mean of the chi distribution (scaled by the square root of n − 1) yields the correction factor in the unbiased estimation of the standard deviation of the normal distribution.

Standard error

SEstandard errorsstandard error of the mean
When this condition is satisfied, another result about s involving c 4 (n) is that the standard error of s is, while the standard error of the unbiased estimator is
See unbiased estimation of standard deviation for further discussion.

Sample mean and covariance

sample meansample covariancesample covariance matrix
where is the sample (formally, realizations from a random variable X) and is the sample mean.

Bessel's correction

Bessel-correctedBessel corrected variance
The use of n − 1 instead of n in the formula for the sample variance is known as Bessel's correction, which corrects the bias in the estimation of the population variance, and some, but not all of the bias in the estimation of the sample standard deviation.
There is no general formula for an unbiased estimator of the population standard deviation, though there are correction factors for particular distributions, such as the normal; see unbiased estimation of standard deviation for details.

Autocorrelation

autocorrelation functionserial correlationautocorrelated
However, real-world data often does not meet this requirement; it is autocorrelated (also known as serial correlation).

Statistics

statisticalstatistical analysisstatistician
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value. In statistics, the standard deviation of a population of numbers is often estimated from a random sample drawn from the population.

Statistical theory

statisticalstatistical theoriesmathematical statistics
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value.

Sample (statistics)

samplesamplesstatistical sample
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value.

Statistical dispersion

dispersionvariabilityspread
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value.

Statistical population

populationsubpopulationsubpopulations
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value.

Expected value

expectationexpectedmean
In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value.

Statistical hypothesis testing

hypothesis testingstatistical teststatistical tests
Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis.

Confidence interval

confidence intervalsconfidence levelconfidence
Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis.

Bayesian inference

BayesianBayesian analysisBayesian method
Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis.

Estimation theory

parameter estimationestimationestimated
However, for statistical theory, it provides an exemplar problem in the context of estimation theory which is both simple to state and for which results cannot be obtained in closed form.

Sampling (statistics)

samplingrandom samplesample
In statistics, the standard deviation of a population of numbers is often estimated from a random sample drawn from the population.

Random variable

random variablesrandom variationrandom
where is the sample (formally, realizations from a random variable X) and is the sample mean.

Jensen's inequality

Jensen inequalitySee here
Since the square root is a strictly concave function, it follows from Jensen's inequality that the square root of the sample variance is an underestimate.

Normal distribution

normally distributedGaussian distributionnormal
Much of the following relates to estimation assuming a normal distribution.

Cochran's theorem

Cochran's QCochran’s theorem
To derive the correction, note that for normally distributed X, Cochran's theorem implies that has a chi square distribution with n − 1 degrees of freedom and thus its square root, has a chi distribution with n − 1 degrees of freedom.

Statistical process control

statistical quality controlstatistical controlSPC
The table below gives numerical values of c 4 and algebraic expressions for some values of n; more complete tables may be found in most textbooks on statistical quality control.