# Statistical inference

**inferenceinferential statisticsinferencesinferentialstatisticalformal statistical inferencehypothesis testinductiveinductive statisticsinfer**

Statistical inference is the process of using data analysis to deduce properties of an underlying probability distribution.wikipedia

305 Related Articles

### Model selection

**selectingchoose a modelcomparing statistical models**

Given a hypothesis about a population, for which we wish to draw inferences, statistical inference consists of (first) selecting a statistical model of the process that generates the data and (second) deducing propositions from the model.

state, "The majority of the problems in statistical inference can be considered to be problems related to statistical modeling".

### Data analysis

**data analyticsanalysisdata analyst**

Statistical inference is the process of using data analysis to deduce properties of an underlying probability distribution.

Inferential statistics includes techniques to measure relationships between particular variables.

### Statistical hypothesis testing

**hypothesis testingstatistical teststatistical tests**

rejection of a hypothesis;

A statistical hypothesis test is a method of statistical inference.

### Confidence interval

**confidence intervalsconfidence levelconfidence**

an interval estimate, e.g. a confidence interval (or set estimate), i.e. an interval constructed using a dataset drawn from a population so that, under repeated sampling of such datasets, such intervals would contain the true parameter value with the probability at the stated confidence level; However this argument is the same as that which shows that a so-called confidence distribution is not a valid probability distribution and, since this has not invalidated the application of confidence intervals, it does not necessarily invalidate conclusions drawn from fiducial arguments.

The principle behind confidence intervals was formulated to provide an answer to the question raised in statistical inference of how to deal with the uncertainty inherent in results derived from data that are themselves only a randomly selected subset of a population.

### Nonparametric statistics

**non-parametricnonparametricnon-parametric statistics**

Non-parametric: The assumptions made about the process generating the data are much less than in parametric statistics and may be minimal. For example, every continuous probability distribution has a median, which may be estimated using the sample median or the Hodges–Lehmann–Sen estimator, which has good properties when the data arise from simple random sampling.

Nonparametric statistics includes both descriptive statistics and statistical inference.

### Statistical population

**populationsubpopulationsubpopulations**

Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates.

In statistical inference, a subset of the population (a statistical sample) is chosen to represent the population in a statistical analysis.

### Normal distribution

**normally distributednormalGaussian**

With finite samples, approximation results measure how close a limiting distribution approaches the statistic's sample distribution: For example, with 10,000 independent samples the normal distribution approximates (to two digits of accuracy) the distribution of the sample mean for many population distributions, by the Berry–Esseen theorem.

Therefore, it may not be an appropriate model when one expects a significant fraction of outliers—values that lie many standard deviations away from the mean—and least squares and other statistical inference methods that are optimal for normally distributed variables often become highly unreliable when applied to such data.

### Statistical classification

**classificationclassifierclassifiers**

clustering or classification of data points into groups.

Algorithms of this nature use statistical inference to find the best class for a given instance.

### Bayesian inference

**BayesianBayesian analysisBayesian methods**

In Bayesian inference, randomization is also of importance: in survey sampling, use of sampling without replacement ensures the exchangeability of the sample with the population; in randomized experiments, randomization warrants a missing at random assumption for covariate information. The classical (or frequentist) paradigm, the Bayesian paradigm, and the AIC-based paradigm are summarized below.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available.

### Frequentist inference

**frequentistfrequentist statisticsclassical statistics**

The classical (or frequentist) paradigm, the Bayesian paradigm, and the AIC-based paradigm are summarized below.

Frequentist inference is a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data.

### Biostatistics

**biostatisticianbiometrybiometrician**

For example, limiting results are often invoked to justify the generalized method of moments and the use of generalized estimating equations, which are popular in econometrics and biostatistics.

Because of that, the sampling process is very important for statistical inference.

### Akaike information criterion

**AICAICcAkaike criterion**

The classical (or frequentist) paradigm, the Bayesian paradigm, and the AIC-based paradigm are summarized below.

The Akaike information criterion is named after the statistician Hirotugu Akaike, who formulated it. It now forms the basis of a paradigm for the foundations of statistics; as well, it is widely used for statistical inference.

### Randomized experiment

**randomized trialrandomizationrandomized**

)Similarly, results from randomized experiments are recommended by leading statistical authorities as allowing inferences with greater reliability than do observational studies of the same phenomena.

Randomization also produces ignorable designs, which are valuable in model-based statistical inference, especially Bayesian or likelihood-based.

### P-value

**p''-valuepp''-values**

p-value

The p-value is widely used in statistical hypothesis testing, specifically in null hypothesis significance testing.

### Information theory

**information theoristinformation-theoreticinformation**

AIC is founded on information theory: it offers an estimate of the relative information lost when a given model is used to represent the process that generated the data. The MDL principle has been applied in communication-coding theory in information theory, in linear regression, and in data mining.

The theory has also found applications in other areas, including statistical inference, natural language processing, cryptography, neurobiology, human vision, the evolution and function of molecular codes (bioinformatics), model selection in statistics, thermal physics, quantum computing, linguistics, plagiarism detection, pattern recognition, and anomaly detection.

### Data mining

**data-miningdataminingdata mine**

The MDL principle has been applied in communication-coding theory in information theory, in linear regression, and in data mining.

Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

### Fiducial inference

**fiducialfiducial distributionfaith**

Fiducial inference was an approach to statistical inference based on fiducial probability, also known as a "fiducial distribution".

Fiducial inference is one of a number of different types of statistical inference.

### Confidence distribution

However this argument is the same as that which shows that a so-called confidence distribution is not a valid probability distribution and, since this has not invalidated the application of confidence intervals, it does not necessarily invalidate conclusions drawn from fiducial arguments.

In statistical inference, the concept of a confidence distribution (CD) has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest.

### Interval estimation

**interval estimateintervalinterval (statistics)**

an interval estimate, e.g. a confidence interval (or set estimate), i.e. an interval constructed using a dataset drawn from a population so that, under repeated sampling of such datasets, such intervals would contain the true parameter value with the probability at the stated confidence level;

There is another approach to statistical inference, namely fiducial inference, that also considers interval estimation.

### Statistical model

**modelprobabilistic modelstatistical modeling**

Given a hypothesis about a population, for which we wish to draw inferences, statistical inference consists of (first) selecting a statistical model of the process that generates the data and (second) deducing propositions from the model.

More generally, statistical models are part of the foundation of statistical inference.

### Algorithmic inference

**sampling mechanism**

Algorithmic inference

Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst.

### Linear regression

**regression coefficientregressionmultiple linear regression**

The MDL principle has been applied in communication-coding theory in information theory, in linear regression, and in data mining.

is a (p+1)-dimensional parameter vector, where \beta_0 is the intercept term (if one is included in the model—otherwise is p-dimensional). Its elements are known as effects or regression coefficients (although the latter term is sometimes reserved for the estimated effects). Statistical estimation and inference in linear regression focuses on β. The elements of this parameter vector are interpreted as the partial derivatives of the dependent variable with respect to the various independent variables.

### Descriptive statistics

**descriptivedescriptive statisticstatistics**

Inferential statistics can be contrasted with descriptive statistics.

Descriptive statistics is distinguished from inferential statistics (or inductive statistics), in that descriptive statistics aims to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent.

### Predictive inference

**prediction theorypredictpredictions**

Predictive inference

Predictive inference is an approach to statistical inference that emphasizes the prediction of future observations based on past observations.

### Statistical assumption

**assumptionsmodel assumptionsstatistical assumptions**

1) Statistical assumptions

There are two approaches to statistical inference: model-based inference and design-based inference.