Statistics

The normal distribution, a very common probability density, useful because of the central limit theorem.
Scatter plots are used in descriptive statistics to show the observed relationships between different variables, here using the Iris flower data set.
Gerolamo Cardano, a pioneer on the mathematics of probability.
Karl Pearson, a founder of mathematical statistics.
A least squares fit: in red the points to be fitted, in blue the fitted line.
Confidence intervals: the red line is true value for the mean in this example, the blue lines are random confidence intervals for 100 realizations.
In this graph the black line is probability distribution for the test statistic, the critical region is the set of values to the right of the observed data point (observed value of the test statistic) and the p-value is represented by the green area.
The confounding variable problem: X and Y may be correlated, not because there is causal relationship between them, but because both depend on a third variable Z. Z is called a confounding factor.
gretl, an example of an open source statistical package

Discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.

- Statistics
The normal distribution, a very common probability density, useful because of the central limit theorem.

111 related topics

Alpha

Box plot of the Michelson–Morley experiment, showing several summary statistics.

Descriptive statistics

Summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and analysing those statistics.

Summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and analysing those statistics.

Box plot of the Michelson–Morley experiment, showing several summary statistics.

The use of descriptive and summary statistics has an extensive history and, indeed, the simple tabulation of populations and of economic data was the first way the topic of statistics appeared.

The supply and demand model describes how prices vary as a result of a balance between product availability and demand. The graph depicts an increase (that is, right-shift) in demand from D1 to D2 along with the consequent increase in price and quantity required to reach a new equilibrium point on the supply curve (S).

Economics

Social science that studies the production, distribution, and consumption of goods and services.

Social science that studies the production, distribution, and consumption of goods and services.

The supply and demand model describes how prices vary as a result of a balance between product availability and demand. The graph depicts an increase (that is, right-shift) in demand from D1 to D2 along with the consequent increase in price and quantity required to reach a new equilibrium point on the supply curve (S).
A 1638 painting of a French seaport during the heyday of mercantilism
The publication of Adam Smith's The Wealth of Nations in 1776 is considered to be the first formalisation of economic thought.
The Marxist critique of political economy comes from the work of German philosopher Karl Marx.
John Maynard Keynes (right) was a key theorist in economics.
Economists study trade, production and consumption decisions, such as those that occur in a traditional marketplace.
Electronic trading brings together buyers and sellers through an electronic trading platform and network to create virtual market places. Pictured: São Paulo Stock Exchange, Brazil.
An example production–possibility frontier with illustrative points marked.
A map showing the main trade routes for goods within late medieval Europe
Pollution can be a simple example of market failure. If costs of production are not borne by producers but are by the environment, accident victims or others, then prices are distorted.
Environmental scientist sampling water
A basic illustration of economic/business cycles
US unemployment rate, 1990–2021
List of countries by GDP (PPP) per capita in 2014

Statistical methods such as regression analysis are common.

The Poisson distribution, a discrete probability distribution.

Probability theory

Branch of mathematics concerned with probability.

Branch of mathematics concerned with probability.

The Poisson distribution, a discrete probability distribution.
The normal distribution, a continuous probability distribution.

As a mathematical foundation for statistics, probability theory is essential to many human activities that involve quantitative analysis of data.

Why-Because Graph of the capsizing of the Herald of Free Enterprise (click to see in detail).

Causality

Influence by which one event, process, state, or object (a cause) contributes to the production of another event, process, state, or object ( an effect) where the cause is partly responsible for the effect, and the effect is partly dependent on the cause.

Influence by which one event, process, state, or object (a cause) contributes to the production of another event, process, state, or object ( an effect) where the cause is partly responsible for the effect, and the effect is partly dependent on the cause.

Why-Because Graph of the capsizing of the Herald of Free Enterprise (click to see in detail).
Whereas a mediator is a factor in the causal chain (1), a confounder is a spurious factor incorrectly suggesting causation (2)
Used in management and engineering, an Ishikawa diagram shows the factors that cause the effect. Smaller arrows connect the sub-causes to major causes.

Statistics and economics usually employ pre-existing data or experimental data to infer causality by regression methods.

{T1, T2, T3, ...} is a sequence of estimators for parameter θ0, the true value of which is 4. This sequence is consistent: the estimators are getting more and more concentrated near the true value θ0; at the same time, these estimators are biased. The limiting distribution of the sequence is a degenerate random variable which equals θ0 with probability 1.

Consistent estimator

{T1, T2, T3, ...} is a sequence of estimators for parameter θ0, the true value of which is 4. This sequence is consistent: the estimators are getting more and more concentrated near the true value θ0; at the same time, these estimators are biased. The limiting distribution of the sequence is a degenerate random variable which equals θ0 with probability 1.

In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ0.

The normal distribution, a very common probability density, useful because of the central limit theorem.

Errors and residuals

The normal distribution, a very common probability density, useful because of the central limit theorem.

In statistics and optimization, errors and residuals are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its "true value" (not necessarily observable).

Some of the different types of data.

Data set

Collection of data.

Collection of data.

Some of the different types of data.

In statistics, data sets usually come from actual observations obtained by sampling a statistical population, and each row corresponds to the observations on one element of that population.

Comparison of the arithmetic mean, median, and mode of two skewed (log-normal) distributions.

Mean

Comparison of the arithmetic mean, median, and mode of two skewed (log-normal) distributions.
Geometric visualization of the mode, median and mean of an arbitrary probability density function.

There are several kinds of mean in mathematics, especially in statistics.

Time series: random data plus trend, with best-fit line and different applied filters

Time series

Series of data points indexed in time order.

Series of data points indexed in time order.

Time series: random data plus trend, with best-fit line and different applied filters
Tuberculosis incidence US 1953-2009

Time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and largely in any domain of applied science and engineering which involves temporal measurements.

The normal distribution, a very common probability density, useful because of the central limit theorem.

Statistical population

The normal distribution, a very common probability density, useful because of the central limit theorem.

In statistics, a population is a set of similar items or events which is of interest for some question or experiment.