Statistics

The normal distribution, a very common probability density, useful because of the central limit theorem.
Scatter plots are used in descriptive statistics to show the observed relationships between different variables, here using the Iris flower data set.
Gerolamo Cardano, a pioneer on the mathematics of probability.
Karl Pearson, a founder of mathematical statistics.
A least squares fit: in red the points to be fitted, in blue the fitted line.
Confidence intervals: the red line is true value for the mean in this example, the blue lines are random confidence intervals for 100 realizations.
In this graph the black line is probability distribution for the test statistic, the critical region is the set of values to the right of the observed data point (observed value of the test statistic) and the p-value is represented by the green area.
The confounding variable problem: X and Y may be correlated, not because there is causal relationship between them, but because both depend on a third variable Z. Z is called a confounding factor.
gretl, an example of an open source statistical package

Discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.

- Statistics
The normal distribution, a very common probability density, useful because of the central limit theorem.

500 related topics

Relevance

The normal distribution, a very common probability density, useful because of the central limit theorem.

Statistical population

The normal distribution, a very common probability density, useful because of the central limit theorem.

In statistics, a population is a set of similar items or events which is of interest for some question or experiment.

Illustration of linear regression on a data set. Regression analysis is an important part of mathematical statistics.

Mathematical statistics

Illustration of linear regression on a data set. Regression analysis is an important part of mathematical statistics.

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data.

Example of samples from two populations with the same mean but different dispersion. The blue population is much more dispersed than the red population.

Statistical dispersion

Example of samples from two populations with the same mean but different dispersion. The blue population is much more dispersed than the red population.

In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed.

The Poisson distribution, a discrete probability distribution.

Probability theory

Branch of mathematics concerned with probability.

Branch of mathematics concerned with probability.

The Poisson distribution, a discrete probability distribution.
The normal distribution, a continuous probability distribution.

As a mathematical foundation for statistics, probability theory is essential to many human activities that involve quantitative analysis of data.

The probability density function (pdf) of the normal distribution, also called Gaussian or "bell curve", the most important absolutely continuous random distribution. As notated on the figure, the probabilities of intervals of values correspond to the area under the curve.

Probability distribution

The probability density function (pdf) of the normal distribution, also called Gaussian or "bell curve", the most important absolutely continuous random distribution. As notated on the figure, the probabilities of intervals of values correspond to the area under the curve.
The probability mass function of a discrete probability distribution. The probabilities of the singletons {1}, {3}, and {7} are respectively 0.2, 0.5, 0.3. A set not containing any of these points has probability zero.
The cdf of a discrete probability distribution, ...
... of a continuous probability distribution, ...
... of a distribution which has both a continuous part and a discrete part.
One solution for the Rabinovich–Fabrikant equations. What is the probability of observing a state on a certain place of the support (i.e., the red subset)?

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment.

Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. The correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case the correlation coefficient is undefined because the variance of Y is zero.

Correlation

Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. The correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case the correlation coefficient is undefined because the variance of Y is zero.
Example scatterplots of various datasets with various correlation coefficients.
Pearson/Spearman correlation coefficients between X and Y are shown when the two variables' ranges are unrestricted, and when the range of X is restricted to the interval (0,1).
Anscombe's quartet: four sets of data with the same correlation of 0.816

In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data.

The normal distribution, a very common probability density, useful because of the central limit theorem.

Estimation theory

The normal distribution, a very common probability density, useful because of the central limit theorem.

Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component.

The normal distribution, a very common probability density, useful because of the central limit theorem.

Central tendency

The normal distribution, a very common probability density, useful because of the central limit theorem.

In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.

An example of data produced by data dredging through a bot operated by statistician Tyler Vigen, apparently showing a close link between the best word winning a spelling bee competition and the number of people in the United States killed by venomous spiders. The similarity in trends is obviously a coincidence.

Data mining

An example of data produced by data dredging through a bot operated by statistician Tyler Vigen, apparently showing a close link between the best word winning a spelling bee competition and the number of people in the United States killed by venomous spiders. The similarity in trends is obviously a coincidence.

Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

3rd century BC Greek mathematician Euclid (holding calipers), as imagined by Raphael in this detail from The School of Athens (1509–1511)

Mathematics

Area of knowledge that includes such topics as numbers , formulas and related structures (algebra), shapes and the spaces in which they are contained (geometry), and quantities and their changes (calculus and analysis).

Area of knowledge that includes such topics as numbers , formulas and related structures (algebra), shapes and the spaces in which they are contained (geometry), and quantities and their changes (calculus and analysis).

3rd century BC Greek mathematician Euclid (holding calipers), as imagined by Raphael in this detail from The School of Athens (1509–1511)
The distribution of prime numbers is a central point of study in number theory. This Ulam spiral serves to illustrate it, hinting, in particular, at the conditional independence between being prime and being a value of certain quadratic polynomials.
The quadratic formula expresses concisely the solutions of all quadratic equations
Rubik's cube: the study of its possible moves is a concrete application of group theory
The Babylonian mathematical tablet Plimpton 322, dated to 1800 BC.
Archimedes used the method of exhaustion, depicted here, to approximate the value of pi.
The numerals used in the Bakhshali manuscript, dated between the 2nd century BC and the 2nd century AD.
A page from al-Khwārizmī's Algebra
Leonardo Fibonacci, the Italian mathematician who introduced the Hindu–Arabic numeral system invented between the 1st and 4th centuries by Indian mathematicians, to the Western World.
Leonhard Euler created and popularized much of the mathematical notation used today.
Carl Friedrich Gauss, known as the prince of mathematicians
The front side of the Fields Medal
Euler's identity, which American physicist Richard Feynman once called "the most remarkable formula in mathematics".

Some areas of mathematics, such as statistics and game theory, are developed in close correlation with their applications and are often grouped under applied mathematics.