Statistics

statisticalstatistical analysisstatisticianapplied statisticsstatistical methodsstatisticallystatistical methodstatistical datastatsquantitative analysis
Statistics is the discipline that concerns the collection, organization, displaying, analysis, interpretation and presentation of data.wikipedia
4,012 Related Articles

Probability theory

theory of probabilityprobabilityprobability theorist
Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.
As a mathematical foundation for statistics, probability theory is essential to many human activities that involve quantitative analysis of data.

Method of moments (statistics)

method of momentsmethod of matching momentsmethod of moment matching
Pearson developed the Pearson product-moment correlation coefficient, defined as a product-moment, the method of moments for the fitting of distributions to samples and the Pearson distribution, among many other things.
In statistics, the method of moments is a method of estimation of population parameters.

Linear discriminant analysis

discriminant analysisDiscriminant function analysisFisher's linear discriminant
He originated the concepts of sufficiency, ancillary statistics, Fisher's linear discriminator and Fisher information.
Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics, pattern recognition, and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events.

Sufficient statistic

sufficient statisticssufficientsufficiency
He originated the concepts of sufficiency, ancillary statistics, Fisher's linear discriminator and Fisher information.
In statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter".

Statistical Methods for Research Workers

intraclass correlationmethods
Fisher's most important publications were his 1918 seminal paper The Correlation between Relatives on the Supposition of Mendelian Inheritance, which was the first to use the statistical term, variance, his classic 1925 work Statistical Methods for Research Workers and his 1935 The Design of Experiments, where he developed rigorous design of experiments models.
Statistical Methods for Research Workers is a classic book on statistics, written by the statistician R. A. Fisher.

Lady tasting tea

Ronald Fisher coined the term null hypothesis during the Lady tasting tea experiment, which "is never proved or established, but is possibly disproved, in the course of experimentation".
In the design of experiments in statistics, the lady tasting tea is a randomized experiment devised by Ronald Fisher and reported in his book The Design of Experiments (1935).

Cryptanalysis

cryptanalystcodebreakingcodebreaker
This text laid the foundations for statistics and cryptanalysis.
Frequency analysis relies on a cipher failing to hide these statistics.

Survey sampling

surveyssampleSample Survey
When full census data cannot be collected, statisticians collect sample data by developing specific experiment designs and survey samples.
In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey.

Descriptive statistics

descriptivedescriptive statisticstatistics
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation).
The use of descriptive and summary statistics has an extensive history and, indeed, the simple tabulation of populations and of economic data was the first way the topic of statistics appeared.

Sampling distribution

finite sample distributiondistributionsampling
Probability is used in mathematical statistics to study the sampling distributions of sample statistics and, more generally, the properties of statistical procedures.
In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic.

Type I and type II errors

Type I errorfalse-positivefalse positive
Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative").

Biometrika

Biometrika Trust
Galton and Pearson founded Biometrika as the first journal of mathematical statistics and biostatistics (then called biometry), and the latter founded the world's first university statistics department at University College London.
The principal focus of this journal is theoretical statistics.

The Design of Experiments

book
Fisher's most important publications were his 1918 seminal paper The Correlation between Relatives on the Supposition of Mendelian Inheritance, which was the first to use the statistical term, variance, his classic 1925 work Statistical Methods for Research Workers and his 1935 The Design of Experiments, where he developed rigorous design of experiments models.
The Design of Experiments is a 1935 book by the English statistician Ronald Fisher about the design of experiments and is considered a foundational work in experimental design.

Statistical theory

statisticalstatistical theoriesmathematical statistics
Probability is used in mathematical statistics to study the sampling distributions of sample statistics and, more generally, the properties of statistical procedures.
The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics.

Instrumental variables estimation

instrumental variableinstrumental variablestwo-stage least squares
While the tools of data analysis work best on data from randomized studies, they are also applied to other kinds of data—like natural experiments and observational studies —for which a statistician would use a modified, more structured estimation method (e.g., Difference in differences estimation and instrumental variables, among many others) that produce consistent estimators.
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment.

Difference in differences

difference-in-differencesdifference-in-differenceAssumptions
While the tools of data analysis work best on data from randomized studies, they are also applied to other kinds of data—like natural experiments and observational studies —for which a statistician would use a modified, more structured estimation method (e.g., Difference in differences estimation and instrumental variables, among many others) that produce consistent estimators.
Difference in differences (DID or DD ) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment.

Decision theory

decision sciencestatistical decision theorydecision sciences
Probability is used in mathematical statistics to study the sampling distributions of sample statistics and, more generally, the properties of statistical procedures.
Empirical applications of this rich theory are usually done with the help of statistical and econometric methods.

Consistent estimator

consistentconsistencyinconsistent
While the tools of data analysis work best on data from randomized studies, they are also applied to other kinds of data—like natural experiments and observational studies —for which a statistician would use a modified, more structured estimation method (e.g., Difference in differences estimation and instrumental variables, among many others) that produce consistent estimators.
In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ 0 —having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ 0.

Twelvefold way

permutations and combinationscombinationcombinations
Al-Khalil (717–786) wrote the Book of Cryptographic Messages, which contains the first use of permutations and combinations, to list all possible Arabic words with and without vowels.
Another way to think of some of the cases is in terms of sampling, in statistics.

Big data

big data analyticsbig data analysisbig-data
Statistics continues to be an area of active research for example on the problem of how to analyze Big data.
Relational database management systems, desktop statistics and software packages used to visualize data often have difficulty handling big data.

Blocking (statistics)

Randomized block designblockingblocks
In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another.

Experiment

experimentalexperimentationexperiments
An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements.
This equivalency is determined by statistical methods that take into account the amount of variation between individuals and the number of individuals in each group.

Categorical variable

categoricalcategorical datadichotomous
Numerical descriptors include mean and standard deviation for continuous data types (like income), while frequency and percentage are more useful in terms of describing categorical data (like education).
In statistics, a categorical variable is a variable that can take on one of a limited, and usually fixed number, possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property.

Calculus

infinitesimal calculusdifferential and integral calculusclassical calculus
In the 18th century, statistics also started to draw heavily from calculus.
Calculus is used in every branch of the physical sciences, actuarial science, computer science, statistics, engineering, economics, business, medicine, demography, and in other fields wherever a problem can be mathematically modeled and an optimal solution is desired.

The Correlation between Relatives on the Supposition of Mendelian Inheritance

Fisher's most important publications were his 1918 seminal paper The Correlation between Relatives on the Supposition of Mendelian Inheritance, which was the first to use the statistical term, variance, his classic 1925 work Statistical Methods for Research Workers and his 1935 The Design of Experiments, where he developed rigorous design of experiments models.
The paper also contains the first use of the statistical term variance.