# Statistics

**statisticalstatistical analysisstatisticianapplied statisticsstatistical methodsstatisticallystatistical methodstatistical datastatsquantitative analysis**

Statistics is the discipline that concerns the collection, organization, displaying, analysis, interpretation and presentation of data.wikipedia

4,012 Related Articles

### Survey methodology

**surveysurveysstatistical survey**

Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

A field of applied statistics of human research surveys, survey methodology studies the sampling of individual units from a population and associated techniques of survey data collection, such as questionnaire construction and methods for improving the number and accuracy of responses to surveys.

### Statistical population

**populationsubpopulationsubpopulations**

In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied.

In statistics, a population is a set of similar items or events which is of interest for some question or experiment.

### Sample (statistics)

**samplesamplesstatistical sample**

When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples.

In statistics and quantitative research methodology, a data sample is a set of data collected and the world selected from a statistical population by a defined procedure.

### Standard deviation

**standard deviationssample standard deviationSD**

Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation).

In statistics, the standard deviation (SD, also represented by the lower case Greek letter sigma σ for the population standard deviation or the Latin letter s for the sample standard deviation) is a measure of the amount of variation or dispersion of a set of values.

### Mean

**mean valueaveragepopulation mean**

Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation).

There are several kinds of means in various branches of mathematics (especially statistics).

### Central tendency

**LocalityLocality (statistics)Measure of central tendency**

Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other.

In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.

### Statistical dispersion

**dispersionvariabilityspread**

Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other.

In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed.

### Observational study

**observational studiesobservationalobservational data**

In contrast, an observational study does not involve experimental manipulation.

In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical concerns or logistical constraints.

### Bias (statistics)

**biasbiasedstatistical bias**

Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur.

Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated.

### Censoring (statistics)

**censoringcensoredcensored data**

The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.

In statistics, engineering, economics, and medical research, censoring is a condition in which the value of a measurement or observation is only partially known.

### Missing data

**missing valuesmissing at randomincomplete data**

The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.

In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation.

### Estimation theory

**parameter estimationestimationestimated**

These inferences may take the form of: answering yes/no questions about the data (hypothesis testing), estimating numerical characteristics of the data (estimation), describing associations within the data (correlation) and modeling relationships within the data (for example, using regression analysis).

Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component.

### Probability and statistics

**probability, statistics**

The earliest writings on probability and statistics, statistical methods drawing from probability theory, date back to Arab mathematicians and cryptographers, notably Al-Khalil (717–786) and Al-Kindi (801–873).

Statistical analysis often uses probability distributions, and the two topics are often studied together.

### Glossary of probability and statistics

See glossary of probability and statistics.

The following is a glossary of terms used in the mathematical sciences statistics and probability.

### Data mining

**data-miningdataminingknowledge discovery in databases**

Inference can extend to forecasting, prediction and estimation of unobserved values either in or associated with the population being studied; it can include extrapolation and interpolation of time series or spatial data, and can also include data mining.

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

### Time series

**time series analysistime-seriestime-series analysis**

Inference can extend to forecasting, prediction and estimation of unobserved values either in or associated with the population being studied; it can include extrapolation and interpolation of time series or spatial data, and can also include data mining.

Time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and largely in any domain of applied science and engineering which involves temporal measurements.

### Mathematics

**mathematicalmathmathematician**

Statistics is a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and presentation of data, or as a branch of mathematics.

Applied mathematics has led to entirely new mathematical disciplines, such as statistics and game theory.

### History of statistics

**foundational advanceshistorian of statisticsstat-'' etymology**

Early applications of statistical thinking revolved around the needs of states to base policy on demographic and economic data, hence its stat- etymology.

The history of statistics in the modern way is that it originates from the term statistics, found in 1749 in Germany.

### Prediction

**predictpredictionspredictive**

Inference can extend to forecasting, prediction and estimation of unobserved values either in or associated with the population being studied; it can include extrapolation and interpolation of time series or spatial data, and can also include data mining.

In statistics, prediction is a part of statistical inference.

### Francis Galton

**Sir Francis GaltonGaltonGalton, Francis**

The first wave, at the turn of the century, was led by the work of Francis Galton and Karl Pearson, who transformed statistics into a rigorous mathematical discipline used for analysis, not just in science, but in industry and politics as well.

Sir Francis Galton, FRS (16 February 1822 – 17 January 1911) was an English Victorian era statistician, polymath, sociologist, psychologist, anthropologist, eugenicist, tropical explorer, geographer, inventor, meteorologist, proto-geneticist, and psychometrician.

### Correlation and dependence

**correlationcorrelatedcorrelations**

These inferences may take the form of: answering yes/no questions about the data (hypothesis testing), estimating numerical characteristics of the data (estimation), describing associations within the data (correlation) and modeling relationships within the data (for example, using regression analysis).

In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data.

### Pearson correlation coefficient

**correlation coefficientPearson product-moment correlation coefficientPearson correlation**

Pearson developed the Pearson product-moment correlation coefficient, defined as a product-moment, the method of moments for the fitting of distributions to samples and the Pearson distribution, among many other things.

In statistics, the Pearson correlation coefficient (PCC, pronounced ), also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y.

### Biostatistics

**biostatisticianbiometrybiometrician**

Galton and Pearson founded Biometrika as the first journal of mathematical statistics and biostatistics (then called biometry), and the latter founded the world's first university statistics department at University College London.

Biostatistics are the development and application of statistical methods to a wide range of topics in biology.

### Spatial analysis

**spatial statisticsspatial autocorrelationgeospatial analysis**

Statistics has contributed greatly through work in spatial statistics.

### Ronald Fisher

**R.A. FisherR. A. FisherFisher**

Ronald Fisher coined the term null hypothesis during the Lady tasting tea experiment, which "is never proved or established, but is possibly disproved, in the course of experimentation".

Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British statistician and geneticist.