Median

averagesample medianmedian-unbiased estimator
The median is used primarily for skewed distributions, which it summarizes differently from the arithmetic mean. Consider the multiset { 1, 2, 2, 2, 3, 14 }. The median is 2 in this case, (as is the mode), and it might be seen as a better indication of central tendency (less susceptible to the exceptionally large value in data) than the arithmetic mean of 4. The median is a popular summary statistic used in descriptive statistics, since it is simple to understand and easy to calculate, while also giving a measure that is more robust in the presence of outlier values than is the mean.

Mode (statistics)

modemodalmodes
Descriptive statistics. Moment (mathematics). Summary statistics. Unimodal function.

First-order logic

predicate logicfirst-orderpredicate calculus
Decidable subsets of first-order logic are also studied in the framework of description logics. The Löwenheim–Skolem theorem shows that if a first-order theory of cardinality λ has an infinite model, then it has models of every infinite cardinality greater than or equal to λ. One of the earliest results in model theory, it implies that it is not possible to characterize countability or uncountability in a first-order language. That is, there is no first-order formula φ(x) such that an arbitrary structure M satisfies φ if and only if the domain of discourse of M is countable (or, in the second case, uncountable).

Variance

sample variancepopulation variancevariability
Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by \sigma^2, s^2, or.

Box plot

In descriptive statistics, a box plot or boxplot is a method for graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points. Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution (though Tukey's boxplot assumes symmetry for the whiskers and normality for their length).

Standard deviation

standard deviationssample standard deviationsigma
For various values of z, the percentage of values expected to lie in and outside the symmetric interval, CI = (−zσ, zσ), are as follows: The mean and the standard deviation of a set of data are descriptive statistics usually reported together. In a certain sense, the standard deviation is a "natural" measure of statistical dispersion if the center of the data is measured about the mean. This is because the standard deviation from the mean is smaller than from any other point.

Range (statistics)

rangerangingsample range
However, in descriptive statistics, this concept of range has a more complex meaning. The range is the size of the smallest interval (statistics) which contains all the data and provides an indication of statistical dispersion. It is measured in the same units as the data. Since it only depends on two of the observations, it is most useful in representing the dispersion of small data sets. For n independent and identically distributed continuous random variables X 1, X 2, ..., X n with cumulative distribution function G(x) and probability density function g(x). Let T denote the range of a sample of size n from a population with distribution function G(x).

Statistical dispersion

Summary statistics. Qualitative variation. Robust measures of scale. Measurement uncertainty.

Seven-number summary

Bowley's seven-figure summaryseven-figure summary
In descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is an extension of the five-number summary. There are two similar, common forms. As with the five-number summary, it can be represented by a modified box plot, adding hatch-marks on the "whiskers" for two of the additional numbers. The following percentiles are evenly spaced under a normally distributed variable: The middle three values – the lower quartile, median, and upper quartile – are the usual statistics from the five-number summary and are the standard values for the box in a box plot.

Mean

mean valuepopulation meanaverage
Descriptive statistics. Kurtosis. Law of averages. Mean value theorem. Median. Mode (statistics). Summary statistics. Taylor's law.

Quartile

quartileslower quartilelower and upper quartiles
Summary statistics. Quantile. Quartile – from MathWorld Includes references and compares various methods to compute quartiles. Quartiles – From MathForum.org. Quartiles calculator – simple quartiles calculator. Quartiles – An example how to calculate it.

Central tendency

Localitycentral locationcentral point
In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution. It may also be called a center or location of the distribution. Colloquially, measures of central tendency are often called averages. The term central tendency dates from the late 1920s.

Skewness

skewedskewskewed distribution
Skewness is a descriptive statistic that can be used on conjunction with the histogram and the normal quantile plot to characterize the data or distribution. Skewness indicates which direction and a relative magnitude of how far a distribution deviates from normal. With pronounced skewness, standard statistical inference procedures such as a confidence interval for a mean will be not only incorrect, in the sense of having true coverage level unequal to the nominal (e.g., 95%) level, but also with unequal error probabilities on each side.

Kurtosis

excess kurtosisleptokurticplatykurtic
In probability theory and statistics, kurtosis (from κυρτός, kyrtos or kurtos, meaning "curved, arching") is a measure of the "tailedness" of the probability distribution of a real-valued random variable. In a similar way to the concept of skewness, kurtosis is a descriptor of the shape of a probability distribution and, just as for skewness, there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. Depending on the particular measure of kurtosis that is used, there are various interpretations of kurtosis, and of how particular measures should be interpreted.

Pearson correlation coefficient

correlation coefficientcorrelationPearson correlation
In statistics, the Pearson correlation coefficient (PCC, pronounced ), also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y. According to the Cauchy–Schwarz inequality it has a value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation. It is widely used in the sciences. It was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s.

Correlation and dependence

correlationcorrelatedcorrelate
These examples indicate that the correlation coefficient, as a summary statistic, cannot replace visual examination of the data. Note that the examples are sometimes said to demonstrate that the Pearson correlation assumes that the data follow a normal distribution, but this is not correct. If a pair (X,Y) of random variables follows a bivariate normal distribution, the conditional mean is a linear function of Y, and the conditional mean is a linear function of X.

Five-number summary

five-number summaries
The five-number summary is a set of descriptive statistics that provide information about a dataset. It consists of the five most important sample percentiles: In addition to the median of a single set of data there are two related statistics called the upper and lower quartiles. If data are placed in order, then the lower quartile is central to the lower half of the data and the upper quartile is central to the upper half of the data. These quartiles are used to calculate the interquartile range, which helps to describe the spread of the data, and determine whether or not any data points are outliers.

Logic

logicianlogicallogics
Many fundamental logical formalisms are essential to section I.2 on artificial intelligence, for example modal logic and default logic in Knowledge representation formalisms and methods, Horn clauses in logic programming, and description logic. Barwise, J. (1982). Handbook of Mathematical Logic. Elsevier. ISBN: 978-0-08-093364-1. Belnap, N. (1977). "A useful four-valued logic". In Dunn & Eppstein, Modern uses of multiple-valued logic. Reidel: Boston. Bocheński, J.M. (1959). A précis of mathematical logic. Translated from the French and German editions by Otto Bird. D. Reidel, Dordrecht, South Holland. Bocheński, J.M. (1970). A history of formal logic. 2nd Edition.

Interquartile range

inter-quartile rangebelowinterquartile
In descriptive statistics, the interquartile range (IQR), also called the midspread or middle 50%, or technically H-spread, is a measure of statistical dispersion, being equal to the difference between 75th and 25th percentiles, or between upper and lower quartiles, IQR = Q 3 − Q 1 . In other words, the IQR is the first quartile subtracted from the third quartile; these quartiles can be clearly seen on a box plot on the data. It is a trimmed estimator, defined as the 25% trimmed range, and is a commonly used robust measure of scale. The IQR is a measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts.

Order statistic

order statisticsorderedth-smallest of items
In statistics, the kth order statistic of a statistical sample is equal to its kth-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference.

Sufficient statistic

sufficient statisticssufficientsufficiency
Stephen Stigler noted 1973 that the concept of sufficiency had fallen out of favor in descriptive statistics because of the strong dependence on an assumption of the distributional form (see Pitman–Koopman–Darmois theorem below), but remained very important in theoretical work. Roughly, given a set \mathbf{X} of independent identically distributed data conditioned on an unknown parameter \theta, a sufficient statistic is a function whose value contains all the information needed to compute any estimate of the parameter (e.g. a maximum likelihood estimate). Due to the factorization theorem (see below), for a sufficient statistic, the probabilty density can be written as.

Fragment (logic)

fragmentfragments
The field of descriptive complexity theory aims at establishing a link between logics and computational complexity theory, by identifying logical fragments that exactly capture certain complexity classes.

Sample maximum and minimum

sample maximumsample minimumMaximum
They are basic summary statistics, used in descriptive statistics such as the five-number summary and Bowley's seven-figure summary and the associated box plot. The minimum and the maximum value are the first and last order statistics (often denoted X (1) and X (n) respectively, for a sample size of n). If the sample has outliers, they necessarily include the sample maximum or sample minimum, or both, depending on whether they are extremely high or low. However, the sample maximum and minimum need not be outliers, if they are not unusually far from other observations. The sample maximum and minimum are the least robust statistics: they are maximally sensitive to outliers.

List of statistics articles

list of statistical topicslist of statistics topics
Descriptive research. Descriptive statistics. Design effect. Design matrix. Design of experiments. The Design of Experiments (book by Fisher). Detailed balance. Detection theory. Determining the number of clusters in a data set. Detrended correspondence analysis. Detrended fluctuation analysis. Deviance (statistics). Deviance information criterion. Deviation (statistics). Deviation analysis (disambiguation). DFFITS – a regression diagnostic. Diagnostic odds ratio. Dickey–Fuller test. Difference in differences. Differential entropy. Diffusion process. Diffusion-limited aggregation. Dimension reduction. Dilution assay. Direct relationship. Directional statistics. Dirichlet distribution.

Decile

deciles
In descriptive statistics, a decile is any of the nine values that divide the sorted data into ten equal parts, so that each part represents 1/10 of the sample or population. A decile is one possible form of a quantile; others include the quartile and percentile. de:Quantil#Dezil ru:Квантиль#Дециль Summary statistics. Socio-economic decile.