# Pearson correlation coefficient

**correlation coefficientcorrelationPearson correlationcorrelation coefficientsrcorrelatedPearson's correlation coefficientPearson's rPearsonPearson's Product-Moment Correlation Coefficient**

In statistics, the Pearson correlation coefficient (PCC, pronounced ), also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y.wikipedia

251 Related Articles

### Covariance

**covariantcovariationcovary**

Pearson's correlation coefficient is the covariance of the two variables divided by the product of their standard deviations. The population Pearson correlation coefficient is defined in terms of moments, and therefore exists for any bivariate probability distribution for which the population covariance is defined and the marginal population variances are defined and are non-zero.

The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation.

### Negative relationship

**inverse relationshipinversely relatednegative correlation**

The correlation coefficient is negative (anti-correlation) if X i and Y i tend to lie on opposite sides of their respective means.

Negative correlation can be seen geometrically when two normalized random vectors are viewed as points on a sphere, and the correlation between them is the cosine of the arc of separation of the points on the sphere.

### Statistics

**statisticalstatistical analysisstatistician**

In statistics, the Pearson correlation coefficient (PCC, pronounced ), also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y.

Pearson developed the Pearson product-moment correlation coefficient, defined as a product-moment, the method of moments for the fitting of distributions to samples and the Pearson distribution, among many other things.

### Fisher transformation

**Fisher's ''z'' transformationFisher’s Z-transformation**

In practice, confidence intervals and hypothesis tests relating to ρ are usually carried out using the Fisher transformation, the inverse hyperbolic function (artanh) of r:

In statistics, hypotheses about the value of the population correlation coefficient ρ between variables X and Y can be tested using the Fisher transformation (aka Fisher z-transformation) applied to the sample correlation coefficient.

### Karl Pearson

**PearsonPearson, KarlCarl Pearson**

It was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s.

Correlation coefficient. The correlation coefficient (first conceived by Francis Galton) was defined as a product-moment, and its relationship with linear regression was studied.

### Multivariate normal distribution

**multivariate normalbivariate normal distributionjointly normally distributed**

For pairs from an uncorrelated bivariate normal distribution, the sampling distribution of a certain function of Pearson's correlation coefficient follows Student's t-distribution with degrees of freedom n − 2.

where ρ is the correlation between X and Y and

### Coefficient of determination

**R 2 R'' 2 explained**

The square of the sample correlation coefficient is typically denoted r 2 and is a special case of the coefficient of determination.

One class of such cases includes that of simple linear regression where r 2 is used instead of R 2 . When an intercept is included, then r 2 is simply the square of the sample correlation coefficient (i.e., r) between the observed outcomes and the observed predictor values.

### Cosine similarity

**cosine distance2 cos 1cosine angle**

This uncentred correlation coefficient is identical with the cosine similarity.

If the attribute vectors are normalized by subtracting the vector means (e.g., A - \bar{A}), the measure is called the centered cosine similarity and is equivalent to the Pearson correlation coefficient.

### Resampling (statistics)

**resamplingstatistical supportstrongly supported**

In some situations, the bootstrap can be applied to construct confidence intervals, and permutation tests can be applied to carry out hypothesis tests.

Bootstrapping is a statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample, most often with the purpose of deriving robust estimates of standard errors and confidence intervals of a population parameter like a mean, median, proportion, odds ratio, correlation coefficient or regression coefficient.

### Correlation and dependence

**correlationcorrelatedcorrelate**

In statistics, the Pearson correlation coefficient (PCC, pronounced ), also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y.

The most common of these is the Pearson correlation coefficient, which is sensitive only to a linear relationship between two variables (which may be present even when one variable is a nonlinear function of the other).

### Distance correlation

**distance standard deviationdistance covariance**

Distance correlation

This is in contrast to Pearson's correlation, which can only detect linear association between two random variables.

### Normally distributed and uncorrelated does not imply independent

**here for an examplein general, not sufficientindividually normally distributed**

Normally distributed and uncorrelated does not imply independent

In probability theory, although simple examples illustrate that linear uncorrelatedness of two random variables does not in general imply their independence, it is sometimes mistakenly thought that it does imply that when the two random variables are normally distributed.

### Multiple correlation

**coefficient of multiple determinationcoefficient of multiple correlation**

Multiple correlation

It is the correlation between the variable's values and the best predictions that can be computed linearly from the predictive variables.

### RV coefficient

RV coefficient

is a multivariate generalization of the squared Pearson correlation coefficient (because the RV coefficient takes values between 0 and 1). It measures the closeness of two set of points that may each be represented in a matrix.

### Probability distribution

**distributioncontinuous probability distributiondiscrete probability distribution**

The population Pearson correlation coefficient is defined in terms of moments, and therefore exists for any bivariate probability distribution for which the population covariance is defined and the marginal population variances are defined and are non-zero.

F-distribution, the distribution of the ratio of two scaled chi squared variables; useful e.g. for inferences that involve comparing variances or involving R-squared (the squared correlation coefficient)

### Simple linear regression

**simple regressioni.e. regression linelinear least squares regression with an intercept term and a single explanator**

In this case, it estimates the fraction of the variance in Y that is explained by X in a simple linear regression.

The product-moment correlation coefficient might also be calculated:

### Partial correlation

If a population or data-set is characterized by more than two variables, a partial correlation coefficient measures the strength of dependence between a pair of variables that is not accounted for by the way in which they both change in response to variations in a selected subset of the other variables.

If we compute the Pearson correlation coefficient between variables X and Y, the result is approximately 0.969, while if we compute the partial correlation between X and Y, using the formula given above, we find a partial correlation of 0.919.

### Spearman's rank correlation coefficient

**rank correlation coefficientSpearmanSpearman's rho**

Spearman's rank correlation coefficient

The Spearman correlation between two variables is equal to the Pearson correlation between the rank values of those two variables; while Pearson's correlation assesses linear relationships, Spearman's correlation assesses monotonic relationships (whether linear or not).

### Quadrant count ratio

Quadrant count ratio

The QCR is not commonly used in the practice of statistics; rather, it is a useful tool in statistics education because it can be used as an intermediate step in the development of Pearson's correlation coefficient.

### Exchangeable random variables

**exchangeabilityexchangeableexchangeable sequence**

However the standard versions of these approaches rely on exchangeability of the data, meaning that there is no ordering or grouping of the data pairs being analyzed that might affect the behavior of the correlation estimate.

Let (X, Y) have a bivariate normal distribution with parameters \mu = 0, and an arbitrary correlation coefficient . The random variables X and Y are then exchangeable, but independent only if \rho=0. The density function is

### Anscombe's quartet

Anscombe's quartet

The second graph (top right) is not distributed normally; while a relationship between the two variables is obvious, it is not linear, and the Pearson correlation coefficient is not relevant. A more general regression and the corresponding coefficient of determination would be more appropriate.

### Cauchy–Schwarz inequality

**Cauchy Schwarz inequalityCauchy's inequalityCauchy-Schwarz inequality**

According to the Cauchy–Schwarz inequality it has a value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation.

### Francis Galton

**Sir Francis GaltonGaltonGalton, Francis**

It was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s.

### Moment (mathematics)

**momentsmomentraw moment**

The population Pearson correlation coefficient is defined in terms of moments, and therefore exists for any bivariate probability distribution for which the population covariance is defined and the marginal population variances are defined and are non-zero. The form of the definition involves a "product moment", that is, the mean (the first moment about the origin) of the product of the mean-adjusted random variables; hence the modifier product-moment in the name.

### Statistical population

**populationsubpopulationsubpopulations**

The population Pearson correlation coefficient is defined in terms of moments, and therefore exists for any bivariate probability distribution for which the population covariance is defined and the marginal population variances are defined and are non-zero. Pearson's correlation coefficient when applied to a population is commonly represented by the Greek letter ρ (rho) and may be referred to as the population correlation coefficient or the population Pearson correlation coefficient.