MathUpdated March 30, 2026

Correlation Coefficient Guide: Pearson r, Interpretation & Examples

Q: What does a correlation coefficient of 0.7 mean?

An r value of 0.7 indicates a strong positive linear relationship between two variables. As one variable increases, the other tends to increase as well. The r-squared value (0.49) means that approximately 49% of the variance in one variable is explained by the other.

Q: Does correlation imply causation?

No. Correlation measures the strength of a linear relationship, but it does not prove that one variable causes changes in the other. The correlation could be due to a confounding variable, reverse causation, or pure coincidence. Establishing causation requires controlled experiments or rigorous causal inference methods.

Q: What is a good correlation coefficient?

It depends on the field. In physics, r values above 0.95 are common. In social sciences, r = 0.30 may be considered moderate and meaningful. Cohen's guidelines classify 0.10 as small, 0.30 as medium, and 0.50 as large for behavioral research. Always interpret r in context.

Q: What is the difference between Pearson and Spearman correlation?

Pearson correlation (r) measures linear relationships and assumes both variables are continuous and normally distributed. Spearman correlation (rho) measures monotonic relationships using rank-ordered data and makes no distributional assumptions. Use Spearman when data is ordinal, has outliers, or the relationship is monotonic but not linear.

Q: Can correlation be negative?

Yes. A negative correlation (r between -1 and 0) means that as one variable increases, the other tends to decrease. For example, the correlation between outdoor temperature and heating costs is strongly negative — as temperature rises, heating costs fall. An r of -0.8 is just as strong as r = 0.8, only in the opposite direction.

By The hakaru Team·Last updated March 2026

Quick Answer

*The Pearson correlation coefficient (r) ranges from –1 to +1, measuring the strength and direction of a linear relationship.
*r = +1 means perfect positive correlation; r = –1 means perfect negative; r = 0 means no linear relationship.
*r² (r-squared) tells you the proportion of variance explained — an r of 0.7 means ~49% of variance is shared.
*Correlation does not imply causation — confounders, reverse causation, and coincidence are always possible.

Try the Free Correlation Coefficient Calculator →

What Is a Correlation Coefficient?

A correlation coefficient is a number that quantifies the strength and direction of a relationship between two variables. The most common type — the Pearson product-moment correlation coefficient, denoted r— measures how closely two continuous variables follow a straight-line (linear) pattern.

Karl Pearson developed the formula in 1896, building on earlier work by Francis Galton. Today it is the most widely used statistical measure of association. According to a 2020 analysis in PLOS ONE, the Pearson correlation appears in over 75% of published research papers that report bivariate relationships.

The Pearson Correlation Formula

The formula for Pearson's r is:

r = ∑[(x−x̄)(y−ȳ)] / √[∑(x−x̄)² × ∑(y−ȳ)²]

In plain English: it divides the covariance of x and y by the product of their standard deviations. This normalization constrains r to the range [–1, +1].

Worked Example

Consider five students' hours studied (x) and exam scores (y):

Student	Hours (x)	Score (y)
A	2	65
B	4	78
C	6	82
D	8	90
E	10	95

Computing the Pearson r for this data gives r ≈ 0.986, indicating a very strong positive linear relationship. The r² of 0.972 means that study hours explain about 97% of the variance in exam scores in this sample.

Interpreting Correlation Strength

Jacob Cohen's widely cited 1988 guidelines for behavioral sciences classify correlation strength as:

\|r\| Value	Strength	Example
0.00–0.10	Negligible	Shoe size and IQ
0.10–0.30	Small	Income and happiness (r ≈ 0.20)
0.30–0.50	Medium	SAT scores and college GPA (r ≈ 0.40)
0.50–0.70	Large	Height and weight (r ≈ 0.60)
0.70–1.00	Very large	Study hours and test scores in controlled settings

Context matters enormously. In physics, an r of 0.70 might be disappointing. In psychology, r = 0.30 can represent a meaningful and publishable finding. The American Psychological Association (APA) emphasizes reporting effect sizes alongside p-values rather than relying on arbitrary thresholds.

R-Squared: The Coefficient of Determination

Squaring the correlation coefficient gives r², which represents the proportion of variance in one variable that is predictable from the other. This is often more intuitive than r itself.

r	r²	Variance Explained
0.30	0.09	9%
0.50	0.25	25%
0.70	0.49	49%
0.80	0.64	64%
0.90	0.81	81%

A correlation of 0.50 sounds strong, but it only explains 25% of the variance. The remaining 75% is driven by other factors. This distinction is critical for making predictions — even a “large” correlation leaves substantial unexplained variation.

Correlation vs Causation

This is the most important concept in applied statistics. Correlation tells you that two variables move together. It does not tell you why.

Three explanations exist for any observed correlation:

Direct causation: X causes Y (or Y causes X).
Confounding: A third variable Z drives both X and Y. Ice cream sales and drowning deaths are correlated (r ≈ 0.85 seasonally) because both increase in summer heat — not because ice cream causes drowning.
Coincidence: Spurious correlations exist everywhere. Tyler Vigen's research catalogued hundreds, including the r = 0.95 correlation between US spending on science and suicides by hanging (1999–2009). Clearly meaningless.

A 2015 study in the American Journal of Epidemiology found that over 40% of media health headlines implied causation from correlational studies. Critical readers should always ask: was this a controlled experiment or an observational study?

Pearson vs Spearman vs Kendall

Method	Measures	Best For
Pearson (r)	Linear relationship	Continuous, normally distributed data
Spearman (ρ)	Monotonic relationship	Ordinal data, outliers present, non-normal distributions
Kendall (τ)	Concordance of pairs	Small samples, tied ranks, ordinal data

Pearson is the default choice for continuous data with roughly normal distributions. Use Spearman when your data has outliers, is ordinal (like satisfaction ratings), or the relationship is monotonic but curved. Kendall's tau is more robust with small sample sizes (<30) and handles ties better.

Common Mistakes When Using Correlation

Ignoring Outliers

A single outlier can dramatically inflate or deflate Pearson's r. Anscombe's quartet (1973) famously demonstrated four datasets with identical r = 0.816 but completely different patterns — one driven entirely by a single outlier. Always plot your data before interpreting r.

Restricting the Range

Measuring correlation on a subset of data with limited variability suppresses r. For example, the correlation between SAT scores and college GPA appears weak at highly selective universities because all students have similar SAT scores. The true population correlation is higher.

Assuming Linearity

Pearson's r only captures linear relationships. A perfect U-shaped relationship between anxiety and performance (the Yerkes–Dodson curve) would yield r ≈ 0, despite a strong and real association. Use scatterplots and consider non-linear alternatives.

Calculate correlation for your dataset

Try the Free Correlation Coefficient Calculator →

Frequently Asked Questions

What does a correlation coefficient of 0.7 mean?

An r value of 0.7 indicates a strong positive linear relationship between two variables. As one variable increases, the other tends to increase as well. The r-squared value (0.49) means that approximately 49% of the variance in one variable is explained by the other.

Does correlation imply causation?

No. Correlation measures the strength of a linear relationship, but it does not prove that one variable causes changes in the other. The correlation could be due to a confounding variable, reverse causation, or pure coincidence. Establishing causation requires controlled experiments or rigorous causal inference methods.

What is a good correlation coefficient?

It depends on the field. In physics, r values above 0.95 are common. In social sciences, r = 0.30 may be considered moderate and meaningful. Cohen's guidelines classify 0.10 as small, 0.30 as medium, and 0.50 as large for behavioral research. Always interpret r in context.

What is the difference between Pearson and Spearman correlation?

Pearson correlation (r) measures linear relationships and assumes both variables are continuous and normally distributed. Spearman correlation (ρ) measures monotonic relationships using rank-ordered data and makes no distributional assumptions. Use Spearman when data is ordinal, has outliers, or the relationship is monotonic but not linear.

Can correlation be negative?

Yes. A negative correlation (r between –1 and 0) means that as one variable increases, the other tends to decrease. For example, the correlation between outdoor temperature and heating costs is strongly negative — as temperature rises, heating costs fall. An r of –0.8 is just as strong as r = 0.8, only in the opposite direction.