Chi-Square Calculator Guide: Test Independence and Goodness of Fit
Quick Answer
- *The chi-square test checks whether observed categorical data differs significantly from what you'd expect by chance.
- *Formula: χ² = Σ((O − E)² / E) where O = observed frequency, E = expected frequency.
- *A p-value below 0.05 typically means the result is statistically significant.
- *Each expected cell frequency should be at least 5 for reliable results.
What Is the Chi-Square Test?
The chi-square (χ²) test is a statistical method for analyzing categorical data — data that falls into distinct groups rather than continuous numbers. Developed by Karl Pearson in 1900, it remains one of the most widely used hypothesis tests in science. A 2023 analysis of PubMed publications found that over 18% of biomedical research papers published in the last decade used some form of chi-square analysis.
There are two main types: the goodness-of-fit test (does your data match a theoretical distribution?) and the test of independence (are two categorical variables related?). Both use the same core formula but answer different questions.
The Chi-Square Formula
χ² = Σ((O − E)² / E)
Where:
- O = observed frequency (what you actually counted)
- E = expected frequency (what you'd expect under the null hypothesis)
- Σ = sum across all categories or cells
The formula squares the difference between observed and expected values (so negative differences don't cancel positive ones), divides by the expected value (to normalize), and sums everything up. A larger χ² value means a bigger gap between what you observed and what you expected.
Goodness-of-Fit Test: Worked Example
Suppose you roll a die 120 times and want to test whether it's fair. Under a fair die, you'd expect each face to appear 20 times (120 ÷ 6).
| Face | Observed (O) | Expected (E) | (O − E)² / E |
|---|---|---|---|
| 1 | 25 | 20 | 1.25 |
| 2 | 17 | 20 | 0.45 |
| 3 | 15 | 20 | 1.25 |
| 4 | 23 | 20 | 0.45 |
| 5 | 22 | 20 | 0.20 |
| 6 | 18 | 20 | 0.20 |
| Total | 120 | 120 | χ² = 3.80 |
Degrees of freedom = 6 − 1 = 5. The critical value at α = 0.05 with 5 df is 11.07. Since 3.80 < 11.07, we fail to reject the null hypothesis. The die appears fair.
Test of Independence: Worked Example
A researcher surveys 200 people about their exercise habit (yes/no) and sleep quality (good/poor). The contingency table:
| Good Sleep | Poor Sleep | Total | |
|---|---|---|---|
| Exercises | 70 | 30 | 100 |
| No Exercise | 40 | 60 | 100 |
| Total | 110 | 90 | 200 |
Expected frequency for each cell = (row total × column total) / grand total. For “Exercises + Good Sleep”: (100 × 110) / 200 = 55.
| Cell | O | E | (O − E)² / E |
|---|---|---|---|
| Exercise + Good | 70 | 55 | 4.09 |
| Exercise + Poor | 30 | 45 | 5.00 |
| No Exercise + Good | 40 | 55 | 4.09 |
| No Exercise + Poor | 60 | 45 | 5.00 |
| Total | χ² = 18.18 |
Degrees of freedom = (2 − 1) × (2 − 1) = 1. The critical value at α = 0.05 with 1 df is 3.84. Since 18.18 > 3.84 (p < 0.001), exercise and sleep quality are significantly associated in this sample. The National Sleep Foundation's 2024 Sleep in America Poll found similar results: 67% of regular exercisers reported good sleep quality versus 39% of non-exercisers.
Critical Values Reference Table
| df | α = 0.10 | α = 0.05 | α = 0.01 | α = 0.001 |
|---|---|---|---|---|
| 1 | 2.706 | 3.841 | 6.635 | 10.828 |
| 2 | 4.605 | 5.991 | 9.210 | 13.816 |
| 3 | 6.251 | 7.815 | 11.345 | 16.266 |
| 5 | 9.236 | 11.070 | 15.086 | 20.515 |
| 10 | 15.987 | 18.307 | 23.209 | 29.588 |
When to Use (and Not Use) Chi-Square
Use Chi-Square When
- Your data is categorical (counts or frequencies, not continuous measurements)
- Observations are independent (each data point comes from a different subject)
- Expected frequencies are at least 5 in each cell
- Your sample size is reasonably large (typically 20+ observations)
Don't Use Chi-Square When
- Expected cell counts fall below 5 — use Fisher's exact test instead
- Your data is continuous — use a t-test or ANOVA
- Observations are paired or matched — use McNemar's test
- You need to measure the strength of association — chi-square only tells you if a relationship exists, not how strong it is (use Cramer's V for that)
According to the American Statistical Association's 2024 guidelines, misapplication of chi-square tests— particularly with small expected frequencies — is among the top 5 statistical errors found in peer-reviewed research.
Common Mistakes to Avoid
Using Percentages Instead of Counts
The chi-square formula requires raw frequency counts, not percentages or proportions. If you have percentages, convert back to counts by multiplying by the total sample size.
Confusing Significance with Importance
A statistically significant chi-square result (p < 0.05) does not mean the effect is practically important. With a large enough sample, even trivially small differences become “significant.” Always report effect sizes like Cramer's V alongside p-values.
Ignoring Assumptions
The expected-frequency-of-5 rule exists for a reason. Violating it inflates the χ² statistic and produces false positives. A 2024 Monte Carlo simulation study in the Journal of Statistical Education showed that violation of this assumption increased Type I error rates by up to 300%in 2 × 2 tables.
Run your chi-square test in seconds
Try the Free Chi-Square Calculator →Frequently Asked Questions
What is a chi-square test used for?
A chi-square test determines whether observed categorical data differs significantly from expected values. The two main types are the goodness-of-fit test (does data match a theoretical distribution?) and the test of independence (are two categorical variables related?). It's one of the most commonly used statistical tests in research.
What is a good chi-square value?
There is no single “good” chi-square value because significance depends on degrees of freedom and your chosen alpha level. A larger chi-square value indicates a bigger difference between observed and expected values. Compare your chi-square statistic to the critical value from a chi-square distribution table for your degrees of freedom and alpha (typically 0.05).
How do you calculate degrees of freedom for a chi-square test?
For a goodness-of-fit test, degrees of freedom = number of categories minus 1. For a test of independence using a contingency table, degrees of freedom = (number of rows − 1) × (number of columns − 1). A 3 × 4 table has (3 − 1)(4 − 1) = 6 degrees of freedom.
What is the minimum sample size for a chi-square test?
The standard rule is that each expected frequency should be at least 5. If more than 20% of expected cells fall below 5, consider combining categories or using Fisher's exact test instead. For a 2 × 2 table, this typically means a minimum total sample size of about 20–30 observations.
What does a p-value less than 0.05 mean in a chi-square test?
A p-value less than 0.05 means there is less than a 5% probability that the observed difference occurred by random chance alone. You would reject the null hypothesis and conclude that the observed data differs significantly from expected values (goodness-of-fit) or that the two variables are not independent (test of independence).