Chi-Square Tests
So far, you have worked mainly with numerical variables and tests that compare means.
However, many real-world problems involve categorical data, such as gender, department, preference, or outcome category.
The Chi-Square test is used to analyze relationships between categorical variables.
What Is a Chi-Square Test?
A Chi-Square test evaluates whether there is a statistically significant association between two categorical variables.
Instead of comparing means, this test compares:
- Observed frequencies
- Expected frequencies
If observed and expected frequencies differ greatly, an association may exist.
When to Use Chi-Square Test
Chi-Square tests are appropriate when:
- Both variables are categorical
- Data is presented as counts or frequencies
- Observations are independent
Common applications include:
- Gender vs product preference
- Department vs job satisfaction
- Education level vs employment status
Example Dataset
Consider a survey measuring product preference by gender:
| Gender | Product A | Product B |
|---|---|---|
| Male | 30 | 20 |
| Female | 25 | 35 |
The question is: Is product preference associated with gender?
Understanding Expected Frequencies
Expected frequencies represent what counts would look like if no relationship existed between variables.
Chi-Square tests compare observed counts to these expected values.
Large differences contribute to a larger Chi-Square statistic.
Running Chi-Square Test (Menu)
To perform the test using SPSS menus:
- Go to Analyze → Descriptive Statistics → Crosstabs
- Place one variable in Rows and one in Columns
- Click Statistics → select Chi-square
- Click OK
SPSS generates crosstabulation tables and Chi-Square statistics.
Using SPSS Syntax
CROSSTABS
/TABLES=Gender BY Product
/STATISTICS=CHISQ
/CELLS=COUNT EXPECTED.
This syntax produces observed and expected counts along with Chi-Square results.
Interpreting the Output
Key elements to interpret:
- Chi-Square value
- Degrees of freedom
- Sig. (p-value)
Interpretation rule:
- p < 0.05 → significant association exists
- p ≥ 0.05 → no significant association
Always verify that expected frequencies are sufficient (typically ≥ 5).
Common Mistakes
Typical errors include:
- Using Chi-Square for numerical data
- Ignoring low expected frequencies
- Assuming causation from association
Chi-Square indicates association, not cause-and-effect.
Quiz 1
What type of data does Chi-Square analyze?
Categorical data.
Quiz 2
What does the Chi-Square test compare?
Observed and expected frequencies.
Quiz 3
What does p < 0.05 indicate?
A significant association exists.
Quiz 4
Which SPSS menu is used for Chi-Square tests?
Analyze → Descriptive Statistics → Crosstabs.
Quiz 5
Does Chi-Square prove causation?
No.
Mini Practice
A company surveys customers to record Gender and Purchase Decision (Yes/No).
Create a contingency table and perform a Chi-Square test to determine whether purchase decision depends on gender.
Use Crosstabs → Statistics → Chi-square and interpret the p-value.
What’s Next
In the next lesson, you will learn about One-Way ANOVA, which extends mean comparison to more than two groups.