SPSS Lesson 32 – TITLE HERE | Dataplexa

Principal Component Analysis (PCA)

In the previous lesson, you learned how Factor Analysis identifies hidden underlying factors.

Principal Component Analysis (PCA) also reduces many variables into fewer components, but with a different objective.

PCA focuses on data compression and variance preservation, not on uncovering latent constructs.

Why PCA Is Used

Modern datasets often contain many correlated variables.

This creates problems such as:

Redundancy in information
Difficulty in visualization
Multicollinearity in regression

PCA transforms the original variables into a smaller set of new variables called principal components.

Core Idea Behind PCA

Each principal component is:

A linear combination of original variables
Uncorrelated with other components
Ordered by the amount of variance explained

The first component explains the maximum possible variance, the second explains the next most, and so on.

PCA vs Factor Analysis

Aspect	PCA	Factor Analysis
Main goal	Variance preservation	Latent structure identification
Focus	Data reduction	Underlying factors
Error modeling	Does not separate error	Separates common & unique variance

In practice, PCA is often used as a preprocessing step, while factor analysis is used for theory building.

Eigenvalues and Variance Explained

In PCA, each component has an eigenvalue, which represents the amount of variance explained.

Common rule:

Eigenvalue > 1 → retain the component

SPSS also provides a scree plot to visually decide the number of components.

Example Scenario

Suppose we collect data on:

Math score
Science score
English score
Logic score

These scores are correlated. PCA can reduce them into one or two components representing overall academic performance.

Running PCA in SPSS (Menu)

To perform PCA in SPSS:

Go to Analyze → Dimension Reduction → Factor
Select variables
Choose Principal Components as extraction
Check Eigenvalues greater than 1
View scree plot
Click OK

SPSS Syntax for PCA


FACTOR
  /VARIABLES Math Science English Logic
  /MISSING LISTWISE
  /ANALYSIS Math Science English Logic
  /EXTRACTION PC
  /CRITERIA MINEIGEN(1)
  /ROTATION NONE.

Interpreting PCA Output

When interpreting PCA results:

Look at eigenvalues
Check percentage of variance explained
Review component loadings

Higher loadings indicate stronger contribution of a variable to a component.

Common Mistakes

Typical mistakes include:

Confusing PCA with Factor Analysis
Keeping too many components
Ignoring scree plot

PCA is a mathematical technique, not a theory-driven model.

Quiz 1

What is the main goal of PCA?

To preserve maximum variance with fewer components.

Quiz 2

What does an eigenvalue represent?

Amount of variance explained by a component.

Quiz 3

Are PCA components correlated?

No.

Quiz 4

Which SPSS menu is used for PCA?

Analyze → Dimension Reduction → Factor.

Quiz 5

Does PCA identify latent constructs?

No.

Mini Practice

Use a dataset with multiple correlated variables.

Apply PCA and:

Decide number of components
Report variance explained

Use eigenvalues and scree plot to justify component selection.

What’s Next

In the next lesson, you will learn about Reliability and Validity, which ensure measurement quality in research and analytics.

← Previous Lesson SPSS Index Next ➜