SPSS Lesson 32 – TITLE HERE | Dataplexa

Principal Component Analysis (PCA)

In the previous lesson, you learned how Factor Analysis identifies hidden underlying factors.

Principal Component Analysis (PCA) also reduces many variables into fewer components, but with a different objective.

PCA focuses on data compression and variance preservation, not on uncovering latent constructs.


Why PCA Is Used

Modern datasets often contain many correlated variables.

This creates problems such as:

  • Redundancy in information
  • Difficulty in visualization
  • Multicollinearity in regression

PCA transforms the original variables into a smaller set of new variables called principal components.


Core Idea Behind PCA

Each principal component is:

  • A linear combination of original variables
  • Uncorrelated with other components
  • Ordered by the amount of variance explained

The first component explains the maximum possible variance, the second explains the next most, and so on.


PCA vs Factor Analysis

Aspect PCA Factor Analysis
Main goal Variance preservation Latent structure identification
Focus Data reduction Underlying factors
Error modeling Does not separate error Separates common & unique variance

In practice, PCA is often used as a preprocessing step, while factor analysis is used for theory building.


Eigenvalues and Variance Explained

In PCA, each component has an eigenvalue, which represents the amount of variance explained.

Common rule:

  • Eigenvalue > 1 → retain the component

SPSS also provides a scree plot to visually decide the number of components.


Example Scenario

Suppose we collect data on:

  • Math score
  • Science score
  • English score
  • Logic score

These scores are correlated. PCA can reduce them into one or two components representing overall academic performance.


Running PCA in SPSS (Menu)

To perform PCA in SPSS:

  • Go to Analyze → Dimension Reduction → Factor
  • Select variables
  • Choose Principal Components as extraction
  • Check Eigenvalues greater than 1
  • View scree plot
  • Click OK

SPSS Syntax for PCA


FACTOR
  /VARIABLES Math Science English Logic
  /MISSING LISTWISE
  /ANALYSIS Math Science English Logic
  /EXTRACTION PC
  /CRITERIA MINEIGEN(1)
  /ROTATION NONE.

Interpreting PCA Output

When interpreting PCA results:

  • Look at eigenvalues
  • Check percentage of variance explained
  • Review component loadings

Higher loadings indicate stronger contribution of a variable to a component.


Common Mistakes

Typical mistakes include:

  • Confusing PCA with Factor Analysis
  • Keeping too many components
  • Ignoring scree plot

PCA is a mathematical technique, not a theory-driven model.


Quiz 1

What is the main goal of PCA?

To preserve maximum variance with fewer components.


Quiz 2

What does an eigenvalue represent?

Amount of variance explained by a component.


Quiz 3

Are PCA components correlated?

No.


Quiz 4

Which SPSS menu is used for PCA?

Analyze → Dimension Reduction → Factor.


Quiz 5

Does PCA identify latent constructs?

No.


Mini Practice

Use a dataset with multiple correlated variables.

Apply PCA and:

  • Decide number of components
  • Report variance explained

Use eigenvalues and scree plot to justify component selection.


What’s Next

In the next lesson, you will learn about Reliability and Validity, which ensure measurement quality in research and analytics.