AI Lesson 34 – Hierarchical Clustering | Dataplexa

Hierarchical Clustering

Hierarchical Clustering is an unsupervised learning technique that groups data points by building a hierarchy of clusters. Unlike K-Means, it does not require us to specify the number of clusters in advance.

Instead of forming clusters in one step, Hierarchical Clustering creates a tree-like structure that shows how data points are merged or split step by step.

Why Hierarchical Clustering?

In many real-world problems, we do not know how many clusters exist in the data. Choosing the wrong value of K in K-Means can lead to poor grouping.

Hierarchical Clustering solves this by allowing us to explore the structure of data first and then decide how many clusters make sense.

Real-World Example

Think about organizing files on your computer. You first group files into folders, then folders into categories, and sometimes categories into broader groups.

This layered grouping is similar to how Hierarchical Clustering organizes data.

Types of Hierarchical Clustering

There are two main approaches:

  • Agglomerative: Starts with each data point as its own cluster and merges them
  • Divisive: Starts with all data points in one cluster and splits them

Agglomerative clustering is the most commonly used approach.

How Agglomerative Clustering Works

The algorithm follows these steps:

  • Start with each data point as a separate cluster
  • Find the two closest clusters
  • Merge them into one cluster
  • Repeat until all points are in a single cluster

The result is a hierarchy that can be visualized using a dendrogram.

Hierarchical Clustering Example

Below is a simple example using agglomerative clustering.


from sklearn.cluster import AgglomerativeClustering

# Sample data
X = [
    [25, 30000],
    [30, 40000],
    [35, 60000],
    [40, 65000],
    [45, 80000],
    [50, 90000]
]

# Create model
model = AgglomerativeClustering(n_clusters=2)

# Fit and predict
labels = model.fit_predict(X)
print(labels)
  
[0 0 1 1 1 1]

The output shows how data points are grouped based on similarity without requiring random initialization.

Understanding Dendrograms

A dendrogram is a tree diagram that shows how clusters are merged at different distances.

By cutting the dendrogram at a chosen height, we can decide the number of clusters that best represent the data.

Advantages of Hierarchical Clustering

  • No need to specify number of clusters initially
  • Produces interpretable hierarchical structure
  • Works well for small to medium datasets

Limitations of Hierarchical Clustering

  • Computationally expensive for large datasets
  • Once clusters are merged, they cannot be undone
  • Sensitive to noise and outliers

Practice Questions

Practice 1: Hierarchical Clustering belongs to which learning type?



Practice 2: What diagram is used to visualize hierarchical clustering?



Practice 3: Which hierarchical method starts with individual data points?



Quick Quiz

Quiz 1: Does Hierarchical Clustering require choosing K beforehand?





Quiz 2: What structure does Hierarchical Clustering produce?





Quiz 3: Agglomerative clustering works by?





Coming up next: Dimensionality Reduction — understanding how to reduce features while preserving information.