Minimum Spanning Tree – Kruskal’s Algorithm
In the previous lesson, we studied Prim’s algorithm, which builds a Minimum Spanning Tree by growing outward from a chosen starting node.
In this lesson, we explore another powerful approach — Kruskal’s Algorithm.
While Prim focuses on vertices, Kruskal focuses on edges. Understanding both gives you strong control over graph optimization problems.
Recap: What Is a Minimum Spanning Tree?
A Minimum Spanning Tree (MST) connects all vertices of a weighted, undirected graph such that:
- No cycles are formed
- All vertices are connected
- Total edge weight is minimized
Multiple MSTs may exist, but all have the same minimum total cost.
The Core Idea of Kruskal’s Algorithm
Kruskal’s algorithm follows a simple greedy idea:
Always pick the cheapest edge available that does not create a cycle.
Instead of growing from a single node, Kruskal builds the MST by combining smaller components until the entire graph becomes connected.
How Kruskal’s Algorithm Works
The algorithm proceeds step by step:
- Sort all edges by increasing weight
- Pick the smallest edge
- Add it if it does not form a cycle
- Repeat until all vertices are connected
Cycle detection is the key challenge here.
Why Cycle Detection Is Important
Adding an edge that forms a cycle means adding unnecessary cost.
To efficiently detect cycles, Kruskal’s algorithm uses a data structure called Disjoint Set Union (Union-Find).
Union-Find (Disjoint Set) Concept
Union-Find helps answer two questions efficiently:
- Are two vertices already connected?
- If not, connect them
This allows Kruskal’s algorithm to remain fast even for large graphs.
Example Graph
We will use the same weighted graph as before:
edges = [
('A', 'B', 2),
('A', 'C', 3),
('B', 'C', 1),
('B', 'D', 1),
('C', 'D', 4)
]
Kruskal’s Algorithm – Implementation
Below is a clean implementation using Union-Find.
parent = {}
def find(node):
if parent[node] != node:
parent[node] = find(parent[node])
return parent[node]
def union(a, b):
rootA = find(a)
rootB = find(b)
if rootA != rootB:
parent[rootB] = rootA
def kruskal(vertices, edges):
for v in vertices:
parent[v] = v
mst_cost = 0
edges.sort(key=lambda x: x[2])
for u, v, weight in edges:
if find(u) != find(v):
union(u, v)
mst_cost += weight
return mst_cost
vertices = {'A', 'B', 'C', 'D'}
print(kruskal(vertices, edges))
This algorithm guarantees the minimum total edge weight.
How Kruskal Thinks
Kruskal’s algorithm does not care where the edge is located.
It only asks:
- Is this the cheapest edge available?
- Will adding it create a cycle?
If the answer to the second question is no, the edge is accepted.
Time Complexity
The dominant cost comes from sorting edges.
Time complexity:
O(E log E)
Union-Find operations are nearly constant time.
Kruskal vs Prim
Both algorithms solve the same problem, but their behavior differs.
Prim grows a single tree from a start node.
Kruskal builds multiple small trees and merges them gradually.
In sparse graphs, Kruskal is often preferred.
Real-World Interpretation
Imagine connecting cities with roads.
Kruskal’s approach:
Build the cheapest roads first, regardless of where they are, as long as they don’t create loops.
Eventually, all cities are connected with minimum total cost.
Exercises
Exercise 1:
Why must edges be sorted in Kruskal’s algorithm?
Exercise 2:
What happens if cycle detection is skipped?
Exercise 3:
Which data structure helps Kruskal avoid cycles?
Quick Quiz
Q1. Is Kruskal’s algorithm greedy?
Q2. Does Kruskal require a starting node?
In the next lesson, we will move from MSTs to shortest path algorithms and begin with the Bellman–Ford algorithm.