Algorithms Lesson 33 – Topological Sorting | Dataplexa

Topological Sorting

In the previous lesson, we learned how to detect cycles in graphs. That concept is extremely important because topological sorting only works on graphs without cycles.

Now we move to one of the most practical graph techniques used in real software systems — Topological Sorting.

This lesson connects theory directly to real-world scheduling problems.


What Is Topological Sorting?

Topological sorting is a linear ordering of vertices in a directed acyclic graph (DAG) such that for every directed edge U → V, vertex U comes before V in the ordering.

In simple terms:

All dependencies are respected.


Where Is Topological Sorting Used?

Topological sorting appears in many real systems:

  • Task scheduling
  • Course prerequisite systems
  • Build systems (Maven, Gradle)
  • Dependency resolution in package managers

Whenever one task must be completed before another, topological sorting is involved.


Why DAG Is Mandatory

If a graph contains a cycle, topological ordering becomes impossible.

Example:

  • Task A depends on B
  • Task B depends on C
  • Task C depends on A

There is no valid starting point.

That is why cycle detection is performed first.


Two Common Approaches

There are two standard ways to perform topological sorting:

  • DFS-based approach
  • Kahn’s Algorithm (BFS-based)

We will start with the DFS approach.


DFS-Based Topological Sorting

The idea is simple:

Visit nodes using DFS, and push each node to a stack after all its neighbors are processed.

graph = {
    'A': ['C'],
    'B': ['C', 'D'],
    'C': ['E'],
    'D': ['F'],
    'E': [],
    'F': []
}

visited = set()
stack = []

def topo_sort(node):
    visited.add(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
            topo_sort(neighbor)
    stack.append(node)

for node in graph:
    if node not in visited:
        topo_sort(node)

print(stack[::-1])

Reversing the stack gives the correct topological order.


How This Works Internally

DFS ensures that all dependencies of a node are visited before the node itself is added to the stack.

This naturally enforces correct ordering.


Kahn’s Algorithm (BFS-Based)

Kahn’s algorithm uses the concept of in-degrees.

The process:

  • Find nodes with in-degree 0
  • Add them to the result
  • Remove their outgoing edges
  • Repeat

Kahn’s Algorithm – Code

from collections import deque

graph = {
    'A': ['C'],
    'B': ['C', 'D'],
    'C': ['E'],
    'D': ['F'],
    'E': [],
    'F': []
}

in_degree = {node: 0 for node in graph}

for node in graph:
    for neighbor in graph[node]:
        in_degree[neighbor] += 1

queue = deque([node for node in in_degree if in_degree[node] == 0])
order = []

while queue:
    node = queue.popleft()
    order.append(node)

    for neighbor in graph[node]:
        in_degree[neighbor] -= 1
        if in_degree[neighbor] == 0:
            queue.append(neighbor)

print(order)

If the final ordering does not contain all nodes, the graph has a cycle.


Real-World Example

Consider a university course system:

  • Math → Algorithms
  • Algorithms → Machine Learning
  • Programming → Algorithms

Topological sorting gives a valid study order.


Time Complexity

Both DFS and Kahn’s Algorithm run in:

O(V + E)

This makes topological sorting efficient even for large graphs.


Mini Practice

Imagine a software build pipeline.

Why must compilation happen before testing?

That dependency is enforced by topological ordering.


Exercises

Exercise 1:
Why does topological sorting fail if a cycle exists?

Because there is no valid order that satisfies all dependencies.

Exercise 2:
Which data structure is used in Kahn’s algorithm?

A queue.

Exercise 3:
Can topological sorting be applied to undirected graphs?

No, it is defined only for directed acyclic graphs.

Quick Quiz

Q1. What type of graph is required for topological sorting?

Directed Acyclic Graph (DAG).

Q2. Which algorithm uses in-degree counts?

Kahn’s Algorithm.

In the next lesson, we will explore Minimum Spanning Trees starting with Prim’s Algorithm, which is widely used in network design.