Recursive Queries | Dataplexa

Recursive Queries

Many real-world datasets are hierarchical in nature.

Examples include:

  • Employee → Manager relationships
  • Category → Subcategory trees
  • Folder → Subfolder structures
  • Organizational charts

In this lesson, we learn how to query such data using Recursive Queries.


What is a Recursive Query?

A recursive query is a query that refers to its own result to repeatedly process hierarchical data.

In SQL, recursive queries are implemented using Recursive Common Table Expressions (CTEs).


When Do We Need Recursion?

Recursion is useful when:

  • Data has parent-child relationships
  • The hierarchy depth is unknown
  • Simple JOINs are not sufficient

Without recursion, such queries become complex or impossible.


Example Table: Employees

Each employee has a manager who is also an employee.

+----+----------+------------+
| id | name     | manager_id |
+----+----------+------------+
| 1  | Alice    | NULL       |
| 2  | Bob      | 1          |
| 3  | Charlie  | 1          |
| 4  | Diana    | 2          |
| 5  | Evan     | 2          |
+----+----------+------------+
  

Recursive CTE Structure

A recursive CTE has two parts:

  1. Anchor query – starting point
  2. Recursive query – references the CTE itself
WITH RECURSIVE cte_name AS (
  -- Anchor query
  SELECT ...

  UNION ALL

  -- Recursive query
  SELECT ...
  FROM cte_name
)
SELECT * FROM cte_name;
  

Anchor Query

The anchor query defines the root of the hierarchy.

Example: Top-level managers.

SELECT id, name, manager_id
FROM employees
WHERE manager_id IS NULL;
  

Recursive Query

The recursive part repeatedly joins the table to the previous result.

SELECT e.id, e.name, e.manager_id
FROM employees e
JOIN cte c
ON e.manager_id = c.id;
  

Complete Recursive Query Example

WITH RECURSIVE employee_hierarchy AS (
  -- Anchor
  SELECT id, name, manager_id, 1 AS level
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive
  SELECT e.id, e.name, e.manager_id, eh.level + 1
  FROM employees e
  JOIN employee_hierarchy eh
  ON e.manager_id = eh.id
)
SELECT * FROM employee_hierarchy;
  

This query returns the full organizational hierarchy.


Understanding the Level Column

The level column helps visualize depth:

  • Level 1 → Top management
  • Level 2 → Direct reports
  • Level 3 → Sub-reports

This is useful for reporting and indentation.


Recursive Queries vs Self JOIN

Feature Recursive CTE Self JOIN
Unknown depth Yes No
Readable Yes No
Scalable Yes No

Preventing Infinite Loops

Recursive queries must eventually stop.

Ways to prevent infinite loops:

  • Ensure no circular references
  • Limit recursion depth
  • Use proper join conditions

Performance Considerations

Recursive queries:

  • Are powerful but expensive
  • Should be indexed properly
  • Should not be overused

Use them only when necessary.


Common Beginner Mistakes

  • Missing UNION ALL
  • Incorrect join condition
  • Creating circular references
  • Forgetting termination logic

Real-World Use Cases

  • Company hierarchies
  • Category trees
  • Bill of materials
  • Folder structures

What’s Next?

In the next lesson, we will explore Working with JSON in SQL, which allows modern applications to store and query semi-structured data.