Recursive Queries
Many real-world datasets are hierarchical in nature.
Examples include:
- Employee → Manager relationships
- Category → Subcategory trees
- Folder → Subfolder structures
- Organizational charts
In this lesson, we learn how to query such data using Recursive Queries.
What is a Recursive Query?
A recursive query is a query that refers to its own result to repeatedly process hierarchical data.
In SQL, recursive queries are implemented using Recursive Common Table Expressions (CTEs).
When Do We Need Recursion?
Recursion is useful when:
- Data has parent-child relationships
- The hierarchy depth is unknown
- Simple JOINs are not sufficient
Without recursion, such queries become complex or impossible.
Example Table: Employees
Each employee has a manager who is also an employee.
+----+----------+------------+ | id | name | manager_id | +----+----------+------------+ | 1 | Alice | NULL | | 2 | Bob | 1 | | 3 | Charlie | 1 | | 4 | Diana | 2 | | 5 | Evan | 2 | +----+----------+------------+
Recursive CTE Structure
A recursive CTE has two parts:
- Anchor query – starting point
- Recursive query – references the CTE itself
WITH RECURSIVE cte_name AS ( -- Anchor query SELECT ... UNION ALL -- Recursive query SELECT ... FROM cte_name ) SELECT * FROM cte_name;
Anchor Query
The anchor query defines the root of the hierarchy.
Example: Top-level managers.
SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL;
Recursive Query
The recursive part repeatedly joins the table to the previous result.
SELECT e.id, e.name, e.manager_id FROM employees e JOIN cte c ON e.manager_id = c.id;
Complete Recursive Query Example
WITH RECURSIVE employee_hierarchy AS ( -- Anchor SELECT id, name, manager_id, 1 AS level FROM employees WHERE manager_id IS NULL UNION ALL -- Recursive SELECT e.id, e.name, e.manager_id, eh.level + 1 FROM employees e JOIN employee_hierarchy eh ON e.manager_id = eh.id ) SELECT * FROM employee_hierarchy;
This query returns the full organizational hierarchy.
Understanding the Level Column
The level column helps visualize depth:
- Level 1 → Top management
- Level 2 → Direct reports
- Level 3 → Sub-reports
This is useful for reporting and indentation.
Recursive Queries vs Self JOIN
| Feature | Recursive CTE | Self JOIN |
|---|---|---|
| Unknown depth | Yes | No |
| Readable | Yes | No |
| Scalable | Yes | No |
Preventing Infinite Loops
Recursive queries must eventually stop.
Ways to prevent infinite loops:
- Ensure no circular references
- Limit recursion depth
- Use proper join conditions
Performance Considerations
Recursive queries:
- Are powerful but expensive
- Should be indexed properly
- Should not be overused
Use them only when necessary.
Common Beginner Mistakes
- Missing UNION ALL
- Incorrect join condition
- Creating circular references
- Forgetting termination logic
Real-World Use Cases
- Company hierarchies
- Category trees
- Bill of materials
- Folder structures
What’s Next?
In the next lesson, we will explore Working with JSON in SQL, which allows modern applications to store and query semi-structured data.