Core Module
12 min forge
SQL Advanced: Recursive CTEs
Master hierarchical data processing in SQL. Navigating organizational charts and complex graph structures.
SQL Advanced: Recursive CTEs
π‘οΈ What is a Recursive CTE?
A Common Table Expression (CTE) is recursive if it refers to itself. Recursive CTEs are essential for querying hierarchical data structures like organizational charts, folder trees, or flight paths where the depth isn't known in advance.
β° When to Use
- Hierarchical Queries: Finding all employees under a manager at any level.
- Path Finding: Tracking connections in a social network or dependency graph.
- Series Generation: Creating a sequence of dates or numbers without a physical table.
π Complexity & Performance
- Infinite Recursion: If the termination condition isn't reached, a recursive query will loop until the system's recursion limit (e.g.,
MAXRECURSIONin SQL Server) is hit. - Memory Usage: Each recursive step adds rows to a temporary working table. Large hierarchies can consume significant memory.
- Execution Plan: Engines often use a "working table" to store the previous step's results and a "final table" to accumulate all steps.
π» Code Example: The Org Chart
sql Standard-- Recursive CTE to find the hierarchy from a top-level manager WITH RECURSIVE OrgChart AS ( -- 1. Anchor Member: Start with the CEO SELECT id, name, manager_id, 1 as level FROM employees WHERE name = 'Alice (CEO)' UNION ALL -- 2. Recursive Member: Join with the anchor/previous step SELECT e.id, e.name, e.manager_id, oc.level + 1 FROM employees e INNER JOIN OrgChart oc ON e.manager_id = oc.id ) SELECT * FROM OrgChart ORDER BY level;
β οΈ Interview Pitfalls
- The Anchor/Recursive Split: Candidates often forget that a recursive CTE MUST have an anchor (base case) and a recursive part joined by
UNION ALL. - Termination Logic: Be prepared to explain how the query knows when to stop (when the recursive join returns zero results).
- UNION vs UNION ALL: Almost all recursive queries use
UNION ALL.UNION(distinct) adds a significant performance overhead of duplicate checking. - Circular References: If the data has a cycle (A manages B, B manages A), the recursion will be infinite unless you manually track visited nodes.