Oracle DataBase Lesson 26 – Subqueries | Dataplexa

Subqueries

To follow along with the examples in this lesson, use the course dataset from Lesson 8.

What Is a Subquery?

A subquery is a SELECT statement nested inside another SQL statement. The outer query uses the result of the inner query as if it were a value, a list, or a table. Subqueries let you answer questions that require two steps — first calculate something, then use that result to filter or select further. Instead of running two separate queries and copying the result between them, you write one query that does both.

Subqueries can appear in three positions: in the WHERE clause to filter rows against a calculated value or list, in the FROM clause as an inline derived table, and in the SELECT clause as a calculated scalar value. Each position has its own rules and use cases.

  • The inner query runs first — its result is handed to the outer query
  • A subquery in WHERE can return a single value, a list, or exist just to test membership
  • A subquery in FROM is called an inline view — it must be given an alias
  • A subquery in SELECT must return exactly one row and one column — called a scalar subquery
  • A correlated subquery references a column from the outer query and re-runs once per outer row
  • A non-correlated subquery is independent of the outer query and runs only once

Subquery in WHERE — Single Value

The most common subquery returns a single value that the outer WHERE clause compares against. The inner query produces one number — an average, a maximum, a count — and the outer query uses it to filter rows. This is cleaner than hardcoding a value that might change as data changes.

-- Which products cost more than the average product price?
-- The subquery calculates the average once; the outer query uses it to filter.
SELECT
    product_name,
    category,
    price
FROM   dataplexa_store.products
WHERE  price > (
    SELECT AVG(price)
    FROM   dataplexa_store.products   -- inner query runs first, returns one number
)
ORDER  BY price DESC;
PRODUCT_NAME | CATEGORY | PRICE
---------------------|-------------|-------
Monitor 27-inch | Electronics | 299.99
Standing Desk | Furniture | 349.99
Mechanical Keyboard | Electronics | 89.99
(3 rows selected)

-- Subquery result (AVG price): 86.49
  • The subquery SELECT AVG(price) FROM products runs first and returns 86.49
  • The outer query then filters for rows where price > 86.49 — three products qualify
  • If prices change tomorrow the subquery recalculates automatically — no hardcoded value to maintain

Subquery in WHERE — List with IN

When the subquery returns multiple rows, use IN to test whether the outer column matches any value in the list. Use NOT IN to exclude matches — but only when you are certain the subquery will never return NULL, because a single NULL in a NOT IN list causes the entire condition to return no rows.

-- Which customers placed an order in June 2024?
-- The subquery returns a list of customer_ids; the outer query fetches their details.
SELECT
    customer_id,
    first_name || ' ' || last_name  AS customer_name,
    email,
    country
FROM   dataplexa_store.customers
WHERE  customer_id IN (
    SELECT DISTINCT customer_id
    FROM   dataplexa_store.orders
    WHERE  order_date >= DATE '2024-06-01'   -- DATE keyword required for Oracle date literals
      AND  order_date <  DATE '2024-07-01'
)
ORDER  BY customer_name;
CUSTOMER_ID | CUSTOMER_NAME | EMAIL | COUNTRY
------------|-----------------|--------------------------|----------------
7 | James Okafor | j.okafor@email.com | United Kingdom
14 | Sarah Chen | sarah.chen@email.com | United States
31 | Tom Wallace | tom.wallace@email.com | Canada
(3 rows selected)
  • The subquery returns the distinct customer_ids of everyone who ordered in June 2024
  • The outer query then fetches full customer details for only those ids
  • DATE '2024-06-01' is Oracle's date literal syntax — the DATE keyword is required
  • Using a date range (>= first day AND < next month) is safer than EXTRACT(MONTH FROM ...) because it can use an index on order_date

Subquery in FROM — Inline View

A subquery in the FROM clause acts as a temporary table for the duration of the query. Oracle calls this an inline view. It must be given an alias. Use it when you need to aggregate first and then filter or join on the aggregated result — something you cannot do in a single SELECT without a subquery or CTE.

-- Find customers whose average order value is above 150.
-- Step 1 (inner): calculate average order value per customer.
-- Step 2 (outer): filter on that average -- HAVING could do this too,
--                 but inline view makes the two steps explicitly visible.
SELECT
    customer_name,
    total_orders,
    ROUND(avg_order_value, 2)  AS avg_order_value
FROM (
    SELECT
        c.customer_id,
        c.first_name || ' ' || c.last_name  AS customer_name,
        COUNT(o.order_id)                   AS total_orders,
        AVG(o.total_amount)                 AS avg_order_value
    FROM   dataplexa_store.customers  c
    JOIN   dataplexa_store.orders     o ON c.customer_id = o.customer_id
    GROUP  BY c.customer_id, c.first_name, c.last_name
) customer_summary                          -- inline view must have an alias
WHERE  avg_order_value > 150
ORDER  BY avg_order_value DESC;
CUSTOMER_NAME | TOTAL_ORDERS | AVG_ORDER_VALUE
----------------|--------------|----------------
Sarah Chen | 12 | 153.54
James Okafor | 9 | 148.89 -- excluded (below 150)
Tom Wallace | 7 | 140.03 -- excluded
(1 row selected)
  • The inline view customer_summary runs first and produces one aggregated row per customer
  • The outer WHERE then filters on avg_order_value — a column that only exists after aggregation
  • You cannot reference an aggregate alias directly in HAVING across a JOIN like this — the inline view makes it clean and readable
  • The alias customer_summary is mandatory in Oracle — omitting it raises ORA-00933

Correlated Subquery

A correlated subquery references a column from the outer query. Because of this it cannot run once and be reused — Oracle re-executes it for every row the outer query processes. Use correlated subqueries when you need to compare each row against a value specific to its own group, such as "for this department, what is the average salary?"

-- Employees who earn more than the average salary in their own department.
-- The subquery references e.department_id from the outer query --
-- so it re-runs once per employee row with that employee's department_id.
SELECT
    e.employee_id,
    e.first_name || ' ' || e.last_name  AS employee_name,
    e.department_id,
    e.salary
FROM   dataplexa_hr.employees  e
WHERE  e.salary > (
    SELECT AVG(e2.salary)
    FROM   dataplexa_hr.employees e2
    WHERE  e2.department_id = e.department_id  -- correlated: references outer e.department_id
)
ORDER  BY e.department_id, e.salary DESC;
EMPLOYEE_ID | EMPLOYEE_NAME | DEPARTMENT_ID | SALARY
------------|------------------|---------------|--------
105 | Lena Fischer | 2 | 81000
112 | Ravi Menon | 2 | 78500
98 | Claire Dupont | 3 | 72000
134 | Omar Yusuf | 3 | 69000
77 | Dan Marsh | 4 | 58000
(5 rows selected -- truncated)
  • The subquery uses e2 as an alias for the inner employees table — e belongs to the outer query
  • For each outer row, Oracle runs the inner SELECT with that employee's department_id to get the department average
  • This is more flexible than a JOIN approach when the comparison must be row-specific
  • Correlated subqueries on large tables can be slow — consider a JOIN to an inline view with pre-aggregated averages as an alternative

EXISTS and NOT EXISTS

EXISTS tests whether a subquery returns any rows at all — it does not care about the actual values returned, only whether at least one row exists. It stops as soon as it finds the first match, making it efficient. NOT EXISTS is the safe alternative to NOT IN when NULLs may be present in the subquery result.

-- Customers who have placed at least one order (EXISTS)
SELECT
    c.customer_id,
    c.first_name || ' ' || c.last_name  AS customer_name
FROM   dataplexa_store.customers  c
WHERE  EXISTS (
    SELECT 1                            -- SELECT 1 is convention -- value doesn't matter
    FROM   dataplexa_store.orders o
    WHERE  o.customer_id = c.customer_id  -- correlated: re-runs per customer
)
ORDER  BY customer_name;
-- Customers who have never placed any order (NOT EXISTS)
-- Safer than NOT IN -- handles NULLs correctly
SELECT
    c.customer_id,
    c.first_name || ' ' || c.last_name  AS customer_name
FROM   dataplexa_store.customers  c
WHERE  NOT EXISTS (
    SELECT 1
    FROM   dataplexa_store.orders o
    WHERE  o.customer_id = c.customer_id
)
ORDER  BY customer_name;
-- EXISTS result (customers with orders)
CUSTOMER_ID | CUSTOMER_NAME
------------|------------------
7 | James Okafor
14 | Sarah Chen
23 | Priya Sharma
31 | Tom Wallace
55 | Nina Patel
(5 rows selected -- truncated)

-- NOT EXISTS result (customers with no orders)
CUSTOMER_ID | CUSTOMER_NAME
------------|------------------
42 | Anna Kovac
(1 row selected)
  • SELECT 1 inside EXISTS is convention — the value returned is irrelevant, only whether a row exists matters
  • NOT EXISTS correctly handles NULLs in the subquery — NOT IN would return zero rows if any order had a NULL customer_id
  • EXISTS stops at the first matching row — it is often faster than IN on large tables

Lesson Summary

Type Where It Goes Returns
Single-value subquery WHERE with =, >, < Exactly one row, one column
List subquery WHERE with IN / NOT IN Multiple rows, one column
Inline view FROM clause — must have alias A full result set used as a table
Scalar subquery SELECT clause Exactly one row, one column per outer row
Correlated subquery WHERE — references outer query column Re-executes once per outer row
EXISTS / NOT EXISTS WHERE — tests for row existence TRUE or FALSE — stops at first match

Practice Questions

Practice 1. What is the difference between a correlated and a non-correlated subquery?



Practice 2. A subquery in the FROM clause must have what?



Practice 3. Why is NOT EXISTS safer than NOT IN when the subquery might contain NULL values?



Practice 4. What does SELECT 1 inside an EXISTS subquery mean?



Practice 5. Write a subquery that finds all products priced above the average price in their own category.



Quiz

Quiz 1. A subquery in the WHERE clause returns more than one row and is used with =. What happens?






Quiz 2. Which subquery type re-executes once for every row processed by the outer query?






Quiz 3. What is an inline view in Oracle?






Quiz 4. You use NOT IN with a subquery and get zero rows back unexpectedly. What is the most likely cause?






Quiz 5. Which position can a subquery NOT appear in within a standard Oracle SELECT statement?






Next up — Views — How to save a SELECT statement as a named object in Oracle, when views are useful, and how updatable views work.