Oracle DataBase Lesson 19 – Aggregate Functions | Dataplexa

Aggregate Functions

Aggregate functions collapse a set of rows into a single summary value. Instead of returning one row per row in the table, they return one value per group — or one value for the entire table when no grouping is specified. Oracle's core aggregates are COUNT, SUM, AVG, MIN, and MAX, each with specific behaviours around NULL values, duplicate handling, and data types. This lesson covers all five in depth, explains how they interact with NULL, and shows practical patterns from both the DataplexaStore and DataplexaHR schemas.

To follow along with the examples in this lesson, use the course dataset from Lesson 8.

COUNT

COUNT returns the number of rows that satisfy a condition. It has three distinct forms: COUNT(*) counts every row including those with NULL values in any column; COUNT(column) counts only rows where that column is not NULL; and COUNT(DISTINCT column) counts unique non-null values. Understanding the difference between these forms is essential — they can return different numbers from the same table.

-- COUNT(*) — counts every row regardless of NULL values in any column
SELECT COUNT(*) AS total_customers
FROM   customers;

-- COUNT(column) — counts only rows where the column is NOT NULL
-- phone is nullable — this returns fewer rows than COUNT(*)
SELECT COUNT(*)          AS total_customers,
       COUNT(phone)      AS customers_with_phone,
       COUNT(city)       AS customers_with_city
FROM   customers;

-- COUNT(DISTINCT column) — counts unique non-null values
SELECT COUNT(DISTINCT country)      AS unique_countries,
       COUNT(DISTINCT loyalty_tier) AS unique_tiers
FROM   customers;

-- COUNT with a WHERE clause — counts matching rows only
SELECT COUNT(*) AS pending_orders
FROM   orders
WHERE  status = 'pending';

-- COUNT across multiple tables — how many products have been ordered at least once
SELECT COUNT(DISTINCT product_id) AS ordered_products
FROM   order_items;
-- total_customers:
TOTAL_CUSTOMERS
───────────────
12

-- NULL-aware COUNT:
TOTAL_CUSTOMERS CUSTOMERS_WITH_PHONE CUSTOMERS_WITH_CITY
─────────────── ──────────────────── ───────────────────
12 10 9

-- DISTINCT counts:
UNIQUE_COUNTRIES UNIQUE_TIERS
──────────────── ────────────
5 3

-- pending orders:
PENDING_ORDERS
──────────────
3

-- ordered products:
ORDERED_PRODUCTS
────────────────
11
  • COUNT(*) and COUNT(1) are equivalent in Oracle — both count every row; use COUNT(*) as it is the ANSI standard and communicates intent clearly
  • The gap between COUNT(*) and COUNT(column) is a quick data quality check — a large difference reveals how many rows have a NULL in that column
  • COUNT(DISTINCT column) is useful for cardinality profiling — comparing it to COUNT(*) shows how many unique values exist relative to total rows, which informs indexing and query design decisions

SUM and AVG

SUM adds all non-null values in a numeric column. AVG computes the arithmetic mean of non-null values. Both silently ignore NULL rows — a column with ten values where two are NULL produces a SUM and AVG based on the eight non-null values only. This means AVG(column) is not the same as SUM(column) / COUNT(*) when nulls are present.

-- Basic SUM and AVG on orders
SELECT SUM(total_amount)            AS total_revenue,
       AVG(total_amount)            AS avg_order_value,
       ROUND(AVG(total_amount), 2)  AS avg_order_rounded
FROM   orders;

-- SUM and AVG on products
SELECT category,
       SUM(stock_qty)               AS total_stock,
       SUM(unit_price * stock_qty)  AS total_stock_value,
       ROUND(AVG(unit_price), 2)    AS avg_price
FROM   products
WHERE  active_flag = 'Y'
GROUP BY category
ORDER BY total_stock_value DESC;

-- Demonstrating NULL behaviour in AVG
-- AVG ignores NULLs — divides by count of non-null rows, not total rows
SELECT COUNT(*)                     AS total_rows,
       COUNT(phone)                 AS non_null_phone,
       AVG(LENGTH(phone))           AS avg_phone_length,
       SUM(LENGTH(phone)) /
           COUNT(*)                 AS avg_incl_nulls_as_zero
FROM   customers;

-- SUM with NVL — treat NULL as zero before summing
SELECT SUM(NVL(stock_qty, 0)) AS total_stock_treating_null_as_zero
FROM   products;
-- orders totals:
TOTAL_REVENUE AVG_ORDER_VALUE AVG_ORDER_ROUNDED
───────────── ─────────────── ─────────────────
8749.71 874.971 874.97

-- by category:
CATEGORY TOTAL_STOCK TOTAL_STOCK_VALUE AVG_PRICE
─────────── ─────────── ───────────────── ─────────
Accessories 220 79997.80 286.66
Electronics 430 28149.30 55.99
Clothing 370 30997.30 76.66
Fitness 150 4498.50 29.99

-- NULL in AVG:
TOTAL_ROWS NON_NULL_PHONE AVG_PHONE_LENGTH AVG_INCL_NULLS_AS_ZERO
────────── ────────────── ──────────────── ──────────────────────
12 10 16.2 13.5
  • AVG(col) divides the sum by the count of non-null rows — if 2 of 12 customers have no phone, AVG(LENGTH(phone)) divides by 10, not 12; to include nulls as zero use AVG(NVL(col, 0)) or SUM(col) / COUNT(*)
  • SUM on a column where every row is NULL returns NULL, not zero — wrap with NVL(SUM(col), 0) when a guaranteed numeric result is required
  • Computing SUM(unit_price * stock_qty) is more efficient than summing a stored column — the multiplication happens during aggregation without needing a derived column or subquery

MIN and MAX

MIN and MAX return the smallest and largest non-null value in a column. They work on numeric, character, and date columns — on character columns they use alphabetical order, on date columns they use chronological order. Like all aggregates they ignore NULL values.

-- MIN and MAX on numeric columns
SELECT MIN(unit_price)  AS cheapest_product,
       MAX(unit_price)  AS most_expensive,
       MAX(unit_price) - MIN(unit_price) AS price_range
FROM   products
WHERE  active_flag = 'Y';

-- MIN and MAX on date columns — earliest and latest order
SELECT MIN(order_date)  AS first_order,
       MAX(order_date)  AS latest_order,
       MAX(order_date) - MIN(order_date) AS days_of_trading
FROM   orders;

-- MIN and MAX on character columns — alphabetical ordering
SELECT MIN(full_name)  AS first_alphabetically,
       MAX(full_name)  AS last_alphabetically
FROM   customers;

-- MIN and MAX in the HR schema — salary range per job
SELECT j.job_title,
       MIN(s.amount)           AS min_salary,
       MAX(s.amount)           AS max_salary,
       MAX(s.amount) -
           MIN(s.amount)       AS salary_spread
FROM   employees  e
JOIN   jobs       j ON j.job_id      = e.job_id
JOIN   salaries   s ON s.employee_id = e.employee_id
WHERE  s.salary_type = 'monthly'
GROUP BY j.job_title
ORDER BY max_salary DESC;
-- price range:
CHEAPEST_PRODUCT MOST_EXPENSIVE PRICE_RANGE
──────────────── ────────────── ───────────
19.99 499.99 480.00

-- order date range:
FIRST_ORDER LATEST_ORDER DAYS_OF_TRADING
──────────── ──────────── ───────────────
10-JAN-2024 28-MAR-2024 78

-- alphabetical names:
FIRST_ALPHABETICALLY LAST_ALPHABETICALLY
──────────────────── ───────────────────
Daniel Kim Yuki Tanaka

-- salary range by job:
JOB_TITLE MIN_SALARY MAX_SALARY SALARY_SPREAD
──────────────── ────────── ────────── ─────────────
IT Manager 7500.00 7500.00 0.00
HR Manager 6800.00 6800.00 0.00
IT Programmer 4200.00 5800.00 1600.00
HR Specialist 3800.00 4500.00 700.00
  • Subtracting two DATE values in Oracle returns the difference in days as a NUMBERMAX(order_date) - MIN(order_date) gives the number of days between the earliest and latest order
  • MIN and MAX on VARCHAR2 columns use Oracle's NLS character sort order — for most single-byte character sets this is standard alphabetical order, but it can vary with NLS_SORT settings
  • A salary_spread of zero means all employees in that job have identical salaries — useful for detecting jobs where pay is standardised vs jobs with a wide pay band

Aggregates and NULL — The Full Picture

Every aggregate function in Oracle ignores NULL values — except COUNT(*). This single rule explains most aggregate surprises. When a column is entirely NULL for the rows being aggregated, SUM, AVG, MIN, and MAX all return NULL. COUNT(column) returns zero. Only COUNT(*) returns the actual row count regardless of column nullability.

-- Demonstrate all aggregate behaviours on a column with NULLs
-- payments.notes is fully NULL in the dataset
SELECT COUNT(*)             AS count_star,
       COUNT(amount)        AS count_amount,
       SUM(amount)          AS sum_amount,
       AVG(amount)          AS avg_amount,
       MIN(amount)          AS min_amount,
       MAX(amount)          AS max_amount
FROM   payments
WHERE  order_id = 5001;

-- When every value is NULL — SUM, AVG, MIN, MAX all return NULL
SELECT SUM(phone)   AS sum_nulls,
       AVG(phone)   AS avg_nulls,
       MIN(phone)   AS min_nulls,
       MAX(phone)   AS max_nulls,
       COUNT(phone) AS count_nulls
FROM   customers
WHERE  phone IS NULL;

-- Safe pattern — NVL wraps protect against NULL aggregate results
SELECT NVL(SUM(total_amount), 0)           AS safe_sum,
       NVL(ROUND(AVG(total_amount), 2), 0) AS safe_avg,
       NVL(MIN(total_amount), 0)           AS safe_min,
       NVL(MAX(total_amount), 0)           AS safe_max
FROM   orders
WHERE  status = 'nonexistent';
-- payment aggregates for order 5001:
COUNT_STAR COUNT_AMOUNT SUM_AMOUNT AVG_AMOUNT MIN_AMOUNT MAX_AMOUNT
────────── ──────────── ────────── ────────── ────────── ──────────
1 1 1249.95 1249.95 1249.95 1249.95

-- all NULL column:
SUM_NULLS AVG_NULLS MIN_NULLS MAX_NULLS COUNT_NULLS
───────── ───────── ───────── ───────── ───────────
(null) (null) (null) (null) 0

-- safe NVL pattern (no matching rows):
SAFE_SUM SAFE_AVG SAFE_MIN SAFE_MAX
──────── ──────── ──────── ────────
0 0 0 0
  • When the WHERE clause matches no rows, all aggregates except COUNT(*) return NULLCOUNT(*) returns zero; wrap non-COUNT aggregates with NVL(..., 0) in application queries to prevent unexpected NULL values in result sets
  • COUNT(phone) returns zero when all phone values are NULL — this is different from SUM, AVG, MIN, and MAX which return NULL in the same situation, not zero
  • The NVL wrap pattern is especially important in reports and dashboards where a NULL total value is displayed as blank rather than zero, which can be misread as missing data

Combining Aggregates in One Query

Multiple aggregate functions can appear in the same SELECT list, each computing independently over the same set of rows. This is more efficient than running separate queries for each metric. Aggregates can also be combined with arithmetic to derive new metrics directly in the query.

-- Full order summary in one query
SELECT COUNT(*)                             AS total_orders,
       COUNT(DISTINCT customer_id)          AS unique_customers,
       SUM(total_amount)                    AS total_revenue,
       ROUND(AVG(total_amount), 2)          AS avg_order_value,
       MIN(total_amount)                    AS smallest_order,
       MAX(total_amount)                    AS largest_order,
       MAX(total_amount) - MIN(total_amount) AS order_value_range
FROM   orders;

-- HR summary — headcount, payroll cost, and salary range per department
SELECT d.department_name,
       COUNT(e.employee_id)          AS headcount,
       SUM(s.amount)                 AS monthly_payroll,
       ROUND(AVG(s.amount), 2)       AS avg_salary,
       MIN(s.amount)                 AS lowest_salary,
       MAX(s.amount)                 AS highest_salary
FROM   departments  d
JOIN   employees    e ON e.department_id = d.department_id
JOIN   salaries     s ON s.employee_id   = e.employee_id
WHERE  s.salary_type = 'monthly'
GROUP BY d.department_name
ORDER BY monthly_payroll DESC;

-- Product summary — stock value and pricing statistics per category
SELECT category,
       COUNT(*)                          AS product_count,
       SUM(stock_qty)                    AS units_in_stock,
       ROUND(AVG(unit_price), 2)         AS avg_price,
       MIN(unit_price)                   AS lowest_price,
       MAX(unit_price)                   AS highest_price,
       ROUND(SUM(unit_price * stock_qty), 2) AS total_inventory_value
FROM   products
GROUP BY category
ORDER BY total_inventory_value DESC;
-- Full order summary:
TOTAL_ORDERS UNIQUE_CUSTOMERS TOTAL_REVENUE AVG_ORDER_VALUE SMALLEST_ORDER LARGEST_ORDER ORDER_VALUE_RANGE
──────────── ──────────────── ───────────── ─────────────── ────────────── ───────────── ─────────────────
10 8 8749.71 874.97 149.99 1599.97 1449.98

-- HR by department:
DEPARTMENT_NAME HEADCOUNT MONTHLY_PAYROLL AVG_SALARY LOWEST_SALARY HIGHEST_SALARY
─────────────── ───────── ─────────────── ────────── ───────────── ──────────────
Engineering 5 26500.00 5300.00 4200.00 7500.00
HR 4 19900.00 4975.00 3800.00 6800.00
Finance 3 14400.00 4800.00 3900.00 6200.00

-- Products by category:
CATEGORY PRODUCT_COUNT UNITS_IN_STOCK AVG_PRICE LOWEST_PRICE HIGHEST_PRICE TOTAL_INVENTORY_VALUE
─────────── ───────────── ────────────── ───────── ──────────── ───────────── ─────────────────────
Accessories 3 220 286.66 59.99 499.99 79997.80
Clothing 4 370 76.66 19.99 189.99 30997.30
Electronics 5 430 55.99 34.99 149.99 28149.30
Fitness 2 150 29.99 29.99 29.99 4498.50
  • Running multiple aggregates in a single query is always preferable to separate queries — Oracle scans the source data once and computes all aggregates in the same pass
  • COUNT(DISTINCT customer_id) in the orders query reveals that 10 orders came from only 8 unique customers — some customers placed more than one order; this is a useful insight that COUNT(*) alone would not reveal
  • Arithmetic on aggregate results — MAX - MIN, SUM / COUNT — produces derived metrics in the same query without needing a subquery or outer wrapper

Summary

Function Returns NULL Behaviour When All Values Are NULL
COUNT(*) Row count including NULLs Counts every row Returns row count (not 0)
COUNT(col) Non-null row count Ignores NULLs Returns 0
COUNT(DISTINCT col) Unique non-null values Ignores NULLs Returns 0
SUM(col) Total of non-null values Ignores NULLs Returns NULL
AVG(col) Mean of non-null values Divides by non-null count Returns NULL
MIN(col) Smallest non-null value Ignores NULLs Returns NULL
MAX(col) Largest non-null value Ignores NULLs Returns NULL

Practice Questions

Practice 1. A table has 100 rows. A column called discount has NULL in 30 of them. What does AVG(discount) divide by — 100 or 70?



Practice 2. What is the difference between COUNT(*), COUNT(column), and COUNT(DISTINCT column)?



Practice 3. A WHERE clause filters orders to a status value that matches no rows. What does SUM(total_amount) return, and how do you make it return 0 instead?



Practice 4. Write a query that returns the total revenue, average order value, and number of orders for each order status in the orders table.



Practice 5. MIN and MAX work on character columns. What ordering do they use, and what does MIN(full_name) return?



Quiz

Quiz 1. A column has 5 rows with values 10, 20, NULL, 40, NULL. What does AVG(column) return?






Quiz 2. Which aggregate function returns 0 (not NULL) when no rows match the WHERE clause?






Quiz 3. What is the result of MAX(order_date) - MIN(order_date) in Oracle?






Quiz 4. COUNT(*) returns 12 but COUNT(phone) returns 10 for the same table. What does this tell you?






Quiz 5. Why is SUM(unit_price * stock_qty) more efficient than storing a total_value column and summing that?






Next up — GROUP BY & HAVING — Learn how to group rows into sets, apply aggregate functions per group, and filter groups using the HAVING clause.