Oracle DataBase Lesson 20 – GROUP BY | Dataplexa

GROUP BY

To follow along with the examples in this lesson, use the course dataset from Lesson 8.

What GROUP BY Does

GROUP BY collapses multiple rows that share a common value into a single summary row. Instead of seeing every individual order you see one row per customer. Instead of every product sale you see one row per category. You always pair GROUP BY with an aggregate functionCOUNT(), SUM(), AVG(), MIN(), or MAX() — which calculates one value across all the rows in each group.

The rule that catches most beginners: every column in your SELECT list must either appear in the GROUP BY clause or be wrapped inside an aggregate function. Oracle enforces this strictly — if you include a column that is neither grouped nor aggregated, Oracle immediately returns ORA-00979: not a GROUP BY expression.

  • GROUP BY creates one output row per distinct value in the grouping column
  • Every non-aggregated column in SELECT must appear in GROUP BY — Oracle enforces this with ORA-00979
  • NULL values form their own group — all rows with NULL in the grouping column land in the same bucket
  • WHERE filters rows before grouping — ORDER BY sorts the grouped result after
SELECT
    category,
    COUNT(*)              AS total_products,
    COUNT(DISTINCT brand) AS unique_brands   -- counts only distinct non-NULL brand values per group
FROM   dataplexa_store.products
GROUP  BY category
ORDER  BY total_products DESC;
CATEGORY | TOTAL_PRODUCTS | UNIQUE_BRANDS
------------|----------------|---------------
Electronics | 18 | 5
Stationery | 14 | 3
Furniture | 9 | 2
Accessories | 7 | 4
Clothing | 5 | 3
(5 rows selected)
  • Five distinct category values exist — Oracle produced exactly one row per group
  • COUNT(*) counts every row including those with NULLs in other columns — COUNT(DISTINCT brand) counts only unique non-NULL brand values
  • Oracle says rows selected — SQL Server says rows affected — same meaning, different wording

Multiple Aggregate Functions in One Query

You can include as many aggregate functions as you need in a single GROUP BY query. Oracle forms the groups once and applies every aggregate to each group independently. Wrapping ROUND() around AVG() works identically in Oracle and SQL Server — the inner function runs first, the outer one processes its result.

SELECT
    category,
    COUNT(*)             AS total_products,
    SUM(stock)           AS total_stock,
    ROUND(AVG(price), 2) AS avg_price,
    MIN(price)           AS min_price,
    MAX(price)           AS max_price
FROM   dataplexa_store.products
GROUP  BY category
ORDER  BY avg_price DESC;
CATEGORY | TOTAL_PRODUCTS | TOTAL_STOCK | AVG_PRICE | MIN_PRICE | MAX_PRICE
------------|----------------|-------------|-----------|-----------|----------
Furniture | 9 | 184 | 289.99 | 49.99 | 899.99
Electronics | 18 | 1842 | 87.45 | 8.99 | 1299.99
Clothing | 5 | 310 | 64.80 | 19.99 | 129.99
Accessories | 7 | 523 | 38.20 | 5.99 | 89.99
Stationery | 14 | 3108 | 14.75 | 1.99 | 49.99
(5 rows selected)
  • All five aggregates share the same five groups — GROUP BY is evaluated once and every aggregate reads from it
  • Furniture has only 9 products but the highest average price — this pattern is invisible in raw row data
  • SUM(stock) silently ignores NULL values — a product with no stock recorded does not contribute zero, it is skipped entirely
-- Headcount and salary breakdown per department using DataplexaHR.
-- The JOIN runs before GROUP BY -- department names are resolved first,
-- then the joined rows are collapsed into department groups.
SELECT
    d.department_name,
    COUNT(e.employee_id) AS headcount,      -- COUNT(column) skips NULLs -- safer than COUNT(*) after a LEFT JOIN
    ROUND(AVG(e.salary), 2) AS avg_salary,
    MIN(e.salary)    AS min_salary,
    MAX(e.salary)    AS max_salary,
    SUM(e.salary)    AS total_payroll
FROM   dataplexa_hr.employees   e
JOIN   dataplexa_hr.departments d ON e.department_id = d.department_id
GROUP  BY d.department_name
ORDER  BY total_payroll DESC;
DEPARTMENT_NAME | HEADCOUNT | AVG_SALARY | MIN_SALARY | MAX_SALARY | TOTAL_PAYROLL
----------------|-----------|------------|------------|------------|---------------
Engineering | 12 | 78500.00 | 55000.00 | 115000.00 | 942000.00
Sales | 9 | 62000.00 | 42000.00 | 88000.00 | 558000.00
Finance | 6 | 71000.00 | 52000.00 | 95000.00 | 426000.00
Warehouse | 8 | 38000.00 | 28000.00 | 52000.00 | 304000.00
HR | 5 | 58000.00 | 45000.00 | 72000.00 | 290000.00
(5 rows selected)
  • Engineering has the highest total payroll despite not being the largest team — SUM and AVG together show why
  • d.department_name must appear in GROUP BY because it is in SELECT and not aggregated — Oracle raises ORA-00979 if it is missing

Grouping by Multiple Columns

You can GROUP BY more than one column. Oracle creates one group for every unique combination of values across all the grouped columns. Grouping by category and brand gives one row per unique category-brand pair — not one per category alone, not one per brand alone. Five brands in Electronics means five Electronics rows.

SELECT
    category,
    brand,
    COUNT(*)    AS products,
    SUM(stock)  AS total_stock,
    MIN(price)  AS min_price,
    MAX(price)  AS max_price
FROM   dataplexa_store.products
WHERE  brand IS NOT NULL     -- WHERE runs before grouping -- unbranded rows are excluded before groups form
GROUP  BY category, brand
ORDER  BY category, products DESC;
CATEGORY | BRAND | PRODUCTS | TOTAL_STOCK | MIN_PRICE | MAX_PRICE
------------|------------|----------|-------------|-----------|----------
Electronics | LogiTech | 5 | 487 | 24.99 | 89.99
Electronics | ViewMax | 4 | 127 | 199.99 | 1299.99
Electronics | TechLink | 4 | 312 | 8.99 | 49.99
Electronics | KeyMaster | 3 | 184 | 69.99 | 149.99
Furniture | DeskPro | 5 | 98 | 149.99 | 899.99
Furniture | ChairCo | 4 | 86 | 49.99 | 449.99
Stationery | PaperCo | 9 | 1840 | 1.99 | 24.99
(7 rows selected)
  • Seven rows — one per unique category and brand combination found in the data
  • Rows where brand was NULL were removed by WHERE before grouping — they never entered any group
  • The order of columns in GROUP BY category, brand does not change which groups are formed

Grouping on Date Parts — The Key Oracle Difference

Grouping by year or month is one of the most common GROUP BY tasks. This is where Oracle and SQL Server differ most visibly. SQL Server provides standalone YEAR() and MONTH() functions. Oracle uses EXTRACT() to pull a named part from a DATE column as a number, or TO_CHAR() to format the date as a string. Both work — EXTRACT() is better when you need the value as a number, TO_CHAR() is better when you want a formatted label like 2024-03 directly in the output.

Task Oracle SQL Server
Extract year as number EXTRACT(YEAR FROM col) YEAR(col)
Extract month as number EXTRACT(MONTH FROM col) MONTH(col)
Format as year-month label TO_CHAR(col, 'YYYY-MM') FORMAT(col, 'yyyy-MM')
Truncate date to month TRUNC(col, 'MM') DATETRUNC('month', col)
Date literal syntax DATE '2024-01-01' '2024-01-01'
Alias in GROUP BY Not allowed — ORA-00979 if used Not allowed — same error
Alias in ORDER BY Allowed Allowed
-- EXTRACT returns year and month as separate numbers.
-- SQL Server equivalent: YEAR(order_date) and MONTH(order_date).
SELECT
    EXTRACT(YEAR  FROM order_date) AS order_year,
    EXTRACT(MONTH FROM order_date) AS order_month,
    COUNT(*)          AS total_orders,
    SUM(total_amount) AS revenue
FROM   dataplexa_store.orders
WHERE  order_date >= DATE '2024-01-01'   -- DATE keyword is required for Oracle date literals
GROUP  BY EXTRACT(YEAR  FROM order_date),
          EXTRACT(MONTH FROM order_date) -- full expression repeated -- aliases not allowed in GROUP BY
ORDER  BY order_year, order_month;       -- aliases ARE allowed in ORDER BY
ORDER_YEAR | ORDER_MONTH | TOTAL_ORDERS | REVENUE
-----------|-------------|--------------|----------
2024 | 1 | 38 | 4820.50
2024 | 2 | 41 | 5240.00
2024 | 3 | 52 | 6830.75
2024 | 4 | 47 | 5920.20
(4 rows selected)
-- TO_CHAR formats the date as a string label like '2024-03'.
-- Useful when you want the formatted month directly in the output column.
-- SQL Server equivalent: FORMAT(order_date, 'yyyy-MM').
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS order_month,
    COUNT(*)                       AS total_orders,
    SUM(total_amount)              AS revenue,
    ROUND(AVG(total_amount), 2)    AS avg_order_value
FROM   dataplexa_store.orders
WHERE  order_date >= DATE '2024-01-01'
GROUP  BY TO_CHAR(order_date, 'YYYY-MM') -- full expression repeated -- alias not allowed in GROUP BY
ORDER  BY order_month;
ORDER_MONTH | TOTAL_ORDERS | REVENUE | AVG_ORDER_VALUE
------------|--------------|----------|----------------
2024-01 | 38 | 4820.50 | 126.86
2024-02 | 41 | 5240.00 | 127.80
2024-03 | 52 | 6830.75 | 131.36
2024-04 | 47 | 5920.20 | 125.96
(4 rows selected)
  • EXTRACT() returns a NUMBER — better when you need to do arithmetic or comparison on the year or month value
  • TO_CHAR() returns a VARCHAR2 string — better when you want a readable label like 2024-03 in the output
  • In both cases the full expression must be repeated in GROUP BY — Oracle does not resolve SELECT aliases at that stage
  • ORDER BY order_month works because ORDER BY is the last clause Oracle evaluates — aliases are already resolved by then

Lesson Summary

Concept Oracle SQL Server Equivalent
Basic GROUP BY Identical syntax — one row per distinct value Identical
Ungrouped column ORA-00979: not a GROUP BY expression Column is invalid in select list
Extract year EXTRACT(YEAR FROM col) YEAR(col)
Extract month EXTRACT(MONTH FROM col) MONTH(col)
Month label TO_CHAR(col, 'YYYY-MM') FORMAT(col, 'yyyy-MM')
Date literal DATE '2024-01-01' '2024-01-01'
Alias in GROUP BY Not allowed — repeat the full expression Not allowed — same rule
Alias in ORDER BY Allowed Allowed
NULL in grouping column NULLs form their own group Identical

Practice Questions

Practice 1. What Oracle error code is returned when a column appears in SELECT but is not in GROUP BY and not aggregated?



Practice 2. What is the Oracle equivalent of SQL Server's MONTH(order_date)?



Practice 3. You define the alias order_month in SELECT. Can you write GROUP BY order_month in Oracle?



Practice 4. Does WHERE filter rows before or after GROUP BY executes?



Practice 5. Which Oracle function formats a DATE column as the string '2024-03' for grouping and display?



Quiz

Quiz 1. You want one row per department showing total payroll. Which query is correct in Oracle?






Quiz 2. GROUP BY category, brand produces one row per — what?






Quiz 3. You write GROUP BY order_month where order_month is a SELECT alias. What does Oracle do?






Quiz 4. Which Oracle syntax correctly groups orders by year and month as separate numeric columns?






Quiz 5. What happens to rows where the GROUP BY column contains NULL?






Next up — HAVING Clause — How to filter groups after aggregation, the difference between WHERE and HAVING, and when to use each one.