Oracle DataBase Lesson 20 – GROUP BY | Dataplexa

GROUP BY

To follow along with the examples in this lesson, use the course dataset from Lesson 8.

What GROUP BY Does

GROUP BY collapses multiple rows that share a common value into a single summary row. Instead of seeing every individual order you see one row per customer. Instead of every product sale you see one row per category. You always pair GROUP BY with an aggregate function — COUNT(), SUM(), AVG(), MIN(), or MAX() — which calculates one value across all the rows in each group.

The rule that catches most beginners: every column in your SELECT list must either appear in the GROUP BY clause or be wrapped inside an aggregate function. Oracle enforces this strictly — if you include a column that is neither grouped nor aggregated, Oracle immediately returns ORA-00979: not a GROUP BY expression.

GROUP BY creates one output row per distinct value in the grouping column
Every non-aggregated column in SELECT must appear in GROUP BY — Oracle enforces this with ORA-00979
NULL values form their own group — all rows with NULL in the grouping column land in the same bucket
WHERE filters rows before grouping — ORDER BY sorts the grouped result after

SELECT
    category,
    COUNT(*)              AS total_products,
    COUNT(DISTINCT brand) AS unique_brands   -- counts only distinct non-NULL brand values per group
FROM   dataplexa_store.products
GROUP  BY category
ORDER  BY total_products DESC;

CATEGORY | TOTAL_PRODUCTS | UNIQUE_BRANDS
------------|----------------|---------------
Electronics | 18 | 5
Stationery | 14 | 3
Furniture | 9 | 2
Accessories | 7 | 4
Clothing | 5 | 3
(5 rows selected)

Five distinct category values exist — Oracle produced exactly one row per group
COUNT(*) counts every row including those with NULLs in other columns — COUNT(DISTINCT brand) counts only unique non-NULL brand values
Oracle says rows selected — SQL Server says rows affected — same meaning, different wording

Multiple Aggregate Functions in One Query

You can include as many aggregate functions as you need in a single GROUP BY query. Oracle forms the groups once and applies every aggregate to each group independently. Wrapping ROUND() around AVG() works identically in Oracle and SQL Server — the inner function runs first, the outer one processes its result.

SELECT
    category,
    COUNT(*)             AS total_products,
    SUM(stock)           AS total_stock,
    ROUND(AVG(price), 2) AS avg_price,
    MIN(price)           AS min_price,
    MAX(price)           AS max_price
FROM   dataplexa_store.products
GROUP  BY category
ORDER  BY avg_price DESC;

CATEGORY | TOTAL_PRODUCTS | TOTAL_STOCK | AVG_PRICE | MIN_PRICE | MAX_PRICE
------------|----------------|-------------|-----------|-----------|----------
Furniture | 9 | 184 | 289.99 | 49.99 | 899.99
Electronics | 18 | 1842 | 87.45 | 8.99 | 1299.99
Clothing | 5 | 310 | 64.80 | 19.99 | 129.99
Accessories | 7 | 523 | 38.20 | 5.99 | 89.99
Stationery | 14 | 3108 | 14.75 | 1.99 | 49.99
(5 rows selected)

All five aggregates share the same five groups — GROUP BY is evaluated once and every aggregate reads from it
Furniture has only 9 products but the highest average price — this pattern is invisible in raw row data
SUM(stock) silently ignores NULL values — a product with no stock recorded does not contribute zero, it is skipped entirely

-- Headcount and salary breakdown per department using DataplexaHR.
-- The JOIN runs before GROUP BY -- department names are resolved first,
-- then the joined rows are collapsed into department groups.
SELECT
    d.department_name,
    COUNT(e.employee_id) AS headcount,      -- COUNT(column) skips NULLs -- safer than COUNT(*) after a LEFT JOIN
    ROUND(AVG(e.salary), 2) AS avg_salary,
    MIN(e.salary)    AS min_salary,
    MAX(e.salary)    AS max_salary,
    SUM(e.salary)    AS total_payroll
FROM   dataplexa_hr.employees   e
JOIN   dataplexa_hr.departments d ON e.department_id = d.department_id
GROUP  BY d.department_name
ORDER  BY total_payroll DESC;

DEPARTMENT_NAME | HEADCOUNT | AVG_SALARY | MIN_SALARY | MAX_SALARY | TOTAL_PAYROLL
----------------|-----------|------------|------------|------------|---------------
Engineering | 12 | 78500.00 | 55000.00 | 115000.00 | 942000.00
Sales | 9 | 62000.00 | 42000.00 | 88000.00 | 558000.00
Finance | 6 | 71000.00 | 52000.00 | 95000.00 | 426000.00
Warehouse | 8 | 38000.00 | 28000.00 | 52000.00 | 304000.00
HR | 5 | 58000.00 | 45000.00 | 72000.00 | 290000.00
(5 rows selected)

Engineering has the highest total payroll despite not being the largest team — SUM and AVG together show why
d.department_name must appear in GROUP BY because it is in SELECT and not aggregated — Oracle raises ORA-00979 if it is missing

Grouping by Multiple Columns

You can GROUP BY more than one column. Oracle creates one group for every unique combination of values across all the grouped columns. Grouping by category and brand gives one row per unique category-brand pair — not one per category alone, not one per brand alone. Five brands in Electronics means five Electronics rows.

SELECT
    category,
    brand,
    COUNT(*)    AS products,
    SUM(stock)  AS total_stock,
    MIN(price)  AS min_price,
    MAX(price)  AS max_price
FROM   dataplexa_store.products
WHERE  brand IS NOT NULL     -- WHERE runs before grouping -- unbranded rows are excluded before groups form
GROUP  BY category, brand
ORDER  BY category, products DESC;

CATEGORY | BRAND | PRODUCTS | TOTAL_STOCK | MIN_PRICE | MAX_PRICE
------------|------------|----------|-------------|-----------|----------
Electronics | LogiTech | 5 | 487 | 24.99 | 89.99
Electronics | ViewMax | 4 | 127 | 199.99 | 1299.99
Electronics | TechLink | 4 | 312 | 8.99 | 49.99
Electronics | KeyMaster | 3 | 184 | 69.99 | 149.99
Furniture | DeskPro | 5 | 98 | 149.99 | 899.99
Furniture | ChairCo | 4 | 86 | 49.99 | 449.99
Stationery | PaperCo | 9 | 1840 | 1.99 | 24.99
(7 rows selected)

Seven rows — one per unique category and brand combination found in the data
Rows where brand was NULL were removed by WHERE before grouping — they never entered any group
The order of columns in GROUP BY category, brand does not change which groups are formed

Grouping on Date Parts — The Key Oracle Difference

Grouping by year or month is one of the most common GROUP BY tasks. This is where Oracle and SQL Server differ most visibly. SQL Server provides standalone YEAR() and MONTH() functions. Oracle uses EXTRACT() to pull a named part from a DATE column as a number, or TO_CHAR() to format the date as a string. Both work — EXTRACT() is better when you need the value as a number, TO_CHAR() is better when you want a formatted label like 2024-03 directly in the output.

Task	Oracle	SQL Server
Extract year as number	`EXTRACT(YEAR FROM col)`	`YEAR(col)`
Extract month as number	`EXTRACT(MONTH FROM col)`	`MONTH(col)`
Format as year-month label	`TO_CHAR(col, 'YYYY-MM')`	`FORMAT(col, 'yyyy-MM')`
Truncate date to month	`TRUNC(col, 'MM')`	`DATETRUNC('month', col)`
Date literal syntax	`DATE '2024-01-01'`	`'2024-01-01'`
Alias in GROUP BY	Not allowed — ORA-00979 if used	Not allowed — same error
Alias in ORDER BY	Allowed	Allowed

-- EXTRACT returns year and month as separate numbers.
-- SQL Server equivalent: YEAR(order_date) and MONTH(order_date).
SELECT
    EXTRACT(YEAR  FROM order_date) AS order_year,
    EXTRACT(MONTH FROM order_date) AS order_month,
    COUNT(*)          AS total_orders,
    SUM(total_amount) AS revenue
FROM   dataplexa_store.orders
WHERE  order_date >= DATE '2024-01-01'   -- DATE keyword is required for Oracle date literals
GROUP  BY EXTRACT(YEAR  FROM order_date),
          EXTRACT(MONTH FROM order_date) -- full expression repeated -- aliases not allowed in GROUP BY
ORDER  BY order_year, order_month;       -- aliases ARE allowed in ORDER BY

ORDER_YEAR | ORDER_MONTH | TOTAL_ORDERS | REVENUE
-----------|-------------|--------------|----------
2024 | 1 | 38 | 4820.50
2024 | 2 | 41 | 5240.00
2024 | 3 | 52 | 6830.75
2024 | 4 | 47 | 5920.20
(4 rows selected)

-- TO_CHAR formats the date as a string label like '2024-03'.
-- Useful when you want the formatted month directly in the output column.
-- SQL Server equivalent: FORMAT(order_date, 'yyyy-MM').
SELECT
    TO_CHAR(order_date, 'YYYY-MM') AS order_month,
    COUNT(*)                       AS total_orders,
    SUM(total_amount)              AS revenue,
    ROUND(AVG(total_amount), 2)    AS avg_order_value
FROM   dataplexa_store.orders
WHERE  order_date >= DATE '2024-01-01'
GROUP  BY TO_CHAR(order_date, 'YYYY-MM') -- full expression repeated -- alias not allowed in GROUP BY
ORDER  BY order_month;

ORDER_MONTH | TOTAL_ORDERS | REVENUE | AVG_ORDER_VALUE
------------|--------------|----------|----------------
2024-01 | 38 | 4820.50 | 126.86
2024-02 | 41 | 5240.00 | 127.80
2024-03 | 52 | 6830.75 | 131.36
2024-04 | 47 | 5920.20 | 125.96
(4 rows selected)

EXTRACT() returns a NUMBER — better when you need to do arithmetic or comparison on the year or month value
TO_CHAR() returns a VARCHAR2 string — better when you want a readable label like 2024-03 in the output
In both cases the full expression must be repeated in GROUP BY — Oracle does not resolve SELECT aliases at that stage
ORDER BY order_month works because ORDER BY is the last clause Oracle evaluates — aliases are already resolved by then

Lesson Summary

Concept	Oracle	SQL Server Equivalent
Basic GROUP BY	Identical syntax — one row per distinct value	Identical
Ungrouped column	ORA-00979: not a GROUP BY expression	Column is invalid in select list
Extract year	`EXTRACT(YEAR FROM col)`	`YEAR(col)`
Extract month	`EXTRACT(MONTH FROM col)`	`MONTH(col)`
Month label	`TO_CHAR(col, 'YYYY-MM')`	`FORMAT(col, 'yyyy-MM')`
Date literal	`DATE '2024-01-01'`	`'2024-01-01'`
Alias in GROUP BY	Not allowed — repeat the full expression	Not allowed — same rule
Alias in ORDER BY	Allowed	Allowed
NULL in grouping column	NULLs form their own group	Identical

Practice Questions

Practice 1. What Oracle error code is returned when a column appears in SELECT but is not in GROUP BY and not aggregated?

Practice 2. What is the Oracle equivalent of SQL Server's MONTH(order_date)?

Practice 3. You define the alias order_month in SELECT. Can you write GROUP BY order_month in Oracle?

Practice 4. Does WHERE filter rows before or after GROUP BY executes?

Practice 5. Which Oracle function formats a DATE column as the string '2024-03' for grouping and display?

Quiz

Quiz 1. You want one row per department showing total payroll. Which query is correct in Oracle?

SELECT department_id, SUM(salary) FROM dataplexa_hr.employees
SELECT department_id, SUM(salary) FROM dataplexa_hr.employees GROUP BY department_id
SELECT SUM(salary) FROM dataplexa_hr.employees WHERE department_id IS NOT NULL
SELECT department_id, salary FROM dataplexa_hr.employees GROUP BY department_id

Quiz 2. GROUP BY category, brand produces one row per — what?

Unique category value only
Unique brand value only
Unique combination of category and brand values
Every row in the products table

Quiz 3. You write GROUP BY order_month where order_month is a SELECT alias. What does Oracle do?

Oracle resolves the alias and groups correctly
Oracle returns ORA-00979 — GROUP BY cannot reference SELECT aliases
Oracle groups by the literal string order_month
Oracle ignores the GROUP BY clause entirely

Quiz 4. Which Oracle syntax correctly groups orders by year and month as separate numeric columns?

GROUP BY YEAR(order_date), MONTH(order_date)
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
GROUP BY DATEPART(year, order_date), DATEPART(month, order_date)
GROUP BY FORMAT(order_date, 'YYYY'), FORMAT(order_date, 'MM')

Quiz 5. What happens to rows where the GROUP BY column contains NULL?

Rows with NULL are automatically excluded
Oracle raises ORA-00979 when NULL appears in a GROUP BY column
NULL values form their own group — all NULL rows are grouped together
NULL is treated as zero in the grouping column

Next up — HAVING Clause — How to filter groups after aggregation, the difference between WHERE and HAVING, and when to use each one.

← Previous Course Index Next →

Oracle Database