Oracle Database
GROUP BY
To follow along with the examples in this lesson, use the course dataset from Lesson 8.
What GROUP BY Does
GROUP BY collapses multiple rows that share a common value into a single summary row. Instead of seeing every individual order you see one row per customer. Instead of every product sale you see one row per category. You always pair GROUP BY with an aggregate function — COUNT(), SUM(), AVG(), MIN(), or MAX() — which calculates one value across all the rows in each group.
The rule that catches most beginners: every column in your SELECT list must either appear in the GROUP BY clause or be wrapped inside an aggregate function. Oracle enforces this strictly — if you include a column that is neither grouped nor aggregated, Oracle immediately returns ORA-00979: not a GROUP BY expression.
- GROUP BY creates one output row per distinct value in the grouping column
- Every non-aggregated column in SELECT must appear in GROUP BY — Oracle enforces this with ORA-00979
- NULL values form their own group — all rows with NULL in the grouping column land in the same bucket
- WHERE filters rows before grouping — ORDER BY sorts the grouped result after
SELECT
category,
COUNT(*) AS total_products,
COUNT(DISTINCT brand) AS unique_brands -- counts only distinct non-NULL brand values per group
FROM dataplexa_store.products
GROUP BY category
ORDER BY total_products DESC;------------|----------------|---------------
Electronics | 18 | 5
Stationery | 14 | 3
Furniture | 9 | 2
Accessories | 7 | 4
Clothing | 5 | 3
(5 rows selected)
- Five distinct category values exist — Oracle produced exactly one row per group
COUNT(*)counts every row including those with NULLs in other columns —COUNT(DISTINCT brand)counts only unique non-NULL brand values- Oracle says rows selected — SQL Server says rows affected — same meaning, different wording
Multiple Aggregate Functions in One Query
You can include as many aggregate functions as you need in a single GROUP BY query. Oracle forms the groups once and applies every aggregate to each group independently. Wrapping ROUND() around AVG() works identically in Oracle and SQL Server — the inner function runs first, the outer one processes its result.
SELECT
category,
COUNT(*) AS total_products,
SUM(stock) AS total_stock,
ROUND(AVG(price), 2) AS avg_price,
MIN(price) AS min_price,
MAX(price) AS max_price
FROM dataplexa_store.products
GROUP BY category
ORDER BY avg_price DESC;------------|----------------|-------------|-----------|-----------|----------
Furniture | 9 | 184 | 289.99 | 49.99 | 899.99
Electronics | 18 | 1842 | 87.45 | 8.99 | 1299.99
Clothing | 5 | 310 | 64.80 | 19.99 | 129.99
Accessories | 7 | 523 | 38.20 | 5.99 | 89.99
Stationery | 14 | 3108 | 14.75 | 1.99 | 49.99
(5 rows selected)
- All five aggregates share the same five groups — GROUP BY is evaluated once and every aggregate reads from it
- Furniture has only 9 products but the highest average price — this pattern is invisible in raw row data
SUM(stock)silently ignores NULL values — a product with no stock recorded does not contribute zero, it is skipped entirely
-- Headcount and salary breakdown per department using DataplexaHR.
-- The JOIN runs before GROUP BY -- department names are resolved first,
-- then the joined rows are collapsed into department groups.
SELECT
d.department_name,
COUNT(e.employee_id) AS headcount, -- COUNT(column) skips NULLs -- safer than COUNT(*) after a LEFT JOIN
ROUND(AVG(e.salary), 2) AS avg_salary,
MIN(e.salary) AS min_salary,
MAX(e.salary) AS max_salary,
SUM(e.salary) AS total_payroll
FROM dataplexa_hr.employees e
JOIN dataplexa_hr.departments d ON e.department_id = d.department_id
GROUP BY d.department_name
ORDER BY total_payroll DESC;----------------|-----------|------------|------------|------------|---------------
Engineering | 12 | 78500.00 | 55000.00 | 115000.00 | 942000.00
Sales | 9 | 62000.00 | 42000.00 | 88000.00 | 558000.00
Finance | 6 | 71000.00 | 52000.00 | 95000.00 | 426000.00
Warehouse | 8 | 38000.00 | 28000.00 | 52000.00 | 304000.00
HR | 5 | 58000.00 | 45000.00 | 72000.00 | 290000.00
(5 rows selected)
- Engineering has the highest total payroll despite not being the largest team — SUM and AVG together show why
d.department_namemust appear in GROUP BY because it is in SELECT and not aggregated — Oracle raises ORA-00979 if it is missing
Grouping by Multiple Columns
You can GROUP BY more than one column. Oracle creates one group for every unique combination of values across all the grouped columns. Grouping by category and brand gives one row per unique category-brand pair — not one per category alone, not one per brand alone. Five brands in Electronics means five Electronics rows.
SELECT
category,
brand,
COUNT(*) AS products,
SUM(stock) AS total_stock,
MIN(price) AS min_price,
MAX(price) AS max_price
FROM dataplexa_store.products
WHERE brand IS NOT NULL -- WHERE runs before grouping -- unbranded rows are excluded before groups form
GROUP BY category, brand
ORDER BY category, products DESC;------------|------------|----------|-------------|-----------|----------
Electronics | LogiTech | 5 | 487 | 24.99 | 89.99
Electronics | ViewMax | 4 | 127 | 199.99 | 1299.99
Electronics | TechLink | 4 | 312 | 8.99 | 49.99
Electronics | KeyMaster | 3 | 184 | 69.99 | 149.99
Furniture | DeskPro | 5 | 98 | 149.99 | 899.99
Furniture | ChairCo | 4 | 86 | 49.99 | 449.99
Stationery | PaperCo | 9 | 1840 | 1.99 | 24.99
(7 rows selected)
- Seven rows — one per unique category and brand combination found in the data
- Rows where brand was NULL were removed by WHERE before grouping — they never entered any group
- The order of columns in
GROUP BY category, branddoes not change which groups are formed
Grouping on Date Parts — The Key Oracle Difference
Grouping by year or month is one of the most common GROUP BY tasks. This is where Oracle and SQL Server differ most visibly. SQL Server provides standalone YEAR() and MONTH() functions. Oracle uses EXTRACT() to pull a named part from a DATE column as a number, or TO_CHAR() to format the date as a string. Both work — EXTRACT() is better when you need the value as a number, TO_CHAR() is better when you want a formatted label like 2024-03 directly in the output.
| Task | Oracle | SQL Server |
|---|---|---|
| Extract year as number | EXTRACT(YEAR FROM col) |
YEAR(col) |
| Extract month as number | EXTRACT(MONTH FROM col) |
MONTH(col) |
| Format as year-month label | TO_CHAR(col, 'YYYY-MM') |
FORMAT(col, 'yyyy-MM') |
| Truncate date to month | TRUNC(col, 'MM') |
DATETRUNC('month', col) |
| Date literal syntax | DATE '2024-01-01' |
'2024-01-01' |
| Alias in GROUP BY | Not allowed — ORA-00979 if used | Not allowed — same error |
| Alias in ORDER BY | Allowed | Allowed |
-- EXTRACT returns year and month as separate numbers.
-- SQL Server equivalent: YEAR(order_date) and MONTH(order_date).
SELECT
EXTRACT(YEAR FROM order_date) AS order_year,
EXTRACT(MONTH FROM order_date) AS order_month,
COUNT(*) AS total_orders,
SUM(total_amount) AS revenue
FROM dataplexa_store.orders
WHERE order_date >= DATE '2024-01-01' -- DATE keyword is required for Oracle date literals
GROUP BY EXTRACT(YEAR FROM order_date),
EXTRACT(MONTH FROM order_date) -- full expression repeated -- aliases not allowed in GROUP BY
ORDER BY order_year, order_month; -- aliases ARE allowed in ORDER BY-----------|-------------|--------------|----------
2024 | 1 | 38 | 4820.50
2024 | 2 | 41 | 5240.00
2024 | 3 | 52 | 6830.75
2024 | 4 | 47 | 5920.20
(4 rows selected)
-- TO_CHAR formats the date as a string label like '2024-03'.
-- Useful when you want the formatted month directly in the output column.
-- SQL Server equivalent: FORMAT(order_date, 'yyyy-MM').
SELECT
TO_CHAR(order_date, 'YYYY-MM') AS order_month,
COUNT(*) AS total_orders,
SUM(total_amount) AS revenue,
ROUND(AVG(total_amount), 2) AS avg_order_value
FROM dataplexa_store.orders
WHERE order_date >= DATE '2024-01-01'
GROUP BY TO_CHAR(order_date, 'YYYY-MM') -- full expression repeated -- alias not allowed in GROUP BY
ORDER BY order_month;------------|--------------|----------|----------------
2024-01 | 38 | 4820.50 | 126.86
2024-02 | 41 | 5240.00 | 127.80
2024-03 | 52 | 6830.75 | 131.36
2024-04 | 47 | 5920.20 | 125.96
(4 rows selected)
EXTRACT()returns a NUMBER — better when you need to do arithmetic or comparison on the year or month valueTO_CHAR()returns a VARCHAR2 string — better when you want a readable label like2024-03in the output- In both cases the full expression must be repeated in GROUP BY — Oracle does not resolve SELECT aliases at that stage
ORDER BY order_monthworks because ORDER BY is the last clause Oracle evaluates — aliases are already resolved by then
Lesson Summary
| Concept | Oracle | SQL Server Equivalent |
|---|---|---|
| Basic GROUP BY | Identical syntax — one row per distinct value | Identical |
| Ungrouped column | ORA-00979: not a GROUP BY expression | Column is invalid in select list |
| Extract year | EXTRACT(YEAR FROM col) |
YEAR(col) |
| Extract month | EXTRACT(MONTH FROM col) |
MONTH(col) |
| Month label | TO_CHAR(col, 'YYYY-MM') |
FORMAT(col, 'yyyy-MM') |
| Date literal | DATE '2024-01-01' |
'2024-01-01' |
| Alias in GROUP BY | Not allowed — repeat the full expression | Not allowed — same rule |
| Alias in ORDER BY | Allowed | Allowed |
| NULL in grouping column | NULLs form their own group | Identical |
Practice Questions
Practice 1. What Oracle error code is returned when a column appears in SELECT but is not in GROUP BY and not aggregated?
Practice 2. What is the Oracle equivalent of SQL Server's MONTH(order_date)?
Practice 3. You define the alias order_month in SELECT. Can you write GROUP BY order_month in Oracle?
Practice 4. Does WHERE filter rows before or after GROUP BY executes?
Practice 5. Which Oracle function formats a DATE column as the string '2024-03' for grouping and display?
Quiz
Quiz 1. You want one row per department showing total payroll. Which query is correct in Oracle?
Quiz 2. GROUP BY category, brand produces one row per — what?
Quiz 3. You write GROUP BY order_month where order_month is a SELECT alias. What does Oracle do?
Quiz 4. Which Oracle syntax correctly groups orders by year and month as separate numeric columns?
Quiz 5. What happens to rows where the GROUP BY column contains NULL?
Next up — HAVING Clause — How to filter groups after aggregation, the difference between WHERE and HAVING, and when to use each one.