PostgreSQL Lesson 23 – Aggregate Functions | Dataplexa

Aggregate Functions

Aggregate functions take a set of rows and return a single summary value. They are how you answer questions like — how many customers do we have, what is our total revenue, what is the average order value, who is our highest paid employee. Every reporting, analytics, and dashboard query you will ever write uses aggregate functions. This lesson covers all five core aggregates with real examples from the Dataplexa Store.

COUNT — Counting Rows

COUNT is the most used aggregate function. COUNT(*) counts every row including those with NULL values. COUNT(column) counts only the rows where that specific column is not NULL — a useful distinction when dealing with optional fields.

-- Count total number of customers
SELECT COUNT(*) AS total_customers FROM customers;

-- Count total orders
SELECT COUNT(*) AS total_orders FROM orders;

-- Count only orders with a recorded total amount (excludes NULLs)
SELECT COUNT(total_amount) AS orders_with_amount FROM orders;

-- Count how many customers have a phone number on record
SELECT COUNT(phone) AS customers_with_phone FROM customers;

-- Count distinct values — how many unique states are customers from
SELECT COUNT(DISTINCT state) AS unique_states FROM customers;

total_customers
-----------------
               50

total_orders
--------------
           200

orders_with_amount
--------------------
               187

customers_with_phone
----------------------
                   19

unique_states
---------------
            12

SUM — Adding Up Values

SUM adds up all the values in a numeric column. It ignores NULL values automatically. This is the function you use for total revenue, total inventory value, total payroll, and any other running total.

-- Total revenue from all orders
SELECT SUM(total_amount) AS total_revenue FROM orders;

-- Total value of all stock currently held
SELECT SUM(price * stock_qty) AS total_inventory_value FROM products;

-- Total payroll cost
SELECT SUM(salary) AS total_payroll FROM employees;

-- Total quantity of items ever sold
SELECT SUM(quantity) AS total_units_sold FROM order_items;

-- SUM with a WHERE filter — revenue from delivered orders only
SELECT SUM(total_amount) AS delivered_revenue
FROM orders
WHERE status = 'delivered';

total_revenue
---------------
    24876.43

total_inventory_value
-----------------------
         18432.60

total_payroll
---------------
   1438500.00

total_units_sold
------------------
            892

delivered_revenue
-------------------
       14203.18

AVG — Calculating the Average

AVG returns the arithmetic mean of a numeric column, ignoring NULLs. Use it for average order value, average product price, average salary, average rating — any metric where the mean tells you something meaningful.

-- Average order value
SELECT AVG(total_amount) AS avg_order_value FROM orders;

-- Average product price
SELECT AVG(price) AS avg_product_price FROM products;

-- Average salary across all employees
SELECT AVG(salary) AS avg_salary FROM employees;

-- Average salary by department
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

-- Round to 2 decimal places — AVG often returns many decimal digits
SELECT ROUND(AVG(total_amount), 2) AS avg_order_value FROM orders;

avg_order_value
-----------------
124.382150000000

avg_product_price
-------------------
  97.490000000000

-- Rounded:
avg_order_value
-----------------
          124.38

-- By department:
department  | avg_salary
-------------+------------
Engineering | 87500.00
Sales       | 63250.00
Marketing   | 58000.00

MIN and MAX — Smallest and Largest Values

MIN returns the smallest value in a column and MAX returns the largest. They work on numbers, dates, and text (alphabetical order). Use them to find the cheapest product, the most recent order, the highest salary, or the earliest signup date.

-- Cheapest and most expensive product
SELECT MIN(price) AS cheapest, MAX(price) AS most_expensive FROM products;

-- Smallest and largest order values
SELECT MIN(total_amount) AS smallest_order,
       MAX(total_amount) AS largest_order
FROM orders;

-- First and most recent order date
SELECT MIN(order_date) AS first_order,
       MAX(order_date) AS latest_order
FROM orders;

-- Lowest and highest salary
SELECT MIN(salary) AS lowest_salary,
       MAX(salary) AS highest_salary
FROM employees;

-- All five aggregates together in one query
SELECT
  COUNT(*)                      AS total_orders,
  SUM(total_amount)             AS revenue,
  ROUND(AVG(total_amount), 2)   AS avg_order,
  MIN(total_amount)             AS smallest,
  MAX(total_amount)             AS largest
FROM orders
WHERE status != 'cancelled';

GROUP BY — Aggregating in Groups

GROUP BY splits rows into groups and applies the aggregate function to each group separately. This is where aggregates become truly powerful — instead of one number for the whole table, you get one number per category, per month, per department, per customer.

-- Number of orders per status
SELECT status, COUNT(*) AS order_count
FROM orders
GROUP BY status
ORDER BY order_count DESC;

-- Revenue per order status
SELECT status,
       COUNT(*) AS order_count,
       SUM(total_amount) AS total_revenue
FROM orders
GROUP BY status
ORDER BY total_revenue DESC;

-- Number of products per category
SELECT category, COUNT(*) AS product_count, ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category
ORDER BY product_count DESC;

-- Total spending per customer
SELECT customer_id,
       COUNT(*) AS order_count,
       SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC
LIMIT 5;

HAVING — Filtering Groups

HAVING filters groups after GROUP BY has been applied — it is the WHERE clause for aggregated results. You cannot use WHERE to filter on an aggregate result like COUNT(*) > 5 because WHERE runs before grouping. HAVING runs after and lets you filter on those aggregate values.

-- Customers who have placed more than 5 orders
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;

-- Categories where average price exceeds $50
SELECT category, ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category
HAVING AVG(price) > 50
ORDER BY avg_price DESC;

-- States with more than 3 customers
SELECT state, COUNT(*) AS customer_count
FROM customers
GROUP BY state
HAVING COUNT(*) > 3
ORDER BY customer_count DESC;

-- Customers who have spent more than $500 total
SELECT customer_id,
       SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(total_amount) > 500
ORDER BY total_spent DESC;

WHERE vs HAVING — Knowing Which to Use

WHERE filters individual rows before grouping — use it to limit which rows enter the aggregate calculation. HAVING filters groups after aggregation — use it to limit which groups appear in the results. You can use both in the same query.

-- WHERE filters rows first, HAVING filters groups after
-- Find categories of active products where average price exceeds $40
SELECT category,
       COUNT(*) AS product_count,
       ROUND(AVG(price), 2) AS avg_price
FROM products
WHERE is_active = TRUE          -- WHERE: only include active products (row filter)
GROUP BY category
HAVING AVG(price) > 40          -- HAVING: only show categories avg > $40 (group filter)
ORDER BY avg_price DESC;

  category   | product_count | avg_price
-------------+---------------+-----------
Furniture   |             4 |    274.99
Electronics |             8 |     62.44
Accessories |             9 |     43.22