Power BI Lesson 20 – Grouping Data | Dataplexa
Power Query · Lesson 20

Grouping and Aggregation

Group By is Power Query's way of summarising rows — collapsing many rows into fewer rows by aggregating values. It is the equivalent of a SQL GROUP BY or an Excel PivotTable, but it runs before data ever reaches your model. Knowing when to group inside Power Query versus letting DAX handle aggregation later is one of the most important judgement calls in building efficient Power BI reports.

What Group By Does

Group By takes a table with many rows and produces a smaller table where each row represents a unique combination of the grouping columns, with one or more aggregated values computed from all the rows that shared that combination.

Before — 8 detail rows
Region Category Revenue
NorthElectronics1,200
NorthElectronics980
NorthFurniture850
SouthElectronics670
SouthFurniture1,540
SouthFurniture430
EastAccessories210
EastAccessories95
After — Group By Region + Category → Sum Revenue
Region Category Total Revenue
NorthElectronics2,180
NorthFurniture850
SouthElectronics670
SouthFurniture1,970
EastAccessories305

The Group By Dialog

Group By is accessed from the Transform tab → Group By. The dialog has two modes — Basic (one group column, one aggregation) and Advanced (multiple group columns and multiple aggregated output columns). You will almost always use Advanced.

Group By dialog — Advanced mode
Group by these columns
Region
Category
+ Add grouping
Add aggregated columns
Total Revenue
Sum
Revenue
Order Count
Count Rows
(not required for Count Rows)
+ Add aggregation
OK
Cancel

All Available Aggregation Functions

Function What it computes Practical example
Sum Adds all values in the column for the group Total Revenue per Region
Average Mean of all values in the column for the group Average order value per Customer
Median Middle value when sorted — less sensitive to outliers than Average Median delivery time per Category
Min / Max Smallest or largest value in the column for the group First order date (Min) and last order date (Max) per Customer
Count Rows Number of rows in the group — no column selection needed Number of orders per Region
Count Distinct Rows Number of unique values in the selected column for the group Number of distinct products sold per Region
All Rows Returns the entire sub-table of rows for that group — advanced use only. Creates a nested table column that you can then expand or process further. Get all order rows for each Customer as a nested table

Multiple Aggregations in One Step

One of the biggest advantages of the Advanced mode is computing several aggregations simultaneously in a single Group By step. This is more efficient than chaining multiple separate steps and produces a clean summary table with all the metrics you need in one pass.

// Group By: Region — compute 4 aggregations at once
// Transform tab → Group By → Advanced

Group by columns:
  Region

Aggregated output columns:
  Name             Operation          Column
  ─────────────────────────────────────────────────
  Total Revenue  → Sum             → Revenue
  Order Count    → Count Rows      → (none needed)
  Avg Order Val  → Average         → Revenue
  Max Order      → Max             → Revenue

M code generated:
= Table.Group(
    #"Filtered Rows",
    {"Region"},
    {
        {"Total Revenue", each List.Sum([Revenue]),    type number},
        {"Order Count",   each Table.RowCount(_),      Int64.Type},
        {"Avg Order Val", each List.Average([Revenue]),type number},
        {"Max Order",     each List.Max([Revenue]),    type number}
    }
)
Result — one row per Region, four computed columns
Region Total Revenue Order Count Avg Order Val Max Order
North $3,030 3 $1,010 $1,200
South $2,640 2 $1,320 $1,540
East $640 2 $320 $430
West $2,520 2 $1,260 $1,540
8 detail rows collapsed into 4 summary rows · all four metrics computed in a single Group By step

When to Group in Power Query vs DAX

Both Power Query and DAX can aggregate data — but they do it at different points in the pipeline and for different reasons. Choosing the wrong tool leads to either wasted model memory or inflexible reports that cannot respond to slicer selections.

Group By in Power Query Aggregation in DAX
When it runs At load time — before the model is built. Runs once on refresh. At query time — when a visual renders. Recalculates every time a slicer changes.
Responds to slicers? No — the result is fixed. A Region slicer cannot re-aggregate a Power Query grouped table. Yes — DAX measures recalculate in response to every filter and slicer automatically.
Model size impact Reduces rows loaded into the model — smaller file, faster load. Ideal for very large transaction tables where full detail is not needed. Full detail table loaded. More rows in model but maximum analytical flexibility.
Use when You need a pre-summarised lookup or reference table. The source has millions of rows but reports only ever show monthly/regional totals. No drill-down to individual row detail is needed. You need totals that respond to filters and slicers. You want to drill from summary to detail. The report audience needs flexible, interactive analysis.
Real example A 50M-row web event log where you only need daily session counts — group to daily summaries in Power Query before loading. A 200K-row orders table where users drill from total revenue → by region → by customer → to individual order rows.

Practical Pattern — Building a Customer Summary Table

A common pattern is to use Group By to create a customer-level summary table alongside the detail Orders table. The summary table gives you pre-computed customer metrics — lifetime value, first purchase date, order count — that are fast to display in cards and tables without complex DAX.

Pattern: Create CustomerSummary as a reference query
─────────────────────────────────────────────────────

Step 1 — Right-click the Orders query → Reference
         Rename the new query: CustomerSummary

Step 2 — Transform tab → Group By → Advanced
         Group by: CustomerName

         Aggregated columns:
           Lifetime Value  = Sum        of Revenue
           Order Count     = Count Rows
           First Order     = Min        of OrderDate
           Last Order      = Max        of OrderDate
           Avg Order Value = Average    of Revenue

Step 3 — Click OK
         CustomerSummary now has one row per customer
         with all five metrics pre-computed

Step 4 — Close & Apply
         Both Orders (detail) and CustomerSummary load
         into the model

Step 5 — In Model View
         Connect CustomerSummary.CustomerName (1)
         to Orders.CustomerName (*) — one-to-many
         relationship

Result in reports:
  Cards showing Lifetime Value respond to customer
  selection even though values are pre-computed —
  because the relationship propagates the filter
CustomerSummary table — loaded into model
CustomerName Lifetime Value Order Count First Order Last Order Avg Order Val
Alice Brown $1,295 2 2024-01-05 2024-02-14 $647.50
Bob Singh $850 1 2024-01-18 2024-01-18 $850.00
Carol Lee $640 2 2024-02-03 2024-04-11 $320.00
David Kim $2,520 2 2024-03-07 2024-04-29 $1,260.00
One row per customer · five metrics pre-computed · connected to Orders table via CustomerName relationship

Counting vs Counting Distinct

The difference between Count Rows and Count Distinct Rows is one of the most common sources of incorrect metrics in Power BI reports. Understanding which one you need before you build prevents a whole category of wrong-number errors.

Function Counts Example — Orders grouped by Region
Count Rows Total number of rows in the group — including duplicates North has 3 rows → Count Rows = 3 (number of orders placed in North)
Count Distinct Rows Number of unique values in a specific column within the group North has 3 orders but only 2 unique customers (Alice placed 2 orders) → Count Distinct on CustomerName = 2

Teacher's Note: The most important thing to remember about Group By in Power Query is that it is permanent — once you load the grouped table into the model, the detail rows are gone. If a viewer later asks "can you show me the individual orders behind that total?", you cannot drill down because the detail no longer exists. Before grouping, always ask: will anyone ever need to see the row-level detail? If yes, keep the full table and let DAX do the aggregation. Use Power Query grouping only when you are absolutely certain the detail rows serve no purpose in your report.

Practice

Practice 1 of 3

You want to create a summary table showing Total Revenue and Order Count per Region. In the Group By dialog Advanced mode, you set the group column to Region, add one aggregation with Operation = Sum and Column = Revenue, and add a second aggregation with Operation = ___ Rows (which counts all rows in each group without needing a column selection).

Practice 2 of 3

You have an Orders table with 500,000 rows. Reports only ever show monthly revenue totals — nobody ever drills to individual order rows. To reduce model size, you should group the table in Power Query by Month and compute Sum of Revenue ___ loading the detail rows into the model.

Practice 3 of 3

You group Orders by Region and use Count Distinct Rows on CustomerName. This gives you the number of ___ customers per Region, not the total number of orders placed in each region.

Lesson Quiz

Quiz 1 of 3

You use Group By in Power Query to create a table showing Total Revenue per Region. A viewer adds a Category slicer to the report and selects "Electronics" — but the Total Revenue card does not change. Why?

Quiz 2 of 3

You need the earliest and latest order date per Customer from the Orders table. Which Group By configuration produces this correctly?

Quiz 3 of 3

You want to create a CustomerSummary query that shares the same source reading as the Orders query to avoid double-loading the Excel file. What is the correct starting point?

Next up — Lesson 21 covers Splitting and Merging Columns — every technique for separating values crammed into one column and combining values spread across multiple columns, including delimiter splits, character-count splits, and the Column from Examples shortcut.