Power BI Course
Grouping and Aggregation
Group By is Power Query's way of summarising rows — collapsing many rows into fewer rows by aggregating values. It is the equivalent of a SQL GROUP BY or an Excel PivotTable, but it runs before data ever reaches your model. Knowing when to group inside Power Query versus letting DAX handle aggregation later is one of the most important judgement calls in building efficient Power BI reports.
What Group By Does
Group By takes a table with many rows and produces a smaller table where each row represents a unique combination of the grouping columns, with one or more aggregated values computed from all the rows that shared that combination.
| Region | Category | Revenue |
|---|---|---|
| North | Electronics | 1,200 |
| North | Electronics | 980 |
| North | Furniture | 850 |
| South | Electronics | 670 |
| South | Furniture | 1,540 |
| South | Furniture | 430 |
| East | Accessories | 210 |
| East | Accessories | 95 |
| Region | Category | Total Revenue |
|---|---|---|
| North | Electronics | 2,180 |
| North | Furniture | 850 |
| South | Electronics | 670 |
| South | Furniture | 1,970 |
| East | Accessories | 305 |
The Group By Dialog
Group By is accessed from the Transform tab → Group By. The dialog has two modes — Basic (one group column, one aggregation) and Advanced (multiple group columns and multiple aggregated output columns). You will almost always use Advanced.
All Available Aggregation Functions
| Function | What it computes | Practical example |
|---|---|---|
| Sum | Adds all values in the column for the group | Total Revenue per Region |
| Average | Mean of all values in the column for the group | Average order value per Customer |
| Median | Middle value when sorted — less sensitive to outliers than Average | Median delivery time per Category |
| Min / Max | Smallest or largest value in the column for the group | First order date (Min) and last order date (Max) per Customer |
| Count Rows | Number of rows in the group — no column selection needed | Number of orders per Region |
| Count Distinct Rows | Number of unique values in the selected column for the group | Number of distinct products sold per Region |
| All Rows | Returns the entire sub-table of rows for that group — advanced use only. Creates a nested table column that you can then expand or process further. | Get all order rows for each Customer as a nested table |
Multiple Aggregations in One Step
One of the biggest advantages of the Advanced mode is computing several aggregations simultaneously in a single Group By step. This is more efficient than chaining multiple separate steps and produces a clean summary table with all the metrics you need in one pass.
// Group By: Region — compute 4 aggregations at once
// Transform tab → Group By → Advanced
Group by columns:
Region
Aggregated output columns:
Name Operation Column
─────────────────────────────────────────────────
Total Revenue → Sum → Revenue
Order Count → Count Rows → (none needed)
Avg Order Val → Average → Revenue
Max Order → Max → Revenue
M code generated:
= Table.Group(
#"Filtered Rows",
{"Region"},
{
{"Total Revenue", each List.Sum([Revenue]), type number},
{"Order Count", each Table.RowCount(_), Int64.Type},
{"Avg Order Val", each List.Average([Revenue]),type number},
{"Max Order", each List.Max([Revenue]), type number}
}
)
| Region | Total Revenue | Order Count | Avg Order Val | Max Order |
|---|---|---|---|---|
| North | $3,030 | 3 | $1,010 | $1,200 |
| South | $2,640 | 2 | $1,320 | $1,540 |
| East | $640 | 2 | $320 | $430 |
| West | $2,520 | 2 | $1,260 | $1,540 |
When to Group in Power Query vs DAX
Both Power Query and DAX can aggregate data — but they do it at different points in the pipeline and for different reasons. Choosing the wrong tool leads to either wasted model memory or inflexible reports that cannot respond to slicer selections.
| Group By in Power Query | Aggregation in DAX | |
|---|---|---|
| When it runs | At load time — before the model is built. Runs once on refresh. | At query time — when a visual renders. Recalculates every time a slicer changes. |
| Responds to slicers? | No — the result is fixed. A Region slicer cannot re-aggregate a Power Query grouped table. | Yes — DAX measures recalculate in response to every filter and slicer automatically. |
| Model size impact | Reduces rows loaded into the model — smaller file, faster load. Ideal for very large transaction tables where full detail is not needed. | Full detail table loaded. More rows in model but maximum analytical flexibility. |
| Use when | You need a pre-summarised lookup or reference table. The source has millions of rows but reports only ever show monthly/regional totals. No drill-down to individual row detail is needed. | You need totals that respond to filters and slicers. You want to drill from summary to detail. The report audience needs flexible, interactive analysis. |
| Real example | A 50M-row web event log where you only need daily session counts — group to daily summaries in Power Query before loading. | A 200K-row orders table where users drill from total revenue → by region → by customer → to individual order rows. |
Practical Pattern — Building a Customer Summary Table
A common pattern is to use Group By to create a customer-level summary table alongside the detail Orders table. The summary table gives you pre-computed customer metrics — lifetime value, first purchase date, order count — that are fast to display in cards and tables without complex DAX.
Pattern: Create CustomerSummary as a reference query
─────────────────────────────────────────────────────
Step 1 — Right-click the Orders query → Reference
Rename the new query: CustomerSummary
Step 2 — Transform tab → Group By → Advanced
Group by: CustomerName
Aggregated columns:
Lifetime Value = Sum of Revenue
Order Count = Count Rows
First Order = Min of OrderDate
Last Order = Max of OrderDate
Avg Order Value = Average of Revenue
Step 3 — Click OK
CustomerSummary now has one row per customer
with all five metrics pre-computed
Step 4 — Close & Apply
Both Orders (detail) and CustomerSummary load
into the model
Step 5 — In Model View
Connect CustomerSummary.CustomerName (1)
to Orders.CustomerName (*) — one-to-many
relationship
Result in reports:
Cards showing Lifetime Value respond to customer
selection even though values are pre-computed —
because the relationship propagates the filter
| CustomerName | Lifetime Value | Order Count | First Order | Last Order | Avg Order Val |
|---|---|---|---|---|---|
| Alice Brown | $1,295 | 2 | 2024-01-05 | 2024-02-14 | $647.50 |
| Bob Singh | $850 | 1 | 2024-01-18 | 2024-01-18 | $850.00 |
| Carol Lee | $640 | 2 | 2024-02-03 | 2024-04-11 | $320.00 |
| David Kim | $2,520 | 2 | 2024-03-07 | 2024-04-29 | $1,260.00 |
Counting vs Counting Distinct
The difference between Count Rows and Count Distinct Rows is one of the most common sources of incorrect metrics in Power BI reports. Understanding which one you need before you build prevents a whole category of wrong-number errors.
| Function | Counts | Example — Orders grouped by Region |
|---|---|---|
| Count Rows | Total number of rows in the group — including duplicates | North has 3 rows → Count Rows = 3 (number of orders placed in North) |
| Count Distinct Rows | Number of unique values in a specific column within the group | North has 3 orders but only 2 unique customers (Alice placed 2 orders) → Count Distinct on CustomerName = 2 |
Teacher's Note: The most important thing to remember about Group By in Power Query is that it is permanent — once you load the grouped table into the model, the detail rows are gone. If a viewer later asks "can you show me the individual orders behind that total?", you cannot drill down because the detail no longer exists. Before grouping, always ask: will anyone ever need to see the row-level detail? If yes, keep the full table and let DAX do the aggregation. Use Power Query grouping only when you are absolutely certain the detail rows serve no purpose in your report.
Practice
Practice 1 of 3
You want to create a summary table showing Total Revenue and Order Count per Region. In the Group By dialog Advanced mode, you set the group column to Region, add one aggregation with Operation = Sum and Column = Revenue, and add a second aggregation with Operation = ___ Rows (which counts all rows in each group without needing a column selection).
Practice 2 of 3
You have an Orders table with 500,000 rows. Reports only ever show monthly revenue totals — nobody ever drills to individual order rows. To reduce model size, you should group the table in Power Query by Month and compute Sum of Revenue ___ loading the detail rows into the model.
Practice 3 of 3
You group Orders by Region and use Count Distinct Rows on CustomerName. This gives you the number of ___ customers per Region, not the total number of orders placed in each region.
Lesson Quiz
Quiz 1 of 3
You use Group By in Power Query to create a table showing Total Revenue per Region. A viewer adds a Category slicer to the report and selects "Electronics" — but the Total Revenue card does not change. Why?
Quiz 2 of 3
You need the earliest and latest order date per Customer from the Orders table. Which Group By configuration produces this correctly?
Quiz 3 of 3
You want to create a CustomerSummary query that shares the same source reading as the Orders query to avoid double-loading the Excel file. What is the correct starting point?
Next up — Lesson 21 covers Splitting and Merging Columns — every technique for separating values crammed into one column and combining values spread across multiple columns, including delimiter splits, character-count splits, and the Column from Examples shortcut.