Power BI Course
Advanced Cleaning Techniques
Basic cleaning removes blank rows and fixes column names. Advanced cleaning handles the subtler problems — cell-level errors that break type changes, null values that silently corrupt calculations, inconsistent text that prevents accurate grouping, and source data that is messy in different ways every time it arrives. This lesson gives you the complete toolkit for making Power Query queries that are robust under real-world conditions.
Understanding Cell-Level Errors
An error at the cell level is different from a step-level error. A step-level error breaks the entire query — nothing loads. A cell-level error is contained to one cell in one row — the rest of the column loads fine, but that cell shows the word "Error" in red. Both need to be handled, but in different ways.
| OrderID | CustomerName | Revenue | Status |
|---|---|---|---|
| 1001 | Alice Brown | 1,200 | Shipped |
| 1002 | Bob Singh | Error | Delivered |
| 1003 | Carol Lee | 430 | Processing |
| 1004 | David Kim | Error | Shipped |
| 1005 | Eve Patel | 1,540 | Delivered |
To see the exact error message for any red cell, click the cell in the preview. Power Query shows the error type and message below the grid. The three most common causes of cell errors are type conversion failures, division by zero, and out-of-range values.
| Error type | Common cause | Example |
|---|---|---|
| DataFormat.Error | Type conversion failed — the value cannot be parsed as the target type | Revenue column contains "N/A" — cannot convert to Decimal Number |
| Expression.Error | A calculation produced an invalid result — often division by zero | Custom column divides Revenue by Quantity — some Quantity values are 0 |
| Number.FromText error | Text-to-number conversion on a cell containing non-numeric characters | Revenue cell contains "$1,200" — the $ and comma prevent numeric conversion |
| Formula.Firewall | Privacy level mismatch between two queries being combined | A Private-level query is referenced by an Organisational-level query |
Three Ways to Handle Cell Errors
Option 1 — Replace Errors
Replace Errors substitutes a fixed value for every error in the selected column. The replacement value becomes part of the loaded data — DAX measures will use this value in calculations. Choose null when you want errors to be ignored by aggregations, or 0 when you need a numeric placeholder.
| OrderID | Revenue |
|---|---|
| 1001 | 1,200 |
| 1002 | Error |
| 1003 | 430 |
| 1004 | Error |
| OrderID | Revenue |
|---|---|
| 1001 | 1,200 |
| 1002 | null |
| 1003 | 430 |
| 1004 | null |
Option 2 — Remove Errors
Remove Errors deletes the entire row for any row that contains an error in the selected column. Use this when error rows are genuinely invalid and should not be included in any calculation — for example, rows where the primary key is an error and the row is meaningless without it.
Option 3 — Fix the Source of Errors
The cleanest approach is to fix the root cause before the error occurs. If Revenue contains "N/A" strings that fail type conversion, add a Replace Values step before the Changed Type step — replace "N/A" with null first, then change the type. The type change now succeeds because null converts cleanly to any type.
// Robust pattern — fix before the type change, not after
// Step 1 — Replace Values: clean known bad text values first
// Transform tab → Replace Values
#"Replaced N/A" = Table.ReplaceValue(
#"Promoted Headers",
"N/A", null, // find "N/A", replace with null
Replacer.ReplaceValue,
{"Revenue"} // in the Revenue column only
),
// Step 2 — Replace Values: clean currency symbols
#"Removed Dollar Sign" = Table.ReplaceValue(
#"Replaced N/A",
"$", "", // find "$", replace with empty string
Replacer.ReplaceText, // ReplaceText for partial matches
{"Revenue"}
),
// Step 3 — Remove commas from number formatting
#"Removed Comma" = Table.ReplaceValue(
#"Removed Dollar Sign",
",", "",
Replacer.ReplaceText,
{"Revenue"}
),
// Step 4 — Now change type safely — no errors will occur
#"Changed Type" = Table.TransformColumnTypes(
#"Removed Comma",
{{"Revenue", type number}}
)
Handling Null Values
Null is Power Query's representation of a missing or unknown value. It is not zero, not an empty string, and not the word "null" — it is the complete absence of a value. Nulls behave differently depending on context, and understanding those differences prevents silent calculation errors.
| Context | How null behaves | Practical implication |
|---|---|---|
| In SUM / AVERAGE | Null rows are excluded from the calculation | SUM ignores nulls correctly. AVERAGE divides by the count of non-null rows, not total rows — which is usually what you want. |
| In COUNT | COUNT does not count null rows — COUNTA counts non-blank, COUNTROWS counts all rows including nulls | COUNT([Revenue]) returns 3 if 2 rows have null Revenue. COUNTROWS(Orders) returns 5 regardless of nulls. |
| In text concatenation | Concatenating null with text using & produces an error in M code | [FirstName] & " " & [LastName] errors if either is null. Wrap with: (if [FirstName] = null then "" else [FirstName]) |
| In relationships | Null key values in a fact table do not join to any row in the dimension table | Orders with null CustomerID have no customer in the model — they appear as "Blank" in visuals. Replace nulls with a known default value if you want them to map to a specific row. |
| In filters | Null is treated as a distinct value — filter dropdowns show "(blank)" as a separate option | When filtering out invalid rows, add a specific condition for null: filter where [Status] <> null AND [Status] <> "Cancelled". |
Filling Nulls — Fill Down and Fill Up
A common pattern in exported spreadsheets is a merged cell that was exported as one value followed by nulls. The Department column might say "Sales" in the first row of a group, then null for all subsequent rows in that group. Fill Down replaces each null with the value from the row above — reconstructing the implied grouping.
| Department | Employee |
|---|---|
| Sales | Alice |
| null | Bob |
| null | Carol |
| Engineering | David |
| null | Eve |
| Department | Employee |
|---|---|
| Sales | Alice |
| Sales | Bob |
| Sales | Carol |
| Engineering | David |
| Engineering | Eve |
Standardising Inconsistent Text
Inconsistent text in dimension columns silently breaks grouping, relationships, and filters. "New York", "new york", "NEW YORK", and "New York " (trailing space) are four different values in Power Query — they will not group together and will not match across tables. Fixing text consistency is one of the highest-impact cleaning steps you can do.
| Problem | Fix in Power Query | M function |
|---|---|---|
| Mixed case | Transform → Format → Capitalize Each Word (or Lowercase / Uppercase depending on your standard) | Text.Proper([City]) |
| Leading/trailing spaces | Transform → Format → Trim. Apply to every text column by default — invisible spaces are the most common cause of merge failures between tables. | Text.Trim([City]) |
| Double spaces | Replace Values: find two spaces " " → replace with one space " ". Repeat until no double spaces remain. Or use a Custom Column with Text.Trim which also collapses internal spaces. | Text.Trim handles edges only — use Replace for internals |
| Abbreviations vs full names | Replace Values: "NY" → "New York", "CA" → "California". Create a lookup table and merge it if there are many abbreviations to standardise. | Table.ReplaceValue(...) |
| Non-printable characters | Transform → Format → Clean. Removes characters below ASCII 32 — tab characters, line breaks, null bytes — that are invisible in the preview but break matching. | Text.Clean([Column]) |
Column Quality — Finding Problems Fast
Before writing any cleaning steps, turn on Column Quality in the View tab. It shows a mini data quality bar below each column header — the percentage of values that are valid, in error, or empty. This gives you an immediate picture of which columns need attention and how severe the problems are.
|
OrderID
100% valid
|
Revenue
60% valid40% error
|
CustomerName
80% valid20% empty
|
Status
100% valid
|
|---|---|---|---|
| 1001 | 1,200 | Alice Brown | Shipped |
| 1002 | Error | null | Delivered |
A Complete Robust Cleaning Template
// Robust cleaning sequence — apply to every new data source
// 1. Fix structure first
#"Removed Top Rows" = Table.Skip(Source, 2),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", ...),
#"Removed Bottom Rows" = Table.RemoveLastN(#"Promoted Headers", 1),
// 2. Clean text columns BEFORE type changes
// Trim removes leading/trailing spaces
// Clean removes non-printable characters
#"Trimmed Text" = Table.TransformColumns(
#"Removed Bottom Rows",
{
{"CustomerName", Text.Trim},
{"City", Text.Trim},
{"Status", Text.Trim}
}
),
#"Cleaned Text" = Table.TransformColumns(
#"Trimmed Text",
{
{"CustomerName", Text.Clean},
{"City", Text.Clean},
{"Status", Text.Clean}
}
),
// 3. Standardise case
#"Proper Case" = Table.TransformColumns(
#"Cleaned Text",
{
{"CustomerName", Text.Proper},
{"City", Text.Proper},
{"Status", Text.Proper}
}
),
// 4. Replace known bad values before type changes
#"Replaced N/A" = Table.ReplaceValue(#"Proper Case", "N/A", null, Replacer.ReplaceValue, {"Revenue"}),
#"Removed Dollar" = Table.ReplaceValue(#"Replaced N/A", "$", "", Replacer.ReplaceText, {"Revenue"}),
#"Removed Comma" = Table.ReplaceValue(#"Removed Dollar", ",", "", Replacer.ReplaceText, {"Revenue"}),
// 5. Now set types safely
#"Changed Type" = Table.TransformColumnTypes(
#"Removed Comma",
{{"OrderID", Int64.Type}, {"OrderDate", type date},
{"Revenue", type number}, {"CustomerName", type text}}
),
// 6. Handle any remaining errors after type change
#"Replaced Errors" = Table.ReplaceErrorValues(
#"Changed Type",
{{"Revenue", null}} // replace any remaining Revenue errors with null
),
// 7. Fill down merged-cell nulls
#"Filled Down Department" = Table.FillDown(#"Replaced Errors", {"Department"}),
// 8. Remove rows where key columns are null
#"Removed Null Orders" = Table.SelectRows(
#"Filled Down Department",
each [OrderID] <> null
)
Teacher's Note: The order of cleaning steps matters more than the steps themselves. The pattern that breaks the fewest queries in practice is: structure → clean text → replace bad values → change types → catch remaining errors. Every beginner's instinct is to change types first because it is the most visible step. Resist that instinct. Changing types before cleaning text is like painting a wall before plastering it — the errors underneath ruin the result. Clean first, type-convert second, catch errors third.
Practice
Practice 1 of 3
A Department column has "Sales" in the first row of each group and null in all subsequent rows — a result of merged cells in the original Excel file. To propagate the group label to every row, you select the column and apply Transform → Fill → ___.
Practice 2 of 3
To see at a glance which columns have errors, which are empty, and what percentage of values in each column are valid — all without clicking individual cells — you enable ___ Quality in the View tab.
Practice 3 of 3
A Revenue column contains the value "$1,200" as text. To convert it to a number without errors, you need to remove the $ and comma using Replace Values steps ___ the Changed Type step in the Applied Steps list.
Lesson Quiz
Quiz 1 of 3
You use Replace Errors with the value 0 on a Revenue column that has 15 error cells. Later you notice the average revenue per order is artificially low. What is the cause and what should you have used instead?
Quiz 2 of 3
A City column contains "new york", "New York", "NEW YORK", and "New York " (with a trailing space). After applying Trim and then Capitalize Each Word, how many distinct City values remain?
Quiz 3 of 3
You build a Custom Column with the formula [FirstName] & " " & [LastName]. Some rows have null in LastName. What happens to those rows and how do you fix it?
Next up — Lesson 24 covers Replacing Errors and Handling Exceptions — a deeper look at error detection, conditional error handling with try…otherwise in M code, and building queries that recover gracefully when source data is unpredictable.