Excel Lesson 28 – Data Cleaning | Dataplexa

Lesson 28 · Power Query Practical

Data Cleaning in Power Query

Data cleaning is where most analysts spend the majority of their time — and it is the area where Power Query delivers the most dramatic improvement over manual Excel work. Handling nulls, fixing inconsistent values, removing duplicates, unpivoting wide tables, and reshaping messy exports are all tasks that used to require formulas, manual editing, or VBA. In Power Query, every cleaning operation is a recorded step that runs automatically on every refresh. Fix the data once, and it stays fixed.

Handling Null Values

Power Query represents missing values as null — not blank cells, not zeros, not dashes. Null is a distinct state meaning "no value present." You can see nulls clearly in the preview pane as grey italic null labels. There are three main strategies for dealing with them: remove the rows that contain them, replace them with a default value, or fill them down from the previous non-null value above.

Three Ways to Handle Null Values

Remove Rows with Null

Right-click column → Remove Empty or use Home → Remove Rows → Remove Blank Rows.

Use when: A null in that column means the entire row is invalid or incomplete.

Replace Null with a Value

Right-click column → Replace Values → Value to find: null → Replace with: 0 (or "Unknown" etc).

Use when: A missing value has a known default — zero for missing sales, "Unknown" for missing category.

Fill Down

Select column → Transform → Fill → Down (or Up).

Use when: Nulls appear because a grouped label was only entered once — fill repeats it for every row in the group.

Fill Down — Fixing a Grouped Export

❌ Before — Department only entered once per group

Department	Name	Salary
Finance	Priya	£61,000
null	James	£67,000
null	Maria	£58,000
Operations	Sarah	£48,000
null	David	£52,000

Transform →
Fill → Down

→

✅ After — every row has its department

Department	Name	Salary
Finance	Priya	£61,000
Finance	James	£67,000
Finance	Maria	£58,000
Operations	Sarah	£48,000
Operations	David	£52,000

Removing Duplicates

Power Query can remove duplicate rows based on all columns or based on specific key columns. Removing based on all columns is the equivalent of Excel's Remove Duplicates feature. Removing based on key columns keeps the first occurrence of each unique key value — useful for deduplicating a table where the same ID appears multiple times and you want to keep one record per ID.

Remove duplicate rows (all columns must match to be a duplicate):
  Home → Remove Rows → Remove Duplicates

Remove duplicates based on a key column only:
  Select the key column (e.g. Employee ID)
  Right-click → Remove Duplicates
  Power Query keeps the first occurrence of each unique key value

Keep only duplicates (find rows that DO repeat):
  Home → Remove Rows → Keep Duplicates
  Useful for auditing — finding IDs or values that appear more than once

Remove Duplicates on Employee ID: 8 rows with 2 duplicate IDs → 6 unique rows retained (first occurrence of each ID kept)

A common scenario: a data extract contains one row per transaction, but multiple transactions share the same customer ID. Removing duplicates on the Customer ID column keeps one row per customer — the first one in the data. If you need the most recent transaction rather than the first, sort by date descending first, then remove duplicates on the ID column.

Unpivoting — Turning Wide Tables Tall

One of the most powerful and frequently needed transformations in Power Query is unpivoting. Wide tables — where months, categories, or regions are spread across multiple columns — need to be restructured into tall tables before they can be analysed properly in PivotTables, charts, or formulas. Unpivoting collapses those column headers into rows.

Unpivot — Wide to Tall

❌ Wide format — months as columns (cannot PivotTable or chart cleanly)

Rep	Jan	Feb	Mar	Apr
Priya	£32,000	£29,000	£31,000	£27,000
James	£28,000	£31,000	£33,000	£25,000

Select Jan, Feb, Mar, Apr → Right-click → Unpivot Columns

↓

✅ Tall format — one row per rep per month, ready for PivotTables and charts

Rep	Month	Sales
Priya	Jan	£32,000
Priya	Feb	£29,000
Priya	Mar	£31,000
Priya	Apr	£27,000
James	Jan	£28,000
James	Feb	£31,000

2 reps × 4 months = 8 rows total · Month column auto-named "Attribute" → rename to "Month"

Unpivot Other Columns — The Safer Method

There are three unpivot options and it is worth knowing the difference. The safest for future-proofing is "Unpivot Other Columns" — select the columns you want to keep as rows (like Rep), then right-click and choose Unpivot Other Columns. This way, if new month columns are added to the source in the future, they are automatically unpivoted too without any changes to the query.

Three unpivot options — right-click a column header:

Unpivot Columns
  Unpivots the columns you selected. New columns added to the source later
  will NOT be unpivoted automatically.

Unpivot Other Columns  ← recommended
  Select the columns to KEEP as identifiers (Rep, Product etc),
  then use this option. Everything else is unpivoted.
  New columns added to the source ARE automatically unpivoted.

Unpivot Only Selected Columns
  Unpivots exactly the selected columns regardless of what else exists.
  Best when you want to unpivot a specific fixed set only.

The Unpivot Other Columns pattern is the right default for any report that involves time series data — monthly sales, weekly metrics, quarterly results — because the source almost always gains new period columns over time. Build the query to handle that growth from the start and you will never need to edit it when January turns to February turns to March.

Fixing Inconsistent Text Values

Inconsistent casing and naming is one of the most common data quality issues in business data — especially in category columns that are typed manually. "North", "north", "NORTH", and " North" are four different values to Excel but should all be the same. Power Query gives you transformation options for all of these in one place.

Text Cleaning Operations — Transform → Format

Operation	What it does	Example
Lowercase	Converts all text to lower case	NORTH → north
Uppercase	Converts all text to upper case	north → NORTH
Capitalize Each Word	Title-cases the value (like PROPER in Excel)	north east → North East
Trim	Removes leading and trailing spaces	" North " → "North"
Clean	Removes non-printable characters	Removes hidden control characters
Replace Values	Substitutes a specific value for another	"N/A" → null, "Nrth" → "North"

The order matters when applying multiple text fixes. The recommended sequence for a messy text column is: Clean first (remove hidden characters) → Trim (remove spaces) → Capitalize Each Word or Lowercase (standardise casing) → Replace Values (fix any specific known bad values). Applying them in this order means each step builds on already-cleaner data.

Conditional Column — Power Query's IF Statement

Conditional Column is Power Query's point-and-click equivalent of a nested IF formula. You define rules — if this column equals this value, put this result — and Power Query builds the M code for you. It is in Add Column → Conditional Column.

Conditional Column — Region Grouping

Add Column → Conditional Column

Column Name

Operator

Value

Output

Region

equals

North

Region

equals

East

Region

equals

West

Otherwise:

Other

Pivoting — Tall Back to Wide

Pivoting is the reverse of unpivoting. If you have tall data with a column of category labels you want to spread out into separate columns, use Transform → Pivot Column. You choose which column becomes the new headers and which column provides the values to fill those headers.

Tall data:
  Rep   | Quarter | Sales
  Priya | Q1      | £32,000
  Priya | Q2      | £29,000
  James | Q1      | £28,000
  James | Q2      | £31,000

Pivot: Select the Quarter column → Transform → Pivot Column
  Values column: Sales
  Aggregate: Sum (or Don't Aggregate for 1:1 data)

Result:
  Rep   | Q1      | Q2
  Priya | £32,000 | £29,000
  James | £28,000 | £31,000

Pivot on Quarter column → Q1 and Q2 become column headers with Sales values filled in per rep

Pivot and Unpivot are mirror operations and you will use them in different situations. Unpivot is far more common in data cleaning work because most raw data arrives wide and needs to go tall for analysis. Pivot is useful when building output tables or summary views that people want to read in a wide format.

💡 Teacher's Note

The unpivot transformation is genuinely one of the most valuable things you can learn in Power Query — and it is something that cannot be done cleanly with formulas at all. If you have worked with any kind of business reporting data, you will have seen the wide-format problem: a spreadsheet with months or products spread across dozens of columns that you need to turn into a proper database-style table. In Power Query this takes about four clicks and thirty seconds. Understanding when data is in the wrong shape — and knowing that Unpivot Other Columns is the fix — is one of those skills that immediately makes you look like an expert to anyone watching.

🟠 Practice

Q1. Your Department column has nulls wherever the department name was only entered for the first row of each group. What Power Query operation fills those nulls from the value above?

Q2. You have a wide table with columns: Rep, Jan, Feb, Mar, Apr, May, Jun. You want to unpivot the month columns into rows, and you want new months to be included automatically in future. Which unpivot option should you use and how?

Q3. The Region column contains a mix of "north", "NORTH", and "North". What is the quickest way in Power Query to standardise all values to "North"?

Next up — Merging Tables in Power Query, where you will learn how to join two queries together using match columns — the Power Query equivalent of VLOOKUP, but far more powerful and fully automated on every refresh.

← Previous Course Index Next →

Excel Course

Data Cleaning in Power Query

Handling Null Values

Removing Duplicates

Unpivoting — Turning Wide Tables Tall

Unpivot Other Columns — The Safer Method

Fixing Inconsistent Text Values

Conditional Column — Power Query's IF Statement

Pivoting — Tall Back to Wide

🟠 Practice

🟣 Quiz