Tableau Lesson 11 – Cleaning Data Overview | Dataplexa
Section II — Lesson 11

Data Cleaning Overview

Real-world data is almost never clean. Before you build a single chart, the quality of your data determines the accuracy of every insight — and Tableau gives you several tools to catch and fix the most common problems without leaving the application.

Why Data Quality Matters

A chart built on dirty data does not show you the truth — it shows you a distorted version of it. If a Sales column contains nulls where it should contain zeroes, your SUM will be understated. If a Region column contains both "East" and "east" as values, your category bar chart will show six bars instead of four. If dates are stored as strings, the date hierarchy will not work at all.

Data cleaning is the process of identifying these problems and correcting them before analysis begins. In a professional workflow, most cleaning happens in the source system or in a dedicated tool like Tableau Prep. But Tableau Desktop itself offers several lightweight cleaning options that are worth knowing for quick fixes on the Data Source tab and in the Data pane.

The Five Most Common Data Quality Problems

Null Values

A null is an empty cell — a missing value. Nulls in Measure fields cause aggregations to be lower than they should be. Nulls in Dimension fields create an "Unknown" or "Null" category in your charts. In Tableau, nulls appear as a separate mark or as blank headers depending on the field type.

Inconsistent Capitalisation

Tableau treats "East", "east", and "EAST" as three completely different values. A column that should have four Region values ends up with eight or twelve, splitting your data and breaking every chart that uses Region.

Wrong Data Types

A date stored as text, a number stored as a string, or a postal code stored as an integer. Wrong types prevent aggregations, break date hierarchies, and strip leading zeros from codes. These must be fixed at the Data Source tab or in the Data pane before building any analysis.

Duplicate Rows

Rows that appear more than once in the dataset inflate every aggregate. SUM(Sales) doubles for every duplicated order row. Duplicates are especially common when joining tables — a topic covered in the next lesson.

Trailing Spaces and Hidden Characters

A value of "East " (with a trailing space) and "East" are treated as different by Tableau. These invisible differences are impossible to spot by looking at the data preview but cause silent splits in Dimension groups. They most commonly come from Excel files exported by legacy systems.

Quick Fixes Available Directly in Tableau Desktop

Tableau Desktop is not a dedicated data cleaning tool, but it provides several options for lightweight fixes that are faster than going back to the source file:

Problem Fix in Tableau Where to Apply
Wrong data type Click the type icon in the Data Source preview or right-click the field in the Data pane → Change Data Type Data Source tab or Data pane
Inconsistent values / typos Right-click the field in the Data pane → Aliases → assign the same alias to variant spellings Data pane → Aliases
Null Dimension values Use a calculated field: IFNULL([Region], "Unknown") to replace nulls with a label Calculated field
Trailing spaces Use a calculated field: TRIM([Region]) to strip leading and trailing spaces Calculated field
Capitalisation variants Use a calculated field: UPPER([Region]) to normalise all values to uppercase Calculated field

Using Aliases to Fix Inconsistent Values

Aliases are the fastest way to rename or consolidate Dimension values without touching the source data. An alias applies a display label over the raw value — the underlying data stays the same, but Tableau shows the alias everywhere the field appears.

1
Right-click the field in the Data pane and select Aliases. The Aliases dialog opens, listing every unique value in the field.
2
Click the Value (Alias) column next to any raw value to edit it. For example, rename "east" → "East" and "EAST" → "East". They will all display as "East" in every chart, even though the raw data still contains the original strings.
3
Click OK. The chart immediately reflects the new aliases. All views in the workbook that use this field are updated at once.
Edit Aliases — Region
Raw Value
Value (Alias)
east
East
EAST
East
East
East
West
West
Clear Aliases
OK

Handling Nulls in Tableau

When Tableau encounters a null in a chart, it handles it differently depending on the field type. For Measure nulls in a line chart, Tableau breaks the line by default — you can right-click the null indicator at the bottom of the view and choose to filter nulls or show data at default value. For Dimension nulls, a separate "Null" mark appears in the view. You can right-click it and select Exclude to hide it from that chart.

📌 Teacher's Note

Tableau Desktop's cleaning tools are useful for quick fixes, but they are not a substitute for cleaning data at the source. Aliases and calculated fields fix the display — the dirty data still exists in the source file. If others connect to the same file and do not apply the same aliases, they see the dirty values. The professional approach is to clean data before it reaches Tableau, either in the source system, in Excel, or in Tableau Prep. For this course, aliases and TRIM/UPPER calculated fields will handle everything we need — but always note in your workbook documentation that the original source contains inconsistencies.

Practice Questions

1. Which Tableau feature lets you rename or consolidate Dimension values — such as mapping "east" and "EAST" to "East" — without modifying the source data?

2. Which Tableau string function removes leading and trailing spaces from a text field value?

3. What does Tableau do to a line chart when it encounters a null value in the Measure field being plotted?

Quiz

1. A Region field contains the values "East", "east", and "EAST" in the source data. What problem does this cause in a Tableau bar chart?


2. Which calculated field formula replaces null values in the Region field with the label "Unknown"?


3. What is the key limitation of fixing data quality problems using aliases and calculated fields in Tableau Desktop?


Next up — Lesson 12: Joins in Tableau — combining data from multiple tables using inner, left, right, and full outer joins to enrich your analysis.