DISTINCT | Dataplexa

DISTINCT Keyword

In many real-world datasets, duplicate values are very common. The DISTINCT keyword allows us to remove duplicate values from query results.

This helps in understanding unique categories, clean reporting, and accurate analysis.


What is DISTINCT?

The DISTINCT keyword is used with the SELECT statement to return only unique values.

Basic syntax:

SELECT DISTINCT column_name
FROM table_name;
  

The database removes duplicate values before returning the result.


Example Table

Consider the following employees table:

+----+----------+------------+--------+
| id | name     | department | salary |
+----+----------+------------+--------+
| 1  | Alice    | IT         | 75000  |
| 2  | Bob      | IT         | 72000  |
| 3  | Charlie  | HR         | 60000  |
| 4  | Diana    | Finance    | 70000  |
| 5  | Evan     | HR         | 58000  |
+----+----------+------------+--------+
  

Without DISTINCT

If we select the department column without DISTINCT:

SELECT department
FROM employees;
  

The result may contain repeated values:

  • IT
  • IT
  • HR
  • Finance
  • HR

Using DISTINCT

To retrieve only unique department names:

SELECT DISTINCT department
FROM employees;
  

Result:

  • IT
  • HR
  • Finance

DISTINCT on Multiple Columns

DISTINCT can also be applied to combinations of columns. In this case, uniqueness is evaluated across all selected columns.

Example:

SELECT DISTINCT department, salary
FROM employees;
  

This returns unique department + salary combinations.


DISTINCT vs ALL (Default)

Keyword Behavior Example
ALL Returns all values (default) SELECT department FROM employees;
DISTINCT Removes duplicate values SELECT DISTINCT department FROM employees;

Using DISTINCT with WHERE

DISTINCT can be combined with WHERE to get unique values from filtered data.

Example: Unique departments where salary is greater than 65,000.

SELECT DISTINCT department
FROM employees
WHERE salary > 65000;
  

Common Beginner Mistakes

  • Using DISTINCT on the wrong column
  • Expecting DISTINCT to remove duplicate rows from the table
  • Forgetting that DISTINCT applies to all selected columns

Remember: DISTINCT affects only the query result, not the stored data.


Real-World Use Cases

  • Finding unique product categories
  • Listing distinct countries or cities
  • Generating dropdown values
  • Removing duplicates in reports

What’s Next?

In the next lesson, we will learn about basic SQL functions such as COUNT, SUM, and AVG, which help us analyze data.