Advanced Sets | Dataplexa

Advanced Sets in Python

Sets in Python are extremely powerful when handling unique data, performing membership tests, removing duplicates, or performing mathematical operations such as union, intersection, and difference. In this lesson, we go beyond the basics and explore advanced set features that are widely used in data processing, analytics, and application development.

Understanding these operations will help you write cleaner, faster, and more efficient Python programs.

What Makes Sets Powerful?

Unlike lists, sets:

Store only unique values
Have extremely fast membership checking (in keyword)
Support mathematical operations like union and intersection
Automatically ignore duplicate entries

Let’s explore advanced operations and real-world examples.

1. Removing Duplicates from a List

This is one of the most common uses of sets. If you have a list with repeated values, converting it to a set instantly removes duplicates.

emails = ["alex@example.com", "emma@example.com", "alex@example.com"]
unique_emails = set(emails)
print(unique_emails)

This is widely used when cleaning user data, product lists, or log files.

2. Set Union (Combining Two Sets)

Union returns all elements from both sets without duplicates.

team_a = {"Alex", "Sophia", "Liam"}
team_b = {"Emma", "Liam", "Oliver"}

combined = team_a.union(team_b)
print(combined)

3. Set Intersection (Common Values)

Intersection returns only the items that appear in both sets.

group1 = {"John", "Emma", "Lucas"}
group2 = {"Lucas", "Sophia", "Emma"}

common = group1.intersection(group2)
print(common)

4. Set Difference (Values Only in One Set)

This returns elements that belong to the first set but NOT the second set.

plan_basic = {"Email", "Storage", "Support"}
plan_premium = {"Storage", "Support", "Analytics"}

only_basic = plan_basic.difference(plan_premium)
print(only_basic)

5. Symmetric Difference (Unique to Each Set)

This returns elements that are in either set, but not in both.

a = {"A", "B", "C"}
b = {"B", "C", "D"}

unique_values = a.symmetric_difference(b)
print(unique_values)

6. Checking Subsets and Supersets

These are extremely useful when verifying permissions, roles, or data groups.

Subset Example

required = {"Email", "Login"}
user_features = {"Login", "Dashboard", "Email", "Reports"}

print(required.issubset(user_features))

Superset Example

all_items = {"A", "B", "C", "D"}
some_items = {"A", "C"}

print(all_items.issuperset(some_items))

7. Set Comprehensions

Just like list comprehensions, Python allows set comprehensions for creating sets in a clean, powerful way.

values = {x * 2 for x in range(5)}
print(values)

8. Immutable Sets — `frozenset`

A frozenset is a set that cannot be changed (no add or remove). This is used when set data must remain consistent—for example, configuration options, constant values, or security rules.

roles = frozenset(["Admin", "User", "Guest"])
print(roles)

Trying to modify it will cause an error.

Real-World Use Cases of Advanced Sets

Cleaning duplicate entries in datasets
Comparing permissions or user access levels
Finding common users between two platforms
Validating required features or configurations
Optimizing search operations

📝 Practice Exercises

Exercise 1

Remove duplicates from: ["NY", "LA", "NY", "TX", "LA"]

Exercise 2

Create a set of all unique letters from the word "Dataplexa".

Exercise 3

Find the intersection of: Set A = {"Google", "Amazon", "Meta"} Set B = {"Netflix", "Meta", "Amazon"}

Exercise 4

Create a set comprehension that contains squares of numbers from 1 to 10.

✅ Practice Answers

Answer 1

cities = ["NY", "LA", "NY", "TX", "LA"]
unique = set(cities)
print(unique)

Answer 2

letters = {ch for ch in "Dataplexa"}
print(letters)

Answer 3

A = {"Google", "Amazon", "Meta"}
B = {"Netflix", "Meta", "Amazon"}

result = A.intersection(B)
print(result)

Answer 4

squares = {n * n for n in range(1, 11)}
print(squares)

← Previous Lesson Python Index Next ➜