Advanced Sets in Python
Sets in Python are extremely powerful when handling unique data, performing membership tests, removing duplicates, or performing mathematical operations such as union, intersection, and difference. In this lesson, we go beyond the basics and explore advanced set features that are widely used in data processing, analytics, and application development.
Understanding these operations will help you write cleaner, faster, and more efficient Python programs.
What Makes Sets Powerful?
Unlike lists, sets:
- Store only unique values
- Have extremely fast membership checking (
inkeyword) - Support mathematical operations like union and intersection
- Automatically ignore duplicate entries
Let’s explore advanced operations and real-world examples.
1. Removing Duplicates from a List
This is one of the most common uses of sets. If you have a list with repeated values, converting it to a set instantly removes duplicates.
emails = ["alex@example.com", "emma@example.com", "alex@example.com"]
unique_emails = set(emails)
print(unique_emails)
This is widely used when cleaning user data, product lists, or log files.
2. Set Union (Combining Two Sets)
Union returns all elements from both sets without duplicates.
team_a = {"Alex", "Sophia", "Liam"}
team_b = {"Emma", "Liam", "Oliver"}
combined = team_a.union(team_b)
print(combined)
3. Set Intersection (Common Values)
Intersection returns only the items that appear in both sets.
group1 = {"John", "Emma", "Lucas"}
group2 = {"Lucas", "Sophia", "Emma"}
common = group1.intersection(group2)
print(common)
4. Set Difference (Values Only in One Set)
This returns elements that belong to the first set but NOT the second set.
plan_basic = {"Email", "Storage", "Support"}
plan_premium = {"Storage", "Support", "Analytics"}
only_basic = plan_basic.difference(plan_premium)
print(only_basic)
5. Symmetric Difference (Unique to Each Set)
This returns elements that are in either set, but not in both.
a = {"A", "B", "C"}
b = {"B", "C", "D"}
unique_values = a.symmetric_difference(b)
print(unique_values)
6. Checking Subsets and Supersets
These are extremely useful when verifying permissions, roles, or data groups.
Subset Example
required = {"Email", "Login"}
user_features = {"Login", "Dashboard", "Email", "Reports"}
print(required.issubset(user_features))
Superset Example
all_items = {"A", "B", "C", "D"}
some_items = {"A", "C"}
print(all_items.issuperset(some_items))
7. Set Comprehensions
Just like list comprehensions, Python allows set comprehensions for creating sets in a clean, powerful way.
values = {x * 2 for x in range(5)}
print(values)
8. Immutable Sets — frozenset
A frozenset is a set that cannot be changed (no add or remove). This is used when set data must remain consistent—for example, configuration options, constant values, or security rules.
roles = frozenset(["Admin", "User", "Guest"])
print(roles)
Trying to modify it will cause an error.
Real-World Use Cases of Advanced Sets
- Cleaning duplicate entries in datasets
- Comparing permissions or user access levels
- Finding common users between two platforms
- Validating required features or configurations
- Optimizing search operations
📝 Practice Exercises
Exercise 1
Remove duplicates from: ["NY", "LA", "NY", "TX", "LA"]
Exercise 2
Create a set of all unique letters from the word "Dataplexa".
Exercise 3
Find the intersection of: Set A = {"Google", "Amazon", "Meta"} Set B = {"Netflix", "Meta", "Amazon"}
Exercise 4
Create a set comprehension that contains squares of numbers from 1 to 10.
✅ Practice Answers
Answer 1
cities = ["NY", "LA", "NY", "TX", "LA"]
unique = set(cities)
print(unique)
Answer 2
letters = {ch for ch in "Dataplexa"}
print(letters)
Answer 3
A = {"Google", "Amazon", "Meta"}
B = {"Netflix", "Meta", "Amazon"}
result = A.intersection(B)
print(result)
Answer 4
squares = {n * n for n in range(1, 11)}
print(squares)