Hash Tables
In the previous lesson, we studied Segment Trees and learned how to efficiently answer range-based queries using a tree structure.
Now we move to one of the most powerful and widely used data structures in computer science — Hash Tables.
Hash tables are the reason why many operations in modern software feel instantaneous, even when working with millions of records.
The Core Problem Hash Tables Solve
Imagine you store student records using a simple list or array.
If you want to find a student by ID, you may need to scan the entire list.
This is inefficient for large data.
Hash tables solve this by allowing near constant-time lookup.
What Is a Hash Table?
A hash table stores data as key–value pairs.
The key is passed through a hash function, which converts it into an index.
The value is stored at that index.
Instead of searching sequentially, we jump directly to the location.
Understanding the Hash Function
A hash function takes an input key and produces a numeric index.
A good hash function:
- Is fast to compute
- Distributes keys uniformly
- Minimizes collisions
Poor hash functions lead to performance degradation.
Simple Hash Function Example
def hash_function(key, size):
return key % size
This function maps a key to a valid index in the table.
Inserting Values into a Hash Table
Let us create a simple hash table using a list.
table_size = 10
hash_table = [None] * table_size
def insert(key, value):
index = hash_function(key, table_size)
hash_table[index] = value
Insertion is fast because we compute the index directly.
Searching in a Hash Table
Searching follows the same logic as insertion.
def search(key):
index = hash_function(key, table_size)
return hash_table[index]
This gives near O(1) lookup time in ideal conditions.
Collisions – The Main Challenge
A collision occurs when two keys produce the same index.
Since the table size is finite, collisions are unavoidable.
Handling collisions correctly is critical for performance.
Collision Handling Overview
Common collision resolution techniques include:
- Chaining
- Open Addressing
- Linear Probing
We will explore these in the next lesson.
Time Complexity
In ideal conditions:
Insert: O(1)
Search: O(1)
Delete: O(1)
In worst cases (many collisions), performance can degrade to O(n).
Real-World Applications
Hash tables are used in:
- Databases and indexing systems
- Caches (Redis, Memcached)
- Compilers and symbol tables
- Authentication systems
Exercises
Exercise 1:
Why is direct indexing faster than linear search?
Exercise 2:
What happens if many keys map to the same index?
Exercise 3:
Is perfect hashing always possible?
Quick Quiz
Q1. What does a hash function produce?
Q2. Why are hash tables faster than arrays for lookup?
In the next lesson, we will dive deeper into collision handling techniques and see how hash tables remain efficient even when collisions occur.