Algorithms Lesson 42 – Hash Tables | Dataplexa

Hash Tables

In the previous lesson, we studied Segment Trees and learned how to efficiently answer range-based queries using a tree structure.

Now we move to one of the most powerful and widely used data structures in computer science — Hash Tables.

Hash tables are the reason why many operations in modern software feel instantaneous, even when working with millions of records.


The Core Problem Hash Tables Solve

Imagine you store student records using a simple list or array.

If you want to find a student by ID, you may need to scan the entire list.

This is inefficient for large data.

Hash tables solve this by allowing near constant-time lookup.


What Is a Hash Table?

A hash table stores data as key–value pairs.

The key is passed through a hash function, which converts it into an index.

The value is stored at that index.

Instead of searching sequentially, we jump directly to the location.


Understanding the Hash Function

A hash function takes an input key and produces a numeric index.

A good hash function:

  • Is fast to compute
  • Distributes keys uniformly
  • Minimizes collisions

Poor hash functions lead to performance degradation.


Simple Hash Function Example

def hash_function(key, size):
    return key % size

This function maps a key to a valid index in the table.


Inserting Values into a Hash Table

Let us create a simple hash table using a list.

table_size = 10
hash_table = [None] * table_size

def insert(key, value):
    index = hash_function(key, table_size)
    hash_table[index] = value

Insertion is fast because we compute the index directly.


Searching in a Hash Table

Searching follows the same logic as insertion.

def search(key):
    index = hash_function(key, table_size)
    return hash_table[index]

This gives near O(1) lookup time in ideal conditions.


Collisions – The Main Challenge

A collision occurs when two keys produce the same index.

Since the table size is finite, collisions are unavoidable.

Handling collisions correctly is critical for performance.


Collision Handling Overview

Common collision resolution techniques include:

  • Chaining
  • Open Addressing
  • Linear Probing

We will explore these in the next lesson.


Time Complexity

In ideal conditions:

Insert: O(1)
Search: O(1)
Delete: O(1)

In worst cases (many collisions), performance can degrade to O(n).


Real-World Applications

Hash tables are used in:

  • Databases and indexing systems
  • Caches (Redis, Memcached)
  • Compilers and symbol tables
  • Authentication systems

Exercises

Exercise 1:
Why is direct indexing faster than linear search?

Because it avoids scanning the entire data structure.

Exercise 2:
What happens if many keys map to the same index?

Performance degrades due to collisions.

Exercise 3:
Is perfect hashing always possible?

No. Collisions are unavoidable in practical systems.

Quick Quiz

Q1. What does a hash function produce?

An index for the hash table.

Q2. Why are hash tables faster than arrays for lookup?

They avoid sequential searching using direct access.

In the next lesson, we will dive deeper into collision handling techniques and see how hash tables remain efficient even when collisions occur.