NO SQL Lesson 12 – Redis Introduction | Dataplexa

NoSQL Database Types · Lesson 12

Redis Introduction

In 2009, Salvatore Sanfilippo was building a real-time web analytics tool in Sicily. His PostgreSQL database couldn't keep up with the write rate. He needed something different — so he built it himself over a weekend. He called it Redis. Sixteen years later, Redis runs inside Twitter, GitHub, Snapchat, Stack Overflow, Airbnb, and tens of thousands of other production systems. This lesson puts it in your hands — every core command, every data structure, every production pattern — from scratch.

Redis Architecture — Why It's So Fast

Before touching a single command, understanding why Redis is fast removes the mystery from everything else. Three architectural decisions make it exceptional:

Everything lives in RAM

RAM access is 100,000x faster than a disk seek. PostgreSQL stores data on disk and caches hot data in memory. Redis stores everything in memory and persists to disk as a background operation. The read path never touches disk.

Single-threaded command processing

Redis processes commands on a single thread. This sounds like a weakness — it's actually a strength. No locks, no mutexes, no thread contention. Every command is atomic by default. The single thread runs at CPU speed — 100,000+ commands per second on a basic server.

Non-blocking I/O with event loop

Redis uses an event loop (like Node.js) to handle thousands of concurrent clients without spawning threads. While waiting for network I/O from one client, it serves other clients. High concurrency with zero thread overhead.

Getting Started — Your First Redis Commands

The scenario: You've just installed Redis and connected to the CLI with redis-cli. Let's run through the essential commands that power 80% of real-world Redis usage. Every command shown here runs in the Redis CLI exactly as written:

# Connect to Redis CLI
redis-cli

# Check Redis is running
PING

# Set and get a simple string
SET greeting "Hello, Redis!"
GET greeting

# Set with expiry (60 seconds)
SET temp_key "I disappear soon" EX 60
TTL temp_key

# Check how many keys are in Redis
DBSIZE

# Delete a key
DEL greeting

127.0.0.1:6379> PING
PONG

127.0.0.1:6379> SET greeting "Hello, Redis!"
OK

127.0.0.1:6379> GET greeting
"Hello, Redis!"

127.0.0.1:6379> SET temp_key "I disappear soon" EX 60
OK

127.0.0.1:6379> TTL temp_key
(integer) 58

127.0.0.1:6379> DBSIZE
(integer) 2

127.0.0.1:6379> DEL greeting
(integer) 1

Line by line:

PING → PONG

The health check. If Redis returns PONG, it's alive and accepting commands. Use this in monitoring scripts and load balancer health checks.

TTL temp_key → 58

Returns seconds remaining. 2 seconds elapsed since we set 60. Returns -1 if the key has no expiry. Returns -2 if the key doesn't exist at all. These two return values are important to distinguish in application code.

DEL greeting → (integer) 1

The integer return is how many keys were deleted. DEL key1 key2 key3 deletes multiple keys in one command and returns how many actually existed. Useful for bulk cleanup operations.

Sorted Sets — The Crown Jewel

Sorted Sets are Redis's most powerful data structure. Every member has a floating-point score. Members are always kept in sorted order by score. You can add, update, remove, and range-query in O(log N) time. Leaderboards, priority queues, time windows, delayed jobs — all built on Sorted Sets.

The scenario: You're running a live gaming tournament. Players earn points throughout the day. You need a real-time leaderboard that updates instantly and handles millions of score changes per hour:

# ZADD — add members with scores to a sorted set
ZADD leaderboard:tournament_9 9820 "player:ali"
ZADD leaderboard:tournament_9 8750 "player:sara"
ZADD leaderboard:tournament_9 9100 "player:james"
ZADD leaderboard:tournament_9 9950 "player:lin"

# Update a score — ZADD overwrites existing member's score
ZADD leaderboard:tournament_9 10200 "player:ali"

ZADD key score member — the score is a float, the member is a string. If the member already exists, its score is updated and its position in the sorted order adjusts automatically. No separate UPDATE command needed.

Internal structure: Redis uses a skip list + hash table combination. The skip list maintains sort order for range queries. The hash table gives O(1) score lookup for any member. You get the best of both worlds.

# Get top 3 players (highest scores first)
ZREVRANGE leaderboard:tournament_9 0 2 WITHSCORES

# Get a specific player's rank (0-indexed, lowest rank = best)
ZREVRANK leaderboard:tournament_9 "player:ali"

# Get a specific player's score
ZSCORE leaderboard:tournament_9 "player:sara"

# Get all players with scores between 9000 and 10000
ZRANGEBYSCORE leaderboard:tournament_9 9000 10000 WITHSCORES

127.0.0.1:6379> ZREVRANGE leaderboard:tournament_9 0 2 WITHSCORES
1) "player:ali"
2) "10200"
3) "player:lin"
4) "9950"
5) "player:james"
6) "9100"

127.0.0.1:6379> ZREVRANK leaderboard:tournament_9 "player:ali"
(integer) 0    ← rank 0 = #1 (best player)

127.0.0.1:6379> ZSCORE leaderboard:tournament_9 "player:sara"
"8750"

127.0.0.1:6379> ZRANGEBYSCORE leaderboard:tournament_9 9000 10000 WITHSCORES
1) "player:james"
2) "9100"
3) "player:lin"
4) "9950"
5) "player:ali"
6) "10200"

ZREVRANGE 0 2 WITHSCORES

Indexes 0 to 2 = top 3 members. REV means highest score first. WITHSCORES includes the score in the output. Result alternates: member, score, member, score — your application groups these into pairs.

ZREVRANK → 0

Rank 0 means first place. Zero-indexed. So rank 0 = #1, rank 1 = #2, etc. Use ZREVRANK + 1 in your application to display the human-readable rank to users.

ZRANGEBYSCORE 9000 10000

Returns all members with scores in a range — inclusive by default. Prefix a score with ( to make it exclusive: ZRANGEBYSCORE key (9000 10000 excludes exactly 9000. Useful for pagination and time windows.

Pub/Sub — Real-Time Messaging

Redis Pub/Sub is a publish-subscribe messaging system built directly into Redis. Publishers send messages to channels. Subscribers listen to channels and receive messages instantly. Zero configuration, zero extra infrastructure.

The scenario: You're building a live order tracking system. When an order status changes, all connected clients watching that order need to be notified immediately — without polling:

import redis
import threading

r = redis.Redis(host='localhost', port=6379)

# SUBSCRIBER — runs in a separate thread/process
def listen_for_updates(order_id):
    pubsub = r.pubsub()
    pubsub.subscribe(f'order:updates:{order_id}')  # subscribe to channel

    print(f"Listening for updates on order {order_id}...")
    for message in pubsub.listen():
        if message['type'] == 'message':           # filter out subscribe confirmations
            print(f"Update received: {message['data'].decode()}")

r.pubsub() — creates a Pub/Sub client. This is a separate connection from your regular Redis connection — subscribed clients can only use SUBSCRIBE, UNSUBSCRIBE, and PING. They cannot issue regular commands on the same connection.

pubsub.listen() — a blocking generator that yields messages as they arrive. The subscriber sleeps with zero CPU usage between messages. When a publisher sends to this channel, Redis wakes the subscriber instantly.

# PUBLISHER — runs in your order service
def update_order_status(order_id, new_status):
    # Update the database first
    db.orders.update_one({'_id': order_id}, {'$set': {'status': new_status}})

    # Then publish the change to all subscribers
    channel = f'order:updates:{order_id}'
    message = f'{{"status": "{new_status}", "ts": "{datetime.now().isoformat()}"}}'
    subscriber_count = r.publish(channel, message)

    print(f"Published to {subscriber_count} subscribers")

-- Publisher calls update_order_status('ord_8821', 'shipped')
Published to 3 subscribers   ← 3 clients watching this order

-- Each subscriber instantly receives:
Update received: {"status": "shipped", "ts": "2024-01-15T14:22:01"}
Update received: {"status": "shipped", "ts": "2024-01-15T14:22:01"}
Update received: {"status": "shipped", "ts": "2024-01-15T14:22:01"}

-- Latency from publish to receive: ~1ms

Important limitation of Redis Pub/Sub:

Messages are fire-and-forget. If no subscriber is listening when a message is published, the message is lost. If a subscriber disconnects and reconnects, it misses all messages sent while it was gone. Redis Pub/Sub is not a message queue.

For durability: Use Redis Streams (Lesson 11) or a proper message broker like Kafka. Pub/Sub is ideal for real-time notifications where a missed message is acceptable — live dashboards, chat notifications, online presence indicators.

Pipelines — Batching Commands for Speed

Every Redis command has a round-trip time — the time for the command to travel from your app to Redis and back. On a local network this is ~0.1ms. On a remote server it's 1–5ms. If you send 1,000 commands one at a time, you pay that round-trip 1,000 times. Pipelines batch multiple commands into a single network round-trip.

❌ Without pipeline (slow)

      for i in range(1000):

        r.set(f'key:{i}', i)

        # 1000 round trips

        # 1000 × 1ms = ~1 second

✅ With pipeline (fast)

      pipe = r.pipeline()

      for i in range(1000):

        pipe.set(f'key:{i}', i)

      pipe.execute() # 1 round trip

      # ~10ms total

The scenario: You're importing 10,000 product prices from an overnight batch job. Without pipelining it takes 45 seconds. Here's the pipelined version:

import redis

r = redis.Redis(host='localhost', port=6379)

# Load 10,000 product prices in one pipeline batch
def bulk_load_prices(price_data):
    pipe = r.pipeline(transaction=False)  # transaction=False = faster, no MULTI/EXEC

    for product_id, price in price_data.items():
        pipe.set(f'price:{product_id}', price, ex=86400)  # expire in 24 hours

    results = pipe.execute()   # ALL commands sent in one network call
    return results.count(True) # count successful writes

-- Without pipeline: 10,000 commands × 1ms each = 10 seconds
-- With pipeline: 10,000 commands in 1 network round trip = 85ms

Successful writes: 10000

-- Pipeline batches commands in memory client-side
-- Sends them all at once when execute() is called
-- Redis processes them sequentially and returns all results together

transaction=False

By default, r.pipeline() wraps commands in a MULTI/EXEC transaction block. Setting transaction=False removes this overhead — commands are still batched but not atomic as a group. For bulk writes where each command is independent, this is faster.

pipe.execute()

This is the moment all queued commands are actually sent. Returns a list of responses — one per command. Check each response to verify success. For SET commands, a successful write returns True.

Transactions — MULTI/EXEC

Redis transactions are different from SQL transactions. They guarantee that a group of commands runs without interruption from other clients — but they don't roll back on failure. All commands in a MULTI/EXEC block either all queue successfully or none run — but if one command errors at runtime, the others still execute.

The scenario: You're implementing a token transfer between two users. Both the debit and credit must happen together without another command interrupting:

def transfer_tokens(from_user, to_user, amount):
    with r.pipeline() as pipe:
        while True:
            try:
                # WATCH the sender's key — abort if it changes before EXEC
                pipe.watch(f'tokens:{from_user}')

                balance = int(pipe.get(f'tokens:{from_user}') or 0)
                if balance < amount:
                    raise ValueError("Insufficient tokens")

                # Queue the two writes atomically
                pipe.multi()                                          # start transaction
                pipe.decrby(f'tokens:{from_user}', amount)           # debit
                pipe.incrby(f'tokens:{to_user}',   amount)           # credit
                pipe.execute()                                        # run both atomically
                break

            except redis.WatchError:
                continue  # another client changed the key — retry

pipe.watch()

WATCH implements optimistic locking. Redis monitors the watched key. If any other client modifies it between WATCH and EXEC, the transaction aborts and raises WatchError. You retry the whole operation with the fresh value. This is the Redis way to handle concurrent writes safely.

pipe.multi() → pipe.execute()

multi() opens the transaction block. All commands after this are queued. execute() sends them all atomically — no other client's command can run between the debit and credit. The two operations are guaranteed to run together.

decrby / incrby

DECRBY key amount atomically subtracts the amount. INCRBY key amount atomically adds it. Both return the new value. These are safer than read-modify-write patterns because the arithmetic happens inside Redis — no risk of stale reads.

Redis in a Real Production Stack

Redis is almost never the only database in a production system. Here's how it fits alongside other components in a typical modern stack:

Web App

Mobile App

API Gateway

Application Servers

Redis
Sessions · Cache · Queues
Leaderboards · Rate limits

~0.3ms reads

PostgreSQL
Users · Orders · Payments
Products · Transactions

~5ms reads

MongoDB
Profiles · Content
Activity · Preferences

~3ms reads

Redis sits in front of slower databases as a fast-access layer. Hot data lives in Redis. Cold data lives in PostgreSQL or MongoDB. The app checks Redis first — cache hit means no SQL query needed.

Cache-Aside Pattern — The Most Common Redis Pattern

The scenario: Your product detail page queries PostgreSQL on every load. At 10,000 page views per hour for popular products, that's 10,000 identical SQL queries. Cache-Aside stores the result in Redis after the first query:

def get_product(product_id):
    cache_key = f"product:{product_id}"

    # Step 1: Check Redis first
    cached = r.get(cache_key)
    if cached:
        return json.loads(cached)          # cache hit — return immediately

    # Step 2: Cache miss — query PostgreSQL
    product = db.query("SELECT * FROM products WHERE id = %s", product_id)

    # Step 3: Store in Redis for next time (cache for 10 minutes)
    r.set(cache_key, json.dumps(product), ex=600)

    return product

-- Request 1 (cache miss):
Redis: key not found
PostgreSQL: query executed (5ms)
Redis: product cached for 600 seconds
Total: 5.3ms

-- Requests 2–1000 within 10 minutes (cache hit):
Redis: key found
Total: 0.3ms each
PostgreSQL: 0 queries

-- After 10 minutes (TTL expires):
Redis: key not found → back to cache miss path → refreshed

The 10-minute TTL: This is intentional. Even if the product price changes in PostgreSQL, the cached version serves users for up to 10 minutes. For most products this is acceptable — the 17x speed improvement (5ms → 0.3ms) is worth a short staleness window. For price-sensitive pages like checkout, use a shorter TTL or invalidate the cache explicitly on update.

Cache invalidation on update: When a product is updated in PostgreSQL, call r.delete(f"product:{product_id}") to immediately remove the stale cache entry. The next read will repopulate it from the fresh database value.

Teacher's Note

Redis rewards engineers who understand it deeply. Most teams use 10% of what Redis offers — SET, GET, and maybe EXPIRE. The teams that get extraordinary results are the ones who reach for the right data structure: Sorted Sets for leaderboards instead of repeated ORDER BY queries, Pub/Sub for real-time instead of polling, Pipelines for bulk writes instead of loops. Redis commands are your vocabulary. The more of them you know, the more problems you can solve in milliseconds instead of seconds.

Practice Questions — You're the Engineer

Scenario:

Your gaming app shows each player their current rank on the leaderboard. The leaderboard is stored in a Redis Sorted Set called leaderboard:global with scores as points (higher = better rank). Which single Redis command returns a specific player's rank position with the highest-scoring player at rank 0?

Scenario:

Your batch job updates 50,000 Redis keys every night with fresh recommendation scores. Running them one by one takes 50 seconds. Your team needs it under 2 seconds. You don't need the commands to be atomic as a group — you just need to eliminate round-trip overhead. Which Redis feature should you use?

Scenario:

Your application checks Redis first, then queries the database on a cache miss, then stores the result in Redis before returning it. On subsequent requests, the data is served from Redis without touching the database. What is this caching pattern called?

Quiz — Redis in the Real World

Up Next · Lesson 13

DynamoDB Introduction

Amazon's fully managed NoSQL database — how it achieves single-digit millisecond latency at any scale, partition keys, sort keys, and why it powers the world's largest e-commerce platform.

← Previous Course Index Next →