Mango DBLesson 5 – MongoDB Architecture | Dataplexa

MongoDB Architecture

Understanding MongoDB's architecture is what separates a developer who just uses MongoDB from one who can design, deploy, tune, and troubleshoot it confidently. MongoDB is not a single process — it is a system of components that work together to deliver flexible storage, high availability, and horizontal scale. This lesson walks through every major architectural component: the storage engine, the server process, replica sets, sharded clusters, and how they all connect.

The Core Process — mongod

mongod is the primary daemon process of MongoDB. It is the actual database server — it manages data files on disk, handles client connections, processes queries, and enforces access control. Every MongoDB deployment, no matter how simple or complex, runs at least one mongod process.

Why it matters: understanding what mongod does helps you configure it correctly — setting the right port, data directory, log path, replica set name, and storage engine for your workload.

# mongod startup — key configuration options

mongod_config = {
    "process": "mongod",
    "default_port": 27017,
    "responsibilities": [
        "Accept client connections",
        "Process read and write operations",
        "Manage data files and indexes on disk",
        "Enforce authentication and authorisation",
        "Replicate data to/from other replica set members",
        "Run background maintenance tasks (compaction, TTL cleanup)"
    ],
    "key_config_options": {
        "--dbpath":    "/data/db",          # where data files are stored
        "--port":      27017,               # port to listen on
        "--logpath":   "/var/log/mongod.log",
        "--replSet":   "rs0",              # replica set name
        "--auth":      True,               # require authentication
        "--bind_ip":   "127.0.0.1"         # which IP to accept connections from
    }
}

for option, value in mongod_config["key_config_options"].items():
    print(f"  {option}: {value}")

--dbpath: /data/db
--port: 27017
--logpath: /var/log/mongod.log
--replSet: rs0
--auth: True
--bind_ip: 127.0.0.1

mongod listens on port 27017 by default
Configuration can be passed as command-line flags or through a YAML config file (mongod.conf)
One mongod process manages one data directory — running multiple instances requires separate data directories
In production, mongod is always run as part of a replica set — never as a standalone process

The Storage Engine — WiredTiger

The storage engine is the component responsible for how data is actually written to and read from disk. MongoDB's default and recommended storage engine is WiredTiger, introduced in MongoDB 3.2 and made the default in MongoDB 3.4.

Why it matters: the storage engine determines compression ratios, concurrency behaviour, and write performance. Understanding WiredTiger explains why MongoDB handles concurrent workloads efficiently.

# WiredTiger storage engine — key characteristics

wiredtiger = {
    "concurrency_model": "Document-level locking",
    # Only the specific document being written is locked
    # Other documents in the same collection can be read/written simultaneously
    # This is far more granular than table-level locking used by older engines

    "compression": {
        "data":    "Snappy (default) or zlib / zstd",
        "indexes": "Prefix compression"
        # Snappy: fast compression, moderate ratio
        # zlib / zstd: higher compression, slightly slower
    },

    "cache": "50% of RAM by default (or 1GB minimum)",
    # WiredTiger keeps frequently accessed data in memory
    # Data is read from disk only when not in the cache

    "checkpoints": "Every 60 seconds or 2GB of journal data",
    # Checkpoint = consistent snapshot written to disk
    # Allows fast recovery after a crash

    "journaling": "Write-ahead log (WAL) for durability",
    # Every write is recorded in the journal before being applied
    # Ensures no data loss even if mongod crashes mid-write
}

for key, value in wiredtiger.items():
    print(f"{key}: {value}")

concurrency_model: Document-level locking
compression: {'data': 'Snappy (default) or zlib / zstd', 'indexes': 'Prefix compression'}
cache: 50% of RAM by default (or 1GB minimum)
checkpoints: Every 60 seconds or 2GB of journal data
journaling: Write-ahead log (WAL) for durability

Document-level locking means two writes to different documents in the same collection never block each other
Compression reduces storage footprint by 30–80% depending on data — enabled by default at no performance cost with Snappy
The WiredTiger cache is separate from the OS file cache — together they form two layers of caching
The journal (write-ahead log) ensures durability — if mongod crashes, uncommitted writes are replayed from the journal on restart

Replica Set Architecture

A replica set is a group of mongod instances that maintain the same data. It is the fundamental unit of high availability in MongoDB. Every production deployment should use a replica set — it provides automatic failover, data redundancy, and read scaling.

Why it matters: without a replica set, a single server failure means downtime. With a replica set, a failed primary is replaced automatically within seconds.

# Replica set architecture — roles and data flow

replica_set_architecture = {
    "PRIMARY": {
        "role": "Receives all writes from clients",
        "count": "Always exactly one at any time",
        "writes_to": "Its own data, then oplog",
    },
    "SECONDARY": {
        "role": "Replicates from primary via oplog",
        "count": "One or more",
        "can_serve": "Read operations (if read preference allows)",
    },
    "ARBITER": {
        "role": "Votes in elections but holds no data",
        "use_case": "Keeps replica set member count odd for clean elections",
        "note": "Use only when a full data-bearing member is not possible"
    }
}

# The oplog — operations log
oplog_concept = {
    "what_it_is": "A capped collection in the 'local' database",
    "what_it_stores": "Every write operation applied to the primary",
    "how_secondaries_use_it": "Poll the oplog and replay operations",
    "size": "5% of disk space by default (configurable)"
}

print("Replica Set Members:")
for role, details in replica_set_architecture.items():
    print(f"  {role}: {details['role']}")

print("\nOplog:", oplog_concept["what_it_is"])

Replica Set Members:
PRIMARY: Receives all writes from clients
SECONDARY: Replicates from primary via oplog
ARBITER: Votes in elections but holds no data

Oplog: A capped collection in the 'local' database

The oplog (operations log) is the mechanism for replication — secondaries tail the oplog and replay each operation
Replica sets require an odd number of voting members (1, 3, 5…) to guarantee a majority in elections
Failover is automatic — if the primary becomes unreachable, the remaining members hold an election within ~10 seconds
Write concern controls how many replicas must acknowledge a write before the client receives confirmation

Sharded Cluster Architecture

When a single replica set can no longer handle the data volume or write throughput, MongoDB scales horizontally through sharding. A sharded cluster has three distinct components working together: shards, the config server replica set, and mongos routers.

# Sharded cluster — three component types

sharded_cluster = {
    "SHARDS": {
        "what": "Each shard is a replica set holding a subset of the data",
        "example": "Shard A holds user_id 1–1M, Shard B holds user_id 1M–2M",
        "minimum": "2 shards (but 3+ recommended for production)"
    },
    "CONFIG_SERVERS": {
        "what": "A replica set storing cluster metadata and shard key mappings",
        "stores": [
            "Which chunks of data live on which shard",
            "Shard key ranges (chunk map)",
            "Cluster topology"
        ],
        "minimum": "3-node replica set for production"
    },
    "MONGOS": {
        "what": "Query router — the entry point for all client connections",
        "role": [
            "Receives queries from the application",
            "Reads chunk map from config servers",
            "Routes query to the correct shard(s)",
            "Merges results and returns to client"
        ],
        "note": "Stateless — can run multiple mongos instances for load balancing"
    }
}

for component, details in sharded_cluster.items():
    print(f"{component}: {details['what']}")

SHARDS: Each shard is a replica set holding a subset of the data
CONFIG_SERVERS: A replica set storing cluster metadata and shard key mappings
MONGOS: Query router — the entry point for all client connections

Applications connect to mongos, never directly to individual shards — the sharding is completely transparent
The config server replica set is critical — losing it makes the cluster unmanageable (though data is still accessible)
MongoDB's balancer runs in the background, migrating chunks between shards to keep data evenly distributed
A targeted query (includes the shard key) hits one shard; a scatter-gather query (no shard key) hits all shards

The Full Architecture Picture

Putting it all together — a production MongoDB deployment layers these components. Clients talk to mongos, mongos routes to shards, each shard is a replica set, and the config server replica set keeps the routing map up to date.

# Full production architecture — component diagram in code

production_deployment = {
    "client_applications": ["App Server 1", "App Server 2", "App Server 3"],

    "mongos_routers": ["mongos-1:27017", "mongos-2:27017"],
    # Multiple mongos instances for load balancing and redundancy

    "config_server_replica_set": {
        "name": "configRS",
        "members": ["cfg1:27019", "cfg2:27019", "cfg3:27019"]
    },

    "shards": {
        "shard_a": {
            "replica_set": "shardA",
            "members": ["sA-1:27018", "sA-2:27018", "sA-3:27018"],
            "data": "user_id: 1 → 1,000,000"
        },
        "shard_b": {
            "replica_set": "shardB",
            "members": ["sB-1:27018", "sB-2:27018", "sB-3:27018"],
            "data": "user_id: 1,000,001 → 2,000,000"
        },
        "shard_c": {
            "replica_set": "shardC",
            "members": ["sC-1:27018", "sC-2:27018", "sC-3:27018"],
            "data": "user_id: 2,000,001 → 3,000,000"
        }
    }
}

total_nodes = (
    len(production_deployment["mongos_routers"]) +
    len(production_deployment["config_server_replica_set"]["members"]) +
    sum(len(s["members"]) for s in production_deployment["shards"].values())
)
print(f"Total MongoDB processes in this cluster: {total_nodes}")

Total MongoDB processes in this cluster: 14

This cluster has 14 processes: 2 mongos + 3 config servers + 9 shard members (3 shards × 3 nodes each)
Each shard replica set operates independently — a shard failure affects only the data on that shard
mongos instances are stateless and can be added or removed without affecting data
In cloud deployments (Atlas), all of this infrastructure is managed automatically

Memory and Data Flow

Understanding how data moves between disk, cache, and the client explains MongoDB's performance characteristics and helps you tune it correctly.

# Data flow — read and write paths

read_path = [
    "1. Client sends find() query to mongos (or mongod directly)",
    "2. mongos checks config servers for chunk routing",
    "3. Query routed to correct shard(s)",
    "4. mongod checks WiredTiger cache — is the data in memory?",
    "5a. Cache HIT  — return data directly from RAM (fast)",
    "5b. Cache MISS — read from disk into cache, then return (slower)",
    "6. Results returned to mongos → merged → returned to client"
]

write_path = [
    "1. Client sends insertOne/updateOne to mongos (or mongod)",
    "2. Write recorded in WiredTiger journal (WAL) — durability guaranteed",
    "3. Write applied to WiredTiger cache (in memory)",
    "4. Write appended to oplog (for replication to secondaries)",
    "5. Secondaries read oplog and replay the operation",
    "6. WiredTiger checkpoint flushes cache to disk every 60 seconds"
]

print("READ PATH:")
for step in read_path:
    print(f"  {step}")

print("\nWRITE PATH:")
for step in write_path:
    print(f"  {step}")

READ PATH:
1. Client sends find() query to mongos (or mongod directly)
2. mongos checks config servers for chunk routing
3. Query routed to correct shard(s)
4. mongod checks WiredTiger cache — is the data in memory?
5a. Cache HIT — return data directly from RAM (fast)
5b. Cache MISS — read from disk into cache, then return (slower)
6. Results returned to mongos → merged → returned to client

WRITE PATH:
1. Client sends insertOne/updateOne to mongos (or mongod)
2. Write recorded in WiredTiger journal (WAL) — durability guaranteed
3. Write applied to WiredTiger cache (in memory)
4. Write appended to oplog (for replication to secondaries)
5. Secondaries read oplog and replay the operation
6. WiredTiger checkpoint flushes cache to disk every 60 seconds

Cache hit ratio is the most important performance metric — a high cache miss rate means the working set exceeds RAM
The journal write happens synchronously — the client does not receive acknowledgment until the journal is written
The checkpoint write to disk is asynchronous — writes are fast because they only go to journal + cache initially
Increasing RAM is the single most effective performance improvement for a read-heavy MongoDB workload

Summary Table

Component	Role	Key Detail
mongod	Core database server process	Listens on port 27017 by default
WiredTiger	Default storage engine	Document-level locking, compression, journaling
Oplog	Replication operations log	Capped collection in local database
Replica set	High availability group	Primary + secondaries + optional arbiter
mongos	Sharded cluster query router	Stateless — routes to correct shard
Config server	Stores cluster metadata	Chunk map and shard topology
Balancer	Keeps shards evenly loaded	Migrates chunks between shards automatically

Practice Questions

Practice 1. What is mongod and what is its default port?

Practice 2. What concurrency model does WiredTiger use and why is it better than table-level locking?

Practice 3. What is the oplog and how do secondaries use it?

Practice 4. In a sharded cluster, what is the role of the config server replica set?

Practice 5. What is the difference between a targeted query and a scatter-gather query in a sharded cluster?

Quiz

Quiz 1. What is WiredTiger's default data compression algorithm?

Snappy
zlib
zstd
gzip

Quiz 2. How often does WiredTiger write a checkpoint to disk by default?

Every 60 seconds
Every 10 seconds
Every 5 minutes
On every write

Quiz 3. What type of member in a replica set votes in elections but holds no data?

Arbiter
Secondary
Hidden member
Delayed member

Quiz 4. What percentage of RAM does WiredTiger use for its cache by default?

50%
25%
75%
100%

Quiz 5. Why is mongos described as stateless in a sharded cluster?

It holds no data of its own — it reads routing info from config servers and can be added or removed without affecting data
It does not store session information between queries
It runs without a connection to the database
It cannot be configured after startup

Next up — Installing MongoDB: getting MongoDB running locally on Windows, macOS, and Linux.

← Previous Course Index Next →