NO SQL Lesson 1 – What is NoSQL | Dataplexa
NoSQL Fundamentals · Lesson 1

What is NoSQL

It's 2 AM. Instagram's engineers are staring at a screen watching their PostgreSQL database melt under 10 million new users. Rows are locking. Queries are timing out. The schema they designed six months ago cannot handle the flood of photos, likes, and follower graphs hitting it all at once. That night, the decision was made to move parts of their stack to NoSQL — and they never looked back. This lesson is about understanding exactly why that decision made sense, and why millions of engineers have made the same call since.

The Name Everyone Gets Wrong

The very first thing to clear up: NoSQL does not mean "No SQL." It never did. The community settled on "Not Only SQL" as the proper meaning. That tiny reframe changes everything about how you think about it.

NoSQL databases don't hate SQL. They're not trying to destroy relational databases. They're a different tool that was built to solve a different set of problems — problems that started appearing loudly around 2004–2009 when the web exploded in scale and unstructured data became the norm, not the exception.

Teacher's Note — The Word "NoSQL" is a marketing term

Johan Oskarsson coined the hashtag #nosql in 2009 for a meetup about non-relational databases. It was never meant to be a formal definition — it just stuck. When you're in an interview and someone asks "what does NoSQL stand for?", say Not Only SQL and watch them nod approvingly.

The Filing Cabinet vs. The Backpack

Before we look at any code, let's make this real with an analogy that will stick.

🗃️

Relational Database (SQL)

Imagine a strict filing cabinet. Every folder must have the same tabs — Name, Date, Amount, Status. If you want to add a new tab, you have to pull out every single folder and add that tab to all of them. It's incredibly organised. It's also incredibly rigid.

🎒

NoSQL Database

Now imagine a flexible backpack. You throw in whatever you need — a water bottle, an umbrella, a laptop, some papers. Each backpack can hold completely different stuff. One backpack might have 3 things. Another has 30. Neither forces the other to match.

Neither one is better. A filing cabinet is perfect for tax records. A backpack is perfect for a hiking trip. The problem engineers ran into is that they were trying to pack a hiking trip using only filing cabinets — and that's when things break.

What Problem Was the World Actually Facing?

By 2005, companies like Google, Amazon, and Facebook were hitting three walls simultaneously:

1

The Scale Wall

Relational databases scale vertically — you buy a bigger, more expensive server. But at Google's scale, there was no single server big enough. They needed to spread data across thousands of cheap machines (horizontal scaling), which relational databases were never designed to do well.

2

The Schema Wall

User profiles on social networks look nothing alike. One user has a bio, 14 phone numbers, 3 profile photos, and speaks 4 languages. Another has just a username. A strict table schema forces both users into the same mould — wasting space and causing constant painful migrations.

3

The Data Type Wall

The web generates data that doesn't fit neatly into rows and columns — JSON blobs, social graphs, time-series logs, geolocation streams. Forcing these into relational tables is technically possible but deeply unnatural and slow.

SQL Row vs. NoSQL Document — Side by Side

Here's the clearest way to feel the difference. Imagine storing a user profile in both systems:

❌ SQL Table (rigid, forced)

id name email phone bio
1 Aisha a@x.com NULL NULL
2 Carlos c@x.com +1-555-0101 Engineer…

NULLs everywhere for data that doesn't apply. Adding "languages spoken" means altering the entire table.

✅ NoSQL Document (flexible, natural)

{
  "id": "u_001",
  "name": "Aisha",
  "email": "a@x.com"
}

{
  "id": "u_002",
  "name": "Carlos",
  "email": "c@x.com",
  "phone": "+1-555-0101",
  "bio": "Engineer…",
  "languages": ["en", "es"]
}

Each document carries only what it needs. No NULLs. Carlos can have fields Aisha doesn't. No schema migration needed.

The Four Families of NoSQL

NoSQL is not one thing — it's a family of four very different database types, each built for a specific kind of data problem. Think of them as four different specialists, each world-class at their job:

🔑

Key-Value Stores

The simplest model. A key maps to a value — like a dictionary or a hash map. Blazing fast. Perfect for caching, sessions, and leaderboards.

Examples: Redis, DynamoDB, Memcached

📄

Document Stores

Stores data as JSON-like documents. Each document is self-describing and can have its own structure. Great for user profiles, product catalogues, content management.

Examples: MongoDB, CouchDB, Firestore

📊

Column-Family Stores

Stores data in column groups rather than rows. Built for massive datasets with heavy write loads — analytics pipelines, IoT telemetry, time-series data.

Examples: Apache Cassandra, HBase, ScyllaDB

🕸️

Graph Databases

Models data as nodes and edges — relationships are first-class citizens. Unbeatable for social networks, fraud detection, recommendation engines.

Examples: Neo4j, Amazon Neptune, ArangoDB

Scaling: The Biggest Structural Difference

This is the architectural reason NoSQL databases became essential at web scale. The two approaches to handling more traffic couldn't be more different:

⬆️ SQL: Vertical Scaling

Server
8 CPU
BIGGER
32 CPU
EVEN BIGGER
128 CPU

Costs explode. Hard ceiling. Single point of failure.

vs

➡️ NoSQL: Horizontal Scaling

Node 1
cheap
Node 2
cheap
Node 3
cheap
+ add more nodes →
N4
N5
N6
+N…

Linear cost growth. No ceiling. Resilient to node failures.

Your First Look at Real NoSQL Code

The scenario: You're a backend developer at a fast-growing e-commerce startup. Your team just decided to store product listings in MongoDB because each product has wildly different attributes — a t-shirt has size and colour, a laptop has RAM and GPU specs, a book has ISBN and author. A rigid SQL table would be a nightmare. Your tech lead asks you to insert a couple of products and verify they're stored correctly. Here's exactly what you'd run.

// MongoDB — inserting two products with completely different shapes
// This runs in MongoDB Shell (mongosh) or any MongoDB driver

// Switch to (or create) the storefront database
use storefront_db

// Insert a t-shirt — has size and colour attributes
db.products.insertOne({
  name: "Classic Cotton Tee",         // product name
  category: "apparel",                // broad category
  price: 29.99,                       // price in USD
  sizes: ["S", "M", "L", "XL"],      // array — multiple values in one field
  colour: "midnight blue",            // apparel-specific field
  stock: 148                          // units available
})

// Insert a laptop — completely different fields, same collection
db.products.insertOne({
  name: "ProBook X14 Laptop",         // same field names where they overlap
  category: "electronics",
  price: 1199.00,
  ram_gb: 16,                         // laptop-specific — no 'sizes' or 'colour'
  storage_gb: 512,
  gpu: "integrated",
  warranty_years: 2,
  stock: 23
})

// Fetch all products in the collection
db.products.find().pretty()           // .pretty() formats the output nicely
[
  {
    _id: ObjectId("64f1a2b3c4d5e6f7a8b9c0d1"),
    name: 'Classic Cotton Tee',
    category: 'apparel',
    price: 29.99,
    sizes: [ 'S', 'M', 'L', 'XL' ],
    colour: 'midnight blue',
    stock: 148
  },
  {
    _id: ObjectId("64f1a2b3c4d5e6f7a8b9c0d2"),
    name: 'ProBook X14 Laptop',
    category: 'electronics',
    price: 1199,
    ram_gb: 16,
    storage_gb: 512,
    gpu: 'integrated',
    warranty_years: 2,
    stock: 23
  }
]

What just happened?

db.products refers to a collection — MongoDB's equivalent of a SQL table, except it has no fixed schema. You never ran a CREATE TABLE command. The collection was created automatically the moment you inserted the first document.

insertOne() takes a JavaScript object (JSON-like) and stores it as a document. MongoDB added an _id field automatically — this is a globally unique ObjectId, MongoDB's equivalent of a primary key, generated without you asking for it.

The key insight: The t-shirt document has sizes, colour. The laptop has ram_gb, gpu, warranty_years. They live in the same collection but have zero fields in common beyond name, category, price, and stock. In a SQL table, both rows would need columns for ALL attributes — most filled with NULL for each product.

find().pretty() queries all documents in the collection and formats the output readably. No SELECT * FROM table — MongoDB's query language is its own API built around JSON objects.

The scenario: Your company's checkout service needs lightning-fast session storage. Every time a user adds something to their cart, that data needs to be read and written in under a millisecond — PostgreSQL is overkill and too slow for this. Your infrastructure team spins up Redis. Here's how a session gets stored and retrieved.

# Redis — key-value storage for shopping cart session
# This runs in redis-cli (Redis command line interface)

# Store a shopping cart session for user_8821
# SET key value EX seconds (EX sets expiry — cart expires in 30 minutes)
SET cart:user_8821 '{"items":[{"sku":"TEE-M-BLUE","qty":2},{"sku":"BOOK-978-0","qty":1}],"total":74.97}' EX 1800

# Retrieve the cart — single key lookup, O(1) speed
GET cart:user_8821

# Check how many seconds until this key expires
TTL cart:user_8821

# Update the cart (just overwrite the key)
SET cart:user_8821 '{"items":[{"sku":"TEE-M-BLUE","qty":3}],"total":89.97}' EX 1800

# Delete the cart when checkout is complete
DEL cart:user_8821
127.0.0.1:6379> SET cart:user_8821 '{"items":[...],"total":74.97}' EX 1800
OK

127.0.0.1:6379> GET cart:user_8821
"{\"items\":[{\"sku\":\"TEE-M-BLUE\",\"qty\":2},{\"sku\":\"BOOK-978-0\",\"qty\":1}],\"total\":74.97}"

127.0.0.1:6379> TTL cart:user_8821
(integer) 1793

127.0.0.1:6379> SET cart:user_8821 '{"items":[{"sku":"TEE-M-BLUE","qty":3}],"total":89.97}' EX 1800
OK

127.0.0.1:6379> DEL cart:user_8821
(integer) 1

What just happened?

SET key value is the most fundamental Redis command. cart:user_8821 is the key — notice the colon naming convention. It's just a naming pattern, not special syntax. It helps organise keys like namespaces: cart:, session:, rate_limit:.

EX 1800 sets the key to auto-expire in 1800 seconds (30 minutes). Redis handles the deletion automatically — you don't need a cron job to clean up stale sessions. This is called a TTL (Time To Live) and it's one of Redis's killer features for session and cache management.

GET key retrieves the value. Redis stores this entirely in RAM — that's why it's sub-millisecond. There's no disk read happening here at all.

DEL cart:user_8821 returns (integer) 1 — that's the number of keys deleted. If the key didn't exist, it would return 0. A clean, predictable response that makes error handling easy in your application code.

NoSQL vs SQL — When to Use What

This is the question you'll face in every architecture meeting. Here's an honest, no-hype comparison:

Criteria ✅ NoSQL Wins ✅ SQL Wins
Data structure Variable, nested, unstructured Consistent, tabular, well-defined
Scale Millions of ops/sec, global distribution Moderate scale with strong consistency
Relationships Simple or managed in application layer Complex joins, referential integrity
Schema changes Zero downtime, evolve freely Migrations required, riskier at scale
Transactions Limited (improving — MongoDB 4+ supports multi-doc) Full ACID transactions, rock solid
Use cases Caching, social feeds, IoT, catalogs, search Banking, payroll, ERP, inventory systems
Developer speed Fast early-stage, schema-free prototyping Better long-term with complex business logic

Where NoSQL Powers the Apps You Use Every Day

📸

Instagram

Cassandra for photo metadata. Redis for feed caching and counters. Handles 100M+ daily uploads.

🎵

Spotify

Cassandra for user activity. Redis for real-time playlist state. 600M+ users served globally.

🛒

Amazon

DynamoDB (which they built!) for shopping carts and order state. Sub-10ms latency at global scale.

💬

WhatsApp

Mnesia (a key-value store) for message routing state. Handles 100 billion messages per day.

🎬

Netflix

Cassandra for viewing history. EVCache (Redis-based) for personalised recommendation serving.

🔗

LinkedIn

Espresso (document store). Voldemort (key-value). Graph DB for the 900M+ member connection network.

Teacher's Note — When NOT to Use NoSQL

Every year, engineering teams rewrite NoSQL databases back to PostgreSQL because they chose NoSQL for the wrong reasons. If your data has strong relationships, if you need complex multi-table transactions (think: bank transfers, inventory deductions), or if your team knows SQL deeply — stick with SQL. NoSQL is not automatically "more modern" or "better." It's a deliberate trade-off. The engineers who understand both tools and pick the right one for the right job are the ones who don't get paged at 3 AM.

A guiding rule: if your data is mostly relationships and joins, stay relational. If your data is mostly documents, events, or key-lookups at scale — NoSQL is probably your answer.

Practice Questions

1. What does NoSQL actually stand for? (three words, all lowercase)



2. NoSQL databases are designed to scale __________ (adding more machines), whereas SQL databases traditionally scale vertically.



3. Which NoSQL family type stores data as nodes and edges, making it ideal for social networks and fraud detection?



Quiz

1. A colleague says "NoSQL means we're replacing all our SQL databases." What's the correct response?


2. Your team needs to cache user session data with sub-millisecond read speed and automatic key expiry. Which database is the best fit?


3. Why is a SQL table a poor choice for storing a mixed product catalogue (t-shirts, laptops, books)?


Up Next · Lesson 2

History of NoSQL

From Google's Bigtable paper to the 2009 hashtag that named a movement — the origin story of the databases running the modern web.