NO SQL Lesson 38 – NoSQL in AWS | Dataplexa

Enterprise & Cloud · Lesson 38

NoSQL in AWS

Running your own NoSQL cluster means patching servers, planning capacity, handling failovers at 3am, and paying engineers to do all of it. AWS managed NoSQL services flip that equation — you hand over the operational burden and get back unlimited scale, multi-AZ availability, and pay-per-use pricing. The trade-off is reduced control and a pricing model that can surprise you. This lesson is about knowing which service fits which problem — and what it actually costs to get it wrong.

The AWS NoSQL Landscape

AWS offers four managed NoSQL services, each targeting a different data model and use case. Picking the right one at the start saves a painful migration six months later.

DynamoDB

Key-value & document

AWS's flagship NoSQL. Fully serverless, millisecond latency at any scale, unlimited throughput. The go-to for anything that needs to handle unpredictable traffic spikes with zero ops overhead.

Best for: session storage, user profiles, IoT, gaming leaderboards, shopping carts

DocumentDB

Document (MongoDB-compatible)

MongoDB-compatible API on AWS-managed infrastructure. If you are already using MongoDB and want AWS to handle ops without rewriting your application code, this is the migration path.

Best for: MongoDB migrations, content management, catalogues with flexible schema

ElastiCache

In-memory (Redis / Memcached)

Managed Redis or Memcached clusters. Sub-millisecond reads for caching, session storage, rate limiting, and pub/sub. Not a primary store — lives in front of your database to absorb read load.

Best for: database query caching, session tokens, rate limiting, leaderboards

Keyspaces

Column-family (Cassandra-compatible)

Serverless Cassandra-compatible service. No cluster to manage. If you use Cassandra Query Language (CQL) and want AWS to handle operations, scaling, and availability automatically.

Best for: Cassandra migrations, time-series, high-write IoT workloads

DynamoDB — Core Concepts and Table Design

DynamoDB is the most opinionated NoSQL service AWS offers. Its data model is simple — every item has a partition key (mandatory) and an optional sort key. Together they form the primary key. Everything else — throughput, cost, query flexibility — flows from how well you design these two fields. The golden rule: DynamoDB rewards you for knowing your access patterns before you create the table, and punishes you severely for discovering them afterward.

Primary Key Patterns

Simple primary key PK only

One partition key, no sort key. Every item must have a unique PK. Use for direct lookups: get user profile by userId, get config by configKey. Cannot query a range — only exact matches.

Composite primary key PK + SK

Partition key + sort key. Multiple items can share the same PK as long as their SK differs. Enables range queries within a partition: "get all orders for user X sorted by date", "get all messages in thread Y". The most powerful DynamoDB pattern.

The scenario: You are building the backend for a multi-tenant SaaS platform. Each tenant has users, and each user generates events. You need to support three access patterns: (1) get a specific event by ID, (2) get all events for a user sorted by timestamp, (3) get all events of a specific type for a user within a date range. You are designing a single DynamoDB table to handle all three.

Python (boto3) — DynamoDB table design and queries

import boto3
from boto3.dynamodb.conditions import Key, Attr

dynamodb = boto3.resource("dynamodb", region_name="us-east-1")

# Table design:
#   PK = userId        → partition key  (groups all events for a user)
#   SK = timestamp#id  → sort key       (allows range queries by time)
#   GSI on eventType + timestamp for access pattern 3
table = dynamodb.Table("UserEvents")

# Write an event item
table.put_item(Item={
    "userId":    "user_4412",
    "SK":        "2025-03-10T14:22:01Z#evt_8821",  # timestamp + unique suffix
    "eventType": "purchase",
    "productId": "prod_001",
    "amount":    149.99
})

# Access pattern 2: all events for a user, newest first
response = table.query(
    KeyConditionExpression=Key("userId").eq("user_4412"),
    ScanIndexForward=False   # False = descending SK order (newest first)
)

# Access pattern 3: purchase events for a user in a date range
response = table.query(
    KeyConditionExpression=(
        Key("userId").eq("user_4412") &
        Key("SK").begins_with("2025-03-")  # SK prefix = March events
    ),
    FilterExpression=Attr("eventType").eq("purchase")
)

// put_item
{'ResponseMetadata': {'HTTPStatusCode': 200}}  ✓

// Access pattern 2 — all events for user_4412, newest first
Items returned: 47
[
  { userId: "user_4412", SK: "2025-03-10T14:22:01Z#evt_8821",
    eventType: "purchase", amount: 149.99 },
  { userId: "user_4412", SK: "2025-03-10T09:11:44Z#evt_8820",
    eventType: "pageview", productId: "prod_002" },
  ...
]
Consumed capacity: 0.5 RCU  |  Latency: 3.1ms

// Access pattern 3 — purchase events in March
Items scanned: 47  |  Items returned: 12  (FilterExpression applied after read)
Consumed capacity: 2.0 RCU  ← filter doesn't reduce read cost, only returned items

SK = "2025-03-10T14:22:01Z#evt_8821"

The sort key combines an ISO timestamp with a unique event ID using a # separator. The timestamp prefix enables range queries — begins_with("2025-03-") retrieves all March events. The unique ID suffix guarantees uniqueness even if two events arrive at the exact same second. This composite sort key pattern is one of the most common in DynamoDB single-table design.

ScanIndexForward=False

DynamoDB stores items within a partition sorted by the sort key in ascending order by default. ScanIndexForward=False reverses the traversal direction, returning items newest-first. This is a free operation — DynamoDB walks the sorted index in reverse. You do not need to create a separate index or sort results in application code.

FilterExpression reads all items, then filters

DynamoDB reads all 47 items matching the key condition, then discards 35 of them based on the filter. You are billed for reading all 47 items — 2.0 RCU — even though only 12 are returned. For access patterns where you always filter by eventType, create a Global Secondary Index (GSI) with eventType as the GSI partition key so queries are targeted and billed only for what they actually need.

DynamoDB Capacity Modes — On-Demand vs Provisioned

DynamoDB charges for reads and writes in units: one Read Capacity Unit (RCU) reads up to 4KB strongly consistent, and one Write Capacity Unit (WCU) writes up to 1KB. You choose how DynamoDB handles capacity provisioning:

On-Demand Mode

Pay per request. AWS scales instantly to any traffic level. No capacity planning. No throttling. More expensive per unit but zero waste — you never pay for idle capacity.

Best for: new tables, spiky or unpredictable traffic, early-stage products

Provisioned Mode + Auto-Scaling

You reserve RCU and WCU in advance. Cheaper per unit at sustained load. Auto-scaling adjusts capacity within min/max bounds. Risk: if traffic spikes beyond max, requests are throttled.

Best for: predictable workloads, cost optimisation at scale, high-volume steady traffic

The scenario: Your application is moving to production. Traffic is predictable — your analytics show consistent 800 reads/sec and 200 writes/sec throughout the day with a 3× spike during a nightly batch job. You want to switch from on-demand to provisioned mode with auto-scaling to reduce your monthly DynamoDB bill by 40% while ensuring you never get throttled during the batch spike.

AWS CLI — configure provisioned capacity with auto-scaling

# Step 1: switch table to provisioned mode
aws dynamodb update-table \
  --table-name UserEvents \
  --billing-mode PROVISIONED \
  --provisioned-throughput ReadCapacityUnits=1000,WriteCapacityUnits=250

# Step 2: register the table as an auto-scaling target
aws application-autoscaling register-scalable-target \
  --service-namespace dynamodb \
  --resource-id "table/UserEvents" \
  --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
  --min-capacity 100 \
  --max-capacity 800    # covers the 3x batch spike (200 * 3 = 600, with headroom)

# Step 3: create the scaling policy — target 70% utilisation
aws application-autoscaling put-scaling-policy \
  --policy-name UserEvents-write-scaling \
  --service-namespace dynamodb \
  --resource-id "table/UserEvents" \
  --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration \
    '{"TargetValue":70.0,"PredefinedMetricSpecification":
      {"PredefinedMetricType":"DynamoDBWriteCapacityUtilization"}}'

Table updated: PROVISIONED mode  ✓
Auto-scaling target registered: WCU 100 → 800  ✓
Scaling policy created: target 70% utilisation  ✓

// Cost comparison (us-east-1, approximate):
On-demand:    800 reads/s + 200 writes/s ≈ $847/month
Provisioned:  1000 RCU + 250 WCU         ≈ $486/month   (-43%)

// During batch spike:
  WCU utilisation hits 85% → auto-scaling triggers
  WCU scales from 250 → 600 within ~2 minutes
  Zero throttled requests  ✓

TargetValue: 70.0 — why not 90%?

Auto-scaling takes 1–2 minutes to provision additional capacity after a threshold is crossed. If you target 90% and traffic spikes instantly, the 1–2 minutes before new capacity arrives will see throttling. Targeting 70% gives a 30% buffer — enough headroom for traffic to grow while auto-scaling catches up. For write-critical workloads, 60–70% is the safe range.

max-capacity: 800 — set above your peak

Auto-scaling will never provision beyond max-capacity. If your batch job spikes to 700 WCU and your max is 800, you have headroom. If you had set max to 600 and the batch runs hot at 650 WCU, auto-scaling cannot help — requests get throttled. Always set max to at least 150% of your observed peak, not your average.

43% cost reduction vs on-demand

On-demand mode charges roughly 6.25× more per WCU than provisioned mode because AWS absorbs all capacity risk. At predictable, sustained load you are paying a large premium for flexibility you do not need. The crossover point is roughly when your traffic is predictable enough that auto-scaling can handle it — typically after 2–3 months of production data showing a consistent pattern.

ElastiCache — Caching DynamoDB with Redis

Even with DynamoDB's single-digit millisecond latency, some read patterns justify a cache layer. A product detail page requested 10,000 times per second reads the same 5KB document every time — paying for 10,000 RCU/sec for data that hasn't changed in hours. ElastiCache Redis sits in front of DynamoDB, serving those reads from memory at sub-millisecond latency and eliminating the DynamoDB read cost entirely for cached items.

The scenario: Your e-commerce platform's product pages are hammering DynamoDB at 15,000 reads/sec during a flash sale. Product data changes at most once per hour. Your DynamoDB bill has tripled. You are adding an ElastiCache Redis cluster with a cache-aside pattern — check Redis first, fall back to DynamoDB on a cache miss, and write back to Redis with a 1-hour TTL.

Python — cache-aside pattern with ElastiCache Redis + DynamoDB

import redis, boto3, json

# ElastiCache Redis cluster endpoint (from AWS console)
cache = redis.Redis(
    host="myapp.cache.amazonaws.com",
    port=6379,
    decode_responses=True
)
table = boto3.resource("dynamodb").Table("Products")

def get_product(product_id: str) -> dict:
    cache_key = f"product:{product_id}"

    # Step 1: check Redis cache first
    cached = cache.get(cache_key)
    if cached:
        return json.loads(cached)   # cache hit — sub-millisecond, zero DynamoDB cost

    # Step 2: cache miss — read from DynamoDB
    response = table.get_item(Key={"productId": product_id})
    product  = response.get("Item")

    if product:
        # Step 3: write to cache with 1-hour TTL
        cache.setex(
            name=cache_key,
            time=3600,        # seconds — expires after 1 hour
            value=json.dumps(product, default=str)
        )

    return product

// First request — cache miss
get_product("prod_001"):
  Redis: MISS
  DynamoDB: GET  →  { productId: "prod_001", name: "Trainers XL", price: 89.99 }
  Redis: SET product:prod_001  (TTL: 3600s)
  Latency: 4.2ms

// All subsequent requests for 1 hour — cache hit
get_product("prod_001"):
  Redis: HIT  →  { productId: "prod_001", name: "Trainers XL", price: 89.99 }
  DynamoDB: not called
  Latency: 0.3ms

// Flash sale traffic — 15,000 req/sec for prod_001
DynamoDB reads/sec: 1   (1 miss on first request, then all from cache)
DynamoDB cost saved: ~14,999 RCU/sec  ✓
Redis hit rate: 99.99%

cache.setex(name, time=3600, value)

setex sets a key with an expiry time atomically. If you used set() followed by expire(), a process crash between the two calls would leave a key with no TTL — it would live in Redis forever, serving stale data indefinitely. Always use setex or set(ex=3600) to set the value and TTL in a single atomic operation.

DynamoDB reads drop from 15,000 to 1 per hour

After the first cache miss populates Redis, every subsequent request for that product is served from memory. During a 1-hour flash sale with 15,000 req/sec, you consume exactly 1 DynamoDB RCU per product per hour instead of 54 million. The ElastiCache cluster cost is a small fraction of what those 54 million RCUs would cost — this is the economics of caching at scale.

Cache-aside vs write-through

Cache-aside (this pattern) populates the cache lazily on miss. The first request after a TTL expiry always hits DynamoDB. Write-through populates the cache on every write — so the cache is always warm and there are never cold misses. Write-through adds latency to every write; cache-aside adds latency to cache misses only. For read-heavy data that changes infrequently, cache-aside is simpler and equally effective.

Amazon Keyspaces — Managed Cassandra

Amazon Keyspaces is a serverless Cassandra-compatible service. You write CQL (Cassandra Query Language) exactly as you would against a self-managed cluster, and AWS handles replication, patching, scaling, and availability. The key difference from self-managed Cassandra: you do not control compaction strategies, JVM tuning, or node placement — AWS abstracts all of it.

The scenario: Your team runs a self-managed 6-node Cassandra cluster for IoT sensor data. Two engineers spend 30% of their time on cluster operations — patching, compaction tuning, node replacements. You are evaluating Keyspaces as a migration target. You want to verify the CQL compatibility and test that your existing application queries work without modification.

CQL — Keyspaces table creation and query (same syntax as Cassandra)

-- Create keyspace (identical syntax to self-managed Cassandra)
CREATE KEYSPACE iot
  WITH replication = {'class': 'SingleRegionStrategy'};
  -- Keyspaces handles replication internally across 3 AZs automatically

-- Create table — same CQL as Cassandra
CREATE TABLE iot.sensor_readings (
    device_id   TEXT,
    recorded_at TIMESTAMP,
    temperature DOUBLE,
    humidity    DOUBLE,
    PRIMARY KEY (device_id, recorded_at)
) WITH CLUSTERING ORDER BY (recorded_at DESC);

-- Insert and query — identical to Cassandra
INSERT INTO iot.sensor_readings
  (device_id, recorded_at, temperature, humidity)
  VALUES ('device_882', toTimestamp(now()), 22.4, 61.2);

-- Get last 24h of readings for a device
SELECT * FROM iot.sensor_readings
  WHERE device_id = 'device_882'
    AND recorded_at >= toTimestamp(now()) - 86400s
  LIMIT 1000;

Keyspace 'iot' created  ✓
Table 'iot.sensor_readings' created  ✓

INSERT executed  (1.8ms)  ✓

SELECT results:
  device_id  | recorded_at              | temperature | humidity
-------------+--------------------------+-------------+---------
  device_882 | 2025-03-10 14:22:01.000Z |        22.4 |    61.2
  device_882 | 2025-03-10 14:21:58.000Z |        22.3 |    61.1
  ... (247 rows, 14.2ms)

// No application code changes required — same CQL driver, same queries  ✓
// Operational overhead: zero node management, zero compaction tuning

SingleRegionStrategy — no RF setting needed

In self-managed Cassandra you set replication_factor: 3 explicitly. In Keyspaces, SingleRegionStrategy tells AWS to handle replication internally — it stores three copies across three Availability Zones automatically. You do not choose the replication factor; AWS enforces it for you as part of the managed service contract.

Same CQL, same driver — zero application changes

Keyspaces accepts connections from the standard Apache Cassandra drivers — Python, Java, Node.js. You only change the connection endpoint and add AWS SigV4 authentication. All existing CQL queries, table schemas, and application code work without modification. The migration effort is infrastructure-only, not application code.

What Keyspaces does not support

Keyspaces does not support all Cassandra features. Lightweight Transactions (LWT / IF NOT EXISTS) have limited support, custom compaction strategies cannot be changed, and ALLOW FILTERING is restricted. Before migrating, audit your CQL against the Keyspaces compatibility matrix. Most standard CRUD and time-series query patterns work without issues.

Teacher's Note

The most common DynamoDB mistake I see is teams treating it like MongoDB or PostgreSQL — designing the table first and figuring out queries later. DynamoDB does not forgive this. If your access patterns change after you have millions of items and no room for a GSI that supports them, your options are expensive: a full table scan, a parallel application-level join, or a painful migration. Write down every query your application needs to run before you create the table. Every single one. Then design the primary key and GSIs to make those queries targeted. DynamoDB rewards preparation and punishes improvisation.

Practice Questions — You're the Engineer

Scenario:

You query a DynamoDB table for all events belonging to a user in March — the key condition matches 200 items. Your code then adds a condition to return only events where eventType = "purchase", which narrows the result to 8 items. Your DynamoDB cost dashboard shows you were billed for 200 item reads, not 8. A colleague explains that DynamoDB reads all items matching the key condition first, then applies a secondary condition client-side — you are billed for everything DynamoDB touches, not just what it returns. What is the name of this secondary condition parameter in the boto3 query() call?

Scenario:

You are launching a new feature that will send push notifications to users. You have no idea how many users will engage with it — it could be 50 requests per second or 50,000. The product team says traffic will be completely unpredictable for the first 6 weeks while they run A/B tests and adjust the notification frequency. You need to create a DynamoDB table to store notification delivery status. Throttled requests would mean missed notifications and user complaints. Which DynamoDB capacity mode should you use for this table at launch?

Scenario:

Your team implements a caching layer between your application and DynamoDB. The strategy works like this: when a read request arrives, your code checks Redis first. If the item is found in Redis it is returned immediately. If it is not found, the code reads from DynamoDB, stores the result in Redis with a TTL, and returns it to the caller. Subsequent requests for the same item are served from Redis until the TTL expires. What is the name of this caching pattern?

Quiz — NoSQL in AWS in Production

Up Next · Lesson 39

Cloud-Native NoSQL

Serverless databases, multi-cloud patterns, and the architecture decisions that let your data layer scale to zero and back — automatically.

← Previous Course Index Next →