NoSQL
NoSQL in AWS
Running your own NoSQL cluster means patching servers, planning capacity, handling failovers at 3am, and paying engineers to do all of it. AWS managed NoSQL services flip that equation — you hand over the operational burden and get back unlimited scale, multi-AZ availability, and pay-per-use pricing. The trade-off is reduced control and a pricing model that can surprise you. This lesson is about knowing which service fits which problem — and what it actually costs to get it wrong.
The AWS NoSQL Landscape
AWS offers four managed NoSQL services, each targeting a different data model and use case. Picking the right one at the start saves a painful migration six months later.
DynamoDB — Core Concepts and Table Design
DynamoDB is the most opinionated NoSQL service AWS offers. Its data model is simple — every item has a partition key (mandatory) and an optional sort key. Together they form the primary key. Everything else — throughput, cost, query flexibility — flows from how well you design these two fields. The golden rule: DynamoDB rewards you for knowing your access patterns before you create the table, and punishes you severely for discovering them afterward.
PK only
userId, get config by configKey. Cannot query a range — only exact matches.PK + SK
The scenario: You are building the backend for a multi-tenant SaaS platform. Each tenant has users, and each user generates events. You need to support three access patterns: (1) get a specific event by ID, (2) get all events for a user sorted by timestamp, (3) get all events of a specific type for a user within a date range. You are designing a single DynamoDB table to handle all three.
import boto3
from boto3.dynamodb.conditions import Key, Attr
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
# Table design:
# PK = userId → partition key (groups all events for a user)
# SK = timestamp#id → sort key (allows range queries by time)
# GSI on eventType + timestamp for access pattern 3
table = dynamodb.Table("UserEvents")
# Write an event item
table.put_item(Item={
"userId": "user_4412",
"SK": "2025-03-10T14:22:01Z#evt_8821", # timestamp + unique suffix
"eventType": "purchase",
"productId": "prod_001",
"amount": 149.99
})
# Access pattern 2: all events for a user, newest first
response = table.query(
KeyConditionExpression=Key("userId").eq("user_4412"),
ScanIndexForward=False # False = descending SK order (newest first)
)
# Access pattern 3: purchase events for a user in a date range
response = table.query(
KeyConditionExpression=(
Key("userId").eq("user_4412") &
Key("SK").begins_with("2025-03-") # SK prefix = March events
),
FilterExpression=Attr("eventType").eq("purchase")
)
// put_item
{'ResponseMetadata': {'HTTPStatusCode': 200}} ✓
// Access pattern 2 — all events for user_4412, newest first
Items returned: 47
[
{ userId: "user_4412", SK: "2025-03-10T14:22:01Z#evt_8821",
eventType: "purchase", amount: 149.99 },
{ userId: "user_4412", SK: "2025-03-10T09:11:44Z#evt_8820",
eventType: "pageview", productId: "prod_002" },
...
]
Consumed capacity: 0.5 RCU | Latency: 3.1ms
// Access pattern 3 — purchase events in March
Items scanned: 47 | Items returned: 12 (FilterExpression applied after read)
Consumed capacity: 2.0 RCU ← filter doesn't reduce read cost, only returned itemsSK = "2025-03-10T14:22:01Z#evt_8821"
The sort key combines an ISO timestamp with a unique event ID using a # separator. The timestamp prefix enables range queries — begins_with("2025-03-") retrieves all March events. The unique ID suffix guarantees uniqueness even if two events arrive at the exact same second. This composite sort key pattern is one of the most common in DynamoDB single-table design.
ScanIndexForward=False
DynamoDB stores items within a partition sorted by the sort key in ascending order by default. ScanIndexForward=False reverses the traversal direction, returning items newest-first. This is a free operation — DynamoDB walks the sorted index in reverse. You do not need to create a separate index or sort results in application code.
FilterExpression reads all items, then filters
DynamoDB reads all 47 items matching the key condition, then discards 35 of them based on the filter. You are billed for reading all 47 items — 2.0 RCU — even though only 12 are returned. For access patterns where you always filter by eventType, create a Global Secondary Index (GSI) with eventType as the GSI partition key so queries are targeted and billed only for what they actually need.
DynamoDB Capacity Modes — On-Demand vs Provisioned
DynamoDB charges for reads and writes in units: one Read Capacity Unit (RCU) reads up to 4KB strongly consistent, and one Write Capacity Unit (WCU) writes up to 1KB. You choose how DynamoDB handles capacity provisioning:
The scenario: Your application is moving to production. Traffic is predictable — your analytics show consistent 800 reads/sec and 200 writes/sec throughout the day with a 3× spike during a nightly batch job. You want to switch from on-demand to provisioned mode with auto-scaling to reduce your monthly DynamoDB bill by 40% while ensuring you never get throttled during the batch spike.
# Step 1: switch table to provisioned mode
aws dynamodb update-table \
--table-name UserEvents \
--billing-mode PROVISIONED \
--provisioned-throughput ReadCapacityUnits=1000,WriteCapacityUnits=250
# Step 2: register the table as an auto-scaling target
aws application-autoscaling register-scalable-target \
--service-namespace dynamodb \
--resource-id "table/UserEvents" \
--scalable-dimension "dynamodb:table:WriteCapacityUnits" \
--min-capacity 100 \
--max-capacity 800 # covers the 3x batch spike (200 * 3 = 600, with headroom)
# Step 3: create the scaling policy — target 70% utilisation
aws application-autoscaling put-scaling-policy \
--policy-name UserEvents-write-scaling \
--service-namespace dynamodb \
--resource-id "table/UserEvents" \
--scalable-dimension "dynamodb:table:WriteCapacityUnits" \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration \
'{"TargetValue":70.0,"PredefinedMetricSpecification":
{"PredefinedMetricType":"DynamoDBWriteCapacityUtilization"}}'
Table updated: PROVISIONED mode ✓ Auto-scaling target registered: WCU 100 → 800 ✓ Scaling policy created: target 70% utilisation ✓ // Cost comparison (us-east-1, approximate): On-demand: 800 reads/s + 200 writes/s ≈ $847/month Provisioned: 1000 RCU + 250 WCU ≈ $486/month (-43%) // During batch spike: WCU utilisation hits 85% → auto-scaling triggers WCU scales from 250 → 600 within ~2 minutes Zero throttled requests ✓
TargetValue: 70.0 — why not 90%?
Auto-scaling takes 1–2 minutes to provision additional capacity after a threshold is crossed. If you target 90% and traffic spikes instantly, the 1–2 minutes before new capacity arrives will see throttling. Targeting 70% gives a 30% buffer — enough headroom for traffic to grow while auto-scaling catches up. For write-critical workloads, 60–70% is the safe range.
max-capacity: 800 — set above your peak
Auto-scaling will never provision beyond max-capacity. If your batch job spikes to 700 WCU and your max is 800, you have headroom. If you had set max to 600 and the batch runs hot at 650 WCU, auto-scaling cannot help — requests get throttled. Always set max to at least 150% of your observed peak, not your average.
43% cost reduction vs on-demand
On-demand mode charges roughly 6.25× more per WCU than provisioned mode because AWS absorbs all capacity risk. At predictable, sustained load you are paying a large premium for flexibility you do not need. The crossover point is roughly when your traffic is predictable enough that auto-scaling can handle it — typically after 2–3 months of production data showing a consistent pattern.
ElastiCache — Caching DynamoDB with Redis
Even with DynamoDB's single-digit millisecond latency, some read patterns justify a cache layer. A product detail page requested 10,000 times per second reads the same 5KB document every time — paying for 10,000 RCU/sec for data that hasn't changed in hours. ElastiCache Redis sits in front of DynamoDB, serving those reads from memory at sub-millisecond latency and eliminating the DynamoDB read cost entirely for cached items.
The scenario: Your e-commerce platform's product pages are hammering DynamoDB at 15,000 reads/sec during a flash sale. Product data changes at most once per hour. Your DynamoDB bill has tripled. You are adding an ElastiCache Redis cluster with a cache-aside pattern — check Redis first, fall back to DynamoDB on a cache miss, and write back to Redis with a 1-hour TTL.
import redis, boto3, json
# ElastiCache Redis cluster endpoint (from AWS console)
cache = redis.Redis(
host="myapp.cache.amazonaws.com",
port=6379,
decode_responses=True
)
table = boto3.resource("dynamodb").Table("Products")
def get_product(product_id: str) -> dict:
cache_key = f"product:{product_id}"
# Step 1: check Redis cache first
cached = cache.get(cache_key)
if cached:
return json.loads(cached) # cache hit — sub-millisecond, zero DynamoDB cost
# Step 2: cache miss — read from DynamoDB
response = table.get_item(Key={"productId": product_id})
product = response.get("Item")
if product:
# Step 3: write to cache with 1-hour TTL
cache.setex(
name=cache_key,
time=3600, # seconds — expires after 1 hour
value=json.dumps(product, default=str)
)
return product
// First request — cache miss
get_product("prod_001"):
Redis: MISS
DynamoDB: GET → { productId: "prod_001", name: "Trainers XL", price: 89.99 }
Redis: SET product:prod_001 (TTL: 3600s)
Latency: 4.2ms
// All subsequent requests for 1 hour — cache hit
get_product("prod_001"):
Redis: HIT → { productId: "prod_001", name: "Trainers XL", price: 89.99 }
DynamoDB: not called
Latency: 0.3ms
// Flash sale traffic — 15,000 req/sec for prod_001
DynamoDB reads/sec: 1 (1 miss on first request, then all from cache)
DynamoDB cost saved: ~14,999 RCU/sec ✓
Redis hit rate: 99.99%cache.setex(name, time=3600, value)
setex sets a key with an expiry time atomically. If you used set() followed by expire(), a process crash between the two calls would leave a key with no TTL — it would live in Redis forever, serving stale data indefinitely. Always use setex or set(ex=3600) to set the value and TTL in a single atomic operation.
DynamoDB reads drop from 15,000 to 1 per hour
After the first cache miss populates Redis, every subsequent request for that product is served from memory. During a 1-hour flash sale with 15,000 req/sec, you consume exactly 1 DynamoDB RCU per product per hour instead of 54 million. The ElastiCache cluster cost is a small fraction of what those 54 million RCUs would cost — this is the economics of caching at scale.
Cache-aside vs write-through
Cache-aside (this pattern) populates the cache lazily on miss. The first request after a TTL expiry always hits DynamoDB. Write-through populates the cache on every write — so the cache is always warm and there are never cold misses. Write-through adds latency to every write; cache-aside adds latency to cache misses only. For read-heavy data that changes infrequently, cache-aside is simpler and equally effective.
Amazon Keyspaces — Managed Cassandra
Amazon Keyspaces is a serverless Cassandra-compatible service. You write CQL (Cassandra Query Language) exactly as you would against a self-managed cluster, and AWS handles replication, patching, scaling, and availability. The key difference from self-managed Cassandra: you do not control compaction strategies, JVM tuning, or node placement — AWS abstracts all of it.
The scenario: Your team runs a self-managed 6-node Cassandra cluster for IoT sensor data. Two engineers spend 30% of their time on cluster operations — patching, compaction tuning, node replacements. You are evaluating Keyspaces as a migration target. You want to verify the CQL compatibility and test that your existing application queries work without modification.
-- Create keyspace (identical syntax to self-managed Cassandra)
CREATE KEYSPACE iot
WITH replication = {'class': 'SingleRegionStrategy'};
-- Keyspaces handles replication internally across 3 AZs automatically
-- Create table — same CQL as Cassandra
CREATE TABLE iot.sensor_readings (
device_id TEXT,
recorded_at TIMESTAMP,
temperature DOUBLE,
humidity DOUBLE,
PRIMARY KEY (device_id, recorded_at)
) WITH CLUSTERING ORDER BY (recorded_at DESC);
-- Insert and query — identical to Cassandra
INSERT INTO iot.sensor_readings
(device_id, recorded_at, temperature, humidity)
VALUES ('device_882', toTimestamp(now()), 22.4, 61.2);
-- Get last 24h of readings for a device
SELECT * FROM iot.sensor_readings
WHERE device_id = 'device_882'
AND recorded_at >= toTimestamp(now()) - 86400s
LIMIT 1000;
Keyspace 'iot' created ✓ Table 'iot.sensor_readings' created ✓ INSERT executed (1.8ms) ✓ SELECT results: device_id | recorded_at | temperature | humidity -------------+--------------------------+-------------+--------- device_882 | 2025-03-10 14:22:01.000Z | 22.4 | 61.2 device_882 | 2025-03-10 14:21:58.000Z | 22.3 | 61.1 ... (247 rows, 14.2ms) // No application code changes required — same CQL driver, same queries ✓ // Operational overhead: zero node management, zero compaction tuning
SingleRegionStrategy — no RF setting needed
In self-managed Cassandra you set replication_factor: 3 explicitly. In Keyspaces, SingleRegionStrategy tells AWS to handle replication internally — it stores three copies across three Availability Zones automatically. You do not choose the replication factor; AWS enforces it for you as part of the managed service contract.
Same CQL, same driver — zero application changes
Keyspaces accepts connections from the standard Apache Cassandra drivers — Python, Java, Node.js. You only change the connection endpoint and add AWS SigV4 authentication. All existing CQL queries, table schemas, and application code work without modification. The migration effort is infrastructure-only, not application code.
What Keyspaces does not support
Keyspaces does not support all Cassandra features. Lightweight Transactions (LWT / IF NOT EXISTS) have limited support, custom compaction strategies cannot be changed, and ALLOW FILTERING is restricted. Before migrating, audit your CQL against the Keyspaces compatibility matrix. Most standard CRUD and time-series query patterns work without issues.
Teacher's Note
The most common DynamoDB mistake I see is teams treating it like MongoDB or PostgreSQL — designing the table first and figuring out queries later. DynamoDB does not forgive this. If your access patterns change after you have millions of items and no room for a GSI that supports them, your options are expensive: a full table scan, a parallel application-level join, or a painful migration. Write down every query your application needs to run before you create the table. Every single one. Then design the primary key and GSIs to make those queries targeted. DynamoDB rewards preparation and punishes improvisation.
Practice Questions — You're the Engineer
Scenario:
eventType = "purchase", which narrows the result to 8 items. Your DynamoDB cost dashboard shows you were billed for 200 item reads, not 8. A colleague explains that DynamoDB reads all items matching the key condition first, then applies a secondary condition client-side — you are billed for everything DynamoDB touches, not just what it returns. What is the name of this secondary condition parameter in the boto3 query() call?
Scenario:
Scenario:
Quiz — NoSQL in AWS in Production
Scenario:
userId and sort key of timestamp#eventId. A new reporting query retrieves all eventType = "refund" events across all users in the past 30 days. You add FilterExpression=Attr("eventType").eq("refund") to the query. In production the query reads 4 million items but returns 1,200 refund events. Your DynamoDB bill for this single query is $780 — based on 4 million item reads. What is the architectural cause of the excessive cost, and what is the correct fix?
Scenario:
ProvisionedThroughputExceededException errors. After the scale-up completes, throttling stops. What is the root cause of the 90-second throttling window, and what should the target utilisation have been set to?
Scenario:
Up Next · Lesson 39
Cloud-Native NoSQL
Serverless databases, multi-cloud patterns, and the architecture decisions that let your data layer scale to zero and back — automatically.