NoSQL
Transactions in NoSQL
For years, "NoSQL means no transactions" was treated as gospel. Engineers chose NoSQL for scale and accepted that multi-document atomicity was simply off the table. Then MongoDB added multi-document transactions in 4.0. Cassandra added lightweight transactions. DynamoDB added transactional APIs. The trade-off did not disappear — transactions in NoSQL are real, but they cost more than people expect, and using them carelessly undoes the performance gains that made you choose NoSQL in the first place.
The Problem Transactions Solve
A bank transfer involves two operations: debit one account, credit another. Both must succeed or both must fail — you cannot have a world where the debit succeeds and the credit fails. This all-or-nothing guarantee is called atomicity, and it is one of the four ACID properties.
Without transactions, a process crash, a network timeout, or a write failure between the two operations leaves the data in a partially-updated state — money gone from one account, never arriving in the other. At scale, these partial writes accumulate silently until a user notices their balance is wrong.
Partial write failure — the problem without transactions:
❌ No transaction — crash between writes
db.accounts.updateOne(
{"{"} _id: "alice" {"}"},
{"{"} $inc: {"{"} balance: -100 {"}"} {"}"}
);
// 💥 SERVER CRASH HERE
// Step 2: credit Bob — never runs
// Alice: -£100. Bob: unchanged.
// £100 vanishes from the system.
✅ With transaction — crash-safe
session.startTransaction();
// debit Alice
// credit Bob
// 💥 SERVER CRASH HERE
// On recovery: transaction is
// rolled back — neither write
// persists. Both balances intact.
Multi-Document Transactions in MongoDB
MongoDB has supported multi-document ACID transactions since version 4.0 for replica sets and 4.2 for sharded clusters. The API mirrors what you would expect from a relational database — start a session, start a transaction, run your operations, commit or abort.
The scenario: You are building a payment service for a marketplace. Every payment involves three writes: debit the buyer's wallet, credit the seller's wallet, and insert a transaction record for the audit log. If any of the three fails, all three must be rolled back. A partial write would mean money leaves the buyer but never reaches the seller — or the audit log shows a payment that never completed.
from pymongo import MongoClient
from pymongo.errors import OperationFailure
import datetime
client = MongoClient("mongodb://localhost:27017/")
db = client["marketplace"]
def process_payment(buyer_id, seller_id, amount_gbp):
# A session is required for multi-document transactions
with client.start_session() as session:
try:
session.start_transaction()
# Step 1: debit buyer — fail if insufficient funds
result = db.wallets.update_one(
{"_id": buyer_id, "balance": {"$gte": amount_gbp}},
{"$inc": {"balance": -amount_gbp}},
session=session
)
if result.modified_count == 0:
raise ValueError("Insufficient funds")
# Step 2: credit seller
db.wallets.update_one(
{"_id": seller_id},
{"$inc": {"balance": amount_gbp}},
session=session
)
# Step 3: insert audit record
db.transactions.insert_one({
"buyer_id": buyer_id,
"seller_id": seller_id,
"amount": amount_gbp,
"status": "completed",
"ts": datetime.datetime.utcnow()
}, session=session)
session.commit_transaction()
return {"status": "ok", "amount": amount_gbp}
except Exception as e:
session.abort_transaction()
return {"status": "failed", "reason": str(e)}
>>> process_payment("buyer_441", "seller_882", 149.99)
{'status': 'ok', 'amount': 149.99}
Transaction committed: 3 writes, all or nothing ✓
Latency: 8.4ms (vs 2.1ms without transaction)
>>> process_payment("buyer_999", "seller_882", 5000.00)
{'status': 'failed', 'reason': 'Insufficient funds'}
Transaction aborted: 0 writes persisted ✓session=session on every operation
Every database operation inside a transaction must receive the session object as a parameter. Operations run without the session object are outside the transaction — they commit immediately and cannot be rolled back if the transaction aborts. This is the most common transaction bug: a developer forgets to pass session=session to one operation and it commits independently of the rest.
{"{"} "balance": {"{"} "$gte": amount_gbp {"}"} {"}"} — conditional debit
The debit uses a filter that requires balance >= amount. If the buyer does not have enough funds, modified_count is 0 and we raise an exception — which triggers abort_transaction(). The seller's credit and the audit record never execute. This is the correct pattern for conditional multi-step writes: check-and-act atomically inside the transaction.
Latency: 8.4ms vs 2.1ms without transaction
Transactions in MongoDB are 3–5× slower than non-transactional writes. The overhead comes from acquiring document-level locks, maintaining a transaction log, and the two-phase commit protocol needed to ensure all operations commit together. This is expected and acceptable for payment flows — never use it for high-throughput event ingestion or feed writes.
Transaction Limitations in MongoDB
MongoDB transactions are real and they work — but they come with hard limits that matter in production.
60-second limit
Transactions time out after 60 seconds by default. Any transaction touching slow external services or doing complex computation will be aborted. Keep transactions short — database operations only, no HTTP calls inside a transaction.
16MB document limit still applies
Transactions do not bypass the 16MB document size limit. Attempting to insert or grow a document beyond 16MB inside a transaction fails and aborts the entire transaction.
No DDL inside transactions
You cannot create collections or indexes inside a transaction. DDL operations (schema changes) must happen outside of any active transaction. Attempting DDL inside a transaction causes an error.
Replica set required
Multi-document transactions require a replica set or sharded cluster — they do not work on standalone MongoDB instances. Running standalone in development and replica set in production can hide transaction-related bugs until they surface in production.
Lightweight Transactions in Cassandra — Compare-and-Swap
Cassandra does not support multi-statement ACID transactions. What it does offer are Lightweight Transactions (LWTs) — single-row conditional writes using the IF clause. Internally, LWTs use the Paxos consensus protocol to ensure that a write only succeeds if the current state matches a condition. This is a compare-and-swap operation: check the current value, update only if it matches what you expect.
The scenario: You are building a seat reservation system for a concert ticketing platform. Two users try to book the last seat simultaneously. Without a conditional write, both could succeed and you would have two people holding the same seat. An LWT ensures only the first write commits.
from cassandra.cluster import Cluster
from cassandra import ConsistencyLevel
from cassandra.query import SimpleStatement
session = Cluster(['localhost']).connect('ticketing')
def reserve_seat(event_id, seat_id, user_id):
# IF status = 'available' — only succeeds if seat is currently available
# If two requests arrive simultaneously, only one can win
stmt = SimpleStatement(
"""UPDATE seats
SET status = 'reserved', reserved_by = %s
WHERE event_id = %s AND seat_id = %s
IF status = 'available'""",
consistency_level=ConsistencyLevel.QUORUM
)
result = session.execute(stmt, (user_id, event_id, seat_id))
row = result.one()
if row.applied:
return {"status": "reserved", "seat": seat_id, "user": user_id}
else:
# Row contains current values — show who has it
return {"status": "already_taken", "reserved_by": row.reserved_by}
>>> reserve_seat("evt_glastonbury", "A14", "user_441")
{'status': 'reserved', 'seat': 'A14', 'user': 'user_441'}
LWT applied: True — seat was available, write committed ✓
>>> reserve_seat("evt_glastonbury", "A14", "user_882")
{'status': 'already_taken', 'reserved_by': 'user_441'}
LWT applied: False — seat already reserved, write rejected ✓
LWT latency: 28ms (vs 2ms for regular write)
Reason: 4 Paxos round trips required per LWTIF status = 'available'
The IF clause turns a regular CQL write into a Paxos-backed compare-and-swap. Cassandra checks the current value atomically at the replica level — not in application code. This guarantees that even under high concurrency, only one of the simultaneous requests succeeds. Without IF, both requests would write 'reserved' and both would get a success response.
row.applied — the LWT result flag
Every LWT returns a result row with an [applied] boolean column. True means the condition was met and the write committed. False means the condition failed — the row was not updated. The result row also contains the current values of the row, so you can show the user who already holds the seat.
LWT latency: 28ms vs 2ms regular write
LWTs are 10–15× slower than regular Cassandra writes. Paxos requires four network round trips between replicas to reach consensus. In a cluster spread across regions, this latency compounds dramatically. Use LWTs only where the correctness guarantee is strictly necessary — not as a general-purpose locking mechanism.
Transactions in DynamoDB — TransactWriteItems
The scenario: You are building a flash sale system on DynamoDB. When a user purchases an item, you need to atomically: decrement the inventory count (only if stock > 0), and insert an order record. If the inventory decrement fails because stock is 0, the order must not be created. Both operations must be atomic across two separate DynamoDB tables.
import boto3
from botocore.exceptions import ClientError
dynamodb = boto3.client('dynamodb', region_name='eu-west-1')
def purchase_item(product_id, user_id, order_id):
try:
dynamodb.transact_write_items(
TransactItems=[
# Operation 1: decrement inventory, only if stock > 0
{
'Update': {
'TableName': 'inventory',
'Key': {'product_id': {'S': product_id}},
'UpdateExpression': 'SET stock = stock - :one',
'ConditionExpression': 'stock > :zero',
'ExpressionAttributeValues': {
':one': {'N': '1'},
':zero': {'N': '0'}
}
}
},
# Operation 2: create the order record
{
'Put': {
'TableName': 'orders',
'Item': {
'order_id': {'S': order_id},
'user_id': {'S': user_id},
'product_id': {'S': product_id},
'status': {'S': 'confirmed'}
},
# Prevent duplicate orders for the same order_id
'ConditionExpression': 'attribute_not_exists(order_id)'
}
}
]
)
return {"status": "purchased", "order_id": order_id}
except ClientError as e:
if e.response['Error']['Code'] == 'TransactionCanceledException':
return {"status": "out_of_stock_or_duplicate"}
raise
>>> purchase_item("prod_trainers_xl", "user_441", "ord_9912")
{'status': 'purchased', 'order_id': 'ord_9912'}
TransactWriteItems: 2 tables, both writes committed ✓
>>> purchase_item("prod_trainers_xl", "user_882", "ord_9913")
{'status': 'out_of_stock_or_duplicate'}
TransactionCanceledException: stock condition failed
Inventory unchanged, order not created ✓transact_write_items — up to 100 operations, up to 4MB
DynamoDB's transaction API supports up to 100 operations across multiple tables in a single atomic call, with a total payload limit of 4MB. All operations succeed or all are cancelled. Unlike MongoDB, you do not need to manage a session — the entire transaction is expressed as a single API call. DynamoDB handles the two-phase commit internally.
ConditionExpression on both items
Both operations carry their own condition: the inventory decrement requires stock > 0, and the order insert requires attribute_not_exists(order_id) to prevent duplicate orders if the request is retried. If either condition fails, TransactionCanceledException is raised and neither write persists. Idempotent transactions — safe to retry — are essential for any payment flow.
DynamoDB transaction cost — 2× read/write capacity units
DynamoDB charges twice the normal read/write capacity for transactional operations. A regular PutItem consuming 1 WCU costs 2 WCUs inside a transaction. At high throughput this doubles your DynamoDB bill instantly. Reserve transactions for operations where correctness is non-negotiable — not for writes where eventual consistency is acceptable.
The Saga Pattern — Transactions Without Transactions
When you need multi-step consistency across services or databases that do not support native transactions, the Saga pattern is the alternative. A saga breaks a multi-step operation into a sequence of local transactions. If a step fails, a compensating transaction undoes the preceding steps.
Saga — Order Fulfilment Across Three Services
The saga does not provide isolation — a concurrent reader could see the intermediate state (inventory reserved, payment charged, no label yet). It provides eventual consistency with explicit rollback logic.
Transactions — Costs Compared
| Database | Transaction support | Latency overhead | Scope | Use for |
|---|---|---|---|---|
| MongoDB | Full multi-document ACID | 3–5× slower | Multiple collections, sharded clusters | Payments, transfers, audit trails |
| Cassandra LWT | Single-row compare-and-swap | 10–15× slower (Paxos) | Single partition only | Seat reservations, unique usernames |
| DynamoDB | TransactWriteItems (up to 100 ops) | 2× capacity cost | Multiple tables, same region | Flash sales, inventory + order atomicity |
| Neo4j | Full ACID per Cypher statement | Minimal — native to engine | Multiple nodes/relationships per query | Graph mutations needing consistency |
Teacher's Note
The temptation after learning about MongoDB multi-document transactions is to wrap everything in a transaction "just to be safe." Resist it. A session that spends 60 seconds holding document-level locks is an availability problem waiting to happen. Use transactions for the 5% of operations that genuinely require all-or-nothing semantics — money movement, inventory allocation, audit records. The other 95% should stay lean, fast, and unlocked.
Practice Questions — You're the Engineer
Scenario:
try block. After reviewing the code you spot the bug immediately. The db.transactions.insert_one() call is missing something that would bind it to the active transaction. What is missing?
Scenario:
INSERT INTO users (username, user_id) VALUES ('alice', 'u_441') IF NOT EXISTS. Two users try to register "alice" simultaneously. You execute the LWT and get a result row back. Your code checks a specific boolean column in the result to determine whether the username was successfully claimed or was already taken. What is this column called?
Scenario:
Quiz — Transactions in Production
Scenario:
Scenario:
Scenario:
transact_write_items. The feature works correctly but the AWS bill doubles overnight. Your CTO asks you to explain the cost impact to the team.
Up Next · Lesson 31
Consistency vs Availability
Every system makes a choice under partition — and most teams only discover which choice their database made when production breaks at 3 AM.