SQL Lesson 30 – | Dataplexa

Data Modeling & Design · Lesson 30

Transactions in NoSQL

For years, "NoSQL means no transactions" was treated as gospel. Engineers chose NoSQL for scale and accepted that multi-document atomicity was simply off the table. Then MongoDB added multi-document transactions in 4.0. Cassandra added lightweight transactions. DynamoDB added transactional APIs. The trade-off did not disappear — transactions in NoSQL are real, but they cost more than people expect, and using them carelessly undoes the performance gains that made you choose NoSQL in the first place.

The Problem Transactions Solve

A bank transfer involves two operations: debit one account, credit another. Both must succeed or both must fail — you cannot have a world where the debit succeeds and the credit fails. This all-or-nothing guarantee is called atomicity, and it is one of the four ACID properties.

Without transactions, a process crash, a network timeout, or a write failure between the two operations leaves the data in a partially-updated state — money gone from one account, never arriving in the other. At scale, these partial writes accumulate silently until a user notices their balance is wrong.

Partial write failure — the problem without transactions:

❌ No transaction — crash between writes

          // Step 1: debit Alice

          db.accounts.updateOne(

            {"{"} _id: "alice" {"}"},

            {"{"} $inc: {"{"} balance: -100 {"}"} {"}"}

          );

          // 💥 SERVER CRASH HERE

          // Step 2: credit Bob — never runs

          // Alice: -£100. Bob: unchanged.

          // £100 vanishes from the system.

✅ With transaction — crash-safe

          // Both ops inside one transaction

          session.startTransaction();

          // debit Alice

          // credit Bob

          // 💥 SERVER CRASH HERE

          // On recovery: transaction is

          // rolled back — neither write

          // persists. Both balances intact.

Multi-Document Transactions in MongoDB

MongoDB has supported multi-document ACID transactions since version 4.0 for replica sets and 4.2 for sharded clusters. The API mirrors what you would expect from a relational database — start a session, start a transaction, run your operations, commit or abort.

The scenario: You are building a payment service for a marketplace. Every payment involves three writes: debit the buyer's wallet, credit the seller's wallet, and insert a transaction record for the audit log. If any of the three fails, all three must be rolled back. A partial write would mean money leaves the buyer but never reaches the seller — or the audit log shows a payment that never completed.

from pymongo import MongoClient
from pymongo.errors import OperationFailure
import datetime

client  = MongoClient("mongodb://localhost:27017/")
db      = client["marketplace"]

def process_payment(buyer_id, seller_id, amount_gbp):
    # A session is required for multi-document transactions
    with client.start_session() as session:
        try:
            session.start_transaction()

            # Step 1: debit buyer — fail if insufficient funds
            result = db.wallets.update_one(
                {"_id": buyer_id, "balance": {"$gte": amount_gbp}},
                {"$inc": {"balance": -amount_gbp}},
                session=session
            )
            if result.modified_count == 0:
                raise ValueError("Insufficient funds")

            # Step 2: credit seller
            db.wallets.update_one(
                {"_id": seller_id},
                {"$inc": {"balance": amount_gbp}},
                session=session
            )

            # Step 3: insert audit record
            db.transactions.insert_one({
                "buyer_id":  buyer_id,
                "seller_id": seller_id,
                "amount":    amount_gbp,
                "status":    "completed",
                "ts":        datetime.datetime.utcnow()
            }, session=session)

            session.commit_transaction()
            return {"status": "ok", "amount": amount_gbp}

        except Exception as e:
            session.abort_transaction()
            return {"status": "failed", "reason": str(e)}

>>> process_payment("buyer_441", "seller_882", 149.99)
{'status': 'ok', 'amount': 149.99}
Transaction committed: 3 writes, all or nothing  ✓
Latency: 8.4ms (vs 2.1ms without transaction)

>>> process_payment("buyer_999", "seller_882", 5000.00)
{'status': 'failed', 'reason': 'Insufficient funds'}
Transaction aborted: 0 writes persisted  ✓

session=session on every operation

Every database operation inside a transaction must receive the session object as a parameter. Operations run without the session object are outside the transaction — they commit immediately and cannot be rolled back if the transaction aborts. This is the most common transaction bug: a developer forgets to pass session=session to one operation and it commits independently of the rest.

{"{"} "balance": {"{"} "$gte": amount_gbp {"}"} {"}"} — conditional debit

The debit uses a filter that requires balance >= amount. If the buyer does not have enough funds, modified_count is 0 and we raise an exception — which triggers abort_transaction(). The seller's credit and the audit record never execute. This is the correct pattern for conditional multi-step writes: check-and-act atomically inside the transaction.

Latency: 8.4ms vs 2.1ms without transaction

Transactions in MongoDB are 3–5× slower than non-transactional writes. The overhead comes from acquiring document-level locks, maintaining a transaction log, and the two-phase commit protocol needed to ensure all operations commit together. This is expected and acceptable for payment flows — never use it for high-throughput event ingestion or feed writes.

Transaction Limitations in MongoDB

MongoDB transactions are real and they work — but they come with hard limits that matter in production.

60-second limit

Transactions time out after 60 seconds by default. Any transaction touching slow external services or doing complex computation will be aborted. Keep transactions short — database operations only, no HTTP calls inside a transaction.

16MB document limit still applies

Transactions do not bypass the 16MB document size limit. Attempting to insert or grow a document beyond 16MB inside a transaction fails and aborts the entire transaction.

No DDL inside transactions

You cannot create collections or indexes inside a transaction. DDL operations (schema changes) must happen outside of any active transaction. Attempting DDL inside a transaction causes an error.

Replica set required

Multi-document transactions require a replica set or sharded cluster — they do not work on standalone MongoDB instances. Running standalone in development and replica set in production can hide transaction-related bugs until they surface in production.

Lightweight Transactions in Cassandra — Compare-and-Swap

Cassandra does not support multi-statement ACID transactions. What it does offer are Lightweight Transactions (LWTs) — single-row conditional writes using the IF clause. Internally, LWTs use the Paxos consensus protocol to ensure that a write only succeeds if the current state matches a condition. This is a compare-and-swap operation: check the current value, update only if it matches what you expect.

The scenario: You are building a seat reservation system for a concert ticketing platform. Two users try to book the last seat simultaneously. Without a conditional write, both could succeed and you would have two people holding the same seat. An LWT ensures only the first write commits.

from cassandra.cluster import Cluster
from cassandra import ConsistencyLevel
from cassandra.query import SimpleStatement

session = Cluster(['localhost']).connect('ticketing')

def reserve_seat(event_id, seat_id, user_id):
    # IF status = 'available' — only succeeds if seat is currently available
    # If two requests arrive simultaneously, only one can win
    stmt = SimpleStatement(
        """UPDATE seats
           SET status = 'reserved', reserved_by = %s
           WHERE event_id = %s AND seat_id = %s
           IF status = 'available'""",
        consistency_level=ConsistencyLevel.QUORUM
    )
    result = session.execute(stmt, (user_id, event_id, seat_id))
    row = result.one()

    if row.applied:
        return {"status": "reserved", "seat": seat_id, "user": user_id}
    else:
        # Row contains current values — show who has it
        return {"status": "already_taken", "reserved_by": row.reserved_by}

>>> reserve_seat("evt_glastonbury", "A14", "user_441")
{'status': 'reserved', 'seat': 'A14', 'user': 'user_441'}
LWT applied: True  — seat was available, write committed  ✓

>>> reserve_seat("evt_glastonbury", "A14", "user_882")
{'status': 'already_taken', 'reserved_by': 'user_441'}
LWT applied: False — seat already reserved, write rejected  ✓

LWT latency: 28ms (vs 2ms for regular write)
Reason: 4 Paxos round trips required per LWT

IF status = 'available'

The IF clause turns a regular CQL write into a Paxos-backed compare-and-swap. Cassandra checks the current value atomically at the replica level — not in application code. This guarantees that even under high concurrency, only one of the simultaneous requests succeeds. Without IF, both requests would write 'reserved' and both would get a success response.

row.applied — the LWT result flag

Every LWT returns a result row with an [applied] boolean column. True means the condition was met and the write committed. False means the condition failed — the row was not updated. The result row also contains the current values of the row, so you can show the user who already holds the seat.

LWT latency: 28ms vs 2ms regular write

LWTs are 10–15× slower than regular Cassandra writes. Paxos requires four network round trips between replicas to reach consensus. In a cluster spread across regions, this latency compounds dramatically. Use LWTs only where the correctness guarantee is strictly necessary — not as a general-purpose locking mechanism.

Transactions in DynamoDB — TransactWriteItems

The scenario: You are building a flash sale system on DynamoDB. When a user purchases an item, you need to atomically: decrement the inventory count (only if stock > 0), and insert an order record. If the inventory decrement fails because stock is 0, the order must not be created. Both operations must be atomic across two separate DynamoDB tables.

import boto3
from botocore.exceptions import ClientError

dynamodb = boto3.client('dynamodb', region_name='eu-west-1')

def purchase_item(product_id, user_id, order_id):
    try:
        dynamodb.transact_write_items(
            TransactItems=[
                # Operation 1: decrement inventory, only if stock > 0
                {
                    'Update': {
                        'TableName': 'inventory',
                        'Key': {'product_id': {'S': product_id}},
                        'UpdateExpression': 'SET stock = stock - :one',
                        'ConditionExpression': 'stock > :zero',
                        'ExpressionAttributeValues': {
                            ':one':  {'N': '1'},
                            ':zero': {'N': '0'}
                        }
                    }
                },
                # Operation 2: create the order record
                {
                    'Put': {
                        'TableName': 'orders',
                        'Item': {
                            'order_id':   {'S': order_id},
                            'user_id':    {'S': user_id},
                            'product_id': {'S': product_id},
                            'status':     {'S': 'confirmed'}
                        },
                        # Prevent duplicate orders for the same order_id
                        'ConditionExpression': 'attribute_not_exists(order_id)'
                    }
                }
            ]
        )
        return {"status": "purchased", "order_id": order_id}

    except ClientError as e:
        if e.response['Error']['Code'] == 'TransactionCanceledException':
            return {"status": "out_of_stock_or_duplicate"}
        raise

>>> purchase_item("prod_trainers_xl", "user_441", "ord_9912")
{'status': 'purchased', 'order_id': 'ord_9912'}
TransactWriteItems: 2 tables, both writes committed  ✓

>>> purchase_item("prod_trainers_xl", "user_882", "ord_9913")
{'status': 'out_of_stock_or_duplicate'}
TransactionCanceledException: stock condition failed
Inventory unchanged, order not created  ✓

transact_write_items — up to 100 operations, up to 4MB

DynamoDB's transaction API supports up to 100 operations across multiple tables in a single atomic call, with a total payload limit of 4MB. All operations succeed or all are cancelled. Unlike MongoDB, you do not need to manage a session — the entire transaction is expressed as a single API call. DynamoDB handles the two-phase commit internally.

ConditionExpression on both items

Both operations carry their own condition: the inventory decrement requires stock > 0, and the order insert requires attribute_not_exists(order_id) to prevent duplicate orders if the request is retried. If either condition fails, TransactionCanceledException is raised and neither write persists. Idempotent transactions — safe to retry — are essential for any payment flow.

DynamoDB transaction cost — 2× read/write capacity units

DynamoDB charges twice the normal read/write capacity for transactional operations. A regular PutItem consuming 1 WCU costs 2 WCUs inside a transaction. At high throughput this doubles your DynamoDB bill instantly. Reserve transactions for operations where correctness is non-negotiable — not for writes where eventual consistency is acceptable.

The Saga Pattern — Transactions Without Transactions

When you need multi-step consistency across services or databases that do not support native transactions, the Saga pattern is the alternative. A saga breaks a multi-step operation into a sequence of local transactions. If a step fails, a compensating transaction undoes the preceding steps.

Saga — Order Fulfilment Across Three Services

        Step 1
        Reserve inventory  →  success
      
        Step 2
        Charge payment card  →  success
      
        Step 3
        Create shipping label  →  FAILS
      
Compensating transactions run in reverse:

        Undo 2
        Refund payment card
      
        Undo 1
        Release inventory reservation

The saga does not provide isolation — a concurrent reader could see the intermediate state (inventory reserved, payment charged, no label yet). It provides eventual consistency with explicit rollback logic.

Transactions — Costs Compared

Database	Transaction support	Latency overhead	Scope	Use for
MongoDB	Full multi-document ACID	3–5× slower	Multiple collections, sharded clusters	Payments, transfers, audit trails
Cassandra LWT	Single-row compare-and-swap	10–15× slower (Paxos)	Single partition only	Seat reservations, unique usernames
DynamoDB	TransactWriteItems (up to 100 ops)	2× capacity cost	Multiple tables, same region	Flash sales, inventory + order atomicity
Neo4j	Full ACID per Cypher statement	Minimal — native to engine	Multiple nodes/relationships per query	Graph mutations needing consistency

Teacher's Note

The temptation after learning about MongoDB multi-document transactions is to wrap everything in a transaction "just to be safe." Resist it. A session that spends 60 seconds holding document-level locks is an availability problem waiting to happen. Use transactions for the 5% of operations that genuinely require all-or-nothing semantics — money movement, inventory allocation, audit records. The other 95% should stay lean, fast, and unlocked.

Practice Questions — You're the Engineer

Scenario:

A developer on your team implements a MongoDB payment transaction. During testing, they notice that when the transaction is aborted due to insufficient funds, the audit log entry is still being inserted — even though it is inside the try block. After reviewing the code you spot the bug immediately. The db.transactions.insert_one() call is missing something that would bind it to the active transaction. What is missing?

Scenario:

You implement a Cassandra lightweight transaction to register unique usernames: INSERT INTO users (username, user_id) VALUES ('alice', 'u_441') IF NOT EXISTS. Two users try to register "alice" simultaneously. You execute the LWT and get a result row back. Your code checks a specific boolean column in the result to determine whether the username was successfully claimed or was already taken. What is this column called?

Scenario:

Your order fulfilment saga has three steps: reserve inventory, charge the payment card, create a shipping label. The shipping label creation fails. Your saga framework automatically triggers a refund on the payment card and releases the inventory reservation. These reversal operations are triggered by the failure to undo the effects of the steps that already succeeded. What are these reversal operations called?

Quiz — Transactions in Production

Up Next · Lesson 31

Consistency vs Availability

Every system makes a choice under partition — and most teams only discover which choice their database made when production breaks at 3 AM.

← Previous Course Index Next →