MongoDB
Delete Documents
Deleting data is the most irreversible operation in MongoDB — once a document is gone it cannot be recovered without a backup. MongoDB provides two methods: delete_one() for removing a single matching document and delete_many() for removing every document that matches a filter. Both accept the same filter syntax used in find() and update_one(), making them easy to reason about. This lesson covers both methods, safe deletion patterns, soft deletes, and how to use find_one_and_delete() to atomically fetch and remove a document in one operation — all against the Dataplexa Store dataset.
delete_one() — Removing a Single Document
delete_one() finds the first document that matches the filter and removes it permanently. It returns a result object with a deleted_count property — either 1 if a document was found and deleted, or 0 if no match was found.
Why it exists: most application deletes target one specific record — a user closes their account, an admin removes a product, a job is dequeued. delete_one() is precise and safe — it will never accidentally remove more than one document even if multiple documents match the filter.
# delete_one() — remove the first matching document
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# First check what we are about to delete
target = db.reviews.find_one({"_id": "r008"})
print("About to delete:", target)
# Delete the review
result = db.reviews.delete_one({"_id": "r008"})
print("Deleted count:", result.deleted_count)
# Confirm it is gone
gone = db.reviews.find_one({"_id": "r008"})
print("After delete:", gone)
# delete_one() with no match — returns 0, no error raised
result = db.reviews.delete_one({"_id": "r999"})
print("No match deleted_count:", result.deleted_count)Deleted count: 1
After delete: None
No match deleted_count: 0
deleted_countis1on success and0when no matching document was found — no exception is raised for a no-match- Always identify the document with
find_one()before deleting — confirm you are targeting the right record - Deleting by
_idis the safest approach — it is unique and indexed, so there is zero ambiguity - If multiple documents match the filter,
delete_one()removes only the first in natural order — usedelete_many()if you need all of them gone
delete_many() — Removing Multiple Documents
delete_many() removes every document that matches the filter in a single operation. It is the right tool for bulk cleanup — removing all cancelled orders, purging expired sessions, clearing test data.
Why it matters: deleting records one by one in a loop is slow and creates unnecessary round trips. delete_many() handles bulk removal efficiently in a single server-side operation.
# delete_many() — remove all matching documents
from pymongo import MongoClient
from datetime import datetime, timezone
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# First — see what we are about to remove
cancelled = list(db.orders.find({"status": "cancelled"}, {"_id": 1, "status": 1}))
print("Cancelled orders to delete:", [o["_id"] for o in cancelled])
# Delete all cancelled orders
result = db.orders.delete_many({"status": "cancelled"})
print(f"Deleted {result.deleted_count} cancelled order(s)")
# Confirm the remaining orders
remaining = db.orders.count_documents({})
print("Remaining orders:", remaining)
# Delete products with zero stock
result = db.products.delete_many({"stock": {"$lte": 0}})
print(f"\nOut-of-stock products deleted: {result.deleted_count}")
# Remove all documents from a collection — pass empty filter
# CAUTION: this deletes everything — use drop() if you want to remove the collection too
result = db.user_sessions.delete_many({})
print(f"User sessions cleared: {result.deleted_count}")Deleted 1 cancelled order(s)
Remaining orders: 6
Out-of-stock products deleted: 0
User sessions cleared: 1
- Always run the equivalent
find()with the same filter beforedelete_many()— review what will be deleted first delete_many({})with an empty filter removes every document in the collection — use with extreme caution- For removing a collection entirely,
db.collection.drop()is faster thandelete_many({})— it removes the collection and its indexes in one step delete_many()is not transactional by default — if it is interrupted partway through, some documents will have been deleted and some will not
The Safe Delete Pattern
Deletes are permanent. A safe delete pattern always involves three steps: verify the target with a query first, perform the delete, and confirm the result. For critical data, consider wrapping in a transaction.
# Safe delete pattern — verify before deleting
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
def safe_delete_user(user_id: str) -> dict:
"""
Delete a user only after confirming they exist.
Returns a result dict with status and details.
"""
# Step 1 — verify the target exists
user = db.users.find_one({"_id": user_id})
if not user:
return {"status": "not_found", "deleted": False}
# Step 2 — log what we are deleting (audit trail)
print(f"Deleting user: {user['name']} ({user['email']})")
# Step 3 — perform the delete
result = db.users.delete_one({"_id": user_id})
# Step 4 — confirm
if result.deleted_count == 1:
return {"status": "deleted", "deleted": True, "user": user["name"]}
else:
return {"status": "error", "deleted": False}
# Test the function
print(safe_delete_user("u006")) # inserted in Lesson 11 — should succeed
print(safe_delete_user("u999")) # does not exist{'status': 'deleted', 'deleted': True, 'user': 'Frank Rossi'}
{'status': 'not_found', 'deleted': False}
- Always fetch and log the document before deleting — creates an audit trail and confirms you have the right record
- In regulated industries (finance, healthcare) deletes must be logged to an audit collection before execution
- Consider returning the deleted document to the caller — the
find_one_and_delete()method does this atomically
find_one_and_delete() — Atomic Fetch and Remove
find_one_and_delete() finds a document, returns it, and deletes it — all in a single atomic operation. This is essential for queue processing where two workers must never receive the same job. It eliminates the race condition between a find_one() and a subsequent delete_one().
# find_one_and_delete() — atomic fetch then delete
from pymongo import MongoClient, ASCENDING
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# Simulate a job queue — pop the next pending order for processing
# In a real queue, multiple workers call this simultaneously — atomicity prevents double-processing
def pop_next_order(status="processing"):
"""
Atomically fetch and remove the next order with the given status.
Returns the order document or None if the queue is empty.
"""
order = db.orders.find_one_and_delete(
{"status": status},
sort=[("date", ASCENDING)] # process oldest first
)
return order
# Pop the next processing order
order = pop_next_order("processing")
if order:
print(f"Processing order: {order['_id']} user: {order['user_id']} total: ${order['total']}")
else:
print("No processing orders in queue")
# Confirm it was removed
count_after = db.orders.count_documents({"status": "processing"})
print(f"Remaining processing orders: {count_after}")Remaining processing orders: 0
find_one_and_delete()returns the document as it was before deletion — useful for logging and processing- The
sortparameter ensures the operation is deterministic — without it, which document is deleted is undefined - This is the correct pattern for building work queues and task processors — never use
find_one()followed bydelete_one()for this use case - Returns
Noneif no document matches — always check before accessing the result
Soft Deletes — Hiding Instead of Removing
A soft delete does not remove the document — it marks it as deleted with a flag field and excludes it from normal queries. This preserves history, supports undo, and satisfies audit requirements. It is the preferred pattern in any system where data must be recoverable or auditable.
# Soft delete pattern — mark as deleted rather than remove
from pymongo import MongoClient
from datetime import datetime, timezone
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
def soft_delete_review(review_id: str, deleted_by: str):
"""Mark a review as deleted without removing it from the database."""
result = db.reviews.update_one(
{"_id": review_id, "deleted": {"$exists": False}}, # not already deleted
{"$set": {
"deleted": True,
"deleted_at": datetime.now(timezone.utc),
"deleted_by": deleted_by
}}
)
return result.modified_count == 1
# Soft delete review r007
success = soft_delete_review("r007", deleted_by="admin")
print("Soft deleted:", success)
# Normal query — exclude soft-deleted documents
active_reviews = db.reviews.find(
{"deleted": {"$exists": False}}, # only non-deleted reviews
{"_id": 1, "product_id": 1, "rating": 1}
)
print("\nActive reviews:")
for r in active_reviews:
print(f" {r['_id']} product: {r['product_id']} rating: {r['rating']}")
# Admin query — include soft-deleted documents to see full history
all_reviews = db.reviews.find({}, {"_id": 1, "deleted": 1})
print("\nAll reviews including soft-deleted:")
for r in all_reviews:
status = "deleted" if r.get("deleted") else "active"
print(f" {r['_id']} — {status}")Active reviews:
r001 product: p001 rating: 5
r002 product: p002 rating: 4
r003 product: p004 rating: 5
r004 product: p007 rating: 4
r005 product: p005 rating: 4
r006 product: p002 rating: 5
All reviews including soft-deleted:
r001 — active
r002 — active
r003 — active
r004 — active
r005 — active
r006 — active
r007 — deleted
- Always filter with
{"deleted": {"$exists": False}}in application queries to exclude soft-deleted documents automatically - Add an index on the
deletedfield — or better, use a partial index to only index active documents - The trade-off: soft deletes grow your collection over time. Schedule a background job to hard-delete documents where
deleted_atis older than your retention period - Soft deletes support undelete — simply remove the
deleted,deleted_at, anddeleted_byfields with$unset
Dropping a Collection vs Deleting All Documents
When you need to remove everything, choosing between delete_many({}) and drop() matters — they have different performance characteristics and different effects on indexes.
# drop() vs delete_many({}) — knowing when to use each
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
comparison = {
"delete_many({})": {
"removes_documents": True,
"removes_indexes": False, # indexes are preserved
"removes_collection": False,
"speed": "Slower — deletes document by document",
"use_when": "You want to clear data but keep the collection structure and indexes"
},
"collection.drop()": {
"removes_documents": True,
"removes_indexes": True, # indexes are removed
"removes_collection": True,
"speed": "Instant — removes all data files in one step",
"use_when": "You want to completely destroy and recreate the collection"
}
}
for method, details in comparison.items():
print(f"\n{method}:")
for k, v in details.items():
print(f" {k:25} {v}")
# Example: drop and recreate for test data reset
# db.test_collection.drop()
# All documents and indexes removed instantlyremoves_documents True
removes_indexes False
removes_collection False
speed Slower — deletes document by document
use_when You want to clear data but keep the collection structure and indexes
collection.drop():
removes_documents True
removes_indexes True
removes_collection True
speed Instant — removes all data files in one step
use_when You want to completely destroy and recreate the collection
- Use
drop()for resetting test environments, clearing temp collections, or rebuilding from scratch - Use
delete_many({})when you want to empty a collection but keep its indexes and validation rules intact - After a
drop(), re-creating the collection and its indexes is necessary before re-importing data
Summary Table
| Method | Removes | Returns | Best For |
|---|---|---|---|
delete_one(filter) |
First matching document | deleted_count (0 or 1) |
Targeted single-record removal |
delete_many(filter) |
All matching documents | deleted_count (n) |
Bulk cleanup, purging expired data |
find_one_and_delete() |
First matching document | The deleted document | Queue processing, atomic fetch-and-remove |
| Soft delete | Nothing — marks as deleted | modified_count |
Auditable systems, undo support |
collection.drop() |
All docs + indexes + collection | None | Reset environments, rebuild from scratch |
Practice Questions
Practice 1. What does delete_one() return when no document matches the filter?
Practice 2. Why is find_one_and_delete() safer than a find_one() followed by delete_one() for queue processing?
Practice 3. What are three advantages of using soft deletes over hard deletes?
Practice 4. What is the key difference between drop() and delete_many({}) when clearing a collection?
Practice 5. Write the filter for a normal application query that excludes soft-deleted documents.
Quiz
Quiz 1. If three documents match the filter in delete_one(), how many are removed?
Quiz 2. What does find_one_and_delete() return?
Quiz 3. What is the main disadvantage of soft deletes over time?
Quiz 4. Which delete method should you use to remove a whole collection including all its indexes in one step?
Quiz 5. Why is deleting by _id safer than deleting by another field like name?
Next up — Comparison Operators: Mastering $eq, $ne, $gt, $gte, $lt, $lte, $in, and $nin to build precise range and value queries.