MongoDB
Insert Documents
Inserting documents is the first write operation you perform in MongoDB. Every piece of data in your application — a new user signing up, a product added to a catalogue, an order placed — enters the database through an insert operation. MongoDB provides two methods: insertOne() for adding a single document and insertMany() for adding multiple documents in one efficient operation. This lesson covers both, explores how MongoDB handles the _id field, and shows how to deal with errors and duplicate keys using the Dataplexa Store dataset.
insertOne() — Adding a Single Document
insertOne() writes a single document to a collection and returns a result object confirming the operation and reporting the inserted document's _id.
Why it exists: individual inserts are the most common write operation in transactional applications — a user registers, an event fires, a log entry is created. insertOne() is the right tool for any time you are adding one document at a time.
Real-world use: a new customer completes registration on the Dataplexa Store — their profile document is inserted into the users collection immediately.
# insertOne() — add a single document to a collection
from pymongo import MongoClient
from datetime import datetime, timezone
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# New user signing up to the Dataplexa Store
new_user = {
"_id": "u006",
"name": "Frank Rossi",
"email": "frank@example.com",
"age": 33,
"city": "Rome",
"country": "Italy",
"membership": "basic",
"joined": datetime.now(timezone.utc),
"tags": []
}
result = db.users.insert_one(new_user)
print("Acknowledged:", result.acknowledged)
print("Inserted _id:", result.inserted_id)Inserted _id: u006
result.acknowledged—Truemeans the server confirmed the write was received and appliedresult.inserted_id— the_idof the newly inserted document- If you omit the
_idfield, MongoDB generates an ObjectId automatically and adds it to your original dictionary in place - The collection is created automatically if it does not already exist — no setup needed
Auto-Generated _id with ObjectId
When you do not supply an _id, MongoDB generates a 12-byte ObjectId and injects it into your document. PyMongo also mutates the original Python dict to include it — a useful behaviour worth knowing.
# Auto-generated ObjectId — MongoDB assigns _id automatically
from pymongo import MongoClient
from bson import ObjectId
from datetime import datetime, timezone
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# No _id provided — MongoDB will generate one
new_product = {
"name": "Ergonomic Chair",
"category": "Furniture",
"brand": "DeskPro",
"price": 249.99,
"stock": 15,
"rating": 4.7,
"tags": ["ergonomic", "adjustable"],
"added_at": datetime.now(timezone.utc)
}
print("Before insert — _id in dict:", "_id" in new_product)
result = db.products.insert_one(new_product)
# PyMongo adds _id to the original dict after insert
print("After insert — _id in dict:", "_id" in new_product)
print("Auto _id:", new_product["_id"])
print("Type:", type(new_product["_id"]).__name__)
print("Created at:", new_product["_id"].generation_time)After insert — _id in dict: True
Auto _id: 64a1f2e3b4c5d6e7f8a9b0c1
Type: ObjectId
Created at: 2024-03-15 09:30:00+00:00
- PyMongo mutates the original dict after
insert_one()— the_idkey is added in place - The ObjectId's
generation_timeproperty gives you the insertion timestamp for free — no extra field needed - If you insert the same dict object twice, the second insert will fail with a
DuplicateKeyErrorbecause the_idwas already set from the first insert
insertMany() — Adding Multiple Documents
insertMany() writes a list of documents to a collection in a single round trip to the server. It is far more efficient than calling insertOne() in a loop — especially when seeding data, importing records, or processing batches.
Real-world use: bulk loading product inventory from a supplier feed, importing customer records from a CSV export, or seeding a database with test data.
# insertMany() — add multiple documents in one operation
from pymongo import MongoClient
from datetime import datetime, timezone
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# Three new reviews to add at once
new_reviews = [
{
"_id": "r006",
"product_id": "p002",
"user_id": "u003",
"rating": 5,
"comment": "Absolutely love the tactile feedback.",
"date": datetime(2024, 4, 10, tzinfo=timezone.utc)
},
{
"_id": "r007",
"product_id": "p003",
"user_id": "u004",
"rating": 3,
"comment": "Good notebook but paper is a bit thin.",
"date": datetime(2024, 4, 11, tzinfo=timezone.utc)
},
{
"_id": "r008",
"product_id": "p007",
"user_id": "u001",
"rating": 5,
"comment": "Incredible colour accuracy for design work.",
"date": datetime(2024, 4, 12, tzinfo=timezone.utc)
},
]
result = db.reviews.insert_many(new_reviews)
print("Acknowledged:", result.acknowledged)
print("Inserted count:", len(result.inserted_ids))
print("Inserted IDs:", result.inserted_ids)Inserted count: 3
Inserted IDs: ['r006', 'r007', 'r008']
result.inserted_ids— a list of_idvalues for every inserted document, in the same order as the input listinsertMany()sends all documents in one network round trip — much faster than a loop ofinsertOne()calls for large batches- By default,
insertMany()inserts documents in order — if one fails, the remaining documents after it are not inserted - Pass
ordered=Falseto continue inserting remaining documents even when one fails
Ordered vs Unordered Inserts
The ordered parameter controls what happens when a batch insert encounters an error such as a duplicate _id. Understanding this saves you from silent data loss in bulk operations.
# ordered=True vs ordered=False — handling batch insert errors
from pymongo import MongoClient
from pymongo.errors import BulkWriteError
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
batch = [
{"_id": "new_001", "name": "Doc A"}, # new — will succeed
{"_id": "u001", "name": "Doc B"}, # duplicate — u001 already exists
{"_id": "new_002", "name": "Doc C"}, # new — what happens to this?
]
# ordered=True (default) — stops at the first error
# Doc A inserts, Doc B fails, Doc C is NEVER attempted
try:
db.users.insert_many(batch, ordered=True)
except BulkWriteError as e:
print("ordered=True — stopped at error")
print(" Inserted before error:", e.details["nInserted"])
# ordered=False — skips errors and continues
# Doc A inserts, Doc B fails (skipped), Doc C inserts
try:
db.users.insert_many(batch, ordered=False)
except BulkWriteError as e:
print("\nordered=False — continued after error")
print(" Successfully inserted:", e.details["nInserted"])
print(" Write errors: ", len(e.details["writeErrors"]))Inserted before error: 1
ordered=False — continued after error
Successfully inserted: 2
Write errors: 1
- Use
ordered=True(default) when documents depend on each other and partial inserts are unacceptable - Use
ordered=Falsefor bulk imports where you want maximum throughput and can tolerate some failures BulkWriteError.detailscontains full information about which documents succeeded and which failed- Always wrap
insertMany()in a try/except when working with data that may contain duplicates
Handling Duplicate Key Errors
Attempting to insert a document with an _id that already exists raises a DuplicateKeyError. Handling this gracefully is essential in real applications — for example, when a user tries to register with an email that already exists.
# Handling DuplicateKeyError — graceful error handling
from pymongo import MongoClient
from pymongo.errors import DuplicateKeyError
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
def add_user(user_doc):
try:
result = db.users.insert_one(user_doc)
print(f"User created — _id: {result.inserted_id}")
return result.inserted_id
except DuplicateKeyError as e:
print(f"User already exists — _id '{user_doc['_id']}' is taken")
return None
# First insert — succeeds
add_user({"_id": "u007", "name": "Grace Kim", "email": "grace@example.com"})
# Second insert — same _id, raises DuplicateKeyError
add_user({"_id": "u007", "name": "Grace Kim", "email": "grace@example.com"})
# Trying to re-insert an existing Dataplexa user
add_user({"_id": "u001", "name": "Alice Johnson", "email": "alice@example.com"})User already exists — _id 'u007' is taken
User already exists — _id 'u001' is taken
DuplicateKeyErroris a subclass ofWriteError— import it frompymongo.errors- DuplicateKeyError is also raised by unique indexes on non-
_idfields — for example a unique index onemail - A common pattern is to catch it and return a user-friendly message rather than letting the error bubble up to the API
Inserting Documents with mongosh
All insert operations work identically in the mongosh shell — useful for quick data entry, testing, and administrative tasks without writing application code.
# Insert operations in mongosh — reference syntax
mongosh_inserts = {
"insertOne": 'db.users.insertOne({ "_id": "u008", "name": "Hana Park", "city": "Seoul" })',
"insertMany": """db.products.insertMany([
{ "name": "Desk Lamp", "category": "Furniture", "price": 39.99 },
{ "name": "Webcam HD", "category": "Electronics", "price": 79.99 }
])""",
"check result": "db.users.countDocuments()",
"ordered false": 'db.users.insertMany([...], { ordered: false })',
}
for operation, syntax in mongosh_inserts.items():
print(f"── {operation} ──")
print(f" {syntax}")
print()db.users.insertOne({ "_id": "u008", "name": "Hana Park", "city": "Seoul" })
── insertMany ──
db.products.insertMany([
{ "name": "Desk Lamp", "category": "Furniture", "price": 39.99 },
{ "name": "Webcam HD", "category": "Electronics", "price": 79.99 }
])
── check result ──
db.users.countDocuments()
── ordered false ──
db.users.insertMany([...], { ordered: false })
- mongosh uses camelCase method names:
insertOne,insertMany— PyMongo uses snake_case:insert_one,insert_many - The result object in mongosh shows
acknowledged,insertedId(singular), andinsertedIds(plural for many) - Always follow an insert with
countDocuments()orfindOne()to verify the result when working interactively
Summary Table
| Method | Input | Returns | Key Option |
|---|---|---|---|
insert_one() |
Single dict | InsertOneResult — inserted_id |
— |
insert_many() |
List of dicts | InsertManyResult — inserted_ids |
ordered=True/False |
Auto _id |
Omit _id field |
ObjectId generated and injected | — |
DuplicateKeyError |
Duplicate _id or unique field |
Exception raised | Catch and handle gracefully |
ordered=False |
insert_many() option |
Continues after errors | Use for bulk imports with expected duplicates |
Practice Questions
Practice 1. What property of the insertOne() result gives you the _id of the newly inserted document?
Practice 2. What does PyMongo do to the original dict after a successful insert_one() call when no _id was provided?
Practice 3. What is the difference between ordered=True and ordered=False in insertMany()?
Practice 4. What exception should you catch when inserting a document whose _id already exists in the collection?
Practice 5. Why is insertMany() more efficient than calling insertOne() in a loop?
Quiz
Quiz 1. What does result.acknowledged = True mean after an insert operation?
Quiz 2. What happens to a collection that does not exist when you call insert_one() on it?
Quiz 3. Which module in PyMongo contains DuplicateKeyError?
Quiz 4. What does result.inserted_ids return after a successful insertMany() call?
Quiz 5. In mongosh, which method name is the equivalent of PyMongo's insert_many()?
Next up — Find Documents: reading data from collections using find(), findOne(), filters, and projections.