Mango DBLesson 11 – Insert Documents | Dataplexa

Insert Documents

Inserting documents is the first write operation you perform in MongoDB. Every piece of data in your application — a new user signing up, a product added to a catalogue, an order placed — enters the database through an insert operation. MongoDB provides two methods: insertOne() for adding a single document and insertMany() for adding multiple documents in one efficient operation. This lesson covers both, explores how MongoDB handles the _id field, and shows how to deal with errors and duplicate keys using the Dataplexa Store dataset.

insertOne() — Adding a Single Document

insertOne() writes a single document to a collection and returns a result object confirming the operation and reporting the inserted document's _id.

Why it exists: individual inserts are the most common write operation in transactional applications — a user registers, an event fires, a log entry is created. insertOne() is the right tool for any time you are adding one document at a time.

Real-world use: a new customer completes registration on the Dataplexa Store — their profile document is inserted into the users collection immediately.

# insertOne() — add a single document to a collection

from pymongo import MongoClient
from datetime import datetime, timezone

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# New user signing up to the Dataplexa Store
new_user = {
    "_id":        "u006",
    "name":       "Frank Rossi",
    "email":      "frank@example.com",
    "age":        33,
    "city":       "Rome",
    "country":    "Italy",
    "membership": "basic",
    "joined":     datetime.now(timezone.utc),
    "tags":       []
}

result = db.users.insert_one(new_user)

print("Acknowledged:", result.acknowledged)
print("Inserted _id:", result.inserted_id)
Acknowledged: True
Inserted _id: u006
  • result.acknowledgedTrue means the server confirmed the write was received and applied
  • result.inserted_id — the _id of the newly inserted document
  • If you omit the _id field, MongoDB generates an ObjectId automatically and adds it to your original dictionary in place
  • The collection is created automatically if it does not already exist — no setup needed

Auto-Generated _id with ObjectId

When you do not supply an _id, MongoDB generates a 12-byte ObjectId and injects it into your document. PyMongo also mutates the original Python dict to include it — a useful behaviour worth knowing.

# Auto-generated ObjectId — MongoDB assigns _id automatically

from pymongo import MongoClient
from bson import ObjectId
from datetime import datetime, timezone

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# No _id provided — MongoDB will generate one
new_product = {
    "name":     "Ergonomic Chair",
    "category": "Furniture",
    "brand":    "DeskPro",
    "price":    249.99,
    "stock":    15,
    "rating":   4.7,
    "tags":     ["ergonomic", "adjustable"],
    "added_at": datetime.now(timezone.utc)
}

print("Before insert — _id in dict:", "_id" in new_product)

result = db.products.insert_one(new_product)

# PyMongo adds _id to the original dict after insert
print("After insert  — _id in dict:", "_id" in new_product)
print("Auto _id:", new_product["_id"])
print("Type:", type(new_product["_id"]).__name__)
print("Created at:", new_product["_id"].generation_time)
Before insert — _id in dict: False
After insert — _id in dict: True
Auto _id: 64a1f2e3b4c5d6e7f8a9b0c1
Type: ObjectId
Created at: 2024-03-15 09:30:00+00:00
  • PyMongo mutates the original dict after insert_one() — the _id key is added in place
  • The ObjectId's generation_time property gives you the insertion timestamp for free — no extra field needed
  • If you insert the same dict object twice, the second insert will fail with a DuplicateKeyError because the _id was already set from the first insert

insertMany() — Adding Multiple Documents

insertMany() writes a list of documents to a collection in a single round trip to the server. It is far more efficient than calling insertOne() in a loop — especially when seeding data, importing records, or processing batches.

Real-world use: bulk loading product inventory from a supplier feed, importing customer records from a CSV export, or seeding a database with test data.

# insertMany() — add multiple documents in one operation

from pymongo import MongoClient
from datetime import datetime, timezone

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Three new reviews to add at once
new_reviews = [
    {
        "_id":        "r006",
        "product_id": "p002",
        "user_id":    "u003",
        "rating":     5,
        "comment":    "Absolutely love the tactile feedback.",
        "date":       datetime(2024, 4, 10, tzinfo=timezone.utc)
    },
    {
        "_id":        "r007",
        "product_id": "p003",
        "user_id":    "u004",
        "rating":     3,
        "comment":    "Good notebook but paper is a bit thin.",
        "date":       datetime(2024, 4, 11, tzinfo=timezone.utc)
    },
    {
        "_id":        "r008",
        "product_id": "p007",
        "user_id":    "u001",
        "rating":     5,
        "comment":    "Incredible colour accuracy for design work.",
        "date":       datetime(2024, 4, 12, tzinfo=timezone.utc)
    },
]

result = db.reviews.insert_many(new_reviews)

print("Acknowledged:", result.acknowledged)
print("Inserted count:", len(result.inserted_ids))
print("Inserted IDs:", result.inserted_ids)
Acknowledged: True
Inserted count: 3
Inserted IDs: ['r006', 'r007', 'r008']
  • result.inserted_ids — a list of _id values for every inserted document, in the same order as the input list
  • insertMany() sends all documents in one network round trip — much faster than a loop of insertOne() calls for large batches
  • By default, insertMany() inserts documents in order — if one fails, the remaining documents after it are not inserted
  • Pass ordered=False to continue inserting remaining documents even when one fails

Ordered vs Unordered Inserts

The ordered parameter controls what happens when a batch insert encounters an error such as a duplicate _id. Understanding this saves you from silent data loss in bulk operations.

# ordered=True vs ordered=False — handling batch insert errors

from pymongo import MongoClient
from pymongo.errors import BulkWriteError

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

batch = [
    {"_id": "new_001", "name": "Doc A"},   # new — will succeed
    {"_id": "u001",    "name": "Doc B"},   # duplicate — u001 already exists
    {"_id": "new_002", "name": "Doc C"},   # new — what happens to this?
]

# ordered=True (default) — stops at the first error
# Doc A inserts, Doc B fails, Doc C is NEVER attempted
try:
    db.users.insert_many(batch, ordered=True)
except BulkWriteError as e:
    print("ordered=True  — stopped at error")
    print("  Inserted before error:", e.details["nInserted"])

# ordered=False — skips errors and continues
# Doc A inserts, Doc B fails (skipped), Doc C inserts
try:
    db.users.insert_many(batch, ordered=False)
except BulkWriteError as e:
    print("\nordered=False — continued after error")
    print("  Successfully inserted:", e.details["nInserted"])
    print("  Write errors:         ", len(e.details["writeErrors"]))
ordered=True — stopped at error
Inserted before error: 1

ordered=False — continued after error
Successfully inserted: 2
Write errors: 1
  • Use ordered=True (default) when documents depend on each other and partial inserts are unacceptable
  • Use ordered=False for bulk imports where you want maximum throughput and can tolerate some failures
  • BulkWriteError.details contains full information about which documents succeeded and which failed
  • Always wrap insertMany() in a try/except when working with data that may contain duplicates

Handling Duplicate Key Errors

Attempting to insert a document with an _id that already exists raises a DuplicateKeyError. Handling this gracefully is essential in real applications — for example, when a user tries to register with an email that already exists.

# Handling DuplicateKeyError — graceful error handling

from pymongo import MongoClient
from pymongo.errors import DuplicateKeyError

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

def add_user(user_doc):
    try:
        result = db.users.insert_one(user_doc)
        print(f"User created — _id: {result.inserted_id}")
        return result.inserted_id

    except DuplicateKeyError as e:
        print(f"User already exists — _id '{user_doc['_id']}' is taken")
        return None

# First insert — succeeds
add_user({"_id": "u007", "name": "Grace Kim", "email": "grace@example.com"})

# Second insert — same _id, raises DuplicateKeyError
add_user({"_id": "u007", "name": "Grace Kim", "email": "grace@example.com"})

# Trying to re-insert an existing Dataplexa user
add_user({"_id": "u001", "name": "Alice Johnson", "email": "alice@example.com"})
User created — _id: u007
User already exists — _id 'u007' is taken
User already exists — _id 'u001' is taken
  • DuplicateKeyError is a subclass of WriteError — import it from pymongo.errors
  • DuplicateKeyError is also raised by unique indexes on non-_id fields — for example a unique index on email
  • A common pattern is to catch it and return a user-friendly message rather than letting the error bubble up to the API

Inserting Documents with mongosh

All insert operations work identically in the mongosh shell — useful for quick data entry, testing, and administrative tasks without writing application code.

# Insert operations in mongosh — reference syntax

mongosh_inserts = {
    "insertOne": 'db.users.insertOne({ "_id": "u008", "name": "Hana Park", "city": "Seoul" })',

    "insertMany": """db.products.insertMany([
  { "name": "Desk Lamp", "category": "Furniture", "price": 39.99 },
  { "name": "Webcam HD", "category": "Electronics", "price": 79.99 }
])""",

    "check result": "db.users.countDocuments()",

    "ordered false": 'db.users.insertMany([...], { ordered: false })',
}

for operation, syntax in mongosh_inserts.items():
    print(f"── {operation} ──")
    print(f"  {syntax}")
    print()
── insertOne ──
db.users.insertOne({ "_id": "u008", "name": "Hana Park", "city": "Seoul" })

── insertMany ──
db.products.insertMany([
{ "name": "Desk Lamp", "category": "Furniture", "price": 39.99 },
{ "name": "Webcam HD", "category": "Electronics", "price": 79.99 }
])

── check result ──
db.users.countDocuments()

── ordered false ──
db.users.insertMany([...], { ordered: false })
  • mongosh uses camelCase method names: insertOne, insertMany — PyMongo uses snake_case: insert_one, insert_many
  • The result object in mongosh shows acknowledged, insertedId (singular), and insertedIds (plural for many)
  • Always follow an insert with countDocuments() or findOne() to verify the result when working interactively

Summary Table

Method Input Returns Key Option
insert_one() Single dict InsertOneResultinserted_id
insert_many() List of dicts InsertManyResultinserted_ids ordered=True/False
Auto _id Omit _id field ObjectId generated and injected
DuplicateKeyError Duplicate _id or unique field Exception raised Catch and handle gracefully
ordered=False insert_many() option Continues after errors Use for bulk imports with expected duplicates

Practice Questions

Practice 1. What property of the insertOne() result gives you the _id of the newly inserted document?



Practice 2. What does PyMongo do to the original dict after a successful insert_one() call when no _id was provided?



Practice 3. What is the difference between ordered=True and ordered=False in insertMany()?



Practice 4. What exception should you catch when inserting a document whose _id already exists in the collection?



Practice 5. Why is insertMany() more efficient than calling insertOne() in a loop?



Quiz

Quiz 1. What does result.acknowledged = True mean after an insert operation?






Quiz 2. What happens to a collection that does not exist when you call insert_one() on it?






Quiz 3. Which module in PyMongo contains DuplicateKeyError?






Quiz 4. What does result.inserted_ids return after a successful insertMany() call?






Quiz 5. In mongosh, which method name is the equivalent of PyMongo's insert_many()?






Next up — Find Documents: reading data from collections using find(), findOne(), filters, and projections.