Mango DBLesson 20 – Array Queries | Dataplexa

Array Queries

Arrays are one of MongoDB's most powerful features — a single field can hold a list of values, embedded documents, or a mix of both. The Dataplexa Store uses arrays throughout: users have a tags array, products have a tags array, and orders have an items array of embedded line-item documents. Querying into arrays requires a specific set of operators: plain element matching, $all, $elemMatch, $size, and positional operators. Each solves a different problem and understanding when to use which is essential for working with any real MongoDB dataset.

Plain Element Match — Does the Array Contain This Value?

The simplest array query uses no special operator at all. When you filter on an array field with a plain value, MongoDB returns every document whose array contains that value — anywhere in the array, regardless of position. This is one of MongoDB's most elegant behaviours.

# Plain element match — array contains a specific value

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Find users whose tags array contains "newsletter"
print("Users tagged 'newsletter':")
results = db.users.find(
    {"tags": "newsletter"},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']:15} tags: {u['tags']}")

# Find products whose tags contain "bestseller"
print("\nProducts tagged 'bestseller':")
results = db.products.find(
    {"tags": "bestseller"},
    {"name": 1, "tags": 1, "_id": 0}
)
for p in results:
    print(f"  {p['name']:25} tags: {p['tags']}")

# Plain element match also works for equality on an entire sub-document
# Find orders containing an item with a specific product_id
print("\nOrders containing product p002:")
results = db.orders.find(
    {"items.product_id": "p002"},
    {"_id": 1, "user_id": 1}
)
for o in results:
    print(f"  {o['_id']}  user: {o['user_id']}")
Users tagged 'newsletter':
Bob Smith tags: ['newsletter']
Alice Johnson tags: ['early_adopter', 'newsletter']
Eva Müller tags: ['early_adopter', 'newsletter']

Products tagged 'bestseller':
Wireless Mouse tags: ['wireless', 'bestseller']
Mechanical Keyboard tags: ['mechanical', 'bestseller', 'rgb']

Orders containing product p002:
o002 user: u002
o006 user: u005
  • Plain element match uses dot notation for arrays of sub-documents — "items.product_id" checks the product_id field inside every element of items
  • Position in the array does not matter — MongoDB scans every element
  • A standard index on an array field indexes every element individually — all plain element match queries are index-eligible

$all — Array Must Contain All Listed Values

$all returns documents where the array field contains every value in the provided list. The array may contain other values too — $all only checks that all listed values are present, in any order.

Why it matters: a plain element match checks for one value. $all checks for multiple values simultaneously — "find users who have both the early_adopter AND the newsletter tag".

# $all — array must contain every listed value

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Users who have BOTH early_adopter AND newsletter tags
print("Users with both 'early_adopter' AND 'newsletter' tags ($all):")
results = db.users.find(
    {"tags": {"$all": ["early_adopter", "newsletter"]}},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — {u['tags']}")

# Products tagged with both 'mechanical' and 'rgb'
print("\nProducts tagged 'mechanical' AND 'rgb':")
results = db.products.find(
    {"tags": {"$all": ["mechanical", "rgb"]}},
    {"name": 1, "tags": 1, "_id": 0}
)
for p in results:
    print(f"  {p['name']} — {p['tags']}")

# $all with a single value is equivalent to a plain element match
single_all  = db.users.count_documents({"tags": {"$all": ["newsletter"]}})
plain_match = db.users.count_documents({"tags": "newsletter"})
print(f"\n$all single value: {single_all}  |  plain match: {plain_match}  |  Same: {single_all == plain_match}")
Users with both 'early_adopter' AND 'newsletter' tags ($all):
Alice Johnson — ['early_adopter', 'newsletter']
Eva Müller — ['early_adopter', 'newsletter']

Products tagged 'mechanical' AND 'rgb':
Mechanical Keyboard — ['mechanical', 'bestseller', 'rgb']

$all single value: 3 | plain match: 3 | Same: True
  • Order of values in the $all list does not matter — MongoDB checks for presence, not position
  • The array may contain additional values beyond those listed — $all does not require an exact match of the whole array
  • To find arrays that contain exactly a specific set of values with no extras, use a plain equality match on the whole array: {"tags": ["a", "b"]} — order and content must match exactly

$elemMatch — Multiple Conditions on the Same Array Element

$elemMatch is essential when you need multiple conditions to apply to the same single element within an array of sub-documents. Without it, each condition can be satisfied by different elements — which produces false positives.

Real-world use: find orders that contain a specific product bought in a quantity greater than 1 — both conditions must be true for the same item in the items array.

# $elemMatch — all conditions apply to the same array element

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Find orders where a single item is product p003 AND qty >= 2
print("Orders with p003 bought in qty >= 2 ($elemMatch):")
results = db.orders.find(
    {"items": {"$elemMatch": {
        "product_id": "p003",
        "qty":        {"$gte": 2}
    }}},
    {"_id": 1, "user_id": 1, "items": 1}
)
for o in results:
    matching = [i for i in o["items"] if i["product_id"] == "p003"]
    print(f"  {o['_id']}  user: {o['user_id']}  p003 items: {matching}")

# WHY $elemMatch matters — the false positive problem
print("\nWithout $elemMatch (may match across different elements):")
# This filter checks items.product_id == p003 AND items.qty >= 2
# but those conditions can be satisfied by DIFFERENT items in the array
without = db.orders.find(
    {
        "items.product_id": "p003",
        "items.qty": {"$gte": 2}
    },
    {"_id": 1, "items": 1}
)
for o in without:
    print(f"  {o['_id']}  all items: {[(i['product_id'], i['qty']) for i in o['items']]}")

# $elemMatch on a simple array of numbers
# Create a test document with a scores array
db.test_scores.drop()
db.test_scores.insert_many([
    {"student": "Ana",   "scores": [45, 72, 88]},
    {"student": "Ben",   "scores": [30, 55, 61]},
    {"student": "Carol", "scores": [80, 90, 95]},
])

# Find students who have at least one score between 70 and 85
print("\nStudents with a score between 70 and 85 ($elemMatch):")
results = db.test_scores.find(
    {"scores": {"$elemMatch": {"$gte": 70, "$lte": 85}}},
    {"student": 1, "scores": 1, "_id": 0}
)
for s in results:
    print(f"  {s['student']} — {s['scores']}")

db.test_scores.drop()
Orders with p003 bought in qty >= 2 ($elemMatch):
o001 user: u001 p003 items: [{'product_id': 'p003', 'qty': 3, 'price': 4.99}]
o007 user: u002 p003 items: [{'product_id': 'p003', 'qty': 2, 'price': 4.99}]

Without $elemMatch (may match across different elements):
o001 all items: [('p001', 1), ('p003', 3)]
o007 all items: [('p006', 2), ('p003', 2)]

Students with a score between 70 and 85 ($elemMatch):
Ana — [45, 72, 88]
Ben — [30, 55, 61]
  • Without $elemMatch, conditions on array sub-document fields are checked independently across all elements — they can be satisfied by different elements
  • $elemMatch on a plain array of scalars applies multiple operators to the same single element — essential for range queries on score arrays and similar structures
  • When querying a single condition on a sub-document field, a plain dot notation query is sufficient — $elemMatch is only needed for multiple conditions on the same element

$size — Match by Array Length

$size matches documents where an array field has exactly the specified number of elements. It is useful for finding documents with empty arrays, exactly one value, or a specific count of items.

# $size — match documents by array length

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Users with exactly 0 tags (empty array)
print("Users with no tags ($size: 0):")
results = db.users.find(
    {"tags": {"$size": 0}},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — tags: {u['tags']}")

# Users with exactly 2 tags
print("\nUsers with exactly 2 tags ($size: 2):")
results = db.users.find(
    {"tags": {"$size": 2}},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — {u['tags']}")

# Orders with exactly 2 line items
print("\nOrders with exactly 2 items ($size: 2):")
results = db.orders.find(
    {"items": {"$size": 2}},
    {"_id": 1, "user_id": 1, "items": 1}
)
for o in results:
    product_ids = [i["product_id"] for i in o["items"]]
    print(f"  {o['_id']}  user: {o['user_id']}  products: {product_ids}")

# $size limitation — cannot use range operators
# WRONG: {"tags": {"$size": {"$gte": 1}}}  ← raises an error
# Workaround: use $exists and $not to check for non-empty arrays
has_tags = db.users.count_documents({"tags.0": {"$exists": True}})
print(f"\nUsers with at least one tag (workaround): {has_tags}")
Users with no tags ($size: 0):
David Lee — tags: []

Users with exactly 2 tags ($size: 2):
Alice Johnson — ['early_adopter', 'newsletter']
Eva Müller — ['early_adopter', 'newsletter']

Orders with exactly 2 items ($size: 2):
o001 user: u001 products: ['p001', 'p003']
o005 user: u004 products: ['p007', 'p001']
o007 user: u002 products: ['p006', 'p003']

Users with at least one tag (workaround): 4
  • $size only accepts an exact integer — it cannot be combined with $gt, $lt, or other range operators
  • To check whether an array is non-empty, use {"array.0": {"$exists": true}} — this checks whether index 0 exists
  • To find arrays with more than N elements, use the aggregation $where or $expr with $size in an expression context

Querying Specific Array Positions

MongoDB lets you query a specific index position in an array using dot notation with a numeric index. This is useful when the position in the array is meaningful — for example, the first tag is always the primary category, or the first item in an order is the lead product.

# Positional array queries — dot notation with index

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Find users whose FIRST tag is "early_adopter"
print("Users whose first tag is 'early_adopter' (index 0):")
results = db.users.find(
    {"tags.0": "early_adopter"},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — {u['tags']}")

# Find orders whose first item has product_id "p001"
print("\nOrders where first item is p001:")
results = db.orders.find(
    {"items.0.product_id": "p001"},
    {"_id": 1, "user_id": 1, "items": 1}
)
for o in results:
    print(f"  {o['_id']}  user: {o['user_id']}  first item: {o['items'][0]}")

# Existence check — does index 1 exist? (array has at least 2 elements)
print("\nOrders with at least 2 items (index 1 exists):")
results = db.orders.find(
    {"items.1": {"$exists": True}},
    {"_id": 1, "user_id": 1}
)
for o in results:
    print(f"  {o['_id']}  user: {o['user_id']}")
Users whose first tag is 'early_adopter' (index 0):
Alice Johnson — ['early_adopter', 'newsletter']
Clara Diaz — ['early_adopter']
Eva Müller — ['early_adopter', 'newsletter']

Orders where first item is p001:
o001 user: u001 first item: {'product_id': 'p001', 'qty': 1, 'price': 29.99}

Orders with at least 2 items (index 1 exists):
o001 user: u001
o005 user: u004
o007 user: u002
  • Positional dot notation syntax: "arrayField.0" for first element, "arrayField.1" for second, and so on
  • Combine with $exists to check whether a particular index position is populated — an efficient way to test minimum array length
  • Positional index queries are not index-backed on standard array indexes — consider a dedicated field if you frequently query a specific position

Updating Array Elements with the Positional $ Operator

The positional $ operator in an update references the first array element that matched the query filter. It lets you update a specific element inside an array without knowing its index in advance.

# Positional $ operator — update the matched array element

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Increase the price of product p001 within order o001's items array
# The $ refers to the first items element where product_id == "p001"
before = db.orders.find_one({"_id": "o001"}, {"items": 1, "_id": 0})
print("Order o001 items before update:")
for item in before["items"]:
    print(f"  {item}")

db.orders.update_one(
    {"_id": "o001", "items.product_id": "p001"},  # filter — match document AND element
    {"$set": {"items.$.price": 32.99}}             # $ = the matched element index
)

after = db.orders.find_one({"_id": "o001"}, {"items": 1, "_id": 0})
print("\nOrder o001 items after update:")
for item in after["items"]:
    print(f"  {item}")
Order o001 items before update:
{'product_id': 'p001', 'qty': 1, 'price': 29.99}
{'product_id': 'p003', 'qty': 3, 'price': 4.99}

Order o001 items after update:
{'product_id': 'p001', 'qty': 1, 'price': 32.99}
{'product_id': 'p003', 'qty': 3, 'price': 4.99}
  • The $ positional operator only updates the first matching element — use $[] to update all elements or $[identifier] with array filters for conditional updates on multiple elements
  • The filter must include a condition on the array field that identifies which element to update — the $ refers to that matched element
  • This is the correct pattern for updating a specific item within an embedded array without fetching the whole document into Python first

Summary Table

Operator / Technique What It Does Example Key Note
Plain element match Array contains value {"tags": "vip"} Position-independent, index-backed
$all Array contains all listed values {"tags": {"$all": ["a","b"]}} Order of list does not matter
$elemMatch One element satisfies all conditions {"items": {"$elemMatch": {...}}} Prevents false positives across elements
$size Array has exact length {"tags": {"$size": 2}} Cannot use range operators with $size
Positional index Query element at specific position {"tags.0": "vip"} Not standard index-backed
Positional $ Update matched array element {"$set": {"arr.$.field": val}} Updates first matching element only

Practice Questions

Practice 1. Write a filter to find all products in the Dataplexa Store whose tags array contains both "bestseller" and "wireless".



Practice 2. Why is $elemMatch necessary when querying arrays of sub-documents with multiple conditions?



Practice 3. What is the workaround to find documents where an array has at least one element, given that $size cannot use range operators?



Practice 4. What does the positional $ operator refer to in an update operation?



Practice 5. What is the difference between {"tags": ["a", "b"]} and {"tags": {"$all": ["a", "b"]}} as a filter?



Quiz

Quiz 1. When you filter on an array field with a plain value like {"tags": "vip"}, what does MongoDB return?






Quiz 2. What is the key difference between $all and a plain element match on an array field?






Quiz 3. Which operator finds documents where the array contains exactly 3 elements?






Quiz 4. In an update using the positional $ operator, how many array elements are updated?






Quiz 5. What does the dot notation "items.0.product_id" in a query filter do?






Next up — Cursor and Pagination: Understanding how MongoDB cursors stream data and implementing efficient offset and keyset pagination patterns.

Array Queries

Arrays are one of MongoDB's most powerful features — a single field can hold a list of values, embedded documents, or a mix of both. The Dataplexa Store uses arrays throughout: users have a tags array, products have a tags array, and orders have an items array of embedded line-item documents. Querying into arrays requires a specific set of operators: plain element matching, $all, $elemMatch, $size, and positional operators. Each solves a different problem and understanding when to use which is essential for working with any real MongoDB dataset.

Plain Element Match — Does the Array Contain This Value?

The simplest array query uses no special operator at all. When you filter on an array field with a plain value, MongoDB returns every document whose array contains that value — anywhere in the array, regardless of position. This is one of MongoDB's most elegant behaviours.

# Plain element match — array contains a specific value

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Find users whose tags array contains "newsletter"
print("Users tagged 'newsletter':")
results = db.users.find(
    {"tags": "newsletter"},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']:15} tags: {u['tags']}")

# Find products whose tags contain "bestseller"
print("\nProducts tagged 'bestseller':")
results = db.products.find(
    {"tags": "bestseller"},
    {"name": 1, "tags": 1, "_id": 0}
)
for p in results:
    print(f"  {p['name']:25} tags: {p['tags']}")

# Plain element match also works for equality on an entire sub-document
# Find orders containing an item with a specific product_id
print("\nOrders containing product p002:")
results = db.orders.find(
    {"items.product_id": "p002"},
    {"_id": 1, "user_id": 1}
)
for o in results:
    print(f"  {o['_id']}  user: {o['user_id']}")
Users tagged 'newsletter':
Bob Smith tags: ['newsletter']
Alice Johnson tags: ['early_adopter', 'newsletter']
Eva Müller tags: ['early_adopter', 'newsletter']

Products tagged 'bestseller':
Wireless Mouse tags: ['wireless', 'bestseller']
Mechanical Keyboard tags: ['mechanical', 'bestseller', 'rgb']

Orders containing product p002:
o002 user: u002
o006 user: u005
  • Plain element match uses dot notation for arrays of sub-documents — "items.product_id" checks the product_id field inside every element of items
  • Position in the array does not matter — MongoDB scans every element
  • A standard index on an array field indexes every element individually — all plain element match queries are index-eligible

$all — Array Must Contain All Listed Values

$all returns documents where the array field contains every value in the provided list. The array may contain other values too — $all only checks that all listed values are present, in any order.

Why it matters: a plain element match checks for one value. $all checks for multiple values simultaneously — "find users who have both the early_adopter AND the newsletter tag".

# $all — array must contain every listed value

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Users who have BOTH early_adopter AND newsletter tags
print("Users with both 'early_adopter' AND 'newsletter' tags ($all):")
results = db.users.find(
    {"tags": {"$all": ["early_adopter", "newsletter"]}},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — {u['tags']}")

# Products tagged with both 'mechanical' and 'rgb'
print("\nProducts tagged 'mechanical' AND 'rgb':")
results = db.products.find(
    {"tags": {"$all": ["mechanical", "rgb"]}},
    {"name": 1, "tags": 1, "_id": 0}
)
for p in results:
    print(f"  {p['name']} — {p['tags']}")

# $all with a single value is equivalent to a plain element match
single_all  = db.users.count_documents({"tags": {"$all": ["newsletter"]}})
plain_match = db.users.count_documents({"tags": "newsletter"})
print(f"\n$all single value: {single_all}  |  plain match: {plain_match}  |  Same: {single_all == plain_match}")
Users with both 'early_adopter' AND 'newsletter' tags ($all):
Alice Johnson — ['early_adopter', 'newsletter']
Eva Müller — ['early_adopter', 'newsletter']

Products tagged 'mechanical' AND 'rgb':
Mechanical Keyboard — ['mechanical', 'bestseller', 'rgb']

$all single value: 3 | plain match: 3 | Same: True
  • Order of values in the $all list does not matter — MongoDB checks for presence, not position
  • The array may contain additional values beyond those listed — $all does not require an exact match of the whole array
  • To find arrays that contain exactly a specific set of values with no extras, use a plain equality match on the whole array: {"tags": ["a", "b"]} — order and content must match exactly

$elemMatch — Multiple Conditions on the Same Array Element

$elemMatch is essential when you need multiple conditions to apply to the same single element within an array of sub-documents. Without it, each condition can be satisfied by different elements — which produces false positives.

Real-world use: find orders that contain a specific product bought in a quantity greater than 1 — both conditions must be true for the same item in the items array.

# $elemMatch — all conditions apply to the same array element

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Find orders where a single item is product p003 AND qty >= 2
print("Orders with p003 bought in qty >= 2 ($elemMatch):")
results = db.orders.find(
    {"items": {"$elemMatch": {
        "product_id": "p003",
        "qty":        {"$gte": 2}
    }}},
    {"_id": 1, "user_id": 1, "items": 1}
)
for o in results:
    matching = [i for i in o["items"] if i["product_id"] == "p003"]
    print(f"  {o['_id']}  user: {o['user_id']}  p003 items: {matching}")

# WHY $elemMatch matters — the false positive problem
print("\nWithout $elemMatch (may match across different elements):")
# This filter checks items.product_id == p003 AND items.qty >= 2
# but those conditions can be satisfied by DIFFERENT items in the array
without = db.orders.find(
    {
        "items.product_id": "p003",
        "items.qty": {"$gte": 2}
    },
    {"_id": 1, "items": 1}
)
for o in without:
    print(f"  {o['_id']}  all items: {[(i['product_id'], i['qty']) for i in o['items']]}")

# $elemMatch on a simple array of numbers
# Create a test document with a scores array
db.test_scores.drop()
db.test_scores.insert_many([
    {"student": "Ana",   "scores": [45, 72, 88]},
    {"student": "Ben",   "scores": [30, 55, 61]},
    {"student": "Carol", "scores": [80, 90, 95]},
])

# Find students who have at least one score between 70 and 85
print("\nStudents with a score between 70 and 85 ($elemMatch):")
results = db.test_scores.find(
    {"scores": {"$elemMatch": {"$gte": 70, "$lte": 85}}},
    {"student": 1, "scores": 1, "_id": 0}
)
for s in results:
    print(f"  {s['student']} — {s['scores']}")

db.test_scores.drop()
Orders with p003 bought in qty >= 2 ($elemMatch):
o001 user: u001 p003 items: [{'product_id': 'p003', 'qty': 3, 'price': 4.99}]
o007 user: u002 p003 items: [{'product_id': 'p003', 'qty': 2, 'price': 4.99}]

Without $elemMatch (may match across different elements):
o001 all items: [('p001', 1), ('p003', 3)]
o007 all items: [('p006', 2), ('p003', 2)]

Students with a score between 70 and 85 ($elemMatch):
Ana — [45, 72, 88]
Ben — [30, 55, 61]
  • Without $elemMatch, conditions on array sub-document fields are checked independently across all elements — they can be satisfied by different elements
  • $elemMatch on a plain array of scalars applies multiple operators to the same single element — essential for range queries on score arrays and similar structures
  • When querying a single condition on a sub-document field, a plain dot notation query is sufficient — $elemMatch is only needed for multiple conditions on the same element

$size — Match by Array Length

$size matches documents where an array field has exactly the specified number of elements. It is useful for finding documents with empty arrays, exactly one value, or a specific count of items.

# $size — match documents by array length

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Users with exactly 0 tags (empty array)
print("Users with no tags ($size: 0):")
results = db.users.find(
    {"tags": {"$size": 0}},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — tags: {u['tags']}")

# Users with exactly 2 tags
print("\nUsers with exactly 2 tags ($size: 2):")
results = db.users.find(
    {"tags": {"$size": 2}},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — {u['tags']}")

# Orders with exactly 2 line items
print("\nOrders with exactly 2 items ($size: 2):")
results = db.orders.find(
    {"items": {"$size": 2}},
    {"_id": 1, "user_id": 1, "items": 1}
)
for o in results:
    product_ids = [i["product_id"] for i in o["items"]]
    print(f"  {o['_id']}  user: {o['user_id']}  products: {product_ids}")

# $size limitation — cannot use range operators
# WRONG: {"tags": {"$size": {"$gte": 1}}}  ← raises an error
# Workaround: use $exists and $not to check for non-empty arrays
has_tags = db.users.count_documents({"tags.0": {"$exists": True}})
print(f"\nUsers with at least one tag (workaround): {has_tags}")
Users with no tags ($size: 0):
David Lee — tags: []

Users with exactly 2 tags ($size: 2):
Alice Johnson — ['early_adopter', 'newsletter']
Eva Müller — ['early_adopter', 'newsletter']

Orders with exactly 2 items ($size: 2):
o001 user: u001 products: ['p001', 'p003']
o005 user: u004 products: ['p007', 'p001']
o007 user: u002 products: ['p006', 'p003']

Users with at least one tag (workaround): 4
  • $size only accepts an exact integer — it cannot be combined with $gt, $lt, or other range operators
  • To check whether an array is non-empty, use {"array.0": {"$exists": true}} — this checks whether index 0 exists
  • To find arrays with more than N elements, use the aggregation $where or $expr with $size in an expression context

Querying Specific Array Positions

MongoDB lets you query a specific index position in an array using dot notation with a numeric index. This is useful when the position in the array is meaningful — for example, the first tag is always the primary category, or the first item in an order is the lead product.

# Positional array queries — dot notation with index

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Find users whose FIRST tag is "early_adopter"
print("Users whose first tag is 'early_adopter' (index 0):")
results = db.users.find(
    {"tags.0": "early_adopter"},
    {"name": 1, "tags": 1, "_id": 0}
)
for u in results:
    print(f"  {u['name']} — {u['tags']}")

# Find orders whose first item has product_id "p001"
print("\nOrders where first item is p001:")
results = db.orders.find(
    {"items.0.product_id": "p001"},
    {"_id": 1, "user_id": 1, "items": 1}
)
for o in results:
    print(f"  {o['_id']}  user: {o['user_id']}  first item: {o['items'][0]}")

# Existence check — does index 1 exist? (array has at least 2 elements)
print("\nOrders with at least 2 items (index 1 exists):")
results = db.orders.find(
    {"items.1": {"$exists": True}},
    {"_id": 1, "user_id": 1}
)
for o in results:
    print(f"  {o['_id']}  user: {o['user_id']}")
Users whose first tag is 'early_adopter' (index 0):
Alice Johnson — ['early_adopter', 'newsletter']
Clara Diaz — ['early_adopter']
Eva Müller — ['early_adopter', 'newsletter']

Orders where first item is p001:
o001 user: u001 first item: {'product_id': 'p001', 'qty': 1, 'price': 29.99}

Orders with at least 2 items (index 1 exists):
o001 user: u001
o005 user: u004
o007 user: u002
  • Positional dot notation syntax: "arrayField.0" for first element, "arrayField.1" for second, and so on
  • Combine with $exists to check whether a particular index position is populated — an efficient way to test minimum array length
  • Positional index queries are not index-backed on standard array indexes — consider a dedicated field if you frequently query a specific position

Updating Array Elements with the Positional $ Operator

The positional $ operator in an update references the first array element that matched the query filter. It lets you update a specific element inside an array without knowing its index in advance.

# Positional $ operator — update the matched array element

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")
db     = client["dataplexa"]

# Increase the price of product p001 within order o001's items array
# The $ refers to the first items element where product_id == "p001"
before = db.orders.find_one({"_id": "o001"}, {"items": 1, "_id": 0})
print("Order o001 items before update:")
for item in before["items"]:
    print(f"  {item}")

db.orders.update_one(
    {"_id": "o001", "items.product_id": "p001"},  # filter — match document AND element
    {"$set": {"items.$.price": 32.99}}             # $ = the matched element index
)

after = db.orders.find_one({"_id": "o001"}, {"items": 1, "_id": 0})
print("\nOrder o001 items after update:")
for item in after["items"]:
    print(f"  {item}")
Order o001 items before update:
{'product_id': 'p001', 'qty': 1, 'price': 29.99}
{'product_id': 'p003', 'qty': 3, 'price': 4.99}

Order o001 items after update:
{'product_id': 'p001', 'qty': 1, 'price': 32.99}
{'product_id': 'p003', 'qty': 3, 'price': 4.99}
  • The $ positional operator only updates the first matching element — use $[] to update all elements or $[identifier] with array filters for conditional updates on multiple elements
  • The filter must include a condition on the array field that identifies which element to update — the $ refers to that matched element
  • This is the correct pattern for updating a specific item within an embedded array without fetching the whole document into Python first

Summary Table

Operator / Technique What It Does Example Key Note
Plain element match Array contains value {"tags": "vip"} Position-independent, index-backed
$all Array contains all listed values {"tags": {"$all": ["a","b"]}} Order of list does not matter
$elemMatch One element satisfies all conditions {"items": {"$elemMatch": {...}}} Prevents false positives across elements
$size Array has exact length {"tags": {"$size": 2}} Cannot use range operators with $size
Positional index Query element at specific position {"tags.0": "vip"} Not standard index-backed
Positional $ Update matched array element {"$set": {"arr.$.field": val}} Updates first matching element only

Practice Questions

Practice 1. Write a filter to find all products in the Dataplexa Store whose tags array contains both "bestseller" and "wireless".



Practice 2. Why is $elemMatch necessary when querying arrays of sub-documents with multiple conditions?



Practice 3. What is the workaround to find documents where an array has at least one element, given that $size cannot use range operators?



Practice 4. What does the positional $ operator refer to in an update operation?



Practice 5. What is the difference between {"tags": ["a", "b"]} and {"tags": {"$all": ["a", "b"]}} as a filter?



Quiz

Quiz 1. When you filter on an array field with a plain value like {"tags": "vip"}, what does MongoDB return?






Quiz 2. What is the key difference between $all and a plain element match on an array field?






Quiz 3. Which operator finds documents where the array contains exactly 3 elements?






Quiz 4. In an update using the positional $ operator, how many array elements are updated?






Quiz 5. What does the dot notation "items.0.product_id" in a query filter do?






Next up — Cursor and Pagination: Understanding how MongoDB cursors stream data and implementing efficient offset and keyset pagination patterns.