MongoDB
Comparison Operators
Comparison operators are the building blocks of every non-trivial MongoDB query. They let you filter documents by asking questions like "is the price greater than $50?", "is the status anything other than delivered?", or "is the country one of these five values?". Every comparison operator follows the same pattern — it sits inside the filter document as the value of a field, wrapped in its own dictionary. Once you understand the pattern, all eight operators feel natural and consistent.
Lesson 13 introduced these operators briefly. This lesson goes deeper — covering edge cases, type-aware comparisons, performance implications, and a full set of worked examples against the Dataplexa Store dataset.
The Comparison Operator Pattern
Every comparison operator follows one consistent structure — the field name maps to a dictionary containing the operator and its operand. This pattern applies to all eight operators and composes cleanly with logical operators.
# The comparison operator pattern — consistent across all operators
# Basic structure:
# { "field": { "$operator": value } }
# Simple exact match (no operator needed — implicit $eq)
implicit_eq = {"membership": "premium"}
# Explicit $eq — identical result to above
explicit_eq = {"membership": {"$eq": "premium"}}
# Range query — combining two operators on one field
range_query = {"price": {"$gte": 20, "$lte": 100}}
# Operator on a nested field using dot notation
nested_query = {"items.price": {"$gt": 10}}
# Operator combined with a filter — all conditions must match
combined = {"category": "Electronics", "price": {"$lt": 50}}
patterns = [
("Implicit eq", implicit_eq),
("Explicit eq", explicit_eq),
("Range", range_query),
("Nested field", nested_query),
("Combined", combined),
]
for label, pattern in patterns:
print(f" {label:15} → {pattern}")Explicit eq → {'membership': {'$eq': 'premium'}}
Range → {'price': {'$gte': 20, '$lte': 100}}
Nested field → {'items.price': {'$gt': 10}}
Combined → {'category': 'Electronics', 'price': {'$lt': 50}}
- The implicit equality shorthand
{"field": value}is equivalent to{"field": {"$eq": value}}— use whichever is more readable - Multiple operators on the same field are combined with an implicit AND — both conditions must be true
- Comparison operators work on any BSON type — strings, numbers, dates, ObjectIds — following BSON comparison order
$eq and $ne — Equal and Not Equal
$eq matches documents where a field equals a specific value. $ne matches documents where a field does not equal the value — including documents where the field does not exist at all.
# $eq and $ne — equality and inequality
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# $eq — find all premium users (explicit form)
print("Premium users ($eq):")
results = db.users.find(
{"membership": {"$eq": "premium"}},
{"name": 1, "membership": 1, "_id": 0}
)
for u in results:
print(f" {u['name']} — {u['membership']}")
# $ne — find all orders that are NOT delivered
print("\nNon-delivered orders ($ne):")
results = db.orders.find(
{"status": {"$ne": "delivered"}},
{"_id": 1, "status": 1, "total": 1}
)
for o in results:
print(f" {o['_id']} {o['status']:12} ${o['total']:.2f}")
# $ne also matches documents where the field does not exist
# Adding a document with no membership field to demonstrate
db.users.insert_one({"_id": "u_temp", "name": "Temp User"})
no_membership = db.users.find(
{"membership": {"$ne": "premium"}},
{"name": 1, "membership": 1, "_id": 0}
)
print("\nUsers who are not premium (including those with no membership field):")
for u in no_membership:
print(f" {u['name']} — {u.get('membership', '[field missing]')}")
# Clean up the temp document
db.users.delete_one({"_id": "u_temp"})Alice Johnson — premium
Clara Diaz — premium
Eva Müller — premium
Non-delivered orders ($ne):
o002 shipped $89.99
o004 processing $349.99
Users who are not premium (including those with no membership field):
Bob Smith — basic
David Lee — basic
Temp User — [field missing]
$nematches documents where the field is absent — use{"field": {"$ne": value}, "field": {"$exists": true}}if you want to exclude missing fields too- For exact equality the implicit shorthand is always preferred —
{"status": "delivered"}is cleaner than{"status": {"$eq": "delivered"}} - On indexed fields, both
$eqand$neuse the index — but$neoften results in a large portion of the index being scanned since it matches most documents
$gt and $gte — Greater Than
$gt matches documents where the field value is strictly greater than the operand. $gte includes equality — greater than or equal. Both work on numbers, dates, strings, and any other comparable BSON type.
# $gt and $gte — greater than comparisons
from pymongo import MongoClient
from datetime import datetime, timezone
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# Products strictly more expensive than $50
print("Products over $50 ($gt):")
results = db.products.find(
{"price": {"$gt": 50}},
{"name": 1, "price": 1, "_id": 0}
)
for p in results:
print(f" ${p['price']:>7.2f} {p['name']}")
# Products $50 or more ($gte includes the boundary)
print("\nProducts $50 or more ($gte):")
results = db.products.find(
{"price": {"$gte": 50}},
{"name": 1, "price": 1, "_id": 0}
)
for p in results:
print(f" ${p['price']:>7.2f} {p['name']}")
# Users aged 30 or older
print("\nUsers aged 30+ ($gte):")
results = db.users.find(
{"age": {"$gte": 30}},
{"name": 1, "age": 1, "_id": 0}
)
for u in results:
print(f" {u['name']} — age {u['age']}")
# Orders placed after a specific date ($gt on a date field)
cutoff = datetime(2024, 2, 1, tzinfo=timezone.utc)
print("\nOrders after 2024-02-01 ($gt on date):")
results = db.orders.find(
{"date": {"$gt": cutoff.strftime("%Y-%m-%d")}},
{"_id": 1, "date": 1, "total": 1}
)
for o in results:
print(f" {o['_id']} {o['date']} ${o['total']:.2f}")$ 89.99 Mechanical Keyboard
$299.99 Monitor 27-inch
$349.99 Standing Desk
Products $50 or more ($gte):
$ 89.99 Mechanical Keyboard
$299.99 Monitor 27-inch
$349.99 Standing Desk
Users aged 30+ ($gte):
Alice Johnson — age 30
David Lee — age 32
Orders after 2024-02-01 ($gt on date):
o003 2024-02-20 $99.98
o004 2024-03-01 $349.99
o005 2024-03-10 $329.98
o006 2024-03-15 $89.99
o007 2024-04-01 $11.97
- The difference between
$gtand$gteonly matters at the boundary value — choose based on whether the boundary should be included - String comparison with
$gt/$gtefollows UTF-8 lexicographic order —"b" > "a"is true, but"B" < "a"because uppercase letters have lower byte values - For date comparisons always use BSON Date (
datetime) — string date comparisons only work correctly if dates are stored in ISO 8601 format (YYYY-MM-DD)
$lt and $lte — Less Than
$lt matches documents where the field is strictly less than the operand. $lte includes the boundary. Combined with $gt/$gte on the same field, they create range queries — one of the most common query patterns in any application.
# $lt and $lte — less than comparisons and range queries
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# Budget products — strictly under $10
print("Budget products under $10 ($lt):")
results = db.products.find(
{"price": {"$lt": 10}},
{"name": 1, "price": 1, "_id": 0}
)
for p in results:
print(f" ${p['price']:>6.2f} {p['name']}")
# Products $50 or under ($lte includes the boundary)
print("\nProducts $50 and under ($lte):")
results = db.products.find(
{"price": {"$lte": 50}},
{"name": 1, "price": 1, "_id": 0}
)
for p in results:
print(f" ${p['price']:>6.2f} {p['name']}")
# Range query — mid-range products between $20 and $100
print("\nMid-range products $20–$100 ($gte + $lte):")
results = db.products.find(
{"price": {"$gte": 20, "$lte": 100}},
{"name": 1, "price": 1, "_id": 0}
)
for p in results:
print(f" ${p['price']:>6.2f} {p['name']}")
# Users under 30 years old
print("\nUsers under 30 ($lt):")
results = db.users.find(
{"age": {"$lt": 30}},
{"name": 1, "age": 1, "_id": 0}
)
for u in results:
print(f" {u['name']} — age {u['age']}")$ 4.99 Notebook A5
$ 3.49 Ballpoint Pens 10-pack
Products $50 and under ($lte):
$29.99 Wireless Mouse
$ 4.99 Notebook A5
$49.99 USB-C Hub
$ 3.49 Ballpoint Pens 10-pack
Mid-range products $20–$100 ($gte + $lte):
$29.99 Wireless Mouse
$49.99 USB-C Hub
$89.99 Mechanical Keyboard
Users under 30 ($lt):
Bob Smith — age 25
Clara Diaz — age 28
- The range pattern
{"price": {"$gte": min, "$lte": max}}is extremely common — both operators apply to the same field simultaneously - A single index on
pricecovers this range query efficiently — MongoDB scans only the relevant portion of the index - Open ranges are valid —
{"age": {"$lt": 18}}has no lower bound,{"age": {"$gte": 65}}has no upper bound
$in and $nin — Match a List of Values
$in matches documents where the field equals any value in a provided list — equivalent to multiple $eq conditions joined with OR, but more efficient. $nin is the inverse — the field must not equal any value in the list.
# $in and $nin — matching against a list of values
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client["dataplexa"]
# $in — users from UK or Spain
print("Users from UK or Spain ($in):")
results = db.users.find(
{"country": {"$in": ["UK", "Spain"]}},
{"name": 1, "country": 1, "_id": 0}
)
for u in results:
print(f" {u['name']} ({u['country']})")
# $in — multiple specific orders by _id (bulk fetch)
print("\nFetch specific orders by _id list ($in):")
results = db.orders.find(
{"_id": {"$in": ["o001", "o003", "o005"]}},
{"_id": 1, "status": 1, "total": 1}
)
for o in results:
print(f" {o['_id']} {o['status']:12} ${o['total']:.2f}")
# $nin — products NOT in Stationery or Furniture
print("\nProducts not in Stationery or Furniture ($nin):")
results = db.products.find(
{"category": {"$nin": ["Stationery", "Furniture"]}},
{"name": 1, "category": 1, "_id": 0}
)
for p in results:
print(f" {p['name']} ({p['category']})")
# $in on an array field — find users who have any of these tags
print("\nUsers with 'vip' or 'early_adopter' tag ($in on array):")
results = db.users.find(
{"tags": {"$in": ["vip", "early_adopter"]}},
{"name": 1, "tags": 1, "_id": 0}
)
for u in results:
print(f" {u['name']} — {u['tags']}")Alice Johnson (UK)
Bob Smith (UK)
Clara Diaz (Spain)
Fetch specific orders by _id list ($in):
o001 delivered $44.96
o003 delivered $99.98
o005 delivered $329.98
Products not in Stationery or Furniture ($nin):
Wireless Mouse (Electronics)
Mechanical Keyboard (Electronics)
USB-C Hub (Electronics)
Monitor 27-inch (Electronics)
Users with 'vip' or 'early_adopter' tag ($in on array):
Alice Johnson — ['early_adopter', 'vip']
Clara Diaz — ['early_adopter']
Eva Müller — ['early_adopter', 'newsletter']
$inagainst an array field matches documents where the array contains any of the listed values — not all of them (use$allfor that)- Fetching multiple documents by a list of known IDs with
$inon_idis the standard bulk-fetch pattern — one query, no loop $inwith a large list (hundreds of values) can be slow — consider a different data model or a lookup aggregation stage instead$ninis generally less index-efficient than$inbecause it matches most of the collection — apply it only to well-indexed fields
BSON Type Ordering in Comparisons
When MongoDB compares values of different BSON types — for example, a string field against a number — it follows a defined BSON type ordering. Understanding this prevents surprising results in mixed-type collections.
# BSON comparison order — types compared when mixed types exist in a field
bson_type_order = [
(1, "MinKey (internal)"),
(2, "Null"),
(3, "Numbers — Int32, Int64, Double, Decimal128"),
(4, "Symbol, String"),
(5, "Object (embedded document)"),
(6, "Array"),
(7, "BinData (binary)"),
(8, "ObjectId"),
(9, "Boolean"),
(10, "Date"),
(11, "Timestamp"),
(12, "Regular expression"),
(13, "MaxKey (internal)"),
]
print("BSON type comparison order (lowest to highest):")
for rank, type_name in bson_type_order:
print(f" {rank:2}. {type_name}")
print("\nPractical implication:")
print(" A field storing '30' (string) is GREATER than 29 (int)")
print(" because strings sort higher than numbers in BSON order.")
print(" Always store values in a consistent type to avoid this.")1. MinKey (internal)
2. Null
3. Numbers — Int32, Int64, Double, Decimal128
4. Symbol, String
5. Object (embedded document)
6. Array
7. BinData (binary)
8. ObjectId
9. Boolean
10. Date
11. Timestamp
12. Regular expression
13. MaxKey (internal)
Practical implication:
A field storing '30' (string) is GREATER than 29 (int)
because strings sort higher than numbers in BSON order.
Always store values in a consistent type to avoid this.
- All numeric types (Int32, Int64, Double, Decimal128) compare by value —
30as Int32 equals30.0as Double - Mixed-type collections produce surprising comparison results — use the
$typeoperator to audit and clean up inconsistent field types - Null sorts lower than all other types including numbers —
{"age": {"$gt": 0}}will not match documents where age is null
Summary Table
| Operator | Meaning | Example | Note |
|---|---|---|---|
$eq |
Equals | {"status": {"$eq": "shipped"}} |
Shorthand: {"status": "shipped"} |
$ne |
Not equal | {"status": {"$ne": "delivered"}} |
Also matches missing field |
$gt |
Greater than | {"price": {"$gt": 50}} |
Boundary not included |
$gte |
Greater than or equal | {"age": {"$gte": 18}} |
Boundary included |
$lt |
Less than | {"stock": {"$lt": 10}} |
Boundary not included |
$lte |
Less than or equal | {"rating": {"$lte": 3}} |
Boundary included |
$in |
Equals any in list | {"country": {"$in": ["UK","US"]}} |
Also works on array fields |
$nin |
Not equal to any in list | {"category": {"$nin": ["X","Y"]}} |
Less index-efficient than $in |
Practice Questions
Practice 1. Write a filter to find all products with a rating greater than or equal to 4.5.
Practice 2. What is the difference between $gt and $gte — give an example where only one would match a document with price = 50?
Practice 3. Write the filter to find all Dataplexa users who are from Germany, USA, or Spain using a single operator.
Practice 4. Why might $ne produce a large result set on an indexed field compared to $eq?
Practice 5. What is the BSON comparison rule that means {"age": {"$gt": 0}} will not match a document where age is null?
Quiz
Quiz 1. Which operator would you use to find all orders with a total between $50 and $200 inclusive?
Quiz 2. What does $ne match in addition to documents where the field has a different value?
Quiz 3. What is the most efficient way to fetch five specific documents by known _id values?
Quiz 4. In BSON comparison order, which type sorts higher — a string or a number?
Quiz 5. When $in is applied to an array field, what does it match?
Next up — Logical Operators: Combining filters with $and, $or, $not, and $nor to express complex multi-condition queries.