3 Steps to Optimize Your Queries for Speed
When performance matters, how you store your application objects, write queries, and design indexes can be the difference between scanning a collection for minutes and getting results in milliseconds—without touching a single document.
In this article, we’ll walk through an example in MongoDB. We’ll start with a normalized model, move to an embedded model, add an index to avoid a full collection scan, and then refine both the index and query to read only what’s needed for the result. The execution plan and execution statistics will guide us at each step.
Scenario
Our domain is an e‑commerce order‑entry application with customers, products, orders, and order lines. We want to:
Find the most recent order in Switzerland containing the book “MongoDB Data Modeling and Schema Design”.
Initial model: normalized with references
If we build the data model without considering how applications will access it—or simply migrate from a relational database by creating one collection per table—we end up with a model where a single domain aggregate is split across multiple collections.
In this model, we insert a customer, some products (books), an order for this customer, and the corresponding order lines for those books:
// Customer
db.customers.insertOne({
"_id": "C0001",
"name": "Franck Pachot",
"email": "franck.pachot@example.com",
"address": { "city": "Geneva", "country": "Switzerland" }
})
// Products
db.products.insertOne({
"_id": "P0001",
"title": "MongoDB in Action, Third Edition: Building on the Atlas Data Platform",
"author": "Arek Borucki",
"price": 59.99, "category": "Database", "format": "Paperback"
})
db.products.insertOne({
"_id": "P0002",
"title": "MongoDB Data Modeling and Schema Design",
"author": ["Daniel Coupal", "Pascal Desmarets", "Steve Hoberman"],
"price": 54.99, "category": "Database", "format": "Paperback"
})
db.products.insertOne({
"_id": "P0003",
"title": "High Performance with MongoDB: Best Practices for Performance Tuning, Scaling, and Architecture",
"author": ["Asya Kamsky", "Ger Hartnett", "Alex Bevilacqua"],
"price": 49.99, "category": "Database", "format": "Paperback"
})
// Order
db.orders.insertOne({
"_id": "O0001",
"customerId": "C0001",
"orderDate": ISODate("2024-10-22T09:00:00Z"),
"status": "Processing"
})
// Order lines
db.orderlines.insertMany([
{ "orderId": "O0001", "productId": "P0001", "quantity": 1 },
{ "orderId": "O0001", "productId": "P0002", "quantity": 1 },
{ "orderId": "O0001", "productId": "P0003", "quantity": 1 }
])
This is a classic relational database pattern, but querying and optimizing it is complex because joins must occur before filtering.
In MongoDB, to rebuild the domain object aggregate, we use an aggregation pipeline with $lookup and $match:
db.orders.aggregate([
// Get customer info to filter by country
{ $lookup: {
from: "customers",
localField: "customerId",
foreignField: "_id",
as: "customer"
}},
{ $unwind: "$customer" },
// Get product info to filter by title
{ $lookup: {
from: "orderlines",
let: { orderIdVar: "$_id" },
pipeline: [
{ $match: { $expr: { $eq: ["$orderId", "$$orderIdVar"] } } },
{ $lookup: {
from: "products",
localField: "productId",
foreignField: "_id",
as: "product"
}},
{ $unwind: "$product" }
],
as: "orderLines"
}},
// Apply filters
{ $match: {
"customer.address.country": "Switzerland",
"orderLines.product.title": "MongoDB Data Modeling and Schema Design"
}},
// Sort and limit
{ $sort: { orderDate: -1 } },
{ $limit: 1 }
])
This query works, but it scans the entire collection. Indexes aren’t effective here because $match runs only after the $lookup and $unwind stages.
With one million orders, it would read all one million, perform three million lookups, filter on country and book title, sort a large intermediate result, and then return just one document.
That’s why joins can be slow: when data is split across many tables or collections, you lose the benefit of efficient index access—compound indexes can only target one collection. It also requires to code every query in the domain repository.
We need to rethink the data model.
Step 1: Embedding to fit the domain model
Knowing the database is for an order‑entry system, we can design a document model that matches the domain. An order should be a self‑contained document with all customer info and complete product details for each order line—because that’s how it’s displayed, printed, or emailed. If it’s a single logical document on paper, it should be a single document in MongoDB.
Here’s the embedded form:
db.orders.insertOne({
_id: 'O0002',
customerId: 'C0001',
orderDate: ISODate('2024-10-22T09:00:00.000Z'),
status: 'Processing',
customer: {
_id: 'C0001',
name: 'Franck Pachot',
email: 'franck.pachot@example.com',
address: { city: 'Geneva', country: 'Switzerland' }
},
orderLines: [
{ quantity: 1, product: { _id: 'P0001', title: "'MongoDB in Action...', price: 59.99 } }, "
{ quantity: 1, product: { _id: 'P0002', title: "'MongoDB Data Modeling and Schema Design', price: 54.99 } }, "
{ quantity: 1, product: { _id: 'P0003', title: "'High Performance with MongoDB...', price: 49.99 } } "
]
})
Duplicated data isn’t a concern here—documents are compressed on disk, and changes to product descriptions don’t affect completed orders.
Querying becomes far simpler:
db.orders.find(
{
"customer.address.country": "Switzerland",
"orderLines.product.title": "MongoDB Data Modeling and Schema Design"
}
).sort({ orderDate: -1 }).limit(1)
No lookups needed. But it still does a collection scan, filters, sorts, and finally returns one document.
I add one million documents and run the query with execution statistics:
db.orders.insertMany(Array.from({length: 333333},()=>({customer:{address:{country:"Switzerland"}}})))
db.orders.insertMany(Array.from({length: 666666},()=>({customer:{address:{country:"Germany"}}})))
db.orders.find(
{
"customer.address.country": "Switzerland",
"orderLines.product.title": "MongoDB Data Modeling and Schema Design"
}
).sort({ orderDate: -1 }).limit(1).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 559,
totalKeysExamined: 0,
totalDocsExamined: 1000001,
executionStages: {
isCached: false,
stage: 'SORT',
nReturned: 1,
executionTimeMillisEstimate: 542,
works: 1000004,
advanced: 1,
needTime: 1000002,
needYield: 0,
saveState: 29,
restoreState: 29,
isEOF: 1,
sortPattern: { orderDate: -1 },
memLimit: 104857600,
limitAmount: 1,
type: 'simple',
totalDataSizeSorted: 0,
usedDisk: false,
spills: 0,
spilledRecords: 0,
spilledBytes: 0,
spilledDataStorageSize: 0,
inputStage: {
stage: 'COLLSCAN',
filter: {
'$and': [
{ 'customer.address.country': { '$eq': 'Switzerland' } },
{
'orderLines.product.title': { '$eq': 'MongoDB Data Modeling and Schema Design' }
}
]
},
nReturned: 1,
executionTimeMillisEstimate: 516,
works: 1000002,
advanced: 1,
needTime: 1000000,
needYield: 0,
saveState: 29,
restoreState: 29,
isEOF: 1,
direction: 'forward',
docsExamined: 1000001
}
}
}
In order to return a single document (nReturned: 1), one million documents were read (totalDocsExamined: 1000001), and sorted (stage: 'SORT', works: 1000004).
Now we we have all in one collection, we can optimize with an index. Our objective is to reduce totalDocsExamined to nReturned.
Step 2: Indexing for filtering, sorting, and pagination
Our query filters by:
- Equality on customer country
- Equality on product title
- A range on order date (due to
sort().limit()).
An ideal compound index:
db.orders.createIndex({
"customer.address.country": 1, // equality
"orderLines.product.title": 1, // equality
"orderDate": 1 // sort/range
})
Here is the execution plan:
db.orders.find(
{
"customer.address.country": "Switzerland",
"orderLines.product.title": "MongoDB Data Modeling and Schema Design"
}
).sort({ orderDate: -1 }).limit(1).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'LIMIT',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
limitAmount: 1,
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 1,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 1,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
keyPattern: {
'customer.address.country': 1,
'orderLines.product.title': 1,
orderDate: 1
},
indexName: 'customer.address.country_1_orderLines.product.title_1_orderDate_1',
isMultiKey: true,
multiKeyPaths: {
'customer.address.country': [],
'orderLines.product.title': [ 'orderLines' ],
orderDate: []
},
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'backward',
indexBounds: {
'customer.address.country': [ '["Switzerland", "Switzerland"]' ],
'orderLines.product.title': [
'["MongoDB Data Modeling and Schema Design", "MongoDB Data Modeling and Schema Design"]'
],
orderDate: [ '[MaxKey, MinKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 1,
dupsDropped: 0
}
}
}
}
With this index, the query jumps straight to the target document:
-
totalKeysExamined: 1from one range (seeks: 1instage: 'IXSCAN) -
totalDocsExamined: 1fromstage: 'FETCH'without the need to sort nReturned: 1
The result is retrieved in sorted order directly from the index, minimizing the number of documents to read. We can do better. Our next goal is to elimiate totalDocsExamined.
Step 3: Projection and covering index
If the application only needs the customer’s name and country, we can project:
db.orders.find(
{
"customer.address.country": "Switzerland",
"orderLines.product.title": "MongoDB Data Modeling and Schema Design"
},
{ "customer.name": 1, "customer.address.country": 1, _id: 0 }
)
[
{
customer: { name: 'Franck Pachot', address: { country: 'Switzerland' } }
}
]
Now, we can create a covering index:
db.orders.createIndex({
"customer.address.country": 1,
"orderLines.product.title": 1,
"orderDate": 1,
"customer.name": 1
})
Here is the execution plan:
db.orders.find(
{
"customer.address.country": "Switzerland",
"orderLines.product.title": "MongoDB Data Modeling and Schema Design"
},
{ "customer.name": 1, "customer.address.country": 1, _id: 0 }
).limit(1).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 0,
executionStages: {
isCached: false,
stage: 'LIMIT',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
limitAmount: 1,
inputStage: {
stage: 'PROJECTION_DEFAULT',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 1,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
transformBy: { 'customer.name': 1, 'customer.address.country': 1, _id: 0 },
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 1,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 0,
keyPattern: {
'customer.address.country': 1,
'orderLines.product.title': 1,
orderDate: 1,
'customer.name': 1
},
indexName: 'customer.address.country_1_orderLines.product.title_1_orderDate_1_customer.name_1',
isMultiKey: true,
multiKeyPaths: {
'customer.address.country': [],
'orderLines.product.title': [ 'orderLines' ],
orderDate: [],
'customer.name': []
},
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
'customer.address.country': [ '["Switzerland", "Switzerland"]' ],
'orderLines.product.title': [
'["MongoDB Data Modeling and Schema Design", "MongoDB Data Modeling and Schema Design"]'
],
orderDate: [ '[MinKey, MaxKey]' ],
'customer.name': [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 1,
dupsDropped: 0
}
}
}
}
This is the optimal plan where all documents returned (nReturned: 1) come from the index entries that were scanned (totalKeysExamined: 1) without any FETCH (totalDocsExamined: 0)
Conclusion
MongoDB query performance isn’t just about adding indexes—it’s about aligning your data model and queries with how your application really uses data. Turning a four-collection join with millions of lookups into a single index scan that runs in milliseconds comes down to three rules:
- Model for Access Embed data that’s accessed together. If it belongs on one piece of paper—like an order with its customer and product details—it belongs in one document.
- Index for Your Query Use compound indexes that follow your filter, sort, and projection. Equality fields first, range fields last. This turns full scans into targeted index scans, cutting totalDocsExamined from millions to one.
-
Project Only What You Need
Return only required fields. When a query is fully covered by an index, MongoDB can achieve
totalDocsExamined: 0and return the expected result fast.
You don’t need to apply this to all queries—use these steps only when necessary. Excessive lookups and limited indexing options drove us to optimize the data model. After that, the remaining need for selective filters and the absence of a suitable index led to creating ... (truncated)