MongoDB sits at the crossroads between SQL, where developers can write queries on a logical model without naming indexes, and the NoSQL world, where developers are accustomed to querying directly by mentioning indexes in their queries.
Except for Atlas Search and Vector indexes, where you must explicitly indicate whether it uses approximate search, you typically query a collection, and the MongoDB query planner determines which index to use.
One exception to this rule happens when using an optimizer hint to reference an index. This feature is supported in most popular databases, except PostgreSQL, where it requires pg_hint_plan. When the hint isn't fully compatible with the query planner's execution plans, different databases may behave differently.
Here is my test case to show how it behaves in MongoDB:
mdb> db.demo.insertMany([
{ _id: 1, name: "Alice", age: 30 }, // has age
{ _id: 2, name: "Bob" }, // missing age
{ _id: 3, name: "Charlie", age: null }, // age is null
{ _id: 4, name: "Dave", age: 35, status: "active" }, // age + status
{ _id: 5, name: "Eve", age: 28, status: "inactive" } // age + status
]);
{
acknowledged: true,
insertedIds: { '0': 1, '1': 2, '2': 3, '3': 4, '4': 5 }
}
mdb> db.demo.find().sort({ age: 1 });
[
{ _id: 2, name: 'Bob' },
{ _id: 3, name: 'Charlie', age: null },
{ _id: 5, name: 'Eve', age: 28, status: 'inactive' },
{ _id: 1, name: 'Alice', age: 30 },
{ _id: 4, name: 'Dave', age: 35, status: 'active' }
]
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
);
[
{ name: 'Alice', age: 30 },
{ name: 'Dave', age: 35 },
{ name: 'Eve', age: 28 }
]
Without any index, the query scans the collection:
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 3,
executionTimeMillis: 0,
totalKeysExamined: 0,
totalDocsExamined: 5,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 3,
executionTimeMillisEstimate: 0,
works: 6,
advanced: 3,
needTime: 2,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, age: 1, name: 1 },
inputStage: {
stage: 'COLLSCAN',
filter: { age: { '$gte': 28 } },
nReturned: 3,
executionTimeMillisEstimate: 0,
works: 6,
advanced: 3,
needTime: 2,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
direction: 'forward',
docsExamined: 5
}
}
}
With an index, the same query does an index scan:
mdb> db.demo.createIndex({ age: 1 }, { name: "idx_age" })
idx_age
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 3,
executionTimeMillis: 1,
totalKeysExamined: 3,
totalDocsExamined: 3,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 3,
executionTimeMillisEstimate: 1,
works: 4,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, age: 1, name: 1 },
inputStage: {
stage: 'FETCH',
nReturned: 3,
executionTimeMillisEstimate: 1,
works: 4,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 3,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 3,
executionTimeMillisEstimate: 1,
works: 4,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { age: 1 },
indexName: 'idx_age',
isMultiKey: false,
multiKeyPaths: { age: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { age: [ '[28, inf.0]' ] },
keysExamined: 3,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
}
I create a better index, that covers the projection, and it is used:
mdb> db.demo.createIndex({ age: 1, name: 1 }, { name: "idx_age_name" })
idx_age_name
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 3,
executionTimeMillis: 1,
totalKeysExamined: 3,
totalDocsExamined: 0,
executionStages: {
isCached: false,
stage: 'PROJECTION_COVERED',
nReturned: 3,
executionTimeMillisEstimate: 1,
works: 5,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, age: 1, name: 1 },
inputStage: {
stage: 'IXSCAN',
nReturned: 3,
executionTimeMillisEstimate: 1,
works: 5,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { age: 1, name: 1 },
indexName: 'idx_age_name',
isMultiKey: false,
multiKeyPaths: { age: [], name: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { age: [ '[28, inf.0]' ], name: [ '[MinKey, MaxKey]' ] },
keysExamined: 3,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
I can decide to force the other index with a hint:
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
).hint("idx_age").explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 3,
executionTimeMillis: 0,
totalKeysExamined: 3,
totalDocsExamined: 3,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 3,
executionTimeMillisEstimate: 0,
works: 4,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, age: 1, name: 1 },
inputStage: {
stage: 'FETCH',
nReturned: 3,
executionTimeMillisEstimate: 0,
works: 4,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 3,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 3,
executionTimeMillisEstimate: 0,
works: 4,
advanced: 3,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { age: 1 },
indexName: 'idx_age',
isMultiKey: false,
multiKeyPaths: { age: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { age: [ '[28, inf.0]' ] },
keysExamined: 3,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
}
A hint() is more than a hint here. It forces the index scan. I can even force its bounds with min() and max():
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
).hint("idx_age").max({age:31}) // scan from begining to age: 30
[ { name: 'Eve', age: 28 }, { name: 'Alice', age: 30 } ]
Rather than the bounds defined by the filter ([ '[28, inf.0]' ]
), I forced an index scan of range [ '[-inf.0, 30]' ]
. Then the filter { '$gte': 28 }
applied and the result has only two documents.
This is visible in the execution plan.
mdb> db.demo.find(
{ age: { $gte: 28 } }, // WHERE age >= 28
{ _id:0, age:1, name: 1 } // SELECT age, name
).hint("idx_age").max({age:31}).explain("executionStats").executionStats
{
executionSuccess: true,
nReturned: 2,
executionTimeMillis: 0,
totalKeysExamined: 4,
totalDocsExamined: 4,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 2,
executionTimeMillisEstimate: 0,
works: 5,
advanced: 2,
needTime: 2,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, age: 1, name: 1 },
inputStage: {
stage: 'FETCH',
filter: { age: { '$gte': 28 } },
nReturned: 2,
executionTimeMillisEstimate: 0,
works: 5,
advanced: 2,
needTime: 2,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 4,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 4,
executionTimeMillisEstimate: 0,
works: 5,
advanced: 4,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { age: 1 },
indexName: 'idx_age',
isMultiKey: false,
multiKeyPaths: { age: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {},
keysExamined: 4,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
}
The bounds are not correctly displayed, but it shows that 4 keys were examined, the 4 keys before {age:31}
.
The most important to remember is that hints in MongoDB replace the query planner decision, and then may may give a different results. This is different from SQL hints where they only prune the query planner possibilities.
This is also visible with sparse and partial indexes:
// Named sparse index
db.demo.createIndex(
{ age: 1 },
{ sparse: true, name: "age_sparse" }
)
// Named partial index (only for status: 'active')
db.demo.createIndex(
{ age: 1 },
{
partialFilterExpression: { status: "active" },
name: "age_active_partial"
}
);
The result depends not only on the query, but also the index definition:
mdb> db.demo.find(
).sort({ age: 1 }) // ORDER BY age NULLS FIRST
[
{ _id: 2, name: 'Bob' },
{ _id: 3, name: 'Charlie', age: null },
{ _id: 5, name: 'Eve', age: 28, status: 'inactive' },
{ _id: 1, name: 'Alice', age: 30 },
{ _id: 4, name: 'Dave', age: 35, status: 'active' }
]
mdb> db.demo.find(
).hint("age_sparse").sort({ age: 1 }) // sparse index
[
{ _id: 3, name: 'Charlie', age: null },
{ _id: 5, name: 'Eve', age: 28, status: 'inactive' },
{ _id: 1, name: 'Alice', age: 30 },
{ _id: 4, name: 'Dave', age: 35, status: 'active' }
<... (truncated)
by Franck Pachot
Percona Database Performance Blog
When troubleshooting a MySQL crash, having only the error log is rarely enough to pinpoint the exact root cause. To truly understand what happened, we need to go deeper—into the memory state of the process at the moment it crashed. That’s where GDB, the GNU Debugger, comes in. GDB lets us inspect a core dump […]
by Vinicius Grippa
Avinash Sajjanshetty
SQLite WAL has checksums, but on corruption it drops all the data and does not raise error