May 02, 2025
MongoDB Operators Explained: Features, Limitations, and Open Source Alternatives
The impact of innodb_doublewrite_pages in MySQL 8.0.41
After reading a blog post from JFG on changes to innodb_doublewrite_pages and bug 111353, I wanted to understand the impact from that on the Insert Benchmark using a large server.
I test the impact from:
- using a larger (non-default) value for innodb_doublewrite_pages
- disabling the doublewrite buffer
tl;dr
- Using a larger value for innodb_doublewrite_pages improves QPS by up to 10%
- Disabling the InnoDB doublewrite buffer is great for performance, but bad for durability. I don't suggest you do this in production.
- cz11a_c32r128 - the base configuration file that does not set innodb_doublewrite_pages and gets innodb_doublewrite_pages=8
- cz11e_c32r128 - adds innodb_doublewrite_pages=128 to the base config
- cz11f_c32r128 - adds innodb_doublewrite=0 to the base config (disables doublewrite)
- l.i0
- insert 200 million rows per table in PK order. The table has a PK index but no secondary indexes. There is one connection per client.
- l.x
- create 3 secondary indexes per table. There is one connection per client.
- l.i1
- use 2 connections/client. One inserts 4M rows per table and the other does deletes at the same rate as the inserts. Each transaction modifies 50 rows (big transactions). This step is run for a fixed number of inserts, so the run time varies depending on the insert rate.
- l.i2
- like l.i1 but each transaction modifies 5 rows (small transactions) and 1M rows are inserted and deleted per table.
- Wait for X seconds after the step finishes to reduce variance during the read-write benchmark steps that follow. The value of X is a function of the table size.
- qr100
- use 3 connections/client. One does range queries and performance is reported for this. The second does does 100 inserts/s and the third does 100 deletes/s. The second and third are less busy than the first. The range queries use covering secondary indexes. This step is run for 1800 seconds. If the target insert rate is not sustained then that is considered to be an SLA failure. If the target insert rate is sustained then the step does the same number of inserts for all systems tested.
- qp100
- like qr100 except uses point queries on the PK index
- qr500
- like qr100 but the insert and delete rates are increased from 100/s to 500/s
- qp500
- like qp100 but the insert and delete rates are increased from 100/s to 500/s
- qr1000
- like qr100 but the insert and delete rates are increased from 100/s to 1000/s
- qp1000
- like qp100 but the insert and delete rates are increased from 100/s to 1000/s
When relative QPS is > 1.0 then performance improved over time. When it is < 1.0 then there are regressions. The Q in relative QPS measures:
- insert/s for l.i0, l.i1, l.i2
- indexed rows/s for l.x
- range queries/s for qr100, qr500, qr1000
- point queries/s for qp100, qp500, qp1000
- the impact on write-heavy steps is mixed: create index was ~7% slower and l.i2 was ~10% faster
- the impact on range query + write steps is positive but small. The improvements were 0%, 0% and 4%. Note that these steps are not as IO-bound as point query + write steps and the range queries do ~0.3 reads per query (see here).
- the impact on point query + write steps is positive and larger. The improvements were 3%, 8% and 9%. These benchmark steps are much more IO-bound than the steps that do range queries.
- the impact on write-heavy steps is large -- from 1% to 36% faster.
- the impact on range query + write steps is positive but small. The improvements were 0%, 2% and 15%. Note that these steps are not as IO-bound as point query + write steps and the range queries do ~0.3 reads per query (see here).
- the impact on point query + write steps is positive and larger. The improvements were 14%, 41% and 42%.
May 01, 2025
Microsoft DocumentDB: RUM instead of GIN but same limitations on JSON paths
Storing documents in PostgreSQL does not transform it to a document database. Embedded documents in JSONB require GIN indexes, which are not effective for range or pagination queries. Microsoft recognized some limitations of JSONB and GIN indexes, and developped the DocumentDB extension for BSON storage. However, this does not resolve the pagination issues. In DocumentDB, GIN indexes are replaced by RUM indexes, but show the same limitation.
Install DocumentDB
To verify I used FerretDB v2 which includes PostgreSQL with the DocumentDB extension and adds a MongoDB API emulation.
I start FerretDB with using the Docker Compose file provided by the documentation. I've added a mongosh service to run the MongoDB client. I've added auto_explain configuration to the start of PostgreSQL.
services:
postgres:
image: ghcr.io/ferretdb/postgres-documentdb:17-0.102.0-ferretdb-2.1.0
platform: linux/amd64
restart: on-failure
environment:
- POSTGRES_USER=username
- POSTGRES_PASSWORD=password
- POSTGRES_DB=postgres
volumes:
- ./data:/var/lib/postgresql/data
command:
postgres -c shared_preload_libraries=auto_explain,pg_stat_statements,pg_cron,pg_documentdb_core,pg_documentdb -c auto_explain.log_min_duration=0 -c auto_explain.log_analyze=on -c auto_explain.log_buffers=on -c auto_explain.log_nested_statements=on
ferretdb:
image: ghcr.io/ferretdb/ferretdb:2.1.0
restart: on-failure
ports:
- 27017:27017
environment:
- FERRETDB_POSTGRESQL_URL=postgres://username:password@postgres:5432/postgres
mongosh:
image: mongo
deploy:
replicas: 0
command: mongosh mongodb://username:password@ferretdb/
I start the services and log stderr so display the output of Auto Explain:
docker compose up -d
docker compose logs -f postgres
I connect to the MongoDB API emulation:
docker compose run --rm -it mongosh
It emulates MongoDB 7.0 on top of PostgreSQL 17.0 with DocumentDB extension:
Current Mongosh Log ID: 6813828db4f75c9887d861df
Connecting to: mongodb://<credentials>@ferretdb/?directConnection=true&appName=mongosh+2.5.0
Using MongoDB: 7.0.77
Using Mongosh: 2.5.0
------
The server generated these startup warnings when booting
2025-05-01T14:17:52.106Z: Powered by FerretDB v2.1.0 and DocumentDB 0.102.0 (PostgreSQL 17.4).
2025-05-01T14:17:52.124Z: Please star 🌟 us on GitHub: https://github.com/FerretDB/FerretDB.
------
Test with FerretDB
I create a simple collection with 1000 documents:
for (let i = 0; i < 10000; i++) {
db.demo.insertOne( {
a: 1 ,
b: Math.random(),
ts: new Date()
} );
}
Here is what Auto Explain logged in PostgreSQL:
postgres-1 | 2025-05-01 13:49:54.012 UTC [37] LOG: duration: 0.219 ms plan:
postgres-1 | Query Text:
postgres-1 | Query Parameters: $1 = '\x13000000070068137c029be252893cd8696900', $2 = '\x34000000075f69640068137c029be252893cd86969106100f30100000162006466d90936e7d73f09747300a6671c8c9601000000'
postgres-1 | Insert on documents_13 collection (cost=0.00..0.01 rows=1 width=80) (actual time=0.216..0.217 rows=1 loops=1)
postgres-1 | Buffers: shared hit=20
postgres-1 | -> Values Scan on "values" (cost=0.00..0.01 rows=1 width=80) (actual time=0.003..0.003 rows=1 loops=1)
postgres-1 | 2025-05-01 13:49:54.012 UTC [37] LOG: duration: 0.485 ms plan:
postgres-1 | Query Text: SELECT p_result::bytea, p_success FROM documentdb_api.insert($1, $2::bytea, $3::bytea)
postgres-1 | Query Parameters: $1 = 'test', $2 = '\x9900000002696e73657274000500000064656d6f0004646f63756d656e7473003c00000003300034000000106100f30100000162006466d90936e7d73f09747300a6671c8c96010000075f69640068137c029be252893cd869690000086f7264657265640001036c736964001e00000005696400100000000431ae93d9c7074468a630c26b67a8709b00022464620005000000746573740000', $3 = NULL
postgres-1 | Function Scan on insert (cost=0.01..0.02 rows=1 width=33) (actual time=0.482..0.482 rows=1 loops=1)
postgres-1 | Buffers: shared hit=20
I'm happy that FerretDB provides a MongoDB-like API to avoid calling the raw DocumentDB functions like:
documentdb_api.insert('test', '\x9900000002696e73657274000500000064656d6f0004646f63756d656e7473003c00000003300034000000106100f30100000162006466d90936e7d73f09747300a6671c8c96010000075f69640068137c029be252893cd869690000086f7264657265640001036c736964001e00000005696400100000000431ae93d9c7074468a630c26b67a8709b00022464620005000000746573740000'::bytea, NULL::bytea)
In PostgreSQL, modified by the DocumentDB extension, inserting a small document (rows=1) into a collection without indexes affects 20 pages (Buffers: shared hit=20). While the syntax resembles that of MongoDB, the performance differs due to PostgreSQL's heap tables and 8k blocks, which introduce additional overhead.
I created a simple index that adheres to the MongoDB Equality, Sort, Range rule. This index is designed for queries utilizing an equality filter on "a" and sorting based on "ts":
db.demo.createIndex({ "a": 1 , ts: -1 }) ;
My goal is to test the most frequent pattern in OLTP applications: pagination queries. This exists in many domains, like retrieving the last ten orders for a customer, the last ten measures from a device, or the last ten payments on an account.
Heap table and RUM index
I connect to PostgreSQL and check the SQL table that stores the documents:
# docker compose run -it -e PGUSER=username -e PGPASSWORD=password postgres psql -h postgres postgres
psql (17.4 (Debian 17.4-1.pgdg120+2))
Type "help" for help.
postgres=# \d+ documentdb_data.documents_15*
Table "documentdb_data.documents_15"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------------+--------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
shard_key_value | bigint | | not null | | plain | | |
object_id | documentdb_core.bson | | not null | | extended | | |
document | documentdb_core.bson | | not null | | extended | | |
creation_time | timestamp with time zone | | | | plain | | |
Indexes:
"collection_pk_15" PRIMARY KEY, btree (shard_key_value, object_id)
"documents_rum_index_35" documentdb_rum (document documentdb_api_catalog.bson_rum_single_path_ops (path=a, tl='2691'), document documentdb_api_catalog.bson_rum_single_path_ops (path=ts, tl='2691'))
Check constraints:
"shard_key_value_check" CHECK (shard_key_value = '15'::bigint)
Access method: heap
The table includes two extended storage columns, a heap table, a primary index, and a secondary index supporting the collection index created. It is important to note that this is a RUM index, not a GIN index. The differences between the two has a good explanation in an Alibaba blog post.
The table stores 154 pages and the index has to read 3 pages to find the first row:
postgres=# explain (analyze , buffers, serialize binary)
select * from documentdb_data.documents_15
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on documents_15 (cost=0.00..254.00 rows=10000 width=89) (actual time=0.010..0.994 rows=10000 loops=1)
Buffers: shared hit=154
Planning Time: 0.070 ms
Serialization: time=4.160 ms output=1026kB format=binary
Execution Time: 6.176 ms
(5 rows)
postgres=# explain (analyze , buffers, serialize binary) select * from documentdb_data.documents_15 order by shard_key_value,object_id limit 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..0.36 rows=1 width=89) (actual time=0.021..0.021 rows=1 loops=1)
Buffers: shared hit=3
-> Index Scan using _id_ on documents_15 (cost=0.29..760.10 rows=10000 width=89) (actual time=0.020..0.020 rows=1 loops=1)
Buffers: shared hit=3
Planning Time: 0.091 ms
Serialization: time=0.005 ms output=1kB format=binary
Execution Time: 0.085 ms
(7 rows)
Simple query (Equality)
The first query I tested has no pagination. On my small collection it retreives 14 document for one value of "a":
test> db.demo.countDocuments( { a: 1 } );
14
test> db.demo.find( { a: 1 } ).explain("executionStats")
{
queryPlanner: {
Plan: {
'Node Type': 'Bitmap Heap Scan',
'Parallel Aware': false,
'Async Capable': false,
'Relation Name': 'documents_15',
'Plan Rows': 5000,
'Recheck Cond': "(document OPERATOR(documentdb_api_catalog.@=) 'BSONHEX0c0000001061000100000000'::documentdb_core.bson)",
Plans: [
{
'Parent Relationship': 'Outer',
'Parallel Aware': false,
'Index Name': 'a_1_ts_-1',
'Startup Cost': 0,
'Total Cost': 0,
'Plan Rows': 100,
'Node Type': 'Bitmap Index Scan',
'Async Capable': false,
'Plan Width': 0,
'Index Cond': "(document OPERATOR(documentdb_api_catalog.@=) 'BSONHEX0c0000001061000100000000'::documentdb_core.bson)"
}
],
Alias: 'collection',
'Startup Cost': 1.25,
'Total Cost': 146.58,
'Plan Width': 53
}
},
explainVersion: '1',
command: { find: 'demo', filter: { a: 1 }, '$db': 'test' },
serverInfo: {
host: 'a898d7c3cd9a',
port: 27017,
version: '7.0.77',
gitVersion: '05ed2b952c612533cb12c1ff1a0319a4e7f2e4b5',
ferretdb: { version: 'v2.1.0' }
},
ok: 1
}
The access methods for RUM and GIN indexes are quite similar, utilizing bitmaps for operations like 'Bitmap Index Scan' and 'Bitmap Heap Scan.' However, FerretDB only displays the explain estimations rather than execution statistics, even when using explain("executionStats").
Additional insights can be gathered from PostgreSQL Auto Explain:
postgres-1 | 2025-05-01 14:01:47.749 UTC [74] LOG: duration: 0.041 ms plan:
postgres-1 | Query Text: SELECT (index_spec).index_name FROM documentdb_api_catalog.collection_indexes WHERE index_id = 35
postgres-1 | Seq Scan on collection_indexes (cost=0.00..1.09 rows=1 width=32) (actual time=0.030..0.031 rows=1 loops=1)
postgres-1 | Filter: (index_id = 35)
postgres-1 | Rows Removed by Filter: 7
postgres-1 | Buffers: shared hit=1
postgres-1 | 2025-05-01 14:01:47.749 UTC [45] LOG: duration: 1.023 ms plan:
postgres-1 | Query Text:
postgres-1 | Query Parameters: $1 = 'BSONHEX6600000004636f6e74696e756174696f6e00050000000010676574706167655f6261746368436f756e7400ca00000010676574706167655f626174636853697a6548696e74000000000110676574706167655f626174636853697a6541747472000100000000'
postgres-1 | Custom Scan (DocumentDBApiScan) (cost=0.42..150.16 rows=1667 width=85) (actual time=0.974..1.012 rows=14 loops=1)
postgres-1 | Page Row Count: 202 rows
postgres-1 | Page Size Hint: 16777216 bytes
postgres-1 | Buffers: shared hit=16
postgres-1 | -> Bitmap Heap Scan on documents_15 collection (cost=0.42..146.00 rows=1667 width=89) (actual time=0.968..0.994 rows=14 loops=1)
postgres-1 | Recheck Cond: (document OPERATOR(documentdb_api_catalog.@=) 'BSONHEX0c0000001061000100000000'::documentdb_core.bson)
postgres-1 | Filter: documentdb_api_internal.cursor_state(document, 'BSONHEX6600000004636f6e74696e756174696f6e00050000000010676574706167655f6261746368436f756e7400ca00000010676574706167655f626174636853697a6548696e74000000000110676574706167655f626174636853697a6541747472000100000000'::documentdb_core.bson)
postgres-1 | Heap Blocks: exact=14
postgres-1 | Buffers: shared hit=16
postgres-1 | -> Bitmap Index Scan on "a_1_ts_-1" (cost=0.00..0.00 rows=100 width=0) (actual time=0.949..0.950 rows=14 loops=1)
postgres-1 | Index Cond: (document OPERATOR(documentdb_api_catalog.@=) 'BSONHEX0c0000001061000100000000'::documentdb_core.bson)
postgres-1 | Buffers: shared hit=2
postgres-1 | 2025-05-01 14:01:47.749 UTC [45] LOG: duration: 39.943 ms plan:
postgres-1 | Query Text: SELECT cursorpage::bytea, continuation::bytea, persistconnection, cursorid FROM documentdb_api.find_cursor_first_page($1, $2::bytea, $3)
postgres-1 | Query Parameters: $1 = 'test', $2 = '\x5a0000000266696e64000500000064656d6f000366696c746572000c0000001061000100000000036c736964001e00000005696400100000000431ae93d9c7074468a630c26b67a8709b00022464620005000000746573740000', $3 = '0'
postgres-1 | Function Scan on find_cursor_first_page (cost=0.00..0.02 rows=1 width=73) (actual time=39.937..39.938 rows=1 loops=1)
postgres-1 | Buffers: shared hit=202 read=1 dirtied=1
There's a lot happening, but the crucial detail is the number of pages and rows that have been processed. The Bitmap Index Scan efficiently located the 14 entries needed for the result (rows=14) by only accessing two index pages (Buffers: shared hit=2). The Bitmap Heap Scan expanded this by including one heap page per document (Heap Blocks: exact=14).
On top of this, a custom scan (DocumentDBApiScan) keeps track of the MongoDB cursor and paging. It reports the following:
Page Row Count: 202 rows
Page Size Hint: 16777216 bytes
Buffers: shared hit=16
The PostgreSQL scans have read 16 pages, but the DocumentDBApiScan emulates 16MB pages of MongoDB with 202 rows. I don't know exactly how to interpret the numbers here. DocumentDB is not PostgreSQL, and even though it is open-source, its code lacks the internal documentation quality of PostgreSQL.
On one side, I don't think there are really 202 rows in that page, as only 14 have been read from storage, but it seems that it has iterated over those 202 rows by re-reading the PostgreSQL pages, as indicated by Buffers: shared hit=202.
Pagination query (Equality, Sort)
OLTP applications commonly implement pagination to limit their results to what is displayed to the user. I executed a query ordered by timestamp to retrieve the last ten entries:
test> db.demo.find(
{ a: 1 } ).sort({ts:-1}).limit(10).explain("executionStats")
;
{
queryPlanner: {
Plan: {
'Parallel Aware': false,
'Async Capable': false,
'Startup Cost': 267.13,
'Total Cost': 267.15,
'Plan Rows': 10,
'Plan Width': 85,
Plans: [
{
'Node Type': 'Sort',
'Parallel Aware': false,
'Async Capable': false,
'Startup Cost': 267.13,
'Total Cost': 279.63,
'Plan Width': 85<... (truncated)
TLA+ Community Event at ETAPS 2025
This Sunday, I'll be attending (and speaking at) the TLA+ Community Event, co-located with ETAPS 2025 in Hamilton, Ontario.
The setting is fitting. ETAPS (European Joint Conferences on Theory and Practice of Software) has long been a hub for research that combines theory with software engineering. It seems that, while the U.S. academia largely left software engineering to industry, European researchers remained more strongly involved in software engineering discipline. ETAPS has consistently hosted work on model checking, type systems, static analysis, and formal methods. Think of work on abstract interpretation, K frameworks, or compilers verified in Coq.
I have never been to ETAPS before. It seems that they are rebranding as "International Joint Conferences On Theory and Practice of Software" and droppping the European. And this year is the first time, after 28 years, that the event moves outside of Europe. Interesting.
McMaster University, the ETAPS 2025 host, is a strong research school, particularly in health sciences and engineering. Huh, the department is called "Department of Computing & Software", and it gives a degree in Computer Science and several others in Software Engineering. It's also just an hour's drive from Buffalo, where I live, so this is a rare hometown event for me.
The TLA+ Community Event runs all day Sunday, May 4. The program features researchers and practitioners from academia and industry. Some highlights:
- ModelFuzz for distributed systems (MPI-SWS)
- Source-level safety checking via C-to-PlusCal translation (Asterios Technologies)
- TLA+ in Python notebooks (Loyola University Chicago)
- Modeling and Modular Verification of MongoDB’s distributed transactions (joint work between Will Schultz and yours truly)
- How do we use TLA+ for statistical properties? (by Jesse Jiryu Davis)
- And a talk on building TLA+ tooling (by Andrew Helwer)
There's no formal proceedings from the event, but slides and recordings will be online.
I would be amiss if I don't mention the TLA+ Foundation Grant Program. The TLA+ Foundation is accepting proposals for grant funding to support projects that advance the state of the art in TLA+ and improve the experience of using TLA+ in research and industry. Grants will be awarded based on the significance of the proposed work and its potential to benefit the TLA+ community.
April 30, 2025
Querying embedded arrays in JSON (PostgreSQL JSONB and MongoDB documents)
When working with document-based data structures, the fields at the root can be indexed using simple expression indexes. However, when an array is present in the path, representing a One-to-Many relationship, PostgreSQL requires a GIN index and the use of JSON path operators for indexing, more efficient than SQL/JSON queries.
Example
I create the following table to store books. I decided to embed more information with a flexible schema and added a "data" column to store JSON data:
create table books (
primary key(book_id),
book_id bigint,
title text,
data jsonb
);
I insert one book and add some reviews in my flexible schema document:
insert into books values (
8675309,
'Brave New World',
'{ "reviews":[
{ "name": "John", "text": "Amazing!" },
{ "name": "Jane", "text": "Incredible book!" }
] }'
);
There’s no need for another table, as reviews are inherently linked to the books they discuss. A book cannot be reviewed without being displayed alongside its review, making any separate table unnecessary. I know it looks like violating the first normal form, but there's no update anomaly possible here because there's no duplication. From a normalization point of view, this is not very different from storing text, which is an array of char, or embeddings, which are arrays of numbers.
Inefficient query with SQL join
If you're comfortable with SQL, you might want to query this structure using SQL. Simply unnest the JSON document array and use it like a relational table:
SELECT DISTINCT title FROM books
JOIN LATERAL jsonb_array_elements(books.data->'reviews') AS review
ON review->>'name' = 'John'
;
jsonb_array_elements expands a JSON array into rows for SQL queries. The lateral join adds book information, the ON or WHERE clause filters by reviewer name, and DISTINCT removes duplicate titles. This is standard SQL syntax but cannot use an index to filter on the reviewer name before unnesting, requiring a read of all rows and documents:
QUERY PLAN
-----------------------------------------------------------------------
Unique
-> Sort
Sort Key: books.title
-> Nested Loop
-> Seq Scan on books
-> Function Scan on jsonb_array_elements review
Filter: ((value ->> 'name'::text) = 'John'::text)
While this is a valid SQL syntax, and JSON is a valid SQL datatype, they are not so friendly because a relational database is not a document database. When using documents in PostgreSQL, you must learn how to query them and index them.
Note that jsonb_array_elements is not SQL standard, but PostgreSQL 17 introduced the JSON_TABLE which is aprt of the standard. The query can be re-written as:
SELECT books.title
FROM books
JOIN JSON_TABLE(
books.data->'reviews',
'$[*]' COLUMNS (
name TEXT PATH '$.name'
)
) AS review
ON review.name = 'John'
;
This is the standard SQL/JSON way to query documents. Unfortunately, it is not efficient as no index scan is possible. Don't forget that SQL indexes are not part of the SQL standard.
Efficient query with JSON operators
To efficiently query JSONB data for reviews by a specific person, we need to utilize PostgreSQL's containment operator @> instead of relying on standard SQL:
SELECT title FROM books
WHERE data->'reviews' @> '[{"name": "John"}]'
;
Now that I filter directly on the table without transforming the document, I can create an index. Since there can be multiple keys per table row, an inverted index is necessary:
CREATE INDEX ON books USING gin ((data->'reviews') jsonb_path_ops)
;
With an index for the JSON path operators, each key corresponds to an item in the array. This can be utilized when querying with an equality filter on the embedded array field:
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on books
Recheck Cond: ((data -> 'reviews'::text) @> '[{"name": "John"}]'::jsonb)
-> Bitmap Index Scan on books_expr_idx
Index Cond: ((data -> 'reviews'::text) @> '[{"name": "John"}]'::jsonb)
GIN (Generalized Inverted Index) is designed for datatypes that includes multiple keys, such as array items or words, stems, or trigrams in text. While powerful, GIN has limitations: it cannot support range queries, optimize ORDER BY clauses, or perform covering projections (no Index Only Scan).
Comparison with a document database
While PostgreSQL offers flexibility in storing and indexing JSON documents, it does not replace a document database where documents are native types. For instance, in MongoDB the fields within an array are used like any other fields. I insert similar document in MongoDB:
db.books.insertOne({
book_id: 8675309,
title: "Brave New World",
reviews: [
{ name: "John", text: "Amazing!" },
{ name: "Jane", text: "Incredible book!" }
]
});
There is no need for special operators, and I can query the embedded field like any other field:
db.books.find(
{ "reviews.name": "John" } // filter
, { title: 1, _id: 0 } // projection
);
[ { title: 'Brave New World' } ]
There is no need for special index type, and I can index the embedded field like any other field:
db.books.createIndex({ "reviews.name": 1 })
;
The execution plan confirms that the index is used to filter on "reviews.name":
db.books.find(
{ "reviews.name": "John" } // filter
, { title: 1, _id: 0 } // projection
).explain().queryPlanner.winningPlan
;
{
isCached: false,
stage: 'PROJECTION_SIMPLE',
transformBy: { title: 1, _id: 0 },
inputStage: {
stage: 'FETCH',
inputStage: {
stage: 'IXSCAN',
keyPattern: { 'reviews.name': 1 },
indexName: 'reviews.name_1',
isMultiKey: true,
multiKeyPaths: { 'reviews.name': [ 'reviews' ] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { 'reviews.name': [ '["John", "John"]' ] }
}
}
}
It is a regular index, the only particularity is that it allows multi-key entries.
Unlike PostgreSQL's GIN index, which requires a Bitmap Scan that doesn't maintain entry order, MongoDB employs a regular index that supports range queries. For instance, if I only know the beginning of a name, I can utilize a Regular Expression to filter the results effectively:
db.books.find(
{ "reviews.name": { $regex: "^Joh" } }, // filter using regex
{ title: 1, _id: 0 } // projection
).explain().queryPlanner.winningPlan
;
{
isCached: false,
stage: 'PROJECTION_SIMPLE',
transformBy: { title: 1, _id: 0 },
inputStage: {
stage: 'FETCH',
inputStage: {
stage: 'IXSCAN',
keyPattern: { 'reviews.name': 1 },
indexName: 'reviews.name_1',
isMultiKey: true,
multiKeyPaths: { 'reviews.name': [ 'reviews' ] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { 'reviews.name': [ '["Joh", "Joi")', '[/^Joh/, /^Joh/]' ] }
}
}
}
MongoDB utilized the index efficiently, as the query planner transformed the regular expression /^Joh/ into a range scan, specifically ["Joh", "Joi").
Conclusion
When comparing PostgreSQL and MongoDB, it is essential to understand their querying and indexing mechanisms, and not rely only on their ability to store JSON.
Like other RDBMS, PostgreSQL excels as a centralized, monolithic database utilized by multiple applications. With its specialized JSONB functions and GIN indexes, it adds some flexibility to the normalized tables.
MongoDB is ideal for development agility, particularly in microservices and domain-driven design, where access patterns are well-defined, but the application evolves with high velocity. Its document model aligns well with business objects.
Ultimately, the choice of a database should be based on your team's expertise, comfort with database syntax, data modeling, optimal indexing, and access to new hires and educational resources. The best database for a specific workload will not perform as expected if there's no expertise to code efficient queries, read execution plans, and index the access paths. Another database may be good enough when it fits better with the development organization and provides a better developer experience and simplifies the optimization.
Benchmarking PostgreSQL: The Hidden Cost of Over-Indexing
April 29, 2025
Announcing Vitess 22
Announcing Vitess 22
April 28, 2025
PostgreSQL aborts the transactions on error
You may be surprised by this in PostgreSQL:
postgres=!# commit;
ROLLBACK
postgres=#
Yes, I issued a COMMIT but got a ROLLBACK!
I'll demo how it happened and how to avoid it. In short, the transaction was already rolled back and the only possible command to run is a ROLLBACK, which is implicit when terminating the transaction.
I created a table, started a transaction and inserted one row:
postgres=# create table demo ( id int primary key );
CREATE TABLE
postgres=# begin transaction;
BEGIN
postgres=*# insert into demo values (1);
INSERT 0 1
postgres=*#
The * in the prompt (which comes from the %x in the default %/%R%x%# PROMPT1) shows that I'm still in a transaction
I try to add the same row, with the same key, which violates the primary key constraint:
postgres=*# insert into demo values (1);
ERROR: duplicate key value violates unique constraint "demo_pkey"
DETAIL: Key (id)=(1) already exists.
postgres=!#
The ! in the prompt shows that the transaction failed.
I can check its status from another transaction (trying it within a failed transaction would have raised ERROR: current transaction is aborted, commands ignored until end of transaction block):
postgres=!# \! psql -c "select pid, application_name, state from pg_stat_activity where wait_event='ClientRead'"
pid | application_name | state
-------+------------------+-------------------------------
66420 | psql | idle in transaction (aborted)
(1 row)
postgres=!#
As the transaction has been aborted, it has released all locks. To verify this, I insert the same tow from another session:
postgres=!# \! psql -c "insert into demo values (1)"
INSERT 0 1
postgres=!
The only thing I can do is ending the transaction block, with ABORT, ROLLBACK or even COMMIT:
postgres=!# commit;
ROLLBACK
postgres=#
If you're used to Oracle Database, you might find it surprising that in an interactive transaction, you must restart from the beginning even after completing previous statements and only one failed. Oracle Database automatically creates a savepoint before each user call, rolling back to this savepoint in case of an error, so that the user can continue with another statement once aware of the error.
In PostgreSQL, creating implicit savepoints is the client's or driver's responsibility.
For instance, PgJDBC can enable autosave=on to achieve this. However, it's important to note that using savepoints in PostgreSQL may be more resource-intensive compared to other databases.
Another example is a PL/pgSQL statement with an exception block that creates an implicit savepoint to roll back the main block before running the exception block. This differs from Oracle Database, which rolls back only the statement that failed when continuing to the exception block of PL/SQL.
If the exceptions are managed by your application code, you must use savepoints to achieve the same.
With an interactive user interface, like PSQL, it might be preferrable to create an implicit savepoint before each statement, and this is possible with ON_ERROR_ROLLBACK. Here is an example:
postgres=# drop table demo;
DROP TABLE
postgres=# create table demo ( id int primary key );
CREATE TABLE
postgres=# \set ON_ERROR_ROLLBACK on
postgres=# begin transaction;
BEGIN
postgres=*# insert into demo values (1);
INSERT 0 1
postgres=*# insert into demo values (1);
ERROR: duplicate key value violates unique constraint "demo_pkey"
DETAIL: Key (id)=(1) already exists.
postgres=*# \! psql -c "select pid, application_name, state from pg_stat_activity where wait_event='ClientRead'"
pid | application_name | state
-------+------------------+---------------------
66461 | psql | idle in transaction
(1 row)
postgres=*# insert into demo values (2);
INSERT 0 1
postgres=*# commit;
COMMIT
In this interactive transaction, with ON_ERROR_ROLLBACK set to on, I was able to continue with another value when I got the information that the one I tried to insert was a duplicate one.
When managing exceptions in your application, such as implementing retry logic for serializable errors, consider creating a savepoint before executing a statement. This allows you to continue with the same transaction if an exception is caught. However, be cautious as it is not always the right solution. In cases of deadlocks, one transaction must abort to release its locks. The scope of rollback on errors depends on what has been executed before, so it makes sense that the application controls it rather than relying on defaults.
In interactive usage with PSQL, setting ON_ERROR_ROLLBACK is advisable to prevent rolling back all previous work due to a simple typo causing an error. While it is unnecessary if you do not start a transaction explicitly, and rely on autocommit, running interactive commands without the ability to verify outcomes before committing changes is not recommended.
April 27, 2025
Index Only Scan on JSON Documents, with Covering and Multi-Key Indexes in MongoDB
Storing JSON documents in a SQL database does not make it a document database. The strength of any database lies in its indexing capabilities, and SQL databases, even with JSON datatypes, lack the flexibility of document databases, particularly when dealing with arrays in an embedded One-to-Many relationships.
In SQL databases, normalized tables are first-class citizens, with single-key indexes optimized for handling equality and range filters (WHERE), sorting (ORDER BY), and projection coverage (SELECT) for individual tables. To avoid joining before filtering, a denormalized model is preferred when those operations involve multiple tables.
While SQL databases can store documents, including arrays for One-to-Many relationships, working with arrays in JSON necessitates the use of inverted indexes like PostgreSQL's GIN, which do not cover range filters, sorting, and projection like regular indexes.
In contrast, MongoDB treats documents as the core of its data model. Its indexing mechanisms naturally extend to handle documents and their arrays, retaining functionality for filtering, sorting, and projection coverage, as regular indexes can hangle multiple keys per documents.
In previous posts, we examined how a multi-key index supports sort operations. Now, let's explore the conditions under which a query projection is covered by the index, eliminating the need to fetch the document from the collection.
Here is the table of content of the tests:
- With default projection: IXSCAN ➤ FETCH
- Partial projection: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
- Full projection: IXSCAN ➤ PROJECTION_COVERED
- Query on a single-key entry: IXSCAN ➤ PROJECTION_COVERED
- Query on a multi-key entry: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
- Projection of "_id" : IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
- Projection of "$recordId" : IXSCAN ➤ PROJECTION_DEFAULT <!-- TOC end -->
Here is a collection of friends, with their first name, last name, and phone number.
db.friends.insertMany([
{ firstName: "Rachel", lastName: "Green", phoneNumber: "555-1234" },
{ firstName: "Ross", lastName: "Geller", phoneNumber: "555-2345" },
{ firstName: "Monica", lastName: "Geller", phoneNumber: "555-3456" },
{ firstName: "Chandler", lastName: "Bing", phoneNumber: "555-4567" },
{ firstName: "Joey", lastName: "Tribbiani", phoneNumber: "555-6789" },
{ firstName: "Janice", lastName: "Hosenstein", phoneNumber: "555-7890" },
{ firstName: "Gunther", lastName: "Centralperk", phoneNumber: "555-8901" },
{ firstName: "Carol", lastName: "Willick", phoneNumber: "555-9012" },
{ firstName: "Susan", lastName: "Bunch", phoneNumber: "555-0123" },
{ firstName: "Mike", lastName: "Hannigan", phoneNumber: "555-1123" },
{ firstName: "Emily", lastName: "Waltham", phoneNumber: "555-2234" }
])
In any database, relational or document, implementing new use cases often requires an index for effective access patterns. For instance, for a reverse phone directory, I create an index where the key starts with the phone number. I add the names to the key in order to allow for index-only scans to benefit from the O(log n) scalability of B-Tree indexes:
db.friends.createIndex(
{ phoneNumber:1, firstName:1, lastName:1 }
)
To confirm that an index-only scan is occurring, I examine the execution plan specifically for PROJECTION_COVERED rather than FETCH
With default projection: IXSCAN ➤ FETCH
Due to its flexible schema, MongoDB cannot assume that all fields in every document within a collection are covered by the index. As a result, it must fetch the entire document:
db.friends.find(
{ phoneNumber:"555-6789" }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: false,
multiKeyPaths: { phoneNumber: [], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
Looking at the result, I can see the "_id" which is stored in the document:
db.friends.find(
{ phoneNumber:"555-6789" }
)
[
{
_id: ObjectId('680d46a1672e2e146dd4b0c6'),
firstName: 'Joey',
lastName: 'Tribbiani',
phoneNumber: '555-6789'
}
]
I can remove it from the projection as I don't need it.
Partial projection: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
I add a projection to exclude the "_id" from the result, but it doesn't remove the FETCH that gets the document with all fields:
db.friends.find(
{ phoneNumber:"555-6789" }
, { "_id":0 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0 },
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: false,
multiKeyPaths: { phoneNumber: [], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
}
Even if I know that my documents have no other fields, the query planner doesn't know it and must plan to get the document.
Full projection: IXSCAN ➤ PROJECTION_COVERED
When the projection declares all fields, and they are in the index key, there's no need to fetch the document as the projection is covered:
db.friends.find(
{ phoneNumber:"555-6789" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 0,
executionStages: {
isCached: false,
stage: 'PROJECTION_COVERED',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, firstName: 1, lastName: 1, phoneNumber: 1 },
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: false,
multiKeyPaths: { phoneNumber: [], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
Such plan is an index only scan, optimal as it doesn't need to read documents.
Adding an array instead of a scalar value
Now that we have examined how a query is covered by a single-key index, where each document has a unique index entry, let's explore the implications of a multi-key index. In MongoDB, a field can contain a single value in one document and an array of values in another. I add such a document, where one of the friends has three phone numbers:
db.friends.insertOne({
firstName: "Phoebe",
lastName: "Buffay",
phoneNumber: ["555-3344", "555-4455", "555-5566"]
})
We refer to the index as a multi-key index, but in reality, it remains the same index in MongoDB. The distinction lies in its capacity to hold multiple entries per document, rather than solely single-key entries.
Query on a single-key entry: IXSCAN ➤ PROJECTION_COVERED
When I query the same single-key document as before, nothing changes and the projection is covered:
db.friends.find(
{ phoneNumber:"555-6789" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, firstName: 1, lastName: 1, phoneNumber: 1 },
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: true,
multiKeyPaths: { phoneNumber: [ 'phoneNumber' ], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 1,
dupsDropped: 0
}
}
}
}
A key advantage of MongoDB's flexible document model is that changes in structure, as the business evolves, do not impact existing documents. This is more agile than SQL databases where changing a One-to-One relationship to a One-to-Many requires complete refactoring of the model and extensive non-regression testing.
Query on a multi-key entry: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
There's a difference when I query the document with an array of values, visible with isMultiKey: true in the IXSCAN, and a FETCH stage:
db.friends.find(
{ phoneNumber:"555-4455" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, firstName: 1, lastName: 1, phoneNumber: 1 },
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: true,
multiKeyPaths: { phoneNumber: [ 'phoneNumber' ], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-4455", "555-4455"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 1,
dupsDropped: 0
}
}
}
}
Understanding the behavior is simplified by recognizing that there is one index entry for each key, with only one entry being read (keysExamined: 1). However, the projection requires access to all associated values. Even if a single value is used to locate the document, the result must display all relevant values:
db.friends.find(
{ phoneNumber:"555-4455" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
)
[
{
firstName: 'Phoebe',
lastName: 'Buffay',
phoneNumber: [ '555-3344', '555-4455', '
Index Only Scan on JSON Documents, with Covering and Multi-Key Indexes in MongoDB
Storing JSON documents in a SQL database does not make it a document database. The strength of any database lies in its indexing capabilities, and SQL databases, even with JSON datatypes, lack the flexibility of document databases, particularly when dealing with arrays in an embedded One-to-Many relationships.
In SQL databases, normalized tables are first-class citizens, with single-key indexes optimized for handling equality and range filters (WHERE), sorting (ORDER BY), and projection coverage (SELECT) for individual tables. To avoid joining before filtering, a denormalized model is preferred when those operations involve multiple tables.
While SQL databases can store documents, including arrays for One-to-Many relationships, working with arrays in JSON necessitates the use of inverted indexes like PostgreSQL's GIN, which do not cover range filters, sorting, and projection like regular indexes.
In contrast, MongoDB treats documents as the core of its data model. Its indexing mechanisms naturally extend to handle documents and their arrays, retaining functionality for filtering, sorting, and projection coverage, as regular indexes can hangle multiple keys per documents.
In previous posts, we examined how a multi-key index supports sort operations. Now, let's explore the conditions under which a query projection is covered by the index, eliminating the need to fetch the document from the collection.
Here are the execution plans tested:
- With default projection: IXSCAN ➤ FETCH
- Partial projection: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
- Full projection: IXSCAN ➤ PROJECTION_COVERED
- Query on a single-key entry: IXSCAN ➤ PROJECTION_COVERED
- Query on a multi-key entry: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
- Projection of "_id" : IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
- Projection of "$recordId" : IXSCAN ➤ PROJECTION_DEFAULT
Here is a collection of friends, with their first name, last name, and phone number.
db.friends.insertMany([
{ firstName: "Rachel", lastName: "Green", phoneNumber: "555-1234" },
{ firstName: "Ross", lastName: "Geller", phoneNumber: "555-2345" },
{ firstName: "Monica", lastName: "Geller", phoneNumber: "555-3456" },
{ firstName: "Chandler", lastName: "Bing", phoneNumber: "555-4567" },
{ firstName: "Joey", lastName: "Tribbiani", phoneNumber: "555-6789" },
{ firstName: "Janice", lastName: "Hosenstein", phoneNumber: "555-7890" },
{ firstName: "Gunther", lastName: "Centralperk", phoneNumber: "555-8901" },
{ firstName: "Carol", lastName: "Willick", phoneNumber: "555-9012" },
{ firstName: "Susan", lastName: "Bunch", phoneNumber: "555-0123" },
{ firstName: "Mike", lastName: "Hannigan", phoneNumber: "555-1123" },
{ firstName: "Emily", lastName: "Waltham", phoneNumber: "555-2234" }
])
In any database, relational or document, implementing new use cases often requires an index for effective access patterns. For instance, for a reverse phone directory, I create an index where the key starts with the phone number. I add the names to the key in order to allow for index-only scans to benefit from the O(log n) scalability of B-Tree indexes:
db.friends.createIndex(
{ phoneNumber:1, firstName:1, lastName:1 }
)
To confirm that an index-only scan is occurring, I examine the execution plan specifically for PROJECTION_COVERED rather than FETCH
With default projection: IXSCAN ➤ FETCH
Due to its flexible schema, MongoDB cannot assume that all fields in every document within a collection are covered by the index. As a result, it must fetch the entire document:
db.friends.find(
{ phoneNumber:"555-6789" }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: false,
multiKeyPaths: { phoneNumber: [], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
Looking at the result, I can see the "_id" which is stored in the document:
db.friends.find(
{ phoneNumber:"555-6789" }
)
[
{
_id: ObjectId('680d46a1672e2e146dd4b0c6'),
firstName: 'Joey',
lastName: 'Tribbiani',
phoneNumber: '555-6789'
}
]
I can remove it from the projection as I don't need it.
Partial projection: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
I add a projection to exclude the "_id" from the result, but it doesn't remove the FETCH that gets the document with all fields:
db.friends.find(
{ phoneNumber:"555-6789" }
, { "_id":0 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0 },
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: false,
multiKeyPaths: { phoneNumber: [], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
}
Even if I know that my documents have no other fields, the query planner doesn't know it and must plan to get the document.
Full projection: IXSCAN ➤ PROJECTION_COVERED
When the projection declares all fields, and they are in the index key, there's no need to fetch the document as the projection is covered:
db.friends.find(
{ phoneNumber:"555-6789" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 0,
executionStages: {
isCached: false,
stage: 'PROJECTION_COVERED',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, firstName: 1, lastName: 1, phoneNumber: 1 },
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: false,
multiKeyPaths: { phoneNumber: [], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
}
Such plan is an index only scan, optimal as it doesn't need to read documents.
Adding an array instead of a scalar value
Now that we have examined how a query is covered by a single-key index, where each document has a unique index entry, let's explore the implications of a multi-key index. In MongoDB, a field can contain a single value in one document and an array of values in another. I add such a document, where one of the friends has three phone numbers:
db.friends.insertOne({
firstName: "Phoebe",
lastName: "Buffay",
phoneNumber: ["555-3344", "555-4455", "555-5566"]
})
We refer to the index as a multi-key index, but in reality, it remains the same index in MongoDB. The distinction lies in its capacity to hold multiple entries per document, rather than solely single-key entries.
Query on a single-key entry: IXSCAN ➤ PROJECTION_COVERED
When I query the same single-key document as before, nothing changes and the projection is covered:
db.friends.find(
{ phoneNumber:"555-6789" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, firstName: 1, lastName: 1, phoneNumber: 1 },
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: true,
multiKeyPaths: { phoneNumber: [ 'phoneNumber' ], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-6789", "555-6789"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 1,
dupsDropped: 0
}
}
}
}
A key advantage of MongoDB's flexible document model is that changes in structure, as the business evolves, do not impact existing documents. This is more agile than SQL databases where changing a One-to-One relationship to a One-to-Many requires complete refactoring of the model and extensive non-regression testing.
Query on a multi-key entry: IXSCAN ➤ FETCH ➤ PROJECTION_SIMPLE
There's a difference when I query the document with an array of values, visible with isMultiKey: true in the IXSCAN, and a FETCH stage:
db.friends.find(
{ phoneNumber:"555-4455" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
).explain('executionStats').executionStats
{
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
executionStages: {
isCached: false,
stage: 'PROJECTION_SIMPLE',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
transformBy: { _id: 0, firstName: 1, lastName: 1, phoneNumber: 1 },
inputStage: {
stage: 'FETCH',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
docsExamined: 1,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 1,
executionTimeMillisEstimate: 0,
works: 2,
advanced: 1,
needTime: 0,
needYield: 0,
saveState: 0,
restoreState: 0,
isEOF: 1,
keyPattern: { phoneNumber: 1, firstName: 1, lastName: 1 },
indexName: 'phoneNumber_1_firstName_1_lastName_1',
isMultiKey: true,
multiKeyPaths: { phoneNumber: [ 'phoneNumber' ], firstName: [], lastName: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
phoneNumber: [ '["555-4455", "555-4455"]' ],
firstName: [ '[MinKey, MaxKey]' ],
lastName: [ '[MinKey, MaxKey]' ]
},
keysExamined: 1,
seeks: 1,
dupsTested: 1,
dupsDropped: 0
}
}
}
}
Understanding the behavior is simplified by recognizing that there is one index entry for each key, with only one entry being read (keysExamined: 1). However, the projection requires access to all associated values. Even if a single value is used to locate the document, the result must display all relevant values:
db.friends.find(
{ phoneNumber:"555-4455" }
, { "_id":0 , firstName:1, lastName:1, phoneNumber:1 }
)
[
{
firstName: 'Phoebe',
lastName: 'Buffay',
phoneNumber: [ '555-3344', '555-4455', '555-5566
April 25, 2025
pgvector: The Critical PostgreSQL Component for Your Enterprise AI Strategy
You’re likely racing to enhance your applications with more intelligent, data-driven capabilities, whether through AI-powered models (which have moved into “must implement now!” territory), advanced search functions, real-time fraud detection, or geospatial analysis. As these demands grow, you face a significant challenge: efficiently storing, managing, and querying high-dimensional vector data within your existing database infrastructure. […]