$exists and non-sparse indexes in MongoDB and in other DocumentDB
In SQL databases, NULL represents an unknown value — not the absence of a value. When a value is simply non-applicable for a given entity, the correct relational modeling approach is normalization: the entity gets no row in the relevant table at all, rather than a NULL in a column. This distinction becomes tricky with OUTER JOIN results, where the absence of a row is surfaced as NULL across all columns of the unmatched side, including key columns — making it easy to confuse "unknown value" with "no row existed."
MongoDB has its own subtlety: a field can be explicitly set to null or simply not exist in the document at all. In the BSON representation, these are distinct — one is a key with a null-typed value, the other is the absence of the key entirely. The schema is flexible: you can define a field or not. But in indexes, this distinction disappears. Except for partial indexes, indexes must have a key value for every document it covers. For documents where the field is missing, MongoDB uses null as a stand-in — the same key value used for explicit nulls. This means an index scan cannot distinguish between the two states, and resolving null vs. missing requires fetching the full document to apply a residual filter.
Consequently, a standard index scan with a filter on null or $exists is inexact: the query planner performs an index scan on the null key and then fetches the full document to verify whether the field is truly null or simply absent.
An example: { $exists: true } filter
When you query with { num: { $exists: true } }, you expect MongoDB to use an index on num. Let's test it on MongoDB, as well as some emulations: Oracle Database, Amazon DocumentDB (AWS), and DocumentDB extension on PostgreSQL (Microsoft).
Here is my test collection:
db.test.insertMany([
{ _id: 1, num: 42 },
{ _id: 2, num: 7 },
{ _id: 3, num: null },
{ _id: 4 },
{ _id: 5, num: 99 },
{ _id: 6 },
{ _id: 7, num: null },
{ _id: 8, num: 15 }
])
I have inserted eight documents:
- four with real values (
_id1, 2, 5, 8), - two with the field explicitly set to
null(_id3, 7), and - two where the field is entirely absent (
_id4, 6).
The query { num: { $exists: true } } should return six documents — everything except _id 4 and 6.
Before touching indexes, notice that $exists is not the same as a null check:
db.test.find({ num: null })
[
{ _id: 3, num: null },
{ _id: 4 },
{ _id: 6 },
{ _id: 7, num: null }
]
db.test.find({ num: { $exists: false } })
[
{ _id: 4 },
{ _id: 6 }
]
db.test.find({ num: { $exists: true } })
[
{ _id: 1, num: 42 },
{ _id: 2, num: 7 },
{ _id: 3, num: null },
{ _id: 5, num: 99 },
{ _id: 7, num: null },
{ _id: 8, num: 15 }
]
A field set to null exists. A field not written into the document does not. This distinction is perfectly clear at the document level. At the index level, it is not.
Null in the index key is ambiguous
When MongoDB builds a B-tree index on num, it must create an entry for every document. For documents with no num field, the index key exists with a null value. For documents where num is explicitly set to null, it also stores null. Both cases produce the same index key.
Here is how the non-sparse index looks:
Non-sparse index on { num: 1 }:
null → _id:3 { num: null } explicit null
null → _id:4 { } missing field
null → _id:6 { } missing field
null → _id:7 { num: null } explicit null
7 → _id:2
15 → _id:8
42 → _id:1
99 → _id:5
The four entries under the null key are indistinguishable from the index alone. To evaluate $exists, the engine must read the actual document. This is called a residual predicate — a filter condition the index cannot resolve, deferred to a later fetch stage.
Another way to look at it: the document schema is flexible, with no structure declared upfront and fields that may or may not exist, whereas indexes are different—their schema is declared, and the key fields always exist.
MongoDB with a non-sparse index
I create a regular index, which is by default non-sparse and has one index entry per document (or more for multi-key indexes).
db.test.createIndex({ num: 1 })
db.test.find({ num: { $exists: true } }).explain("executionStats")
The execution plan shows what happens across the IXSCAN and FETCH stages:
executionStats: {
nReturned: 6,
totalKeysExamined: 8,
totalDocsExamined: 8,
executionStages: {
stage: 'FETCH',
filter: { num: { '$exists': true } },
nReturned: 6,
docsExamined: 8,
inputStage: {
stage: 'IXSCAN',
nReturned: 8,
isSparse: false,
indexBounds: { num: [ '[MinKey, MaxKey]' ] },
keysExamined: 8
}
}
}
The IXSCAN returns all 8 index entries across the full [MinKey, MaxKey] range. The FETCH stage then reads all 8 documents and applies filter: { num: { $exists: true } } as a residual predicate, discarding _id 4 and 6. Notice docsExamined: 8 but nReturned: 6 — two fetches were wasted. The index was used, but the null bucket forced unnecessary work.
MongoDB with a sparse index
A sparse index excludes documents where the indexed field is entirely absent. It does not exclude explicit null values. Documents _id 3 and 7 have num: null and are still indexed.
db.test.createIndex({ num: 1 }, { sparse: true })
db.test.find({ num: { $exists: true } }).explain("executionStats")
As I have no projection, there is still a FETCH, but only for the documents in the final result:
executionStats: {
nReturned: 6,
totalKeysExamined: 6,
totalDocsExamined: 6,
executionStages: {
stage: 'FETCH',
nReturned: 6,
docsExamined: 6,
inputStage: {
stage: 'IXSCAN',
nReturned: 6,
isSparse: true,
indexBounds: { num: [ '[MinKey, MaxKey]' ] },
keysExamined: 6
}
}
}
keysExamined dropped from 8 to 6 — the two missing-field documents are not in the index. More importantly, the FETCH stage has no filter. There is no residual predicate. Every document pointed to by the sparse index either has a real value or has an explicit null — both satisfy $exists: true. The index structure itself proves the condition. The FETCH still happens because find() needs to return the documents, but it is doing useful work only, not wasted disambiguation.
Here is how the sparse index looks:
Sparse index on { num: 1 }:
null → _id:3 { num: null } explicit null — indexed
null → _id:7 { num: null } explicit null — indexed
7 → _id:2
15 → _id:8
42 → _id:1
99 → _id:5
_id:4 { } — not indexed
_id:6 { } — not indexed
The null bucket still exists in a sparse index, but it contains only explicit nulls. The ambiguity is gone.
Oracle Database
I reproduced the same on Oracle Database with the MongoDB emulation:
ora> db.test.createIndex({ num: 1 })
num_1
ora> db.test.find({ num: { $exists: true } }).explain("executionStats")
{
queryPlanner: {
namespace: 'ora.test',
parsedQuery: { num: { '$exists': true } },
rewrittenQuery: { num: { '$exists': true } },
generatedSql: `select "DATA",rawtohex("RESID"),"ETAG" from "ORA"."test" where JSON_EXISTS("DATA",'$?(exists(@.num)) ' type(strict))`,
winningPlan: ' Plan Hash Value : 3552627291 \n' +
'\n' +
'--------------------------------------------------------------------------------------------------\n' +
'| Id | Operation | Name | Rows | Bytes | Cost | Time |\n' +
'--------------------------------------------------------------------------------------------------\n' +
'| 0 | SELECT STATEMENT | | 1 | 24501 | 2 | 00:00:01 |\n' +
'| 1 | TABLE ACCESS BY INDEX ROWID BATCHED | test | 1 | 24501 | 2 | 00:00:01 |\n' +
'| 2 | HASH UNIQUE | | 1 | 24501 | | |\n' +
'| * 3 | INDEX RANGE SCAN (MULTI VALUE) | $ora:test.num_1 | 1 | | 1 | 00:00:01 |\n' +
'--------------------------------------------------------------------------------------------------\n' +
'\n' +
'Predicate Information (identified by operation id):\n' +
'------------------------------------------\n' +
`* 3 - access(JSON_QUERY("DATA" /*+ LOB_BY_VALUE */ FORMAT OSON , '$."num"[*]' RETURNING ANY ORA_RAWCOMPARE ASIS WITHOUT ARRAY WRAPPER ERROR ON ERROR PRESENT ON EMPTY NULL ON MISMATCH TYPE(LAX)\n` +
" MULTIVALUE)>HEXTORAW('01'))\n" +
'\n' +
'\n' +
'Notes\n' +
'-----\n' +
'- Dynamic sampling used for this statement ( level = 2 )\n' +
'\n'
},
serverInfo: { host: 'localhost', port: 27017, version: '7.0.22' },
ok: 1
}
ora>
It doesn't display the execution statistics, but I can get it from the SQL endpoint:
sql> select /*+ gather_plan_statistics */ "DATA",rawtohex("RESID"),"ETAG" from "ORA"."test" where JSON_EXISTS("DATA",'$?(exists(@.num)) ' type(strict));
DATA RAWTOHEX("RESID") ETAG
_______________________ ____________________ ___________________________________
{"_id":3,"num":null} 03C104 523160F3D2777CB2E0637B5B000A71CD
{"_id":7,"num":null} 03C108 523160F3D27F7CB2E0637B5B000A71CD
{"_id":2,"num":7} 03C103 523160F3D2757CB2E0637B5B000A71CD
{"_id":8,"num":15} 03C109 523160F3D2817CB2E0637B5B000A71CD
{"_id":1,"num":42} 03C102 523160F3D2737CB2E0637B5B000A71CD
{"_id":5,"num":99} 03C106 523160F3D27B7CB2E0637B5B000A71CD
6 rows selected.
sql> select * from dbms_xplan.display_cursor(format=>'allstats last');
PLAN_TABLE_OUTPUT
____________________________________________________________________________________________________________________
SQL_ID c08vsvqpn75vw, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ "DATA",rawtohex("RESID"),"ETAG"
from "ORA"."test" where JSON_EXISTS("DATA",'$?(exists(@.num)) '
type(strict))
Plan hash value: 3552627291
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 |00:00:00.01 | 2 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| test | 1 | 1 | 6 |00:00:00.01 | 2 |
| 2 | HASH UNIQUE | | 1 | 1 | 6 |00:00:00.01 | 1 |
|* 3 | INDEX RANGE SCAN (MULTI VALUE) | $ora:test.num_1 | 1 | 1 | 6 |00:00:00.01 | 1 |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("test"."SYS_NC00005$">HEXTORAW('01'))
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
26 rows selected.
The index range scan returned 6 entries, as if it were a sparse index. We cannot create a sparse index on Oracle Database:
ora> db.test.dropIndex({ num: 1 })
{ nIndexesWas: 2, ok: 1 }
ora> db.test.createIndex({ num: 1 } , { sparse: 1 })
MongoServerError[MONGO-67]: Unsupported index option: sparse
Amazon DocumentDB (AWS)
AWS DocumentDB speaks the MongoDB wire protocol but is built on a completely different architecture. The storage layer is distributed like Aurora, replicated across three availability zones. The query planner and storage engine are specific to Amazon DocumentDB and deliver performance characteristics that differ from both MongoDB and standard PostgreSQL.
Non-sparse index on Amazon DocumentDB
The { num: { $exists: true } } query does not use the non-sparse index (created as db.test.createIndex({ num: 1 })) on Amazon DocumentDB (tested on version 8, planner version 3):
queryPlanner: {
plannerVersion: 3,
winningPlan: { stage: 'COLLSCAN', filter: { num: { '$exists': true } } }
},
executionStats: {
nReturned: '6',
executionTimeMillis: '14.121',
planningTimeMillis: '14.019',
executionStages: {
stage: 'COLLSCAN',
nReturned: '6',
executionTimeMillisEstimate: '0.025'
}
}
The index is completely abandoned. The planner chose a full collection scan.
Sparse index on Amazon DocumentDB
With the index created as db.test.createIndex({ num: 1 }, { sparse: true }), the index is used:
queryPlanner: {
plannerVersion: 3,
winningPlan: { stage: 'IXSCAN', indexName: 'num_1', direction: 'forward' }
},
executionStats: {
nReturned: '6',
executionTimeMillis: '10.034',
planningTimeMillis: '8.128',
executionStages: {
stage: 'IXSCAN',
nReturned: '6',
executionTimeMillisEstimate: '1.842',
indexName: 'num_1',
direction: 'forward'
}
}
Every entry in the sparse index provably satisfies $exists: true. It scans 6 index entries and returns 6 documents. While a sparse index is optional in MongoDB, it is mandatory in Amazon DocumentDB to use an index for this query at all.
Microsoft DocumentDB on PostgreSQL
Microsoft DocumentDB is implemented as an open-source PostgreSQL extension, accessed via the MongoDB wire protocol through a compatible endpoint.
With DocumentDB on PostgreSQL, a sparse index is not required for an optimal access path. I created the index as db.test.createIndex({ num: 1 }) and used a hint to force the index, since on a small collection the cost-based planner would otherwise prefer a sequential scan:
db.test.find(
{ num: { $exists: true } }
).hint("num_1").explain("executionStats")
(Be careful when using a hint with MongoDB queries, as it may change the result, limiting the scan to what is indexed)
The execution plan reads only the necessary entries from the index:
executionStats: {
nReturned: Long('6'),
executionTimeMillis: 0.093,
executionStartAtTimeMillis: 0.089,
totalDocsExamined: Long('6'),
totalKeysExamined: Long('6'),
executionStages: {
stage: 'FETCH',
nReturned: Long('6'),
executionTimeMillis: 0.093,
executionStartAtTimeMillis: 0.089,
totalKeysExamined: 6,
numBlocksFromCache: 24,
inputStage: {
stage: 'IXSCAN',
nReturned: Long('6'),
executionTimeMillis: 0.093,
executionStartAtTimeMillis: 0.089,
indexName: 'num_1',
totalKeysExamined: 6,
numBlocksFromCache: 24
}
}
}
It shows the same count for index entries (totalKeysExamined: 6) and documents fetched (totalDocsExamined: 6).
Here we can go further: we can bypass the MongoDB layer entirely and query PostgreSQL directly, seeing exactly what the database engine sees.
To understand why, we can look at the underlying implementation from the PostgreSQL catalog:
\d documentdb_data.documents_15
Table "documentdb_data.documents_15"
Column | Type | Collation | Nullable | Default
-----------------+--------+-----------+----------+---------
shard_key_value | bigint | | not null |
object_id | bson | | not null |
document | bson | | not null |
Indexes:
"collection_pk_15" PRIMARY KEY, btree (shard_key_value, object_id)
"documents_rum_index_47" documentdb_extended_rum
(document bson_extended_rum_composite_path_ops
(pathspec='[ "num" ]', tl='2691'))
There are no individual columns for num, name, or any other document field. The entire document is stored as a single bson blob in the document column. PostgreSQL has no native knowledge of what is inside it. The collection name test maps to documents_15, where 15 is the collection's internal identifier.
The index is not a standard PostgreSQL B-tree. It is an Extended RUM index — documentdb_extended_rum — with a custom operator class: bson_extended_rum_composite_path_ops. RUM is an extension of GIN (Generalized Inverted Index) that adds support for ordering, range scans, and additional per-entry metadata. The operator class is the critical piece: it knows how to extract the num field from the opaque BSON blob and store it in a structure PostgreSQL can search. pathspec='[ "num" ]' tells it which field to index.
We can obtain the PostgreSQL execution plan directly using the DocumentDB API. I disabled sequential scans to override the cost-based planner's preference on this small table:
postgres=# set enable_seqscan to off;
explain (analyze, buffers, verbose, costs off)
select document from bson_aggregation_find(
'test',
'{
"find": "test",
"filter": { "num": { "$exists": true } }
}'::documentdb_core.bson
);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Index Scan using num_1 on documentdb_data.documents_15 collection (actual time=0.036..0.040 rows=6 loops=1)
Output: document
Index Cond: (collection.document @>= '{ "num" : { "$minKey" : 1 } }'::bson)
Buffers: shared hit=3
Planning:
Buffers: shared hit=26
Planning Time: 0.416 ms
Execution Time: 0.052 ms
(8 rows)
The $exists: true predicate has been translated into a PostgreSQL index condition: document @>= '{ "num": { "$minKey": 1 } }'::bson. This uses a custom BSON operator @>= meaning "document has field num with a value greater than or equal to MinKey."
MinKey is a special BSON sentinel value that sits below every other BSON value in the type ordering. The condition @>= MinKey therefore means "field num exists and has any BSON value at all" — which is exactly $exists: true. Existence becomes a range scan from the minimum possible value: an elegant encoding.
The RUM index is path-based and only creates entries for paths that actually exist in documents. However, documents where num is absent also have their index entry, that are scanned by the opposite filter { num: {$exists": false} }:
postgres=# explain (analyze, buffers, verbose, costs off)
select document from bson_aggregation_find(
'test',
'{
"find": "test",
"filter": { "num": { "$exists": false } }
}'::documentdb_core.bson
);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Index Scan using num_1 on documentdb_data.documents_15 collection (actual time=0.030..0.033 rows=2 loops=1)
Output: document
Index Cond: (collection.document @? '{ "num" : false }'::bson)
Buffers: shared hit=3
Planning:
Buffers: shared hit=80
Planning Time: 0.282 ms
Execution Time: 0.070 ms
(8 rows)
This has read the two rows (rows=2) without the num field.
The complete picture
Here is the summary for { $exists: true } queries on the example above:
| Non-sparse index | Sparse index | |
|---|---|---|
| MongoDB | FETCH ← IXSCAN, 8 keys, 8 docs, residual filter, 2 wasted fetches | FETCH ← IXSCAN, 6 keys, 6 docs, no residual filter |
| Amazon DocumentDB (AWS) | COLLSCAN, no index, 8 docs | IXSCAN, 6 keys |
| DocumentDB on PostgreSQL (Microsoft) | FETCH ← IXSCAN, 6 keys, 6 docs, no residua... (truncated)
May 28, 2026Porting PostgreSQL Extensions to MySQL using Agent Skills
Discover how to build and port PostgreSQL extensions to MySQL using VillageSQL's Extension Framework and our new automated CLI agent skill.
Percona Operator for PostgreSQL 3.0.0: Hard Fork, OLM Scoping, Major UpgradesThe Percona Operator for PostgreSQL 3.0.0 is here. This is the release that completes the hard fork of the operator from the Crunchy Data PostgreSQL Operator into a fully independent project, with a dedicated upstream.pgv2.percona.com API group for the inherited CRDs, an automatic CRD-rename rollout for existing 2.x installs on upgrade, and a public roadmap … Continued The post Percona Operator for PostgreSQL 3.0.0: Hard Fork, OLM Scoping, Major Upgrades appeared first on Percona. Guide your Amazon Aurora MySQL migration with Kiro powers
Today, we announce the Amazon Aurora MySQL power for Kiro. The power connects Kiro’s AI agent to Aurora MySQL and pairs live database access with curated best-practice guidance. You describe what you need in natural language. The agent generates the API calls, SQL, and configuration for you to review and run. In this post, we walk through how the power guides a production migration from Amazon Relational Database Service (Amazon RDS) for MySQL 8.0 to Aurora MySQL through four phases: assessment, replica creation, promotion, and post-cutover validation.
May 27, 2026AI-native, full-stack web apps with Vercel and AWS Databases
In this post, we show how the integration between Vercel and AWS Databases solves this and invite you to participate in the H0 hackathon.
Optimize costs in Amazon Aurora
By implementing modern optimization techniques for Aurora, you can achieve additional cost reduction beyond traditional methods alone. This isn’t only about spending less—it’s about building a more efficient, scalable, and resilient database environment. In this post, we show you a structured approach to optimizing Amazon Aurora database costs. It outlines specific strategies, implementation steps, and best practices across different optimization areas.
Migrate from Crunchy Data PostgreSQL Operator to Percona PostgreSQL Operator: Backup-Restore and PV ReuseA Percona PostgreSQL operator pgBackRest restore is the simplest way to move off the Crunchy Data PostgreSQL Operator: take a full Crunchy backup, point the new Percona cluster’s dataSource at the existing pgBackRest archive, and the cluster bootstraps from it before its first start. This post covers that path, plus a second option, persistent-volume reuse, for cases … Continued The post Migrate from Crunchy Data PostgreSQL Operator to Percona PostgreSQL Operator: Backup-Restore and PV Reuse appeared first on Percona. CedarDB: Features of April 2026This post takes a closer look at some of the most impactful features we have shipped in CedarDB across our recent releases. Whether you have been following along closely or are just catching up, here is a deeper look at the additions we are most excited about. Set-Returning Functions: Lock-Step Evaluationv2026-04-20 When handling bulk data transformations or speeding up database inserts, a popular developer trick is to use multiple set-returning functions side-by-side in the To guarantee seamless compatibility and keep your queries lightning-fast, CedarDB evaluates multiple set-returning functions in the Instead of exploding into 27 rows of useless, cross-joined data, CedarDB cleanly steps through the arrays row-by-row to return exactly 3 perfectly paired rows. If you rely on array unnesting to batch your application’s database inserts, you can now enjoy highly scalable performance and behavior that is completely identical to modern PostgreSQL.
ON UPDATE CASCADE: Keep Your Data in Sync Automaticallyv2026-04-20 Changing core identifiers, like a user’s handle or a department code, used to mean manually updating every referencing row to avoid breaking foreign key constraints. To make your life easier, CedarDB now supports Say you have a platform where posts reference an author’s username. If an author changes their handle, a single
pg_stat_database and pg_stat_activity: Observability Out of the Boxv2026-04-20 CedarDB now implements
VACUUM (TRUNCATE): Release Disk Space Back to the OSv2026-04-20 CedarDB’s storage footprint grows as your data grows, but until now, the main storage file never shrank. Dropped indexes, truncated tables, and deleted data all freed up pages internally, but the underlying file stayed the same size on disk. In some cases this could leave you with a much larger file than your actual data warrants, for example after building and then dropping a large index, or after rewriting
CedarDB also now properly returns pages to the free pool after
json_agg and json_build_array: JSON Aggregation in SQLv2026-04-27 Two commonly used JSON aggregation functions are now available in CedarDB:
Together, these two functions cover the most common patterns for producing JSON output directly in SQL, without needing to post-process results in application code. That’s it for now Questions or feedback? Join us on Slack or reach out directly. Do you want to try CedarDB straight away? Sign up for our free Enterprise Trial below. No credit card required. May 26, 2026Announcing VillageSQL Server 0.0.4
Explore VillageSQL Server 0.0.4: now featuring VEF v3, custom aggregates, parameter inference, and preview capabilities like background threads.
The Autovacuum Scale Factor Problem at Scale - Know Your DefaultsIn PostgreSQL, autovacuum and autoanalyze exist to clean up dead tuples (old versions of updated/deleted rows) and update query planner statistics, respectively. The challenge is running them frequently enough so that query plans and execution do not degrade after data modifications, but not so frequently as to cause excessive I/O overhead. Databases often maintain a counter of the number of modifications to trigger these background jobs. Oracle Database and MySQL use a stale percentage (the ratio of modifications to total rows) for statistics gathering. SQL Server uses a dynamically decreasing percentage to ensure statistics do not remain stale for too long on massive tables. PostgreSQL uses a hybrid approach: a fixed base threshold combined with a scale factor (a percentage) that grows proportionally with the table size. This hybrid approach hits the sweet spot for most workloads, but it often requires tuning based on your specific data. The key factor to watch is the amount of static, "cold" data in your tables. Because the scale factor is calculated against the total table size, a large volume of cold data will significantly inflate the threshold. This can delay maintenance on the active working set—the "hot" data actually used by your queries—leaving it vulnerable to stale statistics or bloat. Here are the default base thresholds:
At first glance, this suggests tables are analyzed when 50 rows are modified, and vacuumed when 50 dead tuples accumulate (from deletes or updates) or 1,000 rows are inserted. But this is only true without the scale factor—10% for statistics, 20% for vacuum:
Because of the scale factor, the actual trigger thresholds increase with the size of the table. For the default settings, the formulas are:
As these formulas show, a larger table requires a much larger accumulation of changes before maintenance fires. This is perfectly acceptable if data churn is uniformly distributed, as small changes across a massive dataset will not drastically impact query cost estimations. However, data distribution is rarely uniform and evolves over time (e.g., seasonal sales spikes, market expanding to new countries). Because static data inflates the table row count in the formulas above, your database waits too long to trigger maintenance on the active working set. This is the core problem with default autovacuum settings at scale: a table with 5 million rows can accumulate half a million stale modifications before the planner statistics are refreshed, and over a million dead tuples before bloat is cleaned up. The larger the table grows, the longer it waits, and the worse the situation becomes:
To demonstrate this, I have run the following script to simulate this kind of activity, constantly inserting 100 rows and then updating them. We delete nothing because we want to keep the history, but queries operate on those recent rows. Think of it like orders being entered, then processed, and remaining stored:
For each iteration, the total number of rows inserted ( The Y-axis shows the staleness of statistics ( With 5 million rows, the last million inserted rows accumulated dead tuples. That is 20% of the total table, as defined by the default vacuum scale factor, but it likely represents 100% of the data actively read by your queries (for example, if the application processes the last year or less of a 5-year history). Furthermore, the last 500,000 rows have completely stale statistics, the 10% default analyze scale factor, and the past months may not have the same data distribution as the previous years. Think about the impact this has on the maximum value for an ID sequence or a created_at timestamp. It also completely skews the query planner's understanding of your data distribution (such as querying by country or day of the week). I have seen this cause severe performance issues in the real world: a retail company where shops only open on Sundays during the summer, or a trading platform suddenly processing entirely new market trends. Because the statistics are stale, the planner assumes your new, active data looks exactly like your old, historical data. As the table grows, the impact of this bloat and staleness compounds, and performance will no longer scale. Eventually, your execution plans will flip—not because the queries changed, but simply because the estimations of the query planner are completely wrong. For very large tables where the total size increases but the active working set is a small, predictable number of rows, you can effectively disable the scale factor and rely almost entirely on the fixed threshold:
This sets a nearly flat threshold that does not grow with the table size. The right threshold value depends on how many rows your active working set changes per hour and how much staleness you can tolerate. However, you must monitor the consequences of running autovacuum frequently on a growing table to ensure it does not cause localized I/O spikes. Here is how the same run starts with the new table settings: Auto analyze never left more than ~10,000 modified rows without refreshing statistics. This threshold grows slightly with the table (at 10 million rows it doubles to 20,000), but remains vastly better than the default. Auto vacuum follows the same pattern for dead tuples, but runs more frequently here because the insert-specific vacuum trigger was left at its default (1,000 rows + 20% scale factor), which only triggers the analyze threshold beyond 45,000 rows. To address this unbounded growth natively, recent PostgreSQL versions introduced Naturally, enforcing stricter thresholds—whether through these new maximum caps or manual table-level tuning—means autovacuum will run more frequently, which demands more background worker capacity. Historically, increasing Alternatively, if your table has a clear time-based or categorical boundary between hot and cold data, partitioning is worth considering. Autovacuum operates per partition, so a current_year partition with 100,000 rows will trigger maintenance far sooner than a monolithic 5-million-row table, meaning the default scale factor will naturally behave exactly as intended. Protecting your Supabase projects from npm supply chain attacks
How Supabase is responding to npm supply chain attacks and practical steps you should take today to reduce your risk.
May 25, 2026Running TidesDB as a MySQL 9.7 storage enginetidesdb-mysql is an experimental build that was developed to verify how TidesDB, the LSM-tree key/value engine, can work with MySQL 9.7 as a storage engine. The current build is v0.2.4, and it’s an experiment, not a finished product. So you can use it in your tests if you also want to try TidesDB with MySQL … Continued The post Running TidesDB as a MySQL 9.7 storage engine appeared first on Percona. Migrate from Crunchy Data PostgreSQL Operator to Percona PostgreSQL Operator: Standby Cluster MethodA Crunchy to Percona PostgreSQL migration is more straightforward than most cross-operator moves on Kubernetes, because the Percona PostgreSQL Operator is a hard fork of the Crunchy Data PostgreSQL Operator. Same Patroni HA, same pgBackRest backups, same overall CRD shape. This post walks through the safest of the three migration paths: a standby cluster method … Continued The post Migrate from Crunchy Data PostgreSQL Operator to Percona PostgreSQL Operator: Standby Cluster Method appeared first on Percona. May 23, 2026Building Convex OS, a Browser-Based React App with Real-Time Sync
When you react to app state the Convex way, multi-tab sync comes free. Here's how one developer built a Windows XP desktop in the browser, complete with an AI agent, using just four database tables.
May 22, 2026MySQL 9.7.0 PGO Benchmark AnalysisOverview Servers Tested: MySQL 9.7.0 (PGO-enabled build released by Oracle) MySQL 9.7.0 Non-PGO (built without Profile-Guided Optimization — see BUILD.md) Tier Configurations: Tier 2G: 2GB InnoDB buffer pool Tier 12G: 12GB InnoDB buffer pool Tier 32G: 32GB InnoDB buffer pool View Results 📊 Interactive Reports The benchmark reports are available as interactive HTML pages … Continued The post MySQL 9.7.0 PGO Benchmark Analysis appeared first on Percona. |