a curated list of database news from authoritative sources

June 17, 2025

Percona Software for MongoDB Release Plan Updates

Percona commits to delivering robust, enterprise-grade, and community-driven software for MongoDB. We build our databases and tools to meet evolving data needs. As MongoDB continues to evolve, we’ve updated and clarified our release strategy for the Percona Software for MongoDB suite. This suite includes: Percona Server for MongoDB (PSMDB) Percona Backup for MongoDB (PBM) Percona […]

June 16, 2025

Implement row-level security in Amazon Aurora MySQL and Amazon RDS for MySQL

Row-level security (RLS) is a security mechanism that enhances data protection in scalable applications by controlling access at the individual row level. It enables organizations to implement fine-grained access controls based on user attributes, so users can only view and modify data they’re authorized to access. This post focuses on implementing a cost-effective custom RLS solution using native MySQL features, making it suitable for a wide range of use cases without requiring additional software dependencies. This solution is applicable for both Amazon Relational Database Service (Amazon RDS) for MySQL and Amazon Aurora MySQL-Compatible Edition, providing flexibility for users of either service.

Integrating Repmgr with Barman: Managing Backups During Switchovers

Repmgr is an open source tool suite for managing replication and failover in a cluster of PostgreSQL servers. It enhances PostgreSQL’s built-in hot-standby capabilities with tools to set up standby servers, monitor replication, and perform administrative tasks such as failover or manual switchover operations. If you’re already familiar with Patroni, think of repmgr as a […]

June 15, 2025

Queries on JSON 👉🏻 compound indexes (Equality, Sort, Range)

In a blog post titled New Benchmarks Show Postgres Dominating MongoDB in Varied Workloads, EDB claims that Postgres outperforms MongoDB in document-based data tests. While I generally find such marketing benchmarks useless, they highlight mistakes made by vendors comparing a database where they are experts with one they don't know and are unwilling to learn about. This provides an opportunity for educating on efficient index and query design.

There are four queries in this benchmark, with data loaded from github archive from 2015. This data is in JSON, which makes it suitable to test queries on documents, and queries are OLAP style, using aggregation pipelines in MongoDB.

Load documents to a collection

I used the same method to load data in a small lab to reproduce the queries:

for file in http://data.gharchive.org/2015-{01..12}-{01..31}-{0..23}.json.gz
 do
  wget -q -o /dev/null -O - $file |
  gunzip -c |
  mongoimport --collection="github2015" 
 done

Here is an example of one document:

db.github2015.find().limit(1);

[
  {
    _id: ObjectId('684ee281d8d90a383a078112'),
    id: '2489368070',
    type: 'PushEvent',
    actor: {
      id: 9152315,
      login: 'davidjhulse',
      gravatar_id: '',
      url: 'https://api.github.com/users/davidjhulse',
      avatar_url: 'https://avatars.githubusercontent.com/u/9152315?'
    },
    repo: {
      id: 28635890,
      name: 'davidjhulse/davesbingrewardsbot',
      url: 'https://api.github.com/repos/davidjhulse/davesbingrewardsbot'                                                                      
    },
    payload: {
      push_id: 536740396,
      size: 1,
      distinct_size: 1,
      ref: 'refs/heads/master',
      head: 'a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81',
      before: '86ffa724b4d70fce46e760f8cc080f5ec3d7d85f',
      commits: [
        {
          sha: 'a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81',
          author: {
            email: 'da8d7d1118ca5befd4d0d3e4f449c76ba6f1ee7e@live.com',
            name: 'davidjhulse'
          },
          message: 'Altered BingBot.jar\n\nFixed issue with multiple account support',                                                         
          distinct: true,
          url: 'https://api.github.com/repos/davidjhulse/davesbingrewardsbot/commits/a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81'                 
        }
      ]
    },
    public: true,
    created_at: '2015-01-01T00:00:00Z'
  }
]

This dataset is ideal for testing a document database because:

  • documents have an average size of three kilobytes, with some up to two megabytes.
  • it features a structured format with sub-documents like "actor" and "payload."
  • it contains arrays, such as "payload.commit," necessitating multi-key or inverted indexes.

I will test the four queries used by the benchmark and include an additional one that queries the array "payload.commit," which the benchmark overlooked. MongoDB's multi-key indexes significantly outperform all SQL databases in this regard (see the Multi-key Indexes series). While many vendor benchmarks limit their test coverage to create a favorable impression, my objective is to transparently showcase indexing best practices.

Single-field indexes

Here are the indexes that were created for the vendor benchmark:

 db.github2015.createIndex( {type:1} )
 db.github2015.createIndex( {"repo.name":1} )
 db.github2015.createIndex( {"payload.action":1} )
 db.github2015.createIndex( {"actor.login":1} )
 db.github2015.createIndex( {"payload.issue.comments":1} )

Seeing the index definitions, I already know why they got better results on PostgreSQL. Complex queries rarely filter or sort on a single field, and a well-designed database should have compound indexes. Without the right compound indexes, PostgreSQL can combine multiple indexes with bitmap scans, which is not ideal, as it cannot cover range filters or sort orders. However, it can be used to reduce the number of indexes created, as they negatively impact the vacuum process.

The document model in MongoDB offers the advantage of having all important fields consolidated within a single document. This allows for the use of a compound index that can effectively handle equality, range, and sort order. These indexes can be applied to both scalar values and arrays.

Benchmark queries

I ran benchmark queries in my lab to educate on indexing practices. I created the appropriate indexes, and the best starting point for documentation is The ESR (Equality, Sort, Range) Guideline.

Query a) Repositories order by most open issues quantity

The query is:

db.github2015.aggregate([  
  { $match: { $and: [ { type: "IssuesEvent"} , { "payload.action" : "opened" } ] }},  
  { $group: { _id: "$repo.name", total: { $sum: 1 } } },  
  { $sort: { total: -1 } }  
])

Index according to the ESR guideline:

  • Equality: "type" and "payload.action", with the less selective first (better compression)
  • Sort: "repo.name", as grouping is faster on sorted keys
db.github2015.createIndex(
 { "payload.action": 1, "type": 1, "repo.name": 1 }
)  

In my test, the query scanned nearly three hundred thousands index keys and returned in 700 milliseconds:

        executionStats: {                                                                                                                       
          executionSuccess: true,                                                                                                               
          nReturned: 72361,                                                                                                                     
          executionTimeMillis: 777,                                                                                                             
          totalKeysExamined: 278489,                                                                                                            
          totalDocsExamined: 0,    

Query b) Return git event type order by quantity from major to minor

The query is:

db.github2015.aggregate([  
  { $group: { _id: "$type", total: { $sum: 1 } } },  
  { $sort: { total: -1 } }  
])

Index according to the ESR guideline:

  • Sort: "repo.name", as grouping is faster on sorted keys
db.github2015.createIndex({ "type": 1 })  

This index was created by the benchmark, however it is important to know that the query planner will not choose the index without an equality or range predicate, even if it could help with sorting or grouping.

The query should simply add an unbounded range on the sort key:

db.github2015.aggregate([  
  { $match: { "type": { $gt: MinKey}  }},  
  { $group: { _id: "$type", total: { $sum: 1 } } },  
  { $sort: { total: -1 } }  
])

In my test, the query aggregated ten million keys in 4.5 seconds:

        executionStats: {
          executionSuccess: true,
          nReturned: 14,
          executionTimeMillis: 4585,
          totalKeysExamined: 9480600,
          totalDocsExamined: 0, 

Query c) Return the top 10 most active actors

The query is:

db.github2015.aggregate([  
  { $group: { _id: "$actor.login", events: { $sum: 1 } } },  
  { $sort: { events: -1 } },  
  { $limit: 10 }  
])

Index according to the ESR guideline:

  • Sort: "repo.name", as grouping is faster on sorted keys
db.github2015.createIndex({ "actor.login": 1 })  

As seen above, we need to add an unbounded range:

db.github2015.aggregate([  
  { $match: { "actor.login": { $gt: MinKey}  }}, 
  { $group: { _id: "$actor.login", events: { $sum: 1 } } },  
  { $sort: { events: -1 } },  
  { $limit: 10 }  
])

In my test, the query aggregated ten million keys in 13 seconds:

        executionStats: {                                                                                                                       
          executionSuccess: true,                                                                                                               
          nReturned: 727312,
          executionTimeMillis: 13498,
          totalKeysExamined: 9480600,
          totalDocsExamined: 0,

If you need to optimize this access pattern further, use the computed design pattern, and increment a login counter in an actor's collection, leveraging the atomic $inc operator.

Query d) Return repositories that have more than two comments and a specific event type, order by average comments from major to minor

The query is:

db.github2015.aggregate([  
  { $match: { "type": "PushEvent", "payload.issue.comments": { $gt : 2 } } },  
  { $group: { _id: "$repo.name", avg: { $avg: "$payload.issue.comments" } } },  
  { $sort: { avg: -1 } }  
])

Index according to the ESR guideline:

  • Equality: "type"
  • Sort: "repo.name", as grouping is faster on sorted keys
  • Range: "payload.issue.comments"
db.github2015.createIndex(
 { "type": 1, "repo.name": 1, "payload.issue.comments": 1  }
)

I don't know if it was done on purpose for the benchmark, but the dataset has no "PushEvent" with a "payload.issue" so the query returns an empty result.
For this particular case, as there's no keys to sort, it is better to place the range key before the sort key:

db.github2015.createIndex(
 { "type": 1, "payload.issue.comments": 1, "repo.name": 1 }
)  

The index immediately finds that there are no keys for those bounds:


        executionStats: {
          executionSuccess: true,
          nReturned: 0,
          executionTimeMillis: 1,
          totalKeysExamined: 0,
          totalDocsExamined: 0,

Ad-Hoc queries for OLAP

While acceptable performance can be achieved with the right index, it may not be the optimal solution for OLAP use cases. Instead, consider creating a single Atlas Search index to handle all related queries. For further guidance, refer to my previous post: Search Index for Reporting.

Atlas Search Indexes are maintained asynchronously on a dedicated node, ensuring no impact on the operational database. They provide a near-real-time state without the complexity of streaming changes to another database.

Another Query: Recent push events by user's commit

The recent push by a user's commit is a relevant use-case for this dataset. However, the benchmark did not run any queries on "commits" since it is an array. While PostgreSQL supports JSON, it cannot be directly compared to a document database like MongoDB that handles non-scalar fields natively. If you run the benchmark on PostgreSQL, try this query:

SELECT   
 data->'repo'->>'name' as repo_name,  
 data->'payload'->'commits' as commits_info  
FROM "github2015"  
WHERE   
 data->>'type' = 'PushEvent'  
 AND data->'payload'->'commits' @> '[{"author": {"name": "ggolden@umich.edu"}}]'  
ORDER BY   
 data->>'created_at' DESC  
LIMIT 5;  

You can explore various indexes, such as a GIN index, but you will never find one that directly retrieves the five documents needed for the result. If you do, please correct me and show the execution plan in a comment.

On MongoDB, the same index guideline applies:

  • Equality: "type" (low selectivity) and "payload.commits.author.name" (will be multi-key)
  • Sort: "created_at" (a must for pagination query)
db.github2015.createIndex({  
  "type": 1,  
  "payload.commits.author.name": 1,  
  "created_at": -1  
}) 

The query is simple and doesn't even need an aggregation pipeline:

db.github2015.find({      
  "type": "PushEvent",      
  "payload.commits.author.name": "ggolden@umich.edu"      
}, {  
  "repo.name": 1,  
  "payload.commits.author.name": 1,  
  "payload.commits.message": 1,  
  _id: 0  
}).sort({      
  "created_at": -1      
}).limit(5) 

The execution statistics indicate that only 5 documents have been read, which is the minimum required for the results:

  executionStats: {                                                                                                                                                                                          
    executionSuccess: true,                                                                                                                                                                                  
    nReturned: 5,                                                                                                                                                                                            
    executionTimeMillis: 0,                                                                                                                                                                                  
    totalKeysExamined: 5,                                                                                                                                                                                    
    totalDocsExamined: 5,                                                                                                                                                                                    
    executionStages: {                                                                                                                                                                                       

The execution plan illustrates how a single seek has read five index entries by utilizing a multi-key index. It applies the index bounds for both the equality filters and the sort order, ensuring an efficient retrieval process:

  stage: 'IXSCAN',          
  nReturned: 5,          
  executionTimeMillisEstimate: 0,          
...       
  isMultiKey: true,          
  multiKeyPaths: {          
    type: [],          
    'payload.commits.author.name': [ 'payload.commits' ],          
    created_at: []          
  },          
...                                                                                                                                                                                        
  direction: 'forward',
  indexBounds: {
    type: [ '["PushEvent", "PushEvent"]' ],
    'payload.commits.author.name': [ '["ggolden@umich.edu", "ggolden@umich.edu"]' ],
    created_at: [ '[MaxKey, MinKey]' ]
  },
  keysExamined: 5,
  seeks: 1, 
  dupsTested: 5,

Conclusion

Vendor benchmarks can be misleading with their time comparisons and performance claims. However, when analyzed critically, they serve an educational role by highlighting common design mistakes in the database with which they compare to.

In this post, I applied the [The ESR (Equality, Sort, Range) Guideline], looked at the execution plan, and tested queries that are relevant to a document model, to show the power of MongoDB to query JSON data.

Queries on JSON 👉🏻 compound indexes (Equality, Sort, Range)

In a blog post titled New Benchmarks Show Postgres Dominating MongoDB in Varied Workloads, EDB claims that Postgres outperforms MongoDB in document-based data tests. While I generally find such marketing benchmarks useless, they highlight mistakes made by vendors who compare a database where they are experts with one they don't know and have no desire to learn. This provides an opportunity to remind some basics of on efficient index and query design.

There are four queries in this benchmark, with data loaded from github archive from 2015. This data is in JSON, which makes it suitable to test queries on documents, and queries are OLAP style, using aggregation pipelines in MongoDB.

Load documents to a collection

I used the same method to load data in a small lab to reproduce the queries:

for file in http://data.gharchive.org/2015-{01..12}-{01..31}-{0..23}.json.gz
 do
  wget -q -o /dev/null -O - $file |
  gunzip -c |
  mongoimport --collection="github2015" 
 done

Here is an example of one document:

db.github2015.find().limit(1);

[
  {
    _id: ObjectId('684ee281d8d90a383a078112'),
    id: '2489368070',
    type: 'PushEvent',
    actor: {
      id: 9152315,
      login: 'davidjhulse',
      gravatar_id: '',
      url: 'https://api.github.com/users/davidjhulse',
      avatar_url: 'https://avatars.githubusercontent.com/u/9152315?'
    },
    repo: {
      id: 28635890,
      name: 'davidjhulse/davesbingrewardsbot',
      url: 'https://api.github.com/repos/davidjhulse/davesbingrewardsbot'                                                                      
    },
    payload: {
      push_id: 536740396,
      size: 1,
      distinct_size: 1,
      ref: 'refs/heads/master',
      head: 'a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81',
      before: '86ffa724b4d70fce46e760f8cc080f5ec3d7d85f',
      commits: [
        {
          sha: 'a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81',
          author: {
            email: 'da8d7d1118ca5befd4d0d3e4f449c76ba6f1ee7e@live.com',
            name: 'davidjhulse'
          },
          message: 'Altered BingBot.jar\n\nFixed issue with multiple account support',                                                         
          distinct: true,
          url: 'https://api.github.com/repos/davidjhulse/davesbingrewardsbot/commits/a9b22a6d80c1e0bb49c1cf75a3c075b642c28f81'                 
        }
      ]
    },
    public: true,
    created_at: '2015-01-01T00:00:00Z'
  }
]

This dataset is ideal for testing a document database because:

  • documents have an average size of three kilobytes, with some up to two megabytes.
  • it features a structured format with sub-documents like "actor" and "payload."
  • it contains arrays, such as "payload.commit," necessitating multi-key or inverted indexes.

I will test the four queries used by the benchmark and include an additional one that queries the array "payload.commit," which the benchmark overlooked. MongoDB's multi-key indexes significantly outperform all SQL databases in this regard (see the Multi-key Indexes series). While many vendor benchmarks limit their test coverage to create a favorable impression, my objective is to transparently showcase indexing best practices.

Single-field indexes

Here are the indexes that were created for the vendor benchmark:

 db.github2015.createIndex( {type:1} )
 db.github2015.createIndex( {"repo.name":1} )
 db.github2015.createIndex( {"payload.action":1} )
 db.github2015.createIndex( {"actor.login":1} )
 db.github2015.createIndex( {"payload.issue.comments":1} )

Seeing the index definitions, I already know why they got better results on PostgreSQL for some queries. Complex queries rarely filter or sort on a single field, and a well-designed database should have compound indexes. Without the right compound indexes, PostgreSQL can combine multiple indexes with bitmap scans, which is not ideal, as it cannot cover range filters or sort orders. However, it can be used to reduce the number of indexes created, as they negatively impact the vacuum process.

The document model in MongoDB offers the advantage of having all important fields consolidated within a single document. This allows for the use of a compound index that can effectively handle equality, range, and sort order. These indexes can be applied to both scalar values and arrays.

Benchmark queries

I ran benchmark queries in my lab to educate on indexing practices. I created the appropriate indexes, and the best starting point for documentation is The ESR (Equality, Sort, Range) Guideline.

Query a) Repositories order by most open issues quantity

The query is:

db.github2015.aggregate([  
  { $match: { $and: [ { type: "IssuesEvent"} , { "payload.action" : "opened" } ] }},  
  { $group: { _id: "$repo.name", total: { $sum: 1 } } },  
  { $sort: { total: -1 } }  
]).explain("executionStats")

Index according to the ESR guideline:

  • Equality: "type" and "payload.action", with the less selective first (better compression)
  • Sort: "repo.name", as grouping is faster on sorted keys
db.github2015.createIndex(
 { "payload.action": 1, "type": 1, "repo.name": 1 }
)  

In my test, the query scanned nearly three hundred thousands index keys and returned in 700 milliseconds:

        executionStats: {                                                                                                                       
          executionSuccess: true,                                                                                                               
          nReturned: 72361,                                                                                                                     
          executionTimeMillis: 777,                                                                                                             
          totalKeysExamined: 278489,                                                                                                            
          totalDocsExamined: 0,    

In the vendor's benchmark results, MongoDB outperformed PostgreSQL for this query, even if it had to fetch half of the documents due to an index being used for a single filter. This suggests how MongoDB can be significantly faster than PostgreSQL with the appropriate indexing.

Query b) Return git event type order by quantity from major to minor

The query is:

db.github2015.aggregate([  
  { $group: { _id: "$type", total: { $sum: 1 } } },  
  { $sort: { total: -1 } }  
])

Index according to the ESR guideline:

  • Sort: "repo.name", as grouping is faster on sorted keys
db.github2015.createIndex({ "type": 1 })  

This index was created by the benchmark, however it is important to know that the query planner will consider the index without a filter on the key prefix (this may be improved in the future if SERVER-13197 is implemented).

The query should simply add an unbounded range on the sort key:

db.github2015.aggregate([  
  { $match: { "type": { $gt: MinKey}  }},  
  { $group: { _id: "$type", total: { $sum: 1 } } },  
  { $sort: { total: -1 } }  
])

In my test, the query aggregated ten million keys in 4.5 seconds:

        executionStats: {
          executionSuccess: true,
          nReturned: 14,
          executionTimeMillis: 4585,
          totalKeysExamined: 9480600,
          totalDocsExamined: 0, 

I enhanced performance significantly by optimizing the query without creating in new index. MongoDB provides users with greater control over data access than PostgreSQL, even allowing for the use of query planner hints to improve efficiency. In this case, it is sufficient to add a {$gt: MinKey} or { $lt: MaxKey}.

Query c) Return the top 10 most active actors

The query is:

db.github2015.aggregate([  
  { $group: { _id: "$actor.login", events: { $sum: 1 } } },  
  { $sort: { events: -1 } },  
  { $limit: 10 }  
])

Index according to the ESR guideline:

  • Sort: "repo.name", as grouping is faster on sorted keys
db.github2015.createIndex({ "actor.login": 1 })  

As seen above, we need to add an unbounded range filter to get the idnex considered by the query planner:

db.github2015.aggregate([  
  { $match: { "actor.login": { $gt: MinKey}  }}, 
  { $group: { _id: "$actor.login", events: { $sum: 1 } } },  
  { $sort: { events: -1 } },  
  { $limit: 10 }  
])

In my test, the query aggregated ten million keys in 13 seconds:

        executionStats: {                                                                                                                       
          executionSuccess: true,                                                                                                               
          nReturned: 727312,
          executionTimeMillis: 13498,
          totalKeysExamined: 9480600,
          totalDocsExamined: 0,

If you need to optimize this access pattern further, use the computed design pattern, and increment a login counter in an actor's collection, leveraging the atomic $inc operator.

Query d) Return repositories that have more than two comments and a specific event type, order by average comments from major to minor

The query is:

db.github2015.aggregate([  
  { $match: { "type": "PushEvent", "payload.issue.comments": { $gt : 2 } } },  
  { $group: { _id: "$repo.name", avg: { $avg: "$payload.issue.comments" } } },  
  { $sort: { avg: -1 } }  
])

Index according to the ESR guideline:

  • Equality: "type"
  • Sort: "repo.name", as grouping is faster on sorted keys
  • Range: "payload.issue.comments"
db.github2015.createIndex(
 { "type": 1, "repo.name": 1, "payload.issue.comments": 1  }
)

I don't know if it was done on purpose for the benchmark, but the dataset has no "PushEvent" with a "payload.issue" so the query returns an empty result.
For this particular case, as there's no keys to sort, it is better to place the range key before the sort key:

db.github2015.createIndex(
 { "type": 1, "payload.issue.comments": 1, "repo.name": 1 }
)  

The index immediately finds that there are no keys for those bounds:


        executionStats: {
          executionSuccess: true,
          nReturned: 0,
          executionTimeMillis: 1,
          totalKeysExamined: 0,
          totalDocsExamined: 0,

The query may mistakenly expects a "PushEvent" to be an "IssuesEvent," as it looks for "payload.issue" comments. To eliminate this ambiguity, MongoDB offers schema validation (see "Schema Later" considered harmful).

Ad-Hoc queries for OLAP

While acceptable performance can be achieved with the right index, it may not be the optimal solution for OLAP use cases. Instead, consider creating a single Atlas Search index to handle all related queries. For further guidance, refer to my previous post: Search Index for Reporting.

Atlas Search Indexes are maintained asynchronously on a dedicated node, ensuring no impact on the operational database. They provide a near-real-time state without the complexity of streaming changes to another database.

Another Query: Recent push events by user's commit

The recent push by a user's commit is a relevant use-case for this dataset. However, the benchmark did not run any queries on "commits" since it is an array. While PostgreSQL supports JSON, it cannot be directly compared to a document database like MongoDB that handles non-scalar fields natively. If you run the benchmark on PostgreSQL, try this query:

SELECT   
 data->'repo'->>'name' as repo_name,  
 data->'payload'->'commits' as commits_info  
FROM "github2015"  
WHERE   
 data->>'type' = 'PushEvent'  
 AND data->'payload'->'commits' @> '[{"author": {"name": "ggolden@umich.edu"}}]'  
ORDER BY   
 data->>'created_at' DESC  
LIMIT 5;  

You can explore various indexes, such as a GIN index, but you will never find one that directly retrieves the five documents needed for the result. If you do, please correct me and show the execution plan in a comment.

On MongoDB, the same index guideline applies:

  • Equality: "type" (low selectivity) and "payload.commits.author.name" (will be multi-key)
  • Sort: "created_at" (a must for pagination query)
db.github2015.createIndex({  
  "type": 1,  
  "payload.commits.author.name": 1,  
  "created_at": -1  
}) 

The query is simple and doesn't even need an aggregation pipeline:

db.github2015.find({      
  "type": "PushEvent",      
  "payload.commits.author.name": "ggolden@umich.edu"      
}, {  
  "repo.name": 1,  
  "payload.commits.author.name": 1,  
  "payload.commits.message": 1,  
  _id: 0  
}).sort({      
  "created_at": -1      
}).limit(5) 

The execution statistics indicate that only 5 documents have been read, which is the minimum required for the results:

  executionStats: {                                                                                                                                                                                          
    executionSuccess: true,                                                                                                                                                                                  
    nReturned: 5,                                                                                                                                                                                            
    executionTimeMillis: 0,                                                                                                                                                                                  
    totalKeysExamined: 5,                                                                                                                                                                                    
    totalDocsExamined: 5,                                                                                                                                                                                    
    executionStages: {                                                                                                                                                                                       

The execution plan illustrates how a single seek has read five index entries by utilizing a multi-key index. It applies the index bounds for both the equality filters and the sort order, ensuring an efficient retrieval process:

  stage: 'IXSCAN',          
  nReturned: 5,          
  executionTimeMillisEstimate: 0,          
...       
  isMultiKey: true,          
  multiKeyPaths: {          
    type: [],          
    'payload.commits.author.name': [ 'payload.commits' ],          
    created_at: []          
  },          
...                                                                                                                                                                                        
  direction: 'forward',
  indexBounds: {
    type: [ '["PushEvent", "PushEvent"]' ],
    'payload.commits.author.name': [ '["ggolden@umich.edu", "ggolden@umich.edu"]' ],
    created_at: [ '[MaxKey, MinKey]' ]
  },
  keysExamined: 5,
  seeks: 1, 
  dupsTested: 5,

Conclusion

Benchmarks are often biased because vendors optimize tests for their own databases, neglecting others. While their performance claims and time comparisons lack value, looking at the queries and execution plans may serve an educational role by highlighting common design mistakes they made for the databases they are comparing to.

In this post, I applied the The ESR (Equality, Sort, Range) Guideline, examined the execution plan, and tested queries relevant to a document model to demonstrate the power of MongoDB in querying JSON data. After replacing two indexes and improving two queries, you should notice that MongoDB performs significantly faster than PostgreSQL on those four queries.

June 13, 2025

DuckDB to query MongoDB

DuckDB is a high-performance analytical database, but it is single-process, so it cannot replace an operational database. MongoDB is a general-purpose database for operational data in a flexible format. DuckDB lacks a native connector to MongoDB, but you can combine pg_duckdb and mongo_fdw, two PostgreSQL extensions, to query MongoDB with DuckDB.

A PostgreSQL Docker container with DuckDB and Mongo_FDW

Here is an example. I built a pg_duckdb image, With is PostgreSQL with DuckDB as an extension, where I install mongo_fdw, the Foreign Data Wrapper to access MongoDB from PostgreSQL. Here is my dockerfile:

ARG pg_version=17

# build stage to compole the MongoDB Foreign Data Wrapper

FROM pgduckdb/pgduckdb:${pg_version}-main as build

ARG pg_version

ARG MONGO_FDW_VERSION=5_5_2
ARG MONGO_FDW_URL=https://github.com/EnterpriseDB/mongo_fdw/archive/REL-${MONGO_FDW_VERSION}.tar.gz
ARG SOURCE_FILES=/tmp/mongo_fdw

ENV PKG_CONFIG_PATH=${SOURCE_FILES}/mongo-c-driver/src/libmongoc/src:${SOURCE_FILES}/mongo-c-driver/src/libbson/src
ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu
ENV MONGOC_INSTALL_DIR=${LD_LIBRARY_PATH}
ENV JSONC_INSTALL_DIR=${LD_LIBRARY_PATH}

USER root

# dependencies (compilation and runtime)
RUN apt-get update && apt-get install -y --no-install-recommends wget ca-certificates make gcc cmake pkg-config postgresql-server-dev-${pg_version} libssl-dev libzstd-dev libmongoc-dev libjson-c-dev libsnappy1v5 libmongocrypt0 
# compile mongo_fdw
WORKDIR /var/tmp
ADD ${MONGO_FDW_URL} /var/tmp
WORKDIR /var/tmp
RUN tar -zxvf $(basename ${MONGO_FDW_URL})
# install monfo_fdw
WORKDIR /var/tmp/mongo_fdw-REL-${MONGO_FDW_VERSION}
RUN ./autogen.sh && make USE_PGXS=1 && make USE_PGXS=1 install;

# add mongosh because it can be useful
WORKDIR /var/tmp
ADD https://downloads.mongodb.com/compass/mongosh-2.5.2-linux-x64.tgz /tmp
RUN tar -xvf /tmp/mongosh-2.5.2-linux-x64.tgz
RUN cp ./mongosh-2.5.2-linux-x64/bin/mongosh /usr/local/bin

# final stage to add mongo_fdw to pgduckdb

FROM pgduckdb/pgduckdb:${pg_version}-main

USER root

ARG pg_version
ARG extdir=/usr/share/postgresql/${pg_version}/extension
ARG extlibdir=/usr/lib/postgresql/${pg_version}/lib
ARG libdir=/usr/lib/x86_64-linux-gnu

COPY --from=build ${extdir}/mongo_fdw* ${extdir}/
COPY --from=build ${extlibdir}/mongo_fdw.so ${extlibdir}/
COPY --from=build ${libdir}/libmongoc-1.0.so.0.0.0 \
                  ${libdir}/libbson-1.0.so.0.0.0 \
                  ${libdir}/libmongocrypt.so.0.0.0 \
                  ${libdir}/libsnappy.so.1.1.9 \
                  ${libdir}/ 

RUN cd ${libdir} && \
    ln -sf libmongoc-1.0.so.0.0.0 libmongoc-1.0.so.0 && \
    ln -sf libmongoc-1.0.so.0 libmongoc-1.0.so && \
    ln -sf libbson-1.0.so.0.0.0 libbson-1.0.so.0 && \
    ln -sf libbson-1.0.so.0 libbson-1.0.so && \
    ln -sf libmongocrypt.so.0.0.0 libmongocrypt.so.0 && \
    ln -sf libmongocrypt.so.0 libmongocrypt.so && \
    ln -sf libsnappy.so.1.1.9 libsnappy.so.1 && \
    ln -sf libsnappy.so.1 libsnappy.so;

COPY --from=build /usr/local/bin /usr/local/bin

USER postgres

Start the databases

I started a MongoDB container:

docker run --name mongodb $e -d mongodb/mongodb-community-server:latest

I built the image and started a container from my image, with network link to mongodb:

docker build -t duckduckmongo . 

docker run --name duckduckmongo -d --link mongodb:mongodb -e POSTGRES_PASSWORD=postgres4mongo duckduckmongo

Create MongoDB collections

In this container, where I've installed the MongoDB Shell, I connected to test the connection to the MongoDB database:

docker exec -it duckduckmongo mongosh mongodb:27017/test

While there, I created a user:

db.createUser( { 
  user: "duckduckmongo",
  pwd: "mongo4postgres",
  roles: [ { role: "readWrite", db: "test" } ] 
} );

I created two collections with some data:

db.dim.insertMany( Array.from({ length: 42 }, (_, i) => (
 { 
   dimid: i + 1, 
   dimvalue: Math.random(),
   tags: [ "fr" , "de" , "it" ],
   coord: { x:1, y:-11 } 
})) );

db.fact.insertMany( Array.from({ length: 100000 }, () => (
 { 
   ref: Math.ceil(Math.random() * 42), 
   value: Math.random(),
}
)) );

From the field names, you can guess that I wanted to test some joins, with one collection referencing the other.

Create the Foreign Tables

I connected to PostgreSQL:

docker exec -it duckduckmongo psql -U postgres 

I enabled the extensions:

create extension if not exists mongo_fdw;
create extension if not exists pg_duckdb;

I declared the Foreign Data Wrapper to connect to MongoDB:


CREATE SERVER "MongoDB server" FOREIGN DATA WRAPPER mongo_fdw
  OPTIONS ( address 'mongodb', port '27017' )
;

GRANT USAGE ON FOREIGN SERVER "MongoDB server" TO postgres;

CREATE USER MAPPING FOR postgres SERVER "MongoDB server"
  OPTIONS ( username 'duckduckmongo', password 'mongo4postgres' )
;

I declared the foreign tables that map to the MongoDB collections:


-- Note: first column of the table must be "_id" of type "name" according to the doc

CREATE FOREIGN TABLE dim ( 
   _id name,
   dimid int,
   dimvalue float,
   "coord.x" int,    -- mapping nested object fields to column
   "coord.y" int,
   tags text[]       -- mappint arrays as arrays
 ) SERVER "MongoDB server"
 OPTIONS ( database 'test', collection 'dim' )
;

CREATE FOREIGN TABLE fact ( 
   _id name, 
   ref int, 
   value float 
 ) SERVER "MongoDB server"
 OPTIONS ( database 'test', collection 'fact' )
;

Those foreign tables are like views and can be queried from PostgreSQL, and the Foreign Data Wrapper will query the MongoDB database server.

Query MongoDB from PostgreSQL with join

I run the following query, with a filter and a join:

select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

It takes several second and I can check the execution plan:

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Foreign Scan (actual time=16.379..7545.432 rows=49889 loops=1)
   Output: fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
   Foreign Namespace: (test.fact) INNER JOIN (test.dim)
 Planning Time: 2.226 ms
 Execution Time: 7548.817 ms

The slowness occurs because the join in MongoDB uses a $lookup within an aggregation pipeline, which isn't optimized for fifty thousand documents.

In MongoDB, the $lookup operation can be slower than joins in an RDBMS due to its document model, which is designed for embedding related data and reducing the need for joins. The flexible schema allows fields to be scalars, arrays, or nested documents, making $lookup operations more complex than traditional equi-joins. A lookup is a key-intersection join, which requires implicit array traversal and matching across two sets of keys, resembling a semijoin but including all matching elements from the inner collection, typically as an array that can also be unwound to multiple documents.

When querying thousands of documents, it's more effective to run two separate queries instead of one $lookup. This can be done transparently through the Foreign Data Wrapper, which can avoid pushing down the join. A hash join is preferable for my query as it requires most rows from the "dim' collection. Note that MongoDB can also use a hash join for lookups but it is only effective if the table has fewer than ten thousand documents and temporary disk usage is allowed.

Query MongoDB collection but join in PostgreSQL

To execute the join in PostgreSQL, I disabled the join pushdown:

set mongo_fdw.enable_join_pushdown to false;

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Hash Join (actual time=1.524..102.797 rows=49889 loops=1)
   Output: fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on public.fact (actual time=0.923..90.954 rows=49889 loops=1)
         Output: fact._id, fact.ref, fact.value
         Foreign Namespace: test.fact
   ->  Hash (actual time=0.575..0.575 rows=42 loops=1)
         Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
         Buckets: 1024  Batches: 1  Memory Usage: 15kB
         ->  Foreign Scan on public.dim (actual time=0.488..0.552 rows=42 loops=1)
               Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
               Foreign Namespace: test.dim
 Planning Time: 2.341 ms
 Execution Time: 105.252 ms

This has pushed the filter to MongoDB but not the join, so it has read all documents from "dim" in 0.6 milliseconds, to build the hashed table, and fifty thousand documents from "fact" in 92 milliseconds, serving as the probe table for the hash join.

I disabled join pushdown for my session, but it can be an option of the foreign table declaration.

Query planner cardinalities

When not pushing down the joins, it is important to get good cardinality estimations because it uses the PostgreSQL cost-based optimizer. By default the MongoDB Foreign Data Wrapper uses the same constant for all tables:

explain
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Hash Join  (cost=1062.50..2240.00 rows=5000 width=192)
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on fact  (cost=25.00..1025.00 rows=1000 width=76)
         Foreign Namespace: test.fact
   ->  Hash  (cost=1025.00..1025.00 rows=1000 width=116)
         ->  Foreign Scan on dim  (cost=25.00..1025.00 rows=1000 width=116)
               Foreign Namespace: test.dim

I was fortunate to have the small table hashed, but it is not a guarantee, as both options have the same cost. To get more accurate cardinality estimations, I enabled the remote estimate feature, which requests a count estimation from the MongoDB database:


ALTER SERVER "MongoDB server" 
 OPTIONS (ADD use_remote_estimate 'true')
;

explain
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

                              QUERY PLAN                              
----------------------------------------------------------------------
 Hash Join  (cost=14.04..3887.03 rows=7000 width=192)
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on fact  (cost=5.00..3682.98 rows=33333 width=76)
         Foreign Namespace: test.fact
   ->  Hash  (cost=8.52..8.52 rows=42 width=116)
         ->  Foreign Scan on dim  (cost=5.00..8.52 rows=42 width=116)
               Foreign Namespace: test.dim

To determine the correct join order, even an estimated selectivity is sufficient. In a hash join, the smaller table should be the build table, while the larger one serves as the probe table.

Invoke DuckDB to query the foreign tables

This didn't go through DuckDB because DuckDB, here, is an extension within PostgreSQL. To run the same though DuckDB I force it:

set duckdb.force_execution to true;

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

WARNING:  Unsupported Postgres type: No conversion to DuckDB available for type with oid=19)

                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Hash Join (actual time=1.598..102.117 rows=49889 loops=1)
   Output: fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on public.fact (actual time=1.018..90.344 rows=49889 loops=1)
         Output: fact._id, fact.ref, fact.value
         Foreign Namespace: test.fact
   ->  Hash (actual time=0.558..0.558 rows=42 loops=1)
         Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
         Buckets: 1024  Batches: 1  Memory Usage: 15kB
         ->  Foreign Scan on public.dim (actual time=0.464..0.527 rows=42 loops=1)
               Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
               Foreign Namespace: test.dim
 Planning Time: 25.667 ms
 Execution Time: 104.390 ms

The execution looks the same because it has actually been run by PostgreSQL - it is a PostgreSQL execution plan. Note that I got the following warning:

WARNING:  Unsupported Postgres type: No conversion to DuckDB available for type with oid=19)

To analyze the query prepared by the DuckDB extension, I enabled DuckDB debug messages:

set client_min_messages   to debug;
set duckdb.log_pg_explain to true;

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

DEBUG:  (PGDuckDB/DuckdbPrepare) Preparing: 
 SELECT fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue , dim."coord.x", dim."coord.y", dim.tags
 FROM (pgduckdb.public.fact 
 JOIN pgduckdb.public.dim ON ((fact.ref = dim.dimid))) 
 WHERE (fact.value > (0.5)::double precision)

DEBUG:  (DuckDB/SetTableInfo) Column name: '_id', Type: "No conversion to DuckDB available for type with oid=19)"
DEBUG:  (DuckDB/SetTableInfo) Column name: 'ref', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'value', Type: DOUBLE
DEBUG:  (DuckDB/SetTableInfo) Column name: '_id', Type: "No conversion to DuckDB available for type with oid=19)"
DEBUG:  (DuckDB/SetTableInfo) Column name: 'dimid', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'dimvalue', Type: DOUBLE
DEBUG:  (DuckDB/SetTableInfo) Column name: 'coord.x', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'coord.y', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'tags', Type: VARCHAR[]

I ran the same without the "_id" columns which I don't need and get a DuckDB execution plan:

explain (analyze, costs off, verbose)
select fact.ref, fact.value, dim.dimid, dim.dimvalue, dim.dimvalue 
 , dim."coord.x", dim."coord.y", dim.tags
 from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;
                                                                                                                                                                                                                                                                                                                                             QUERY PLAN                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Custom Scan (DuckDBScan) (actual time=0.001..0.002 rows=0 loops=1)
   Output: duckdb_scan.ref, duckdb_scan.value, duckdb_scan.dimid, duckdb_scan.dimvalue, duckdb_scan.dimvalue_1, duckdb_scan."coord.x", duckdb_scan."coord.y", duckdb_scan.tags
   DuckDB Execution Plan: 

 ┌─────────────────────────────────────┐
 │┌───────────────────────────────────┐│
 ││    Query Profiling Information    ││
 │└───────────────────────────────────┘│
 └─────────────────────────────────────┘
 EXPLAIN ANALYZE  SELECT fact.ref, fact.value, dim.dimid, dim.dimvalue, dim.dimvalue, dim.
                                    
                                    
                                    
                                    
                                

DuckDB to query MongoDB

DuckDB is a high-performance analytical database, but it is single-process, so it cannot replace an operational database. MongoDB is a general-purpose database for operational data in a flexible format. DuckDB lacks a native connector to MongoDB, but you can combine pg_duckdb and mongo_fdw, two PostgreSQL extensions, to query MongoDB with DuckDB.

  • A PostgreSQL Docker container with DuckDB and Mongo_FDW
  • Start the databases
  • Create MongoDB collections
  • Create the Foreign Tables
  • Query MongoDB from PostgreSQL with join
  • Query MongoDB collection but join in PostgreSQL
  • Query planner cardinalities
  • Invoke DuckDB to query the foreign tables
  • Cache static tables as temporary tables
  • Map the whole document

A PostgreSQL Docker container with DuckDB and Mongo_FDW

Here is an example. I built a pg_duckdb image, With is PostgreSQL with DuckDB as an extension, where I install mongo_fdw, the Foreign Data Wrapper to access MongoDB from PostgreSQL. Here is my dockerfile:

ARG pg_version=17

# build stage to compole the MongoDB Foreign Data Wrapper

FROM pgduckdb/pgduckdb:${pg_version}-main as build

ARG pg_version

ARG MONGO_FDW_VERSION=5_5_2
ARG MONGO_FDW_URL=https://github.com/EnterpriseDB/mongo_fdw/archive/REL-${MONGO_FDW_VERSION}.tar.gz
ARG SOURCE_FILES=/tmp/mongo_fdw

ENV PKG_CONFIG_PATH=${SOURCE_FILES}/mongo-c-driver/src/libmongoc/src:${SOURCE_FILES}/mongo-c-driver/src/libbson/src
ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu
ENV MONGOC_INSTALL_DIR=${LD_LIBRARY_PATH}
ENV JSONC_INSTALL_DIR=${LD_LIBRARY_PATH}

USER root

# dependencies (compilation and runtime)
RUN apt-get update && apt-get install -y --no-install-recommends wget ca-certificates make gcc cmake pkg-config postgresql-server-dev-${pg_version} libssl-dev libzstd-dev libmongoc-dev libjson-c-dev libsnappy1v5 libmongocrypt0 
# compile mongo_fdw
WORKDIR /var/tmp
ADD ${MONGO_FDW_URL} /var/tmp
WORKDIR /var/tmp
RUN tar -zxvf $(basename ${MONGO_FDW_URL})
# install monfo_fdw
WORKDIR /var/tmp/mongo_fdw-REL-${MONGO_FDW_VERSION}
RUN ./autogen.sh && make USE_PGXS=1 && make USE_PGXS=1 install;

# add mongosh because it can be useful
WORKDIR /var/tmp
ADD https://downloads.mongodb.com/compass/mongosh-2.5.2-linux-x64.tgz /tmp
RUN tar -xvf /tmp/mongosh-2.5.2-linux-x64.tgz
RUN cp ./mongosh-2.5.2-linux-x64/bin/mongosh /usr/local/bin

# final stage to add mongo_fdw to pgduckdb

FROM pgduckdb/pgduckdb:${pg_version}-main

USER root

ARG pg_version
ARG extdir=/usr/share/postgresql/${pg_version}/extension
ARG extlibdir=/usr/lib/postgresql/${pg_version}/lib
ARG libdir=/usr/lib/x86_64-linux-gnu

COPY --from=build ${extdir}/mongo_fdw* ${extdir}/
COPY --from=build ${extlibdir}/mongo_fdw.so ${extlibdir}/
COPY --from=build ${libdir}/libmongoc-1.0.so.0.0.0 \
                  ${libdir}/libbson-1.0.so.0.0.0 \
                  ${libdir}/libmongocrypt.so.0.0.0 \
                  ${libdir}/libsnappy.so.1.1.9 \
                  ${libdir}/ 

RUN cd ${libdir} && \
    ln -sf libmongoc-1.0.so.0.0.0 libmongoc-1.0.so.0 && \
    ln -sf libmongoc-1.0.so.0 libmongoc-1.0.so && \
    ln -sf libbson-1.0.so.0.0.0 libbson-1.0.so.0 && \
    ln -sf libbson-1.0.so.0 libbson-1.0.so && \
    ln -sf libmongocrypt.so.0.0.0 libmongocrypt.so.0 && \
    ln -sf libmongocrypt.so.0 libmongocrypt.so && \
    ln -sf libsnappy.so.1.1.9 libsnappy.so.1 && \
    ln -sf libsnappy.so.1 libsnappy.so;

COPY --from=build /usr/local/bin /usr/local/bin

USER postgres

Start the databases

I started a MongoDB container:

docker run --name mongodb $e -d mongodb/mongodb-community-server:latest

I built the image and started a container from my image, with network link to mongodb:

docker build -t duckduckmongo . 

docker run --name duckduckmongo -d --link mongodb:mongodb -e POSTGRES_PASSWORD=postgres4mongo duckduckmongo

Create MongoDB collections

In this container, where I've installed the MongoDB Shell, I connected to test the connection to the MongoDB database:

docker exec -it duckduckmongo mongosh mongodb:27017/test

While there, I created a user:

db.createUser( { 
  user: "duckduckmongo",
  pwd: "mongo4postgres",
  roles: [ { role: "readWrite", db: "test" } ] 
} );

I created two collections with some data:

db.dim.insertMany( Array.from({ length: 42 }, (_, i) => (
 { 
   dimid: i + 1, 
   dimvalue: Math.random(),
   tags: [ "fr" , "de" , "it" ],
   coord: { x:1, y:-11 } 
})) );

db.fact.insertMany( Array.from({ length: 100000 }, () => (
 { 
   ref: Math.ceil(Math.random() * 42), 
   value: Math.random(),
}
)) );

From the field names, you can guess that I wanted to test some joins, with one collection referencing the other.

Create the Foreign Tables

I connected to PostgreSQL:

docker exec -it duckduckmongo psql -U postgres 

I enabled the extensions:

create extension if not exists mongo_fdw;
create extension if not exists pg_duckdb;

I declared the Foreign Data Wrapper to connect to MongoDB:


CREATE SERVER "MongoDB server" FOREIGN DATA WRAPPER mongo_fdw
  OPTIONS ( address 'mongodb', port '27017' )
;

GRANT USAGE ON FOREIGN SERVER "MongoDB server" TO postgres;

CREATE USER MAPPING FOR postgres SERVER "MongoDB server"
  OPTIONS ( username 'duckduckmongo', password 'mongo4postgres' )
;

I declared the foreign tables that map to the MongoDB collections:


-- Note: first column of the table must be "_id" of type "name" according to the doc

CREATE FOREIGN TABLE dim ( 
   _id name,
   dimid int,
   dimvalue float,
   "coord.x" int,    -- mapping nested object fields to column
   "coord.y" int,
   tags text[]       -- mappint arrays as arrays
 ) SERVER "MongoDB server"
 OPTIONS ( database 'test', collection 'dim' )
;

CREATE FOREIGN TABLE fact ( 
   _id name, 
   ref int, 
   value float 
 ) SERVER "MongoDB server"
 OPTIONS ( database 'test', collection 'fact' )
;

Those foreign tables are like views and can be queried from PostgreSQL, and the Foreign Data Wrapper will query the MongoDB database server.

Query MongoDB from PostgreSQL with join

I run the following query, with a filter and a join:

select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

It takes several second and I can check the execution plan:

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Foreign Scan (actual time=16.379..7545.432 rows=49889 loops=1)
   Output: fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
   Foreign Namespace: (test.fact) INNER JOIN (test.dim)
 Planning Time: 2.226 ms
 Execution Time: 7548.817 ms

The slowness occurs because the join in MongoDB uses a $lookup within an aggregation pipeline, which isn't optimized for fifty thousand documents.

In MongoDB, the $lookup operation can be slower than joins in an RDBMS due to its document model, which is designed for embedding related data and reducing the need for joins. The flexible schema allows fields to be scalars, arrays, or nested documents, making $lookup operations more complex than traditional equi-joins. A lookup is a key-intersection join, which requires implicit array traversal and matching across two sets of keys, resembling a semijoin but including all matching elements from the inner collection, typically as an array that can also be unwound to multiple documents.

When querying thousands of documents, it's more effective to run two separate queries instead of one $lookup. This can be done transparently through the Foreign Data Wrapper, which can avoid pushing down the join. A hash join is preferable for my query as it requires most rows from the "dim' collection. Note that MongoDB can also use a hash join for lookups but it is only effective if the table has fewer than ten thousand documents and temporary disk usage is allowed.

Query MongoDB collection but join in PostgreSQL

To execute the join in PostgreSQL, I disabled the join pushdown:

set mongo_fdw.enable_join_pushdown to false;

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Hash Join (actual time=1.524..102.797 rows=49889 loops=1)
   Output: fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on public.fact (actual time=0.923..90.954 rows=49889 loops=1)
         Output: fact._id, fact.ref, fact.value
         Foreign Namespace: test.fact
   ->  Hash (actual time=0.575..0.575 rows=42 loops=1)
         Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
         Buckets: 1024  Batches: 1  Memory Usage: 15kB
         ->  Foreign Scan on public.dim (actual time=0.488..0.552 rows=42 loops=1)
               Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
               Foreign Namespace: test.dim
 Planning Time: 2.341 ms
 Execution Time: 105.252 ms

This has pushed the filter to MongoDB but not the join, so it has read all documents from "dim" in 0.6 milliseconds, to build the hashed table, and fifty thousand documents from "fact" in 92 milliseconds, serving as the probe table for the hash join.

I disabled join pushdown for my session, but it can be an option of the foreign table declaration.

Query planner cardinalities

When not pushing down the joins, it is important to get good cardinality estimations because it uses the PostgreSQL cost-based optimizer. By default the MongoDB Foreign Data Wrapper uses the same constant for all tables:

explain
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Hash Join  (cost=1062.50..2240.00 rows=5000 width=192)
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on fact  (cost=25.00..1025.00 rows=1000 width=76)
         Foreign Namespace: test.fact
   ->  Hash  (cost=1025.00..1025.00 rows=1000 width=116)
         ->  Foreign Scan on dim  (cost=25.00..1025.00 rows=1000 width=116)
               Foreign Namespace: test.dim

I was fortunate to have the small table hashed, but it is not a guarantee, as both options have the same cost. To get more accurate cardinality estimations, I enabled the remote estimate feature, which requests a count estimation from the MongoDB database:


ALTER SERVER "MongoDB server" 
 OPTIONS (ADD use_remote_estimate 'true')
;

explain
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

                              QUERY PLAN                              
----------------------------------------------------------------------
 Hash Join  (cost=14.04..3887.03 rows=7000 width=192)
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on fact  (cost=5.00..3682.98 rows=33333 width=76)
         Foreign Namespace: test.fact
   ->  Hash  (cost=8.52..8.52 rows=42 width=116)
         ->  Foreign Scan on dim  (cost=5.00..8.52 rows=42 width=116)
               Foreign Namespace: test.dim

To determine the correct join order, even an estimated selectivity is sufficient. In a hash join, the smaller table should be the build table, while the larger one serves as the probe table.

Invoke DuckDB to query the foreign tables

This didn't go through DuckDB because DuckDB, here, is an extension within PostgreSQL. To run the same though DuckDB I force it:

set duckdb.force_execution to true;

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

WARNING:  Unsupported Postgres type: No conversion to DuckDB available for type with oid=19)

                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Hash Join (actual time=1.598..102.117 rows=49889 loops=1)
   Output: fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
   Hash Cond: (fact.ref = dim.dimid)
   ->  Foreign Scan on public.fact (actual time=1.018..90.344 rows=49889 loops=1)
         Output: fact._id, fact.ref, fact.value
         Foreign Namespace: test.fact
   ->  Hash (actual time=0.558..0.558 rows=42 loops=1)
         Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
         Buckets: 1024  Batches: 1  Memory Usage: 15kB
         ->  Foreign Scan on public.dim (actual time=0.464..0.527 rows=42 loops=1)
               Output: dim._id, dim.dimid, dim.dimvalue, dim."coord.x", dim."coord.y", dim.tags
               Foreign Namespace: test.dim
 Planning Time: 25.667 ms
 Execution Time: 104.390 ms

The execution looks the same because it has actually been run by PostgreSQL - it is a PostgreSQL execution plan. Note that I got the following warning:

WARNING:  Unsupported Postgres type: No conversion to DuckDB available for type with oid=19)

To analyze the query prepared by the DuckDB extension, I enabled DuckDB debug messages:

set client_min_messages   to debug;
set duckdb.log_pg_explain to true;

explain (analyze, costs off, verbose)
select * from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;

DEBUG:  (PGDuckDB/DuckdbPrepare) Preparing: 
 SELECT fact._id, fact.ref, fact.value, dim._id, dim.dimid, dim.dimvalue , dim."coord.x", dim."coord.y", dim.tags
 FROM (pgduckdb.public.fact 
 JOIN pgduckdb.public.dim ON ((fact.ref = dim.dimid))) 
 WHERE (fact.value > (0.5)::double precision)

DEBUG:  (DuckDB/SetTableInfo) Column name: '_id', Type: "No conversion to DuckDB available for type with oid=19)"
DEBUG:  (DuckDB/SetTableInfo) Column name: 'ref', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'value', Type: DOUBLE
DEBUG:  (DuckDB/SetTableInfo) Column name: '_id', Type: "No conversion to DuckDB available for type with oid=19)"
DEBUG:  (DuckDB/SetTableInfo) Column name: 'dimid', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'dimvalue', Type: DOUBLE
DEBUG:  (DuckDB/SetTableInfo) Column name: 'coord.x', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'coord.y', Type: INTEGER
DEBUG:  (DuckDB/SetTableInfo) Column name: 'tags', Type: VARCHAR[]

I ran the same without the "_id" columns which I don't need and get a DuckDB execution plan:

explain (analyze, costs off, verbose)
select fact.ref, fact.value, dim.dimid, dim.dimvalue, dim.dimvalue 
 , dim."coord.x", dim."coord.y", dim.tags
 from fact join dim on fact.ref=dim.dimid
 where fact.value>0.5
;
                                                                                                                                                                                                                                                                                                                                             QUERY PLAN                                                                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Custom Scan (DuckDBScan) (actual time=0.001..0.002 rows=0 loops=1)
   Output: duckdb_scan.ref, duckdb_scan.value, duckdb_scan.dimid, duckdb_scan.dimvalue, duckdb_scan.dimvalue_1, duckdb_scan."coord.x", duckdb_scan."coord.y", duckdb_scan.tags
   DuckDB Execution Plan: 

 ┌─────────────────────────────────────┐
 │┌───────────────────────────────────┐│
 ││    Query Profiling Information    ││
 │└───────────────────────────────────┘│
 └─────────────────────────────────────┘
 EXPLAIN ANALYZE  SELECT fact.ref, fact
                                    
                                    
                                    
                                    
                                

Percona Earns Kubernetes Certified Services Provider Status for All Three Major Open Source Databases

As a member of the Kubernetes Certified Services Provider program, Percona is now part of a select, “pre-qualified tier of vetted service providers who have deep experience helping enterprises successfully adopt Kubernetes…” Kubernetes (frequently abbreviated as K8s) has come a long way over the past decade. From being used almost exclusively for orchestrating stateless container […]

June 11, 2025

Postgres 18 beta1: small server, IO-bound Insert Benchmark (v2)

This is my second attempt at an IO-bound Insert Benchmark results with a small server. The first attempt is here and has been deprecated because sloppy programming by me meant the benchmark client was creating too many connections and that hurt results in some cases for Postgres 18 beta1.

There might be regressions from 17.5 to 18 beta1

  • QPS decreases by ~5% and CPU increases by ~5% on the l.i2 (write-only) step
  • QPS decreases by <= 2% and CPU increases by ~2% on the qr* (range query) steps
There might be regressions from 14.0 to 18 beta1
  • QPS decreases by ~6% and ~18% on the write-heavy steps (l.i1, l.i2)

Builds, configuration and hardware

I compiled Postgres from source using -O2 -fno-omit-frame-pointer for versions  14.0, 14.18, 15.0, 15.13, 16.0, 16.9, 17.0, 17.5 and 18 beta1.

The server is an ASUS ExpertCenter PN53 with and AMD Ryzen 7 7735HS CPU, 8 cores, SMT disabled, 32G of RAM and one NVMe device for the database. The OS has been updated to Ubuntu 24.04. More details on it are here.

For Postgres versions 14.0 through 17.5 the configuration files are in the pg* subdirectories here with the name conf.diff.cx10a_c8r32. For Postgres 18 beta1 I used 3 variations, which are here:
  • conf.diff.cx10b_c8r32
    • uses io_method='sync' to match Postgres 17 behavior
  • conf.diff.cx10c_c8r32
    • uses io_method='worker' and io_workers=16 to do async IO via a thread pool. I eventually learned that 16 is too large.
  • conf.diff.cx10d_c8r32
    • uses io_method='io_uring' to do async IO via io_uring
The Benchmark

The benchmark is explained here and is run with 1 client and 1 table with 800M rows. I provide two performance reports:
  • one to compare Postgres 14.0 through 18 beta1, all using synchronous IO
  • one to compare Postgres 17.5 with 18 beta1 using 3 configurations for 18 beta1 -- one for each of io_method= sync, workers and io_uring.
The benchmark steps are:

  • l.i0
    • insert 20 million rows per table in PK order. The table has a PK index but no secondary indexes. There is one connection per client.
  • l.x
    • create 3 secondary indexes per table. There is one connection per client.
  • l.i1
    • use 2 connections/client. One inserts 4M rows per table and the other does deletes at the same rate as the inserts. Each transaction modifies 50 rows (big transactions). This step is run for a fixed number of inserts, so the run time varies depending on the insert rate.
  • l.i2
    • like l.i1 but each transaction modifies 5 rows (small transactions) and 1M rows are inserted and deleted per table.
    • Wait for X seconds after the step finishes to reduce variance during the read-write benchmark steps that follow. The value of X is a function of the table size.
  • qr100
    • use 3 connections/client. One does range queries and performance is reported for this. The second does does 100 inserts/s and the third does 100 deletes/s. The second and third are less busy than the first. The range queries use covering secondary indexes. This step is run for 1800 seconds. If the target insert rate is not sustained then that is considered to be an SLA failure. If the target insert rate is sustained then the step does the same number of inserts for all systems tested.
  • qp100
    • like qr100 except uses point queries on the PK index
  • qr500
    • like qr100 but the insert and delete rates are increased from 100/s to 500/s
  • qp500
    • like qp100 but the insert and delete rates are increased from 100/s to 500/s
  • qr1000
    • like qr100 but the insert and delete rates are increased from 100/s to 1000/s
  • qp1000
    • like qp100 but the insert and delete rates are increased from 100/s to 1000/s
Results: overview

The performance report is here for Postgres 14 through 18 and here for Postgres 18 configurations.

The summary sections (herehere and here) have 3 tables. The first shows absolute throughput by DBMS tested X benchmark step. The second has throughput relative to the version from the first row of the table. The third shows the background insert rate for the benchmark steps. The second table makes it easy to see how performance changes over time. The third table makes it easy to see which DBMS+configs failed to meet the SLA.

Below I use relative QPS (rQPS) to explain how performance changes. It is: (QPS for $me / QPS for $base) where $me is the result for some version $base is the result from Postgres 17.5.

When rQPS is > 1.0 then performance improved over time. When it is < 1.0 then there are regressions. When it is 0.90 then I claim there is a 10% regression. The Q in relative QPS measures: 
  • insert/s for l.i0, l.i1, l.i2
  • indexed rows/s for l.x
  • range queries/s for qr100, qr500, qr1000
  • point queries/s for qp100, qp500, qp1000
Below I use colors to highlight the relative QPS values with red for <= 0.97, green for >= 1.03 and grey for values between 0.98 and 1.02.

Results: Postgres 14.0 through 18 beta1

The performance summary is here

See the previous section for the definition of relative QPS (rQPS). For the rQPS formula, Postgres 14.0 is the base version and that is compared with more recent Postgres versions. The results here are similar to what I reported prior to fixing the too many connections problem in the benchmark client.

For 14.0 through 18 beta1, QPS on ...
  • the initial load (l.i0)
    • Performance is stable across versions
    • 18 beta1 and 17.5 have similar performance
    • rQPS for (17.5, 18 beta1 with io_method=sync) is (1.00, 0.99)
  • create index (l.x)
    • ~10% faster starting in 15.0
    • 18 beta1 and 17.5 have similar performance
    • rQPS for (17.5, 18 beta1 with io_method=sync) is (1.11, 1.12)
  • first write-only step (l.i1)
    • Performance decreases ~7% from version 16.9 to 17.0. CPU overhead (see cpupq here) increases by ~5% in 17.0.
    • 18 beta1 and 17.5 have similar performance
    • rQPS for (17.5, 18 beta1 with io_method=sync) is (0.93, 0.94)
  • second write-only step (l.i2)
    • Performance decreases ~6% in 15.0, ~8% in 17.0 and then ~5% in 18.0. CPU overhead (see cpupq here) increases ~5%, ~6% and ~5% in 15.0, 17.0 and 18beta1. Of all benchmark steps, this has the largest perf regression from 14.0 through 18 beta1 which is ~20%.
    • 18 beta1 is ~4% slower than 17.5
    • rQPS for (17.5, 18 beta1 with io_method=sync) is (0.86, 0.82)
  • range query steps (qr100, qr500, qr1000)
    • 18 beta1 and 17.5 have similar performance, but 18 beta1 is slightly slower
    • rQPS for (17.5, 18 beta1 with io_method=sync) is (1.00, 0.99) for qr100, (0.97, 0.98) for qr500 and (0.97, 0.95) for qr1000. The issue is new CPU overhead, see cpupq here.
  • point query steps (qp100, qp500, qp1000)
    • 18 beta1 and 17.5 have similar performance but 18 beta1 is slightly slower
    • rQPS for (17.5, 18 beta1 with io_method=sync) is (1.00, 0.98) for qp100, (0.99, 0.98) for qp500 and (0.97, 0.96) for qp1000. The issue is new CPU overhead, see cpupq here.
Results: Postgres 17.5 vs 18 beta1

The performance summary is here.

See the previous section for the definition of relative QPS (rQPS). For the rQPS formula, Postgres 17.5 is the base version and that is compared with results from 18 beta1 using the three configurations explained above:
  • x10b with io_method=sync
  • x10c with io_method=worker and io_workers=16
  • x10d with io_method=io_uring
The summary is:
  • initial load step (l.i0)
    • rQPS for (x10b, x10c, x10d) was (0.99, 1001.00)
  • create index step (l.x)
    • rQPS for (x10b, x10c, x10d) was (1.011.021.02)
  • first write-heavy ste (l.i1)
    • for l.i1 the rQPS for (x10b, x10c, x10d) was (1.00, 0.99, 1.01)
  • second write-heavy step (l.i2)
    • for l.i2 the rQPS for (x10b, x10c, x10d) was (0.960.930.94)
    • CPU overhead (see cpupq here) increases by ~5% in 18 beta1
  • range query steps (qr100, qr500, qr1000)
    • for qr100 the rQPS for (x10b, x10c, x10d) was (1.00, 0.99, 0.99)
    • for qr500 the rQPS for (x10b, x10c, x10d) was (1.00, 0.97, 0.99)
    • for qr1000 the rQPS for (x10b, x10c, x10d) was (0.99, 0.98, 0.97)
    • CPU overhead (see cpupq here, here and here) increases by ~2% in 18 beta1
  • point query steps (qp100, qp500, qp1000)
    • for qp100 the rQPS for (x10b, x10c, x10d) was (0.98, 0.99, 0.99)
    • for qp500 the rQPS for (x10b, x10c, x10d) was (0.991.001.00)
    • for qp1000 the rQPS for (x10b, x10c, x10d) was (0.99, 0.990.99)










June 10, 2025

Percona Named to DBTA’s 2025 List of 100 Companies That Matter Most in Data

We’re proud to share that Percona has been named to the 2025 DBTA 100—Database Trends and Applications’ annual list of “The Companies That Matter Most in Data.” This recognition highlights our success in empowering organizations to build, scale, and optimize open source database environments for today’s most demanding applications. At Percona, we believe open source […]

June 09, 2025

Using the PostgreSQL extension tds_fdw to validate data migration from SQL Server to Amazon Aurora PostgreSQL

Data validation is an important process during data migrations, helping to verify that the migrated data matches the source data. In this post, we present alternatives you can use for data validation when dealing with tables that lack primary keys. We discuss alternative approaches, best practices, and potential solutions to make sure that your data migration process remains thorough and reliable, even in the absence of traditional primary key-based validation methods. Specifically, we demonstrate how to perform data validation after a full load migration from SQL Server to Amazon Aurora PostgreSQL-Compatible Edition using the PostgreSQL tds_fdw extension.

Comparison of JOINS 👉🏻 aggregation pipeline

In a blog post titled Comparison of JOINS: MongoDB vs. PostgreSQL on EDB's site, Michael Stonebraker runs a join with $lookup in an aggregation pipeline to compare with a PostgreSQL join. I take this opportunity to discuss some effective design practices for working with a document database.

A common mistake vendors make is comparing one database, where the author is an expert, to another database they are unfamiliar with and unwilling to learn about. This leads to biased conclusions, as they contrast best practices from one database with a database where the design was incorrect. In the EDB blog post, the following aggregation pipeline is used claim that "JOINS are Brittle in MongoDB":

db.employee.aggregate([
  {
    $lookup: {
      from: "department",
      localField: "department",
      foreignField: "_id",
      as: "dept"
    }
  },
  {
    $unwind: "$dept"
  },
  {
    $group: {
      "_id": "$dept.dname",
      "salary": { "$sum": "$salary" },
    }
  }
]);

This query reads the "employee" collection, which contains a "department" field referencing the _id field in the "department" collection. It performs a lookup into the "department" collection to retrieve additional information about each employee's department, specifically the department name ("dname"). After unwinding the "dept" array (from the lookup, in our case there's a single value because it's a many to one but MongoDB can embed a many-to-many as well), it groups the data by department name ("dept.dname") and calculates the total salary of employees for each department by summing up the salary field.

In a document model, the department name should be included as a field within the employee document instead of using a reference. While one might argue that normalizing it to another collection simplifies updates, this operation is infrequent. Additionally, any department renaming is likely part of a broader enterprise reorganization, which would prompt updates to the employee collection regardless.

But let's say that we need to avoid duplication and normalize it with a department relation that implements the dependency between the department surrogate key "_id" and the department name. Still, the department name must be unique, as it is the natural key, and this can be enforced with a unique index ({ dname: 1 }, { unique: true }). Knowing this, grouping by "_id" or grouping by "name" is the same. There's no need to lookup for the name for each employee. In an aggregation pipeline, it is better to aggregate first and lookup after, for each department rather than for each employee.

Here is the correct aggregation pipeline:

db.employee.aggregate([  
  {    
    $group: {  
      _id: "$department",  
      salary: { $sum: "$salary" }  
    }  
  },  
  {  
    $lookup: {  
      from: "department",  
      localField: "_id",  
      foreignField: "_id",  
      as: "dept"  
    }  
  },  
  {  
    $unwind: "$dept"  
  },  
  {  
    $project: {  
      _id: "$dept.dname",  
      salary: 1  
    }  
  }  
]);  

For an expert in relational theory, coding the order of execution might seem surprising, as RDBMS are built to optimize access paths from declarative queries on a logical model. However, similar considerations apply to SQL databases, where those concerns are usually deferred until production, when data grows, and you need to analyze the execution plan.

For example, the article compared with the following query in PostgreSQL:

create table department (
    dname varchar    primary key
,   floor            integer
,   budget           integer
);

create table employee (
    ename varchar
,   age              integer
,   salary           integer
,   department       varchar references department(dname)
);

select dname, sum(salary)
from employee as e
            inner join
            department as d
            on e.department = d.dname
group by dname
;

There are two main differences between the queries in MongoDB and PostgreSQL. First, PostgreSQL utilizes a natural key instead of a surrogate key, which simplifies joins but does not resolve the issue of updating department names. Second, while MongoDB performs a left outer join for lookups, but it's an inner join in the PostgreSQL example. Given the data, the outer join makes sense because you don't want to miss the salary because of the department missing. PostgreSQL does not optimize this either and executes the join prior to aggregation:

explain (costs off)
select dname, sum(salary)
from employee as e
            left outer join
            department as d
            on e.department = d.dname
group by dname
;

                         QUERY PLAN                          
-------------------------------------------------------------
 HashAggregate
   Group Key: d.dname
   ->  Hash Left Join
         Hash Cond: ((e.department)::text = (d.dname)::text)
         ->  Seq Scan on employee e
         ->  Hash
               ->  Seq Scan on department d

In PostgreSQL, you also need to look at the execution plan and change the query. For example, using the department name from the employee table instead of the one in the department table eliminates the join:

explain (costs off)
select department, sum(salary)
from employee as e
            left outer join
            department as d
            on e.department = d.dname
group by department
;

          QUERY PLAN          
------------------------------
 HashAggregate
   Group Key: e.department
   ->  Seq Scan on employee e

This is not different from MongoDB. In PostgreSQL as well:

  • You must understand the access path.
  • You accept some update complexity to eliminate joins.

In PostgreSQL, you can adopt a style similar to the MongoDB aggregation pipeline by declaring stages within a Common Table Expression (CTE) using a WITH clause. This approach executes the GROUP BY before the JOIN, making the code's intent clearer:

with employee_agg as (  
  select department, sum(salary) as total_salary  
  from employee  
  group by department  
)  
select d.dname as _id, ea.total_salary as salary  
from employee_agg as ea  
left outer join department as d  
on ea.department = d.dname;  

To conclude, it is true that joins in PostgreSQL are generally faster than lookups in MongoDB. This is because PostgreSQL is designed for normalized schemas, where a single business query can retrieve data from multiple tables, while MongoDB is optimized for document models that align with business domain entities, and optimizing joins is not a priority.

If MongoDB lookups are causing slow queries, consider improving your data model and aggregation pipelines first. Filter and aggregate before joining, and utilize multi-key indexes on well-designed document schemas.
Additionally, be aware that MongoDB's aggregation pipeline includes an optimization phase to reshape the pipeline for enhanced performance.

If you find the aggregation pipeline syntax complex, it's easy to learn and developers like it (other databases like Google BigQuery or DuckDB adopted a similar approach). It resembles Common Table Expressions in SQL, making it straightforward to test each stage. Additionally, you can use the Atlas UI or Compass to construct it with a wizard and view the output of each stage, as shown in the header of this post. But do not abuse it: the document model should avoid lookups on many documents, and aggregation pipelines should filter and aggregate first.

Comparison of JOINS 👉🏻 aggregation pipeline and CTEs

In a blog post titled Comparison of JOINS: MongoDB vs. PostgreSQL on EDB's site, Michael Stonebraker runs a join with $lookup in an aggregation pipeline to compare with a PostgreSQL join. I take this opportunity to discuss some effective design practices for working with a document database.

A common mistake vendors make is publishing comparisons between one database, where the author of the article is an expert, and another database they are unfamiliar with and unwilling to learn about. This leads to biased conclusions, as they contrast best practices from one database with a database where the design is incorrect. In the EDB blog post, the following aggregation pipeline is used to claim that "JOINS are Brittle in MongoDB" - do you spot the problem?

db.employee.aggregate([
  {
    $lookup: {
      from: "department",
      localField: "department",
      foreignField: "_id",
      as: "dept"
    }
  },
  {
    $unwind: "$dept"
  },
  {
    $group: {
      "_id": "$dept.dname",
      "salary": { "$sum": "$salary" },
    }
  }
]);

This query reads the "employee" collection, which contains a "department" field referencing the _id field in the "department" collection. It performs a lookup into the "department" collection to retrieve additional information about each employee's department, specifically the department name ("dname"). After unwinding the "dept" array (from the lookup, in our case there's a single value because it's a many to one but MongoDB can embed a many-to-many as well), it groups the data by department name ("dept.dname") and calculates the total salary of employees for each department by summing up the salary field.

In a document model, the department name should be included as a field within the employee document instead of using a reference. While one might argue that normalizing it to another collection simplifies updates, this operation is infrequent. Additionally, any department renaming is likely part of a broader enterprise reorganization, which would prompt updates to the employee collection regardless.
The model does not account for departments without employees, as it is inherently tied to a specific business domain, HR in this case. It focuses on employees and does not share sensitive information like salary details with other domains. In this bounded context, the department information is an employee attribute.

But let's say that we need to avoid duplication and normalize it with a department relation that implements the dependency between the department surrogate key "_id" and the department name. Still, the department name must be unique, as it is the natural key, and this can be enforced with a unique index ({ dname: 1 }, { unique: true }). Knowing this, grouping by "_id" or grouping by "name" is the same. There's no need to lookup for the name for each employee. In an aggregation pipeline, it is better to aggregate first and lookup after, for each department rather than for each employee.

Here is the correct aggregation pipeline:

db.employee.aggregate([  
  {    
    $group: {  
      _id: "$department",  
      salary: { $sum: "$salary" }  
    }  
  },  
  {  
    $lookup: {  
      from: "department",  
      localField: "_id",  
      foreignField: "_id",  
      as: "dept"  
    }  
  },  
  {  
    $unwind: "$dept"  
  },  
  {  
    $project: {  
      _id: "$dept.dname",  
      salary: 1  
    }  
  }  
]);  

For an expert in relational theory, coding the order of execution might seem surprising, as RDBMS are built to optimize access paths from declarative queries on a logical model. However, similar considerations apply to SQL databases, where those concerns are usually deferred until production, when data grows, and you need to analyze the execution plan and "tune" the query.

For example, the article compared with the following in PostgreSQL:

create table department (
    dname varchar    primary key
,   floor            integer
,   budget           integer
);

create table employee (
    ename varchar
,   age              integer
,   salary           integer
,   department       varchar references department(dname)
);

select dname, sum(salary)
from employee as e
            inner join
            department as d
            on e.department = d.dname
group by dname
;

There are two main differences between the queries they use in MongoDB and PostgreSQL. First, PostgreSQL utilizes a natural key instead of a surrogate key, which simplifies joins but does not resolve the issue of cascading the updates to the department names. This is equivalent to embedding or extended reference in MongoDB. While the author may have ignored it, MongoDB, like SQL, can reference and join columns other than the generated "_id", and secondary indexes makes it fast.
Second, MongoDB performs a left outer join for lookups, because they are lookups, not relational joins. However, the author used an inner join in the PostgreSQL example. Given the data, the outer join makes sense because you don't want to miss the salary because of the department missing. PostgreSQL does not optimize this either and executes the join prior to aggregation:

explain (costs off)
select dname, sum(salary)
from employee as e
            left outer join
            department as d
            on e.department = d.dname
group by dname
;

                         QUERY PLAN                          
-------------------------------------------------------------
 HashAggregate
   Group Key: d.dname
   ->  Hash Left Join
         Hash Cond: ((e.department)::text = (d.dname)::text)
         ->  Seq Scan on employee e
         ->  Hash
               ->  Seq Scan on department d

In PostgreSQL, you also need to look at the execution plan and change the query. For example, using the department name from the employee table instead of the one in the department table eliminates the join:

explain (costs off)
select
 department,
 sum(salary)
from employee as e
      left outer join
      department as d
      on e.department = d.dname
group by
 department
;

          QUERY PLAN          
------------------------------
 HashAggregate
   Group Key: e.department
   ->  Seq Scan on employee e

This is not different from MongoDB. In PostgreSQL as well:

  • You must understand the access path.
  • You accept some update complexity to eliminate joins.

The join was eliminated because no column is read from the inner table, because the natural key, department name, was chosen. If you query an additional column from departments, like "floor", the query becomes more complex as this column must be added to the GROUP BY clause even if the normalized model doesn't allow more than one floor per department, and it is joining before the aggregation:

explain (costs off)
select 
 department,
 floor,
 sum(salary)
from employee as e
      left outer join
      department as d
      on e.department = d.dname
group by 
 department,
 floor
;
                         QUERY PLAN                          
-------------------------------------------------------------
 HashAggregate
   Group Key: e.department, d.floor
   ->  Hash Left Join
         Hash Cond: ((e.department)::text = (d.dname)::text)
         ->  Seq Scan on employee e
         ->  Hash
               ->  Seq Scan on department d

In PostgreSQL, you can adopt a style similar to the MongoDB aggregation pipeline by declaring stages within a Common Table Expression (CTE) using a WITH clause. This approach executes the GROUP BY before the JOIN, making the code's intent clearer:

explain (costs off)
with employee_agg as (  
  select 
   department,
   sum(salary) as total_salary  
  from employee  
  group by department  
)  
select 
  d.dname as department_name, 
  d.floor as department_floor,
  ea.total_salary as total_salary  
from employee_agg as ea  
left outer join department as d  
on ea.department = d.dname; 

                       QUERY PLAN                       
--------------------------------------------------------
 Hash Right Join
   Hash Cond: ((d.dname)::text = (ea.department)::text)
   ->  Seq Scan on department d
   ->  Hash
         ->  Subquery Scan on ea
               ->  HashAggregate
                     Group Key: employee.department
                     ->  Seq Scan on employee

This method is more efficient as it aggregates before join. Using Common Table Expressions (CTEs) imitates the MongoDB aggregation pipeline, which provides greater control over data access optimization. Both are high-level languages that enable developers to decompose queries into logical steps effectively.

When writing an SQL query, I prefer to start with aggregations and projections in Common Table Expressions (CTEs) before performing natural joins. This method is valid as long as all projections are clearly defined, ensuring an organized and efficient query structure:

explain (costs off)
with "EmployeesPerDepartment" as (  
  select 
   department     as "DepartmentName",
   sum(salary)    as "TotalSalary"  
  from employee  
  group by department  
),  "Departments" as (
 select 
  dname         as "DepartmentName", 
  floor         as "DepartmentFloor"
 from department
)  select 
  "DepartmentName", 
  "DepartmentFloor",
  "TotalSalary"  
from "EmployeesPerDepartment"
natural left join "Departments"
;
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Hash Right Join
   Hash Cond: ((department.dname)::text = (employee.department)::text)
   ->  Seq Scan on department
   ->  Hash
         ->  HashAggregate
               Group Key: employee.department
               ->  Seq Scan on employee

Because a SQL result is a single tabular result, it is possible to declare the projection (column aliases) to the final column name first. This eliminates the need for table aliases and complex join clauses. It is also easier to debug, running the intermediate steps

To conclude, it is true that joins in PostgreSQL are generally faster than lookups in MongoDB. This is because PostgreSQL is designed for normalized schemas, where a single business query can retrieve data from multiple tables, while MongoDB is optimized for document models that align with business domain entities, and adding more complexity to the query planner to optimize joins is not a priority.
In SQL databases, the challenge lies not in executing joins but in the complexity faced by developers when crafting optimal queries. To achieve acceptable response times, SQL databases must utilize multiple join algorithms. This requires the query planner to perform cost-based optimization, which heavily relies on accurate cardinality estimations. As the number of tables to join increases, so does the risk of obtaining a poor execution plan. This complexity impacting the developer and the optimizer can create the perception that joins are slow.

If MongoDB lookups are causing slow queries, consider improving your data model and aggregation pipelines first. Embed the one-to-one or one-to-many that belong to the same business object. Filter and aggregate before joining, and utilize multi-key indexes on well-designed document schemas.
Additionally, be aware that MongoDB's aggregation pipeline includes an optimization phase to reshape the pipeline for enhanced performance.

If you find the aggregation pipeline syntax complex, it's easy to learn and developers like it (other databases like Google BigQuery or DuckDB adopted a similar pipeline approach). It resembles Common Table Expressions in SQL, making it straightforward to test each stage. Additionally, you can use the Atlas UI or Compass to construct it with a wizard and view the output of each stage, as shown in the header of this post. But do not abuse it: the document model should avoid lookups on many documents, and aggregation pipelines should filter (on indexes) and aggregate first.

"Schema Later" considered harmful 👉🏻 schema validation

In a blog post titled "Schema Later" Considered Harmful on EDB's site, Michael Stonebraker demonstrates that inserting junk data can be harmful to queries. While this conclusion is evident, it’s important to note that all databases have some form of schema and a full "schema later" doesn't exist. Otherwise, indexes couldn't be added, and data couldn't be queried and processed.
It's recommended to declare the schema-on-write part in the database in addition to the application code, once it has been established and used in the application.

A common mistake vendors make is comparing one database, where the author is an expert, to another database they are unfamiliar with and unwilling to learn about. This leads to biased conclusions, as they contrast best practices from one database with a database where the design was incorrect. In the EDB blog post, the schema for PostgreSQL was defined using data types and check constraints. However, nothing similar was done for the example on MongoDB.

In MongoDB, you can begin with a flexible schema defined by your application. Once the structure is established, MongoDB schema validation ensures that there are no unintended changes or improper data types, maintaining the integrity of your data. Although this feature has been available since MongoDB 3.6, released in 2017, it remains overlooked due to persistent myths about NoSQL.

In the EDB blog, they created the PostgreSQL table as:

create table employee (
 name varchar,
 age int4,
 salary int4 check (salary > 0)
);

To compare, they should have created the MongoDB collection as:

db.createCollection("employee", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "age", "salary"],
      properties: {
        name:   {  bsonType: "string",            description: "VARCHAR equivalent"  },                   
        age:    {  bsonType: "int",               description: "INT4 equivalent"  },                     
        salary: {  bsonType: "int",  minimum: 0,  description: "CHECK salary > 0 equivalent"                                 
        }
      }
    }
  },
  validationAction: "error" // Strict validation: reject invalid documents
});

With such schema validation, the incorrect inserts are rejected:

db.employee.insertOne ({name : "Stonebraker", age : 45, salary : -99})

MongoServerError: Document failed validation
Additional information: {
  failingDocumentId: ObjectId('6845cfe3f9e37e21a1d4b0c8'),
 ...
        propertyName: 'salary',
            description: 'CHECK salary > 0 equivalent',
            details: [
              {
                operatorName: 'minimum',
                specifiedAs: { minimum: 0 },
                reason: 'comparison failed',
                consideredValue: -99
...


db.employee.insertOne ({name : "Codd", age : "old", salary : 40000})

MongoServerError: Document failed validation
Additional information: {
  failingDocumentId: ObjectId('6845d041f9e37e21a1d4b0c9'),
...
            propertyName: 'age',
            description: 'INT4 equivalent',
            details: [
              {
                operatorName: 'bsonType',
                specifiedAs: { bsonType: 'int' },
                reason: 'type did not match',
                consideredValue: 'old',
                consideredType: 'string'
              }
...

The application receives all information regarding the violation, in JSON that is parsable by the exception handling. Unlike many SQL databases that provide only the constraint name in a text message, this approach avoids exposing parts of the physical data model to the application, enhancing logical data independence.

Note that the examples are taken from the EDB blog post. You should probably store the date of birth rather than the age (validated as { bsonType: 'date' } and with a range of acceptable dates), and a currency along with the salary (with a sub-object { salary: { amount: 40000, currency: "CHF" } }).

MongoDB schema validation is declarative, and the Atlas UI or Compass can help you start with an existing collection by populating rules from sample data (the screenshot in the header of this post used rule generation from this example).

"Schema Later" considered harmful 👉🏻 schema validation

In a blog post titled "Schema Later" Considered Harmful on EDB's site, Michael Stonebraker demonstrates that inserting junk data can be harmful to queries. While this conclusion is evident, it’s important to note that all databases have some form of schema and a full "schema later" doesn't exist. Otherwise, indexes couldn't be added, and data couldn't be queried and processed.
It's recommended to declare the schema-on-write part in the database in addition to the application code, once it has been established and used in the application.

A common mistake vendors make is comparing one database, where the author is an expert, to another database they are unfamiliar with and unwilling to learn about. This leads to biased conclusions, as they contrast best practices from one database with a database where the design was incorrect. In the EDB blog post, the schema for PostgreSQL was defined using data types and check constraints. However, this work was left out in the example for MongoDB.

In MongoDB, you can begin with a flexible schema defined by your application. Once the structure is established, MongoDB schema validation ensures that there are no unintended changes or improper data types, maintaining the integrity of your data. Although this feature has been available since MongoDB 3.6, released in 2017, it remains overlooked due to persistent myths about NoSQL and unstructured data.

In the EDB blog, they created the PostgreSQL table as:

create table employee (
 name varchar,
 age int4,
 salary int4 check (salary > 0)
);

To compare, they should have created the MongoDB collection as:

db.createCollection("employee", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "age", "salary"],
      properties: {
        name:   {  bsonType: "string",            description: "VARCHAR equivalent"  },                   
        age:    {  bsonType: "int",               description: "INT4 equivalent"  },                     
        salary: {  bsonType: "int",  minimum: 0,  description: "CHECK salary > 0 equivalent"                                 
        }
      }
    }
  },
  validationAction: "error" // Strict validation: reject invalid documents
});

With such schema validation, the incorrect inserts are rejected:

db.employee.insertOne ({name : "Stonebraker", age : 45, salary : -99})

MongoServerError: Document failed validation
Additional information: {
  failingDocumentId: ObjectId('6845cfe3f9e37e21a1d4b0c8'),
 ...
        propertyName: 'salary',
            description: 'CHECK salary > 0 equivalent',
            details: [
              {
                operatorName: 'minimum',
                specifiedAs: { minimum: 0 },
                reason: 'comparison failed',
                consideredValue: -99
...


db.employee.insertOne ({name : "Codd", age : "old", salary : 40000})

MongoServerError: Document failed validation
Additional information: {
  failingDocumentId: ObjectId('6845d041f9e37e21a1d4b0c9'),
...
            propertyName: 'age',
            description: 'INT4 equivalent',
            details: [
              {
                operatorName: 'bsonType',
                specifiedAs: { bsonType: 'int' },
                reason: 'type did not match',
                consideredValue: 'old',
                consideredType: 'string'
              }
...

The application receives all information regarding the violation, in JSON that is parsable by the exception handling. Unlike many SQL databases that provide only the constraint name in a text message, this approach avoids exposing internal names to the application, enhancing logical data independence.

Remark: instead of storing age, it’s advisable to store the date of birth as a date type ({ bsonType: 'date' } with an acceptable range of values). Additionally, you can use sub-objects to include a currency alongside the salary: { salary: { amount: 40000, currency: "CHF" } }.

MongoDB schema validation is declarative, and Atlas UI or Compass can help you start with an existing collection by populating rules from sample data (the screenshot in the header of this post used rule generation from this example).

How to Perform a Disaster Recovery Switchover with Patroni for PostgreSQL

Patroni is a Python-based template for managing high availability PostgreSQL clusters. Originally a fork of the Governor project by Compose, Patroni has evolved significantly with many new features and active community development. It supports integration with various Distributed Configuration Stores (DCS) like etcd, Consul, and ZooKeeper, and provides simple setup and robust failover management. This blog […]