a curated list of database news from authoritative sources

March 16, 2025

At what level of concurrency do MySQL 5.7 and 8.0 become faster than 5.6?

Are MySQL 5.7 and 8.0 faster than 5.6? That depends a lot on the workload -- both types of SQL and amount of concurrency. Here I summarize results from sysbench on a larger server (48 cores) using 1, 4, 6, 8, 10, 20 and 40 clients to show how things change.

tl;dr

  • the workload here is microbenchmarks with a database cached by InnoDB
  • 5.7.44 is faster than 8.0.x at all concurrency levels on most microbenchmarks
  • for 5.6.51 vs 8.0.x
    • for point queries, 5.6.51 is faster at <= 8 clients
    • for range queries without aggregation 5.6.51 is always faster
    • for range queries with aggregation 5.6.51 is faster except at 40 clients
    • for writes, 5.6.51 is almost always faster at 10 or fewer clients (excluding update-index)
Performance summaries

For point queries:
  • 5.7.44 is always faster than 8.0
  • 8.0.28 suffers from bug 102037 - found by me with sysbench, fixed in 8.0.31
  • at what level of concurrency do most things get faster in 5.7 & 8.0 vs 5.6?
    • 5.7.44 becomes faster than 5.6.51 at 6+ clients
    • 8.0.x becomes faster than 5.6.51 at between 10 and 20 clients
    • Two of the microbenchmarks are always faster in 5.6.51 - both do non-covering queries on a secondary index
For range queries without aggregation
  • 5.7.44 is always faster than 8.0x
  • 5.6.51 is always faster than 5.7.44 and 8.0.x
For range queries with aggregation
  • 5.7.44 is almost always faster than 8.0.x
  • 5.7.44 becomes faster than 5.6.51 at 6+ clients
  • 8.0.x becomes faster than 5.6.51 at 40 clients
For writes
  • For update-index
    • 5.7.44 and 8.0.x are always faster than 5.6.51 at 4+ clients
    • There is an odd drop from ~6X to ~3X for 8.0.32 and 8.0.39 at 20 clients
  • 5.7.44 is mostly faster than 8.0.x for 1 to 20 clients and they have similar results at 40 clients
  • 5.7.44 & 8.0.x are always faster than 5.6.51 at 20+ clients
Builds, configuration and hardware

I compiled MySQL from source for versions 5.6.51, 5.7.44, 8.0.28, 8.0.32, 8.0.39 and 8.0.41.

The server is an ax162-s from Hetzner with 48 cores (AMD EPYC 9454P), 128G RAM and AMD SMT disabled. It uses Ubuntu 22.04 and storage is ext4 with SW RAID 1 over 2 locally attached NVMe devices. More details on it are here. At list prices a similar server from Google Cloud costs 10X more than from Hetzner.

The configuration files are named my.cnf.cz11a_c32r128 and here for 5.6.51, 5.7.44, 8.0.28, 8.0.32, 8.0.39 and 8.0.41.

Benchmark

I used sysbench and my usage is explained here. To save time I only run 27 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

The tests run with 8 tables and 10M rows/table. There are 40 client threads, read-heavy microbenchmarks run for 180 seconds and write-heavy run for 300 seconds.

The command lines to run all tests are:
bash r.sh 8 10000000 180 300 md2 1 1 1
bash r.sh 8 10000000 180 300 md2 1 1 4
bash r.sh 8 10000000 180 300 md2 1 1 6
bash r.sh 8 10000000 180 300 md2 1 1 8
bash r.sh 8 10000000 180 300 md2 1 1 10
bash r.sh 8 10000000 180 300 md2 1 1 20
bash r.sh 8 10000000 180 300 md2 1 1 40

Results

For the results below I split the microbenchmarks into 4 groups: point queries, range queries without aggregation, range queries with queries, writes. The spreadsheet with all data is here. Files with performance summaries for relative and absolute QPS are hereValues from iostat and vmstat per microbenchmark are here for 1 client, 4 clients, 6 clients, 8 clients, 10 clients, 20 clients and 40 clients. These help to explain why something is faster or slower because it shows how much HW is used per query.

The relative QPS is the following where $version is >= 5.7.44.
(QPS for $version) / (QPS for MySQL 5.6.51)
The numbers in the spreadsheets are the relative QPS. When the relative QPS is > 1 then $version is faster than MySQL 5.6.51. When it is 3.0 then $version is 3X faster than the base case.

Results: charts 

Notes on the charts

  • the y-axis shows the relative QPS
  • the y-axis starts at 0.80 to make it easier to see differences
  • in some cases the y-axis truncates the good outliers, cases where the relative QPS is greater than 1.5. I do this to improve readability for values near 1.0. Regardless, the improvements are nice.
Results: point queries

Summary
  • 5.7.44 is always faster than 8.0
  • 8.0.28 suffers from bug 102037 - found by me with sysbench, fixed in 8.0.31
  • at what level of concurrency do most things get faster in 5.7 & 8.0 vs 5.6?
    • 5.7.44 becomes faster than 5.6.51 at 6+ clients
    • 8.0.x becomes faster than 5.6.51 at between 10 and 20 clients
    • Two of the microbenchmarks are always faster in 5.6.51 - both do non-covering queries on a secondary index
Results: range queries without aggregation

Summary
  • 5.7.44 is always faster than 8.0x
  • 5.6.51 is always faster than 5.7.44 and 8.0.x
Results: range queries with aggregation

Summary
  • 5.7.44 is almost always faster than 8.0.x
  • 5.7.44 becomes faster than 5.6.51 at 6+ clients
  • 8.0.x becomes faster than 5.6.51 at 40 clients
Results: writes

The relative speedup for the update-index microbenchmark is frequently so large that it obscures the smaller changes on other microbenchmarks. So here I truncate the y-axis for some of the charts (for 6+ clients) and the section that follows has the charts without truncation.

Summary
  • For update-index
    • 5.7.44 and 8.0.x are always faster than 5.6.51 at 4+ clients
    • There is an odd drop from ~6X to ~3X for 8.0.32 and 8.0.39 at 20 clients but you can't see that on the charts in this section because of the truncation. It is visible in the next section. From vmstat I see an increase in CPU/operation (cpu/o) and context switches /operation (cs/o) at 20 clients but not at 40 clients.
  • 5.7.44 is mostly faster than 8.0.x for 1 to 20 clients and they have similar results at 40 clients
  • 5.7.44 & 8.0.x are always faster than 5.6.51 at 20+ clients
Results: charts for writes without truncation

The y-axis is truncated the the charts for writes in the previous section for 6+ clients. This section has those charts without truncation.

March 15, 2025

Postgres 17.4 vs sysbench on a large server, revisited

I recently shared results for Postgres vs sysbench on a large server. The results were mostly boring (it is rare for me to spot regressions in Postgres) but there was one microbenchmark where there was a problem. The problem microbenchmark does a range scan with aggregation and the alleged regression arrived in Postgres 11. With advice from Postgres experts it looked like the problem was an intermittent change in the query plan.

In this post I explain additional tests that I did and in this case the alleged regression was still there, but like many things in DBMS-land it depends, there is nuance. For now I assume the problem is from a change in the query plan and I will run more tests with more instrumentation to investigate that. Here the alleged regression might be ~5% and only at the highest concurrency level (40. clients).

Postgres the DBMS and community are not fans of query plan hints and the sysbench tests that I use don't add hints for queries. Query plan hints are possible in Postgres via the pg_hint_plan extension.  Query plan hints have been good to me with web-scale MySQL. For some of the web-scale workloads that I support the SQL and schema doesn't change much and query plan hints have two benefits -- plan stability and CPU reduction. By CPU reduction I mean that the CPU overhead from the optimizer is reduced because it has less work to do.

tl;dr

  • There might be a regression for some range queries, but it is small here (~5%) and only occurs at the highest concurrency level (40 clients). I assume this is from a change in the query plan.
  • I have yet to explain the alleged regression
  • I like query plan hints

The problem microbenchmark

The microbenchmarks in question do a range scan with aggregation and have a name like read-only_range=$X where X has been 10, 100 and 1000. The value of X is the length of the range scan. These are run via oltp_read_only.lua and uses these SQL statements.

Build, configuration, hardware, benchmark

These are described in my previous post. But a few things have changed for this report

  • I only tested Postgres versions 10.23, 11.0 and 11.10
  • I repeated the benchmark for 1, 10, 20 and 40 client threads. Previously I only ran it for 40.
  • I ran the read-only_range=$X microbenchmark for X in 10, 100, 1000, 2000, 4000, 8000, 16000 and 32000. Previously I ran it for X in 10, 100, 10000.
What didn't change is that the tests are run with 8 tables and 10M rows/table, read-heavy microbenchmarks run for 180 seconds and write-heavy run for 300 seconds.

Results: overview

For the results below I split the microbenchmarks into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, one group has queries without aggregation and the other has queries with aggregation. The spreadsheet with all data and charts is here. It has a tab for 1, 10, 20 and 40 clients (named dop=$X for X in 1, 10, 20 and 40).

Files with performance summaries are here. These include summaries of results from vmstat and iostat for each microbenchmark which are here for 1 client, 10 clients, 20 clients and 40 clients.

The relative QPS is the following where $version is either 11.0 or 11.10.
(QPS for $version) / (QPS for Postgres 10.23)
The numbers in the spreadsheets are the relative QPS. When the relative QPS is > 1 then $version is faster than Postgres 10.23.  When it is 3.0 then $version is 3X faster than the base case.

Notes on the charts

  • the y-axis shows the relative QPS
  • the y-axis starts at 0.90 to make it easier to see differences
  • there are 4 charts per section, one for each of 1, 10, 20 and 40 clients
Results: point queries

Summary
  • Postgres 11.x is always faster than 10.23 and usually about 3% faster
Results: range queries without aggregation

Summary
  • There are no regressions
  • Postgres 11.0 & 11.10 get up to 11% more QPS than 10.23 for the range-covered and range-notcovered microbenchmarks
  • For the scan microbenchmark QPS is mostly unchanged between Postgres 10.23, 11.0 and 11.10 but in one cases Postgres 11 was slightly slower (relative QPS for 11.0 was 0.99 at 20 clients). 
Results: range queries with aggregation

I repeated the read-only microbenchmark using range queries of length 10, 100, 1000, 2000, 4000, 8000, 16000 and 32000.

Summary:
  • In general, the advantage for Postgres 11.0 & 11.10 vs 10.23 was largest for the longest range scans (16000 & 32000) and next largest for the shortest range scans (10 & 100).
  • The comparison for range scans of length 1000, 2000, 4000 and 8000 was interesting. Here the benefit for Postgres 11.0 & 11.10 was not as large and in one case (range=8000 at 40 clients) there was a small regression (~5%). Perhaps there is a change in the query plan.
Results: writes

Summary:
  • Postgres 11.0 & 11.10 were almost always faster than 10.23 and up to 1.75X faster
  • In one case (update-one microbenchmark at 20 clients) Postgres 11.0 & 11.10 were ~5% slower than 10.23. And this is odd because 11.0 and 11.10 were ~1.7X faster at 40 clients on the same microbenchmark. I can only wave my hands for this one. But don't think this is a regression.
    • The update-one microbenchmark is run by oltp_update_non_index.lua (the name means it updates non-indexed columns), the SQL for the update is here and the schema is here.
    • From vmstat and iostat metrics for 20 clients and for 40 clients and looking at the CPU /operation (cpu/o) and context switches /operation (cs/o)
      • For 20 clients these are slightly larger for 11.0 & 11.10 vs 10.23
      • For 40 clients these are significantly small for 11.0 & 11.10 vs 10.23
    • The microbenchmarks that aren't read-only are run for 600 seconds each and it is possible for performance variance to come from write debt (writeback, vacuum, etc) inherited from microbenchmarks that preceded update-one. The order in which the update microbenchmarks are run is here and is update-inlist, update-index, update-nonindex, update-one, update-zipf. From the dop=20 tab on the spreadsheet, throughput is 1.1X to 1.3X larger for the 3 update microbenchmarks that precede update-one so inherited write debt might explain this results.

March 14, 2025

Enterprise Readiness with Percona Monitoring and Management: A Look at PMM 3.0.0

Database disasters come with a hefty price tag. According to Information Technology Intelligence Consulting’s 2024 Hourly Cost of Downtime Report, 41% of enterprises face downtime costs ranging from $1 million to over $5 million per hour of outage. How can enterprises ensure their critical database infrastructure remains healthy and performant? This question keeps IT leaders […]

March 13, 2025

MongoDB Multi-Planner Optimizer and Plan Cache

Database systems utilize query planners to optimize data retrieval, primarily through two methods: Rule-Based Optimizers (RBO) and Cost-Based Optimizers (CBO).

  • Rule-Based Optimizer (RBO): This method employs predefined rules to select execution plans, resulting in stable but simplistic query plans. It often struggle with complex queries involving multiple joins but with a document model, the join ordering is solved upfront.

  • Cost-Based Optimizer (CBO): CBO analyzes data distribution statistics to evaluate potential plans and choose the most cost-effective option. Its reliance on possibly outdated statistics can lead to suboptimal decisions, so gathering statistics is crucial. Planning complex queries can take time, so it either relies on a shared plan cache or switches back to simple genetic algorithms when there are many joins.

MongoDB utilizes a document model that minimizes joins, focusing primarily on selecting the appropriate index. It defers execution decisions until runtime instead of relying on RBO or CBO to pick one. Through a shared plan cache and multi-planner mechanism, MongoDB dynamically evaluates and adjusts query plans for optimal execution.

Let's explain with a simple example and use the most challenging situation for a query planner: column skew, where a field has some unique values and a few popular ones. A business case I encountered was at a coffee capsule vendor. Online customers use their customer ID and order some capsules every three months on average. Shops have a special customer ID and record thousands of orders every day. For a query per customer ID and some other criteria, the index to use must be different, but the application obeys the 'parse one execute many' best practice, and the first who runs determines the execution plan for the others. Tom Kyte told the story of the database being slow when it rains (because the first user to parse the statement depended on a user coming by bike or car).

Initialize a collection

I'll build a simple example, a collection of ten million documents with two fields: a and b:

  • a is very selective for a few documents (with value less than 50) but the millions of remaining documents all have the same value of 50.
  • b is uniform, with values from 1 to 10, and good selectivity: one value returns ten documents.

Here here how I generated this with a loop on i, a is generated with Math.min(i,50) and b is generated with i%1e6:

// Drop the collection if it already exists
db.franck.drop();
// Insert documents with different data distributions
const bulk = db.franck.initializeUnorderedBulkOp();
const num=1e7;
for (let i = 0; i < num; i++) {
    bulk.insert({ a: Math.min(i,50), b: i%1e6 });
}
const r = bulk.execute();

console.log(`Bulk Operation Summary: Inserted: ${r.insertedCount}, Matched: ${r.matchedCount}, Modified: ${r.modifiedCount}, Deleted: ${r.deletedCount}, Upserted: ${r.upsertedCount}`);

This inserted ten million documents:

test> console.log(`Bulk Operation Summary: Inserted: ${r.insertedCount}, Matched: ${r.matchedCount}, Modified: ${r.modifiedCount}, Deleted: ${r.deletedCount}, Upserted: ${r.upsertedCount}`);             
Bulk Operation Summary: Inserted: 10000000, Matched: 0, Modified: 0, Deleted: 0, Upserted: 0

Create Indexes

As the goal is to show two possible execution plans, I create two indexes, one on each field:

test> db.franck.createIndex({ a: 1 });
a_1
test> db.franck.createIndex({ b: 1 });
b_1

Verify Data

test> // "b" has a good selectivity

test> db.franck.countDocuments({ b: 42 });
10

test> // "a" has a better selectivity when a<50

test> db.franck.countDocuments({ a: 42         });
1

test> // "a" has a very bad selectivity when a=50 (popular value)

test> db.franck.countDocuments({ a: 50         });
9999950

My query will have a predicate on each column. I'll test { a: 42, b: 42 } which returns one document, and { a: 50, b: 42 } which returns ten documents.

Execution profile and plan cache

I'll look at the statistics executions with profiling and at the query plan cache, so I reset them in my lab:

db.franck.getPlanCache().clear();
db.franck.getPlanCache().list();
db.setProfilingLevel(0);
db.system.profile.drop();

First execution and plan cache

I set profiling, and I run a first execution that returns one document:

test> db.setProfilingLevel(2 );
{ was: 0, slowms: 0, sampleRate: 1, ok: 1 }
test> db.franck.find({ a: 42, b: 42 });
[ { _id: ObjectId('67d34e6128ca7c6f95a00acb'), a: 42, b: 42 } ]
test> db.setProfilingLevel(0);
{ was: 2, slowms: 0, sampleRate: 1, ok: 1 }

The initial execution of the MongoDB query planner generated all possible plans stored in the cache: one plan executes an IXSCAN on index a, another performs an IXSCAN on index b, and a third combines both indexes with an AND_SORTED operation:

test> db.franck.getPlanCache().list();

[
  {
    version: '1',
    queryHash: '20F294D4',
    planCacheKey: '2AAF6E88',
    isActive: false,
    works: Long('2'),
    timeOfCreation: ISODate('2025-03-13T21:43:31.189Z'),
    createdFromQuery: { query: { a: 42, b: 42 }, sort: {}, projection: {} },
    cachedPlan: {
      stage: 'FETCH',
      filter: { b: { '$eq': 42 } },
      inputStage: {
        stage: 'IXSCAN',
        keyPattern: { a: 1 },
        indexName: 'a_1',
        isMultiKey: false,
        multiKeyPaths: { a: [] },
        isUnique: false,
        isSparse: false,
        isPartial: false,
        indexVersion: 2,
        direction: 'forward',
        indexBounds: { a: [ '[42, 42]' ] }
      }
    },
    creationExecStats: [
      {
        nReturned: 1,
        executionTimeMillisEstimate: 0,
        totalKeysExamined: 1,
        totalDocsExamined: 1,
        executionStages: {
          stage: 'FETCH',
          filter: { b: { '$eq': 42 } },
          nReturned: 1,
          executionTimeMillisEstimate: 0,
          works: 2,
          advanced: 1,
          needTime: 0,
          needYield: 0,
          saveState: 0,
          restoreState: 0,
          isEOF: 1,
          docsExamined: 1,
          alreadyHasObj: 0,
          inputStage: {
            stage: 'IXSCAN',
            nReturned: 1,
            executionTimeMillisEstimate: 0,
            works: 2,
            advanced: 1,
            needTime: 0,
            needYield: 0,
            saveState: 0,
            restoreState: 0,
            isEOF: 1,
            keyPattern: { a: 1 },
            indexName: 'a_1',
            isMultiKey: false,
            multiKeyPaths: { a: [] },
            isUnique: false,
            isSparse: false,
            isPartial: false,
            indexVersion: 2,
            direction: 'forward',
            indexBounds: { a: [Array] },
            keysExamined: 1,
            seeks: 1,
            dupsTested: 0,
            dupsDropped: 0
          }
        }
      },
      {
        nReturned: 1,
        executionTimeMillisEstimate: 0,
        totalKeysExamined: 2,
        totalDocsExamined: 2,
        executionStages: {
          stage: 'FETCH',
          filter: { a: { '$eq': 42 } },
          nReturned: 1,
          executionTimeMillisEstimate: 0,
          works: 2,
          advanced: 1,
          needTime: 1,
          needYield: 0,
          saveState: 0,
          restoreState: 0,
          isEOF: 0,
          docsExamined: 2,
          alreadyHasObj: 0,
          inputStage: {
            stage: 'IXSCAN',
            nReturned: 2,
            executionTimeMillisEstimate: 0,
            works: 2,
            advanced: 2,
            needTime: 0,
            needYield: 0,
            saveState: 0,
            restoreState: 0,
            isEOF: 0,
            keyPattern: { b: 1 },
            indexName: 'b_1',
            isMultiKey: false,
            multiKeyPaths: { b: [] },
            isUnique: false,
            isSparse: false,
            isPartial: false,
            indexVersion: 2,
            direction: 'forward',
            indexBounds: { b: [Array] },
            keysExamined: 2,
            seeks: 1,
            dupsTested: 0,
            dupsDropped: 0
          }
        }
      },
      {
        nReturned: 1,
        executionTimeMillisEstimate: 0,
        totalKeysExamined: 2,
        totalDocsExamined: 1,
        executionStages: {
          stage: 'FETCH',
          filter: { '$and': [ [Object], [Object] ] },
          nReturned: 1,
          executionTimeMillisEstimate: 0,
          works: 2,
          advanced: 1,
          needTime: 1,
          needYield: 0,
          saveState: 0,
          restoreState: 0,
          isEOF: 0,
          docsExamined: 1,
          alreadyHasObj: 0,
          inputStage: {
            stage: 'AND_SORTED',
            nReturned: 1,
            executionTimeMillisEstimate: 0,
            works: 2,
            advanced: 1,
            needTime: 1,
            needYield: 0,
            saveState: 0,
            restoreState: 0,
            isEOF: 0,
            failedAnd_0: 0,
            failedAnd_1: 0,
            inputStages: [ [Object], [Object] ]
          }
        }
      }
    ],
    candidatePlanScores: [ 2.5002, 1.5002, 1.5001 ],
    indexFilterSet: false,
    estimatedSizeBytes: Long('4693'),
    host: '4b49f6ca6400:27017'
  }
]

Those plans were evaluated (creationExecStats) with { a: { '$eq': 42 } and { b: { '$eq': 42 } and scored. The best score goes to the most selective one, the index on a, which examines one key and fetches one document.

Next executions with the same values

I run the same nine more times:

db.setProfilingLevel(2 );
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.franck.find({ a: 42, b: 42 });
db.setProfilingLevel(0);

I extract some interesting information from the profiling of those ten executions:

test> db.system.profile.find({ ns: "test.franck" }).sort({ ts: 1 }).limit(100).forEach((doc) => { if (doc.execStats) console.log(`${doc.ts.toISOString().slice(11, 23)} works: ${doc.execStats.works.toString().padStart(5)} keys: ${doc.keysExamined.toString().padStart(5)} docs: ${doc.docsExamined.toString().padStart(5)} ret: ${doc.nreturned.toString().padStart(5)} ${doc.execStats.stage.padStart(12)} ${doc.planSummary.padStart(12)} exec(plan): ${doc.millis.toString().padStart(5)}ms (${doc.planningTimeMicros.toString().padStart(8)}us) query/plan ${doc.queryHash}/${doc.planCacheKey} ${doc.queryFramework} ${doc.fromPlanCache ? 'fromPlanCache ' : ''}${doc.fromMultiPlanner ? 'fromMultiPlanner ' : ''}${doc.replanned ? doc.replanReason : ''}`); });

21:43:31.191 works:     3 keys:     1 docs:     1 ret:     1        FETCH IXSCAN { a: 1 } exec(plan):     7ms (    7063us) query/plan 20F294D4/2AAF6E88 classic fromMultiPlanner                                                             
21:56:49.938 works:     3 keys:     1 docs:     1 ret:     1        FETCH IXSCAN { a: 1 } exec(plan):     9ms (    8899us) query/plan 20F294D4/2AAF6E88 classic fromMultiPlanner                                                             
21:56:49.982 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     1ms (    1264us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.014 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     233us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.053 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     240us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.085 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     235us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.108 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     254us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.133 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     230us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.157 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     246us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache                                                                
21:56:50.188 works:     2 keys:     1 docs:     1 ret:     1  CACHED_PLAN IXSCAN { a: 1 } exec(plan):     0ms (     246us) query/plan 20F294D4/2AAF6E88 classic fromPlanCache     

In the first two executions, all plans were evaluated using the Multi-Planner. From the third execution onward, only the winning plan from the Plan Cache was executed, resulting in decreased execution time, including planning time. The best index, which was fully executed, is the one on a.

More executions with different values

I execute the same query shape but with a different value, for wich I know that the index on a is not good:

db.setProfilingLevel(2);
db.franck.find({ a: 50, b<... (truncated)
                                    

Using Barman Cloud Utilities for Backups and WAL Archiving in PostgreSQL

In this brief blog post, we will talk about Barman cloud utilities, which greatly ease the process of storing backups on cloud platforms like GCP, AWS, Azure, etc. Backups are of paramount importance, and in PostgreSQL, we also need to retain the WAL files, which can be used for various purposes like incremental backups or […]

IO devices and latency

Take an interactive journey through the history of IO devices, and learn how IO device latency affects performance.

March 12, 2025

HAProxy-Patroni Setup Using Health Check Endpoints and Debugging

Patroni has a REST API that allows HAProxy and other kinds of load balancers to perform HTTP health checks. This blog post explains how HAProxy uses Health check endpoints with Patroni and how to debug the status issue. HAProxy and Patroni setup: Sample configuration: [crayon-67ee89abc7af9510089155/]   [crayon-67ee89abc7b04427155149/]   OPTIONS /primary: This is the primary health […]

March 11, 2025

Ready for the Enterprise: New Solution Updates from Percona

The latest releases from Percona bring targeted improvements that help organizations streamline deployments, strengthen security, and optimize performance in demanding enterprise environments. From Kubernetes-native database management to security-hardened monitoring, these updates ensure businesses can scale efficiently while maintaining high availability and compliance. Percona Everest: Simplified deployments for complex infrastructure Percona Everest, which automates the provisioning […]

March 10, 2025

Extending MySQL 5.7: Percona Releases Post-End-of-Life Source Code to the Community

Percona is excited to announce the community release of our MySQL 5.7 End-of-Life (EOL) source code for both Percona Server for MySQL (PS) version 5.7.44-52 and Percona XtraDB Cluster (PXC) version 5.7.44-31.65.5. This release is an accumulation of several customer-only releases we’ve done as part of our post-EOL support for MySQL 5.7, and we’re happy […]

Publish and Perish: Why Ponder Stibbons Left the Ivory Tower

(With apologies to Terry Pratchett)

Ponder Stibbons sat in the shadowy recesses of the Uncommon Room, clutching a lukewarm cup of tea and the last vestiges of his patience. Across from him, the Dean, wrapped in his usual self-satisfaction, puffed his pipe and surveyed the room as if it were all a great and glorious joke that he alone understood.

"Leaving?" the Dean spluttered, when Ponder finally managed to slide the conversation in that direction. "What do you mean, leaving? Where would you even go?"

"Industry," said Ponder, trying not to make it sound like a curse. "Databases. Big ones."

The Dean blinked. "But we have databases here. The Archchancellor’s hat alone contains centuries of magical indexing..."

"Yes, but in the real world, we use them to actually store and retrieve information, not just argue about what information is worth storing in the first place," Ponder snapped. "And I’ll be paid properly."

"Pah," scoffed the Dean. "The true academic mind isn’t in it for the money."

"Yes, I noticed that," Ponder said. "But, funnily enough, the people running the place seem to be. We have more suits than I can count, and none of them seem to know how to count. We used to do research, but now we do ‘strategic impact planning’ and ‘value-oriented magical development initiatives’."

The Dean made a valiant effort at looking wise. "Well, you have to admit, Stibbons, there’s no point in just doing magic for magic’s sake."

"Then what’s the point of this place?" Ponder snapped. "If we’re not here to push the boundaries of knowledge, then we’re just a very exclusive, very inefficient vocational school for people too dim-witted to teach themselves. The students don’t want to learn magic, they want to finish a course of study in three semesters so they can get a cushy post enchanting chairs for the Patrician’s office."

The Dean waved a hand vaguely. "That’s always been the case."

"It wasn’t always quite this blatant," Ponder retorted. "Back in the day, we had time to actually think about things, research things, discuss things. Now it’s all quotas, reviews, assessments! And if the students don’t like you, they file complaints! Whining course reviews! As if the study of magic were meant to be… comfortable! We used to throw fireballs at people! Now we have to ‘create a nurturing environment for knowledge acquisition’."

"Ah," said the Dean, puffing his pipe. "Well, times change."

"Not here, they don’t," said Ponder bitterly. "That’s the whole problem. The world outside moves forward while we shuffle our papers and complain that we don’t have enough funding to investigate basic transmutation spells."

The Dean chuckled. "You always were an idealist, Stibbons. Research takes time. Why, I myself have been preparing to write a paper on the comparative metaphysical mass of a thaum for—"

"Twenty years!" snapped Ponder. "And you’ve produced nothing!"

"Ah, but when I do…" The Dean waggled his fingers mysteriously.

"And even if you did, it wouldn’t get published," Ponder continued. "Do you know what happens to papers these days? You send them off, wait a year, and then get them back covered in cryptic, punishing comments from anonymous reviewers who seem to hate the very idea of scholarship."

"Ah yes, peer review!" said the Dean enthusiastically. "Ensuring the highest quality!"

"Ensuring nobody gets anything done!" Ponder snapped. "It’s not review, it’s ritualized academic hazing. Half the time the reviewers are just competitors trying to torpedo your work so their own gets published first. It’s like trying to hold a debate where the other side gets to set your chair on fire before you start speaking."

"Ah, well," the Dean said philosophically. "At least we all suffer equally."

"No, we don’t!" Ponder threw up his hands. "It’s a zero-sum game! Everyone hoards their ideas because collaboration just means giving someone else ammunition to shoot down your next grant proposal! We’re supposed to be discovering new knowledge, and instead we spend all our time writing meticulously worded rebuttals to complaints about our font choice."

"Well, I suppose that’s just how academia works," the Dean said, smiling in what he clearly thought was a reassuring way.

Ponder sighed. "You know what’s coming, don’t you? The golems are already doing spellwork. Automated spell matrices. Soon, students won’t even need to be here. They’ll just buy the knowledge in convenient pre-packaged modules."

"Nonsense," the Dean huffed. "You can’t replace a wizard with a machine."

"That’s exactly what I thought," said Ponder. "Until I have seen them in the wild. It turned out you very well can."

The Dean froze. "What?"

"Oh yes. A spell-engine that automates theorem derivation and spell stabilization. Does in minutes what takes us years. Accurate, tireless, and best of all, doesn’t insist on being called ‘Professor’."

The Dean was pale. "But… but the prestige, Stibbons! The robes! The titles! The long lunches!"

"Industry has long lunches too," said Ponder. "And they reimburse you in two days instead of two months."

The Dean looked positively ill. "And you’re going to… what? Spend your days solving real problems? Making a tangible difference?"

"Yes."

"But… how will you cope?" The Dean’s voice was faint. "Without the endless meetings? The unread grant proposals? The departmental infighting over whose name goes first on a paper no one will read?"

Ponder stood, dusted off his robes, and picked up his satchel. "I think I’ll manage."


(Written by a golem named ChatGPT with some prompting)