a curated list of database news from authoritative sources

June 29, 2026

Hybrid Search (Full-Text and Vector Similarity) in HorizonDB

You may have read about Hybrid search for Azure HorizonDB. It is presented as combining BM25 full‑text and vector similarity in a single query. But how are they actually combined? The execution plan answers that.

In this post, I use a small synthetic product catalog to ensure the entire demo is reproducible. The text is sufficiently realistic for BM25 queries, and the embeddings are deterministic synthetic vectors, allowing you to run the full script without needing an embedding model. If you have azure_openai.create_embeddings() configured, you can substitute the synthetic embedding function with actual embeddings.

I will test two separate queries, followed by three methods of combining BM25 with vector retrieval:

  • cascade (BM25 → vector)
  • cascade (vector → BM25)
  • hybrid (parallel + fusion)

These are not interchangeable; they represent distinct trade-offs between recall and performance.

Setup

This example uses:

  • pg_textsearch for BM25 full-text search
  • pgvector for the vector type and distance operator
  • pg_diskann for the vector index, when available

The extensions must be listed in azure.extensions to enable CREATE EXTENSION.

In addition, pg_textsearch must be loaded on startup:

Here is my HorizonDB configuration:

postgres=> select version();
                                                        version
-----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 17.9 (Azure HorizonDB (70f3b593ec7)(release)) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 13.2.0, 64-bit

postgres=> show azure.extensions;

             azure.extensions
------------------------------------------
 pg_diskann,vector,pg_textsearch,azure_ai

postgres=> show shared_preload_libraries;

                               shared_preload_libraries
---------------------------------------------------------------------------------------
 pg_textsearch,azure,orion_storage,pg_availability,pg_qs,pgms_stats,pgms_wait_sampling

I've set up the extension features, functions, data types, and operators using CREATE EXTENSIONS and created a schema for this demo:


create extension if not exists vector;
create extension if not exists pg_textsearch;
create extension if not exists pg_diskann;

drop schema if exists hybrid_demo cascade;
create schema hybrid_demo;

set search_path = hybrid_demo, public, pgfts;

For the demo: a deterministic embedding function

I should use azure_openai.create_embeddings(), as I mentioned in the previous blog post. However, for this demo, I opted to create a fake embedding function that maps text to a simple vector(16) based on product keywords. This isn't an embedding model, but it ensures that execution plans are reproducible without relying on a model. It also simplifies the concept of embeddings with a basic, small-dimension LIKE-style semantic vector. The purpose is to demonstrate a search query using both a real text search and a vector-based semantic search.

create or replace function demo_embedding(txt text)
returns vector(16)
language sql
immutable
parallel safe
as $$
select (
  '[' ||
  concat_ws(',',
    case when txt ~* '(mid|century)'       then '1.0' else '0.0' end,
    case when txt ~* '(modern|minimalist)' then '1.0' else '0.0' end,
    case when txt ~* '(chair|seat)'        then '1.0' else '0.0' end,
    case when txt ~* '(table|desk)'        then '1.0' else '0.0' end,
    case when txt ~* '(sofa|couch)'        then '1.0' else '0.0' end,
    case when txt ~* '(lamp|light)'        then '1.0' else '0.0' end,
    case when txt ~* '(wood|walnut|oak)'   then '1.0' else '0.0' end,
    case when txt ~* '(metal|steel)'       then '1.0' else '0.0' end,
    case when txt ~* '(leather)'           then '1.0' else '0.0' end,
    case when txt ~* '(fabric|linen)'      then '1.0' else '0.0' end,
    case when txt ~* '(industrial)'        then '1.0' else '0.0' end,
    case when txt ~* '(scandinavian)'      then '1.0' else '0.0' end,
    case when txt ~* '(office)'            then '1.0' else '0.0' end,
    case when txt ~* '(dining)'            then '1.0' else '0.0' end,
    case when txt ~* '(classic|vintage)'   then '1.0' else '0.0' end,
    case when txt ~* '(outdoor|garden)'    then '1.0' else '0.0' end
  ) ||
  ']'
)::vector;
$$;

Using a straightforward CASE that searches for specific words, I create the semantic vector for a text. This method employs vector similarity solely for demonstration, without utilizing a model. You can envision AI models doing the same but with thousands of dimensions driven by large language models (LLMs) rather than keywords.

Table with text and embeddings

I created a product catalog table with structured fields, text, and a single vector column:

create table products (
  product_id bigint primary key,
  category   text not null,
  brand      text not null,
  price      numeric(10,2) not null,
  title      text not null,
  description text not null,
  search_text text generated always as (
    title || ' ' || description
  ) stored,
  embedding vector(16) not null
);

I loaded a few hundred thousand rows. The data is synthetic but intentionally patterned: products have categories, materials, styles, and terms that are useful for both BM25 and vector search:

insert into products ( product_id, category, brand, price, title, description, embedding)
with generated as (
  select
    g as product_id,
    (array[ 'chair', 'table', 'sofa', 'lamp', 'desk', 'shelf' ])[1 + (g % 6)] as category,
    (array[ 'Contoso', 'Fabrikam', 'Northwind', 'AdventureWorks', 'Wingtip', 'Tailspin' ])[1 + (g % 6)] as brand,
    (array[ 'mid-century modern', 'industrial', 'scandinavian', 'classic', 'minimalist', 'outdoor' ])[1 + (g % 6)] as style,
    (array[ 'walnut wood', 'black metal', 'oak wood', 'leather', 'fabric', 'brushed steel' ])[1 + ((g / 7) % 6)] as material, (25 + (g % 500))::numeric(10,2) as price
  from generate_series(1, 200000) as g
)
select product_id, category, brand, price,
  initcap(style || ' ' || material || ' ' || category) as title,
  concat( style, ' ', category, ' by ', brand, ' with ', material, '. Designed for ',
    case
      when category in ('chair', 'sofa') then 'living room seating'
      when category in ('table', 'desk') then 'home office and dining'
      when category = 'lamp' then 'warm interior lighting'
      else 'storage and display'
    end,
    '. Product code ', product_id, '.'
  ) as description,
  demo_embedding(
    concat_ws(' ', style, material, category, brand)
  ) as embedding
from generated;

I checked a few rows:

postgres=>
 select product_id, title, category, brand, price, search_text, embedding
from products where price >= 42
order by product_id
limit 10;

 product_id |               title               | category |     brand      | price |                                                               search_text                                                               |             embedding
------------+-----------------------------------+----------+----------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------
         17 | Outdoor Oak Wood Shelf            | shelf    | Tailspin       | 42.00 | Outdoor Oak Wood Shelf outdoor shelf by Tailspin with oak wood. Designed for storage and display. Product code 17.                      | [0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1]
         18 | Mid-Century Modern Oak Wood Chair | chair    | Contoso        | 43.00 | Mid-Century Modern Oak Wood Chair mid-century modern chair by Contoso with oak wood. Designed for living room seating. Product code 18. | [1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]
         19 | Industrial Oak Wood Table         | table    | Fabrikam       | 44.00 | Industrial Oak Wood Table industrial table by Fabrikam with oak wood. Designed for home office and dining. Product code 19.             | [0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0]
         20 | Scandinavian Oak Wood Sofa        | sofa     | Northwind      | 45.00 | Scandinavian Oak Wood Sofa scandinavian sofa by Northwind with oak wood. Designed for living room seating. Product code 20.             | [0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0]
         21 | Classic Leather Lamp              | lamp     | AdventureWorks | 46.00 | Classic Leather Lamp classic lamp by AdventureWorks with leather. Designed for warm interior lighting. Product code 21.                 | [0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0]
         22 | Minimalist Leather Desk           | desk     | Wingtip        | 47.00 | Minimalist Leather Desk minimalist desk by Wingtip with leather. Designed for home office and dining. Product code 22.                  | [0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0]
         23 | Outdoor Leather Shelf             | shelf    | Tailspin       | 48.00 | Outdoor Leather Shelf outdoor shelf by Tailspin with leather. Designed for storage and display. Product code 23.                        | [0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1]
         24 | Mid-Century Modern Leather Chair  | chair    | Contoso        | 49.00 | Mid-Century Modern Leather Chair mid-century modern chair by Contoso with leather. Designed for living room seating. Product code 24.   | [1,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0]
         25 | Industrial Leather Table          | table    | Fabrikam       | 50.00 | Industrial Leather Table industrial table by Fabrikam with leather. Designed for home office and dining. Product code 25.               | [0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0]
         26 | Scandinavian Leather Sofa         | sofa     | Northwind      | 51.00 | Scandinavian Leather Sofa scandinavian sofa by Northwind with leather. Designed for living room seating. Product code 26.               | [0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0]


(10 rows)

I tested some searches on search_text and embedding and started indexing those columns.

Create the indexes (BM25 and DiskANN)

First, I created the BM25 full-text index. HorizonDB’s BM25 full-text search brings BM25 ranking into PostgreSQL without a separate Elasticsearch/OpenSearch Search service. It uses the open-source extension pg_textsearch:


postgres=> create index products_bm25_idx
            on products
            using bm25 (search_text)
            with (text_config = 'english')
;

NOTICE:  BM25 index build started for relation products_bm25_idx
NOTICE:  Using text search configuration: english
NOTICE:  Using index options: k1=1.20, b=0.75
NOTICE:  parallel index build: launched 2 of 2 requested workers

NOTICE:  BM25 index build completed: 200000 documents, avg_length=16.17

CREATE INDEX

Then I created the vector index using DiskANN and cosine similarity:


postgres=> create index products_embedding_diskann_idx
            on products
            using diskann (embedding vector_cosine_ops)
;

CREATE INDEX

I gathered the statistics:


postgres=> vacuum analyze products;

ANALYZE

My data set is ready for queries.

Query 1: BM25 only

I started with a keyword search, using to_bm25query() to define the BM25 query. Ranking uses BM25 and is performed with the <@> operator. Top-k queries use this operator in ORDER BY ... LIMIT.

BM25 is implemented as an index-backed operator that must be bound to a specific index. This is why prepared statements require explicitly naming the index:


postgres=> prepare query1 (text, int) as
select
  p.product_id,
  p.title,
  p.category,
  p.brand,
  p.price
from products p
order by p.search_text <@> to_bm25query( $1 , 'products_bm25_idx' )
limit $2;

postgres=> explain (analyze, buffers, verbose, costs off) 
           execute query1 ('mid century modern wooden chair', 10)
;

                                                                 QUERY PLAN                                                          
---------------------------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=0.268..0.442 rows=10 loops=1)
   Output: product_id, title, category, brand, price, ((search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query))
   Buffers: shared hit=582
   ->  Index Scan using products_bm25_idx on hybrid_demo.products p (actual time=0.267..0.440 rows=10 loops=1)
         Output: product_id, title, category, brand, price, (search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
         Order By: (p.search_text <@> 'products_bm25_idx:mid century modern wooden chair'::bm25query)
         Buffers: shared hit=582
 Query Identifier: -4837396746836655100
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.116 ms
 Execution Time: 0.460 ms

(12 rows)

The Index Scan returns the Top-10 result ('rows=10') directly in ranking order (Order By).

Lexical retrieval is good for exact words, rare terms, product codes, and anything where the user expects the same token to appear in the document.

Query 2: ANN only

I further explored the semantic aspect by performing similarity search using the cosine distance operator (<=>) for vectors:

postgres=> prepare query2 (text, int) as
select
  p.product_id,
  p.title,
  p.category,
  p.brand,
  p.price,
  p.embedding <=> demo_embedding($1) as distance
from products p
order by p.embedding <=> demo_embedding($1)
limit $2;

postgres=> explain (analyze, buffers, verbose, costs off)
execute query2 ('modern wooden chair', 10);

                                                         QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------------
 Limit (actual time=0.374..0.399 rows=10 loops=1)
   Output: product_id, title, category, brand, price, ((embedding <=> '[0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0]'::vector))
   Buffers: shared hit
                                    
                                    
                                    
                                    
                                

Skipping Percona Server for MySQL 8.4.9 and 9.7.0

Upstream MySQL published an out-of-schedule release this week with two high-severity CVE fixes. We’ve pulled those fixes into our next builds and are skipping the two versions we had already queued: Percona Server for MySQL 8.4.9 and 9.7.0. These fixes arrived through Oracle’s new monthly Critical Security Patch Updates (CSPUs), which Oracle announced begin May … Continued

The post Skipping Percona Server for MySQL 8.4.9 and 9.7.0 appeared first on Percona.

CedarDB Launches HTAP (10 Years Ago)

The term “HTAP” is the holy grail of database systems. It describes what every data engineer would love: Being able to do all your data Processing, no matter if it’s complex Analytics or fast-paced Transactional operations in a single Hybrid system.

Many system have tried to enable HTAP but have found that building a truly hybrid system is an impossible challenge. The keynote talk at Databricks Data + AI Summit 2026 highlighted that it is impossible to get a query on the analytical warehouse to execute in less than 1 second. As a new solution that solves this, the Databricks cofounder Reynold Xin announced “Databricks LTAP” and directly cited our paper on “Morsel-Driven Parallelism” as one of the “latest and coolest academic papers”:

With their LTAP offering, Databricks is actually able to offer sub-second transactional performance while maintaining its well-known analytical performance. That’s genuinely impressive! But if you look under the hood (e.g. by watching Databricks engineers talk about the technology behind LTAP), you will see that LTAP still uses the same classical separation between operational and analytical data. By their own description, LTAP does not run on one engine. It keeps a transactional engine and an analytical engine and unifies them at the storage layer. You can think of this being a really good, really fast zero-ETL system.

Zero-ETL is not enough, though. As long as you run two engines, you have two sources of truth, and there is a moment where data crosses from one to the other. Zero-ETL makes that gap small, but even if the name suggests otherwise, it cannot make it zero. The data still has to move from the system that wrote it to the system that reads it. When you really want HTAP, you really care about this gap being actually zero. Take fraud detection: a warehouse can flag suspicious activity only after the fact, but you want to catch it before the money moves. For zero read lag and a single source of truth, you have to re-think the entire system around one engine that runs both workloads natively.

Morsel-Driven Parallelism, a research paper from the research project behind CedarDB

It’s almost impossible to change an existing system designed for either transactions or analytics into a true hybrid system. Database researchers have known this for several decades already. We weren’t the first to attempt it, and we didn’t coin the term HTAP. Systems like SAP HANA and HyPer went after it before us but required your data to fit completely in main memory. Sadly, this didn’t work out. What made us reconsider that HTAP was back on the table was that fast SSDs became widely available. So ten years ago, we started Umbra, a research project with one goal: building a truly HTAP system. CedarDB is built on that foundation.

Since then, new developments in the database space focused only on analytics, leading to great analytical systems such as Databricks Lakehouse and Snowflake, and ClickHouse. It turns out the existing transactional systems, even regular PostgreSQL, scaled to even the most demanding AI workloads. What’s hard is making sure transactions and analytics don’t slow each other down when running at the same time.

To make this work well, you need to unify both the execution engine and the storage format without introducing new bottlenecks. For that, we built a hybrid column-row format as our data layer. It can support fast writes on hot data, automatically transforming between hot write-optimized and cold compressed data as needed, fully transparently as a single copy.

Not only that, we also built the foundations for fast analytical processing on modern hardware. You can find an overview of key techniques on our technology page, including morsel-driven parallelism, data-centric code generation, a cost-based optimizer with full subquery decorrelation, and a buffer manager designed to fully utilize fast SSDs.

Databricks LTAP is coming soon, CedarDB is in production today! Good to see the industry catching up to the problem. Come see what the answer looks like when it’s already running.

June 27, 2026

Azure AI on HorizonDB

The azure_ai extension on HorizonDB adds generative AI functions to PostgreSQL, allowing users to utilize Azure AI's generation, ranking, and embedding models. Here's a four-step example that demonstrates how to define default models, set up endpoints, register them, and use the in SQL queries.

1. Allow and install the extension

The azure_ai extension must be set in azure.extensions from the parameter group:

Once enabled, you can CREATE EXTENSION:

postgres=> SHOW azure.extensions;

             azure.extensions
------------------------------------------
 pg_diskann,vector,pg_textsearch,azure_ai

(1 row)

postgres=> CREATE EXTENSION IF NOT EXISTS azure_ai;

CREATE EXTENSION

The functions are available, but I don't have access to an AI model yet:

postgres=> SELECT azure_ai.generate(
            'Hello'
           );

ERROR:  Endpoint not found.
DETAIL:  Please set/register the model.

An AI Model Management feature is coming to HorizonDB, currently in private preview, which is basically a zero-setup mode for azure_ai, but for the moment, I will do it manually.

2. Deploy a model in Azure

I go to the Microsoft Foundry | Azure OpenAI, hit "Create" and select "Azure OpenAI":

I set my resource group and region:

I use the default network setting that allows all networks, including the internet, to access this resource.

Once created, the next step is to "go to resource" in order to deploy a model:

In the model catalog, I select the chat models, for LLM tasks and generation purposes:

I hit "Uset this model" and deploy gpt-4o-mini:

The URL and API key is displayed in the Home section:

The details for the available models is in the Deployments section:

I can also get the parameters from the Python sample:

endpoint = "https://frankpachot-ai.openai.azure.com/"
model_name = "gpt-4o-mini"
deployment = "gpt-4o-mini"

subscription_key = "<your-api-key>"
api_version = "2024-12-01-preview"

client = AzureOpenAI(
    api_version=api_version,
    azure_endpoint=endpoint,
    api_key=subscription_key,

This information will be used to register the model from PostgreSQL.

I'll use them directly, but in production, keys should be stored in a secrets management system instead of being hardcoded in SQL.

3. Register the model in PostgreSQL

In HorizonDB, I can add AI models with model_registry.model_add() which takes the following parameters:

parameter meaning
alias SQL name
endpoint where to call the model
deployment which model instance
model_name metadata / capability
api_version protocol version
auth_type how to authenticate
key credential

The deployment name must exactly match the one defined in Azure. This is not the model name but the deployment identifier.

Here is the registration with the information from the chat model I deployed:

postgres=> SELECT model_registry.model_add(
    'default-chat',                                 -- alias
    'https://frankpachot-ai.openai.azure.com/',     -- azure_endpoint
    'gpt-4o-mini',                                  -- deployment name
    'gpt-4o-mini',                                      -- model name
    '2024-12-01-preview',                           -- api_version
    'subscription-key',                             -- auth type
    'FR3Xcz5VXiHSbz8Eqeo5qXsyKqgxrFeYCSuqOv...'     -- api_key
);

                       model_add
--------------------------------------------------------
 Model 'default-chat' (gpt-4o-mini) added successfully.

(1 row)


PostgreSQL can now invoke the Azure OpenAI deployment.

With this solution, the database itself doesn’t host the model. It only contains the information needed to call it, such as the endpoint, deployment, and key, but the calls to Azure OpenAI are transparent to the users.

4. Use the model from SQL queries

Now, azure_ai.generate() can use the registered model for generative AI:

postgres=> SELECT azure_ai.generate(
    'Who is Slonik and how does he look like? context:'|| version()
  , 'default-chat'
);

generate     
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Slonik is a mascot for PostgreSQL, often depicted as a friendly, cartoonish elephant. He typically has a blue-gray body, large floppy ears, and a cheerful expression. Slonik embodies the PostgreSQL community's spirit and is often used in promotional materials and events related to PostgreSQL.
(1 row)

postgres=>

The model name default-chat is the default for this function. I can omit it.

Instead of employing the model for text generation, I can utilize it to validate my text with azure_ai.is_true().

Let's verify the correct names for our favorite database:

postgres=> select azure_ai.is_true( format (
           '%s is the right name for Slonik''s database', unnest
            )), string_agg(unnest,',') from unnest(ARRAY[
            'PostgreSQL','Postgres','PG','pgsql','postgresql',
            'POSTGRES','pgdb','postgres-db','SQL','psql',
            'postmaster','postgré','postgrès','posgress',
            'posgresql','postgrasql','postgray','postgrest',
            'postgrezql','postgrex','postgresesql','postgresequel',
            'Post-Ingres',' Post-Gres-Q-L','Postgres95','pg-sql',
            'pgserver','pg-database','HorizonDB',
            'slonik-db','elephant-db','the elephant'
             ]) group by 1 order by 1 desc;

agg
---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 t       | PostgreSQL,Postgres,postgresql,POSTGRES,the elephant
 f       | PG,pgsql,pgdb,postgres-db,SQL,psql,postmaster,postgré,postgrès,posgress,posgresql,postgrasql,postgray,postgrest,postgrezql,postgrex,postgresesql,postgresequel,Post-Ingres, Post-Gres-Q-L,Postgres95,pg-sql,pgserver,pg-database,HorizonDB,slonik-db,elephant-db

(2 rows)

Ok, according to gpt-4o-mini, "The Elephant" is a valid name for PostgreSQL.

The chat model can also be used to extract structured information from unstructured text. For example, from the version() banner I'm interested in the PostgreSQL compatibility version, and the name of the managed service:

postgres=> SELECT version();

                                                        version
-----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 17.9 (Azure HorizonDB (70f3b593ec7)(release)) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 13.2.0, 64-bit
(1 row)

postgres=> SELECT azure_ai.extract(
             version() , -- text from which to extract
             ARRAY['PostgreSQL compatibility','Cloud service name']
            );

                                    extract
-------------------------------------------------------------------------------
 {"Cloud service name": "Azure HorizonDB", "PostgreSQL compatibility": "17.9"}

(1 row)

In order to generate embeddings, I need to deploy a dedicated model:

With the same method I used for chat models, I got the endpoint, name, and key to access this embedding model:

postgres=> SELECT model_registry.model_add(
    'default-embedding',                            -- alias
    'https://franckpachot-ai.openai.azure.com/',     -- azure_endpoint
    'text-embedding-3-small',                       -- deployment name
    'text-embedding-3-small',                       -- model name
    '2024-12-01-preview',                           -- api_version
    'subscription-key',                             -- auth type
    'FR3Xcz5VXiHSbz8Eqeo5qXsyKqgxrFeYCSuqOv...'     -- api_key
);

                              model_add
---------------------------------------------------------------------
 Model 'default-embedding' (text-embedding-3-small) added successfully.

(1 row)

Using the alias default-embedding, which I defined when registering the model deployment, I can generate embeddings for specific text:

postgres=> SELECT azure_openai.create_embeddings(
    'default-embedding',
    'hello world'
);
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 create_embeddings                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
----------------------------------------------------------------
{-0.006729126,-0.03918457,0.03414917,0.028747559,-0.024841309,-0.041992188,-0.030288696,0.049316406,-0.013969421,-0.017669678,0.015396118,-0.026992798,-0.020980835,-0.027801514,0.008583069,0.03567505,-0.053619385,-0.0023059845,0.008773804,0.048034668,0.037078857,-0.009239197,-0.008781433,0.011428833,0.0140686035,-0.002161026,-0.037597656,0.04547119,0.0112838745,-0.03967285,0.0234375,-0.050628662,0.011985779,...}
(1 row)

postgres=>

Note that the azure_ai extension deploys functions in different namespaces. High‑level semantic operations, independent of the model vendor, such as generate(), is_true(), extract(), rank() are in the azure_ai schema. In contrast, create_embeddings, which returns vectors and is tied to the OpenAI API, is in the azure_openai schema.

I have also enabled pg_vector to use the vector data type and operators for generated embeddings. Here is an example using a prompt to find PostgreSQL settings related to shared buffer cache memory:

postgres=> CREATE EXTENSION vector;

CREATE EXTENSION

postgres=> WITH settings AS (
  SELECT name, short_desc, azure_openai.create_embeddings('default-embedding',
      row_to_json(pg_settings)::text
  )::vector AS embedding FROM pg_settings
) SELECT
  name, short_desc FROM settings
  ORDER BY embedding <=> azure_openai.create_embeddings('default-embedding',
    'Buffer cache shared memory allocated by PostgreSQL'
  )::vector LIMIT 5;

            name            |                                       short_desc
----------------------------+----------------------------------------------------------------------------------------
 effective_cache_size       | Sets the planner's assumption about the total size of the data caches.
 shared_memory_type         | Selects the shared memory implementation used for the main shared memory region.
 shared_buffers             | Sets the number of shared memory buffers used by the server.
 shared_memory_size         | Shows the size of the server's main shared memory area (rounded up to the nearest MB).
 dynamic_shared_memory_type | Selects the dynamic shared memory implementation used.

(5 rows)

I haven't set an index or stored embeddings here. This example simply demonstrates how it works in a stateless demo. The similarity search uses the cosine distance operator <=> to compare data from pg_settings with my prompt, returning the Top-5 matches. In a real application, you would generate the embeddings within an AI pipeline, then store and index them using pg_vector or DiskANN, rather than calling azure_openai.create_embeddings() on each query, except for the prompt.

AzureAI also provides functions for re-ranking the results of a similarity search. Ideally, a dedicated ranking model should be used. Here, I didn't deploy one and used the chat model for demonstration.:

postgres=> WITH
 settings AS (
  SELECT name, short_desc, row_to_json(pg_settings)::text AS doc,
      azure_openai.create_embeddings( 'default-embedding',
        row_to_json(pg_settings)::text
      )::vector AS embedding FROM pg_settings),
 candidates as (
  SELECT name, short_desc, doc FROM settings
  ORDER BY embedding <=> azure_openai.create_embeddings( 'default-embedding',
    'Buffer cache shared memory allocated by PostgreSQL'
  )::vector LIMIT 20 ),
 reranked AS ( SELECT * FROM azure_ai.rank(
    'Setting the buffer cache shared memory allocated by PostgreSQL',
    ARRAY (SELECT doc FROM candidates),  -- text to rank
    ARRAY (SELECT name FROM candidates), -- id of the item
    'default-chat'
  )
)
SELECT r.*, s.short_desc
 FROM settings s JOIN reranked r ON r.id = s.name
 ORDER BY r.rank LIMIT 5
;

             id             | rank | score |                                       short_desc
----------------------------+------+-------+----------------------------------------------------------------------------------------
 shared_buffers             |    1 |   0.9 | Sets the number of ... (truncated)
                                    

June 26, 2026

$lookup join strategies: understanding the trade-offs with flexible documents

In a previous post, I explored how MongoDB chooses between nested loop, indexed loop, and hash join strategies for $lookup. Here, I examine what occurs when $lookup runs on DocumentDB for PostgreSQL—an open-source extension implementing the MongoDB API on PostgreSQL.

The document model minimizes the need for joins by embedding related data directly within documents. However, when a join is necessary — such as for reference data that updates independently, many-to-many relationships, or dimensional lookups — the flexibility of embedding can complicate join optimization.

The goal isn't just to identify "which database is faster"—it's to understand why their behaviors differ, the trade-offs involved, and the options when join performance matters.

Relational databases tend to perform more joins because normalized schemas require them, but they also optimize joins more effectively thanks to scalar, well-typed columns. In contrast, document databases perform fewer joins thanks to embedding, but when they do, flexible field semantics—such as arrays—restrict the available join algorithms.

I've run all tests in Docker containers with default settings on the same machine. The timings are indicative, not benchmarks — they illustrate the relative cost of different approaches, not absolute performance under production conditions (caching, concurrency, hardware, and tuning would all change the numbers).

An example: fact and reference table

In a document database, you'd typically embed related data to avoid joins. But some data doesn't embed well:

  • Exchange rates change continuously. If you embed rate_to_usd inside each portfolio document, you'd need to update millions of documents every time a rate moves.
  • Portfolios reference a currency, and you want to compute USD valuations by looking up the current rate at query time.

This is a classic case where a $lookup join makes sense: a large fact collection (portfolios) joined to a small, frequently-updated reference collection (fxRates). The document model can't avoid this join without accepting stale embedded rates.

Schema and data generation

I created two collections:

  • portfolios: 5 million documents with a currency field (5 distinct values)
  • fxRates: 5 documents mapping each currency to its USD exchange rate

I used mongosh to create and load the collection with the following commands:

db.portfolios.drop();
db.fxRates.drop();

const currencies = ["USD", "EUR", "CHF", "GBP", "JPY"];
currencies.forEach(cur => {
  db.fxRates.insertOne({
    currency: cur,
    rate_to_usd: Math.random() * (1.5 - 0.5) + 0.5,
    last_updated: new Date()
  });
});

const totalPortfolios = 5e6;
let bulk = [];
for (let i = 1; i <= totalPortfolios; i++) {
  const currency = currencies[Math.floor(Math.random() * currencies.length)];
  bulk.push({
    portfolioId: i,
    clientId: Math.floor(Math.random() * 10000),
    valuation: Math.round(Math.random() * 1_000_000),
    currency: currency,
    asOfDate: new Date()
  });
  if (bulk.length === 10000) {
    db.portfolios.insertMany(bulk);
    bulk = [];
  }
}
if (bulk.length > 0) db.portfolios.insertMany(bulk);

db.fxRates.createIndex({ currency: 1 }, { unique: true });

The index on a five-document collection is not strictly necessary, but it's good practice and protects my lookup table from duplicates.

The Query with $lookup

This query fetches all portfolios, retrieves the foreign exchange rate for each currency, and converts the valuation to USD.


db.portfolios.aggregate([
  {$lookup: {
    from: "fxRates",
    localField: "currency",
    foreignField: "currency",
    as: "fx"
  }},
  {$unwind: "$fx"},
  {$project: {
    portfolioId: 1, valuation: 1, currency: 1,
    rate_to_usd: "$fx.rate_to_usd",
    valuation_usd: {$multiply: ["$valuation", "$fx.rate_to_usd"]}
  }}
])

MongoDB's $lookup combined with $unwind behaves like a LEFT OUTER JOIN followed by filtering out non-matching rows.

Why flexible documents make joins hard

In a relational database, portfolios.currency is a VARCHAR column. The optimizer knows it's a single scalar value per row. It can extract it, hash it, sort it, or probe an index with it — all with well-defined operators.

In a document database, currency might be:

  • A string: "USD"
  • An array: ["USD", "EUR"]
  • Missing entirely
  • A nested document

MongoDB's $lookup compatibility requires the following behavior:

  • If localField is an array ["USD", "EUR"], it matches any foreign document where foreignField equals "USD" OR "EUR" (or contains either, if it's also an array).
  • It's effectively an "any element matches any element" semantic.

This means that the join condition is not always a simple equality a = b, but may involve “any element matches” semantics evaluated at runtime. Instead, the matching logic must evaluate each document's field at runtime, determine whether it's a scalar or an array, and match accordingly.

The safest general approach is a lateral join — executing the inner query for each outer document and passing the current document's field value into the matching function. This is what both MongoDB and DocumentDB for PostgreSQL do.

What happens under the hood (DocumentDB for PostgreSQL)

I use the DocumentDB API in a SQL query rather than the MongoDB-compatible endpoint to view the PostgreSQL execution plan.

EXPLAIN (ANALYZE ON, BUFFERS ON, COSTS ON, VERBOSE ON)
SELECT document
FROM bson_aggregation_pipeline('test',
'{
  "aggregate": "portfolios",
  "pipeline": [
    {
      "$lookup": {
        "from": "fxRates",
        "localField": "currency",
        "foreignField": "currency",
        "as": "fx"
      }
    },
    {
      "$unwind": "$fx"
    },
    {
      "$project": {
        "portfolioId": 1,
        "valuation": 1,
        "currency": 1,
        "rate_to_usd": "$fx.rate_to_usd",
        "valuation_usd": {
          "$multiply": [
            "$valuation",
            "$fx.rate_to_usd"
          ]
        }
      }
    }
  ],
  "cursor": {}
}');

Since I joined a large collection with a small one and require all documents from both, I would anticipate a hash join. Instead, it uses a nested loop join:

Nested Loop  (actual time=579..64792 rows=5000000 loops=1)
  ->  Seq Scan on documents_11 collection  (rows=5000000 loops=1)
  ->  Seq Scan on documents_10 collection_0_1  (rows=1 loops=5000000)
        Filter: bson_dollar_lookup_join_filter(...)
        Rows Removed by Filter: 4
Execution Time: 87750 ms

The fxRates table (5 rows, fitting in a single 8kB block) is scanned 5 million times. PostgreSQL's cost-based optimizer knows the table is tiny and fits in cache, so a sequential scan is the right choice over an index scan — but the scan is still executed 5 million times because of the LATERAL pattern. The filter function bson_dollar_lookup_join_filter is evaluated 25 million times. This function handles array semantics — it extracts the field from the outer document, determines whether it's scalar or an array, and checks for matches in the inner document.

Because the inner side is marked as LATERAL, it depends on the current outer row. This prevents PostgreSQL from evaluating both sides independently, which is required for hash or merge joins. As a result, only a nested loop strategy is possible.

In MongoDB, the equivalent behavior is the IndexedLoopJoin strategy: for each outer document, probe the index on the foreign field. The algorithm and per-document cost are the same.

A note on MongoDB's Hash Join

MongoDB 8.0 can use hash join for $lookup when allowDiskUse: true, no compatible index on the foreign field, the foreign collection is small, and the SBE engine is active. Under these conditions, MongoDB builds an in-memory hash table from the foreign collection, correctly handling array semantics by storing per-element entries.

In tests with 5M portfolios and 5 fxRates, MongoDB's native HashJoin finished in ~14 seconds — the fastest of my tests. Without tweaks, it took 170 seconds — the worst.

To achieve 14 seconds, I dropped the index on the foreign field, enabled allowDiskUse, and set internalQueryFrameworkControl to trySbeEngine. The default trySbeRestricted mode doesn't push the $lookup and $unwind to SBE, since the optimization depends on feature flags that aren't enabled in this mode. With trySbeEngine, SBE handles the pipeline, using HashJoin:

// Setup for hash join
db.adminCommand({setParameter: 1, internalQueryFrameworkControl: "trySbeEngine"});
db.fxRates.dropIndex("currency_1");

// The query (same as all other tests)
db.portfolios.aggregate([
  {$lookup: {from: "fxRates", localField: "currency", foreignField: "currency", as: "fx"}},
  {$unwind: "$fx"},
  {$project: {portfolioId: 1, valuation: 1, currency: 1, rate_to_usd: "$fx.rate_to_usd", valuation_usd: {$multiply: ["$valuation", "$fx.rate_to_usd"]}}}
], {allowDiskUse: true}).explain("executionStats");

// Restore
db.fxRates.createIndex({currency: 1}, {unique: true});
db.adminCommand({setParameter: 1, internalQueryFrameworkControl: "trySbeRestricted"});

DocumentDB for PostgreSQL doesn't currently implement this optimization — it relies on PostgreSQL's native join strategies, which don't understand BSON array semantics. Under normal conditions, both MongoDB and DocumentDB use a Nested Loop join.

Attempting alternatives via the MongoDB API

Using _id as Join Key (~71s)

The documentDB extension has a special case when foreignField is _id — it uses direct object_id equality:

// Reshape fxRates to use currency as _id
db.fxRates.drop();
currencies.forEach(cur => {
  db.fxRates.insertOne({
    _id: cur,
    rate_to_usd: Math.random() * (1.5 - 0.5) + 0.5,
    last_updated: new Date()
  });
});

db.portfolios.aggregate([
  {$lookup: {from: "fxRates", localField: "currency", foreignField: "_id", as: "fx"}},
  {$unwind: "$fx"},
  {$project: {portfolioId:1, valuation:1, currency:1,
              rate_to_usd:"$fx.rate_to_usd",
              valuation_usd:{$multiply:["$valuation","$fx.rate_to_usd"]}}}
])

It uses an index scan with the join condition applied as an Index Cond, which is more efficient than a sequential scan with a Filter. It's slightly faster, taking 71 seconds instead of 88 seconds, yet it remains a nested loop with 5 million iterations:

Nested Loop  (actual time=17..48170 rows=5000000 loops=1)
  ->  Seq Scan on documents_11 collection  (rows=5000000 loops=1)
  ->  Index Scan using _id_ on documents_12  (rows=1 loops=5000000)
        Index Cond: (object_id = ANY (bson_dollar_lookup_extract_filter_array(...)))
Execution Time: 70578 ms

This is the same as MongoDB's IndexedLoopJoin — the _id field is guaranteed to be scalar, so the extension can use a direct equality lookup on the primary key. However, it doesn't change the join strategy.

Uncorrelated $lookup + $filter (~68s)

A minor enhancement involves reading all fxRates at once, using an empty pipeline and no join condition, attaching the data as an array, and then filtering locally:

db.portfolios.aggregate([
  {$lookup: {from: "fxRates", pipeline: [], as: "allFx"}},
  {$addFields: {
    fx: {$arrayElemAt: [{$filter: {
      input: "$allFx", as: "r",
      cond: {$eq: ["$$r.currency", "$currency"]}
    }}, 0]}
  }},
  {$project: {portfolioId:1, valuation:1, currency:1,
              rate_to_usd:"$fx.rate_to_usd",
              valuation_usd:{$multiply:["$valuation","$fx.rate_to_usd"]}}}
])

The execution plan shows a Nested Loop with a single loop:

Nested Loop  (actual time=17..20177 rows=5000000 loops=1)
  ->  Aggregate  (rows=1 loops=1)          -- reads fxRates ONCE
  ->  Seq Scan on documents_11  (rows=5000000 loops=1)
Execution Time: 67905 ms  (of which ~48s is $addFields+$project)

The join itself is fast — fxRates are aggregated once into a single array. But the per-document $filter + $arrayElemAt evaluates BSON expressions 5 million times. We traded "nested loop probe" for "per-row array scan in BSON space".

This is conceptually similar to the "nested loop with materialization" approach from the previous MongoDB article — reading the lookup collection once, but matching per-document in the projection.

Pipeline-Based $lookup — No Help

Using $lookup with pipeline and let doesn't enhance performance:

  {$lookup: {
    from: "fxRates",
    let: { cur: "$currency" },
    pipeline: [
      {$match: {$expr: {$eq: ["$currency", "$$cur"]}}}
    ],
    as: "fx"
  }},
  {$unwind: "$fx"},

The extension still creates a LATERAL join (all code paths set rightTree->lateral = true), and it introduces additional overhead due to variable resolution.

With the MongoDB-compatible API, no solution significantly improves the efficiency of the join. But on DocumentDB, the power of SQL opens new possibilities.

The PostgreSQL escape hatch: SQL with Hash Join

Since DocumentDB stores data in standard PostgreSQL tables, we can query the same collections with SQL—within the same transaction and with full ACID guarantees. The trade-off is that we lose flexible-document join semantics and assume scalar join keys.

Prerequisite: enabling Hash Join for the BSON type (a hack)

The bson type has a hash operator class (bson_hash_ops) used for GROUP BY and DISTINCT. But the = operator doesn't declare hash join support — it's missing HASHES and MERGES properties. This is likely intentional, since bson = bson comparison on full documents has different semantics than field-level equality. But for my investigation (comparing extracted scalar field values), it would work:

-- Requires superuser — this is a hack, not a supported configuration
-- If DocumentDB enables this in the future, it will be part of the extension
ALTER OPERATOR documentdb_core.= (documentdb_core.bson, documentdb_core.bson)
  SET (COMMUTATOR = OPERATOR(documentdb_core.=), HASHES, MERGES);

Without this, PostgreSQL cannot execute hash join for bson = bson conditions, even in custom SQL. However, note that the SQL hash join method, enabled by this hack, does not replicate MongoDB's "any element matches" behavior when joined fields include arrays.

The Query: CTE with Hash Join (~39s)

To utilize a SQL join, I first query the two collections within two common table expressions in the WITH clause, then join them in the main query:

WITH portfolios AS (
  SELECT document FROM documentdb_api.collection('test', 'portfolios')
),
fxRates AS (
  SELECT document FROM documentdb_api.collection('test', 'fxRates')
)
SELECT documentdb_api_internal.bson_dollar_project(
  documentdb_api_internal.bson_dollar_merge_documents_at_path(
    p.document, f.document, 'fx'),
  '{ "portfolioId" : 1, "valuation" : 1, "currency" : 1,
     "rate_to_usd" : "$fx.rate_to_usd",
     "valuation_usd" : { "$multiply" : ["$valuation", "$fx.rate_to_usd"] } }'::bson,
  '{}'::bson
)
FROM portfolios p
JOIN fxRates f
  ON documentdb_api_catalog.bson_expression_get(
       p.document, '{"": "$currency"}'::bson, true)
   = documentdb_api_catalog.bson_expression_get(
       f.document, '{"": "$currency"}'::bson, true);

With this query and the operator tweak enabling hash join, I have the following execution plan:

Hash Join  (actual time=7.4..34018 rows=5000000 loops=1)
  Hash Cond: (bson_expression_get(documents_11.document, '{"":"$currency"}'...)
            = bson_expression_get(documents_10.document, '{"":"$currency"}'...))
  ->  Seq Scan on documents_11  (rows=5000000 loops=1)
  ->  Hash  (rows=5 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on documents_10  (rows=5 loops=1)
Execution Time: 38664 ms

PostgreSQL creates a small 5-row hash table (9 kB) and probes it once per portfolio. It makes a single pass over both collections. Most of the remaining time is spent calling bson_expression_get 5 million times to retrieve the join key, along with bson_dollar_merge_documents_at_path and bson_dollar_project to generate the final output.

In the end, this query is only about twice as fast. It requires a complex workaround, breaks document semantics, and still spends most of its time evaluating BSON expressions.

Results Summary

I started Docker containers as:

# Start MongoDB (minimal setup for ACID transactions)
docker exec -it $(
 docker run -d -p 27017:27017 mongo --replSet rs0
) mongosh --eval "rs.initiate()"

# Start DocumentDB (PostgreSQL and MongoDB-compatible endpoints)
docker run -d -p 9712:9712 -p 10260:10260 \
 ghcr.io/documentdb/documentdb/documentdb-local:latest \
 --username "franck" --password "franck" --start-pg

Below is a summary of my experiments, run in Docker containers with default configurations, involving 5 million portfolios, 5 fxRates, and a unique index on fxRates.currency:

Approach MongoDB DocumentDB Strategy
$lookup localField/foreignField ~170s ~88s Nested Loop (lateral index/filter)
$lookup with foreignField: "_id" ~155s ~71s Nested Loop (index probe)
Uncorrelated $lookup + $filter ~22s ~68s Materialize once + per-doc filter
SQL CTE + Hash Join (operator tweak) ~39s Hash Join (forced)
HashJoin (SBE, internal tweak) ~14s Hash Join (forced)

MongoDB's native HashJoin via the Slot-Based Execution engine is fastest, handling hash table build/probe natively with per-element array support and avoiding BSON field extraction overhead, but will not be used without configuration tweaks. The DocumentDB SQL escape hatch uses PostgreSQL's optimizer for the same join strategy but incurs overhead with bson_expression_get on each row.

The other solutions are compatible with standard configurations and use appropriate data models and query code. Remember that the time here reflects reading five million documents, and the difference may be insignificant on small datasets.

The Trade-off: Flexibility vs. Optimization

These experiments show the trade-off clearly. Relational systems rely on joins due to normalization, but they can optimize them effectively thanks to typed scalar columns. Document databases avoid many joins, but when joins are needed, flexible semantics—like arrays—limit the available algorithms.

DocumentDB for PostgreSQL sits in the middle. It relies on PostgreSQL storage and execution while preserving MongoDB semantics. As a result, $lookup uses only a subset of SQL's join capabilities to preserve this flexibility. The SQL workaround shows that performance improves when you enforce scalar semantics, but this runs counter to the expectations of a document model, where any field in one document can be an array in another.

So the real question is not which system is faster, but which trade-off you choose: flexibility with embedded arrays or optimization for scalar values.

This was tested on MongoDB 8.0 and DocumentDB 0.112 on PostgreSQL 17.10, and both can improve in the future. Optimization is possible when the field is a known scalar. But if you have a fixed schema, do you still want a document database or switch to SQL? PostgreSQL can also gain optimizations that benefit DocumentDB queries. For example, the lateral join could be memoized in a future version.

If you're thinking about using DocumentDB for PostgreSQL — whether you're migrating from MongoDB or starting fresh — don't stop at the first slow query. Look into the causes, since the trade-off between speed and flexibility can differ. Check execution plans, and file an issue or start a discussion. More feedback from real workloads helps the contributors improve the extension. That's a major advantage of open source.

June 25, 2026

Running pgvector in production on Amazon Aurora PostgreSQL

Running pgvector on Amazon Aurora PostgreSQL gives you a production-grade vector store on a database you already know, backed by the operational tooling, high availability, and scaling behaviour of Amazon Aurora. Production traffic does introduce a predictable set of operational considerations: query latency as the corpus grows, recall on filtered vector searches, memory headroom during index builds, and connection behaviour under load. This post is scoped to the database operations that keep the RAG retrieval layer healthy. In this post, we cover the operational practices that keep a pgvector workload healthy once you depend on it: choosing the right index and distance function, scaling with quantization and partitioning, managing Hierarchical Navigable Small World (HNSW) churn, sizing for memory-resident operation, and the observability signals that catch problems early.

Oracle FDW on Azure Database for PostgreSQL

When migrating from Oracle Database to PostgreSQL, the Oracle Foreign Data Wrapper enables PostgreSQL to connect to Oracle and import data as if it were a local table. This extension is not available on all managed PostgreSQL services. For Azure Database for PostgreSQL, setup is straightforward since it runs community PostgreSQL. The only requirement is that extensions must first be enabled via the azure-specific parameter azure.extensions:

Oracle_FDW extension in PostgreSQL

I can create the extension as the administrator and utilize its features:

postgres=> \du
                                List of roles

   Role name    |                         Attributes
----------------+------------------------------------------------------------
 azure_pg_admin | Cannot login
 azuresu        | Superuser, Create role, Create DB, Replication, Bypass RLS
 franck         | Create role, Create DB, Bypass RLS
 replication    | Replication

postgres=> \connect - franck

I am now connected to the database "postgres" as user "franck".

postgres=> create extension oracle_fdw;

CREATE EXTENSION

postgres=> select oracle_diag();
                                                oracle_diag
------------------------------------------------------------------------------------------------------------
 oracle_fdw 2.8.0, PostgreSQL 18.4, Oracle client 23.26.0.0.0, ORACLE_HOME=/opt/oracle/oracle_instantclient

It's important to understand how PostgreSQL will connect to the Oracle Database: oracle_fdw is installed with the Oracle Instant Client libraries.

An Oracle Database in a lab

I exposed an Oracle database to the internet for lab purposes, so I did not need to configure the network. I set up using the Docker image.


# Start Oracle and expose port 1521 to all interfaces

sudo docker run -d --name ora -p 1521:1521 -e ORACLE_PASSWORD=franck gvenzl/oracle-free:slim

# Open port 1521 in the firewall (you may have to add a rule in the inbound security rules)

sudo firewall-cmd --zone=public --permanent --add-port=1521/tcp
sudo firewall-cmd --reload

# Starting Oracle Database takes time, wait for it to be up

echo "Waiting for Oracle" ; until docker logs ora | 
 grep "Completed: Pluggable database FREEPDB1 opened read write"
do echo -n "." ; sleep 1 ; done

# Load the SCOTT/TIGER schema example

docker exec -it ora bash -c '
TWO_TASK=//localhost:1521/FREEPDB1 sqlplus system/franck @?/rdbms/admin/utlsampl.sql
'

# Allow SCOTT to EXPLAIN PLAN

docker exec -it ora bash -c '
 echo "GRANT SELECT_CATALOG_ROLE TO SCOTT;" |  
 TWO_TASK=//localhost:1521/FREEPDB1 sqlplus system/franck
'

# Show the connection string from internet

echo "You can connect as: SCOTT/TIGER@//$(curl -s ifconfig.me):1521/FREEPDB1"


Foreign Data Wrapper server and user

Now, back to Azure Database for PostgreSQL, I can specify the Oracle Database server via the connection string, as well as the user and password I use for connection:


postgres=> create server oracle_server
  foreign data wrapper oracle_fdw
  options (dbserver '//141.145.212.208:1521/FREEPDB1');

CREATE SERVER

postgres=> create user mapping for current_user
  server oracle_server
  options (user 'SCOTT', password 'TIGER');

CREATE USER MAPPING

I import the tables I need to query:


postgres=> IMPORT FOREIGN SCHEMA "SCOTT" 
           FROM SERVER oracle_server INTO public
;

postgres=> \d

             List of relations

 Schema |   Name   |     Type      | Owner
--------+----------+---------------+--------
 public | bonus    | foreign table | franck
 public | dept     | foreign table | franck
 public | emp      | foreign table | franck
 public | salgrade | foreign table | franck

(4 rows)

postgres=> select * from dept;

 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON

(4 rows)

I can access the tables as if they were local, my PostgreSQL session being connected to the remote Oracle Database.

Performance and execution plan

The execution plan indicates whether the join has been pushed down, a decision made by the query planner:

postgres=> explain (verbose off)
           select * from dept join emp using(deptno)
;
                                                                                QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=30012.50..40190.00 rows=5000 width=206)
   Hash Cond: (dept.deptno = emp.deptno)
   ->  Foreign Scan on dept  (cost=10000.00..20000.00 rows=1000 width=92)
         Oracle query: SELECT /*62b5e8b0d88d7cfb*/ r1."DEPTNO", r1."DNAME", r1."LOC" FROM "SCOTT"."DEPT" r1
   ->  Hash  (cost=20000.00..20000.00 rows=1000 width=116)
         ->  Foreign Scan on emp  (cost=10000.00..20000.00 rows=1000 width=116)
               Oracle query: SELECT /*d6c3774a98386064*/ r2."EMPNO", r2."ENAME", r2."JOB", r2."MGR", r2."HIREDATE", r2."SAL", r2."COMM", r2."DEPTNO" FROM "SCOTT"."EMP" r2

(7 rows)

postgres=> explain (verbose off)
           select * from dept join emp using(deptno)
           where ename='KING'
;

                                                                                                                                 QUERY PLAN                 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan  (cost=10000.00..10250.00 rows=25 width=206)
   Oracle query: SELECT /*7a68726e19f9c7b7*/ r1."DEPTNO", r1."DNAME", r1."LOC", r2."EMPNO", r2."ENAME", r2."JOB", r2."MGR", r2."HIREDATE", r2."SAL", r2."COMM" FROM ("SCOTT"."DEPT" r1 INNER JOIN "SCOTT"."EMP" r2 ON (r1."DEPTNO" = r2."DEPTNO") AND (r2."ENAME" = 'KING'))
(2 rows)


When using the VERBOSE option, EXPLAIN executes an EXPLAIN PLAN on the remote server. (I granted SELECT_CATALOG_ROLE to prevent the no SELECT privilege on V$SQL error.)

postgres=> explain (verbose on, analyze, costs off)
           select * from dept join emp using(deptno)
           where ename='KING'
;
                                                                                                                                                                                                                                                                                                                                                  QUERY PLAN                 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan (actual time=11.621..11.632 rows=1.00 loops=1)
   Output: dept.deptno, dept.dname, dept.loc, emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, emp.sal, emp.comm
   Oracle query: SELECT /*7a68726e19f9c7b7*/ r1."DEPTNO", r1."DNAME", r1."LOC", r2."EMPNO", r2."ENAME", r2."JOB", r2."MGR", r2."HIREDATE", r2."SAL", r2."COMM" FROM ("SCOTT"."DEPT" r1 INNER JOIN "SCOTT"."EMP" r2 ON (r1."DEPTNO" = r2."DEPTNO") AND (r2."ENAME" = 'KING'))
   Oracle plan: SELECT STATEMENT
   Oracle plan:   HASH JOIN   (condition "R1"."DEPTNO"="R2"."DEPTNO")
   Oracle plan:     NESTED LOOPS
   Oracle plan:       NESTED LOOPS
   Oracle plan:         STATISTICS COLLECTOR
   Oracle plan:           TABLE ACCESS FULL EMP  (filter "R2"."ENAME"='KING')
   Oracle plan:         INDEX UNIQUE SCAN PK_DEPT (condition "R1"."DEPTNO"="R2"."DEPTNO")
   Oracle plan:       TABLE ACCESS BY INDEX ROWID DEPT
   Oracle plan:     TABLE ACCESS FULL DEPT
 Query Identifier: 3032555283609452836
 Planning Time: 35.288 ms
 Execution Time: 11.704 ms

(15 rows)

An EXPLAIN ANALYZE displays the timing of remote calls, which here is approximately 12 milliseconds:

This 12ms latency is expected because my PostgreSQL database is located in Amsterdam (Azure West Europe), while the Oracle Database is in Paris (OCI eu-paris-1).

Network troubleshooting

I can verify the location using the TCP/IP information in the Oracle Database listener.

docker exec -it ora bash -c 'curl -s ipinfo.io ; echo ; echo ; grep "PROGRAM=postgres" /opt/oracle/diag/tnslsnr/*/listener/trace/listener.log | tail -1'

{
  "ip": "141.145.212.208",
  "city": "Paris",
  "region": "Île-de-France",
  "country": "FR",
  "loc": "48.8534,2.3488",
  "org": "AS31898 Oracle Corporation",
  "postal": "75000",
  "timezone": "Europe/Paris",
  "readme": "https://ipinfo.io/missingauth"
}

24-JUN-2026 21:28:23:647 * (CONNECT_DATA=(SERVICE_NAME=FREEPDB1)(CID=(PROGRAM=postgres:?franck?postgres?152.67.90.237?35830??SELECT)(HOST=fea1c65ae6cf)(USER=azuredb))(CONNECTION_ID=VQdZFJxbFlngYwQBIQob2g==))(TARGET_LOCAL_INSTANCE=FREE) * (ADDRESS=(PROTOCOL=tcp)(HOST=51.124.207.22)(PORT=1024)) * establish * FREEPDB1 * 0

The IP address embedded in PROGRAM reflects the PostgreSQL client process name, not Oracle network metadata (I used psql in Zurich). The IP in HOST indicates the PostgreSQL client connecting to the Oracle listener, while the first IP from ipinfo.io shows the location of the Oracle Database host.

A latency of 12ms is expected between these two cloud regions:

If I need to troubleshoot, I can run tcpdump -i any port 1521 -nn on the Oracle Database host:

sudo tcpdump -i any port 1521 -nn -tttt -vvv

dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes

2026-06-24 22:01:44.583852 IP 51.124.207.22.1024 > 10.0.0.59.1521: Flags [S], seq 987172167, win 65535, options [mss 1440,sackOK,TS val 2030652947 ecr 0,nop,wscale 9], length 0
2026-06-24 22:01:44.583952 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [S], seq 987172167, win 65535, options [mss 1440,sackOK,TS val 2030652947 ecr 0,nop,wscale 9], length 0
2026-06-24 22:01:44.583960 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [S], seq 987172167, win 65535, options [mss 1440,sackOK,TS val 2030652947 ecr 0,nop,wscale 9], length 0
2026-06-24 22:01:44.583988 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [S.], seq 944233141, ack 987172168, win 65160, options [mss 1460,sackOK,TS val 2168564814 ecr 2030652947,nop,wscale 7], length 0
2026-06-24 22:01:44.583996 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [S.], seq 944233141, ack 987172168, win 65160, options [mss 1460,sackOK,TS val 2168564814 ecr 2030652947,nop,wscale 7], length 0
2026-06-24 22:01:44.584048 IP 10.0.0.59.1521 > 51.124.207.22.1024: Flags [S.], seq 944233141, ack 987172168, win 65160, options [mss 1460,sackOK,TS val 2168564814 ecr 2030652947,nop,wscale 7], length 0
2026-06-24 22:01:44.602184 IP 51.124.207.22.1024 > 10.0.0.59.1521: Flags [.], ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 0
2026-06-24 22:01:44.602200 IP 51.124.207.22.1024 > 10.0.0.59.1521: Flags [P.], seq 1:75, ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 74
2026-06-24 22:01:44.602201 IP 51.124.207.22.1024 > 10.0.0.59.1521: Flags [P.], seq 75:336, ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 261
2026-06-24 22:01:44.602241 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [.], ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 0
2026-06-24 22:01:44.602249 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [.], ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 0
2026-06-24 22:01:44.602252 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [P.], seq 1:75, ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 74
2026-06-24 22:01:44.602253 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [P.], seq 1:75, ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 74
2026-06-24 22:01:44.602254 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [P.], seq 75:336, ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 261
2026-06-24 22:01:44.602255 IP 51.124.207.22.1024 > 172.17.0.2.1521: Flags [P.], seq 75:336, ack 1, win 128, options [nop,nop,TS val 2030652960 ecr 2168564814], length 261
2026-06-24 22:01:44.602295 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [.], ack 75, win 509, options [nop,nop,TS val 2168564833 ecr 2030652960], length 0
2026-06-24 22:01:44.602303 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [.], ack 75, win 509, options [nop,nop,TS val 2168564833 ecr 2030652960], length 0
2026-06-24 22:01:44.602312 IP 10.0.0.59.1521 > 51.124.207.22.1024: Flags [.], ack 75, win 509, options [nop,nop,TS val 2168564833 ecr 2030652960], length 0
2026-06-24 22:01:44.602435 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [.], ack 336, win 507, options [nop,nop,TS val 2168564833 ecr 2030652960], length 0
2026-06-24 22:01:44.602446 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [.], ack 336, win 507, options [nop,nop,TS val 2168564833 ecr 2030652960], length 0
2026-06-24 22:01:44.602462 IP 10.0.0.59.1521 > 51.124.207.22.1024: Flags [.], ack 336, win 507, options [nop,nop,TS val 2168564833 ecr 2030652960], length 0
2026-06-24 22:01:44.619442 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [P.], seq 1:62, ack 336, win 507, options [nop,nop,TS val 2168564850 ecr 2030652960], length 61
2026-06-24 22:01:44.619470 IP 172.17.0.2.1521 > 51.124.207.22.1024: Flags [P.], seq 1:62, ack 336, win 507, options [nop,nop,TS val 2168564850 ecr 2030652960], length 61
2026-06-24 22:01:44.619493 IP 10.0.0.59.1521 > 51.124.207.22.1024: Flags [P.], seq 1:62, ack 336, win 507, options [nop,nop,TS val 2168564850 ecr 2030652960], length 61
2026-06-24 22:01:44.630863 IP 51.124.207.22.1024 > 10.0.0.59.1521: Flags [.], ack 62, win 128, options [nop,nop,TS val 2030652994 ecr 2168564850], length 0

Duplicate packets originating from 172.17.0.2 are from Docker networking and are captured on both the host and container bridges. This marks the start of a TCP connection from Azure Database for PostgreSQL to Oracle on port 1521, beginning with the TCP handshake. PostgreSQL then sends two TNS packets to Oracle, which acknowledges both. Oracle responds with a 61-byte payload. TCP flags include [S] (connection setup), [S.] (SYN+ACK), [.] (ACK), and [P.] (PSH + ACK). The absence of the U flag indicates Oracle Net isn't using TCP urgent data, resolving previous OOB issues with older clients.

Production connection string

Running the Oracle Database in a lab helps with troubleshooting, but the database you want to migrate is not publicly accessible, and the connection string may be a bit more complex. As the Oracle client is on a managed service, you cannot add files such as tnsnames.ora or an mTLS wallet. I tested with an Oracle Autonomous database.

You need to enable One-Way TLS (encryption without a wallet) by setting Mutual TLS (mTLS) authentication to not required, and add the IP address of the PostgreSQL database to the allow list:

I got this address from the listener log of my previous test.

Then I got the TLS connection string for the Oracle Autonomous database and used it to define the foreign data wrapper server:

create server oracle_autonomous
  foreign data wrapper oracle_fdw
  options (dbserver '(description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1521)(host=adb.eu-madrid-1.oraclecloud.com))(connect_data=(service_name=g230b6cc64a62e6_mad_tp.adb.oraclecloud.com))(security=(ssl_server_dn_match=yes)))');

create user mapping for current_user
  server oracle_autonomous
  options (user 'ADMIN', password '4ut0n0m0u5_Password');

After specifying the connection string and credentials, you can read or import tables from Oracle Autonomous into Azure Database for PostgreSQL.

Conclusion

Oracle FDW allows querying an Oracle database from PostgreSQL with minimal setup. On Azure Database for PostgreSQL, the only extra step is to enable the extension. Otherwise, it functions as standard PostgreSQL. This feature is useful for migrations to import or compare data between the two databases. Queries are straightforward, and the planner decides what runs remotely. Performance depends on pushdown to Oracle and network latency. Filters and joins may run remotely or locally, and if pushdown isn’t possible, large result sets may be transferred and joined locally. All this is visible from the execution plan.

Connecting to Oracle can sometimes be challenging. Make sure to run oracle_diag() and verify that both oracle_fdw and the Oracle client are current and compatible with your Oracle database, that the port is opened on the network between the two databases, and that you can connect with a connection string without additional files.

One Postgres cluster, many apps

Yo, I heard you liked databases, so we put logical databases in your database cluster so you can database while you database

June 24, 2026

Build a Spring Boot REST API with Amazon Aurora DSQL

In this post, you learn how to build a Spring Boot REST API that integrates with Aurora DSQL. You’ll configure the Aurora DSQL JDBC Connector for IAM authentication, implement optimistic concurrency control, and run the application across two regional nodes to observe active-active behavior.

5 Lessons at 50

Looking at my peak male physique, and my Keanu Reeves baby face, you would never suspect it, but I recently turned 50. As is the tradition, I thought about writing a post titled "50 Lessons at 50". Unfortunately, I don't have that kind of wisdom. The thing is, I still feel like I'm 18, same age as my son. Turns out this is the secret old guys have been hiding from us all along. You get older on the outside, but inside you still see yourself as the same young lad.

Still, fifty years should count for something. So what did I actually learn? How am I different mentally than my 18-year-old self. Here is my attempt to tally it up. 


1. Caution is warranted

I finally understand my parents. As you age, you accumulate battle scars, and the scars turn into habits. Anything that can go wrong will go wrong. You forget the cooktop on once, and suddenly you check it three times before leaving the house. True story. You stop diving head-first into a pile of leaves, because there might be a rake underneath. Another true story.

But the trap is overcorrecting. Common sense is good, but too much common sense is dangerous. It makes you overly cautious, predictable, and dull, and it talks you out of the leaf pile even when there is no rake. So, it is important to fight that entropy.


2. Realism helps you understand how the world runs

When I was younger I thought everyone was good, smart, rational, and reasonable. I was idealistic, and I was projecting my own motivations onto everyone else. As a professor, I rarely questioned the administrators. Surely everyone was trying to improve education for the students. Not really. In fact, many of the students were not particularly interested in improving their own education either.

This was a disappointing realization, but then it was also liberating. The world gets much easier to understand once you take off the rose-colored glasses and start watching the incentives instead of the stated intentions. You stop being surprised, and you stop being hurt. You get a model of the world that actually predicts the world. Recognizing where the incentives point doesn't obligate you to like them. You can decide to play the game, or refuse it, or even try to change it. 


3. Competence is bliss

I alluded to the Murphy's Law earlier. In 1949, an engineer named Edward Murphy got frustrated after a technician wired the sensors backwards during an Air Force rocket sled experiment. This led to the aphorism "if there is a way to do it wrong, someone will find it". People remember this as a line about bad luck, but it's actually about how easy it is to do things wrong, and how rare it is to do them right.

It took me a couple of decades to fully appreciate that competence is far scarcer than you'd expect. This is both bad news and good news. The bad news is obvious, and all of us are going to experience it many times, at the hands of incompetent parties. The good news is that competence stands out. When you meet genuinely competent people you notice immediately, and you appreciate them. They are reliable. They sweat the details and understand the tradeoffs. You start seeking them out, and when you find them, friendships form fast, because you both know how unusual the thing actually is.


4. Do what you like

I've always wanted an excuse to use the Bell Curve meme, so here it goes. 

Most people dramatically overestimate their ability to predict what will matter in ten years or even five years. The elaborate plans are mostly a waste of time. The best strategy is work on what you like, and what you find interesting and energizing, because curiosity, enthusiasm, and the craft compounds. The safest long-term strategy is often to become exceptionally good at something you genuinely enjoy.

"Find out who you are and do it on purpose." --Dolly Parton


5. Your attitude determines your success

I am not likely to turn into the old guy yelling at the cloud any time soon, as I am aware of the danger of learned pessimism and helplessness. So I care about this lesson more than the others. Your attitude determines your success, and the stories you tell yourself matter. Since our emotions drive our persistence and our willingness to keep going, it pays to keep a constructive narrative about setbacks and challenges.

I aim for cautious optimism. I am realistic about the risks, but still optimistic about the possibilities. I am jaded enough to see how the world works, but still idealistic enough to want to improve it. Years ago I wrote a post called Fool Yourself, about deliberately choosing the narrative that gives you energy and momentum. I still believe that. At 50 I'm a little wiser, certainly more cautious and aware of the challenges ahead, but I am still fooling myself into chasing things I find meaningful.

June 22, 2026

Vector Search with Filters: pgvector vs DiskANN on HorizonDB

In a previous post, I explained that using filtering with pgvector can decrease recall in approximate nearest neighbor (ANN) searches.

This post repeats the same experiment and dataset but compares two methods:

  • pgvector with HNSW, a popular PostgreSQL extension
  • HorizonDB with DiskANN, Microsoft’s vector index

The goal is to understand what happens when similarity search is combined with filtering.

Both setups were tested on a HorizonDB instance with 2 vCores and 16 GiB RAM in Azure public preview, where I activated the extensions by adding them to azure.extensions.

In real applications, SQL statements without a WHERE clause are rare, and typical queries combine a filter alongside a similarity search:

SELECT *
 FROM embeddings
 WHERE tenant_id = 42
 ORDER BY embedding <=> :vector_query
 LIMIT 10;

You don’t search the entire dataset. Instead, you focus on a specific subset, such as one tenant, recent data, or a category.

Experiment Setup

As in the previous post, I inserted 2 million rows of 512‑dimensional vectors with a simple metadata column (color, evenly distributed):

CREATE TABLE embeddings_table (
  id bigserial PRIMARY KEY,
  color text,
  embedding vector(512)
);

I generated random vectors to insert the embeddings, and three values for color:


create function random_embedding(dimensions int) returns vector as
$$
select
    array(
        select random()::real
        from generate_series(1, dimensions)
    )::vector
$$ language sql;

insert into embeddings_table (embedding,color)
 select random_embedding(512)
  ,(array['red', 'green', 'blue'])[generate_series(1, 2000000)%3+1]
;

This uniform distribution is intentional: it isolates the effect of filtering.

pgvector (HNSW)

Here's the HNSW index that I created in my previous post:

CREATE EXTENSION IF NOT EXISTS vector;

CREATE INDEX ON embeddings_table 
 USING hnsw (embedding vector_cosine_ops)
;

It can take some time. Here is the query I use to monitor progress:

SELECT
  command,
  phase,
  CASE WHEN blocks_total > 0 THEN round(100.0 * blocks_done / blocks_total, 1) ELSE 0 END AS blocks_pct,
  CASE WHEN tuples_total > 0 THEN round(100.0 * tuples_done / tuples_total, 1) ELSE 0 END AS tuples_pct,
  relid::regclass AS table_name,
  index_relid::regclass AS index_name
FROM pg_stat_progress_create_index
;

I might have raised the maintenance_work_mem as indicated by the NOTICE after index creation — it took more than one hour.


NOTICE: hnsw graph no longer fits into maintenance_work_mem after 195123 tuples

CREATE INDEX

Total execution time: 01:20:03.861

I generate another embedding for my query, which I'll store as a psql variable with \gset:

select random_embedding(512) as query
\gset

Without index - Seq Scan and Exact Nearest Neighbor Search

I checked the behavior before the index is created, without using a vector index:

--EXPLAIN (ANALYZE, BUFFERS, VERBOSE off, COSTS off)
SELECT id , color, embedding <=> :'query' nn_cosine
 FROM embeddings_table 
 WHERE color = 'red'
 ORDER BY embedding <=> :'query'
 LIMIT 15
;

Without a vector index, pgvector performs an exact nearest neighbors (ENN) search:

postgres=> SELECT id , color, embedding <=> :'query' nn_cosine
 FROM embeddings_table
 WHERE color = 'red'
 ORDER BY embedding <=> :'query'
 LIMIT 15
;

   id    | color |      nn_cosine
---------+-------+---------------------
 1370302 | red   |  0.2639440793770883
 1048027 | red   | 0.23483715071295053
  804529 | red   | 0.24173628441472828
 1013860 | red   | 0.25945607630683887
 1011571 | red   | 0.23552944123421315
 1514713 | red   |  0.2324248794732996
 1411876 | red   | 0.21627272961247324
 1743091 | red   | 0.25835400937319386
  494461 | red   | 0.23541082932461943
 2128600 | red   |  0.2531862056756401
 2118649 | red   | 0.27055299026445434
 2151097 | red   | 0.25835806753276225
 1988278 | red   |  0.2341769026974998
 1259950 | red   | 0.21806901991480443
 1553203 | red   | 0.24358174125683563
(15 rows)

The execution plan indicates that the color = 'red' filter was applied to the scan result, and the distance was computed from the filtered data to sort and return the top-15.


 Limit (actual time=4088.254..4164.226 rows=15 loops=1)
   Buffers: shared hit=5348519
   ->  Gather Merge (actual time=4088.252..4164.222 rows=15 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=5348519
         ->  Sort (actual time=4058.895..4058.897 rows=11 loops=3)
               Sort Key: ((embedding <=> '[0.51098114,0.8720485,0.8568168,...,0.110820755]'::vector))
               Sort Method: top-N heapsort  Memory: 26kB
               Buffers: shared hit=5348519
               Worker 0:  Sort Method: top-N heapsort  Memory: 26kB
               Worker 1:  Sort Method: top-N heapsort  Memory: 26kB
               ->  Parallel Seq Scan on embeddings_table (actual time=0.127..3949.246 rows=222222 loops=3)
                     Filter: (color = 'red'::text)
                     Rows Removed by Filter: 444445
                     Buffers: shared hit=5348445
 Planning Time: 0.078 ms
 Execution Time: 4164.252 ms
(18 rows)

This approach works when the filter is sufficiently selective to enable distance calculations and sorting on a small set of rows. However, with larger or expanding data sets, a quick neighbor search requires an index that can efficiently find neighbors.

With HNSW index - Approximate Nearest Neighbor Index Scan

Once the index is created, the query runs faster, but the result, compared to the previous execution, is incomplete:

postgres=> SELECT id , color, embedding <=> :'query' nn_cosine
 FROM embeddings_table
 WHERE color = 'red'
 ORDER BY embedding <=> :'query'
 LIMIT 15
;
   id    | color |      nn_cosine
---------+-------+---------------------
 1259950 | red   | 0.21806901991480443
  744973 | red   | 0.21565428393281894
  550210 | red   |  0.2085433809181183
 1860280 | red   | 0.24081963623331726
 1794061 | red   | 0.22253304456085543
 1056052 | red   |   0.232418043420476
 1247836 | red   | 0.22905966001837208
 1128376 | red   | 0.24222950580938118
  847906 | red   | 0.24276414827559645
 1782067 | red   |   0.233356563767256
 2023984 | red   | 0.24263831933162294
 1438981 | red   | 0.21360709574533177
(12 rows)

This is what happened:

  1. The HNSW index retrieves a fixed number of candidates
  2. PostgreSQL applies the WHERE filter afterward
  3. Only matching rows are returned, which can result in fewer rows than requested

For example, if the index examines about 40 candidates (ef_search = 40, the default) and the filter matches one third of the rows, we would expect about 13 matching rows on average. This is approximate because the index examines a local region of the vector space, where the metadata distribution may differ from the global distribution.

In this execution plan, 40 candidates were visited (12 returned, 28 filtered out), but only 12 matched the predicate, falling short of the query's LIMIT 15. These candidates are not necessarily the closest neighbors that satisfy the filter. The true nearest neighbors with color = 'red' might lie outside the explored region and were therefore not considered, because the region included other colors.

As a result, fewer than LIMIT rows may be returned, and even those may not be the actual nearest neighbors within the filtered subset. This explains the drop in recall, which results from post-filtering.

In the execution plan, this is evident as the condition on color appears as a Filter applied after the index scan:


 Limit (actual time=2.352..2.496 rows=12 loops=1)
   Buffers: shared hit=1775
   ->  Index Scan using i_hnsw on embeddings_table (actual time=2.351..2.493 rows=12 loops=1)
         Order By: (embedding <=> '[0.51098114,0.8720485,0.8568168,...,0.110820755]'::vector)
         Filter: (color = 'red'::text)
         Rows Removed by Filter: 28
         Buffers: shared hit=1775
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.054 ms
 Execution

The execution plan details show what happened: 12 returned + 28 filtered = 40 candidates, which is ef_search.

HNSW applies filtering after traversal, meaning the graph is searched globally before rows are discarded. This breaks the assumption that the closest neighbors are among the candidates returned by the index, especially when using selective filters, as they often are not.

The mitigations include:

  • Increasing ef_search. For instance, based on our data distribution, setting hnsw.ef_search = 200 could provide enough candidates so that at least 15 remain after filtering.
  • Using partial indexes when filtering on a small set of discrete, known values, so each is indexing a specific subset.
  • Enabling iterative scans to expand the search scope when filtering removes too many candidates. This improves recall but requires multiple scans.

These measures improve recall, though they come with higher costs and require manual calibration because filtering is handled separately from the ANN search.

DiskANN on HorizonDB

Now I create a DiskANN index, designed for filtering during the search:

postgres=> CREATE EXTENSION IF NOT EXISTS pg_diskann;

CREATE EXTENSION

postgres=> CREATE INDEX i_diskann ON embeddings_table
           USING diskann (embedding vector_cosine_ops)
;

CREATE INDEX

Total execution time: 00:42:40.800

In HorizonDB documentation, it is explained that DiskANN enables advanced filtering for combined vector and metadata queries. Depending on selectivity and planner estimates, it can apply filters during traversal or fall back to post-filtering

With DiskANN index - Approximate Nearest Neighbor Index Scan

After creating the index, I run the same query:

postgres=> SELECT id , color, embedding <=> :'query' nn_cosine
 FROM embeddings_table
 WHERE color = 'red'
 ORDER BY embedding <=> :'query'
 LIMIT 15
;

NOTICE:  pg_diskann: Filter selectivity too high (0.3325), skipping filtered vector scan

   id    | color |      nn_cosine
---------+-------+---------------------
 1968337 | red   | 0.24740531343440575
  545257 | red   | 0.22958854707242105
  828484 | red   | 0.24054936837179086
  744973 | red   | 0.21565428393281894
  307819 | red   | 0.23955190885002464
  557251 | red   |  0.2158213914722017
 1860280 | red   | 0.24081963623331726
 1794061 | red   | 0.22253304456085543
 1048324 | red   | 0.24138402291179228
  494461 | red   | 0.23541082932461943
  851149 | red   | 0.26081572380948004
  832813 | red   |  0.2240000332741019
  745516 | red   | 0.24130828891293765
  233791 | red   |  0.2324244758968873
 1247836 | red   | 0.22905966001837208
(15 rows)

The note is interesting: pre-filtering was not used. Still, the result is complete because DiskANN dynamically adapts the number of candidates explored.

This means the filter was not applied during traversal, so the query falls back to post-filtering, as with pgvector. However, unlike HNSW, the number of candidates is not fixed: the engine explores more vectors until enough rows are returned.

In the execution plan, we see that 42 rows were filtered out and 15 were returned, meaning about 57 candidates were explored:


 Limit (actual time=6.607..10.821 rows=15 loops=1)
   Buffers: shared hit=5772
   ->  Index Scan using i_diskann on embeddings_table (actual time=6.606..10.817 rows=15 loops=1)
         Order By: (embedding <=> '[0.51098114,0.8720485,0.8568168,0.3119626,...,0.110820755]'::vector)
         Filter: (color = 'red'::text)
         Rows Removed by Filter: 42
         Buffers: shared hit=5772
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.104 ms
 Execution Time: 10.893 ms
(11 rows)

By varying the LIMIT, we can see that even without pre-filtering, the results are complete and remain fast for common top-n queries:

LIMIT Rows Removed by Filter rows= Buffers Execution Time
1 1 1 3779 4.215 ms
2 1 2 3790 6.431 ms
5 22 5 4613 8.153 ms
10 29 10 5310 9.841 ms
15 42 15 5722 10.893 ms
100 222 100 15364 39.989 ms
1000 1879 1000 91585 826.303 ms
10000 20202 10000 602407 27827.608 ms

This distinction is crucial: pgvector uses a fixed search budget (ef_search), whereas DiskANN adjusts the number of explored candidates. Even with post-filtering, it can provide enough rows without manual tuning and without incremental scans.

Highly selective query - Exact Nearest Neighbor

If the filter is highly selective, there is no need to use an approximate search:

postgres=> SELECT id , color, embedding <=> :'query' nn_cosine
 FROM embeddings_table
 WHERE color = 'red' and id = 494461
 ORDER BY embedding <=> :'query'
 LIMIT 15
;

NOTICE:  pg_diskann: Filter selectivity too low (0.0000), using filter-only scan

   id   | color |      nn_cosine
--------+-------+---------------------
 494461 | red   | 0.23541082932461943
(1 row)

The execution plan shows that it didn't use the vector index but instead used the B-tree index on id with an additional Sort for exact nearest-neighbor ordering:


 Limit (actual time=0.038..0.039 rows=1 loops=1)
   Buffers: shared hit=12
   ->  Sort (actual time=0.037..0.038 rows=1 loops=1)
         Sort Key: ((embedding <=> '[0.51098114,0.8720485,0.8568168,0.311962,...,0.110820755]'::vector))
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=12
         ->  Index Scan using embeddings_table_pkey on embeddings_table (actual time=0.033..0.034 rows=1 loops=1)
               Index Cond: (id = 494461)
               Filter: (color = 'red'::text)
               Buffers: shared hit=12
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.140 ms
 Execution Time: 0.053 ms
(14 rows)

DiskANN doesn’t even participate when the filtering is highly selective. The planner uses a B-tree index and bypasses ANN.

With DiskANN index - Pre-filtered Approximate Nearest Neighbor

When filters are selective enough to be applied during the scan but not enough to avoid ANN search, there's no NOTICE:

postgres=> SELECT id , color, embedding <=> :'query' nn_cosine
 FROM embeddings_table
 WHERE color = 'red' and id <= 494461
 ORDER BY embedding <=> :'query'
 LIMIT 15
;

   id   | color |      nn_cosine
--------+-------+---------------------
 418054 | red   | 0.21522295998337881
 233791 | red   |  0.2324244758968873
 377470 | red   | 0.23508242287803505
 196009 | red   |  0.2356772244494152
 225253 | red   | 0.24119501226879858
 233026 | red   | 0.23055577367239388
 308998 | red   | 0.21652541614338927
 291283 | red   |  0.2319492505072145
 459442 | red   |  0.2448930593940325
 296956 | red   | 0.21969539585363562
 223876 | red   | 0.24664282569002072
 385669 | red   | 0.21339531139470636
 390397 | red   |  0.2484855378229266
 296467 | red   | 0.22426404565684388
 220285 | red   |  0.2179154018451408
(15 rows)

This is where the execution plan displays a Custom Scan (DiskANNFilteredScan), enabled by PostgreSQL's extensibility. It performs pre-filtering (Filter(IndexScan)) and Approximate Nearest Neighbor search (Vector) to yield the precise number of candidates in the filtered set (Rows Retrieved: 15 count):


                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Limit (actual time=25.842..28.876 rows=15 loops=1)
   Buffers: shared hit=11905
   ->  Custom Scan (DiskANNFilteredScan) (actual time=25.841..28.872 rows=15 loops=1)
         Strategy: Filter(IndexScan) -> Vector
         Rows Retrieved: 15 count
         TIDs Collected: 303062 count
         Buffers: shared hit=11905
 Planning:
   Buffers: shared hit=2
 Planning Time: 0.091 ms
 Execution Time: 28.927 ms
(11 rows)

This plan is an ANN with pre-filtering, effective when both the column filter and the embedding filter are selective and need to be combined for efficient execution. The optimizer makes this decision dynamically.

TIDs Collected indicates the number of rows matching the indexable portion of the filter prior to the vector search. According to the cardinalities, the id condition was used for prefiltering:

postgres=> select count(*) FROM embeddings_table 
           WHERE id <= 494461;
 count
--------
 303063

This represents a selectivity of about 15% for the pre-filtered set. The remaining condition (color = 'red') is applied afterward, reducing the result further to about 5% of the table.

Why this matters

These experiments reveal a different execution model, not just performance differences. With pgvector and HNSW, search is drive... (truncated)

June 20, 2026

Our Collective Bike Shed Moment

In 1957, Parkinson postulated his "Law of Triviality" using a fictitious committee reviewing plans for a nuclear power plant. The reactor design gets 10 minutes because nobody understands it, so nobody argues. The bike shed gets 45 minutes because everyone has opinions about the paint color.

I feel like we are living this committee meeting at scale every day.

LLMs are already better engineers than most of us. They are better at formal methods, and better at reasoning under pressure than most people. They run at incredible speed and don't get tired. They improve continuously. But, some people keep moving the goalposts on LLMs. First they said LLMs couldn't code. Then they said they hallucinated too much. All of these barriers fell, but some people are still scoffing at these systems. What chutzpah!

If aliens landed in Central Park tomorrow, I don't think the reaction would be that different. With AI, an alien form of intelligence has already arrived in our laps, and we gave it a collective shrug and kept scrolling.

Psychologists call this the normalcy bias. We tend to assume things will continue roughly as they have, even when confronted with something that requires revising our picture of the future.

Douglas Adams introduces an adjacent phenomena, called the Somebody Else's Problem (SEP) field. The SEP field does not try to make something invisible directly, rather it makes your brain classify it as somebody else's problem, so it gets actively skipped. An alien spacecraft hovers over a crowded park unnoticed, because everyone has silently agreed it's someone else's problem.

The individual adaptation strategy makes sense of course. Learn the tools, and stay ahead of the curve to save your own ship. Why not? Stay afloat the next couple years. But what about the coming decades?

I am not saying everyone should engage with everything. I know about the circles of control, influence, and concern. But, unfortunately, even the people who should be debating this (technologists, researchers, economists, policymakers, ethicists, military planners, educators) are not engaging with this problem at the depth it deserves.

We are having the bike shed debate about the reactor, and unfortunately the reactor design doesn't get enough attention.

CPU-bound sysbench on a large server: Postgres 12 to 19 beta1

This has results from sysbench on a small server with Postgres versions 12 through 19 beta1. Sysbench is run with high concurrency (40 connections) and a cached database. The purpose is to search for changes in performance.

Postgres remains boring, it is hard to find performance regressions.

tl;dr for Postgres 17 to 19

  • there are no regressions
  • throughput on the read-only-count test improves by ~3X in 19 beta1 thanks to a better query plan

tl;dr for Postgres 12 to 19

  • there are few regressions, throughput might have dropped by up to 5% on a few range query tests
  • there are a few large improvements for read-only tests
  • there are many large improvements for write-heavy tests

Builds, configuration and hardware

I compiled Postgres from source for versions 12.22, 13.23, 14.23, 15.18, 16.14, 17.10, 18.4 and 19 beta1.

I used a 48-core server from Hetzner
  • an ax162s with an AMD EPYC 9454P 48-Core Processor with SMT disabled
  • 2 Intel D7-P5520 NVMe storage devices with RAID 1 (3.8T each) using ext4
  • 128G RAM
  • Ubuntu 24.04
Configuration files for Postgres:
  • the config file is named conf.diff.cx10a_c32r128 (x10a_c32r128) and is here for versions 12, 13, 14, 15, 16 and 17.
  • for Postgres 18 and 19 I used conf.diff.cx10b_c32r128 (x10b_c32r128) which is as close as possible to the Postgres 17 config and uses io_method=sync
Benchmark

I used sysbench and my usage is explained here. I now run 32 of the 42 microbenchmarks listed in that blog post. Most test only one type of SQL statement. Benchmarks are run with the database cached by Postgres.

The read-heavy microbenchmarks are run for 600 seconds and the write-heavy for 1200 seconds. The benchmark is run with 40 clients and 8 tables with 10M rows per table. The database is cached.

The purpose is to search for regressions from new CPU overhead and mutex contention. I use the small server with low concurrency to find regressions from new CPU overheads and then larger servers with high concurrency to find regressions from new CPU overheads and mutex contention.

The tests can be called microbenchmarks. They are very synthetic. But microbenchmarks also make it easy to understand which types of SQL statements have great or lousy performance. Performance testing benefits from a variety of workloads -- both more and less synthetic.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. 

I provide charts below with relative QPS (rQPS). The relative QPS is the following:
(QPS for some version) / (QPS for base version)
When the relative QPS is > 1 then some version is faster than base version.  When it is < 1 then there might be a regression. Values from iostat and vmstat divided by QPS are also provided here. These can help to explain why something is faster or slower because it shows how much HW is used per request.

Here, base version is either Postgres 12.23 or 17.10 and some version is a more recent version. I use 12.23 as the base version to identify regressions over a long period of time. And then I use 17.10 as the base version to confirm there aren't recent, large regressions.

I describe performance changes (changes to relative QPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

Results: point queries, version 17 to 19

Summary:
  • there are no regressins
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
1.00    0.99    hot-points
1.01    1.01    point-query
1.00    1.00    points-covered-pk
0.98    0.99    points-covered-si
1.01    1.00    points-notcovered-pk
1.00    1.00    points-notcovered-si
1.01    1.01    random-points_range=10
1.02    1.00    random-points_range=100
1.00    1.00    random-points_range=1000

Results: point queries, version 12 to 19

Summary
  • there are no regressions
  • throughput for the hot-points test improves by ~2X in versions 17.10, 18.4 and 19beta
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
1.00    0.90    0.97    1.03    2.34    2.35    2.31    hot-points
1.00    1.01    1.03    1.04    1.03    1.04    1.03    point-query
1.02    1.04    1.04    1.07    1.04    1.04    1.04    points-covered-pk
1.01    1.07    1.04    1.04    1.04    1.03    1.04    points-covered-si
0.98    1.01    1.03    1.02    1.00    1.01    1.00    points-notcovered-pk
0.99    1.03    1.03    1.01    1.02    1.02    1.01    points-notcovered-si
0.99    1.01    1.03    1.03    1.00    1.01    1.01    random-points_range=10
0.99    1.02    1.04    1.04    1.01    1.03    1.01    random-points_range=100
1.00    1.02    1.02    1.03    1.01    1.02    1.01    random-points_range=1000

Results: range queries without aggregation, version 17 to 19

Summary
  • there are no regressions
  • while 19 beta1 has a better result on the scan test, that test has more variance with Postgres so I am reluctant to judge this without more results
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
0.98    0.99    range-covered-pk
0.97    0.99    range-covered-si
0.99    0.99    range-notcovered-pk
1.02    1.01    range-notcovered-si
0.96    1.07    scan

Results: range queries without aggregation, version 12 to 19

Summary
  • there are no regressions
  • scan throughput has improved a lot from version 12 to 19
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
0.99    1.03    1.04    1.04    1.03    1.00    1.02    range-covered-pk
0.99    1.04    1.04    1.04    1.03    1.00    1.03    range-covered-si
1.00    1.00    1.00    0.99    1.00    0.99    0.99    range-notcovered-pk
1.00    1.01    1.01    0.99    1.00    1.02    1.01    range-notcovered-si
1.09    1.27    1.10    1.21    1.19    1.14    1.28    scan

Results: range queries with aggregation, version 17 to 19

Summary
  • there are no regressions
  • throughput on the read-only-count test is ~3X better thanks to a new query plan. This improvement was also visible on my small server
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
1.03    3.30    read-only-count
1.02    0.99    read-only-distinct
1.00    0.97    read-only-order
0.99    0.99    read-only_range=10
0.99    0.99    read-only_range=100
1.01    1.00    read-only_range=10000
1.03    1.01    read-only-simple
1.03    1.01    read-only-sum

Results: range queries with aggregation, version 12 to 19

Summary
  • there might be a few small regressions, but losing 5% throughput from version 12 to 19 isn't a big deal
  • throughput on the read-only-count test is ~3X better thanks to a new query plan. This improvement was also visible on my small server
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
1.01    0.95    0.96    0.97    0.93    0.95    3.06    read-only-count
1.00    0.98    0.98    0.98    0.96    0.98    0.95    read-only-distinct
1.00    0.98    0.98    1.00    0.99    0.99    0.97    read-only-order
0.99    1.00    1.01    1.00    1.01    0.99    1.00    read-only_range=10
0.99    1.00    1.00    1.00    1.01    1.00    0.99    read-only_range=100
1.00    0.97    1.02    1.03    1.04    1.05    1.03    read-only_range=10000
1.00    0.97    0.99    0.97    0.95    0.98    0.96    read-only-simple
1.00    0.96    0.97    0.97    0.94    0.97    0.95    read-only-sum

Results: writes, version 17 to 19

Summary
  • there are no regressions
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
0.99    0.99    delete
1.02    1.02    insert
1.00    0.98    read-write_range=10
0.99    0.99    read-write_range=100
1.01    1.03    update-index
1.01    0.98    update-inlist
0.98    1.01    update-nonindex
1.01    1.03    update-one
1.00    1.00    update-zipf
0.97    0.99    write-only

Results: writes, version 12 to 19

Summary
  • there are no regressions
  • many large improvements arrived in version 17 and remain in 19 beta1
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
0.99    1.11    1.13    1.10    1.28    1.27    1.27    delete
1.02    1.17    1.16    1.19    1.23    1.25    1.25    insert
1.00    1.20    1.22    1.20    1.24    1.24    1.22    read-write_range=10
0.99    1.04    1.05    1.04    1.06    1.05    1.04    read-write_range=100
0.98    1.08    1.05    0.94    1.84    1.85    1.90    update-index
1.00    1.07    1.06    1.05    1.12    1.13    1.10    update-inlist
1.01    1.07    1.07    0.86    1.87    1.84    1.88    update-nonindex
1.04    0.96    0.96    1.10    1.39    1.41    1.43    update-one
1.01    1.05    1.07    0.96    1.63    1.62    1.63    update-zipf
0.99    1.11    1.13    1.09    1.41    1.37    1.40    write-only