a curated list of database news from authoritative sources

August 16, 2024

August 15, 2024

Introducing Log Drains

Log Drains for exporting product logs is now available under Public Alpha

August 14, 2024

August 13, 2024

Tinybird vs. ClickHouse®️: What's the difference?

Tinybird is a real-time data platform for user-facing analytics, built using ClickHouse. Here are the differences between Tinybird and other ClickHouse solutions, including self-hosted and managed.

Zero downtime migrations at petabyte scale

Data migrations are a critical part of the database lifecycle, and are sometimes necessary for version upgrades, sharding, or moving to a new platform. In many cases, migrations are painful and error-prone. In this article, we walk through how migrations are performed at PlanetScale, and offer advice on how to improve the migration experience.

Can You Do Both: Fast Scans and Fast Writes in a Single System?

Can You Do Both: Fast Scans and Fast Writes in a Single System?

Have you ever wondered why most existing database systems focus solely on either analytical or transactional performance? The data orientation within the file format and the internals of the storage engine are key reasons for this specialization. Current database systems are unable to balance transactional and analytical processing, and therefore, are forced to optimize for just one of both workload types. Transaction-focused OLTP systems use row-based storage formats for quick updates and lookups, but these formats are inefficient for analytics-focused OLAP tasks. OLAP workloads require scanning many rows while typically accessing only a few columns, and row-based formats are simply not designed to handle this. OLAP systems use compressed columnar formats for fast scans, which make updates complex and slow, often lacking efficient point lookup capabilities. For instance, a simple table scan using a column store can be more than 5x faster than storing data in a row-based format due to the data movement characteristics of the format.

August 12, 2024

July 31, 2024

Data Replication Design Spectrum

Consistent replication algorithms can be placed on a sliding scale based on how they handle replica failures. Across the three common points on this spectrum, the resource efficiency, availability, and latency are compared, providing guidance for how to choose an appropriate replication algorithm for a use case.

July 30, 2024

Faster backups with sharding

Sharding a database comes with many benefits: Scalability, failure isolation, write throughput, and more. However, one of the lesser-known benefits comes from improved backups and restore performance.