December 12, 2023
December 11, 2023
8 example projects to master real-time data engineering
Supabase Studio: AI Assistant and User Impersonation
December 08, 2023
How design works at Supabase
Postgres Language Server: implementing the Parser
December 05, 2023
Why is Jepsen Written in Clojure?
People keep asking why Jepsen is written in Clojure, so I figure it’s worth having a referencable answer. I’ve programmed in something like twenty languages. Why choose a Weird Lisp?
Jepsen is built for testing concurrent systems–mostly databases. Because it tests concurrent systems, the language itself needs good support for concurrency. Clojure’s immutable, persistent data structures make it easier to write correct concurrent programs, and the language and runtime have excellent concurrency support: real threads, promises, futures, atoms, locks, queues, cyclic barriers, all of java.util.concurrent, etc. I also considered languages (like Haskell) with more rigorous control over side effects, but decided that Clojure’s less-dogmatic approach was preferable.
Because Jepsen tests databases, it needs broad client support. Almost every database has a JVM client, typically written in Java, and Clojure has decent Java interop.
Because testing is experimental work, I needed a language which was concise, adaptable, and well-suited to prototyping. Clojure is terse, and its syntactic flexibility–in particular, its macro system–work well for that. In particular the threading macros make chained transformations readable, and macros enable re-usable error handling and easy control of resource scopes. The Clojure REPL is really handy for exploring the data a test run produces.
Tests involve representing, transforming, and inspecting complex, nested data structures. Clojure’s data structures and standard library functions are possibly the best I’ve ever seen. I also print a lot of structures to the console and files: Clojure’s data syntax (EDN) is fantastic for this.
Because tests involve manipulating a decent, but not huge, chunk of data, I needed a language with “good enough” performance. Clojure’s certainly not the fastest language out there, but idiomatic Clojure is usually within an order of magnitude or two of Java, and I can shave off the difference where critical. The JVM has excellent profiling tools, and these work well with Clojure.
Jepsen’s (gosh) about a decade old now: I wanted a language with a mature core and emphasis on stability. Clojure is remarkably stable, both in terms of JVM target and the language itself. Libraries don’t “rot” anywhere near as quickly as in Scala or Ruby.
Clojure does have significant drawbacks. It has a small engineering community and no (broadly-accepted, successful) static typing system. Both of these would constrain a large team, but Jepsen’s maintained and used by only 1-3 people at a time. Working with JVM primitives can be frustrating without dropping to Java; I do this on occasion. Some aspects of the polymorphism system are lacking, but these can be worked around with libraries. The error messages are terrible. I have no apologetics for this. ;-)
I prototyped Jepsen in a few different languages before settling on Clojure. A decade in, I think it was a pretty good tradeoff.
Announcing foreign key constraints support
The challenges of supporting foreign key constraints
Supabase Beta November 2023
Launch Week X Community Meetups
Supabase Launch Week X Hackathon
December 01, 2023
What is HTAP?
Automatic CLI login
November 28, 2023
The Infamous ORDER BY LIMIT Query Optimizer Bug
Which is faster: LIMIT 1
or LIMIT 20
?
Presumably, fetching less rows is faster than fetching more rows.
But for 16 years (since 2007) the MySQL query optimizer has had a “bug”† that not only makes LIMIT 1
slower than LIMIT 20
but can also make the former a table scan, which tends to cause problems.
This happened last week where I work, and although MySQL DBAs are familiar with this bug, I’m writing this blog post for developers to more clearly illustrate and explain what’s going on and why because it’s really counterintuitive.