August 26, 2024
Erasure Coding for Distributed Systems
August 24, 2024
Obsession
In your professional and personal life, I don't believe there is a stronger motivation than having something in mind and the desire to do it. Yet the natural way to deal with a desire to do something is to justify why it's not possible.
"I want to read more books but nobody reads books these days so how could I."
"I want to write for a magazine but I have no experience writing professionally."
"I want to build a company someday but how could someone of my background."
Our official mentors, our managers, through a combination of well-intentioned defeatism and well-intentioned lack of accomplishment themselves, among other things, are often unable to process big goals or guide you toward them.
I've been one of these managers myself. In fact I have, to my immense regret, tried too often to convince people to do what is practical rather than what they want to do. Or to do what I judged they were capable of doing rather than what they wanted to do.
In the best cases, my listener had the self-confidence to ignore me. They did what they wanted to do anyway. In the worst case, again to my deep regret, I've been a well-intentioned part of derailing someone's career for years.
So I don't want to convince anyone of anything anymore. If I start trying to convince someone by accident, I try to catch myself. I try to avoid sentences like "I think you should …". Instead "Here is something that's worked for me: …" or "Here is what I've heard works well for other people: …".
Nobody wants to be convinced. But intelligent people will change their mind when exposed to new facts or different ideas. Being convinced is a battle of will. Changing one's mind is a purely personal decision.
There are certainly people with discipline who can grind on things they hate doing and eventually become experts at it. But more often I see people grind on things they hate only to become depressed and give up.
For most of us, our best hope is (healthy) obsession. And obsession, in the sense I'm talking about, does not come from something you are ambivalent about or hate. Obsession can only come when you're doing something you actually want to do.
For big goals or big changes, you need regular commitment weekly, monthly, yearly. Over the course of years. And only obsession makes that work not actually feel like work. Obsession is the only thing that makes discipline not feel like discipline.
That big goals take years to accomplish need not be scary. Obsession doesn't mean you can't pivot. There is quite a lot to gain by committing to something regularly over the course of years even if you decide to stop and commit from then on to something else. You will learn a good deal.
And healthy obsession to me is more specifically measurable on the order of weeks, not hours or days. Healthy obsession means you're still building healthy personal and professional relationships. You're still taking care of yourself, emotionally and physically.
I do not have high expectations for people in general. This seems healthy and reasonable. But as I meet more people and observe them over the years, I am only more convinced of the vast potential of individuals. Individuals are almost universally underestimated.
I think you can do almost anything you want to do. If you commit to do doing it.
I'll end this with a personal story.
Until 11th grade, I hated school. I hated the rigidity. Being forced to be somewhere for hours and to follow so many rules. I skipped so many days of school I'm embarrassed by it. I'd never do homework at home. I never studied for tests. I got Bs and Cs in the second-tier classes. I was in the orchestra for 6 years and never practiced at home. I was not cool enough to be a "bad kid" but I did not understand the system and had no discipline whatsoever.
I found out at the end of 10th grade that I could actually afford college if I got into a good enough school that paid full needs-based tuition. It sounded significantly better than the only other option that seemed obvious, joining the military as a recruit. I realized and decided that if I wanted to get into a good school I needed to not half-ass things.
Somehow, I decided to only do things I could become obsessed with. And I decided to be obsessed in the way that I wanted, not to do what everyone else did (which I basically could not do since I had no discipline). If we covered a topic in class, I'd read news about it or watch movies about it. I'd get myself excited about the topic in every way I could.
It basically worked out. I ended high school in the top 10% of the class (up from top 40% or something). I got into a good liberal arts college that paid the entirety of my tuition. But I remained a basically lazy and undisciplined person. I never stayed up late studying for a test. I dropped out after a year and a half for family reasons.
But I've now spent the last 10 years in my spare time working on compiler projects, interpreter projects, parser projects, database projects, distributed systems projects. I've spent the last 6 years consistently publishing at least one blog post per month.
I didn't want to work the way everyone else worked. I wanted to be obsessed about what I worked on.
Obsession has made all of this into something I now barely register as doing. It's allowed me to continue adding activities like organizing book clubs and meetups to the list of things I'm up to. Up until basically this year I could have in good faith said I am a very lazy and undisciplined person. But obsession turned me into someone with discipline.
Obsession became about more than just the tech. It meant trying to fully understand the product, the users, the market. It meant thinking more carefully about product documentation, user interfaces, company messaging. Obsession meant reflecting on how I treat my coworkers, and how my coworkers feel treated by others in general. Obsession meant wanting an equitable and encouraging work environment for everyone.
And, as I said, it's about healthy obsession. I didn't really understand the "healthy" part until a few years ago. But I'm now convinced that the "healthy" part is as important as the "obsession" part. To go to the gym regularly. To play pickup volleyball. To cook excellent food. To read fiction and poetry and play music. To serve the community. To be friendly and encouraging to all people. To meet new people and build better genuine friendships.
And in the context of work, "healthy obsession" means understanding you can't do everything, even while you care about everything. It means accepting that you make mistakes and that you do your best; that you try to do better and learn from mistakes the next time.
It's got to be sustainable. And we can develop a healthy obsession while we have quite a bit of fun too. :)
I wrote an essay on my mistakes trying to convince people to do something, on doing what you want to do, and on obsession.
— Phil Eaton (@eatonphil) August 24, 2024
Ended with a personal note on developing healthy discipline, and having fun. :)https://t.co/4WWdtU6AhL pic.twitter.com/lBw7zlqWeq
August 23, 2024
Vitess Now Supports Recursive CTEs: A Step Closer to Full MySQL Compatibility
August 22, 2024
Large Uniquifier Values
Hey what’s up? It’s been a while. If you define a clustered index that’s not unique, SQL Server will add a hidden 4-byte column called UNIQUIFIER. You can’t see it directly but it’s there. When you add a row whose key is a duplicate of an existing row, the new row gets a new unique […]
The post Large Uniquifier Values first appeared on Michael J. Swart.August 21, 2024
Mozilla Llamafile in Supabase Edge Functions
August 20, 2024
Want a managed ClickHouse®️? Here are some options
What's the big deal about Deterministic Simulation Testing?
Bugs in distributed systems are hard to find, largely because systems interact in chaotic ways. And even once you've found a bug, it can be anywhere from simple to impossible to reproduce it. It's about as far away as you can get from the ideal test environment: property testing a pure function.
But what if we could write our code in a way that we can isolate the chaotic aspects of our distributed system during testing: run multiple systems communicating with each other on a single thread and control all randomness in each system? And property test this single-threaded version of the distributed system with controlled randomness, all the while injecting faults (fancy term for unhappy path behavior like errors and latency) we might see in the real-world?
Crazy as it sounds, people actually do this. It's called Deterministic Simulation Testing (DST). And it's become more and more popular with startups like FoundationDB, Antithesis, TigerBeetle, Polar Signals, and WarpStream; as well as folks like Tyler Neely and Pekka Enberg, talking about and making use of this technique.
It has become so popular to talk about DST in my corner of the world that I worry it risks coming off sounding too magical and maybe a little hyped. It's worth getting a better understanding of both the benefits and the limitations.
Thank you to Alex Miller and Will Wilson for reviewing a version of this post.
Randomness and time
A big source of non-determinism in business logic is the use of random numbers—in your code or your transitive dependencies or your language runtime or your operating system.
Crucially, DST does not imply you can't have randomness! DST merely assumes that you have a global seed for all randomness in your program and that the simulator controls the seed. The seed may change across runs of the simulator.
Once you observe a bad state as a result of running the simulation on a random seed, you allow the user to enter the same seed again. This allows the user to recreate the entire program run that led to that observed bad state. Allows the user to debug the program trivially.
Another big source of non-determinism is being dependent on time. As with randomness, DST does not mean you can't depend on time. DST means you must be able to control the clock during the simulation.
To "control" randomness or time basically means you support dependency injection, or the old-school alternative to dependency injection called passing the dependency as an explicit parameter. Rather than referring to a global clock or a global seed, you need to be able to receive a clock or a seed from someone.
For example we might separate the operation of an application into the
language's main() entrypoint and an actual application start()
entrypoint.
# app.pseudocode
def start(clock, seed):
# lots of business logic that might depend on time or do random things
def main:
clock = time.clock()
seed = time.now()
app.start(clock, seed)
The application entrypoint is where we must be able to swap out a real clock or real random seed for one controlled by our simulator:
# sim.pseudocode
import "app.pseudocode"
def main:
sim_clock = make_sim_clock()
sim_seed = os.env.DST_SEED or time.now()
try:
app.start(sim_clock, sim_seed)
catch(e):
print("Bad execution at seed: %s", sim_seed)
throw e
Let's look at another example.
Converting an existing function
Let's say that we had a helper method that kept calling a function until it succeeded, with backoff.
# retry.pseudocode
class Backoff:
def init:
this.rnd = rnd.new(seed = time.now())
this.tries = 0
async def retry_backoff(f):
while this.tries < 3:
if f():
return
await time.sleep(this.rnd.gen())
this.tries++
There is a single source of nondeterminism here and it's where we
generate a seed. We could parameterize the seed, but since we want to
call time.sleep() and since in DST we control the time, we can just
parameterize time.
# retry.psuedocode
class Backoff:
def init(this, time):
this.time = time
this.rnd = rnd.new(seed = this.time.now())
this.tries = 0
async def retry_backoff(this, f):
while this.tries < 3:
if f():
return
await this.time.sleep(this.rnd.gen())
this.tries++
Now we can write a little simulator to test this:
# sim.psuedocode
import "retry.pseudocode"
sim_time = {
now: 0
sleep: (ms) => {
await future.wait(ms)
}
tick: (ms) => now += ms
}
backoff = Backoff(sim_time)
while true:
failures = 0
f = () => {
if rnd.rand() > 0.5:
failures++
return false
return true
}
try:
while sim_time.now < 60min:
promise = backoff.retry_backoff(f)
sim_time.tick(1ms)
if promise.read():
break
assert_expect_failure_and_expected_time_elapse(sim_time, failures)
catch(e):
print("Found logical error with seed: %d", seed)
throw e
This demonstrates a few critical aspects of DST. First, the simulator itself depends on randomness. But allows the user to provide a seed so they can replay a simulation that discovers a bug. The controlled randomness in the simulator is what lets us do property testing.
Second, the simulation workload must be written by the user. Even when you've got a platform like Antithesis that gives you an environment for DST, it's up to you to exercise the application.
Now let's get a little more complex.
A single thread and asynchronous IO
The determinism of multiple threads can only be controlled at the operating system or emulator or hypervisor layer. Realistically, that would require third-party systems like Antithesis or Hermit (which, don't get excited, is not actively developed and hasn't worked on any interesting program of mine) or rr.
These systems transparently transform multi-threaded code into single threaded code. But also note that Hermit and rr have only limited ability to do fault injection which, in addition to deterministic execution, is a goal of ours. And you can't run them on a mac. And can't run them on ARM.
But we can, and would like, to write a simulator without writing a new operating system or emulator or hypervisor, and without a third-party system. So we must limit ourselves to writing code that can be collapsed into a single thread. Significantly, since using blocking IO would mean an entire class of concurrency bugs could not be discovered while running the simulator in a single thread, we must limit ourselves to asynchronous IO.
Single threaded and asynchronous IO. These are already two big limitations.
Some languages like Go are entirely built around transparent multi-threading and blocking IO. Polar Signals solved this for DST by compiling their application to WASM where it would run on a single thread. But that wasn't enough. Even on a single thread, the Go runtime intentionally schedules goroutines randomly. So Polar Signals forked the Go runtime to control this randomness with an environment variable. That's kind of crazy. Resonate took another approach that also looks cumbersome. I'm not going to attempt to describe it. Go seems like a difficult choice of a language if you want to do DST.
Like Go, Rust has no builtin async IO. The most mature async IO library is tokio. The tokio folks attempted to provide a tokio-compatible simulator implementation with all sources of nondeterminism removed. From what I can tell, they did not at any point fully succeed. That repo has now been replaced with a "this is very experimental" tokio-rs project called turmoil that provides deterministic execution plus network fault injection. (But not disk fault injection. More on that later.) It isn't surprising that it is difficult to provide deterministic execution for an IO library that was not designed for it. tokio is a large project with many transitive dependencies. They must all be combed for non-determinism.
On the other hand, Pekka has already demonstrated for us how we might build a simpler Rust async IO library that is designed to be simulation tested. He modeled this on the TigerBeetle design King and I wrote about two years ago.
So let's sketch out a program that does buggy IO and let's look at how we can apply DST to it.
# readfile.pseudocode
def read_file(io, name, into_buffer):
f = await io.open(name)
read_buffer = [4096]u8{}
while true:
err, n_read = await f.read(&read_buffer)
if err == io.EOF:
into_buffer.copy_maybe_allocate(read_buffer[0:sizeof(read_buffer)])
return
if err:
throw err
into_buffer.copy_maybe_allocate(read_buffer[0:sizeof(read_buffer)])
In our simulator, we will provide a mocked out IO system and we will randomly inject various errors while asserting pre- and post-conditions.
# sim.psuedocode
import "readfile.pseudocode"
seed = if os.env.DST_SEED ? int(os.env.DST_SEED) : time.now()
rnd = rnd.new(seed)
while true:
sim_disk_data = rnd.rand_bytes(10MB)
sim_fd = {
pos: 0
EOF: Error("eof")
read: (fd, buf) => {
partial_read = rnd.rand_in_range_inclusive(0, sizeof(buf))
memcpy(sim_disk_data, buf, fd.pos, partial_read)
fd.pos += partial_read
if fd.pos == sizeof(sim_disk_data):
return io.EOF, partial_read
return partial_read
}
}
sim_io = {
open: (filename) => sim_fd
}
out_buf = Vector<u8>.new()
try:
read_file(sim_io, "somefile", out_buf)
assert_bytes_equal(out_buf.data, sim_disk_data)
catch (e):
print("Found logical error with seed: %d", seed)
throw e
And with this simulator we would have eventually caught our partial read bug! In our original program when we wrote:
into_buffer.copy_maybe_allocate(read_buffer[0:sizeof(read_buffer)])
We should have written:
into_buffer.copy_maybe_allocate(read_buffer[0:n_read])
Great! Let's get a little more complex.
A distributed system
I already mentioned in the beginning that the gist of deterministic simulation testing a distributed system is that you get all of the nodes in the system to run in the same process. This would be basically impossible if you wanted to test a system that involved your application plus Kafka plus Postgres plus Redis. But if your system is a self-contained distributed system, such as one that embeds a Raft library for high availability of your application, you can actually run multiple nodes into the same process!
For a system like this, our simulator might look like:
# sim.pseudocode
import "distsys-node.pseudocode"
seed = if os.env.DST_SEED ? int(os.env.DST_SEED) : time.now()
rnd = rnd.new(seed)
while true:
sim_fd = {
send(fd, buf) => {
# Inject random failure.
if rnd.rand() > .5:
throw Error('bad write')
# Inject random latency.
if rnd.rand() > .5:
await time.sleep(rnd.rand())
n_written = assert_ok(os.fd.write(buf))
return n_written
},
recv(fd, buf) => {
# Inject random failure.
if rnd.rand() > .5:
throw Error('bad read')
# Inject random latency.
if rnd.rand() > .5:
await time.sleep(rnd.rand())
return os.fd.read(buf)
}
}
sim_io = {
open: (filename) => {
# Inject random failure.
if rnd.rand() > .5:
throw Error('bad open')
# Inject random latency.
if rnd.rand() > .5:
await time.sleep(rnd.rand())
return sim_fd
}
}
all_ports = [6000, 6001, 6002]
nodes = [
await distsys-node.start(sim_io, all_ports[0], all_ports),
await distsys-node.start(sim_io, all_ports[1], all_ports),
await distsys-node.start(sim_io, all_ports[2], all_ports),
]
history = []
try:
key = rnd.rand_bytes(10)
value = rnd.rand_bytes(10)
nodes[rnd.rand_in_range_inclusive(0, len(nodes)].insert(key, value)
history.add((key, value))
assert_valid_history(nodes, history)
# Crash a process every so often
if rnd.rand() > 0.75:
node = nodes[rnd.rand_in_range_inclusive(0, 3)]
node.restart()
catch (e):
print("Found logical error with seed: %d", seed)
throw e
I'm completely hand waving here to demonstrate the broader point and not any specific testing strategy for a specific distributed system. The important points are that these three nodes run in the same process, on different ports.
We control disk IO. We control network IO. We control how time elapses. We run a deterministic simulated workload against the three node system while injecting disk, network, and process faults.
And we are constantly checking for an invalid state. When we get the invalid state, we can be sure the user can easily recreate this invalid state.
Other sources of non-determinism
Within some error margin, most CPU instructions and most CPU behavior are considered to be deterministic. There are, however, certain CPU instructions that are definitely not. Unfortunately that might include system calls. It might also include malloc. There is very little to trust.
If we ignore Antithesis, people doing DST seem not to worry about these smaller bits of nondeterminism. Yet it's generally agreed that DST is still worthwhile anyway. The intuition here is that every bit of non-determinism you can eliminate makes it that much easier to reproduce bugs when you find them.
Put another way: determinism, even among DST practitioners, remains a spectrum.
Considerations
As you may have noticed already from some of the pseudocode, DST is not a panacea.
Consideration 1: Edges
First, because you must swap out non-deterministic parts of your code, you are not actually testing the entirety of your code. You are certainly encouraged to keep the deterministic kernel large. But there will always be the non-deterministic edges.
Without a system like Antithesis which gives you an entire deterministic machine, you can't test your whole program.
But even with Antithesis you cannot test the integration between your system and external systems. You must mock out the external systems.
It's also worth noting that there are many areas where you could inject simulation. You could do it at a high-level RPC and storage layer. This would be simpler and easier to understand. But then you'd be omitting testing and error-handling of lower-level errors.
Consideration 2: Your workload(s)
DST is dependent on your creativity and thoroughness of your workload as much as any other type of test or benchmark.
Just as you wouldn't depend on one single benchmark to qualify your application, you may not want to depend on a single simulated workload.
Or as Will Wilson put it for me:
The biggest challenge of DST in my experience is that tuning all the random distributions, the parameters of your system, the workload, the fault injection, etc. so that it produces interesting behavior is very challenging and very labor intensive. As with fuzzing or PBT, it's terrifyingly easy to build a DST system that appears to be doing a ton of testing, but actually never explores very much of the state space of your system. At FoundationDB, the vast majority of the work we put into the simulator was an iterative process of hunting for what wasn't being covered by our tests and then figuring out how to make the tests better. This process often resembles science more than it does engineering.
Unfortunately, unlike with fuzzing, mere branch coverage in your code is usually a pretty poor signal for the kinds of systems you want to test with DST. At Antithesis we handle this with Sometimes assertions, at FDB we did something pretty similar, and I assume TigerBeetle and others have their own version of this. But of course the ultimate figure of merit is whether your DST system is finding 100% of your bugs. It's quite difficult to get to the point that it does. The truly ambitious part of Antithesis isn't the hypervisor, but the fact that we also aim to solve the much harder "is my DST working?" problem with minimal human guidance or supervision.
Consideration 3: Your knowledge of what you mocked
When you mock out the behavior of disk or network IO, the benefits of DST are tied to your understanding of the spectrum of behavior that may happen in the real world.
What are all possible error conditions? What are the extreme latency bounds of the original method? What about corruption or misdirected IO?
The flipside here is that only in deterministic simulation testing can you configure these crazy scenarios to happen at a configurable regularity. You can kick off a set of runs that have especially high IO latency or especially high corrupt reads/writes. Joran and I wrote a year ago about how the TigerBeetle simulator does exactly this.
Consideration 4: Non-reproducible seeds as code changes
Critically, the reproducibility of DST only helps so long as your code doesn't change. As soon as your code changes, the seed may no longer even get you to the state where the bug was exhibited. So the reproducibility of DST means more that it may help you convert the seed simulation run into an integration test that describes the precise scenario even as the code changes.
Consideration 5: Time and compute
Because of Consideration 4, you need to keep rerunning the simulator not just to keep finding new seeds and new histories but because the new seeds and new histories may change every time you make changes to code.
What about Jepsen?
Jepsen does limited process and network fault injection while testing for linearizability. It's a fantastic project.
However, it represents only a subset of what is possible with Deterministic Simulation Testing (if you actually put in the effort described above to get there).
But even more importantly, Jepsen has nothing to do with deterministic execution. If Jepsen finds a bug and your system can't do deterministic execution, you may or may not be able to reproduce that Jepsen bug.
Here's another Will Wilson quote for you on Jepsen and FoundationDB:
Anyway, we did [Deterministic Simulation Testing] for a while and found all of the bugs in the database. I know, I know, that’s an insane thing to say. It’s kind of true though. In the entire history of the company, I think we only ever had one or two bugs reported by a customer. Ever. Kyle Kingsbury aka “aphyr” didn’t even bother testing it with Jepsen, because he didn’t think he’d find anything.
Conclusion
The degree to which you can place faith in DST alone, and not time spent in production, has limits. However, it certainly does no harm to employ DST. And, barring the considerations described above, will likely make the kernel of your product significantly more stable. Furthermore, everyone who uses DST knows about these considerations. But I think it's worthwhile to list them out to help folks who do not know DST to build an intuition for what it's excellent at.
Further reading:
- "Testing Distributed Systems w/ Deterministic Simulation" by Will Wilson
- (Mostly) Deterministic Simulation Testing in Go
- Magical Deterministic Simulator for distributed systems in Rust
I wrote a new post talking through the basics, considerations, and limitations of Deterministic Simulation Testing.https://t.co/9Fp5ytL7Wz pic.twitter.com/xRE6FOwc0P
— Phil Eaton (@eatonphil) August 20, 2024