a curated list of database news from authoritative sources

February 01, 2023

January 31, 2023

January 30, 2023

Lessons learned streaming building a Scheme-like interpreter in Go

I wanted to practice making coding videos so I did a four-part series on writing a basic Scheme-like language (minus macros and arrays and tons of stuff).

I picked this simple topic because I wanted a low-stakes way to learn what I did not know about making videos.

Here was the end result (nothing crazy):

$ go build
$ cat examples/fib.scm
(func fib (a)
      (if (< a 2)
          a
          (+ (fib (- a 1)) (fib (- a 2)))))

(fib 11)
$ ./livescheme examples/fib.scm
89

The code for the project is here.

Video archives

Here are the four episodes! Each about an hour long. One per week for four weeks.

Live live

The videos were streamed to Twitch live.

I didn't prep for them because I wanted to show warts and all. The thought process.

But some things turned out to be tricky to explain without preparation (function calling conventions, mostly).

Overall hopefully the series was somewhat useful.

Full screen windows

The first episode I did I didn't make sure that the terminal window was captured full screen. So some of my code went off the bottom of the video. That was dumb.

I even have a tmux mode-line at the bottom of the terminal app that I could have looked for to notice it didn't exist in the OBS view.

So I made sure to have the full window in view after the first episode.

Twitch moderation

Twitch Shield Mode is great. But the default setting prevents folks from commenting live until they've followed you for 2 weeks or something.

For someone starting a channel that doesn't make much sense. So in my first video I disabled it so folks could chat. And then some crypto scammer came in. Go figure.

After the first video I turned Shield Mode back on but set the minimum follow time to 10 minutes I think.

OBS Studio

I used OBS Studio to record. I was frustrated with it for a while because the video would lag so much when I tested out streaming. After playing around with Twitch Studio and giving up on it for being too simple, I messed with OBS video settings enough to get my video to not lag. Unfortunately I can't remember what settings I used.

Noise Gate / pop filter

The Noise Gate Filter is awesome. My mechanical keyboard sounded obnoxious before I turned it on. I was considering getting a pop filter but then discovered that the Noise Gate Filter is built in, you just have to turn it on.

Scenes

It also took me a while to understand OBS Scenes but then I realized I can use them to have an intro graphic (without the mic on!), a main coding scene (focused on my terminal and with my webcam overlayed), and a "back soon" graphic if I needed it.

To get the mic off you have to disable the mic globally (it's on globally by default) and then add it as an input only to the scenes you want.

Storage and export to YouTube

Twitch doesn't store streams by default. You have to turn on Video on Demand.

Even when it's turned on the videos only seem to be stored for 1 week. Maybe that's configurable but I didn't see it.

In any case it's not a problem because you can set up a YouTube connection. Then after a stream is complete you find the stream video and click Export. It takes about a minute to upload the hour long videos I did. Though YouTube post-processing took a while longer after that.

Next?

I'm forced to take a break from recording these videos for the next two weeks since I'll be in Cape Town.

I haven't decided yet if I'll continue this series (not something I'm extremely excited about since everyone builds a Scheme-like language).

I'd like to have a project that I can keep contributing to over time but I don't see very much value in doing that based on a Scheme or any lisp-like.

Maybe I'll do a basic JavaScript implementation next. Or another basic SQL database. Dunno.

January 27, 2023

January 26, 2023

MySQL scaling made easy

Learn about sharding, connection pooling, and more from PlanetScale Technical Solutions Architect Jonah Berquist.

January 25, 2023

January 23, 2023

For systems, research is development and development is research

The Conference on Innovative Data Systems Research (CIDR) 2023 is over, and as usual both the official program and the informal discussions have been great. CIDR encourages innovative, risky, and controversial ideas as well as honest exchanges. One intensely-discussed talk was the keynote by Hannes Mühleisen, who together with Mark Raasveldt is the brain behind DuckDB.

In the keynote, Hannes lamented the incentives of systems researchers in academia (e.g., papers over running code). He also criticized the often obscure topics database systems researchers work on while neglecting many practical and pressing problems (e.g., top-k algorithms rather than practically-important issues like strings). Michael Stonebraker has similar thoughts on the database systems community. I share many of these criticisms, but I'm more optimistic regarding what systems research in academia can do, and would therefore like to share my perspective.

Software is different: copying it is free, which has two implications: (1) Most systems are somewhat unique -- otherwise one could have used an existing one. (2) The cost of software is dominated by development effort. I argue that, together, these two observations mean that systems research and system development are two sides of the same coin.

Because developing complex systems is difficult, reinventing the wheel is not a good idea -- it's much better to stand on the proverbial shoulders of giants. Thus, developers should look at the existing literature to find out what others have done, and should experimentally compare existing approaches. Often there are no good solutions for some problems, requiring new inventions, which need to be written up to communicate them to others. Writing will not just allow communication, it will also improve conceptual clarity and understanding, leading to better software. Of course, all these activities (literature review, experiments, invention, writing) are indistinguishable from systems research.

On the other hand, doing systems research without integrating the new techniques into real systems can also lead to problems. Without being grounded by real systems, researchers risk wasting their time on intellectually-difficult, but practically-pointless problems. (And indeed much of what is published at the major database conferences falls into this trap.) Building real systems leads to a treasure trove of open problems. Publishing solutions to these often directly results in technological progress, better systems, and adoption by other systems.

To summarize: systems research is (or should be) indistinguishable from systems development. In principle, this methodology could work in both industry and academia. Both places have problematic incentives, but different ones. Industry often has a very short time horizon, which can lead to very incremental developments. Academic paper-counting incentives can lead to lots of papers without any impact on real systems.

Building systems in academia may not be the best strategy to publish the maximum number of papers or citations, but can lead to real-world impact, technological progress, and (in the long run even) academic accolades. The key is therefore to work with people who have shown how to overcome these systemic pathologies, and build systems over a long time horizon. There are many examples such academic projects (e.g., PostgreSQL, C-Store/Vertica, H-Store/VoltDB, ShoreMT, Proteus, Quickstep, Peloton, KÙZU, AsterixDB, MonetDB, Vectorwise, DuckDB, Hyper, LeanStore, and Umbra).


An effective product manager

There are three specific activities I have loved in some product managers I've worked with (and missed in others).

tldr;

  • Talk with customers and prospects
  • Develop and share a vision
  • Evangelize

Talk with customers and prospects

As a product manager, your superpower over engineering is to have spent time with customers and prospects. You should have (or develop) a good understanding of the market and your product's potential.

The only way you can do this is by spending time, over time, with customers and prospects. Understanding their workflows and their issues.

Develop and share a vision

Cynical folks will cringe at the word "vision" but it is a serious and necessary part of a successful organization.

As a product manager, you should establish and share a path for engineering to follow based on your understanding of customers, prospects, the market, and the company.

This is the "roadmap" and "prioritization". But prioritization is useless without a long-term vision.

The roadmap should represent (and broadly demonstrate) a concrete and meaningful goal. A goal that you can and should adjust over time as the company and market changes.

Evangelize

In bigger organizations there might be dedicated evangelism teams. But product managers must drive this work.

Evangelism should fit the vision you've developed.

And in the absense of dedicated evangelism teams, product managers should be creating demos, writing blog posts, and testing the solution with customers and prospects.

Again, it's fine for dedicated teams outside of product management to do bits of that work. But it must be driven and led by the product manager.

It's hard

Observed as I have from outside, being an effective product manager feels like a massively challenging task.

It's so easy to go without talking to customers, to get sucked into day-to-day issues and not create a vision, and to allow evangelism to happen ad-hoc.

Then there's the fact you don't live in a vacuum. You may have a boss in product management. Your engineering peers may have competing priorities. You may have a hard time understanding the founders or CEO. In a large company, you may not even have a CEO.

My ideas, your ideas

These are my ideas based on my experience. You may have your own ideas. If mine help you, great! If they don't, great!