a curated list of database news from authoritative sources

January 20, 2018

January 10, 2018

First few hurdles writing a Scheme interpreter

I started working on BSDScheme last October, inspired to get back into language implementation after my coworker built bshift, a compiler for a C-like language. BSDScheme is an interpreter for a (currently small subset of) Scheme written in D. It implements a few substantial primitive functions (in under 1000 LoC!). It uses the same test framework bshift uses, btest. I'm going to expand here on some notes I wrote in a post on Reddit on some issues I faced during these first few months developing BSDSCheme.

Before I get too far, here is a simple exponent function running in BSDScheme. It demonstates a few of the basic builtin primitives and also integers being upgraded to D's std.bigint when an integer operation produces an integer unable to fit in 64 bits. (See the times and plus guards for details; see the examples directory for other examples.)

$ cat examples/recursion.scm
(define (exp base pow)
  (if (= pow 0)
      1
      (* base (exp base (- pow 1)))))

(display (exp 2 64))
(newline)
$ ./bin/bsdscheme examples/exp.scm
18446744073709551616

The first big correction I made was to the way values are represented in memory. I originally implemented BSDScheme's value representation as a struct with a pointer to each possible value type. This design was simple to begin with but space-inefficient. I modelled a redesign after the Chicken Scheme data representation. It uses a struct with two fields, header and data. Both fields are word-size integers (currently hard-coded as 64 bits). The header stores type and length information and the data stores data.

In this representation, simple types (integers < 2^63, booleans, characters, etc.) take up only 128 bits. The integers, booleans, etc. are placed directly into the 64 bit data field. Other types (larger integers, strings, functions, etc) use the data field to store a pointer to memory allocated in the heap. Getting the conversion of these complex types right was the trickiest part of this data representation effort... lots of void-pointer conversions.

The next big fix I made was to simplify the way generic functions dealt with their arguments. Originally I passed each function its arguments un-evaluated and left it up to each function to evaluate its arguments before operating on them. While there was nothing intrinsically wrong with this method, it was overly complicated and bug-prone. I refactored the builtin functions into two groups: normal functions and special functions. Normal function arguments are evaluated before sending the arguments S-expression to the function. Special functions receive the arguments S-expression verbatim so they can decide what / when to evaluate.

The last issue I'll talk about in this post was dealing with the AST representation. When I started out, the easiest way to get things working was to have an AST representation completely separate from the representation of BSDScheme values. This won't get you far in Scheme. In order to (eventually) support macros (and in the meantime support eval), the AST representation would have to make use of the value representation. This was the most complicated and confusing issue so far in BSDScheme. With the switch to recursive data structures, it was hard to know if an error occurred because I parsed incorrectly, or recursed over what I parsed incorrectly, or even if I was printing out what I parsed incorrectly. After some embarrassing pain, I got all the pieces in place after a month and it set me up to easily support converting my original interpret function into a generic eval function that I could expose to the language like any other special function.

One frustrating side-effect of this AST conversion is that since the parsing stage builds out trees using the internal value representation, the parsing stage is tied to the interpreter. From what I can tell, this basically means I have to revert back to some intermediate AST representation or throw away the parser to support a compiler backend.

Next steps in BSDScheme include converting all the examples into tests, combining the needlessly split out lexing and parsing stage into a single read function that can be exposed into the language, fleshing out R7RS library support, and looking more into LLVM as a backend.

September 18, 2017

Custom Sharding With Vitess

Vitess supports a variety of predefined sharding algorithms that can suit different needs. This is achieved by associating a Vindex with your main sharding column. A Vindex essentially provides a mapping function that converts your column value to a keyspace_id. This keyspace_id is then used to decide the target shard. A full description of VSchema and Vindexes can be found here. However, such predefined vindexes will work only if you intend to shard your system using Vitess.

April 25, 2017

Vitess releases version 2.1

The Vitess project is proud to announce the release of version 2.1. This version comes packed with new features that improve usability, availability and resilience of the overall system. The release coincides with the Percona Live 2017 Conference, where project co-founder Sugu Sougoumarane will give the talk "Vitess beyond YouTube". He is joined by Robert Navarro from Stitch Labs who is going to describe how Stitch Labs uses Vitess in production.

March 21, 2017

March 11, 2017

Deploying FreeBSD on Linode unattended in minutes

I became a FreeBSD user over 2 years ago when I wanted to see what all the fuss was about. I swapped my y410p dual-booting Windows / Ubuntu with FreeBSD running Gnome 3. I learned a lot during the transition and came to appreciate FreeBSD as a user. I soon began running FreeBSD as my OS of choice on cloud servers I managed. So naturally, when I started working at Linode a year ago I wanted to run FreeBSD servers on Linode too.

Linode is a great platform for running random unofficial images because you have much control over the configuration. I followed existing guides closely and was soon able to get a number of operating systems running on Linodes by installing them manually: FreeBSD, OpenBSD, NetBSD, Minix3, and SmartOS to date.

Unofficial images come at a cost though. In particular, I became frustrated having to reinstall using the installer every time I managed to trash the disk. So over the past year, I spent time trying to understand the automated installation processes across different operating systems and Linux distributions.

Unattended installations are tough. The methods for doing them differ wildly. On RedHat, Fedora, and CentOS there is Kickstart. On Debian and Ubuntu there is preseeding. Gentoo, Arch, and FreeBSD don't particularly have a framework for unattended installs, but the entire installation process is well-documented and inherently scriptable (if you put in the effort). OpenBSD has autoinstall. Trying to understand each and every one of these potential installation methods was pretty defeating for getting started on a side-project.

A few weeks ago, I finally had the silly revelation that I didn't need to script the installation process -- at least initially. I only had to have working images available somewhere that could be copied to new Linodes. Some OSs / distributions may provide these images, but there is no guarantee that they exist or work. If I tested and hosted them for Linodes, anyone could easily run their own copy.

I began by running the installation process as normal for FreeBSD. After the disk had FreeBSD installed on it, I rebooted into Finnix, made a compressed disk image, and transferred it to an "image host" (another Linode in Fremont running an FTP server). Then I tested the reversal process manually to make sure a new Linode could grab the image, dd it to a disk, reboot and have a working filesystem and networking. (This transfer occurs over private networking to reduce bandwidth costs and thus limits Linode creation to the datacenter of the image host, Fremont.)

Then it was time to script the process. I looked into the existing Linode API client wrappers and noticed none of them were documented. So I took a day to write and document a good part of a new Linode Python client.

I got to work and out came the linode-deploy-experimental script. To run this script, you'll need an API token. This script will allow you to deploy from the hosted images (which now include FreeBSD 11.0 and OpenBSD 6.0). Follow the example line in the git repo and you'll have a Linode running OpenBSD or FreeBSD in minutes.

Clearly there's a lot of work to do on both this script and on the images:

  • Fremont datacenter has the only image host.
  • The script does not change the default password: "password123". You'll want to change this immediately.
  • The script does not automatically grow the file system after install.
  • The TTY config for these images currently requires you to use Glish instead of Weblish.
  • And more.

Even if many of these issues do get sorted out (I assume they will), keep in mind that these are unofficial, unsupported images. Some things will probably never work: backups, password reset, etc. If you need help, you are probably limited to community support. You can also find me with any questions (peaton on OFTC). But for me this is at least a slight improvement on having to run through the install process every time I need a new FreeBSD Linode.

December 29, 2016

Walking through a basic Racket web service

Racket is an impressive language and ecosystem. Compared to Python, Racket (an evolution of Scheme R5RS is three years younger. It is as concise and expressive as Python but with much more reasonable syntax and semantics. Racket is also faster in many cases due in part to:

Furthermore, the built-in web server libraries and database drivers for MySQL and PostgreSQL are fully asynchronous. This last bit drove me here from Play / Akka. (But strong reservations about the complexity of Scala and the ugliness of Play in Java helped too.)

With this motivation in mind, I'm going to break down the simple web service example provided in the Racket manuals. If you don't see the following code in the linked page immediately, scroll down a bit.

#lang web-server

(require web-server/http)

(provide interface-version stuffer start)

(define interface-version 'stateless)

(define stuffer
  (stuffer-chain
   serialize-stuffer
   (md5-stuffer (build-path (find-system-path 'home-dir) ".urls"))))

(define (start req)
  (response/xexpr
   `(html (body (h2 "Look ma, no state!")))))

First we notice the #lang declaration. Racket libraries love to make new "languages". These languages can include some entirely new syntax (like the Algol language implementation) or can simply include a summary collection of libraries and alternative program entrypoints (such as this web-server language provides). So the first thing we'll do to really understand this code is to throw out the custom language. And while we're at it, we'll throw out all typical imports provided by the default racket language and use the racket/base language instead. This will help us get a better understanding of the Racket libraries and the functions we're using from these libraries.

While we're throwing the language away, we notice the paragraphs just below that original example in the manual. It mentions that the web-server language also imports a bunch of modules. We can discover which of these modules we actually need by searching in the Racket manual for functions we've used. For instance, searching for "response/xexpr" tells us it's in the web-server/http/xexpr module. We'll import the modules we need using the "prefix-in" form to make function-module connections explicit.

#lang racket/base

(require (prefix-in xexpr: web-server/http/xexpr)
         (prefix-in hash: web-server/stuffers/hash)
         (prefix-in stuffer: web-server/stuffers/stuffer)
         (prefix-in serialize: web-server/stuffers/serialize))

(provide interface-version stuffer start)

(define interface-version 'stateless)

(define stuffer
  (stuffer:stuffer-chain
   serialize:serialize-stuffer
   (hash:md5-stuffer (build-path (find-system-path 'home-dir) ".urls"))))

(define (start req)
  (xexpr:response/xexpr
   `(html (body (h2 "Look ma, no state!")))))

Now we've got something that is a little less magical. We can run this file by calling it: "racket server.rkt". But nothing happens. This is because the web-server language would start the service itself using the exported variables we provided. So we're going to have to figure out what underlying function calls "start" and call it ourselves. Unfortunately searching for "start" in the manual search field yields nothing relevant. So we Google "racket web server start". Down the page on the second search result we notice an example using the serve/servlet function to register the start function. This is our in.

#lang racket/base

(require (prefix-in xexpr: web-server/http/xexpr)
         (prefix-in hash: web-server/stuffers/hash)
         (prefix-in stuffer: web-server/stuffers/stuffer)
         (prefix-in serialize: web-server/stuffers/serialize)
         (prefix-in servlet-env: web-server/servlet-env))

(provide interface-version stuffer start)

(define interface-version 'stateless)

(define stuffer
  (stuffer:stuffer-chain
   serialize:serialize-stuffer
   (hash:md5-stuffer (build-path (find-system-path 'home-dir) ".urls"))))

(define (start req)
  (xexpr:response/xexpr
   `(html (body (h2 "Look ma, no state!")))))

(servlet-env:serve/servlet start)

Run this version and it works! We are directed to a browser with our HTML. But we should clean this code up a bit. We no longer need to export anything so we'll drop the provide line. We aren't even using the interface-version and stuffer code. Things seem to be fine without them, so we'll drop those too. Also, looking at the serve/servlet documentation we notice some other nice arguments we can tack on.

#lang racket/base

(require (prefix-in xexpr: web-server/http/xexpr)
         (prefix-in servlet-env: web-server/servlet-env))

(define (start req)
  (xexpr:response/xexpr
   `(html (body (h2 "Look ma, no state!")))))

(servlet-env:serve/servlet
 start
 #:servlet-path "/"
 #:servlet-regexp rx""
 #:stateless? #t)

Ah, that's much cleaner. When you run this code, you will no longer be directed to the /servlets/standalone.rkt path but to the site root -- set by the #:servlet-path optional variable. Also, every other path you try to reach such as /foobar will successfully map to the start function -- set by the #:servlet-regexp optional variable. Finally, we also found the configuration to set the servlet stateless -- set by the optional variable #:stateless?.

But this is missing two things we could really use out of a simple web service. The first is routing. We do that by looking up the documentation for the web-server/dispatch module. We'll use this module to define some routes -- adding a 404 route to demonstrate the usage.

#lang racket/base

(require (prefix-in dispatch: web-server/dispatch)
         (prefix-in xexpr: web-server/http/xexpr)
         (prefix-in servlet: web-server/servlet-env))

(define (not-found-route request)
  (xexpr:response/xexpr
   `(html (body (h2 "Uh-oh! Page not found.")))))

(define (home-route request)
  (xexpr:response/xexpr
   `(html (body (h2 "Look ma, no state!!!!!!!!!")))))

(define-values (route-dispatch route-url)
  (dispatch:dispatch-rules
   [("") home-route]
   [else not-found-route]))

(servlet:serve/servlet
 route-dispatch
 #:servlet-path "/"
 #:servlet-regexp #rx""
 #:stateless? #t)

Run this version and check out the server root. Then try any other path. Looks good. The final missing piece to this simple web service is logging. Thankfully, the web-server/dispatch-log module has us covered with some request formatting functions. So we'll wrap the route-dispatch function and we'll print out the formatted request.

#lang racket/base

(require (prefix-in dispatch: web-server/dispatch)
         (prefix-in dispatch-log: web-server/dispatchers/dispatch-log)
         (prefix-in xexpr: web-server/http/xexpr)
         (prefix-in servlet: web-server/servlet-env))

(define (not-found-route request)
  (xexpr:response/xexpr
   `(html (body (h2 "Uh-oh! Page not found.")))))

(define (home-route request)
  (xexpr:response/xexpr
   `(html (body (h2 "Look ma, no state!!!!!!!!!")))))

(define-values (route-dispatch route-url)
  (dispatch:dispatch-rules
   [("") home-route]
   [else not-found-route]))

(define (route-dispatch/log-middleware req)
  (display (dispatch-log:apache-default-format req))
  (flush-output)
  (route-dispatch req))

(servlet:serve/servlet
 route-dispatch/log-middleware
 #:servlet-path "/"
 #:servlet-regexp #rx""
 #:stateless? #t)

Run this version and notice the logs displayed for each request. Now you've got a simple web service with routing and logging! I hope this gives you a taste for how easy it is to build simple web services in Racket without downloading any third-party libraries. Database drivers and HTML template libraries are also included and similarly well-documented. In the future I hope to add an example of a slightly more advanced web service.

I have had huge difficulty discovering the source of Racket libraries. These library sources are nearly impossible to Google and search on Github is insane. Best scenario, the official racket.org docs would link directly to the source of a function when the function is documented. Of course I could just download the Racket source and start grepping... but I'm only so interested.