February 07, 2019
February 01, 2019
Honest asymptotic complexity for search trees and tries
But that argument is very unfair: Either we consider a key comparison a O(1) operation, then a tree lookup is indeed in O(log n), but then a trie lookup is in O(1)! As the length of the key has to be bounded by a constant to get O(1) comparisons, the depth of the trie is bounded, too. Or the length of the key matters, then a trie lookup is indeed in O(k), but then a tree lookup is in O(k log n). We have to compare with the key on every level, and if we are unlucky we have to look at the whole key, which gives the factor k.
Which of course makes tries much more attractive asymptotic wise. Note that we ignore wall clock times here, which are much more difficult to predict, but in many if not most cases tries are indeed much faster than search trees.
I believe the reason why text books get away with this unfair comparison is that they all present balanced search trees with integer keys:
While tries are traditionally introduced with string examples. If they had used string keys for balanced search trees instead it would have been clear that the key length matters:
The trie examines every key byte at most once, while the search tree can examine every key byte log n times. Thus, the asymptotic complexity of tries is actually better than that of balanced search tries.
January 22, 2019
Transparency and communication on small teams
I saw a post on dev.to that talks about dysfunctional teams. This is a response that focuses specifically on how to prevent burnout from overworking. This is aimed at senior/lead engineers and engineering/project managers -- because everyone in a leadership role is responsible for the health of the team and the company.
In an otherwise good company with hard-working, ethical employees, overworking happens because of imperfect communication. If neither of those premises hold, you have more serious issues and have no need for this post.
The primary subjects of poor communication are:
- Capacity/capabilities
- Priorities
- Results
If any member of the team (or worse, the entire team) is not honestly reporting on their capacity and capability, this will drive them to overwork to make up for what they couldn't accomplish on work hours.
If any member of the team (or worse, the entire team) is not honestly and publicly reporting on what they understand to be the priorities, they will end up needing to work overtime if true priorities become apparent too late.
And if any member of the team (or worse, the entire team) is not honestly and publicly reporting on what they accomplished, they will end up needing to work overtime if discrepancies become apparent too late.
Solution
Put a sprint process in place and schedule at least one meeting every sprint. Discover every political, technical, and structural stakeholder and find a time they can attend this meeting. At this meeting you will cover at a high level (perhaps with some demos) what was accomplished in the sprint and what you intend to accomplish in the next sprint.
If any stakeholder cannot make this meeting, find a time to sync up with him/her separately.
Your sprints should not last more than two weeks because any longer is too long to go before talking to/reviewing with your stakeholders.
Finally, publish a report on what you accomplished this sprint (and also what you did not accomplish!) and what you plan to accomplish the next sprint. For example, I send an email to the engineering organization with two docs at the end of each sprint: 1) a review doc listing tasks accomplished/not accomplished and 2) a list of tasks planned for the next sprint. This gives your stakeholders (and anyone else interested) an opportunity to review the contents of the meeting at their leisure.
Doing this can be difficult and embarrassing at first. Hard-working, ethical employees never want to be seen as not accomplishing their share of work. But the most important thing for the mid-to-long-term health of these employees is to get them reporting honestly.
This helps make it clear where these employees can legitimately improve (i.e. receive more training) and where it's necessary to hire more or different employees. You'll likely need to put pressure on every team member to report honestly and to do so without fear.
And as a result of doing this, you've done everything you can as a senior/lead member of a small team to push responsibility for your team's work up to your stakeholders. This is the best position to be in.
tldr; don't let your folks overwork unnecessarily when you could be reporting more frequently/honestly on understood priorities and accomplishments achieved/not achieved https://t.co/PeTe2Bq0Xz
— Phil Eaton (@phil_eaton) January 22, 2019
January 20, 2019
Windows
It has been six years since I last used Windows for any remotely serious software development. I've used Ubuntu, Arch, or FreeBSD since. But eventually I spent so much time working around common workplace tasks that I decided to put Windows 10 Pro on my work laptop.
Windows Subsystem for Linux
Introduced in 2016, this technology allows Windows to run unmodified Linux binaries. The core feat being syscall translation.
It works nearly flawlessly. This means I can do all my Go, Node, PostgreSQL development on Windows without a virtual machine using bash, tmux, git, emacs, etc.
I've seen a few minor exceptions over the course of regular software development in WSL:
More generally, Linux programs are heavily file-oriented. And Windows I/O is not designed well for that. In the worst cases (installing/adding Node packages) it can take minutes to do operations that would take Linux seconds.
Vagrant
Vagrant-Windows interoperability is abysmal.
As noted above, you cannot manage Hyper-V from vagrant within WSL. So you're stuck using Powershell. Even then, managing synced files from vagrant is a nightmare. The default sync method requires you to sign in using your Windows Live username and password on every reboot. But Node package installation attempts some file operations that are not supported over the default synced, network filesystem.
When I switched to rsync vagrant wouldn't reliable sync when the virtual machine went down and came back up.
After hours of trying to get some files synced with vagrant I gave up.
Hyper-V
Hyper-V's GUI is much more complex/feature-complete than VirtualBox. It even provides a Ubuntu-quick-install that I used to jump right in. I don't recommend using this though because it gives you no option but an 11GB hard disk. I didn't realize this until I went through an hour or two of post-install customization only to run out of space. Too lazy to boot into a live CD to grow the root filesystem I reinstalled with a more suitable 64GB drive and went through the hour-long post-install customization process again.
Networking in Hyper-V is more complex/feature-complete than VirtualBox as well. To access a Hyper-V machine you must create a new virtual network interface manually and associate it. Static IP address appear to be controlled at the host networking level (e.g. Control Panel) instead of within the Hyper-V interface. This highlights how these virtual interfaces are first-class, but overcomplicates the process of getting started.
Ultimately I gave up on a static IP address and decided to reboot less frequently.
Performance-wise Hyper-V machines are exactly as expected: excellent.
Misc
Docker support on Windows needs work. It took me a while to understand how Docker interacts with the WSL filesystem and what I needed to do to allow Docker to mount. The complexity is similar on macOS when you want to mount privileged directories like /var, but the experience is worse on Windows.
Apparently Windows does have tiling window managers, but I have not tried one out yet.
Powershell, a language with real types, is pretty compelling. But I have not spent enough time with it to be efficient. And since WSL is mostly good enough I don't really plan to.
Windows doesn't allow you to delete any files that are "in use". This is kinda cool except for that the errors you get when trying to delete files that are in use are useless. They are even more useless when you get the plain "could not delete directory" when you try to delete a directory with some file inside it that is in use. I had to start deleting files within by hand until I found the one I realized was in use.
Conclusion
If you have never run Linux or FreeBSD, don't use this post as an excuse not to. You should run Linux or FreeBSD for the experience. But if you've reached diminishing returns in your Linux/FreeBSD use, Windows as a development environment has come a long way. It may be the best platform available for software development, the profession.
Some notes on my experience having replaced Arch Linux with Windows on my work laptop https://t.co/8asxZmspwR
— Phil Eaton (@phil_eaton) January 20, 2019
Writing a lisp compiler from scratch in JavaScript: 2. user-defined functions and variables
Previously in compiler basics:
<! forgive me, for I have sinned >
1. lisp to assembly
Next in compiler basics:
3. LLVM
4. LLVM conditionals and compiling fibonacci
5. LLVM system calls
6. an x86 upgrade
In this post we'll extend the compiler to support defining functions
and variables. Additionally, we'll require the program's entrypoint to
be within a main
function.
The resulting code can be found here.
Function definition
The simplest function definition we need to support is for our main
function. This will look like this:
$ cat basic.lisp
(def main ()
(+ 1 2))
Where compiling and running it should produce a return code of 3:
$ node ulisp.js basic.lisp
$ ./build/a.out
$ echo $?
3
Parsing function definitions
The entire language is defined in S-expressions and we already parse S-expressions.
$ node
> const { parse } = require('./parser');
> JSON.stringify(parse('(def main () (+ 1 2))'));
'[[["def","main",[],["+",1,2]]],""]'
So we're done!
Code generation
There are two tricky parts to code generation once function definitions are introduced:
- Functions definitions are not expressions (in assembly)
- Function calling conventions for the callee
- Variable scope
Function definitions
A function definition looks like a function call. So we'll need to keep a list of "primitive" functions that handle what looks like function calls differently.
function compile_define() {
// TODO
}
const primitive_functions = {
def: compile_define,
};
Then in our compile_call
function we need to see if the function
being "called" is in this list. If so, we allow the associated
callback to handle compilation.
function compile_call(fun, args, destination) {
if (primitive_functions[fun]) {
primitive_functions[fun](args, destination);
return;
}
// Save param registers
args.map((_, i) => emit(1, `PUSH ${PARAM_REGISTERS[i]}`));
// Compile registers and store as params
args.map((arg, i) => compile_expression(arg, PARAM_REGISTERS[i], scope));
emit(1, `CALL ${BUILTIN_FUNCTIONS[fun] || scope[fun]}`);
// Restore param registers
args.map((_, i) => emit(1, `POP ${PARAM_REGISTERS[args.length - i - 1]}`));
if (destination && destination !== 'RAX') {
emit(1, `MOV ${destination}, RAX`);
}
}
Now we can begin thinking about compile_define
. It takes args
which will be a list of three elements containing the function's:
- name
- parameters
- and body
It does not use destination because we're treating function definitions as statements for now and not as expressions. If we were treating it as an expression, we might store the address of the function in the destination register. We keep destination around to keep the primitive function signatures consistent.
Based on how we called functions before and how we defined the
hard-coded add
function, we know what a function definition in
assembly generally looks like. And we know the arguments to the
function when called will be in RDI, RSI, and RDX.
function compile_define([name, parameters, body]) {
// Function name becomes a label we can CALL
emit(0, `${name}:`);
// Something to do with RDI, RSI, RDX and the parameters variable?
// We renamed compile_argument to compile_expression to be more general
compile_expression(body[0], 'RAX');
// Maybe some cleanup to do with RDI, RSI, RDX?
emit(1, 'RET\n');
}
Not a bad first sketch. But how do we match up RDI
, RSI
, RDX
and
the user-defined parameters
variable names? For example in the
following:
(def plus-two (a)
(+ a 2))
It's clear to us that a
must match up to RDI
. In order to do this
we need to track all variables in a scope
dictionary mapping the
variable name to the register where it's stored.
Additionally, keeping track of scope can help us fail quickly in the following scenario:
(def plus-two (a)
(+ b 2))
The variable b
is used but never defined. It has not been added to
the scope dictionary. So our compiler can fail quickly saying there is
an undefined variable being referenced.
Taking this a step further, what if we want to catch the following too:
(def plus-two (a)
(plus a 2))
We're trying to call plus
but it has not been defined. We should be
able to fail quickly here too. But that means we're need to track the
scope of function names in addition to variables. We'll choose to
track function names and variable names in the same scope dictionary.
This is the distinction between a lisp-1 and a lisp-2. We are a lisp-1 like Scheme because we have a single scope. Common Lisp is a lisp-2 because it stores function name scope separately from variable name scope.
Implementing scope
We need to revise every compile function to accept a scope dictionary
(specifically: compile
, compile_expression
, compile_call
, and
compile_define
). If a variable is referenced, we need to look up
it's location in the scope dictionary. If a variable is defined
(e.g. a function name or a function parameter) we need to add a
mapping to the scope dictionary.
Modifying compile_expression
is easiest:
function compile_expression(arg, destination, scope) {
// Is a nested function call, compile it
if (Array.isArray(arg)) {
compile_call(arg[0], arg.slice(1), destination, scope);
return;
}
if (scope[arg] || Number.isInteger(arg)) {
emit(1, `MOV ${destination}, ${scope[arg] || arg}`);
} else {
throw new Error('Attempt to reference undefined variable or unsupported literal: ' + arg);
}
}
Next we modify compile_call
:
function compile_call(fun, args, destination, scope) {
if (primitive_functions[fun]) {
primitive_functions[fun](args, destination, scope);
return;
}
// Save param registers
args.map((_, i) => emit(1, `PUSH ${PARAM_REGISTERS[i]}`));
// Compile registers and store as params
args.map((arg, i) => compile_expression(arg, PARAM_REGISTERS[i], scope));
const validFunction = BUILTIN_FUNCTIONS[fun] || scope[fun];
if (validFunction) {
emit(1, `CALL ${validFunction}`);
} else {
throw new Error('Attempt to call undefined function: ' + fun);
}
// Restore param registers
args.map((_, i) => emit(1, `POP ${PARAM_REGISTERS[args.length - i - 1]}`));
if (destination && destination !== 'RAX') {
emit(1, `MOV ${destination}, RAX`);
}
}
And then compile_define
where we modify scope for the first time:
function compile_define([name, params, ...body], destination, scope) {
// Add this function to outer scope
scope[name] = name.replace('-', '_');
// Copy outer scope so parameter mappings aren't exposed in outer scope.
const childScope = { ...scope };
emit(0, `${scope[name]}:`);
params.forEach(function (param, i) {
const register = PARAM_REGISTERS[i];
// Store parameter mapped to associated register
childScope[param] = register;
});
// Pass childScope in for reference when body is compiled.
compile_expression(body[0], 'RAX', childScope);
emit(1, 'RET\n');
}
And finally we need to modify the entrypoint compile
:
module.exports.compile = function (ast) {
emit_prefix();
// Pass in new, empty scope mapping
compile_call(ast[0], ast.slice(1), 'RAX', {});
emit_postfix();
}
And scope-wise we're pretty good!
Function calling convention: callee
We currently have a problem that we're using parameters registers to store local variables that messes up with how we are storing parameters for function calls within the function itself.
Ideally we could store function local variables (including the parameters when we get them) separately from how we store function call parameters within the function.
Thankfully according to the calling convention we've followed, we're
given a set of registers that are callee-preserved. Of them we'll use
RBX
, RBP
, and R12
in that order. This allows us to mess with so
long as we store them and restore them within the function.
Applying the same storing/restoring strategy to local variables as we did for parameters, we get:
const LOCAL_REGISTERS = [
'RBX',
'RBP',
'R12',
];
function compile_define([name, params, ...body], destination, scope) {
// Add this function to outer scope
scope[name] = name.replace('-', '_');
// Copy outer scope so parameter mappings aren't exposed in outer scope.
const childScope = { ...scope };
emit(0, `${scope[name]}:`);
params.forEach(function (param, i) {
const register = PARAM_REGISTERS[i];
const local = LOCAL_REGISTERS[i];
emit(1, `PUSH ${local}`);
emit(1, `MOV ${local}, ${register}`);
// Store parameter mapped to associated local
childScope[param] = local;
});
// Pass childScope in for reference when body is compiled.
compile_expression(body[0], 'RAX', childScope);
params.forEach(function (param, i) {
// Backwards first
const local = LOCAL_REGISTERS[params.length - i - 1];
emit(1, `POP ${local}`);
});
emit(1, 'RET\n');
}
And we're set.
Cleanup
We've still got a few messes going on:
- emit_prefix wraps out entire body in
_main
, we're requiring our ownmain
now - emitting to stdout instead of to a file
- multiple function definitions is treated as nonsense
Starting first, we rewrite emit_prefix
and emit_postfix
so that
our _main
just calls main
.
function emit_prefix() {
emit(1, '.global _main\n');
emit(1, '.text\n');
emit(0, 'plus:');
emit(1, 'ADD RDI, RSI');
emit(1, 'MOV RAX, RDI');
emit(1, 'RET\n');
}
function emit_postfix() {
emit(0, '_main:');
emit(1, 'CALL main');
emit(1, 'MOV RDI, RAX'); // Set exit arg
emit(1, `MOV RAX, ${SYSCALL_MAP['exit']}`);
emit(1, 'SYSCALL');
}
Next to deal with writing to a file instead of stdout, we need our
emit
function to write to a buffer. We'll let ulisp.js
write that
buffer to a file. Because we're incredibly lazy, we'll do this all
globally.
let OUT = '';
function emit(depth, args) {
const indent = new Array(depth + 1).join(' ');
OUT += `${indent}${args}\n`;
}
...
module.exports.compile = function (ast) {
OUT = '';
emit_prefix();
compile_call(ast[0], ast.slice(1), 'RAX', {});
emit_postfix();
return OUT;
}
And modify ulisp.js
:
const cp = require('child_process');
const fs = require('fs');
const { parse } = require('./parser');
const { compile } = require('./compiler');
function main (args) {
const input = fs.readFileSync(args[2]).toString();
const [ast] = parse(input);
const program = compile(ast);
try {
fs.mkdirSync('build');
} catch (e) {}
fs.writeFileSync('build/prog.s', program);
cp.execSync('gcc -mstackrealign -masm=intel -o build/a.out build/prog.s');
}
main(process.argv);
And we're finally ready to run a simple program.
A program!
$ cat test.lisp
(def main () (+ 1 2))
$ node ulisp.js test.lisp
$ ./build/a.out
$ echo $?
3
Hurray! Now let's try defining and calling a second function to validate parameter behavior.
$ cat test.lisp
(def plus-two (a)
(+ a 2))
(def main ()
(plus-two 3))
$ node ulisp.js test.lisp
$ ./build/a.out
./compiler.js:106
throw new Error('Attempt to call undefined function: ' + fun);
^
Error: Attempt to call undefined function: p2
...
We start getting some really weird errors. And the reason is because our compiler doesn't know how to deal with sibling S-expressions.
So we'll introduce a new primitive function called begin
that calls
all it's sibling functions and returns the value of the last
call. Then we'll wrap the program in an implicit begin
so we don't
need to.
function compile_begin(body, destination, scope) {
body.forEach((expression) => compile_expression(expression, 'RAX', scope));
if (destination && destination !== 'RAX') {
emit(1, `MOV ${destination}, RAX`);
}
}
const primitive_functions = {
def: compile_define,
begin: compile_begin,
};
...
module.exports.compile = function (ast) {
OUT = '';
emit_prefix();
compile_call('begin', ast, 'RAX', {});
emit_postfix();
return OUT;
}
And we try our test program again. :)
$ cat test.lisp
(def plus-two (a)
(+ a 2))
(def main ()
(plus-two 3))
$ node ulisp.js test.lisp
$ ./build/a.out
$ echo $?
5
And that's all there is to it! Stay tuned for the next post on conditionals and tail-call optimization.
Part two on compiler basics using JavaScript: user-defined functions and variables https://t.co/XOam67HO8h
— Phil Eaton (@phil_eaton) January 20, 2019
December 27, 2018
Make small changes and solve the problems you have
Two frustrating things that can happen in an organization are 1) big changes and 2) changes that aren’t clearly associated with a known problem. It’s even worse in that order.
These situations tend to happen when a problem remain unaddressed for too long. These situations tend to happen when there is not a strong enough emphasis on respect for all employees -- their experience, ideas, and feelings.
I try to avoid these issues in teams I run by starting early with a problem statement. Specifically when there’s a problem I’d like to solve, I’ll mention it in our fortnightly team retro. If there’s general agreement a problem exists, we begin looking for the least invasive/least effort way to fix the problem. More on that later.
If the problem is not well understand or widely-enough shared, I’ll table the discussion until I can talk with more people to better articulate the problem. Or maybe there isn’t a problem after all.
This process of clarifying and agreeing a problem exists is the only appropriate first step when making a change. It is important to provide sufficient context to affected employees.
After the problem is understood I begin to suggest possible solutions -- soliciting feedback and alternatives. But making sure a problem is well understand is not the same thing as making sure that potential solutions could reasonably solve the problem. Throughout the discussion of solutions I try to repeatedly make sure that proposed solutions could actually address the problem.
From there I try to steer discussion of solutions to ones that are easiest to make and least invasive. Small changes are easier to make. There is little room for disagreement when there is little changing.
Making small changes among a small group of people is even easier. The few disagreements that you find when making small changes among a small group of people give you a chance to prove or improve the solution before introducing it to a larger group.
Communicating frequently and effectively should be a clear theme here.
At this point if there is a single most reasonable solution, I’ll pick it unless there is serious disagreement. Most of the time folks are amenable to the need for a solution to be chosen to solve a problem they agreed existed, even if they don’t love the solution.
If there is no clear solution or there is serious disagreement, go back a few paragraphs and start over to understand the problem and solicit feedback and alternative for solutions. Or take the heat of serious disagreement.
This is a philosophy. It’s difficult to prove the effectiveness one way or the other -- especially over the mid-to-long-term. But the logic makes sense to me, it agrees with what I’ve read on management, and has worked effectively in teams I’ve run so far.
Further reading:
- Peopleware: Productive Projects and Teams
- Managing Transitions: Making the Most of Change
- Thinking, Fast and Slow
- Site Reliability Engineering: How Google Runs Production Systems
Wrote a post expanding on a side of this: make small changes and solve the problems you have https://t.co/FXepELSHMx https://t.co/mVsT1KFhKc
— Phil Eaton (@phil_eaton) December 27, 2018
December 10, 2018
Introducing Vitess 3.0
November 20, 2018
Writing a lisp compiler from scratch in JavaScript: 1. lisp to assembly
Next in compiler basics:
<! forgive me, for I have sinned >
2. user-defined functions and variables
3. LLVM
4. LLVM conditionals and compiling fibonacci
5. LLVM system calls
6. an x86 upgrade
In this post we'll write a simple compiler in Javascript (on Node)
without any third-party libraries. Our goal is to take an input
program like (+ 1 (+ 2 3))
and produce an output assembly program
that does these operations to produce 6
as the exit code. The
resulting compiler can be found
here.
We'll cover:
- Parsing
- Code generation
- Assembly basics
- Syscalls
And for now we'll omit:
- Programmable function definitions
- Non-symbol/-numeric data types
- More than 3 function arguments
- Lots of safety
- Lots of error messsages
Parsing
We pick the S-expression syntax mentioned earlier because it's very easy to parse. Furthermore, our input language is so limited that we won't even break our parser into separate lexing/parsing stages.
Once you need to support string literals, comments, decimal
literals, and other more complex literals it becomes easier to use
separate stages.
If you're curious about these separate stages of parsing, you may be
interested in my post
on writing
a JSON parser.
Or, check out my BSDScheme project for a fully-featured
lexer
and
parser
for Scheme.
The parser should produce an Abstract Syntax Tree (AST), a data
structure representing the input program. Specifically, we want (+ 1 (+ 2 3))
to produce ['+', 1, ['+', 2, 3]]
in Javascript.
There are many different ways to go about parsing but the most intuitive to me is to have a function that accepts a program (a string) and returns a tuple containing the program parsed so far (an AST) and the rest of the program (a string) that hasn't been parsed.
That leaves us with a function skeleton that looks like this:
module.exports.parse = function parse(program) {
const tokens = [];
... logic to be added ...
return [tokens, ''];
};
The code that initially calls parse will thus have to deal with unwrapping the outermost tuple to get to the AST. For a more helpful compiler we could check that the entire program was actually parsed by failing if the second element of the return result is not the empty string.
Within the function we will iterate over each character and accumulate until we hit space, left or right parenthesis:
module.exports.parse = function parse(program) {
const tokens = [];
let currentToken = '';
for (let i = 0; i < program.length; i++) {
const char = program.charAt(i);
switch (char) {
case '(': // TODO
break;
case ')': // TODO
break;
case ' ':
tokens.push(+currentToken || currentToken);
currentToken = '';
break;
default:
currentToken += char;
break;
}
}
return [tokens, ''];
};
The recursive parts are always the most challenging. The right paren is easiest. We must push the current token and return all tokens with the rest of the program:
module.exports.parse = function parse(program) {
const tokens = [];
let currentToken = '';
for (let i = 0; i < program.length; i++) {
const char = program.charAt(i);
switch (char) {
case '(': // TODO
break;
case ')':
tokens.push(+currentToken || currentToken);
return [tokens, program.substring(i + 1)];
case ' ':
tokens.push(+currentToken || currentToken);
currentToken = '';
break;
default:
currentToken += char;
break;
}
}
return [tokens, ''];
};
Finally the left paren should recurse, add the parsed tokens to the list of sibling tokens, and force the loop to start at the new unparsed point.
module.exports.parse = function parse(program) {
const tokens = [];
let currentToken = '';
for (let i = 0; i < program.length; i++) {
const char = program.charAt(i);
switch (char) {
case '(': {
const [parsed, rest] = parse(program.substring(i + 1));
tokens.push(parsed);
program = rest;
i = 0;
break;
}
case ')':
tokens.push(+currentToken || currentToken);
return [tokens, program.substring(i + 1)];
case ' ':
tokens.push(+currentToken || currentToken);
currentToken = '';
break;
default:
currentToken += char;
break;
}
}
return [tokens, ''];
};
Assuming this is all in parser.js
, let's try it out in the REPL:
$ node
> const { parse } = require('./parser');
undefined
> console.log(JSON.stringify(parse('(+ 3 (+ 1 2)')));
[[["+",3,["+",1,2]]],""]
Solid. We move on.
Assembly 101
Assembly is essentially the lowest-level programming language we can
use. It is a human readable, 1:1 representation of the binary
instructions the CPU can interpret. Conversion from assembly to
binary is done with an assembler; the reverse step is done with a
disassembler. We'll use gcc
for assembling since it deals with some
oddities of
assembly programming on macOS.
The primary data structures in assembly are registers (temporary
variables stored by the CPU) and the program stack. Every function in
a program has access to the same registers, but convention cordons
off sections of the stack for each function so it ends up being a
slightly more durable store than registers. RAX
, RDI
, RDX
, and
RSI
are a few registers available to us.
Now we only need to know a few instructions to compile our program (the rest of programming assembly is convention):
MOV
: store one register's content into another, or store a literal number into a registerADD
: store the sum of two register's contents in the first registerPUSH
: store a register's content on the stackPOP
: remove the top-most value from the stack and store in a registerCALL
: enter a new section of the stack and start running the functionRET
: enter the calling functions stack and return to evaluating from the next instruction after the callSYSCALL
: likeCALL
but where the function is handled by the kernel
Function calling convention
Assembly instructions are flexible enough that there is no language-defined way to make function calls. Therefore it is important to answer (at least) the following few questions:
- Where are parameters stored by the caller so that the callee has access to them?
- Where is the return value stored by the callee so the caller has access to it?
- What registers are saved by whom?
Without getting too far into the specifics, we'll assume the following answers for development on x86_64 macOS and Linux systems:
- Parameters are stored (in order) in the
RDI
,RSI
, andRDX
registers- We won't support passing more than three arguments
- The return value is stored in the
RAX
register RDI
,RSI
, andRDX
registers are stored by the caller
Code generation
With assembly basics and the function call convention in mind, we've got enough to generate code from the parsed program's AST.
The skeleton of our compile code will look like this:
function emit(depth, code) {
const indent = new Array(depth + 1).map(() => '').join(' ');
console.log(indent + code);
}
function compile_argument(arg, destination) {
// If arg AST is a list, call compile_call on it
// Else must be a literal number, store in destination register
}
function compile_call(fun, args, destination) {
// Save param registers to the stack
// Compile arguments and store in param registers
// Call function
// Restore param registers from the stack
// Move result into destination if provided
}
function emit_prefix() {
// Assembly prefix
}
function emit_postfix() {
// Assembly postfix
}
module.exports.compile = function parse(ast) {
emit_prefix();
compile_call(ast[0], ast.slice(1));
emit_postfix();
};
From our pseudo-code in comments it is simple enough to fill in. Let's fill in everything but the prefix and postfix code.
function compile_argument(arg, destination) {
// If arg AST is a list, call compile_call on it
if (Array.isArray(arg)) {
compile_call(arg[0], arg.slice(1), destination);
return;
}
// Else must be a literal number, store in destination register
emit(1, `MOV ${destination}, ${arg}`);
}
const BUILTIN_FUNCTIONS = { '+': 'plus' };
const PARAM_REGISTERS = ['RDI', 'RSI', 'RDX'];
function compile_call(fun, args, destination) {
// Save param registers to the stack
args.forEach((_, i) => emit(1, `PUSH ${PARAM_REGISTERS[i]}`));
// Compile arguments and store in param registers
args.forEach((arg, i) => compile_argument(arg, PARAM_REGISTERS[i]));
// Call function
emit(1, `CALL ${BUILTIN_FUNCTIONS[fun] || fun}`);
// Restore param registers from the stack
args.forEach((_, i) => emit(1, `POP ${PARAM_REGISTERS[args.length - i - 1]}`));
// Move result into destination if provided
if (destination) {
emit(1, `MOV ${destination}, RAX`);
}
emit(0, ''); // For nice formatting
}
In a better compiler, we would not make plus
a built-in
function. We'd emit code for the assembly instruction ADD
. However,
making plus
a function makes code generation simpler and also allows
us to see what function calls look like.
We'll define the plus
built-in function in the prefix code.
The prefix
Assembly programs consist of a few "sections" in memory. The most
important of which are the text
and data
sections. text
is a
read-only section where the program instructions themselves are
stored. The CPU is instructed to start interpreting from some location
in this text section and it will increment through instructions,
evaluating each instruction until it reaches an instruction that tells
it to jump to a different location to evaluate instructions (e.g. with
CALL, RET, or JMP).
To denote the text section we emit .text
in our prefix before we
emit our generated code.
The data section is for statically initialized values (e.g. global
variables). We don't have any need for that right now so we'll
ignore it.
Here
is a good read with more detail on these (and other) sections.
Additionally, we need to emit an entrypoint (we'll use _main
) and
add a notice (.global _main
) so that the location of this entrypoint
is visible externally. This is important because we let gcc
handle
the hairier parts of generating an executable file and it needs access
to the entrypoint.
So far, our prefix looks like this:
function emit_prefix() {
emit(1, '.global _main\n');
emit(1, '.text\n');
// TODO: add built-in functions
emit(0, '_main:');
}
The last part of our prefix needs to include the plus
built-in
function. For this, we add the first two parameter registers we agreed
on (RDI
and RSI
) and store the result in RAX
.
function emit_prefix() {
emit(1, '.global _main\n');
emit(1, '.text\n');
emit(0, 'plus:');
emit(1, 'ADD RDI, RSI');
emit(1, 'MOV RAX, RDI');
emit(1, 'RET\n');
emit(0, '_main:');
}
And we're golden.
The postfix
The job of the postfix will be simple, call exit
with the value of
RAX
since this will be the result of the last function called by the
program.
exit
is a syscall, so we'll use the SYSCALL
instruction to call
it. The x86_64 calling convention on macOS and Linux for SYSCALL
defines parameters the same way CALL
does. But we also need to tell
SYSCALL
what syscall to call. The convention is to set RAX
to the
integer representing the syscall on the current system. On Linux it
will be 60
; on macOS it is 0x2000001
.
When I say "convention", I don't mean that you really have a choice as a programmer. It was arbitrary when the operating system and standard libraries chose it. But if you want to write a working program that uses syscalls or calls out to (say) glibc, you'll need to follow these conventions.
The postfix then looks like this:
const os = require('os');
const SYSCALL_MAP = os.platform() === 'darwin' ? {
'exit': '0x2000001',
} : {
'exit': 60,
};
function emit_postfix() {
emit(1, 'MOV RDI, RAX'); // Set exit arg
emit(1, `MOV RAX, ${SYSCALL_MAP['exit']}`); // Set syscall number
emit(1, 'SYSCALL');
}
And we're set here too.
Putting it all together
We can finally write our Javascript entrypoint and run our compiler against a sample program.
The entrypoint might look like this:
const { parse } = require('./parser');
const { compile } = require('./compiler');
function main(args) {
const script = args[2];
const [ast] = parse(script);
compile(ast[0]);
}
main(process.argv);
And we can call it like so:
$ node ulisp.js '(+ 3 (+ 2 1))'
.global _main
.text
plus:
ADD RDI, RSI
MOV RAX, RDI
RET
_main:
PUSH RDI
PUSH RSI
MOV RDI, 3
PUSH RDI
PUSH RSI
MOV RDI, 2
MOV RSI, 1
CALL plus
POP RSI
POP RDI
MOV RSI, RAX
CALL plus
POP RSI
POP RDI
MOV RDI, RAX
MOV RAX, 0x2000001
SYSCALL
Generating an executable file from the output
If we redirect the previous output to an assembly file and call gcc
on it, we can generate a program we can run. Then we can echo the $?
variable to see the exit code of the previous process.
$ node ulisp.js '(+ 3 (+ 2 1))' > program.S
$ gcc -mstackrealign -masm=intel -o program program.s
$ ./program
$ echo $?
6
And we've got a working compiler! The full source of the compiler is available here.
Further reading
- x86_64 calling convention
- macOS assembly programming
- Destination-driven code generation
Finished that intro to compilers post :) lisp to assembly in Javascript https://t.co/0HDIn4Mv7a
— Phil Eaton (@phil_eaton) November 26, 2018