August 05, 2018
Vitess Weekly Digest - Aug 5 2018
August 04, 2018
btest: a language agnostic test runner
btest is a minimal, language-agnostic test runner originally written for testing compilers. Brian, an ex- co-worker from Linode, wrote the first implementation in Crystal (a compiled language clone of Ruby) for testing bshift, a compiler project. The tool accomplished exactly what I needed for my own language project, BSDScheme, and had very few dependencies. After some issues with Crystal support in containerized CI environments, and despite some incredible assistance from the Crystal community, we rewrote btest in D to simplify downstream use.
How it works
btest registers a command (or commands) to run and verifies the command output and status for different inputs. btest iterates over files in a directory to discover test groups and individual tests within. It supports a limited template language for easily adjusting a more-or-less similar set of tests. And it supports running test groups and individual tests themselves in parallel. All of this is managed via a simple YAML config.
btest.yaml
btest requires a project-level configuration file to declare the test
directory, the command(s) to run per test, etc. Let's say we want to
run tests against a python program. We create
a btest.yaml
file with the following:
test_path: tests
runners:
- name: Run tests with cpython
run: python test.py
test_path
is the directory in which tests are located.
runners
is an array of commands to run per test. We
hard-code a file to run test.py
as a project-level
standard file that will get written to disk in an appropriate path for
each test-case.
On multiple runners
Using multiple runners is helpful when we want to run all tests with different test commands or test command settings. For instance, we could run tests against cpython and pypy by adding another runner to the runners section.
test_path: tests
runners:
- name: Run tests with cpython
run: python test.py
- name: Run tests with pypy
run: pypy test.py
An example test config
Let's create a divide-by-zero.yaml
file in
the tests
directory and add the following:
cases:
- name: Should exit on divide by zero
status: 1
stdout: |
Traceback (most recent call last):
File "test.py", line 1, in <module>
4 / 0
ZeroDivisionError: division by zero
denominator: 0
templates:
- test.py: |
4 / {{ denominator }}
In this example, name
will be printed out when the test
is run. status
is the expected integer returned by
running the program. stdout
is the entire expected output
written by the program during execution. None of these three fields
are required. If status
or stdout are not
provided, btest will skip checking them.
Any additional key-value pairs are treated as template variable values
and will be substituted if/where it is referenced in the templates
section when the case is run. denominator
is the only
such variable we use in this example. When this first (and only) case
is run, test.py
will be written to disk
containing 4 / 0
.
templates section
The templates
section is a dictionary allowing us to
specify files to be created with variable substitution. All files are
created in the same directory per test case, so if we want to import
code we can do so with relative paths.
Here is a simple example of a BSDScheme test that uses this feature.
Running btest
Run btest from the root directory (the directory
above tests
) and we'll see all the grouped test cases
that btest registers and the result of each test:
$ btest
tests/divide-by-zero.yaml
[PASS] Should exit on divide by zero
1 of 1 tests passed for runner: Run tests with cpython
Use in CI environments
In the future we may provide pre-built release binaries. But in the meantime, the CI step involves downloading git and ldc and building/installing btest before calling it.
Circle CI
This is the config file I use for testing BSDScheme:
version: 2
jobs:
build:
docker:
- image: dlanguage/ldc
steps:
- checkout
- run:
name: Install debian-packaged dependencies
command: |
apt update
apt install -y git build-essential
ln -s $(which ldc2) /usr/local/bin/ldc
- run:
name: Install btest
command: |
git clone https://github.com/briansteffens/btest
cd btest
make
make install
- run:
name: Install bsdscheme
command: |
make
make install
- run:
name: Run bsdscheme tests
command: btest
Travis CI
This is the config Brian uses for testing BShift:
sudo: required
language: d
d:
- ldc
script:
# ldc gets installed as other names sometimes
- sudo ln -s `which $DC` /usr/local/bin/ldc
# bshift
- make
- sudo ln -s $PWD/bin/bshift /usr/local/bin/bshift
- sudo ln -s $PWD/lib /usr/local/lib/bshift
# nasm
- sudo apt-get install -y nasm
# basm
- git clone https://github.com/briansteffens/basm
- cd basm && cabal build && cd ..
- sudo ln -s $PWD/basm/dist/build/basm/basm /usr/local/bin/basm
# btest
- git clone https://github.com/briansteffens/btest
- cd btest && make && sudo make install && cd ..
# run the tests
- btest
July 22, 2018
When MySQL Goes Away
Handling MySQL errors in Go is not easy. There are a lot of MySQL server error codes, and the Go MySQL driver as its own errors, and Go database/sql has its own errors, and errors can bubble up from other packages, like net.OpError. Consequently, Go programs tend not to handle errors. Instead, they simply report errors:
err := db.Query(...).Scan(&v)
if err != nil {
return err
}
And then the error is logged or reported somewhere. This is as poor as it common, and it’s extremely common. A robust program handles the error: retry the query if possible; or report a more specific error; else, report the unhandled error. But robust MySQL error handling in Go requires very specific knowledge and experience that is beyond the reasonable purview of app developers.
June 08, 2018
Propagation of Mistakes in Papers
I started searching, and there is a large number of papers that uses the value 0.775351, but there is also a number of papers that uses the value 0.77351. Judging by the number of Google hits for "Flajolet 0.77351" vs. "Flajolet 0.775351" the 0.77351 group seems to be somewhat larger, but both camps have a significant number of publications. Interestingly, not a single paper mentions both constants, and thus no paper explains what the correct constant should be.
In the end I repeated the constant computation as explained by Flajolet, and the correct value is 0.77351. We can even derive one digit more when using double arithmetic (i.e., 0.773516), but that makes no difference in practice. Thus, the original paper was correct.
But why do so many paper use the incorrect value 0.775351 then? My guess is that at some point somebody made a typo while writing a paper, introducing the superfluous digit 5, and that all other authors copied the constant from that paper without re-checking its value. I am not 100% sure what the origin of the mistake is. The incorrect value seems to appear first in the year 2007, showing up in multiple publications from that year. Judging by publication date the source seems to be this paper (also it did not cite any other papers with the incorrect value, as far as I know). And everybody else just copied the constant from somewhere else, propagating it from paper to paper.
If you find this web page because you are searching for the correct Flajolet/Martin bias correction constant, I can assure you that the original paper was correct, and that the value is 0.77351. But you do not have to trust me on this, you can just repeat the computation yourself.
May 18, 2018
Writing to be read
There is a common struggle in the writing and maintenance of documentation, checklists, emails, guides, etc. Each provides immense value; a document may be the key to an important process. The goal is to remove barriers -- to encourage understanding and correct application of what has been noted -- without requiring a change in the character of the reader. That is, expect reading to be difficult and people to be lazy. Don't make things harder for your reader than need be.
Ignoring imperfections in the ideas transcribed into writing, there are a few particular aesthetic approaches I take to (hopefully) make my notes more effective. These ideas have been influenced by readings on writing, psychology, and user experience. In particular, I recommend On Writing Well, Thinking Fast and Slow, and Nielsen Norman research.
Language correctness
Spelling and grammatical correctness are low hanging fruit. They are easy to achieve. Use full sentences, use punctuation, and capitalize appropriately. But don't be a grammar stickler unreasonably; language is flexible and always changing. Don't allow anyone the opportunity to take your work less seriously by screwing up the basics.
Structuring sentences and paragraphs
Keep your sentences short. And avoid run on sentences; they are always difficult to parse. If you use more than two commas in a sentences (aside from in lists), the sentence is terrible. Split it up. Commas are often used superfluously. Don't do that.
Remember that if a comma separates two sentences, you can separate them into two sentences with a period instead. And if you ever have a list containing another list, separate the outer list with semi colons instead of commas to provide better differentiation.
Keep your paragraphs short too. In primary school you may have learned to use 5-8 sentences per paragraph. Don't do so needlessly. 3-5 sentences can be perfectly appropriate. As both sentences and paragraphs get longer, they appear more intimidating and can discourage readers from continuing.
Make your line height 120-145% the height of the font. Increase the spacing between lines in a paragraph to make the paragraph less dense and more friendly.
Keep contrast high. Don't put very gray (or colored) text on a white background.
Additionally, a number of studies suggest that limiting the width of text increases readability. For best results, limit the width such that 50-75 characters appear per line of text.
Don't put checklists in paragraphs
If a document describes concrete steps that should be followed exactly and can be reasonably summarized, don't hide the steps within paragraphs of text. Instead use an ordered or unordered list to clearly enumerate the expectations. You can't expect a checklist to be followed when it is hidden within the sentences of a paragraph.
Structuring sections
Any document (regardless the type) longer than 3-5 paragraphs should
be broken into sub-sections with summarizing headers to aid
scanning. Use the HTML id
attribute to allow a direct link to a
particular section in a long page. If the page has more than two
sections or vertically flows beyond a single screen, consider adding a
table of contents at the top of the page to allow the reader to find
the exact section she needs.
Don't put large headers immediately next to each other. It is disruptive to have multiple lines of large text.
I almost completely avoid Github Markdown's h1/# tag because it is just too large and jarring relative to the rest of the text. It is often best for the flow of a Github Markdown document to stick to only h3-h4/###-#### tags for headers, using the h2/## tag for the document title.
In summary
The aesthetic flow of a document can help or hurt the experience of a reader consuming it. Good aesthetic "sense" in this regard can be boiled down to a few methods that primarily revolve around simplifying structure and facilitating the rewarding feeling of progress as a reader reads.
Writing is difficult and takes time to evolve helpfully. The dividends are paid when process is better followed and questions are readily clarified in writing without further human intervention. It is incumbent on those writing and maintaining to organize effectively and see confusion of the reader as fault of the document, not fault of the reader. It is easier to change something yourself than to expect others to change to accommodate you.
May 06, 2018
Writing a simple JSON parser
Writing a JSON parser is one of the easiest ways to get familiar with parsing techniques. The format is extremely simple. It's defined recursively so you get a slight challenge compared to, say, parsing Brainfuck; and you probably already use JSON. Aside from that last point, parsing S-expressions for Scheme might be an even simpler task.
If you'd just like to see the code for the library, pj
, check it out
on Github.
What parsing is and (typically) is not
Parsing is often broken up into two stages: lexical analysis and syntactic analysis. Lexical analysis breaks source input into the simplest decomposable elements of a language called "tokens". Syntactic analysis (often itself called "parsing") receives the list of tokens and tries to find patterns in them to meet the language being parsed.
Parsing does not determine semantic viability of an input source. Semantic viability of an input source might include whether or not a variable is defined before being used, whether a function is called with the correct arguments, or whether a variable can be declared a second time in some scope.
There are, of course, always variations in how people choose to parse and apply semantic rules, but I am assuming a "traditional" approach to explain the core concepts.
The JSON library's interface
Ultimately, there should be a from_string
method that accepts a
JSON-encoded string and returns the equivalent Python dictionary.
For example:
assert_equal(from_string('{"foo": 1}'),
{"foo": 1})
Lexical analysis
Lexical analysis breaks down an input string into tokens. Comments and whitespace are often discarded during lexical analysis so you are left with a simpler input you can search for grammatical matches during the syntactic analysis.
Assuming a simple lexical analyzer, you might iterate over all the characters in an input string (or stream) and break them apart into fundemental, non-recursively defined language constructs such as integers, strings, and boolean literals. In particular, strings must be part of the lexical analysis because you cannot throw away whitespace without knowing that it is not part of a string.
In a helpful lexer you keep track of the whitespace and comments you've skipped, the current line number and file you are in so that you can refer back to it at any stage in errors produced by analysis of the source. The V8 Javascript engine recently became able to do reproduce the exact source code of a function. This, at the very least, would need the help of a lexer to make possible.
Implementing a JSON lexer
The gist of the JSON lexer will be to iterate over the input source and try to find patterns of strings, numbers, booleans, nulls, or JSON syntax like left brackets and left braces, ultimately returning each of these elements as a list.
Here is what the lexer should return for an example input:
assert_equal(lex('{"foo": [1, 2, {"bar": 2}]}'),
['{', 'foo', ':', '[', 1, ',', 2, ',', '{', 'bar', ':', 2, '}', ']', '}'])
Here is what this logic might begin to look like:
def lex(string):
tokens = []
while len(string):
json_string, string = lex_string(string)
if json_string is not None:
tokens.append(json_string)
continue
# TODO: lex booleans, nulls, numbers
if string[0] in JSON_WHITESPACE:
string = string[1:]
elif string[0] in JSON_SYNTAX:
tokens.append(string[0])
string = string[1:]
else:
raise Exception('Unexpected character: {}'.format(string[0]))
return tokens
The goal here is to try to match strings, numbers, booleans, and nulls and add them to the list of tokens. If none of these match, check if the character is whitespace and throw it away if so. Otherwise store it as a token if it is part of JSON syntax (like left brackets). Finally throw an exception if the character/string didn't match any of these patterns.
Let's extend the core logic here a little bit to support all the types and add the function stubs.
def lex_string(string):
return None, string
def lex_number(string):
return None, string
def lex_bool(string):
return None, string
def lex_null(string):
return None, string
def lex(string):
tokens = []
while len(string):
json_string, string = lex_string(string)
if json_string is not None:
tokens.append(json_string)
continue
json_number, string = lex_number(string)
if json_number is not None:
tokens.append(json_number)
continue
json_bool, string = lex_bool(string)
if json_bool is not None:
tokens.append(json_bool)
continue
json_null, string = lex_null(string)
if json_null is not None:
tokens.append(None)
continue
if string[0] in JSON_WHITESPACE:
string = string[1:]
elif string[0] in JSON_SYNTAX:
tokens.append(string[0])
string = string[1:]
else:
raise Exception('Unexpected character: {}'.format(string[0]))
return tokens
Lexing strings
For the lex_string
function, the gist will be to check if the first
character is a quote. If it is, iterate over the input string until
you find an ending quote. If you don't find an initial quote, return
None and the original list. If you find an initial quote and an ending
quote, return the string within the quotes and the rest of the
unchecked input string.
def lex_string(string):
json_string = ''
if string[0] == JSON_QUOTE:
string = string[1:]
else:
return None, string
for c in string:
if c == JSON_QUOTE:
return json_string, string[len(json_string)+1:]
else:
json_string += c
raise Exception('Expected end-of-string quote')
Lexing numbers
For the lex_number
function, the gist will be to iterate over the
input until you find a character that cannot be part of a number.
(This is, of course, a gross simplification, but being more accurate
will be left as an exercise to the reader.) After finding a character
that cannot be part of a number, either return a float or int if the
characters you've accumulated number more than 0. Otherwise return
None and the original string input.
def lex_number(string):
json_number = ''
number_characters = [str(d) for d in range(0, 10)] + ['-', 'e', '.']
for c in string:
if c in number_characters:
json_number += c
else:
break
rest = string[len(json_number):]
if not len(json_number):
return None, string
if '.' in json_number:
return float(json_number), rest
return int(json_number), rest
Lexing booleans and nulls
Finding boolean and null values is a very simple string match.
def lex_bool(string):
string_len = len(string)
if string_len >= TRUE_LEN and \
string[:TRUE_LEN] == 'true':
return True, string[TRUE_LEN:]
elif string_len >= FALSE_LEN and \
string[:FALSE_LEN] == 'false':
return False, string[FALSE_LEN:]
return None, string
def lex_null(string):
string_len = len(string)
if string_len >= NULL_LEN and \
string[:NULL_LEN] == 'null':
return True, string[NULL_LEN:]
return None, string
And now the lexer code is done! See the pj/lexer.py for the code as a whole.
Syntactic analysis
The syntax analyzer's (basic) job is to iterate over a one-dimensional list of tokens and match groups of tokens up to pieces of the language according to the definition of the language. If, at any point during syntactic analysis, the parser cannot match the current set of tokens up to a valid grammar of the language, the parser will fail and possibly give you useful information as to what you gave, where, and what it expected from you.
Implementing a JSON parser
The gist of the JSON parser will be to iterate over the tokens
received after a call to lex
and try to match the tokens to objects,
lists, or plain values.
Here is what the parser should return for an example input:
tokens = lex('{"foo": [1, 2, {"bar": 2}]}')
assert_equal(tokens,
['{', 'foo', ':', '[', 1, ',', 2, '{', 'bar', ':', 2, '}', ']', '}'])
assert_equal(parse(tokens),
{'foo': [1, 2, {'bar': 2}]})
Here is what this logic might begin to look like:
def parse_array(tokens):
return [], tokens
def parse_object(tokens):
return {}, tokens
def parse(tokens):
t = tokens[0]
if t == JSON_LEFTBRACKET:
return parse_array(tokens[1:])
elif t == JSON_LEFTBRACE:
return parse_object(tokens[1:])
else:
return t, tokens[1:]
A key structural difference between this lexer and parser is that the lexer returns a one-dimensional array of tokens. Parsers are often defined recursively and returns a recursive, tree-like object. Since JSON is a data serialization format instead of a language, the parser should produce objects in Python rather than a syntax tree on which you could perform more analysis (or code generation in the case of a compiler).
And, again, the benefit of having the lexical analysis happen independent from the parser is that both pieces of code are simpler and concerned with only specific elements.
Parsing arrays
Parsing arrays is a matter of parsing array members and expecting a comma token between them or a right bracket indicating the end of the array.
def parse_array(tokens):
json_array = []
t = tokens[0]
if t == JSON_RIGHTBRACKET:
return json_array, tokens[1:]
while True:
json, tokens = parse(tokens)
json_array.append(json)
t = tokens[0]
if t == JSON_RIGHTBRACKET:
return json_array, tokens[1:]
elif t != JSON_COMMA:
raise Exception('Expected comma after object in array')
else:
tokens = tokens[1:]
raise Exception('Expected end-of-array bracket')
Parsing objects
Parsing objects is a matter of parsing a key-value pair internally separated by a colon and externally separated by a comma until you reach the end of the object.
def parse_object(tokens):
json_object = {}
t = tokens[0]
if t == JSON_RIGHTBRACE:
return json_object, tokens[1:]
while True:
json_key = tokens[0]
if type(json_key) is str:
tokens = tokens[1:]
else:
raise Exception('Expected string key, got: {}'.format(json_key))
if tokens[0] != JSON_COLON:
raise Exception('Expected colon after key in object, got: {}'.format(t))
json_value, tokens = parse(tokens[1:])
json_object[json_key] = json_value
t = tokens[0]
if t == JSON_RIGHTBRACE:
return json_object, tokens[1:]
elif t != JSON_COMMA:
raise Exception('Expected comma after pair in object, got: {}'.format(t))
tokens = tokens[1:]
raise Exception('Expected end-of-object brace')
And now the parser code is done! See the pj/parser.py for the code as a whole.
Unifying the library
To provide the ideal interface, create the from_string
function
wrapping the lex
and parse
functions.
def from_string(string):
tokens = lex(string)
return parse(tokens)[0]
And the library is complete! (ish). Check out the project on Github for the full implementation including basic testing setup.
Appendix A: Single-step parsing
Some parsers choose to implement lexical and syntactic analysis in one stage. For some languages this can simplify the parsing stage entirely. Or, in more powerful languages like Common Lisp, it can allow you to dynamically extend the lexer and parser in one step with reader macros.
I wrote this library in Python to make it more accessible to a larger audience. However, many of the techniques used are more amenable to languages with pattern matching and support for monadic operations -- like Standard ML. If you are curious what this same code would look like in Standard ML, check out the JSON code in Ponyo.
I wrote a short post (and a corresponding Python library) explaining lexing and parsing with JSON https://t.co/3yEZlcU6i5 https://t.co/FbksvUO9aT #json #python
— Phil Eaton (@phil_eaton) May 6, 2018
April 28, 2018
Finishing up a FreeBSD experiment
I've been using FreeBSD as my daily driver at work since December. I've successfully done my job and I've learned a hell of a lot forcing myself on CURRENT... But there's been a number of issues with it that have made it difficult to keep using, so I replaced it with Arch Linux yesterday and I no longer have those issues. This is not the first time I've forced myself to run FreeBSD and it won't be the last.
The FreeBSD setup
I have a Dell Developer Edition. It employs full-disk encryption with ZFS. Not being a "disk-jockey" I cannot comment on how exhilarating an experience running ZFS is. It didn't cause me any trouble.
It has an Intel graphics card and the display server is X. I use the StumpWM window manager and the SLiM login manager. xscreensaver handles locking the screen, feh gives me background images, scrot gives me screenshots, and recordMyDesktop gives me video screen capture. This list should feel familiar to users of Arch Linux or other X-supported, bring-your-own-software operating systems/Linux distributions.
Software development
I primarily work on a web application with Node/PostgreSQL and React/SASS. I do all of this development locally on FreeBSD. I run other components of our system in a Vagrant-managed VirtualBox virtual machine.
Upgrading the system
Since I'm running CURRENT, I fetch the latest commit on Subversion and rebuild the FreeBSD system (kernel + user-land) each weekend to get the new hotness. This takes somewhere between 1-4 hours. I start the process Sunday morning and come back to it after lunch. After the system is compiled and installed, I update all the packages through the package manager and deal with fallout from incompatible kernel modules that send me in a crash/reboot loop on boot.
This is actually the part about running FreeBSD (CURRENT) I love the most. I've gotten more familiar with the development and distribution of kernel modules like the WiFi, Graphics, and VirtualBox drivers. I've learned a lot about the organization of the FreeBSD source code. And I've gotten some improvements merged into the FreeBSD Handbook on how to debug a core dump.
Issues with FreeBSD on my hardware
I installed CURRENT in December to get support for new Intel graphics drivers (which have since been backported to STABLE). The built-in Intel WiFi card is also new enough that it hadn't been backported to STABLE. My WiFi ultimately never got more than 2-4Mbps down on the same networks my Macbook Pro would get 120-250Mbps down. I even bought an older Realtek USB WiFi adapter and it fared no differently. My understanding is that this is because CURRENT turns on enough debug flags that the entire system is not really meant to be used except for by FreeBSD developers.
It would often end up taking 10-30 seconds for a git push
to
happen. It would take minutes to pull new Docker images, etc. This
(like everything else) does not mean you cannot do work on FreeBSD
CURRENT, it makes it really annoying.
Appendix A - Headphones
I couldn't figure out the headphone jack at all. Configuring outputs
via sysctl
and device.hints
is either really complicated or
presented in documentation really complicatedly. I posted a few times
in #freebsd on Freenode and got eager assistance but ultimately
couldn't get the headphone jack to produce anything without incredible
distortion.
Of course Spotify has no FreeBSD client and I didn't want to try the Linux compatiblity layer (which may have worked). I tried spoofing user agents for the Spotify web app in Chrome but couldn't find one that worked. (I still cannot get a working one on Linux either.) So I'd end up listening to Spotify on my phone, which would have been acceptable except for that the studio headphones I decided I needed were immensely under-powered by my phone.
Appendix B - Yubikey
I couldn't figure out how to give myself non-root access to my Yubikey which I believe is the reason I ultimately wasn't able to make any use of it. Though admittedly I don't understand a whit of GPG/PGP or Yubikey itself.
Appendix C - bhyve
I really wanted to use bhyve as the hypervisor for my CentOS virtual machines instead of VirtualBox. So I spent 2-3 weekends trying to get it working as a backend for Vagrant. Unfortunately the best "supported" way of doing this is to manually mutate VirtualBox-based Vagrant boxes and that just repeatedly didn't work for me.
When I tried using bhyve directly I couldn't get networking right. Presumably this is because NAT doesn't work well with wireless interfaces... And I hadn't put in enough weekends to understand setting up proxy rules correctly.
Appendix D - Synaptics
It is my understanding that FreeBSD has its own custom Synaptics drivers and configuration interfaces. Whether that is the case or not, the documentation is a nightmare and while I would have loved to punt to a graphical interface to prevent from fat-palming the touchpad every 30 seconds, none of the graphical configuration tools seemed to work.
A few weeks ago I think I finally got the synaptics support on but I couldn't scroll or select text anymore. I also had to disable synaptics, restart X, enable synaptics, and restart X on each boot for it to successfully register the mouse. I meant to post in #freebsd on Freenode where I probably would have found a solution but :shrugs:.
Appendix E - Sleep
Well sleep doesn't really work on any modern operating system.
FreeBSD is awesome
I enjoy picking on my setup, but it should be impressive that you can do real-world work on FreeBSD. If I had a 3-4 year old laptop instead of a 1-2 year old laptop, most of my issues would be solved.
Here are some reasons to like FreeBSD.
Less competition
This is kind of stupid. But it's easier to find work to do (e.g. docs to fix, bugs to report, ports to add/update, drivers to test) on FreeBSD. I'm really disappointed to be back on Linux because I like being closer to the community and knowing there are ways I can contribute and learn. It's difficult to find the right combination of fending/learning for yourself and achieving a certain level of productivity.
Package management (culture)
Rolling packages are really important to me as a developer. When I've run Ubuntu and Debian desktops in the past, I typically built 5-15 major (to my workflow) components from source myself. This is annoying. Rolling package systems are both easier to use and easier to contribute to... The latter point may be a coincidence.
In FreeBSD, packages are rolling and the base system (kernel + userland) is released every year or two if you run the recommended/supported "flavors" of FreeBSD (i.e. not CURRENT). If you're running CURRENT then everything is rolling.
Packages are binary, but you can build them from source if needed.
Source
FreeBSD has an older code base than Linux does but still manages to be much better organized. OpenBSD and Minix are even better organized but I don't consider them in the grouping as mainstream general-purpose operating systems like FreeBSD and Linux. Linux is an awful mess and is very intimidating, though I hope to get over that.
Old-school interfaces
There's no systemd so starting X is as simple as startx
(but you can
enable the login manager service to have it launch on boot). You
configure your network interfaces via ifconfig
, wpa_supplicant
,
and dhclient
.
Alternatives
PCBSD or TrueOS may be a good option for desktop users but something about the project turns me off (maybe it's the scroll-jacking website).
Picking Arch Linux
In any case, I decided it was time to stop waiting for git push
to
finish. I had run Gentoo at work for 3-4 months before I installed
FreeBSD. But I still had nightmares of resolving dependencies during
upgrades. I needed a binary package manager (not hard to find) and a
rolling release system.
Installing Arch stinks
Many of my old coworkers at Linode run Arch Linux at home so I've looked into it a few times. It absolutely meets my rolling release and binary packaging needs. But I've been through the installation once before (and I've been through Gentoo's) and loathed the minutes-long effort required to set up full-disk encryption. Also, systemd? :(
How about Void Linux?
Void Linux looked promising and avoids systemd (which legitimately adds complexity and new tools to learn for desktop users with graphics and WiFi/DHCP networking). It has a rolling release system and binary packages, but overall didn't seem popular enough. I worried I'd be in the same boat as in Debian/Ubuntu building lots of packages myself.
What about Arch-based distros?
Eventually I realized Antergos and Manjaro are two (Distrowatch-rated) popular distributions that are based on Arch and would provide me with the installer I really wanted. I read more about Manjaro and found it was pretty divergent from Arch. That didn't sound appealing. Divergent distributions like Manjaro and Mint exist to cause trouble. Antergos, on the other hand, seemed to be a thin layer around Arch including a graphical installer and its own few package repositories. It seemed easy enough to remove after the installation was finished.
Antergos Linux
I ran the Antergos installer and the first time around, my touchpad didn't work at all. I tried a USB mouse (that to be honest, may have been broken anyway) but it didn't seem to be recognized. I rebooted and my touchpad worked.
I tried to configure WiFi using the graphical NetworkManager provided but it was super buggy. Menus kept expanding and contracting as I moused over items. And it ultimately never prompted me for a password to the locked networks around me. (It showed lock icons beside the locked networks.)
I spent half an hour trying to configure the WiFi manually. After I
got it working (and "learned" all the fun new modern tools like ip
,
iw
, dhcpcd
, iwconfig
, and systemd networking), the Antergos
installer would crash at the last step for some error related to not
being able to update itself.
At this point I gave up. The Antergos installer was half-baked, buggy, and was getting me nowhere.
Anarchy Linux
Still loathe to spend a few minutes configuring disk encryption manually, I interneted until I found Anarchy Linux (which used to be Arch Anywhere).
This installer seemed even more promising. It is a TUI installer so no need for a mouse and there are more desktop environments to pick from (including i3 and Sway) or avoid.
It was a little concerning that Anarchy Linux also intends to be its own divergent Arch-based distribution, but in the meantime it still offers support for installing vanilla Arch.
It worked.
Life on Arch
I copied over all my configs from my FreeBSD setup and they all worked. That's pretty nice (also speaks to the general compatibility of software between Linux and FreeBSD). StumpWM, SLiM, scrot, xscreensaver, feh, Emacs, Tmux, ssh, kubectl, font settings, keyboarding bindings, etc.
Getting Powerline working was a little weird. The powerline
and
powerline-fonts
packages don't seem to install patched fonts
(e.g. Noto Sans for Powerline
). I prefer to use these than the
alternative of specifying multiple fonts for fallbacks because I have
font settings in multiple places (e.g. .Xresources, .emacs, etc) and
the syntax varies in each config. So ultimately I cloned the
github.com/powerline/fonts
repo and ran the install.sh
script
there to get the patched fonts.
But hey, there's a Spotify client! It works! And the headphone jack
just works after installing alsa-utils
and running alsamixer
. And
my WiFi speed is 120Mbps-250Mbps down on all the right networks!
I can live with this.
Random background
Each time I join a new company, I try to use the change as an excuse to force myself to try different workflows and learn something new tangential to the work I actually do. I'd been a Vim and Ubuntu desktop user since highschool. In 2015, I took a break from work on the East Coast to live in a school bus in Silver City, New Mexico. I swapped out my Ubuntu and Vim dev setup for FreeBSD and Emacs. I kept GNOME 3 because I liked the asthetic. I spent 6 months with this setup forcing myself to use it as my daily-driver doing full-stack, contract web development gigs.
In 2016, I joined Linode and took up the company Macbook Pro. I wasn't as comfortable at the time running Linux on my Macbook, but a determined coworker put Arch on his. I was still the only one running Emacs (everyone else used Vim or VS Code) for Python and React development.
I joined Capsule8 in late 2017 and put Gentoo on my Dell Developer Edition. Most people ran Ubuntu on the Dell or macOS. I'd never used Gentoo on a desktop before but liked the systemd-optional design and similarities to FreeBSD. I ran Gentoo for 3-4 months but was constantly breaking it during upgrades, and the monthly, full-system upgrades themselves took 1-2 days. I didn't have the chops or patience to deal with it.
So I used FreeBSD for 5 months and now I'm back on Linux.