The shell is at the heart of Unix. It's the glue that makes all the little Unix tools work together so well. Understanding it sheds light on many of Unix's important ideas, and writing our own is the best path to that understanding.
Earlier this year, at a place I worked, I decided to run a series of workshops on writing a Unix shell. A lot of questions had come up that I think writing a shell leads you through, as well as issues that suggested tenuous mental models of the shell and its scripting language.
A small sampling of those kinds of questions:
set -e
work?^C
) work? why do I need to use ^\
sometimes instead? why doesn't ^C
work the way I'd expect on
this bash for
loop?At this company, we had a regular Friday afternoon workshop/lecture series. I had previously tried to do an overview of Unix processes relationships, but it felt too abstract. So, I tried to make it more concrete by getting everyone to actually implement a shell.
Initially, this was just a rough layout of what I thought I could cover in each session, and pointers to manpages. I never turned this into the full, DIY, self-paced tutorial I had hoped, but (in the spirit of release early, release often) I am opening up my work in progress at https://github.com/tokenrove/build-your-own-shell.
This isn't "finished", but if you're ambitious, you should be able to make something that passes all the tests. I decided it would be better to put it out there, even in rough form, than keep it sealed up. After all, a number of people enjoyed the workshop out of which this came.
(Caveat for macOS and *BSD users: there's still something wonky about the timing in the section that tests signals and job control; hopefully by the time you're reading this, I'll have it worked out, but if not, I apologize.)
In this post I'll reflect on some choices I made, and follow a few tangents that come up in the text but would be disruptive there.
I decided that, for this to be useful for self-study, it should contain an automated test suite. I love Tcl and expect, and had figured it would be a natural tool for testing the interactive components of shells. I took a quick look at how other shells were testing themselves. Most were strictly non-interactive tests, using shell scripts and comparing with expected output. A nice exception here is fish, which indeed uses expect for its interactive tests.
This makes sense, but I wanted to focus on interactive shells: in part because so many tutorials ignored the considerations of interactive shells, but also because I felt people would enjoy themselves more if they could use the shell they were writing directly.
I started with some tests edited from the output of autoexpect
, but
this turned out to be too fragile. Something I noticed in the first
workshop was that people really enjoyed customizing their prompt; this
should be no surprise (prompt customization is a perennial
time-wasting activity in any shell), but it meant I'd have to be
careful about how I matched outputs in tests. In particular, I
couldn't really depend on detecting and matching the prompt.
The other tricky thing is that I couldn't use any feature in the tests
that hadn't been developed yet, so using conditionals or echoing $?
wasn't possible in the early tests.
I considered writing a wrapper using ptrace(2) that would watch for
all fork
/ execve
/ wait
syscalls from the shell and its children,
and print those in a form easily consumed by a test harness (this
seemed easier to do than cleaning up the output of strace
), but
things like prompts that exec git
every time, as well as ptrace
's
noted stubbiness on macOS, prevented me from going further with this.
So that's where the workshop sat for a long time, until I finally decided to use a little test description language in place of expect scripts directly. So now a typical test might look like:
→ true || false || echo-rot13 foo⏎ ≠ sbb → false || true && echo-rot13 foo⏎ ← sbb → exit 42 ☠ 42
For whatever reason, expressing things this way allowed me to finally write out all the tests I had intended to have, without focusing too much on the implementation of the test harness. Then I wrote some Tcl to interpret these files.
I decided to go with string matching in the output, which is not
particularly robust, but is simple. Because of discrepancies between
how different shells and TTY drivers draw things, it can be prone to
matching the echoed input as the output if one isn't careful. There
are also some timing issues; the script written by autoexpect
suggests inserting a 100ms delay between each keystroke sent, but this
makes the tests cripplingly slow; I'm still trying to find a tuning
that is reliable across systems but speedy enough to be usable.
I decided that bash
and mksh
should pass all the tests, and cat
should fail every test. There's nothing worse than a test that fails
to actually test something. This reminds me of the admonishment
"don't try to do what a corpse can do better": goals phrased in the
negative (like "stop reading Hacker News") are hard to achieve — the
dead (or cat
) will always do them better than you. Positive goals
(and tests) are more actionable.
There are still some timing issues on different platforms, but I don't regret making the simple choice for now.
Doing the workshop lead me to think about minimizing shell builtins;
one of the questions that comes up a lot is why cd
needed to be a
builtin, but what doesn't come up until one is much deeper into
pipelines and job control is what a pain builtins are, in how they
interact with the rest of the shells features. It would be nice to
get rid of them.
There are some commands which are builtins only to make them fast,
like echo
, true
, and false
. These usually have equivalents in
/bin
already.
Some builtins are required because they modify the shell's own
environment: cd
, exit
, fg
, bg
, jobs
, exec
, wait
,
ulimit
. (This is excluding really tricky, impractical things, like
using shared memory, process_vm_writev
, or ptrace
to modify the
shell from an outside process.)1
To prove a point, you could take functional programming to an extreme
and have an immutable shell where cd
executes a new shell in the
chosen directory, but some of the others are probably not possible in
the presence of typical job control.2
If we take this line of thought further, we can try externalizing some
of the shell's operators. Conditional execution is interesting. How
about &&
and ||
? Syntactically, we probably can't pull these off
as external commands, but we could provide commands and
and or
which take commands to execute.
Implementing if
is an obvious next step from and
and or
. Now we
can implement while
, although we'd have to be careful about how we
handle the environment if we wanted to handle many typical uses of
while
.
The for
loop almost already exists in this form, as xargs
. We
would probably want to provide both a sequential version, where the
environment for each iteration depends on the previous, and a parallel
version where everything can run at the same time.
Note that most of these approaches require that you have mechanisms for escaping that aren't too cumbersome, for them to be practical. There seems to be a close parallel with macro facilities in languages like Lisp.
At the extreme side of cumbersome quoting would be case
, which you'd
probably want to take its input from a heredoc.
I was originally going to write a proof of concept of this (called "builtouts"), but researching this lead me to the intriguing execline "shell", which has already done this, and explored this space rather nicely.
One thing that execline
doesn't seem to do is implement something
resembling real job control. If bg
executes a command without
waiting and then re-executes the shell with a suitable variable set
(to the PGID of this job), the shell on each execution can check this
variable to see what jobs are still alive; the jobs
command can
print the contents of this variable; the fg
command just becomes
tcsetpgrp
and wait
with the PGID of the current job. For an
interactive shell, the tricky thing is probably making sure that
bg
's children don't end up in an orphaned process group.
A lot of these programs end up having to deal with quoting. Is there
a way to take this further and handle quoting in its own program? For
fixed-arity programs (like if
), we can imagine an unquote
helper
that calls a subsidiary program with, first, the fixed remaining
arguments, and then all of the original quoted argument, expanded, as
the remaining arguments.
As glob(7)
notes:
Long ago, in UNIX V6, there was a program /etc/glob that would expand wildcard patterns. Soon afterward this became a shell built-in.
Luckily, the source is available in Diomidis Spinellis's unix-history-repo, and we can see that it does this same kind of chain loading, executing its first argument with the rest of its arguments expanded according to the globbing rules.
I especially enjoy the extremely primitive path search and shell script support.
(Found object engineering, often called cargo cult programming.)
Now we get to the inflamatory bits, for those who kept reading.
Stackoverflow modernized, but did not create, the practice of assembling Frankenstein programs from poorly understood and imitated examples, but I think no language has been more greatly affected by this than shell, as evidenced by the bizarre ready-made shell scripts one can encounter almost everywhere. Sometimes, the evolution of these patterns reminds me of semantic drift in languages.
A lot of constructs are poorly understood and misused. I'm not blaming people, though; part of the problem is that I can't easily point to a single, modern reference work that someone should read before writing shell scripts. And, since shell scripts often feel like "configuration" rather than "programming", I imagine people don't even think about learning shell as a programming language.
Writing a shell helps disabuse people of some common confusions, for example that:
if
, while
, et cetera is something magical;export FOO=x
repeatedly does something;(Don't forget to use shellcheck and checkbashisms everywhere!)
In the workshop, I cite the following motivations for writing a shell:
I've already touched on the first two, but the third is maybe less obvious. The shell remains a ubiquitous interface, decades after we imagined other modes of interaction would replace it. The field is ripe with opportunities for improvements.
There are a lot of people exploring this space in interesting ways, but I think there's room for so much more.
A lot of existing tutorials focus on the non-interactive case, and I think people will have more fun if they build a shell they can use interactively.
Aside from the interactive case, a lot of infrastructure is held together with shell scripts.
There's a commonly held belief that scripting languages like Perl, Ruby, and Python are complete replacements for shell scripting. My own experience is that these languages lack the expressive tools of the shell for working with pipelines, exit statuses, redirections, and so on, and the replacement code is often:
So, I feel there's still room for new tools in this space, too.
If you've been thinking about writing a shell for a while and haven't gotten around to it, why not try my workshop or any of the tutorials it links to?
POSIX avoids dictating exactly what must be a builtin, but does specify that the following commands must be executed no matter if they are in the path:
alias bg cd command false fc fg getopts jobs kill newgrp pwd read true umask unalias wait
Most of these have something to do with the shell's internal state, but not all.
This is a kind of chain loading, sometimes called Bernstein chaining. There's a lovely discussion of this in Andy Chu's Shell has a Forth-like quality. (The entire oil shell blog is full of great stuff.)