Unix Tools Today

I learned Unix almost 30 years ago, while attending graduate school in the early 90s, from a now long-obsolete book entitled “Unix for the Impatient”.

Some of the tools and commands I learned back then have long since become irrelevant (ftp, telnet, cvs, biff — remember biff?). Others, although long in the tooth, continue to serve me well every day (emacs, tcsh, cc). And yet a third group seems to be more important than ever (such as tar, which is the basis for Docker images).

But what stands out to me is the group of new tools that did not even exist back then. Many of them are not isolated commands, but constitute entire, quite complex ecosystems by themselves (like git).

It is an amusing question for an idle afternoon to ask: what would the table-of-contents look like for a contemporary book that wants to teach not only some commands, but “the Unix mindset”? Here are some ideas of new general-purpose tools (explicitly omitting strictly developer or devops tools, all languages, build systems, deployment and orchestration systems).

  • zsh and fish
  • sudo
  • ssh and rsync
  • systemd (because it has subsumed and replaced old standbys like cron and ntp)
  • git and github
  • apt and the public repos
  • docker (and dockerhub)
  • busybox
  • markdown and markdown tools (e.g. pandoc)
  • jq and other JSON tools

All of these topics are new - most of them didn’t even exist until quite recently. Yet all still incorporate that intangible “Unix Philosophy”, of relatively transparent tools that can be combined (often scripted) to accomplish goals outside their original area of application.

But not everything is new. Largely unchanged is the set of basic system and filesystems commands: ls, mv, cp, rm; cat, head, tail; ps, kill, top. But even here, there are developments and new entrants: di deserves a space next to du and df. And I wonder whether today, even for a local copy, rsync might now be preferable to cp -a.

I’m ambivalent on the set of tools that worked on plain text files: sort, grep, awk, regexes. Using them, one can still easily put colleagues working with spreadsheets to shame. Nevertheless, these tools are becoming less relevant as the traditional, line-oriented text file is being replaced by more structured formats, such as JSON.

The arrival of “the cloud” also offers new solutions to old problems: sort, for instance, had no problem sorting files (quickly, too) that would not fit into memory. This used to be a lifesaver, but has become much less so, because today it might be more expedient to spin up a high-memory virtual machine in whatever cloud you prefer, sort the file there, and take the machine down again.

Finally, something else that has changed the way we work is the general availability of public repositories, starting with the repos for Debian apt packages. A “software installation” has become something dynamic: more a momentary, transient state, than something essentially fixed. The list of course also includes github, dockerhub, and the various CDNs that provide third-party access to JavaScript libraries: a piece of common-good infrastructure that is starting to be taken as much for granted as public DNS servers.