11 Other programming languages
Having seen the design of scsh, we can now compare it to other approaches in some detail.
11.1 Functional languages
The design of scsh could be ported without much difficulty to any language that provides first-class procedures, GC, and exceptions, such as COMMON LISP or ML. However, Scheme's syntactic extensibility (macros) plays an important role in making the shell features convenient to use. In this respect, Scheme and COMMON LISP are better choices than ML. Using the fork/pipe procedure with a series of closures involves more low-level detail than using scsh's (| pf1 ... pfn) process form with the closures implied. Good notations suppress unnecessary detail.
The payoff for using a language such as ML would come not with small shell scripts, but with larger programs, where the power provided by the module system and the static type checking would come into play.
11.2 Shells
Traditional Unix shells, such as sh, have no advantage at all as scripting languages.
Escaping the least common denominator trap
One of the attractions of scsh is that it is a Unix shell that isn't constrained by the limits of Unix's uniform ``least common denominator'' representation of data as a text string. Since the standard medium of interchange at the shell level is ASCII byte strings, shell programmers are forced to parse and reparse data, often with tools of limited power. For example, to determine the number of files in a directory, a shell programmer typically uses an expression of the form ls | wc -l. This traditional idiom is in fact buggy: Unix files are allowed to contain newlines in their names, which would defeat the simple wc parser. Scsh, on the other hand, gives the programmer direct access to the system calls, and employs a much richer set of data structures. Scsh's directory-files procedure returns a list of strings, directly taken from the system call. There is no possibility of a parsing error.
As another example, consider the problem of determining if a file has its setuid bit set. The shell programmer must grep the text-string output of ls -l for the ``s'' character in the right position. Scsh gives the programmer direct access to the stat() system call, so that the question can be directly answered.
Computation granularity and impedance matching
Sh and csh provide minimal computation facilities on the assumption that all
real computation will happen in C programs invoked from the shell.
This is a granularity assumption.
As long as the individual units of computation are large, then the cost of
starting up a separate program is amortised over the actual computation.
However, when the user wants to do something simple -- e.g., split an X
$DISPLAY
string at the colon,
count the number of files in a directory,
or lowercase a string -- then the overhead of program invocation
swamps the trivial computation being performed.
One advantage of using a real programming language for the shell language is
that we can get a wider-range ``impedance match'' of computation to process
overhead.
Simple computations can be done in the shell;
large grain computations can still be spawned off
to other programs if necessary.
11.3 New-generation scripting languages
A newer generation of scripting languages has been supplanting sh in Unix. Systems such as perl and tcl provide many of the advantages of scsh for programming shell scripts [perl, tcl]. However, they are still limited by weak linguistic features. Perl and tcl still deal with the world primarily in terms of strings, which is both inefficient and expressively limiting. Scsh makes the full range of Scheme data types available to the programmer: lists, records, floating point numbers, procedures, and so forth. Further, the abstraction mechanisms in perl and tcl are also much more limited than Scheme's lexically scoped, first-class procedures and lambda expressions. As convenient as tcl and perl are, they are in no sense full-fledged general systems-programming languages: you would not, for example, want to write an optimizing compiler in tcl. Scsh is Scheme, hence a powerful, full-featured general programming tool.
It is, however, instructive to consider the reasons for the popular success of tcl and perl. I would argue that good design is necessary but insufficient for a successful tool. Tcl and perl are successful because they are more than just competently designed; critically, they are also available on the Net in turn-key forms, with solid documentation. A potential user can just down-load and compile them. Scheme, on the other hand, has existed in multiple mutually-incompatible implementations that are not widely portable, do not portably address systems issues, and are frequently poorly documented. A contentious and standards-cautious Scheme community has not standardised on a record datatype or exception facility for the language, features critical for systems programming. Scheme solves the hard problems, but punts the necessary, simpler ones. This has made Scheme an impractical systems tool, banishing it to the realm of pedagogical programming languages. Scsh, together with Scheme 48, fills in these lacunae. Its facilities may not be the ultimate solutions, but they are useable technology: clean, consistent, portable and documented.