7 Lexical issues
Scsh's lexical syntax is not fully R4RS-compliant in two ways:
In scsh, symbol case is preserved by read and is significant on symbol comparison. This means
(run (less Readme))
displays the right file.``-'' and ``+'' are allowed to begin symbols. So the following are legitimate symbols:
-O2 -geometry +Wn
Scsh also extends R4RS lexical syntax in the following ways:
``|'' and ``.'' are symbol constituents. This allows | for the pipe symbol, and .. for the parent-directory symbol. (Of course, ``.'' alone is not a symbol, but a dotted-pair marker.)
A symbol may begin with a digit. So the following are legitimate symbols:
9x15 80x36-3+440
Strings are allowed to contain the ANSI C escape sequences such as
\n
and\161
.#! is a comment read-macro similar to ;. This is important for writing shell scripts.
The lexical details of scsh are perhaps a bit contentious. Extending the symbol syntax remains backwards compatible with existing correct R4RS code. Since flags to Unix programs always begin with a dash, not extending the syntax would have required the user to explicitly quote every flag to a program, as in
(run (cc "-O" "-o" "-c" main.c)).This is unacceptably obfuscatory, so the change was made to cover these sorts of common Unix flags.
More serious was the decision to make symbols read case-sensitively, which introduces a true backwards incompatibility with R4RS Scheme. This was a true case of clashing world-views: Unix's tokens are case-sensitive; Scheme's, are not.
It is also unfortunate that the single-dot token, ``.'', is both a fundamental Unix file name and a deep, primitive syntactic token in Scheme -- it means the following will not parse correctly in scsh:
(run/strings (find . -name *.c -print))You must instead quote the dot:
(run/strings (find "." -name *.c -print))