September 18th, 2006


shiver wisdom

Olin Shivers wrote a paper about the regular-expression system for scsh. The wisdom of the preamble alone makes the document worth reading:
There's a problem with tool design in the free software and academic community. The tool designers are usually people who are building tools for some larger goal. For example, let's take the case of someone who wants to do web hacking in Scheme. His Scheme system doesn't have a sockets interface, so he sits down and hacks one up for his particular Scheme implementation. Now, socket API's are not what this programmer is interested in; he wants to get on with things and hack the exciting stuff -- his real interest is Web services. So he does a quick 80% job, which is adequate to get him up and running, and then he's on to his orignal goal.

Unfortunately, his quickly-built socket interface isn't general. It just covers the bits this particular hacker needed for his applications. So the next guy that comes along and needs a socket interface can't use this one. Not only does it lack coverage, but the deep structure wasn't thought out well enough to allow for quality extension. So he does his own 80% implementation. Five hackers later, five different, incompatible, ungeneral implementations had been built. No one can use each others code.

This is the current situation for socket libraries in Common Lisp, and I have my own personal 80% solution. I've heard more than one hacker say they're working on a 100% solution; I hope at least one comes to fruition.

There's more great stuff. The "Discussion and design notes" section provides rationale for the features present in the system, and also discusses features intentionally absent from the system, giving a detailed survey of existing regular expression engines in the process. For example:

SRE notation does not provide lazy repeat forms, of the kind found in perl. Lazy repeat forms have problems. In principle, a regexp doesn't specify a matching algorithm, it specifies a set of strings. Lazy submatches violate this principle. Providing lazy submatches forces a particular matching algorithm: an NDFA-based backtracking search engine -- you can't use a fast DFA no-backtracking engine. I chose to restrict the notation to keep it algorithm-independent.

It makes me wish all library authors:

  • surveyed the field of existing solutions as thoroughly
  • grasped the problem space beyond their immediate need
  • provided rationale for intentionally missing features

This is above the normal stuff that people should do just to achieve a reasonable baseline of quality, like fully document the public interface. To paraphrase Chris Rock, some library authors are proud of stuff they're just supposed to do. "This library is fully documented." What do you want, a cookie? You're not supposed to release undocumented code, you low-expectation-having motherfucker! (Not that I'm any better.)

corman lisp 3.0

I don't use Corman Lisp personally, but version 3 just came out. See the release notes for a detailed listing of updates. Here are a few that look good to me:
  • Math and numeric code performance [...] All the math code is in Lisp and Corman Lisp assembler.
  • Numerous ANSI compatibility improvements
  • Integration with .NET (Edi Weitz' RDNZL package is included)
  • Most Lisp kernel function are now redefined in Lisp (or Corman Lisp assembler). They play by all the threading and GC rules, and stability is improved.