haskeem web site!What's haskeem? haskeem is (pretty damn
close to) Scheme in Haskell... however, if you're looking for an
industrial-strength Scheme interpreter, this might not be it!
It doesn't implement the full Scheme number tower, it's not properly
tail-recursive, it doesn't have syntax-modifying macros, it doesn't
have call-with-current-continuation. If you're a real schemer, you're
probably snickering by now... that's ok! haskeem is a toy
that's letting me learn more about both Scheme and Haskell, it's a lot
of fun to work on, and that's enough for me! I arbitrarily called the
first version I put onto the web "0.5"; I'm now at 0.6.9, which is
available here, all under GPL. (But see
also my comments below on doing it yourself: you'll learn far more if
you go through Jonathan Tang's tutorial Write
Yourself a Scheme in 48 Hours and write all of this yourself,
rather than just grabbing what I've got here and looking at it.)
What does it have?
definelambdaquotequasiquote, unquote,
unquote-splicingset!beginand, orif, cond, case,
when, untillet, let*, letrec,
letrec*delay, forcedoguardapply, eval,
load
#(...) including
special forms vector-set!, vector-fill!,
and vector-resize!trace and dump-bindingsHASKEEM_INITreadline; this makes it a
very pleasant working environment.What bugs do I know about? At this point I know of two bugs in stuff that's at least partially implemented:
rational infinities don't compare correctly: (> 1/0 -1/0)
returns #f whereas it ought to return #t; however,
comparing each of those infinities to a finite quantity does return the
right value. This is a GHC bug, I've checked it with a pure Haskell
program. It also manifests in a couple of other ways, with the other
comparison operators and with min and max.
Hm. Upon further reflection and reading of documentation, I think what I
am trying to do is actually abusing the Rational datatype in
Haskell; it's not designed to deal with infinities. So I think I should call
this my bug, not GHC's.
Floating-point NaNs are not handled correctly for a couple of operators:
**, min, and max. This is also a GHC
bug. There is apparently a recent draft proposal to allow min, and
max to return numbers rather than NaNs if some of the passed-in
values are good numbers and others are NaNs; however, I'm not sure GHC is
trying to achieve this, because its results are inconsistent.
If you have questions or comments about the stuff here, please contact me at korg@korgwal.com.
Enjoy! -- Uwe Hollerbach
March 5, 2008
Not so much haskell hacking in the last few days. I did finish the
nested quasiquote etc stuff. As expected, it was
straightforward but a little tedious. Now it returns all the proper
results for the example cases from R6RS.
I have tweaked the powerseries examples a little bit. I verified
the Bessel functions, added Bessel functions of the second kind
I_n, and simplified the (ps-recip) function: it
had previously needed three terms where it ought to have needed only
two. After a bit of looking, I found the problem: it wasn't in
(ps-recip) at all, but in (ps-mul): I wasn't
being lazy enough in that routine.
February 24, 2008
Only a little bit of hacking today: I added a (new-symbol)
special form; I'm starting to think about doing macros, and I think that'll
be necessary. I avoid name clashes with user-defined symbols by simply
generating the internal symbol with a leading space: the rest of the system
is perfectly happy with such a symbol, but the parser won't pass the space;
so the user can't accidentally generate such a symbol. Clashes between
multiple internally-generated symbols are avoided by simply sticking a
global counter into the symbol name: every time a new internal symbol is
manufactured, the counter gets incremented.
I added a non-gnu (and non-editing) version of readline, for those machines which don't happen to have gnu readline installed. It's commented out inside haskeem.hs.
I think I added the necessary bits to the Bessel function power series in powerseries.scm; at any rate, it correctly computes the first (non-trivial) zero of the first five Bessel functions.
February 23, 2008
Again, not too much hacking of the interpreter; more scheme hacking
the past few days... that is a problem with writing an
interpreter, you then want to use it! :-) With a few hints from Bulat
Ziganshin on the haskell-cafe mailing list, I did add the ability to
catch keyboard interrupts; so it's possible to type control-C at a
long-running calculation and get back to the lisp>
prompt. Previously, that would have just killed the interpreter, which
is kinda nasty. Thanks, Bulat!
The scheme hacking consisted mainly of some playing with power series. A few days ago, I was re-reading Higher-Order Perl by Mark Jason Dominus, and I decided to re-implement his section 6.7 on power series. The code is here, if you want to play with it; be aware that the Bessel function series is incomplete. The equivalent of his last example, the head of the power series for tan(x), is
lisp> (stream-head ps-tan 10)
(0 1 0 1/3 0 2/15 0 17/315 0 62/2835)
February 20, 2008
Not much hacking of the interpreter, just adding several
functions to the library: random-number generation, plus networking:
I can now write a TCP client and server in haskeem.
Here's a trivial server:
(define listener (listen-on 2500))
(write-string stderr "listening on port 2500\n")
(let ((remote #f)
(host "")
(port -1)
(ret '()))
(do ((i 1 (+ i 1)))
((> i 10) (close-port listener))
(set! ret (accept listener))
(set! remote (car ret))
(set! host (cadr ret))
(set! port (caddr ret))
(write-string stderr "got connection from " host #\linefeed)
(write-string remote "hello, " host ", how are you?\n")
(write-string remote "The current local time is " (localtime) #\linefeed)
(close-port remote)))
And a trivial client to go with it:
(define host "localhost")
(define port 2500)
(when (positive? (length args))
(set! host (car args))
(set! args (cdr args)))
(when (positive? (length args))
(set! port (string->number (car args)))
(set! args (cdr args)))
(define remote (connect-to host port))
(do ((spew (string-join-by " "
"Connecting to host" host
"port" (number->string port))))
((not spew) (write-string "all done!\n") (close-port remote))
(write-string spew #\linefeed)
(set! spew (read-line remote)))
A super-high-performance web server is probably not in the cards, but I think I could probably build a small web spider on top of this functionality... :-)
February 17, 2008
The quasiquoting unquoting operators are mostly working. I have
implemented the unwrapping I had left yesterday, and haskeem is
passing most of the test cases shown in R6RS. There are two areas
where it is not yet working: neither nested quasiquotes nor the caboose of a
dotted-list are properly handled yet. I think both of these are
straightforward but a bit involved: doing nested quasiquotes is simply a
matter of keeping track of the nesting, but that involves threading a
nesting count through the whole machinery of the evaluator, and that touches
a fair amount of code. Doing dotted-lists right is, I think, analogous to
properly handling them in the parser, and I've already got that
working. It's made a little tricky by JTang's original design decision,
which I adopted as well, of using Haskell lists as the basic unit of list
structure:
(a . (b . (c . (d . ()))))
is exactly and explicitly equivalent, according to R6RS, to
(a b c d)
as are all of these (plus a couple more):
(a b . (c d))
(a b . (c . (d . ())))
(a b c . (d . ()))
and that equivalence is not entirely natural when using Haskell lists (indeed, the parser presented in JTang's tutorial doesn't handle this correctly). So there is a bit of work required to correctly handle dotted-lists.
time passes...
The dotted-list problem is solved: indeed, it turned out not to be
analogous to the stuff in the parser, it turned out to be the
parser. If you examine the examples above, you'll see there is a simple rule
for how to turn dotted-lists into lists: if the thing in the caboose
position is itself a proper list, then the overall dotted-list object is
actually a proper list. That rule, applied repeatedly, is sufficient to turn
all of the above examples into the "canonical" list (a b c d),
and there is (was) code in the parser to do just that.
However... in the presence of unquote, that comes out
wrong. The test case from R6RS which checks this is
`((foo ,(- 10 3)) ,@(cdr '(c)) . ,(car '(cons)))
which turns into
(quasiquote ((foo (unquote (- 10 3))) (unquote-splicing (cdr (quote (c)))) . (unquote (car (quote (cons))))))
Examine the caboose: it's (unquote (car (quote
(cons))))... which is a two-element proper list; so the parser
promotes the overall expression to a proper list, the unquote
is in the middle of the list by itself, and then it all falls apart in the
evaluator.
So I have added a bit of special-purpose pattern matching to the code in
the parser that checks for lists in the caboose: if it's a list, but the
first item in the list is a symbol, and that symbol is either
unquote or unquote-splicing, then the promotion
gets skipped. And then the test succeeds!
February 16, 2008
I've added support for vectors: #(...). At first I was
going to use the Haskell Data.Array datatype, but
documentation for that is somewhere between nonexistent and crappy,
and what little I could make out from the sources seemed to suggest
that this was more for native datatypes: the equivalent of C's
int, double, etc. My LispVal
doesn't seem to fit in there. Fortunately, the Data.IntMap
datatype seemed very well suited to what I wanted, so that was the
route I chose. The R6RS merely says that
Vectors are heterogeneous structures whose elements are indexed by integers. A vector typically occupies less space than a list of the same length, and the average time needed to access a randomly chosen element is typically less for the vector than for the list.
which fits very well with Data.IntMap: that is implemented
using PATRICIA trees, and many of the operations take
O(min(n,W)) time, where n is the length of the
vector and W is the word size of the computer: 32 or 64. I
don't know what the constant out front is, and I don't care too much about
that; for large vectors, it's bound to be better than O(n) for
a list.
I have also begun to implement more of the quoting constructs:
(quote foo) was already present; now I'm adding
(quasiquote foo), (unquote foo), and
(unquote-splicing foo) -- so far, only the easy bits: the
abbreviations
’<datum>for(quote <datum>)
‘<datum>for(quasiquote <datum>)
,<datum>for(unquote <datum>)
,@<datum>for(unquote-splicing <datum>)
At least with the browser I'm using right now, the quote characters are
very hard to distinguish: ’, the character used in the abbreviation
for (quote <datum>), is the right single quotation mark,
located on the right of my keyboard, and ‘, the character used in the
abbreviation for (quasiquote <datum>), is the left single
quotation mark, located on the left of my keyboard. Actually, the keyboard
characters should probably be called apostrophe and backtick, respectively.
time passes...
These are almost working: there is still one extra level of
parentheses in the result, for example ‘(1 2 ,(+ 3 4))
should return (1 2 7), but actually returns (1 2
(7)). I think I understand why, and what to do about it... but that's
for tomorrow.
February 11, 2008
I've fixed the glitch in writing numbers in scientific notation; that was pretty straightforward, although I don't know how efficient my solution is. But efficiency isn't the primary target here...
I've also added a rather specific new type of comment to the parser: it now understands and ignores "#!" lines at the top. So I can have actual directly-executable scheme scripts:
% ls -lF try
-rwxr-xr-x 1 uwe users 134 Feb 11 17:05 try*
% cat try
#!/home/uwe/tools/haskeem
(set! args (cons "It's alive! It's ALIIIIIVVVE!" args))
(write-string (string-join-by " " args) #\linefeed)
% ./try
It's alive! It's ALIIIIIVVVE!
% ./try meow
It's alive! It's ALIIIIIVVVE! meow
So now, if I need to, I can compute symbolic derivatives in a cron job :-)
February 10, 2008
I have just received an object lesson in just how lazy Haskell is... I
was working on implementing printing numbers in scientific notation,
n.nnEnn, and when I thought I had that working, I started doing a couple of
tests in haskeem. I entered a couple of numbers which worked
fine, and then I entered 145e-2 which promptly crashed the
interpreter. I looked at the write-number routine and couldn't find
anything, so I started sneaking up on the problem: I entered the number in
such a way that I wasn't going to print its value at all
(define returns the value it assigned to the variable, and the
REPL prints that, so that wouldn't work; I had to write (begin (define
var 145e-2) #t) to get the value into the interpreter without
printing it), then I started querying it: is it an integer? no; is it a
rational? no; is it a real? yes; is it infinite? and the interpreter crashed
again. After a little while, I tracked the problem to the parser: I allow
the parser to return numbers of the form aaaEbb as integers, if there is no
decimal point in the "aaa" part and the "bb" part is non-negative: ie, 123e4
is the integer number 1230000. It turned out that I was insufficiently
careful in how I had implemented that, and this particular number 145e-2
satisfied the first part of the test, no decimal point in the "aaa" portion,
but not the second part, the "bb" portion is non-negative. So the parse
wasn't fully evaluated until well after the interpreter had
returned from the parser and had already processed several other
commands. In fact, the "is it infinite?" check was the first one where the
actual value mattered, as opposed to just the format. Wow! I knew Haskell is
lazy, but only in the front of my mind, it seems; my reptilian brainstem was
still thinking "oh, it's returned from the parse, so it's finished with
that" (well, to the extent that my reptilian brainstem thinks about
programming at all... :-) )
That's all fixed now, and scientific notation works, with still one small
glitch: if a number has more nonzero digits in its tail than the specified
precision, rounding occurs, and that rounding can of course propagate up
toward the head: so for example 0.9999 gets printed as 1.00
with two digits after the decimal point. If that happens when formatting a
number in scientific notation, 0.9999 gets formatted as
9.999e-1 when enough digits of precision are available, but as
10.00e-1 rather than as 1.00e0 when rounding
occurs. The value is correct, just the format is slightly off. This is
purely my bug, nothing to do with GHC's native number-printing routines at
all, and I understand where and why it's happening, I just haven't fixed it
yet.
February 9, 2008
I'm in the process of adding an optional precision specifier to
(number->string): it does the equivalent of C's %.Nf
now, for integers and rationals; I'm still working on the same for
floating-point numbers, and I haven't started %.Ne at all. For
integers and floating-point numbers, this isn't so hugely necessary (although
it's still pretty useful); but if you're doing some big calculation involving
rational numbers, say computing the sum of the reciprocals of the first
hundred primes, it's very much nicer to know that that result is about
2.106342121472621094 than that it is the ratio of two several-hundred-digit
numbers:
9924938317306578104794915348371074960828795860429929737290618944808506634743370796528436363734050330051393733303456260639596308477098909255081265596725817362940507726427713984550226766935610334519674156561744081975868623/ 4711930799906184953162487834760260422020574773409675520188634839616415335845034221205289256705544681972439104097777157991804380284218315038719444943990492579030720635990538452312528339864352999310398481791730017201031090
I dunno about you, gentle reader, but that makes my eyeballs bleed...
To compute that, I actually used the exception-handling stuff in kind of a neat fashion. Here's a bit of code: first define an initial table of the lowest couple of primes; we have to have 3 in there, because 2 is kind of an odd prime...
(define primes '(2 3))
Then define a routine to check for primality. This is the part I think is
cool: I could have a moderately complicated test to determine if I can exit
the loop, and then another bit of code to determine what result the routine
should return, but using a guard to catch some exceptions, and
then within the loop just raise an exception with the return value of the test
ends up being cleaner; and it doesn't interfere at all with the actual
exception of running off the end of the table: that one isn't caught by the
guard.
(define (is-prime? n)
(guard (err ((boolean? err) err))
(do ((pl primes (cdr pl))
(cur 0))
((null? pl) (raise "exhausted primes list!"))
(set! cur (car pl))
(if (= cur n) (raise #t))
(if (zero? (remainder n cur)) (raise #f))
(if (> (* cur cur) n) (raise #t)))))
Then this next routine computes the next prime past the end of the table and adds it to the table, until the size of the primes table is at least n.
(define (grow-primes-table n)
(do ((j (length primes) (+ j 1)))
((>= (length primes) n) primes)
(do ((i (+ 2 (last primes)) (+ 2 i)))
((is-prime? i) (set! primes (append primes (list i)))))))
And finally this is the routine that computes the sum of the reciprocals of
the first N primes. It makes use of the fact that the
grow-primes-table routine above returns the list of primes after
it has finished growing it.
(define (sum-recprimes n)
(apply + (map / (list-head (grow-primes-table n) n))))
If you invoke this with
(number->string (sum-recprimes 100))
you get the eyeball-bleed-inducer above, whereas if you invoke it with
(number->string (sum-recprimes 100) 10 18)
you get the much nicer 2.106...
February 2, 2008
Major reworking of error handling. I figured out a couple of
problems with the error handling, in particular error messages
generated in non-interactive mode are now
printed. I had that very wrong before: I was using
runIOThrows, a wrapper around catchError,
inside Control.Exception.try. In hindsight, it's
perfectly obvious that if the inner catcher is handling all the
errors, then the outer catcher won't have much of anything to
do... d'oh! <palm to forehead>
This does not yet address the exception-handling at the scheme level, but I think I understand Haskell's error handling much better now, so hopefully I'll get the scheme-level exception handling working now, too.
time passes...
Yes! new special form guard, with syntax as described in
R6RS. Check this out: first define a small function...
; the first 'test', "(begin...)", is only to print out the value of
; the exception that was thrown: that 'test' always returns #f
(define (tryit test)
(guard
(err ((begin (write-string "exception is '")
(display err)
(write-string "'\n")
#f) #t)
((and (string? err) (string=? err "meh"))
(write-string "caught 'meh'\n")
"I am the 'meh'-catcher")
((and (string? err) (string=? err "barf"))
(write-string "caught 'barf'... yuk!\n")
-42)
((and (string? err) (string=? err "yahoo!"))
(write-string "a little excitable today, aren't we...\n")
"US$44.6billion")
((eqv? err '(1 2))
(write-string "does not compute\n")
"daleks rule! (until their heads blow off)"))
(raise test)))
and then run some tests with it:
lisp> (load "guard.scm")
(lambda (test) ...)
lisp> (tryit "meh")
exception is '"meh"'
caught 'meh'
"I am the 'meh'-catcher" <-- return value
lisp> (tryit "barf")
exception is '"barf"'
caught 'barf'... yuk!
-42 <-- return value
lisp> (tryit "yahoo!")
exception is '"yahoo!"'
a little excitable today, aren't we...
"US$44.6billion" <-- return value
lisp> (tryit '(1 2))
exception is '(1 2)'
does not compute
"daleks rule! (until their heads blow off)" <-- return value
lisp> (tryit '(1 3))
exception is '(1 3)'
user exception (1 3) <-- NOT caught! this got printed by
the outermost level, the 'P' in 'REPL'
Notice that we can throw arbitrary lisp objects as exceptions. W00t! Izzat cool, or what? This mechanism will also catch system-level exceptions, ie, "file not found" kinds of errors; however, those will just show up as string values, like "nil.dat: openFile: does not exist (No such file or directory)". So processing those is a little messier, although still doable.
February 1, 2008
Not so much Haskell hacking, but a few small tweaks in stdlib.scm, plus one very nice usability feature: connect the REPL with gnu readline. That makes it very much nicer to use interactively! I'm hereby declaring that it's no longer (almost) Scheme in Haskell, it's just Scheme in Haskell.
January 28, 2008
I have finished the do syntactic form, although it's
not yet in the self-tests (by now 800+); but a small separate script
using it gives the right results. The exception-handling stuff, which
I thought would be straightforward, is proving less so: I apparently
don't understand enough Haskell yet, and I am getting various
type-mismatch error messages which I haven't yet managed to
untangle. Oh well...
January 26, 2008
I have added a bunch of functions for manipulating files and directories,
reading the environment, and so on, plus miscellaneous cleanups; for example,
it was giving a parser error message if the user just entered a blank line at
the prompt. That was correct, as far as the parser goes -- it expects to see
at least one scheme expression or object -- but kinda unfriendly. Also, a
special form to display the currently-active bindings:
dump-bindings.
January 22, 2008
I have added delay and force, which allow the
definition of lazily-evaluated infinite datastructures such as streams. For
example, define the natural numbers as follows:
(define natural-numbers
(letrec ((next
(lambda (n)
(cons n (delay (next (+ n 1)))))))
(next 1)))
ie, the integer 1 plus a promise to evaluate the rest of the natural numbers if it should become necessary; then it's possible to do
(define (stream-head strm n)
(if (<= n 0)
'()
(cons (car strm) (stream-head (force (cdr strm)) (- n 1)))))
(stream-head natural-numbers 20)
and get
(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20)
Very cool!
On one level, this was pretty trivial: add a new Delay
constructor to the LispVal stuff in lispdata.hs, then in
evaluator.hs add code to just return a Delay object when a
(delay foo) special form is seen, plus code to actually
evaluate what was stored if a (force bar) special form is
seen. However, that by itself fails one of the tests in R6RS:
with just that, it's possible to make two-timing promises that evaluate to
different results depending on when they are forced; that's not so nice!
The solution, adding a variable into which to store the computed value
and just return that from then on, was straightforward, too, except for
making the variable name. It doesn't really matter what that name is, as
long as it doesn't clash with anything else, but that "doesn't clash with
anything else" was tricky. I tried for a while to use the State
monad, but I couldn't get that to work. Eventually, I decided to just stick
stuff into the same environment that's passed along when the
Delay object was created, using a global variable defined at
startup in the top-level environment. That worked fine.
January 19, 2008
I've been interested in the Haskell programming language for a while, but was never really able to get into it. I wrote a couple of tiny toy programs, but that was about it, until I recently came across Jonathan Tang's most excellent Haskell tutorial Write Yourself a Scheme in 48 Hours. I decided that that looked like a lot of fun, so I started on it. By now, I've spent a fair bit more than 48 hours on it, but I've also gone a ways beyond what JTang described. I started during the christmas 2007 holiday, and have, as of late January 2008, spent the idle hours of a couple of weeks at it. It's starting to be a neat little toy, so I'm putting it onto the web here for others to play with or possibly learn from.
Before I go any farther, a couple of points:
OK. With that out of the way, there are a couple of points where I am actually and intentionally critical of what JTang wrote, and where I think he should have done better:
The first is an actual programming issue: the parser JTang described, and the one you will write, should support some kind of comments. R6RS (the Revised6 Report on Scheme) describes a couple of different kinds, of which I have so far implemented only the "semicolon to line-end" flavor, but it is essential, once you write anything longer than a line or two, to be able to comment it. Perhaps you'll only ever write a single comment that says Look and marvel, world, at the brilliance of my lisp koan!... that's ok; but maybe you'll also write things like OMG! WTF is this stupid interpreter doing?! Fix this!!! and then later on you'll be able to find it again and fix it. Whatever. Comments are essential, so when you go through JTang's tutorial, pretend that he talked about comments, and add the ability to comment stuff into the parser early on. It's not hard, and you'll be grateful for it soon enough.
The second issue is less of a programming issue and more being
systematic. However much of a toy this is, it is an actual interpreter for an
actual language, and you have to test it as you go, but you also have to check
that your new hacks don't break anything old. So keep track of small tests,
collect them into one large file, and once your interpreter is able to execute
a file, set things up to run that self-test; then run it often. This is
something that I think JTang should have emphasized more. By now I'm up to 400
tests in the self-test for haskeem. These aren't all totally
distinct, a bunch of them are exploring corner cases of simple math functions,
but not only does that give me some confidence that it's doing the arithmetic
right, it also caused me to find a couple of small bugs in GHC, the Glasgow
Haskell Compiler, which I have duly reported. (GHC doesn't handle NaNs quite
correctly in all cases, for example 1.0 ** NaN should be NaN but is evaluated
as 1.0.)
All right, here's where you should stop if you want to do it yourself. Go write yourself a scheme in 48 hours, and come back when you're done with it!
So what's haskeem? It's (mostly) a sub-set of R6RS
Scheme; while writing it, I did retrieve the R6RS, and I went
through it, trying to find and eliminate differences. To the extent possible,
I did and do want it to be compatible, although I did do a couple of
small things differently; I'll describe them more below.
JTang's tutorial implements an interpreter with integer-only arithmetic
and only base-10 input and output of numbers, whereas scheme supports
integers, rational numbers, and real and complex floating-point numbers; all
of these can furthermore come in several different precisions, plus they can
be exact or inexact, and the integer quantities can be written in base-2,
base-8, base-10, and base-16. I have not implemented all of this!
It's important for writing large robust programs; it's not essential for
writing a small toy. But I have implemented parts of it: I began by adding
support for real floating-point numbers, since I wanted to add a bunch of math
functions, and having, for example, (sin x) always return only
one of -1, 0, or +1 just doesn't quite cut it; the numerical analyst
cringes!
This is the point where I made my one major excursion from R6RS
Scheme syntax: that document describes and allows only base-10 floating-point
numbers, but when I was adding floating-point numbers to the parser, I found
that I was re-using enough pieces of the integer-parsing code, which supports
multiple bases, that it would be pretty simple to add support for
floating-point numbers in the other bases. So I did. So it's possible to
specify the floating-point number #xdeadbeef.c0ffee in
haskeem (it's 3.7359285597539053e9), or to see that pi in base-16
is about #x3.243f6a8885a3.
I have also mostly finished adding support for rational numbers. There are
some small differences in the way I treat them versus how R6RS
treats them. For finite rationals, of course, it's all the same, but if I
understand the Report correctly, they are disallowing rational infinity and
NaN: +1/0, -1/0, and 0/0, whereas I think these are useful quantities for the
same reasons that floating-point infinities and NaNs are useful, so I allowed
them and have attempted to deal properly with them in
haskeem. That was actually a bit tricky, since Haskell's native
rational numbers don't play well with these quantities. Haskell will recognize
them and can read and write them, but doing arithmetic with them leads to
runtime exceptions, so I had to add a fair bit of special-case code to deal
with all the possibilities..
That's the next major area where I added stuff to what JTang showed. Try
typing this at his interpreter: (/ 1 0). If it does the same
thing on your computer as on mine, it'll give you a "division by zero" runtime
exception and dump you back to your shell prompt. Not so good! Another area
where I found the same thing was when implementing some of the file-IO
stuff. It's pretty user-hostile to signal end-of-file with an exception that
kills the interpreter, so adding exception-handling was pretty important. I'm
not sure I have this entirely right: although I've wrapped the entire main
REPL in a Haskell try, it is still not catching division by zero. (No, Haskell
experts, it's not that I'm using try from the prelude, it's
Control.Exception.try, and it's not catching divide by zero; I (sic) tried,
although in a small test program CE.try does catch divide by zero.)
However, haskeem as a whole is catching divide by zero, I just
used a different method to do it.
I have also added most of the core R6RS syntactic forms: I've
got and, apply, begin,
case, cond, define, if,
lambda, let, let*, letrec,
letrec*, or, quote, and
set!. (Incidentally, JTang implemented and and
or as functions wrapped around && and
||; this is not quite right, because and and
or are supposed to short-circuit as soon as the value of the
expression can be determined.) Still missing are let-values,
let*-values, do, when and
until (although these last two are pretty trivial), and
set-car! and set-cdr!; these last two seem to be
regarded with some mixed feelings by the R6RS, and I'm not at all
sure I'll ever put them into haskeem. In part this is due to a
design decision by JTang; more on this in a bit. Also missing is everything
related to quasiquote.
I have a mutant version of eval, which actually evaluates an
expression in the current function context rather than the top-level
context. It didn't occur to me to make it evaluate in the top-level context,
that's why I made this a special syntactic form rather than a regular library
function. In order to make this work with things like (map eval
some-list), I added a function to the standard library: (define
(eval x) (eval x)). (If you look, you'll see one of those comments I
mentioned next to this definition. :-) )
More related to the actual running of the interpreter, and less related to
the scheme language, is the load special form. I am not exactly
sure now why I made it a special syntactic form rather than a library function
(I should have added a comment there, too!), but it is. Perhaps it was for
easy access to the environment; I haven't added any facility for getting the
execution environment inside library functions.
Another special form that's related to the specifics of the interpreter is
trace, a very modest facility for observing how the interpreter
is doing stuff. If you examine the file fibo.scm, you'll see it in action; it
produces output like this:
lisp> (load "fibo.scm")
trace bad-fibo on
#t
lisp> (fibo 6)
trace fibo-acc on
trace: fibo-acc <- (6 1 0)
trace: fibo-acc <- (5 1 1)
trace: fibo-acc <- (4 2 1)
trace: fibo-acc <- (3 3 2)
trace: fibo-acc <- (2 5 3)
trace: fibo-acc <- (1 8 5)
8
lisp> (bad-fibo 6)
trace: bad-fibo <- (6)
trace: bad-fibo <- (5)
trace: bad-fibo <- (4)
trace: bad-fibo <- (3)
trace: bad-fibo <- (2)
trace: bad-fibo <- (1)
trace: bad-fibo <- (0)
trace: bad-fibo <- (1)
trace: bad-fibo <- (2)
trace: bad-fibo <- (1)
trace: bad-fibo <- (0)
trace: bad-fibo <- (3)
trace: bad-fibo <- (2)
trace: bad-fibo <- (1)
trace: bad-fibo <- (0)
trace: bad-fibo <- (1)
trace: bad-fibo <- (4)
trace: bad-fibo <- (3)
trace: bad-fibo <- (2)
trace: bad-fibo <- (1)
trace: bad-fibo <- (0)
trace: bad-fibo <- (1)
trace: bad-fibo <- (2)
trace: bad-fibo <- (1)
trace: bad-fibo <- (0)
8
The stuff in parentheses is the argument list being handed to each invocation
of each traced function (there can be any number being traced simultaneously).
This doesn't yet show indentation to represent who's calling whom, but it is
nonetheless already very useful in seeing what's going on. This example very
clearly shows the difference in recursive calling between the two versions of
the fibonacci calculation. I intend to use haskeem to work my way
through as much of Structure and Interpretation of Computer Programs
as I can, and I expect that this will help a lot.
So what's missing, what's next? I plan to improve this in a couple more areas, as I have time and ability. I think these are more or less listed in the order in which I'll attack them, and also in order of increasing difficulty:
catch and throw
(although the spellings of the names may be different; I'll have to see what
the R6RS says). At least the catch has to be a special
form rather than a regular function, because we don't want all of its
arguments evaluated beforehand: initially, only the thing to be tried is
evaluated, and only if it throws an error does the error-handler get called. I
think this will be fairly straightforward.delay and
force. These are the Scheme syntactic forms for doing lazy
evaluation, and allow infinite (non-cyclic) datastructures, which is quite
cool.haskeme at that point! :-) However, I don't
know how easy or hard this is; I expect it's a moderately significant
effort.haskeem
that would require it; for those occasions where I would need complex numbers,
I can probably do it explicitly as pairs of reals.
If you have questions or comments about the stuff here, please contact me at korg@korgwal.com.
Enjoy! -- Uwe Hollerbach
Page last modified at 2008/03/05 20:00 US/Pacific