Very often my code ends up having the following form: parse an input to create an intermediate data structure and evaluate the structure to produce an output. Strangely, many years ago, when my code was object oriented, that wasn’t the case. Or at least I wasn’t explicitly aware of it.

When you write an interpreter or a compiler, things always work out like that, but I see the same pattern in almost everything I produce: from financial backtesting to chart libraries. Sometimes when, out of laziness or stupidity, I forego the intermediate structure, I end up in the end having to retrofit it in. Simply processing input and generating output at the same time rarely cuts it. But it is tempting because you get going pretty fast and I’m tricked into it occasionally.

Hence the first thing that I find myself reasoning about is often the particular form of such intermediate structure. In this case it looks like the following:

type Env = (string * LispVal ref) list ref
and FuncRecord = { parms: string list; varargs: string option; body: LispVal list; closure: Env}
and LispVal =
    | Atom of string
    | List of LispVal list
    | DottedList of LispVal list * LispVal
    | Number of int
    | String of string
    | Bool of bool
    | PrimitiveFunc of (LispVal list -> LispVal)
    | Func of FuncRecord
    | Port of System.IO.FileStream

This LispVal structure has one constructor for each kind of expression (production) that is allowed in Scheme. Or at least that ones I support …

It is important that each one stores all the information that is necessary for the evaluator to evaluate the expression. No more, no less. Here is a brief description:

The only remaining code to address is:

type Env = (string * LispVal ref) list ref

This is the symbol table and it is ugly. It is not multithread safe either. But it works and it is close enough to the Haskell version so I decided to retain it. A proper code review would ‘strongly suggest’ rewriting the code to pass it around to each function instead of using ‘ref’ or using the state monad encapsulated in a computation expression. Any of these solutions is left as an exercise to the reader (use the testcases to validate that you get it right).

We could go in many different direction from here. We could talk about:

To keep with the top – down approach I’ve been espousing. I’ll probably talk about the evaluator next.