mdhender

super secret hq

LISP in Go

I’m keen on learning Go. I’ve kicked around the sample programs and understand the syntax. I can write code that compiles but I’m not yet to the point where I feel comfortable programming it. You could say that I know that grammar but not the idioms.

I’m going to change that by writing a good sized program in Go. I love LISP, so that’s a natural choice for me. Luckily for me, Leo Howell wrote a very nice series of articles on building a LISP interpreter in C. I’ve gone through that and think that I understand it well enough to rework it in Go. There’s enough to it that I should learn how to express myself in Go.

I’m keen on learning Go, so I am building Leo’s interpreter with it. It’s a good opportunity since there’s a lot more to it than the typical “Hello, Universe” examples. Luckily, I understand the interpreter well enough that the challenge will be expressing myself in Go. (Well, I think that I understand it; this will prove it.)

As a bonus, I watched Rob Pike’s lecture on lexing with goroutines, and that looks like a lot of fun to try.

I plan to follow Leo’s outline, so feel free to compare against the original.

Test Driven Development

Leo’s articles include a lot of useful tests, which is great. When I built it, I used them to add tests to it. That’s useful because I was able to make changes to the code and have confidence that I didn’t break something.

Go has native support for testing. I’m using their harness to define the first test case. It’s simple: verify that we have a package named “leo.”

The Go Environment

Let’s set up the Go environment for the program, which I’m naming “leo.”

$ env | grep GOPATH
GOPATH=/Users/quoha/go

$ mkdir -p $GOPATH/leo ; cd $GOPATH/leo

The Test

Note: I’m learning Go, so I’m not sure how best to organize my test cases. I’m taking the easy out and starting with everything in a single file. Later on, if I need to refactor the test cases, then I will.

That file is interpreter_test.go.

package leo

import (
	"testing"
)

func TestLeo(t *testing.T) {
}

Running it shows that we pass.

$ go test -v
=== RUN TestLeo
--- PASS: TestLeo (0.00 seconds)
PASS
ok  	github.com/quoha/leo	0.004s

Atoms

LISP supports many types of atoms. In this article, I’ll add integers, symbols, the empty list (NIL), and CONS cells (though we’ll call them “pairs”).

I should mention that these aren’t all atoms. Leo called them atoms to make the code simpler. It works. It works well enough to leave be until it doesn’t work.

It shouldn’t be too much work to add other types (like strings) later. (As a bonus, doing so will test out my plans to make atoms an interface so that it’s easy to embed/extend the interpreter (big dreams already).)

By the way, there’s a lovely article about creating “variant structs“ in Go. That’s a future refactor since I don’t have the wisdom to apply it yet.

Integers

We need two bits of data for an atom that is an integer: a type flag saying that it is an integer and the value of the integer.

The Test

Let’s verify that creating an atom from an integer creates the atom correctly. I’m going to start with a test case that I know will fail and then update the code to make it work.

Note: I’m starting AtomType with an upper-case letter, so I’m expecting that my package will export it. I know that’s controversial, but I’m planning on making the package extensible, so I’m starting out with everything exported.

package leo

import (
	"testing"
)

func TestLeo(t *testing.T) {
}

func TestMake_Int(t *testing.T) {
	i := 42
	a := Make_Int(i)
	if a.aType != AtomType_Integer {
		t.Errorf("Make_Int(%v).aType = %v, want %v", i, a.aType, AtomType_Integer)
	}
	if a.integer != 42 {
		t.Errorf("Make_Int(%v).integer = %v, want %v", i, a.integer, i)
	}
}

If you’re not used to TDD, then this is going to seem like a pointless waste of time. I believe that it’s helpful in the long run. If you don’t agree, just skip ahead.

Running the updated test does produce the expected error:

$ go test -v
# github.com/quoha/leo
./interpreter_test.go:12: undefined: Make_Int
./interpreter_test.go:13: undefined: AtomType_Integer
./interpreter_test.go:14: undefined: AtomType_Integer
FAIL	github.com/quoha/leo [build failed]

There are compiler failures because the package doesn’t have the function or constants that we’re using.

Let’s write the code to pass the tests.

The Code For Integers

Here’s interpreter.go. It creates an unadorned struct that holds our bits of data. I created a new type, AtomType as a sort of “enum” for what the atom holds. I know that I’ll be replacing that at some point in the future. Even so, my goal is to make the tests pass and this is the simplest solution makes that happen.

package leo
// based on the articles at www.lwh.jp/lisp

type AtomType int
const (
	AtomType_Integer AtomType = iota
)

type Atom struct {
	aType   AtomType
	integer int
}

func Make_Int(i int) Atom {
	return Atom{aType: AtomType_Integer, integer: i}
}

Proof of that is in re-running the test case:

$ go test -v
=== RUN TestLeo
--- PASS: TestLeo (0.00 seconds)
=== RUN TestMake_Int
--- PASS: TestMake_Int (0.00 seconds)
PASS
ok  	github.com/quoha/leo	0.004s

Symbols

Symbols have names. LISP allows pretty much any character in the name, so Go’s UTF-8 support will work nicely.

We’ll also need a new flag to say that the atom holds a symbol.

The Test

As before, I’m creating the test case before writing the code. I’m adding a test case for Go’s iota. It’s not needed (I’m not testing the compiler) and could be called bad style since I’m not testing the compiler. I put it in because I have a habit of sorting enum definitions. Done casually, that can break the iota constant generator. I put the test in as a defense against that.

package leo

import (
	"testing"
)

func TestLeo(t *testing.T) {
	if AtomType_Integer == AtomType_Symbol {
		t.Errorf("AtomType_Integer = AtomType_Symbol, want them to be different")
	}
}

func TestMake_Int(t *testing.T) {
	i := 42
	a := Make_Int(i)
	if a.aType != AtomType_Integer {
		t.Errorf("Make_Int(%v).aType = %v, want %v", i, a.aType, AtomType_Integer)
	}
	if a.integer != 42 {
		t.Errorf("Make_Int(%v).integer = %v, want %v", i, a.integer, i)
	}
}

func TestMake_Symbol(t *testing.T) {
	s := "foo"
	a := Make_Symbol(s)
	if a.aType != AtomType_Symbol {
		t.Errorf("Make_Int(%v).aType = %v, want %v", s, a.aType, AtomType_Symbol)
	}
	if a.symbol != s {
		t.Errorf("Make_Int(%v).symbol = %v, want %v", s, a.symbol, s)
	}
}

Running the test gives the expected compilation errors:

$ go test -v
# github.com/quoha/leo
./interpreter_test.go:8: undefined: AtomType_Symbol
./interpreter_test.go:26: undefined: Make_Symbol
./interpreter_test.go:27: undefined: AtomType_Symbol
./interpreter_test.go:28: undefined: AtomType_Symbol
FAIL	github.com/quoha/leo [build failed]

The Code For Symbols

The code for this is very similar to the code for integers. I’m adding a new constant to flag symbols plus a string to hold the name of the symbol.

package leo
// based on the articles at www.lwh.jp/lisp

type AtomType int
const (
	AtomType_Integer AtomType = iota
	AtomType_Symbol
)

type Atom struct {
	aType   AtomType
	integer int
	symbol  string
}

func Make_Int(i int) Atom {
	return Atom{aType: AtomType_Integer, integer: i}
}

func Make_Symbol(s string) Atom {
	return Atom{aType: AtomType_Symbol, symbol: s}
}

And that passes the tests:

$ go test -v
=== RUN TestLeo
--- PASS: TestLeo (0.00 seconds)
=== RUN TestMake_Int
--- PASS: TestMake_Int (0.00 seconds)
=== RUN TestMake_Symbol
--- PASS: TestMake_Symbol (0.00 seconds)
PASS
ok  	github.com/quoha/leo	0.004s

CONS Cells

Yes, CONS cells aren’t atoms. I’m storing them in the same struct as a matter of convenience. To avoid confusion (and endless diversions), we’re going to call them pairs.

Pairs hold a pair of atoms. Since we’re treating pairs as atoms, pairs can hold other pairs. That’s what enables us to build up lists.

We’ll also need a new flag to say that the atom holds a pair.

The Test

package leo

import (
	"testing"
)

func TestLeo(t *testing.T) {
	if AtomType_Integer == AtomType_Symbol {
		t.Errorf("AtomType_Integer = AtomType_Symbol, want them to be different")
	}
	if AtomType_Integer == AtomType_Pair {
		t.Errorf("AtomType_Integer = AtomType_Pair, want them to be different")
	}
}

func TestMake_Int(t *testing.T) {
	i := 42
	a := Make_Int(i)
	if a.aType != AtomType_Integer {
		t.Errorf("Make_Int(%v).aType = %v, want %v", i, a.aType, AtomType_Integer)
	}
	if a.integer != 42 {
		t.Errorf("Make_Int(%v).integer = %v, want %v", i, a.integer, i)
	}
}

func TestMake_Symbol(t *testing.T) {
	s := "foo"
	a := Make_Symbol(s)
	if a.aType != AtomType_Symbol {
		t.Errorf("Make_Int(%v).aType = %v, want %v", s, a.aType, AtomType_Symbol)
	}
	if a.symbol != s {
		t.Errorf("Make_Int(%v).symbol = %v, want %v", s, a.symbol, s)
	}
}

func TestMake_Pair(t *testing.T) {
	i := 42
	s := "foo"
	a := Make_Pair(Make_Symbol(s), Make_Int(i))
	if a.aType != AtomType_Pair {
		t.Errorf("Make_Pair(%v).aType = %v, want %v", s, a.aType, AtomType_Pair)
	}
}

Running the test gives the expected compilation errors:

$ go test -v
# github.com/quoha/leo
./interpreter_test.go:11: undefined: AtomType_Pair
./interpreter_test.go:41: undefined: Make_Pair
./interpreter_test.go:42: undefined: AtomType_Pair
./interpreter_test.go:43: undefined: AtomType_Pair
FAIL	github.com/quoha/leo [build failed]

The Code For Pairs

package leo
// based on the articles at www.lwh.jp/lisp

type AtomType int
const (
	AtomType_Integer AtomType = iota
	AtomType_Symbol
	AtomType_Pair
)

type Atom struct {
	aType   AtomType
	integer int
	symbol  string
	pair   *Pair
}

type Pair struct {
	first, rest Atom
}

func Make_Int(i int) Atom {
	return Atom{aType: AtomType_Integer, integer: i}
}

func Make_Pair(first Atom, rest Atom) Atom {
	p := new(Pair)
	p.first, p.rest = first, rest
	return Atom{aType: AtomType_Pair, pair: p}
}

func Make_Symbol(s string) Atom {
	return Atom{aType: AtomType_Symbol, symbol: s}
}

Coming from C, the Make_Pair function gives me the willies. In C, we’re returning locals and leaking memory. Go is garbage collected, though, so we’ll reclaim that memory. Go also returns a valid copy of those locals. (To be fair, C will copy a struct on the stack, too, so it’s not an issue there, either.)

This passes the tests:

$ go test -v
=== RUN TestLeo
--- PASS: TestLeo (0.00 seconds)
=== RUN TestMake_Int
--- PASS: TestMake_Int (0.00 seconds)
=== RUN TestMake_Symbol
--- PASS: TestMake_Symbol (0.00 seconds)
=== RUN TestMake_Pair
--- PASS: TestMake_Pair (0.00 seconds)
PASS
ok  	github.com/quoha/leo	0.004s

The Hidden Truth

My initial version of this failed the test badly:

$ go test -v
# github.com/quoha/leo
./interpreter.go:28: cannot assign to p.first
./interpreter.go:28: cannot assign to p.rest
./interpreter_test.go:41: cannot use Make_Symbol(s) (type Atom) as type *Atom in argument to Make_Pair
./interpreter_test.go:41: cannot use Make_Int(i) (type Atom) as type *Atom in argument to Make_Pair
./interpreter_test.go:42: a.aType undefined (type *Pair has no field or method aType)
./interpreter_test.go:43: a.aType undefined (type *Pair has no field or method aType)
./interpreter_test.go:43: too many errors

My C muscle memory was the cause. I’d created the variables as first, rest *Atom instead of just plain Atom. The compiler was nice enough to point out my mistake.

NIL - The Empty List

Last on our list is NIL.

package leo

import (
	"testing"
)

func TestLeo(t *testing.T) {
	if AtomType_Integer == AtomType_Symbol {
		t.Errorf("AtomType_Integer = AtomType_Symbol, want them to be different")
	}
	if AtomType_Integer == AtomType_Pair {
		t.Errorf("AtomType_Integer = AtomType_Pair, want them to be different")
	}
}

func TestMake_Int(t *testing.T) {
	i := 42
	a := Make_Int(i)
	if a.aType != AtomType_Integer {
		t.Errorf("Make_Int(%v).aType = %v, want %v", i, a.aType, AtomType_Integer)
	}
	if a.integer != 42 {
		t.Errorf("Make_Int(%v).integer = %v, want %v", i, a.integer, i)
	}
}

func TestMake_Symbol(t *testing.T) {
	s := "foo"
	a := Make_Symbol(s)
	if a.aType != AtomType_Symbol {
		t.Errorf("Make_Int(%v).aType = %v, want %v", s, a.aType, AtomType_Symbol)
	}
	if a.symbol != s {
		t.Errorf("Make_Int(%v).symbol = %v, want %v", s, a.symbol, s)
	}
}

func TestMake_Pair(t *testing.T) {
	i := 42
	s := "foo"
	a := Make_Pair(Make_Symbol(s), Make_Int(i))
	if a.aType != AtomType_Pair {
		t.Errorf("Make_Pair(%v).aType = %v, want %v", s, a.aType, AtomType_Pair)
	}
}

func TestMake_Nil(t *testing.T) {
	a := NIL;
	if a.aType != AtomType_Nil {
		t.Errorf("Make_Nil().aType = %v, want %v", a.aType, AtomType_Nil)
	}
}

Running the test gives the expected compilation errors:

$ go test -v
# github.com/quoha/leo
./interpreter_test.go:48: undefined: NIL
FAIL	github.com/quoha/leo [build failed]

The Code

Our implementation of NIL is simple. It’s a global to the package.

We need to add a new flag to indicate that the atom is NIL, just like we did for the other types. I put AtomType_Nil at the top of the constant list to give it the value of zero. In Go, variables are zeroed out when they’re created, so Atoms will automatically be initialized to the NIL value.

package leo
// based on the articles at www.lwh.jp/lisp

type AtomType int
const (
	AtomType_Nil     AtomType = iota
	AtomType_Integer
	AtomType_Symbol
	AtomType_Pair
)

type Atom struct {
	aType   AtomType
	integer int
	symbol  string
	pair   *Pair
}

type Pair struct {
	first, rest Atom
}

var NIL = Atom{}

func Make_Int(i int) Atom {
	return Atom{aType: AtomType_Integer, integer: i}
}

func Make_Pair(first Atom, rest Atom) Atom {
	p := new(Pair)
	p.first, p.rest = first, rest
	return Atom{aType: AtomType_Pair, pair: p}
}

func Make_Symbol(s string) Atom {
	return Atom{aType: AtomType_Symbol, symbol: s}
}

That passes all of the test cases:

$ go test -v
=== RUN TestLeo
--- PASS: TestLeo (0.00 seconds)
=== RUN TestMake_Int
--- PASS: TestMake_Int (0.00 seconds)
=== RUN TestMake_Symbol
--- PASS: TestMake_Symbol (0.00 seconds)
=== RUN TestMake_Pair
--- PASS: TestMake_Pair (0.00 seconds)
=== RUN TestMake_Nil
--- PASS: TestMake_Nil (0.00 seconds)
PASS
ok  	github.com/quoha/leo	0.004s

Next Up

In the next article, we’ll discuss why my implementation of NIL is flawed and we’ll also create a symbol table to hold our symbols.

Comments