mdhender

super secret hq

A Recursive Descent Parser, Part 1

An easy type of parser to build is the recursive descent parser. It gets its name from the fact that it starts at the top of the grammar and calls functions to accept parts of the grammar. It’s easy to write because there are simple rules for turning every production in the grammar into a function. Writing one can be tedious, though, especially if there are a lot of productions.

The type of parser that we’re going to write works for LL(1) grammars only, which means that we only need to look one symbol ahead in order to know which production to follow. It also means that no two productions start with the same symbol.

The parser is built from the top-down. That means that it starts with the first rule in the grammar and uses the next available symbol to determine which production rule to parse. It then calls a function to parse that production rule.

Here’s a sample parser for salc:

// grammar = Number Plus Number Equals.

void Grammar(void) {
  if (!Accept("[0-9]")) {
    Reject();
    return;
  }
  if (!Accept("+")) {
    Reject();
    return;
  }
  if (!Accept("[0-9]")) {
    Reject();
    return;
  }
  if (!Accept("=")) {
    Reject();
    return;
  }
  Valid();
}

For the momement, pretend that Accept(), Reject() and Valid() are defined elsewhere. This example looks at an input stream and rejects it if it isn’t accepted by the grammar. We’ll get to making it useful in a bit.

Notice that I changed the format of the grammar a bit. Terminals start with a capital letter and aren’t actually defined in the grammar. With this change, we assume that the terminals are defined by the scanner.

Comments