Course:CPSC312-2021/small programming language

From UBC Wiki

Redwood, a small programming language

Authors: Connor Cahoon, Sam Schweigel

What is the problem?

We would like to investigate the feasibility of using Haskell to write interpreters. We plan on creating an interpreter for a small imperative programming language.

The exact design is subject to change.

What is the something extra?

We are going to use megaparsec, a parser combinator library, for the frontend. The interpreter will have some modern features, like first-class functions and lexical scope. This will involve creating a few data structures to represent the language's syntax tree, as well as runtime values/closures.

What did we learn from doing this?

For a variety of reasons, Haskell turned out to be a good choice for writing an interpreter:

  • Algebraic data types and pattern matching make representing and manipulating syntax trees extremely easy.
  • Haskell makes it very compelling to use immutable data structures, but still provides efficient mutable references (we used IORef) for when they're needed. This was important when getting the interpreter to run fast enough to be used to write small games.
  • Parser combinators are an elegant, functional way to express the grammar of our language in Haskell, without having to use an external tool like a parser generator.
  • Monads and do notation let us abstract over some of the boilerplate involved with writing the interpreter. We put the environment for the currently executing function into a reader monad so that do notation would thread it through for us.

Overall, Haskell turned out to be a good choice for this project. Despite some irritating parts (liftIO was pretty annoying), we could iterate quickly, and parts of the interpreter usually functioned correctly on the first try after they typechecked. The whole thing turned out to be very succinct: under 1000 lines of code for quite a lot of functionality!

Links to code

https://github.com/sjsch/small-programming-language

Language components

program ::= stmt*

args ::= ident? (',' ident)*

block ::= '{' stmt* '}'

stmt ::= expr
       | ident '=' expr              Assignment, variable definition
       | 'if' expr block             If statement
         ('else if' expr block)?
         ('else' block)?
       | 'while' expr block          While loop
       | 'for' ident (',' ident)?    For loop (value only, and key-value version)
         'in' expr block
       | 'return' expr               Return statement
       | 'func' ident '(' args ')'   Function definition
         block

expr ::= number                Float literal
       | string                String lieral
       | array
       | dictionary
       | true | false          Boolean literal
       | expr binop expr       Arithmetic
       | ident                 Variable access
       | ident '(' args ')'    Function call
       | expr '.' ident        Dictionary access
       | expr '[' expr ']'     Array/dictionary indexing
       | '!' expr              Boolean not
       | '-' expr
       | 'null'

array ::= '[' expr? (',' expr)* ']'              Array literals
       
dictionary ::= '{' keyvalue (',' keyvalue)* '}'  Dictionary literals

keyvalue ::= ident ':' expr

binop ::= '+' | '-' | '*' | '/' | '<' | '>' | '==' | '&' | '|'