Implementing A Functional Language

This is the first in a series of posts on implementing a simple functional language. A typical functional language compiler goes through the following steps to turn the code you write into a binary you run:

your code your code ast ast your code->ast PARSE annotated ast annotated ast ast->annotated ast TYPECHECK core core annotated ast->core DESUGAR abstract IR abstract IR core->abstract IR COMPILE backend language backend language abstract IR->backend language CODEGEN

We will look at the transformation from core to abstract IR. In Haskell, this corresponds to the transformation from GHC Core to STG. This process is interesting because it is here that we define how the program is evaluated. We will look at several different designs for compilers, of increasing complexity, and consider problems such as laziness, sharing, garbage collection, strictness analysis and parallel evaluation.

This series is heavily based on a fantastic book of the same name written by Simon Peyton Jones and David Lester in 2000. All credit goes to them for the content: I’ve simply altered the presentation and updated it slightly. There are a number of differences:

The book is freely available in PDF form here. There’s also a broader overview of this area (again by Simon Peyton Jones) available here. I highly recommend both of them.

Structure

The structure of this series is as follows. However this is a work in progress, so I may change things as I complete each section.

  1. The Core Language
  2. Graph Reduction
  3. Template Instantiation
  4. The G-machine
  5. The Three Instruction Machine
  6. The Parallel G-machine
  7. Lambda Lifting
  8. The Spineless Tagless G-machine
  9. Appendix: Printing Core
  10. Appendix: Parsing Core

Each post will be a literate Haskell file, and will compile using Stackage lts-12.2. We’ll use several libraries from Hackage, so you’ll want to make sure you’re using the same versions. Compiling with Stack and lts-12.2 is the easiest way to do this.