-
Notifications
You must be signed in to change notification settings - Fork 82
Compiler Design
calvis edited this page Apr 28, 2012
·
2 revisions
Our compiler is divided into three parts, a front-end, a middle, and a back-end.
The front-end of the Harlan compiler is comprised of our parser, typechecker, and several passes that are responsible for making the code more uniform and simple.
-
parse-harlan
verifies the input program, and inserts tags around all scalar expressions (for example, the integer0
becomes(int 0)
). This reduces ambiguity in the typechecker. -
returnify
will insert a return around the last expression in each function body, if one does not exist already. This pass must run before typechecking, so that the typechecker can enforce that all returns are returning expressions of the same type. -
typecheck
does exactly what it sounds like. Every expression is tagged with its type, assuming that the program is statically well-typed. If not, the typechecker will error. -
expand-primitives
rewrites front-end primitives, such aswrite-pgm
(primitives that are too complex to deal with in passes later), orprint
andassert
that are called on complex data-types. -
remove-danger
adds safety-checks before vector-refs, to ensure safe memory access.
-
make-kernel-dimensions-explicit
adds another field to kernels, to describe their dimension. Usually, this is the length of the first argument. -
optimize-fuse-kernels
creates two-dimensional kernels where possible, and inlinesiota
as a kernel argument. -
lift-complex
unnests all complex expressions. -
remove-nested-kernels
turns all kernels nested inside another kernel into for-loops, leaving the outermost kernel intact. -
optimize-lift-lets
lifts let-bindings as high as possible in the program, possibly outside of kernels, for-loops, and while-loops. -
returnify-kernels
changes kernel syntax again. Instead of returning expressions, a "return value" argument is explicitly passed in as an argument, and the kernel repeatedly assigns to the return value instead. -
make-vector-refs-explicit
replaces all references to kernel arguments inside of kernel bodies with vector-refs to the source vector. (This could use an example). -
annotate-free-vars
walks the body of all kernels and adds an explicit annotation of all free-variables referenced inside of them. -
insert-let-regions
inserts a new primitive,let-region
, inside of each function body, for-loop, and while-loop. This indicates a scoped region of memory, where things can be allocated. -
infer-regions
infers all variables onto a region. This is an understatement. -
uglify-vectors
replacesmake-vector
with an explicitalloc
, replaceslength
with a region-reference to the field where a vector's length is stored, and adds an additional offset tovector-ref
to account for the new addition of the vector-length field. -
remove-let-regions
explicitly creates and destroys a region based on its lexical scope. -
flatten-lets
unnests lets. -
hoist-kernels
moves all kernels to a special module at the top of the program, replacing them with a newapply-kernel
primitive that applies a kernel's unique identifier to its arguments. -
generate-kernel-calls
turnsapply-kernel
into calls to runtime functions. -
compile-module
transforms the entire module into something more C-like and discards unnecessary annotations. -
convert-types
converts Harlan types into C-types.
Currently, harlan-format-c
is the only pass in the back-end of the compiler, and it is responsible for turning s-expressions into a string of C++ with OpenCL.