Roost Compiler Back End
- Assign: Tuesday, 2 April
- Checkpoint: Translation to IR complete and computation of storage offsets complete by Friday, 12 April
- Due: Code Generation and all parts complete by Friday, 19 April
- Reference:
Contents
- Plan
- Translation of ASTs to Intermediate Representation (IR)
- Machine Code Generation
- Tools
- Standard Extensions
Plan
By the end of this phase of the project, you will be able to run Roost programs generated by your compiler!
You will implement:
- a translator to convert your AST representation to an intermediate representation (IR) using three-address code (TAC); and
- a code generator to convert IR/TAC programs to x86 assembly code.
Code generators would seem no more complex than other parts of a compiler, but they are notorious for subtle bugs. Start early and design carefully before you start coding. Implement, test, and commit your code incrementally. Definitely start with the core language and consider extension only if you have the core language working and well tested. If you are debugging issues with the code generator, seek help with effective debugging techniques before burning valuable time on misinformed or haphazard bug hunts.
All parts of the Roost Compiler Requirements and Coding Guide apply to this project stage.
You are required to implement back end support for the Roost core language. If you have time and curiosity, implementing the standard extensions adds some interest in a few spots.
Stages/Checkpoints
The Back End project stage includes one intermediate checkpoint:
There is one intermediate checkpoint for this phase. Be sure to add comments to your code to document any special cases, describe tricky parts, and provide an overview of how any non-trivial class is designed.
- Due 12 April 2019:
-
IR complete: Your compiler must translate valid Roost programs to a three-address code (TAC) intermediate representation (IR) and support the
--show-ir
command-line option. YourREADME.md
must contain a section defining your TAC instruction set and noting any important design decisions. -
Code Generation progress: You should aim to have the code generation implementation underway with computation of storage offsets complete. Additional code generation progress is recommended but not required at this checkpoint.
-
Submit using the
ir
branch.
-
-
Due 19 April 2019:
-
Code Generation complete: Your compiler must translate valid Roost programs to x86 assembly code. Your
README.md
must document the design of your code generator. -
Submit using the
codegen
branch.
-
If you finish early, I suggest getting a head start on the next project phase: the machine-independent optimizer. Alternatively, if you are unhappy with the inefficient generated machine code, you could begin work on a variety of improvements to instruction selection or register allocation in the code generation. Talk to me.
Package Structure
Support for the Back End components of the compiler should be implemented in the following packages:
roost.ir
: intermediate representation (IR) / three-address code (TAC) definition and translation from ASTsroost.codegen
: machine code generation
Translation of ASTs to Intermediate Representation (IR)
To support optimization and machine code generation, your Roost compiler will use an intermediate representation (IR) for programs based on three-address code (TAC) to capture a computation that is equivalent to the Roost code. You are responsible for choosing the specific instructions for your own TAC instruction set. The TAC instruction set will probably include the standard kinds of instructions: unary and binary operations, data movement instructions, labels and branch instructions, calls, and returns. You will also need to add explicit instructions for run-time checks. In general, the most significant differences between your ASTs or Roost source code and a well-designed IR are the flattening (elimination) of nested expressions and the decomposition of control-flow constructs into branches and jumps. Use the sample TAC definitions from past tutorial exercises and the TAC from the Tiny compiler as inspiration.
You must clearly describe your TAC instruction set and its semantics
in README.md
. While the primary representation of TAC will be as
internal data structures of your Scala implementation of the Roost
compiler, you should also design and document a syntax for displaying
TAC instructions.
The compiler should perform all of the tasks from the front end. Next, it will translate the AST to three-address code.
Three-Address Code (TAC) Instruction Set Design
After deciding on a TAC instruction set, you will implement a
translation from your AST representation to TAC. Your translation
phase must convert high-level constructs such as if
and while
,
short-circuit conditional expressions, break
and continue
,
etc. into TAC using branch and jump instructions. As usual, you will
define this translation as a recursive function over ASTs. The
function is the basic unit for translation. (There is little
interesting translation work outside functions, at least for the core
language.)
Optionally, you may try to generate ``clean’’ TAC code with no consecutive labels, no unnecessary jumps, etc. These are optimizations — do not implement them unless you have completed the basic assignment with time to spare. Do not try to minimize the number of temporary names within a method, for the reasons we discussed in meetings.
Design Hints
Design suggestions:
- I strongly encourage implementing the
TAC instructions as a collection of case classes extending an
abstract
TACInstr
class or similar. - Create a
TACList
class or similar that stores a list of TAC instructions. After translating to TAC, each method declaration in your AST would then have aTACList
attached.- Note that the operands to most TAC instructions can be (a) program variables, (b) temporary variables, or (c) constants. Your TAC design should support this.
- You may wish to design your
TACList
andTACInstr
classes to support generic map/fold operations to make it easy for clients to traverse and process TAC lists (for printing, generating x86 code, etc.). -
To help understand generated code and debug TAC and x86 code generation, I encourage you to design your TAC representation (and x86 code generator) to support annotation strings on individual instruction objects that can be printed out as “comments” in your output later, as in:
label _if3: # true branch for if on line 3 ... t1 = x + 1 # line 11
You can even include a
TACComment
instruction form that acts as a no-op (no semantic effect) but generates a comment to be stored at a specific point in the target program.
- To translate the concatenation of strings
s + t
to TAC, the compiler should generate a builtin function call for the hidden builtin functionRoost.concat(s,t)
. - If implementing methods, the instruction generated for a method call
o.m(...)
must have the receiver objecto
as its first argument, followed by the explicit arguments.
Run-Time Checks
Your three-address code must also include appropriate run-time check instructions.
- For each array access
a[i]
(read or write), the compiler must insert two checks as explicit TAC instructions before the access instruction:- a null check to determine if
a
is not null; and - a bounds check to determine if the index is within bounds for the given array the access is within bounds.
- a null check to determine if
- For each array length lookup operation,
e.length
, the compiler must insert a null check for the result ofe
, before the instruction that retrieves the array length. - For each dynamic array allocation
new T[e]
, the compiler must insert a check to verify that the array size resulting frome
is non-negative. - If supporting functions as values, for each function call
e(...)
, the compiler must insert a null check for the result ofe
before the call. - For each field access
e.f
(or method calle.m()
if implementing methods), the compiler must insert a null check for the result ofe
before the code that accesses the field or calls the method. - For each division or mod operation, the compiler must insert a check for division by zero.
These checks should be represented by special-purpose TAC instructions
(e.g., NullCheck
and BoundsCheck
), not a sequence of regular TAC
instructions that cumulatively compute the relevant conditions. We
will translate the logic of the checks during machine code
generation. Keeping them as semantically meaningful checks at this
level enables more effective optimizations.
Show IR
In addition to all of the options from the previous assignments, your
compiler must support the command-line option --show-ir
that causes
the compiler to print a description of the three-address code for each
function (and method) in the program, labeled with its name. (For
method names, use a qualified name
, consisting of the struct name, a
.
, and the method name. For functions declared in let blocks, invent
a qualified naming scheme that captures every function definition
uniquely.) For readability, be sure to separate the code for different
functions by blank lines.
Machine Code Generation
Next, you will implement a code generator to translate your three-address code into x86 assembly code. Implement a straightforward code generator by translating each TAC instruction into a standard sequence of assembly instructions. Do not attempt to optimize. It is fine for now if the generated assembly code is (very) inefficient. It must match the semantics of the input program. Your translation must correctly handle all of the following:
Your compiler should run code generation as its final stage after
translating the AST to TAC. Given an input file file.roost
, your
compiler should produce an assembly file file.roost.s
.
Calling Conventions and ABI
For function calls, the code generator must generate the code to prepare arguments, call the function, and collect a return result. For each function body, the code generator must include any necessary prologue (setup) or epilogue (finish) code to prepare and cleanup a stack frame.
The x86_64 Linux Application Binary Interface (ABI) defines that:
- Registers
%rax
,%rcx
,%rdx
,%rdi
,%rsi
,%r8 - %r11
are caller-save. Assume that their contents may be destroyed by anycall
instruction. - Registers
%rbx
,%rbp
,%rsp
, and%r12 - %r15
are callee-save. Their existing contents must be preserved (via save/restore) by any function that uses them. - The first 6 arguments to a given call are passed in registers
%rdi
,%rsi
,%rdx
,%rcx
,%r8
, and%r9
. - Remaining arguments to a given call are pushed onto the stack, with the earliest argument closest to the top of the stack (at the lowest memory address).
- A return value is passed in the
%rax
register. - Stack frames must be 16-byte aligned at calls. The stack pointer will be properly aligned when your code is entered. Generated code should preserve the existing alignment (i.e., grow only by multiples of 16 bytes) at any call site.
You must follow the x86_64 Linux ABI when calling external code
(builtin library functions) and where your code is called by external
code (in main
).
You may adopt any calling convention in generated code that calls (or is called by) only other generated code. For example, you may find it easiest to store/find all arguments/parameters on the stack, such that all variables and parameters can be treated uniformly – as stack locations. Any consistent internal calling convention is acceptable for this project as long as you follow the ABI when interfacing with external code.
Data Storage and Access
Variables
For simplicity, allocate space for every TAC variable in the current stack frame at the beginning of the enclosing function. Deallocate this space before return. Following this pattern, every TAC instruction that uses a variable operand can emit a standard code pattern to load or store a value in this variable via a temporary register.
Objects/Structs
For an object allocation expression new S()
, your assembly code
should invoke the runtime support function __LIB_allocateObject(s)
,
which returns a reference to the start of the newly allocated and
zeroed object. The size s
of the allocated object must be chosen to
accommodate all of the fields of the allocated object.
For field accesses o.f
, the code generator must emit code to access
the memory location at address o
plus the constant offset for field
f
(as determined by the static type of o
).
If implementing methods/signatures, the representation requires an
additional 8 bytes to store a reference to the dynamic dispatch table
(vtable/method vector) for the object’s structure type. After
allocating space for an object, your code must set up this reference
and the use virtual table to resolve method invocations. For method
calls o.m(...)
, the code must perform method dispatch using this
reference from object o
to find the address of the relevant method
m
and perform an indirect call to the code at this address. For
method names in the generated assembly, use a naming scheme where a
method m
of a structure A
is named _A_m
.
Arrays and Strings
Arrays and strings will be stored in the heap. To create new arrays,
use the library function __LIB_allocateArray(n)
, which returns a
reference to the first element of a newly allocated and zeroed array
large enough to store n elements. All array elements have a size of
8 bytes, the size for all types in the language (booleans, integers,
and references). The array allocation function stores the array
length in the 8 bytes preceding the base address of the array returned
(i.e., at offset -8).
String constants should be allocated statically in the data segment
using the ASCII encoding, with one byte per character. Strings do not
have null terminators; instead, each string is preceded by a word
indicating the length of the string. As with array lengths, this
length should be stored at an offset of -8 from the address of the
string character array. The length and representation of a string
should treat escaped characters such as '\n'
as one single
character.
Code
Run-Time Check Implementation
The code generator must convert the TAC run-time check
instructions as sequences of assembly instructions
that perform those checks. If a run-time check fails, it should
print a message to report the error, then call __LIB_exit(n)
to
terminate the process with nonzero exit code n.
Calls to Builtin Functions
Calls to Roost
builtin functions (including the hidden concat
function that implements string +
) are translated into function call
sequences using the naming convention given above. For example
Roost.readi64()
should be converted into a call to the function
__LIB_readi64
in the assembly code. You must pass the arguments to
the library functions in registers according to the x86_64 Linux
ABI. The result value will be in %rax
as usual when the
builtin function returns. The code for these functions will be
available in the Roost runtime library.
Main Function
The assembly code must contain a global procedure named
__roost_main
. When a program is executed, the run-time library will
set up the command-line argument list as a valid str[]
array and
then call your __roost_main
procedure with a reference to that
string array in register %rdi
.
Design and Implementation Hints
The starter
branch provides a skeleton for a code generator that
produces much of the boilerplate assembly you need to get things
working. Feel free to use it (git pull origin starter
), then you
will need to change a few AST references and other things – basically
delete and replace anything that doesn’t type-check or doesn’t make
sense to you) to match features of your AST. I suggest splitting your
code generator into separate passes:
- Compute Offsets:
- the object offset for each declared field
- the stack frame offset for each TAC variable, including parameters, local variables, and temporary variables created during TAC generation.
Store this information in declaration nodes in the AST (structures, fields, methods, declared parameters/variables) and in TAC variables (temporaries). Extend your symbol table printer and TAC printer to show this information.
- Generation x86 Assembly Code:
- Generate all necessary data literals (string literals, dispatch tables/method vectors).
- Generate code for each function, code for error handlers, and code
for the
__roost_main
wrapper. - When calling generated from generated code, it is straightforward to pass arguments on the stack as a first simple implementation. It allows all variables, including parameters, local variables, and temporary variables to be represented uniformly: as a memory location accessed via an offset from the stack pointer. Later, you can spend time to optimize for passing arguments in register if desired.
Tools
Once you are working code generation, you will want to use an x86-64 GNU/Linux environment. Supporting macOS is feasible, but requires a couple workarounds. Your code must run on the x86-64 GNU/Linux environment.
CS 240 materials are a good reference for x86 assembly and debugging x86 code with GDB.
You may find it useful to look at the assembly code generated by a C
compiler to see how it accomplishes a given task in x86. You can do
this in gcc with the -S
option. For example, gcc -S a.c
generates
the assembly code file a.s
.
Assembling and Linking
To build an executable from your generated machine code you need to run the assembler and link with the Roost runtime library.
Given an input file file.roost
, your compiler will produce an
assembly file file.roost.s
. You can then use an assembler to convert
this assembly code into an object file file.roost.o
and the linker
to convert this object file into an executable binary
file.roost.bin
. While there are separate assembler and linker tools
available, it is simplest to use gcc
, the GNU compiler collection,
to do both steps:
gcc -g -o file.roost.bin file.roost.s runtime/libroost.a
The -g
flag causes gcc
to create executables with debugging
symbols for use in gdb
.
The runtime library libroost.a
can be built by running make linux
in the src/runtime/c/libroost
directory. The compiled runtime
library contains x86 machine code for the Roost builtin functions that
are defined in the language specification along with support for
conservative garbage collection.
To integrate the assembly and linking step directly into your compiler, you could include something like this:
import scala.sys.process._
val LIB = "runtime/libroost.a"
val ret = Seq("gcc", "-g", "-o", binFilePath, asmFilePath,
System.getenv("ROOST_HOME") + "/" + LIB).!
if (ret != 0) {
// Report assembler/linker error.
}
macOS (unsupported)
Ignore this unless you really want to run your Roost programs
directly on a Mac. Further tinkering may be necessary. All labels
must start with underscore _
. Generate a macOS-friendly runtime
library by running make osx
in src/runtime/c/libroost
. Also,
unlike GCC, the assembler in the LLVM toolchain (installed by
xcode-select --install
on macOS and aliased to gcc
) does not
accept absolute addresses as values. This means that if your code
generator produces something like this, which uses the absolute
address corresponding to the label __string_literal_13
:
movq $__string_literal_13, 8(%rsp)
It will need to produce something like this instead, which uses a relative address:
leaq __string_literal_13(%rip), %rax
movq %rax, 8(%rsp)
GDB
Use GDB to
help understand the execution of your generated code. To run
executable foo
from the current directory under GDB: gdb ./foo
.
Typically, set a break point at __roost_main
or other later point of
interest, type run
, then step through and inspect your program’s
data from that point on.
CS 240 materials are a good reference for x86 assembly and debugging x86 code with GDB.
Standard Extensions
If you have time and interest, here is a quick sketch of implementation notes for the Roost standard language extensions. Talk to me if you are planning to implement these and have questions.
- Parametric polymorphic types:
- Use type erasure to create a single version of each generic function that works for any concrete type parameters.
- Methods, signatures, subtyping:
- For translation to IR:
- Translate methods like functions, but at the receiver as the first argument.
- Implement dispatch tables as explored in tutorial.
- Include a distinct method call TAC instruction separate from function call. They have different semantics.
- For code generation:
- Generate code for methods as with functions, but assume the receiver/target object will be passed as a first argument.
- Create
.rodata
assembly data declarations for dispatch tables. - Augment objects with a header word that stores a pointer to the
relevant dispatch table. Initialize this header in the code
generated for
new
. - Generate code for method calls that explicitly completes the dispatch table lookup using the
- For translation to IR:
- Closures:
- Replace stack storage of execution frames with heap-allocated storage of execution frames for arguments and locals. (Only necessary for those that are captured in closures.)
- Keep explicit references from the current execution to its parent execution frame to support look ups of variables in the parent scope.
- Alternatively, do a source-to-source translation to capture this all explicitly using auto-generated Roost structs.