code Implement a TAC IR, lowering from ASTs to TAC, and code generation from TAC to x86.

Assign: Tuesday, 2 April
Checkpoint: Translation to IR complete and computation of storage offsets complete by Friday, 12 April
Due: Code Generation and all parts complete by Friday, 19 April
Reference:

Plan
- Stages/Checkpoints
- Package Structure
Translation of ASTs to Intermediate Representation (IR)
Machine Code Generation
Tools
- Assembling and Linking
- GDB
Standard Extensions

Plan

By the end of this phase of the project, you will be able to run Roost programs generated by your compiler!

You will implement:

a translator to convert your AST representation to an intermediate representation (IR) using three-address code (TAC); and
a code generator to convert IR/TAC programs to x86 assembly code.

Code generators would seem no more complex than other parts of a compiler, but they are notorious for subtle bugs. Start early and design carefully before you start coding. Implement, test, and commit your code incrementally. Definitely start with the core language and consider extension only if you have the core language working and well tested. If you are debugging issues with the code generator, seek help with effective debugging techniques before burning valuable time on misinformed or haphazard bug hunts.

All parts of the Roost Compiler Requirements and Coding Guide apply to this project stage.

You are required to implement back end support for the Roost core language. If you have time and curiosity, implementing the standard extensions adds some interest in a few spots.

Stages/Checkpoints

The Back End project stage includes one intermediate checkpoint:

There is one intermediate checkpoint for this phase. Be sure to add comments to your code to document any special cases, describe tricky parts, and provide an overview of how any non-trivial class is designed.

Due 12 April 2019:
- IR complete: Your compiler must translate valid Roost programs to a three-address code (TAC) intermediate representation (IR) and support the --show-ir command-line option. Your README.md must contain a section defining your TAC instruction set and noting any important design decisions.
- Code Generation progress: You should aim to have the code generation implementation underway with computation of storage offsets complete. Additional code generation progress is recommended but not required at this checkpoint.
- Submit using the ir branch.
Due 19 April 2019:
- Code Generation complete: Your compiler must translate valid Roost programs to x86 assembly code. Your README.md must document the design of your code generator.
- Submit using the codegen branch.

If you finish early, I suggest getting a head start on the next project phase: the machine-independent optimizer. Alternatively, if you are unhappy with the inefficient generated machine code, you could begin work on a variety of improvements to instruction selection or register allocation in the code generation. Talk to me.

Package Structure

Support for the Back End components of the compiler should be implemented in the following packages:

roost.ir: intermediate representation (IR) / three-address code (TAC) definition and translation from ASTs
roost.codegen: machine code generation

Translation of ASTs to Intermediate Representation (IR)

To support optimization and machine code generation, your Roost compiler will use an intermediate representation (IR) for programs based on three-address code (TAC) to capture a computation that is equivalent to the Roost code. You are responsible for choosing the specific instructions for your own TAC instruction set. The TAC instruction set will probably include the standard kinds of instructions: unary and binary operations, data movement instructions, labels and branch instructions, calls, and returns. You will also need to add explicit instructions for run-time checks. In general, the most significant differences between your ASTs or Roost source code and a well-designed IR are the flattening (elimination) of nested expressions and the decomposition of control-flow constructs into branches and jumps. Use the sample TAC definitions from past tutorial exercises and the TAC from the Tiny compiler as inspiration.

You must clearly describe your TAC instruction set and its semantics in README.md. While the primary representation of TAC will be as internal data structures of your Scala implementation of the Roost compiler, you should also design and document a syntax for displaying TAC instructions.

The compiler should perform all of the tasks from the front end. Next, it will translate the AST to three-address code.

Three-Address Code (TAC) Instruction Set Design

After deciding on a TAC instruction set, you will implement a translation from your AST representation to TAC. Your translation phase must convert high-level constructs such as if and while, short-circuit conditional expressions, break and continue, etc. into TAC using branch and jump instructions. As usual, you will define this translation as a recursive function over ASTs. The function is the basic unit for translation. (There is little interesting translation work outside functions, at least for the core language.)

Optionally, you may try to generate ``clean’’ TAC code with no consecutive labels, no unnecessary jumps, etc. These are optimizations — do not implement them unless you have completed the basic assignment with time to spare. Do not try to minimize the number of temporary names within a method, for the reasons we discussed in meetings.

Design Hints

Design suggestions:

I strongly encourage implementing the TAC instructions as a collection of case classes extending an abstract TACInstr class or similar.
Create a TACList class or similar that stores a list of TAC instructions. After translating to TAC, each method declaration in your AST would then have a TACList attached.
- Note that the operands to most TAC instructions can be (a) program variables, (b) temporary variables, or (c) constants. Your TAC design should support this.
- You may wish to design your TACList and TACInstr classes to support generic map/fold operations to make it easy for clients to traverse and process TAC lists (for printing, generating x86 code, etc.).
- To help understand generated code and debug TAC and x86 code generation, I encourage you to design your TAC representation (and x86 code generator) to support annotation strings on individual instruction objects that can be printed out as “comments” in your output later, as in:
```
label _if3: # true branch for if on line 3
...
t1 = x + 1  # line 11
```
  You can even include a TACComment instruction form that acts as a no-op (no semantic effect) but generates a comment to be stored at a specific point in the target program.
To translate the concatenation of strings s + t to TAC, the compiler should generate a builtin function call for the hidden builtin function Roost.concat(s,t).
If implementing methods, the instruction generated for a method call o.m(...) must have the receiver object o as its first argument, followed by the explicit arguments.

Run-Time Checks

Your three-address code must also include appropriate run-time check instructions.

For each array access a[i] (read or write), the compiler must insert two checks as explicit TAC instructions before the access instruction:
- a null check to determine if a is not null; and
- a bounds check to determine if the index is within bounds for the given array the access is within bounds.
For each array length lookup operation, e.length, the compiler must insert a null check for the result of e, before the instruction that retrieves the array length.
For each dynamic array allocation new T[e], the compiler must insert a check to verify that the array size resulting from e is non-negative.
If supporting functions as values, for each function call e(...), the compiler must insert a null check for the result of e before the call.
For each field access e.f (or method call e.m() if implementing methods), the compiler must insert a null check for the result of e before the code that accesses the field or calls the method.
For each division or mod operation, the compiler must insert a check for division by zero.

These checks should be represented by special-purpose TAC instructions (e.g., NullCheck and BoundsCheck), not a sequence of regular TAC instructions that cumulatively compute the relevant conditions. We will translate the logic of the checks during machine code generation. Keeping them as semantically meaningful checks at this level enables more effective optimizations.

Show IR

In addition to all of the options from the previous assignments, your compiler must support the command-line option --show-ir that causes the compiler to print a description of the three-address code for each function (and method) in the program, labeled with its name. (For method names, use a qualified name, consisting of the struct name, a ., and the method name. For functions declared in let blocks, invent a qualified naming scheme that captures every function definition uniquely.) For readability, be sure to separate the code for different functions by blank lines.

Machine Code Generation

Next, you will implement a code generator to translate your three-address code into x86 assembly code. Implement a straightforward code generator by translating each TAC instruction into a standard sequence of assembly instructions. Do not attempt to optimize. It is fine for now if the generated assembly code is (very) inefficient. It must match the semantics of the input program. Your translation must correctly handle all of the following:

Your compiler should run code generation as its final stage after translating the AST to TAC. Given an input file file.roost, your compiler should produce an assembly file file.roost.s.

Calling Conventions and ABI

For function calls, the code generator must generate the code to prepare arguments, call the function, and collect a return result. For each function body, the code generator must include any necessary prologue (setup) or epilogue (finish) code to prepare and cleanup a stack frame.

The x86_64 Linux Application Binary Interface (ABI) defines that:

Registers %rax, %rcx, %rdx, %rdi, %rsi, %r8 - %r11 are caller-save. Assume that their contents may be destroyed by any call instruction.
Registers %rbx, %rbp, %rsp, and %r12 - %r15 are callee-save. Their existing contents must be preserved (via save/restore) by any function that uses them.
The first 6 arguments to a given call are passed in registers %rdi, %rsi, %rdx, %rcx, %r8, and %r9.
Remaining arguments to a given call are pushed onto the stack, with the earliest argument closest to the top of the stack (at the lowest memory address).
A return value is passed in the %rax register.
Stack frames must be 16-byte aligned at calls. The stack pointer will be properly aligned when your code is entered. Generated code should preserve the existing alignment (i.e., grow only by multiples of 16 bytes) at any call site.

You must follow the x86_64 Linux ABI when calling external code (builtin library functions) and where your code is called by external code (in main).

You may adopt any calling convention in generated code that calls (or is called by) only other generated code. For example, you may find it easiest to store/find all arguments/parameters on the stack, such that all variables and parameters can be treated uniformly – as stack locations. Any consistent internal calling convention is acceptable for this project as long as you follow the ABI when interfacing with external code.

Data Storage and Access

Variables

For simplicity, allocate space for every TAC variable in the current stack frame at the beginning of the enclosing function. Deallocate this space before return. Following this pattern, every TAC instruction that uses a variable operand can emit a standard code pattern to load or store a value in this variable via a temporary register.

Objects/Structs

For an object allocation expression new S(), your assembly code should invoke the runtime support function __LIB_allocateObject(s), which returns a reference to the start of the newly allocated and zeroed object. The size s of the allocated object must be chosen to accommodate all of the fields of the allocated object.

For field accesses o.f, the code generator must emit code to access the memory location at address o plus the constant offset for field f (as determined by the static type of o).

If implementing methods/signatures, the representation requires an additional 8 bytes to store a reference to the dynamic dispatch table (vtable/method vector) for the object’s structure type. After allocating space for an object, your code must set up this reference and the use virtual table to resolve method invocations. For method calls o.m(...), the code must perform method dispatch using this reference from object o to find the address of the relevant method m and perform an indirect call to the code at this address. For method names in the generated assembly, use a naming scheme where a method m of a structure A is named _A_m.

Arrays and Strings

Arrays and strings will be stored in the heap. To create new arrays, use the library function __LIB_allocateArray(n), which returns a reference to the first element of a newly allocated and zeroed array large enough to store n elements. All array elements have a size of 8 bytes, the size for all types in the language (booleans, integers, and references). The array allocation function stores the array length in the 8 bytes preceding the base address of the array returned (i.e., at offset -8).

String constants should be allocated statically in the data segment using the ASCII encoding, with one byte per character. Strings do not have null terminators; instead, each string is preceded by a word indicating the length of the string. As with array lengths, this length should be stored at an offset of -8 from the address of the string character array. The length and representation of a string should treat escaped characters such as '\n' as one single character.

Code

Run-Time Check Implementation

The code generator must convert the TAC run-time check instructions as sequences of assembly instructions that perform those checks. If a run-time check fails, it should print a message to report the error, then call __LIB_exit(n) to terminate the process with nonzero exit code n.

Calls to Builtin Functions

Calls to Roost builtin functions (including the hidden concat function that implements string +) are translated into function call sequences using the naming convention given above. For example Roost.readi64() should be converted into a call to the function __LIB_readi64 in the assembly code. You must pass the arguments to the library functions in registers according to the x86_64 Linux ABI. The result value will be in %rax as usual when the builtin function returns. The code for these functions will be available in the Roost runtime library.

Main Function

The assembly code must contain a global procedure named __roost_main. When a program is executed, the run-time library will set up the command-line argument list as a valid str[] array and then call your __roost_main procedure with a reference to that string array in register %rdi.

Design and Implementation Hints

The starter branch provides a skeleton for a code generator that produces much of the boilerplate assembly you need to get things working. Feel free to use it (git pull origin starter), then you will need to change a few AST references and other things – basically delete and replace anything that doesn’t type-check or doesn’t make sense to you) to match features of your AST. I suggest splitting your code generator into separate passes:

Compute Offsets:
- the object offset for each declared field
- the stack frame offset for each TAC variable, including parameters, local variables, and temporary variables created during TAC generation.
Store this information in declaration nodes in the AST (structures, fields, methods, declared parameters/variables) and in TAC variables (temporaries). Extend your symbol table printer and TAC printer to show this information.
Generation x86 Assembly Code:
- Generate all necessary data literals (string literals, dispatch tables/method vectors).
- Generate code for each function, code for error handlers, and code for the __roost_main wrapper.
- When calling generated from generated code, it is straightforward to pass arguments on the stack as a first simple implementation. It allows all variables, including parameters, local variables, and temporary variables to be represented uniformly: as a memory location accessed via an offset from the stack pointer. Later, you can spend time to optimize for passing arguments in register if desired.

Tools

Once you are working code generation, you will want to use an x86-64 GNU/Linux environment. Supporting macOS is feasible, but requires a couple workarounds. Your code must run on the x86-64 GNU/Linux environment.

CS 240 materials are a good reference for x86 assembly and debugging x86 code with GDB.

You may find it useful to look at the assembly code generated by a C compiler to see how it accomplishes a given task in x86. You can do this in gcc with the -S option. For example, gcc -S a.c generates the assembly code file a.s.

Assembling and Linking

To build an executable from your generated machine code you need to run the assembler and link with the Roost runtime library.

Given an input file file.roost, your compiler will produce an assembly file file.roost.s. You can then use an assembler to convert this assembly code into an object file file.roost.o and the linker to convert this object file into an executable binary file.roost.bin. While there are separate assembler and linker tools available, it is simplest to use gcc, the GNU compiler collection, to do both steps:

gcc -g -o file.roost.bin file.roost.s runtime/libroost.a

The -g flag causes gcc to create executables with debugging symbols for use in gdb.

The runtime library libroost.a can be built by running make linux in the src/runtime/c/libroost directory. The compiled runtime library contains x86 machine code for the Roost builtin functions that are defined in the language specification along with support for conservative garbage collection.

To integrate the assembly and linking step directly into your compiler, you could include something like this:

import scala.sys.process._

val LIB = "runtime/libroost.a"
val ret = Seq("gcc", "-g", "-o", binFilePath, asmFilePath,
              System.getenv("ROOST_HOME") + "/" + LIB).!
if (ret != 0) {
  // Report assembler/linker error.
}

macOS (unsupported)

Ignore this unless you really want to run your Roost programs directly on a Mac. Further tinkering may be necessary. All labels must start with underscore _. Generate a macOS-friendly runtime library by running make osx in src/runtime/c/libroost. Also, unlike GCC, the assembler in the LLVM toolchain (installed by xcode-select --install on macOS and aliased to gcc) does not accept absolute addresses as values. This means that if your code generator produces something like this, which uses the absolute address corresponding to the label __string_literal_13:

movq $__string_literal_13, 8(%rsp)

It will need to produce something like this instead, which uses a relative address:

leaq __string_literal_13(%rip), %rax
movq %rax, 8(%rsp)

GDB

Use GDB to help understand the execution of your generated code. To run executable foo from the current directory under GDB: gdb ./foo. Typically, set a break point at __roost_main or other later point of interest, type run, then step through and inspect your program’s data from that point on.