Roost Compiler Requirements and Coding Guide

Getting Started
Project Stage Requirements
- Lexer
- Front End
- Back End
- Optimizer
- Feature
Compiler Output Requirements
Programming Support
- Utility Code for Debug Messages
- Assertions
Testing Support
Documentation and Style
Submission and Evaluation
- Submit a Merge Request
- Code Review

Reference

Getting Started

These instructions assume you are using your CS account on one of the CS GNU/Linux workstations in the Microfocus. They can be adapted mildly to other environments if you’re trying to install the tools yourself.

To set up CS 301 tools in your CS account¹ permanently² for all CS GNU/Linux workstations, log into one, open a terminal, and run:

source /home/cs301/live/env/init/account.sh

Clone Your Project Repository

If you have not previously used Git in your CS account, configure it with your identity by running these commands in a terminal:
```
git config --global user.name "Your Name"
git config --global user.email "your.email@wellesley.edu"
```
Visit the URL of your team GitLab repository. (You should have this in an email from GitLab.)
In the upper right, click Clone and choose the HTTPS URL (or SSH if you have configured SSH keys for GitLab in your CS account).
Run the following command, replacing <URL> with the URL you copied from GitLab and <DIR> with the path you want to use for your local project directory. (Use Shift-Control-V to paste in the terminal.)
```
git clone <URL> <DIR>
```
Alternatively, you can use IntelliJ to check out the project by choosing Check out from Version Control > Git, and using the same URL and directory described above.

Set Up the Project in IntelliJ

Launch IntelliJ with the command idea.sh & or, if you have launched it previously, select it from the Applications > Programming menu.

General IntelliJ Settings

If your have not used IntelliJ in your account yet:
- Choose Do not import settings.
- Accept the license, then choose Don’t send (data sharing).
- Choose a UI style, then Skip Remaining and Set Defaults.
Open the IntelliJ preferences. (File > Settings in the menubar or Configure > Settings in the lower right of the IntelliJ welcome window.)
- Choose Build, Execution, Deployment > Compiler.
  - Enable Build project automatically.
  - In Scala Compiler > Scala Compile Server:
    - Set JVM options to -server -Xss256m
    - Set JVM maximum heap size, MB to 2048
  - In Languages & Frameworks > Scala:
    - Optionally enable Show type info on mouse hover …

IntelliJ Plugins

The IntelliJ Scala plugin, needed for your project, has been preinstalled on the lab workstations.

As you open files that are not Scala or Java files, IntelliJ may offer to install related plugins with a bar across the top of the editor pane. Unfortunately, the plugins it offers for JFlex and Java CUP are not useful to us. You should dismiss these offers by clicking Ignore extension.

One exception is the Markdown support plugin from JetBrains, which lets you preview the rendered Markdown in one pane as you write in another. I find typing becomes a bit laggy using the plugin, so I prefer to write Markdown in my usual text editor instead, but you may appreciate it.

If installing plugins, be careful of the plugin selection process. When there are multiple plugins in the window that pops up when you click the Install plugins option, you must explicitly uncheck those plugins that you do not want to install.

Open and Configure the Project

From the File menu or the welcome window, choose Open and select your project directory.
When prompted to Import project from sbt, click OK.
Open the IntelliJ preferences again. In Build, Execution, Deployment > Compiler, enable Build project automatically.
Wait for the progress indicator in near the middle/right end of the bottom status bar to finish.
From the Run menu or near the right hand end of the toolbar, choose Edit Configurations….
- In the upper left of the Run/Debug Configurations window, click the + and select sbt Task from the drop-down menu.
- Set the Name field to “Lexer/Parser Generator”.
- Set the Tasks: field to “generators”.
- Click OK.
Near the right end of the toolbar, between the hammer and the play button, you should now see Lexer/Parser Generators and the play button should be green. Click the green play button to run the Lexer/Parser Generators task.
- This should pop up an sbt shell window pane. (If not, click the sbt shell tab in the lower left.)
- The first time after you launch IntelliJ, this will be pretty slow. Subsequent runs will be faster.
- The sbt shell pane will show the output of running the generators. (More about that in the first two project stages!)
Click the build button (green hammer near right end of toolbar) to force a build for good measure.
Look for the Problems tab in the lower left/middle and open this window pane. This is where any (Scala) compiler errors will be reported. There should not be any errors right now.
Click the upper left (sideways) Project pane to view the project files.

Set Up Paths

To configure your current shell session to find your Roost compiler as roostc, use this command:

cd <your-compiler-project-path>
source env.sh

To set this up automatically in every new shell, use your favorite text editor to add the following line to the file .bash_profile in your home directory:

pushd <your-compiler-project-path> && (source env.sh ; popd)

Project Files

bin/: scripts, utilities
- roostc: wrapper to run the compiler
- test-roostc-status.py: basic testing script for source code from ./src/main/roost/<stage>/*.
lib/: jar files for external libraries and tools
src/: all code
- main/scala/: compiler source code
- test/roost/: Roost source files for testing
README.md: top-level documentation of your implementation

Building the Roost Compiler

To build from IntelliJ (after completing the project setup steps above):

All Scala files will build automatically, shortly after you make changes. To force a build (some times IntelliJ gets tired?), click the green hammer in the upper right toolbar.
To generate the lexer (with JFlex) or parser (with Java CUP) from your specifications, click the green play button next to your Lexer/Parser Generators task in the upper right toolbar.

To build from a shell:

Run sbt compile. Alternatively, start an sbt shell by runningin sbt, then run the command compile. This will run the lexer/parser generators and build all Scala files.

Running the Roost Compiler

The provided code includes a skeleton for the compiler’s command-line interface. Assuming you have completed the project setup steps above and the build has completed, the roostc wrapper for your compiler should be available. At the command line, your compiler can be invoked as follows:

roostc file.roost

Additional options can be seen with the -h or --help option:

roostc --help

The wrapper script roostc simply invokes the scala runtime with the right environment arguments to find and evaluate the main entrypoint in roost.Compiler. All command-line arguments to roostc are passed into your compiler’s main entrypoint.

Committing and Pushing Changes with Git

Your team’s work is hosted as a Git repository on GitLab, which is also where I will collect and review your work. Git will help you track changes and restore old versions if things go wrong. You have also used Git or Mercurial in on small assignments in CS 240, so this should be somewhat familiar, but this may be your first time working on a large protracted software project with version control.

As you work, you should frequently:

Work together in pair/trio programming style with your team. This is the preferable mode of work.
Communicate with your team (if you have to work separately) to avoid conflicts (concurrent edits to the same parts of the same files) and other broken merges (edits that change something the other teammate is depending on).
git add and git commit cohesive sets of changes with a descriptive commit message.
git pull commits from – and git push commits your team repository.

You can perform Git operations on the command line or from within IntelliJ (see the VCS menu). (Or, if you use Emacs, check out Magit!)

More reference:

Git Documentation (← start here)
GitLab Documentation (← start here)
- GitLab Merge Requests for submitting project checkpoints and stages.
- GitLab Issues for tracking bugs, features, to-dos, etc.
You can practice Git skills with the Tutorial from CS 240, even though you’re not in that course currently.

Tracking Issues and To-do Items with GitLab

At some point, your may have a bug! At many times, you will have a wide range of tasks that need doing, such as debugging and fixing a problem, adding a new feature, redesigning and changing the implementation of an existing feature, updating documentation, etc. To help coordinate progress on these tasks and document the knowledge required (or discovered) to complete them, you find it useful to use the Issue Tracker hosted with your repository on GitLab.

Project Stage Requirements

Specifications of each compiler stage are described separately in the corresponding project stage.

Lexer

Implement lexical analysis.

Front End

Implement Parsing, ASTs, symbol tables, and type checking.

Back End

Implement a TAC IR, lowering from ASTs to TAC, and code generation from TAC to x86.

Optimizer

Implement several compiler optimizations.

Feature

Add a non-trivial new feature to your compiler.

assign: Friday, 19 April
checkpoint: Tuesday, 30 April
work: Tuesday, 7 May
work: Friday, 10 May
checkpoint: Tuesday, 14 May
presentations: Tuesday, 14 May
due: Tuesday, 21 May

Compiler Output Requirements

In addition to the specific requirements for each project stage, which may include outputs such as stage summaries enabled by command-line options or files generated by compilation, the following requirements for compiler output apply to the entire project.

Error Messages

Your compiler should detect and report the first lexical analysis error it encounters (if any). Your program must always report the first lexical error in the file; reporting later errors is not necessary. The compiler should print an error message, report its final status, and exit cleanly. The format and exact content of error messages is left to you. They must be informative and useful to the programmer in understanding and fixing the offending issue in the source code: it should be easy to fix the problem immediately after reading the message. It is highly recommended that you include a line and column number of a position in the input program source code where the error arises. This is helpful not just to your (for now imaginary) end users, but especially to you while you are testing and debugging your compiler.

Source code error reporting will be an important feature of your compiler for lexical errors in this stage and many other types of errors in future stages. One convenient way to organize error-reporting is by raising instances of subtypes of roost.error.CompilerError, an exception class. Whenever the program encounters an error in source code, the relevant component can raise an appropriate type of CompilerError exception. The top-level compiler logic can then catch and report any CompilerError in a single central location.

Status Reporting

Regardless of whether your compiler prints other required information as indicated by command-line options, reports a compiler error, etc., it must clearly report the final status of compilation upon termination. Your compiler must do the following two things to report whether it accepted or rejected the source program:

The last line printed by your compiler must always be one of Accepted. or Rejected., formatted on its own line. The output of your compiler must contain nothing else after this line.
The exit code of the compiler process must be 0 if the compiler accepts the source program and nonzero if it rejects the source program. Scala’s built-in sys.exit(x) terminates the process and yields the given exit code, x.

These will be helpful for automating tests of your compiler.

No Other Output

Excepting the outputs explicitly required by each stage (e.g., --show-tokens for the lexer), compiler error messages, and status reporting, your compiler should print no other output under normal operation. If you wish to show additional information for yourself while developing, testing, or debugging, try the provided mechanism for explicitly enabling extra informational messages.

Programming Support

The starter code for roost.Compiler demos a few system interaction features like working with buffered file IO and parsing command-line arguments (using the scopt) library). Your compiler must implement at least the command-line options and status-reporting behavior, regardless of how. Successive stages will specify additional requirements of the same style. As long as you satisfy these specifications, you may replace or change any parts of the starter code.

Utility Code for Debug Messages

One feature of the provided code that you may find useful is support for controlling the printing of informational messages from within your compiler. As you develop your compiler, you may find it useful to display more information about incremental internal steps than is required (or allowed) by the output specification. The roost.Util function provides a method debug for printing such messages. This method has two useful features:

It uses printf-style formatting, which is more efficient than constructing strings through repeated concatenation with +.
By default, debug never prints its messages. The command-line flag -d (or --debug) can be used to enable the messages when needed. This helps avoid the tradeoff between cluttering the compiler output and constantly adding/removing/commenting/uncommenting code to print such messages.

Using the -d or --debug flag with no additional argument enables printing of all debug messages. Giving a comma-separated list of debug keys as an argument to the -d or --debug flag enables only the debug messages that are associated with this list of debug keys and those messages that are associated with no key at all. The first argument to debug is an option (None or Some(...)) indicating how the message is keyed. The second argument is a format string. Any remaining arguments are used to fill the % holes in the format string.

import roost.Util
Util.debug(None, "#1. See line %d Debug messages are enabled!", lineNumber)
Util.debug(Some("lex"), "#2. Debug messages are enabled for key 'lex'!")
Util.debug(Some("parse"), "#3. Debug messages are enabled for key 'parse'!")

Given the above Util.debug calls, running roostc

without -d/--debug does not allow any of the messages print;
with -d/--debug allows #1 to print;
with -d parser/--debug parse allows #1 and #3 to print;
with -d lexer,parser/--debug lex,parse allows #1, #2, and #3 to print.

This feature makes it attractive to leave your informational messages for all stages in place and enable only those that you need currently.

Feel free to add other broadly useful functionality in the roost.Util object. You will likely import in most files.

Assertions

You should make liberal use of Scala’s assertion facilities: use assert(condition, "message") to assert that specific Boolean conditions (e.g., preconditions, postconditions, invariants) are always true at run time, and otherwise intentionally crash with an exception after printing message. Use assertions to check for logic errors in your compiler code. Do not use assertions for reporting errors in user input, such as command-line flags or Roost source code. User input errors, such as Roost source code errors, are an expected and normal occurrence for the Roost compiler which must be handled by normal code in the compiler; they are not logic errors in your compiler.

Testing Support

You must test your lexer. You should develop a thorough test suite that tests all legal tokens and as many lexical errors as you can think of. We will test your lexer against our own test cases and those of your classmates, using both lexically well-formed and lexically ill-formed inputs.

The starter code provides a basic testing script in bin/test-roostc-status.py. For this stage, it expects test inputs for this stage in src/test/roost/lex/all/, where tests are divided into tests the compiler should accept and those it should reject. You should write dozens of tests for each stage, mixing both kinds to ensure your compiler accepts programs that it should and rejects programs that it should. Feel free to extend the script (make your own copy, in case I update the original) to perform more extensive testing

As we get into later stages, we will discuss adding more types of tests.

Documentation and Style

Follow the Scala Style Guide plus general rules of thumb for clean code, using your best judgment. Use assertions judiciously. Style matters more the larger the project gets.

Use Scaladoc header comments on classes and methods, especially for important parts of each stage. Use succinct inline comments to document steps of logic as need when they are not abundantly clear from the code.

Maintain an up-to-date README.md. It should include:

documentation of how to build and run the compiler;
a high-level description of your compiler design and implementation;
documentation of any additional or non-standard features;
justification of important design choices;
a change log summarizing major changes in design or implementation (with dates);
any critical known issues in your design or implementation.

Keep you compiler’s command-line interface self-documentation (roostc -h/--help) up to date but succinct.

Submission and Evaluation

Commit and push your work to GitLab as you go. Each project stage except the lexer includes a final stage deadline, when all parts of the stage are due, plus multiple intermediate checkpoints, when individual features from the full stage are due. Please use a clearly idenitifiable commit messages of the form x submission for each checkpoint or stage deadline x. After all stages and many checkpoints, I will test and review your code and provide feedback with the mechanisms described below. We can also schedule in-person code review sessions for more interactive feedback.

Submit a Merge Request

When you are ready to submit your work for a checkpoint or stage deadline, submit a merge request to the review branch on GitLab:

Make sure your work is committed and pushed to GitLab and ensure that this is the latest work committed.
Open the GitLab web page for your project and visit the Repositories > Branches section in the left side menu. Click the New branch button in the upper right. The Create from field should be master (or whatever branch contains your work for this step). Enter the prescribed Branch name for this checkpoint or stage deadline and click Create branch.
From the left bar, choose Merge Requests, then click New Merge Request. (Ignore the shortcut offers to create a merge request from your new branch.)
On the New Merge Request page, set the following:
- Source branch: the branch you just created
- Target branch: review (this is the branch representing what work I have reviewed)
Then click Compare branches and continue.
Edit the Title to include the checkpoint or stage (and any other relevant indicators you choose)
Provide a Description of what work you are submitting for review. Note any key items that I should pay attention to in my review.
Set the Assignee: Ben Wood (@bpw).
Feel free to use labels if it helps you stay organized, but do not use any of the other settings below that.
Review the Commits and Changes tabs at the bottom of the page and ensure that the set of changes you are submitting for review is what you intend.
Finally, click Submit merge request to submit the code to me for review.

Unless all members of your team are experienced with Git and branching, I suggest doing all development on master and creating branches only for submission, as described above.

Optional: If your team is experienced with Git branching and prefers to use a “feature branch” workflow with one feature branch per checkpoint, feel free to do so, but please manage your branches cleanly. Specifically, once you have submitted a checkpoint or stage by initiating a merge request on its branch, do not commit new work for other features into that branch. Continue development elsewhere.

Code Review

Your work will be evaluated on the basis of:

Completeness: Your compiler must implement all the required features for all language forms.
Correctness: Your compiler must pass my suite of tests. I will evaluate your compiler on a private test suite plus all submitted tests of all teams.
Efficiency and Scalability: Your compiler must employ appropriate data structure and algorithms that are effective from a big-O perspective and scale well to handle large programs.
Design: Your compiler must make effective use of relevant foundations and be organized logically and clearly. (Moderate to big-picture view.)
Style: See above.
Documentation: See above.

These guidelines apply to the entire project.

Request a Wellesley CS account if needed (on campus only). ↩
If you prefer to use the CS 301 tools only in the current shell session, then run the command source /home/cs301/live/env/init/session.sh instead; run this command in each shell session where you want to use the CS 301 tools. If you need to remove the permanent account configuration, edit ~/.bash_profile to remove the lines marked as related to CS 301. ↩