Project
Roost Compiler Requirements and Coding Guide
- Getting Started
- Project Stage Requirements
- Compiler Output Requirements
- Programming Support
- Testing Support
- Documentation and Style
- Submission and Evaluation
Reference
- Policies for Project Work
- Roost Examples
- Roost Language Specification
- Tiny Compiler Exploration
- Tool Documentation (Git, IntelliJ, Scala, etc.)
Getting Started
These instructions assume you are using your CS account on one of the CS GNU/Linux workstations in the Microfocus. They can be adapted mildly to other environments if you’re trying to install the tools yourself.
To set up CS 301 tools in your CS account1 permanently2 for all CS GNU/Linux workstations, log into one, open a terminal, and run:
source /home/cs301/live/env/init/account.sh
Clone Your Project Repository
-
If you have not previously used Git in your CS account, configure it with your identity by running these commands in a terminal:
git config --global user.name "Your Name" git config --global user.email "your.email@wellesley.edu"
- Visit the URL of your team GitLab repository. (You should have this in an email from GitLab.)
- In the upper right, click Clone and choose the HTTPS URL (or SSH if you have configured SSH keys for GitLab in your CS account).
-
Run the following command, replacing
<URL>
with the URL you copied from GitLab and<DIR>
with the path you want to use for your local project directory. (Use Shift-Control-V to paste in the terminal.)git clone <URL> <DIR>
Alternatively, you can use IntelliJ to check out the project by choosing Check out from Version Control > Git, and using the same URL and directory described above.
Set Up the Project in IntelliJ
Launch IntelliJ with the command idea.sh &
or, if you have launched
it previously, select it from the Applications > Programming menu.
General IntelliJ Settings
- If your have not used IntelliJ in your account yet:
- Choose Do not import settings.
- Accept the license, then choose Don’t send (data sharing).
- Choose a UI style, then Skip Remaining and Set Defaults.
- Open the IntelliJ preferences. (File > Settings in the menubar or
Configure > Settings in the lower right of the IntelliJ welcome
window.)
- Choose Build, Execution, Deployment > Compiler.
- Enable Build project automatically.
- In Scala Compiler > Scala Compile Server:
- Set JVM options to
-server -Xss256m
- Set JVM maximum heap size, MB to
2048
- Set JVM options to
- In Languages & Frameworks > Scala:
- Optionally enable Show type info on mouse hover …
- Choose Build, Execution, Deployment > Compiler.
IntelliJ Plugins
The IntelliJ Scala plugin, needed for your project, has been preinstalled on the lab workstations.
As you open files that are not Scala or Java files, IntelliJ may offer to install related plugins with a bar across the top of the editor pane. Unfortunately, the plugins it offers for JFlex and Java CUP are not useful to us. You should dismiss these offers by clicking Ignore extension.
One exception is the Markdown support plugin from JetBrains, which lets you preview the rendered Markdown in one pane as you write in another. I find typing becomes a bit laggy using the plugin, so I prefer to write Markdown in my usual text editor instead, but you may appreciate it.
If installing plugins, be careful of the plugin selection process. When there are multiple plugins in the window that pops up when you click the Install plugins option, you must explicitly uncheck those plugins that you do not want to install.
Open and Configure the Project
- From the File menu or the welcome window, choose Open and select your project directory.
- When prompted to Import project from sbt, click OK.
- Open the IntelliJ preferences again. In Build, Execution, Deployment > Compiler, enable Build project automatically.
- Wait for the progress indicator in near the middle/right end of the bottom status bar to finish.
- From the Run menu or near the right hand end of the toolbar,
choose Edit Configurations….
- In the upper left of the Run/Debug Configurations window, click the + and select sbt Task from the drop-down menu.
- Set the Name field to “Lexer/Parser Generator”.
- Set the Tasks: field to “generators”.
- Click OK.
- Near the right end of the toolbar, between the hammer and the play
button, you should now see Lexer/Parser Generators and the play
button should be green. Click the green play button to run the
Lexer/Parser Generators task.
- This should pop up an sbt shell window pane. (If not, click the sbt shell tab in the lower left.)
- The first time after you launch IntelliJ, this will be pretty slow. Subsequent runs will be faster.
- The sbt shell pane will show the output of running the generators. (More about that in the first two project stages!)
- Click the build button (green hammer near right end of toolbar) to force a build for good measure.
- Look for the Problems tab in the lower left/middle and open this window pane. This is where any (Scala) compiler errors will be reported. There should not be any errors right now.
- Click the upper left (sideways) Project pane to view the project files.
Set Up Paths
To configure your current shell session to find your Roost compiler as
roostc
, use this command:
cd <your-compiler-project-path>
source env.sh
To set this up automatically in every new shell, use your favorite
text editor to add the following line to the file .bash_profile
in
your home directory:
pushd <your-compiler-project-path> && (source env.sh ; popd)
Project Files
bin/
: scripts, utilitiesroostc
: wrapper to run the compilertest-roostc-status.py
: basic testing script for source code from./src/main/roost/<stage>/*
.
lib/
: jar files for external libraries and toolssrc/
: all codemain/scala/
: compiler source codetest/roost/
: Roost source files for testing
README.md
: top-level documentation of your implementation
Building the Roost Compiler
To build from IntelliJ (after completing the project setup steps above):
- All Scala files will build automatically, shortly after you make changes. To force a build (some times IntelliJ gets tired?), click the green hammer in the upper right toolbar.
- To generate the lexer (with JFlex) or parser (with Java CUP) from your specifications, click the green play button next to your Lexer/Parser Generators task in the upper right toolbar.
To build from a shell:
- Run
sbt compile
. Alternatively, start ansbt
shell by runninginsbt
, then run the commandcompile
. This will run the lexer/parser generators and build all Scala files.
Running the Roost Compiler
The provided code includes a skeleton for the compiler’s command-line
interface. Assuming you have completed the project setup
steps above and the build has completed, the
roostc
wrapper for your compiler should be available. At
the command line, your compiler can be invoked as follows:
roostc file.roost
Additional options can be seen with the -h
or --help
option:
roostc --help
The wrapper script roostc
simply invokes the scala
runtime with the right environment arguments to find and evaluate the
main
entrypoint in roost.Compiler
. All command-line
arguments to roostc
are passed into your compiler’s main
entrypoint.
Committing and Pushing Changes with Git
Your team’s work is hosted as a Git repository on GitLab, which is also where I will collect and review your work. Git will help you track changes and restore old versions if things go wrong. You have also used Git or Mercurial in on small assignments in CS 240, so this should be somewhat familiar, but this may be your first time working on a large protracted software project with version control.
As you work, you should frequently:
- Work together in pair/trio programming style with your team. This is the preferable mode of work.
- Communicate with your team (if you have to work separately) to avoid conflicts (concurrent edits to the same parts of the same files) and other broken merges (edits that change something the other teammate is depending on).
git add
andgit commit
cohesive sets of changes with a descriptive commit message.git pull
commits from – andgit push
commits your team repository.
You can perform Git operations on the command line or from within IntelliJ (see the VCS menu). (Or, if you use Emacs, check out Magit!)
More reference:
- Git Documentation (← start here)
- GitLab Documentation (← start here)
- GitLab Merge Requests for submitting project checkpoints and stages.
- GitLab Issues for tracking bugs, features, to-dos, etc.
- You can practice Git skills with the Tutorial from CS 240, even though you’re not in that course currently.
Tracking Issues and To-do Items with GitLab
At some point, your may have a bug! At many times, you will have a wide range of tasks that need doing, such as debugging and fixing a problem, adding a new feature, redesigning and changing the implementation of an existing feature, updating documentation, etc. To help coordinate progress on these tasks and document the knowledge required (or discovered) to complete them, you find it useful to use the Issue Tracker hosted with your repository on GitLab.
Project Stage Requirements
Specifications of each compiler stage are described separately in the corresponding project stage.
Lexer
Implement lexical analysis.
- assign: Tuesday, 12 February
- checkpoint: Friday, 15 February
- due: Tuesday, 19 February
Front End
Implement Parsing, ASTs, symbol tables, and type checking.
- assign: Tuesday, 19 February
- work: Friday, 22 February
- checkpoint: Friday, 1 March
- work: Friday, 1 March
- checkpoint: Friday, 8 March
- work: Friday, 8 March
- checkpoint: Friday, 15 March
- work: Friday, 15 March
- checkpoint: Tuesday, 19 March
- due: Tuesday, 19 March
Back End
Implement a TAC IR, lowering from ASTs to TAC, and code generation from TAC to x86.
- assign: Tuesday, 2 April
- work: Friday, 5 April
- checkpoint: Friday, 12 April
- work: Friday, 12 April
- due: Friday, 19 April
Optimizer
Implement several compiler optimizations.
- assign: Friday, 19 April
- work: Friday, 26 April
- due: Friday, 3 May
Feature
Add a non-trivial new feature to your compiler.
- assign: Friday, 19 April
- checkpoint: Tuesday, 30 April
- work: Tuesday, 7 May
- work: Friday, 10 May
- checkpoint: Tuesday, 14 May
- presentations: Tuesday, 14 May
- due: Tuesday, 21 May
Compiler Output Requirements
In addition to the specific requirements for each project stage, which may include outputs such as stage summaries enabled by command-line options or files generated by compilation, the following requirements for compiler output apply to the entire project.
Error Messages
Your compiler should detect and report the first lexical analysis error it encounters (if any). Your program must always report the first lexical error in the file; reporting later errors is not necessary. The compiler should print an error message, report its final status, and exit cleanly. The format and exact content of error messages is left to you. They must be informative and useful to the programmer in understanding and fixing the offending issue in the source code: it should be easy to fix the problem immediately after reading the message. It is highly recommended that you include a line and column number of a position in the input program source code where the error arises. This is helpful not just to your (for now imaginary) end users, but especially to you while you are testing and debugging your compiler.
Source code error reporting will be an important feature of your
compiler for lexical errors in this stage and many other types of
errors in future stages. One convenient way to organize
error-reporting is by raising instances of subtypes of
roost.error.CompilerError
, an exception class. Whenever
the program encounters an error in source code, the relevant component
can raise an appropriate type of CompilerError
exception. The
top-level compiler logic can then catch and report any CompilerError
in a single central location.
Status Reporting
Regardless of whether your compiler prints other required information as indicated by command-line options, reports a compiler error, etc., it must clearly report the final status of compilation upon termination. Your compiler must do the following two things to report whether it accepted or rejected the source program:
-
The last line printed by your compiler must always be one of
Accepted.
orRejected.
, formatted on its own line. The output of your compiler must contain nothing else after this line. -
The exit code of the compiler process must be
0
if the compiler accepts the source program and nonzero if it rejects the source program. Scala’s built-insys.exit(x)
terminates the process and yields the given exit code,x
.
These will be helpful for automating tests of your compiler.
No Other Output
Excepting the outputs explicitly required by each stage (e.g.,
--show-tokens
for the lexer), compiler error messages, and status
reporting, your compiler should print no other output under normal
operation. If you wish to show additional information for yourself
while developing, testing, or debugging, try the provided mechanism
for explicitly enabling extra informational messages.
Programming Support
The starter code for roost.Compiler
demos a few system
interaction features like working with buffered file IO and parsing
command-line arguments (using the scopt) library). Your compiler
must implement at least the command-line options and status-reporting
behavior, regardless of how. Successive stages
will specify additional requirements of the same style. As long as you
satisfy these specifications, you may replace or change any parts of
the starter code.
Utility Code for Debug Messages
One feature of the provided code that you may find useful is support
for controlling the printing of informational messages from within
your compiler. As you develop your compiler, you may find it useful to
display more information about incremental internal steps than is
required (or allowed) by the output specification. The
roost.Util
function provides a method debug
for printing
such messages. This method has two useful features:
- It uses printf-style formatting, which is more efficient than
constructing strings through repeated concatenation with
+
. - By default,
debug
never prints its messages. The command-line flag-d
(or--debug
) can be used to enable the messages when needed. This helps avoid the tradeoff between cluttering the compiler output and constantly adding/removing/commenting/uncommenting code to print such messages.
Using the -d
or --debug
flag with no additional argument enables
printing of all debug
messages. Giving a comma-separated list of
debug keys as an argument to the -d
or --debug
flag enables only
the debug
messages that are associated with this list of debug keys
and those messages that are associated with no key at all. The first
argument to debug
is an option (None
or Some(...)
) indicating
how the message is keyed. The second argument is a format string. Any
remaining arguments are used to fill the %
holes in the format
string.
import roost.Util
Util.debug(None, "#1. See line %d Debug messages are enabled!", lineNumber)
Util.debug(Some("lex"), "#2. Debug messages are enabled for key 'lex'!")
Util.debug(Some("parse"), "#3. Debug messages are enabled for key 'parse'!")
Given the above Util.debug
calls, running roostc
- without
-d
/--debug
does not allow any of the messages print; - with
-d
/--debug
allows #1 to print; - with
-d parser
/--debug parse
allows #1 and #3 to print; - with
-d lexer,parser
/--debug lex,parse
allows #1, #2, and #3 to print.
This feature makes it attractive to leave your informational messages for all stages in place and enable only those that you need currently.
Feel free to add other broadly useful functionality in the
roost.Util
object. You will likely import in most files.
Assertions
You should make liberal use of Scala’s assertion facilities: use
assert(condition, "message")
to assert that specific Boolean
conditions (e.g., preconditions, postconditions, invariants) are
always true at run time, and otherwise intentionally crash with an
exception after printing message
. Use assertions to check for logic
errors in your compiler code. Do not use assertions for reporting
errors in user input, such as command-line flags or Roost
source code. User input errors, such as Roost source code
errors, are an expected and normal occurrence for the Roost
compiler which must be handled by normal code in the compiler; they
are not logic errors in your compiler.
Testing Support
You must test your lexer. You should develop a thorough test suite that tests all legal tokens and as many lexical errors as you can think of. We will test your lexer against our own test cases and those of your classmates, using both lexically well-formed and lexically ill-formed inputs.
The starter code provides a basic testing script in
bin/test-roostc-status.py
. For this stage, it expects
test inputs for this stage in src/test/roost/lex/all/
,
where tests are divided into tests the compiler should accept
and
those it should reject
. You should write dozens of tests for each
stage, mixing both kinds to ensure your compiler accepts programs that
it should and rejects programs that it should. Feel free to extend the
script (make your own copy, in case I update the original) to perform
more extensive testing
As we get into later stages, we will discuss adding more types of tests.
Documentation and Style
Follow the Scala Style Guide plus general rules of thumb for clean code, using your best judgment. Use assertions judiciously. Style matters more the larger the project gets.
Use Scaladoc header comments on classes and methods, especially for important parts of each stage. Use succinct inline comments to document steps of logic as need when they are not abundantly clear from the code.
Maintain an up-to-date README.md
. It should include:
- documentation of how to build and run the compiler;
- a high-level description of your compiler design and implementation;
- documentation of any additional or non-standard features;
- justification of important design choices;
- a change log summarizing major changes in design or implementation (with dates);
- any critical known issues in your design or implementation.
Keep you compiler’s command-line interface self-documentation
(roostc -h
/--help
) up to date but succinct.
Submission and Evaluation
Commit and push your work to GitLab as you go. Each project stage
except the lexer includes a final stage deadline, when all parts of
the stage are due, plus multiple intermediate checkpoints, when
individual features from the full stage are due. Please use a
clearly idenitifiable commit messages of the form x submission
for each checkpoint or stage deadline x
. After all stages and many
checkpoints, I will test and review your code and provide feedback
with the mechanisms described below. We can also schedule in-person
code review sessions for more interactive feedback.
Submit a Merge Request
When you are ready to submit your work for a checkpoint or stage
deadline, submit a merge
request
to the review
branch on GitLab:
- Make sure your work is committed and pushed to GitLab and ensure that this is the latest work committed.
- Open the GitLab web page for your project and visit the
Repositories > Branches section in the left side menu. Click
the New branch button in the upper right. The Create from
field should be
master
(or whatever branch contains your work for this step). Enter the prescribed Branch name for this checkpoint or stage deadline and click Create branch. - From the left bar, choose Merge Requests, then click New Merge Request. (Ignore the shortcut offers to create a merge request from your new branch.)
- On the New Merge Request page, set the following:
- Source branch: the branch you just created
- Target branch:
review
(this is the branch representing what work I have reviewed)
Then click Compare branches and continue.
- Edit the Title to include the checkpoint or stage (and any other relevant indicators you choose)
- Provide a Description of what work you are submitting for review. Note any key items that I should pay attention to in my review.
- Set the Assignee: Ben Wood (@bpw).
- Feel free to use labels if it helps you stay organized, but do not use any of the other settings below that.
- Review the Commits and Changes tabs at the bottom of the page and ensure that the set of changes you are submitting for review is what you intend.
- Finally, click Submit merge request to submit the code to me for review.
Unless all members of your team are experienced with Git and
branching, I suggest doing all development on master
and creating
branches only for submission, as described above.
Optional: If your team is experienced with Git branching and prefers to use a “feature branch” workflow with one feature branch per checkpoint, feel free to do so, but please manage your branches cleanly. Specifically, once you have submitted a checkpoint or stage by initiating a merge request on its branch, do not commit new work for other features into that branch. Continue development elsewhere.
Code Review
Your work will be evaluated on the basis of:
- Completeness: Your compiler must implement all the required features for all language forms.
- Correctness: Your compiler must pass my suite of tests. I will evaluate your compiler on a private test suite plus all submitted tests of all teams.
- Efficiency and Scalability: Your compiler must employ appropriate data structure and algorithms that are effective from a big-O perspective and scale well to handle large programs.
- Design: Your compiler must make effective use of relevant foundations and be organized logically and clearly. (Moderate to big-picture view.)
- Style: See above.
- Documentation: See above.
These guidelines apply to the entire project.
-
Request a Wellesley CS account if needed (on campus only). ↩
-
If you prefer to use the CS 301 tools only in the current shell session, then run the command
source /home/cs301/live/env/init/session.sh
instead; run this command in each shell session where you want to use the CS 301 tools. If you need to remove the permanent account configuration, edit~/.bash_profile
to remove the lines marked as related to CS 301. ↩