code Add a non-trivial new feature to your compiler.

Contents

Plan

To wrap up the semester, you will extend your compiler with an interesting new program analysis (or other) feature of your choosing. Find something that excites you and talk to me to form a plan. Feel free to work individually, form new teams (of up to 3), or keep your existing teams.

When choosing topics and teams, aim for a non-trivial project, but keep time limits in mind:

  • You likely will still have some implementation work for the optimizer due a week before the end of classes.
  • By the presentations (to be scheduled, during reading period), you should have an initial prototype.
  • You probably know much of your compiler implementation fairly well now. Regrouping carries extra code-learning costs, but some projects may not depend much on your work so far.

The goal of the project is explore a program analysis or an additional compiler or runtime system feature we have not covered in the course. Your final products will include some combination of an implementation artifact (code) and a short paper.

Tracks

There are multiple tracks. You may focus on: building an implementation or theoretical artifact; doing empirical evaluation and comparison of existing program analysis/compiler tools; or writing a literature survey of research in a focused topic area. All options will involve some reading and the production of a paper.

Implementation Track

In this track, you will implement and experimentally evaluate a new-to-you compiler/runtime feature in your compiler or another platform. As your final product, you will submit an implementation, experimental data, and a technical paper on these two.

The experimental evaluation will focus on performance, accuracy, or some other measure of efficacy if relevant for the feature in question.

The paper should be organized like a technical research paper in a major compilers/programming languages research conference or journal. Naturally, your contribution most likely will be smaller scale and not as novel! (Actual research papers in the area often involve months or years of work by multiple people.) The paper outline should roughly match the following:

  1. Introduction: Briefly describe the work and its context. Give an overview of what the paper will communicate.
  2. Background or motivation: describe the foundations of this feature, citing relevant references, and provide examples of its application and benefits.
  3. Design/Theory: Describe how the feature works, capturing the core theoretical or design model (not the details of exactly how you implemented it with the tools at hand).
  4. Implementation: Describe any how you implemented the feature.
  5. Evaluation: describe experiments performed and results observed. Discuss.
  6. Conclusions.

We will work together to refine the focus of your implementation and paper. Many projects will implement a well-known feature. In this case, the focus will be on the implementation and evaluation sections. Some projects may implement less-studied features or recent ideas in research. In this case, there will be more balance among the sections.

Empirical Track

In this track, you will conduct substantial evaluation of existing program analysis tools rather than implementing your own. As your final product, you will submit experiment infrastructure, experimental evaluation data, and a technical paper documenting on these two.

Since you will use existing tools rather than developing new tools, the scope of your evaluation will be much larger than in the implementation track. You will evaluate at least two similar tools on a wide range of applications to compare their performance, accuracy, or other relevant characteristics.

  1. Introduction: Briefly describe the work and its context. Give an overview of what the paper will communicate.
  2. Background or motivation: describe the foundations of the feature under evaluation, citing relevant references, and provide examples of its application and benefits.
  3. Design and Implementation: Describe key design differences about the tools/systems you will evaluate, including how they work, capturing the core theoretical or design model (not the details of exactly how they implement it).
  4. Evaluation: Describe experiments performed and results observed. Discuss.
  5. Case Studies: Provide additional detail about the use of the tools on one or more selected benchmark applications.
  6. Conclusions.

We will work together to refine the focus of your evaluation and how that will drive the structure of the paper.

Survey Track

In this track, you will conduct a survey of research literature on a relevant topic in program analysis, programming tools, or programming language implementation. You will submit a survey paper as your final product. Compared to the implementation option, this paper will be longer.

We have read some survey papers as part of tutorials (on language support for information flow control, dynamic optimization, garbage collection, also the Cardelli papers on types to an extent). This is the style to target, but it is not assumed that you will have dozens upon dozens of references.

Aim to produce a technical overview of the topic we agree on, focusing on a few key papers/contributions, how they relate to one another, what trade-offs may exist, and so on. Target fellow technically sophisticated 301 students as your audience and aim to teach about the topic. Include concrete examples to illustrate key concepts, but do not give an exhaustive accounting of all technical components of all papers.

The organization of a survey paper is very much dictated by the shape of the topic you explore. It will likely begin with an overview of the topic and choose a foundational central set of concepts to define carefully. From there, it can explore different approaches to a central problem or different refinements to a central technique. We can arrange the organization together as your reading progresses. Your survey should cover at least 5 key papers on contributions to your topic.

Format

Paper

Please prepare your paper using the ACM Article (acmart) template. Available as LaTeX (also on Overleaf) or Word.

Please submit your paper as a PDF.

Presentation

Each feature team will give a presentation of their topic area and implementation/survey work.

Content/Structure: Presentations should:

  • Assume the audience is composed of computer scientists who understands the fundamentals of compiler design and construction.
  • Introduce/teach basic background of the specific topic area in which your feature project focuses.
  • Give an overview of the design and implementation of your feature in the context of a Roost compiler (for implementation projects) or the key features of techniques you studied (for survey projects). Don’t show us Scala code from your implementation, snippets from papers you read, or big formal systems. Do show us self-constructed examples of how your implemented/studied feature works, the code it generates, the basic algorithms/formal underpinnings, etc. Use diagrams. Many of these will be useful for your paper as well.
  • For implementation projects, describe how you have evaluated or will evaluated your implementation, including initial results if they are available.
  • For survey projects, describe prior evaluation of the techniques you studied and characterize trade-offs between alternative approaches where relevant.
  • Include demos if useful in conveying the ideas.

Time: For a team of n members, target a presentation of roughly 5 + 3n minutes.

Media: A whiteboard and projection for slides/demos will be available. Please let me know at least 1 day in advance if you plan to project from one of the Linux machines that we are moving.

Topic Ideas

I do not assume that you know what all of these are. The ambition level varies widely. Some ideas have a high bar for “making it work at all.” Others have several potential (and reasonable) stopping points along the way. I am happy to talk about any and all of these (and your own ideas too).

Program Analysis and Tools

Feature projects in this category will extend your Roost compiler/runtime system (or another compiler/analysis system for a widely used language) to report additional useful information to programmers.

  • Implement a performance profiler to measure how much each basic block contributes to the running time of a program.
  • Track “blame” for runtime errors like null pointer dereferences. Where did that null come from? What code did it flow through (without being checked) to get here?
  • Detect constant or stationary fields. Or recast this as a language feature for explicit definitions of constants, plus optimizations that can constant-fold uses of constant fields.
  • Implement dynamic or static information flow control/tracking to track whether “confidential” data every flows to “public” outputs.
  • Function escape analysis: determine struct/objects allocated in a function that never escape the function, i.e., they are never passed to other functions or returned from this function and references to them are never stored in fields of escaping objects or elements of escaping arrays. Use the analysis to convert non-escaping heap allocations to stack allocations or registers.
  • Many more ideas – talk to me!

Language Extensions

Add a language feature (other than the standard extensions) to Roost and implement it in the compiler and runtime.

  • Exceptions and exception handling.
  • More extensive type inference for the core Roost language.
  • A Roost type system extension to support distinguishing nullable and non-nullable types and checking their proper usage.
  • Threads, tasks, or other constructs for concurrent or parallel programs.
  • A Roost type system that supports tracking integer ranges

More Optimizations

Improve your compiler or runtime system to produce more efficient code or execute code more efficiently. (Browse EC 9 – 11 or 13 for more ideas.)

  • Implement simple offline trace-based compilation/optimizations for Roost.
  • Add partial redundancy elimination or loop optimizations to your optimizer and evaluate the performance effects.
  • Implement low-level instruction selection with tiling and intelligent register allocation for basic blocks and evaluate the performance effects.
  • Implement inlining or specialization and evaluate the performance effects.
  • Implement conservative type-based alias analysis and use it for elimination of redundant field or array accesses. Evaluate the performance effects.
  • Generate SSA, compute dominators, implement one or two SSA-based optimizations, and evaluate the performance effects.
  • Generate SIMD/vector instructions for suitable array codes.

Runtime Systems and Back Ends

Build more sophisticated runtime systems or target a different language/architecture.

  • Implement a garbage collector. Start with a copying or mark-sweep algorithm, then implement a generational collector if you are really into it.
  • Implement a Roost TAC interpreter with a very simple just-in-time compiler.
  • Implement a code generator targeting WebAssembly or JavaScript to run Roost programs in the browser.
  • Implement a code generator targeting LLVM bitcode and compare the performance of your optimizer and backend to the LLVM optimizer and backend.

Schedule and Products

Optimizer-dependent

If you have elected to complete the optimizer, follow the schedule below for your feature project. If you have elected to skip the optimizer, get started on this ASAP.

  • Week of 22-26 April: Start exploring potential topics and discuss interests and options with me. This will help me suggest starting points for literature searches on topics relevant to the feature you want to explore.
  • Tuesday 30 April: Submit a project proposal including:
    • one or more paragraphs describing the topic area
    • what you plan to build
    • how you plan to evaluate it
    • a preliminary work plan, including a schedule for implementation and writing

    Putting in work on a nice proposal will pay off later as it can become the basis/outline that can expand to become your paper.

  • Monday 13 May, morning: Linux machines move from Microfocus to E101.
  • Monday 13 May or Tuesday 14 May TBD: Initial prototype and presentations. Food.
  • Tuesday 21 May: End of exam period, implementation and paper due.

Submit

If you are doing any implementation, please add, commit, and push all parts (including the paper PDF) to your project repository. If you are doing a survey (no implementation), just send the paper PDF by email.