Checkpoint: Aim to complete the map, AST, and runtime sections by 11:00pm Thursday 5 November
Due: 11:00pm Monday 9 November
Starter files:
- fork wellesleycs251 / cs251-smile and add bpw as admin
Submission:
- Estimate the time you spent on each section in time.txt.
- Commit and push your completed work to your Bitbucket fork.
Relevant reading:
- SML References
- For CS 235 concurrent enrollees: Ocaml vs. SML:
  - SML/Ocaml syntax and feature comparison
  - SML vs. Ocaml syntax
Tools:
- Mercurial
- Emacs
- SML/NJ
Collaboration:
- All problems fall under the group collaboration policy. You may collaborate with 0 or 1 other students to produce one solution stored in one repository. Groups should act as individuals usually do with regards to discussing ideas. The group’s writing and code must be their own. Group members are expected to contribute equally and accomplish a large majority of the work (90%) together in person.

In this assignment, you will implement an interpreter for SMiLe, a programming language designed just for this course. SMiLe shares its syntax and evaluation semantics with a subset of Standard ML, but does dynamic type-checking instead of static type-checking.

MAP (Environment) Abstract Data Type (15 points)

A MAP signature is provided and documented in map.sig. Maps map keys to values. We provide an Env structure in env.sml that implements maps (environments) as association lists, with some additional operations. Your first task is to implement and discuss an alternative representation of maps (environments).

Implement a FunMap structure in map.sml, ascribing the FUNMAP signature, which also supports defining a map from a given function. Your implement should represent maps purely as function closures. The ideas to implement these are quite similar to those for the SET ADTs we (partly) implemented in class. A map-representing function returns an option when called on a key: NONE if the given key is unbound in the map represented by that function, SOME v if the given key is bound to value v in the map represented by that function. Implement all bindings described in FUNMAP.
Test your implementation to make sure it functions as expected on a range of cases.
One of the bindings in the MAP signature could be implemented externally by clients without knowing the concrete type of maps. Which one? Is there an advantage to implementing it as a primitive ADT operation besides saving clients the trouble of writing it?
Consider the trade-offs between list representations, function representations, and at least one other (hypothetical) representation for sets and for maps. Examples of other representations include hash tables, heaps, tries, etc. You do not need to implement your extra representation. Answer the following questions in answers.txt:
- What distinguishes function-representable but non-list-representable sets and maps from list-representable sets and maps? Give examples and explain the fundamental distinction that separates sets/maps that can be represented with, e.g., lists and those that cannot (but can be represented by functions).
- What is the fundamental operation over collections that is not possible to implement for function representations of general sets and maps? (Operations like toString and toList are impossible because of this limitation.)
- Which representation (lists or functions) is more space-efficient for use cases where insertion of duplicate set elements or binding of duplicate map keys (shadowing) is common? How much does your answer depend on implementation choices vs. fundamental limits of the representation?
For the last couple points on this problem, convert map.sml to include zero uses of the fn ... => .. syntax. Use exactly one fun binding per binding in FUNMAP. (Hint: exploit currying and partial application.)

Testing [updated 3 November]: Test your implementation by building maps and performing bind, unbind and lookup operations. For example, you might try these and reason about what they should yield:

- val m = FunMap.empty;
- FunMap.lookup "x" m;
- val m1 = FunMap.bind ("x", 7) m;
- FunMap.lookup "x" m1;
- val m2 = FunMap.bind ("y", 8) m1;
- FunMap.lookup "x" m2;
- FunMap.lookup "y" m2;
- FunMap.lookup "x" (FunMap.unbind "x" m2);
- FunMap.lookup 73 (FunMap.define (fn n => SOME (n div 2)));
- FunMap.lookup "x" (FunMap.make [("y", 4),("x", 5),("x",2)];

SMiLe

In this part, you will complete the implementation of an interpreter for SMiLe, a dynamically-typed programming language that is otherwise a syntactic and semantic subset of Standard ML. This section will guide you through some exercises to complete the interpreter.

Work incrementally. Test and commit often.

Since the SMiLe implementation is a non-trivial piece of code, you are encouraged to test and hg commit your changes often. Then if you break something and have trouble debugging it, you can more easily recover a recent working version rather than starting from scratch.

The SMiLe Language

The SMiLe language is similar to Standard ML. It uses a subset of the SML syntax and a subset of the SML semantics, with the exception that it performs dynamic type-checking instead of static type-checking.

Concrete Syntax

You will work mainly with the abstract syntax of SMiLe programs represented as tree data structures. However, to ease the job of testing (and increase the amount of fun possible once your interpreter is working), we have provided a concrete syntax and parser that converts from this syntax to the representation you will use inside the representation.

We use an Extended Backus Normal Form (EBNF) grammar to describe the concrete syntax of SMiLe programs. Nonterminals are written as <nonterm>. Terminals are any symbol other than angle-bracketed names, |, ::=, +, *, or curly brackets. Whenever any of these symbols is required as a terminal in the grammar, it is surrounded by quotes to distinguish it from the a symbol with meaning in the syntax of grammars themselves. The productions of each nonterminal are listed with:

<nonterm> ::= ... production 1 ...
            | ... production 2 ...
            | ...

The “Extended” form adds the ability to write regular expressions as part of the grammar productions to succinctly express several possible productions in just one. (Take CS 235 and CS 301 for more about regular expressions, concrete syntax, grammars, parsing, etc.) The one extension of BNF form that we will use is the postfix + and * symbols, which indicate “one or more of the preceding thing, in sequence” and “zero or more of the preceding thing, in sequence”, respectively. We use curly braces {} to group multiple items together and apply +. For example, ({<expr>,}+ <expr>) describes possible tuple expressions in the language, i.e., all strings of the form: a left parenthesis followed by one or more expressions (each followed by a comma), followed by one expression (without a comma), followed by a right parenthesis.

Below is an EBNF grammar for SMiLe. This is not the smallest possible grammar, but it matches what we have chosen for our abstratc syntax representation. It is also an ambiguous grammar, so precedence and associativity rules are needed when understanding SMiLe programs. Since the details of the concrete syntax are not the focus of this assignment, we omit those extra rules (and the more complicated grammar actually used by our parser) in this presentation. You will work with unambiguous Abstract Syntax Trees in your work implementing SMiLe. (If you are curious about the details of parsing concrete syntax, take CS 235 and CS 301 and feel free to take a peek inside our parser implementation.)

<program>  ::= <binding> <program>
             | <binding>
<binding>  ::= val <pattern> = <expr>
             | fun <var> <funcases>
<funcases> ::= <pattern> = <expr> "|" <funcases>
             | <pattern> = <expr>
<pattern>  ::= _ | () | true | false | <int> | <string>
             | ({<pattern>,}+ <pattern>) | <tag> <pattern> | (<pattern>)
   <expr>  ::= () | true | false | <int> | <string>
             | <unop> <expr> | <expr> <binop> <expr>
             | ({<expr>,}+ <expr>) | [{<expr>,}* <expr>] | [] | <tag> <expr> | (<expr>)
             | if <expr> then <expr> else <expr>
             | fn <cases> | <expr> <expr> | <var>
             | let <program> in <expr> end
             | case <expr> of <cases>
    <unop> ::= not | ~
   <binop> ::= "+" | - | "*" | div | mod | ^
             | = | <> | < | <= | > | >= | ::
             | andalso | orelse
   <cases> ::= <pattern> => <expr> "|" <cases>
             | <pattern> => <expr>

Abstract Syntax

The abstract syntax is defined by Abstract Syntax Trees very similar to the grammar above. You will explore the abstract syntax as part of the first problem below.

Evaluation Semantics

The evaluation semantics of SMiLe is the same as that of SML (for the subset of the language SMiLe includes), with one difference: SMiLe performs dynamic type-checking.

Differences with SML

Some notable differences with the syntax of Standard ML:

By intention:

Type-related syntax is not supported.
As a result, datatypes and constructors are never declared. Just use constructors (also called “tags” in our implementation) as if already declared.

Minor bugs that may be resolved soon:

~2 is not currently parsed as an int, but as negation applied to 2.
Literal negative numbers (~2) cannot be used in patterns.

Many other SML syntax features are available, including:

Multi-branch pattern-matching and curried form in function bindings.
Left-associative function application (curried functions can be called without parenthesization).
Special list syntax with [] and ::.
All bindings support full patterns.

Provided Code and Infrastructure

We provide significant code to get you started.

ast.sml: the definition of Abstract Syntax Trees (ASTs) for the SMiLe langauge, annotated with their correspondence to the concrete syntax, plus functions to pretty-print SMiLe ASTs in the concrete syntax
parser/parser.sig: the signature of the Parser structure, which translates programs from strings to ASTs.
- Ignore the remainder of the parser implementation in parser/..., unless you are curious. (Take CS 301 for more about this!)
static.sig and static.sml: static operations over SMiLe ASTs, including desugaring and static scope checking
runtime.sig and runtime.sml: evaluation of SMiLe programs and ASTs

You will edit static.sml and runtime.sml in your tasks below.

Implementing a language using a similar implementation language or metalanguage has benefits and problems. The most visible problem is muddled communication or thinking about whether you are working in or discussing the implementation language (SML) or the implemented language (SMiLe). On the other hand, being forced to keep them straight helps you learn to think about the distinction more clearly. Finally, implementing many of the evaluation rules of SML as evaluation rules for SMiLe will help you understand SML more clearly. (I’m looking at you, pattern-matching!)

Running the SMiLe Implementation

There are a couple ways to run the SMiLe implementation we build.

Command-Line Wrapper

In the top-level directory, run ./smile file.sml. The executable script smile is a thin wrapper that will evaluate the given file as a SMiLe program. Its output includes much information from SML/NJ, the implementation of our implementation language, ML. The output of the SMiLe interpreter is bounded by (-: and :-):

[10:31 PM] bpw@queets: ~/251/smile$ ./smile examples/pat.sml
Standard ML of New Jersey v110.78 [built: Sun Apr 26 01:06:11 2015]
[scanning ./sources.cm]
... much additional SML/NJ compilation manager output elided here ...
[autoloading done]
val it = () : unit
[autoloading]
[autoloading done]
(-:
  bind x = 10
:-)
val it = - : Runtime.environment
-
[10:31 PM] bpw@queets: ~/251/smile$

Much like the SML/NJ REPL prints out val bindings that are evaluated at the top level, the SMiLe interpreter prints out bindings using a similar notation (with bind instead of val to make them more distinguishable from any SML bindings that may appear in the same SML REPL).

Options: The smile wrapper has a few options for modes of operation other than evaluating one file:

--parse or -p: Parse the file and print the AST, but do not invoke any Static or Runtime actions.
--desugar or -d: Parse and Static.desugar the file and print the desugared AST, but do not invoke any other Static or Runtime actions.
--test or -t: Run the provided test suite.

In the SML REPL

A more flexible alternative to the smile wrapper is to open an SML/NJ REPL in the top-level directory and call functions in the SMiLe implementation directly. To load the SMiLe implementation, run CM.make "sources.cm"; at the REPL.

Standard ML of New Jersey v110.78 [built: Sun Apr 26 01:06:11 2015]
- CM.make "sources.cm";
[autoloading]
... much additional SML/NJ compilation manager output elided here ...
[New bindings added.]
val it = true : bool
-

This runs the SML/NJ Compilation Manager, which will generate a lot of output. If everything builds correctly, the result will be true (seen here bound to SML/NJ’s default REPL variable, it). If all is not well, you will see error messages and a false result.

As you make changes, fix errors, and so on, you may rerun CM.make "sources.cm"; in the same REPL session without closing and reopening as you do with use. (Unless you also use use, CM, the Compilation Manager, will get all the bindings right.)

To run a SMiLe program stored in a file once the SMiLe implementation is loaded:

- Runtime.run_file "examples/pat.sml";
(-:
  bind x = 10
:-)
val it = - : Runtime.environment
-

As before, output of the SMiLe implementation is bounded by (-: and :-).

There is no SMiLe REPL, per se. You can run a SMiLe program given on standard input, but it is a little awkward. (Type Control-D to end your input, but this will also close standard input and terminate the SML/NJ REPL.)

- Runtime.run TextIO.stdIn;
[autoloading]
[autoloading done]
val x = if 5 < 4 then 8 else 2*5 div 1
^D
(-:
  bind x = 10
:-)
val it = - : Runtime.environment
- 

Process sml finished

You are encouraged to run whole files – this also pushes you toward writing down more of your tests.

Testing Guide

(Expanded 3 November)

Before you have completed most of the runtime.sml additions, your interpreter will be unable to handle the provided examples/* programs written in concrete syntax. Until then, we suggest testing incrementally: just call functions in your SMiLe implementation directly in the SML REPL.

Visibility

Once the SMiLe implementation is loaded (with CM.make "sources.cm";), you may call any visible function directly from the REPL. Check the .sig files for what’s visible. If a function you wish to test or call is not visible and you want to test it, make it visible. Either:

add its type binding to the relevant .sig file; or
temporarily change the ascription of signature to module to use transparent ascription (struct Runtime : RUNTIME instead of struct Runtime :> RUNTIME). This will make all bindings in Runtime visible externally.

`open`

To avoid having to type prefixes for all functions/constructors in a structure, you may open the structure to import all its visible bindings into the current environment. For example open Ast now allows the use of all AST constructors without the prefix Ast. This is particularly useful when constructing ASTs or Runtime.values. Be warned: bindings introduced by open will shadow any preexisting bindings in the current environment if they have the same name.

Constructing Inputs

When testing functions that require ASTs as input, you may find it faster to write (or use preexisting) concrete syntax programs, parse them, and then copy the pieces of the AST you want to use. For this task, Parser.parse_file is quite useful, as is the ./smile --parse script. To pretty-print an AST structure p in concrete syntax, use print (Ast.show p) (or related Ast.show... functions).

Recall that, to see the full depth or breadth of an AST or other nested structure, you may need to direct the SML/NJ REPL to print more complete outputs. For example:

val _ = Control.Print.printDepth := 100;
val _ = Control.Print.printLength := 100;
val _ = Control.Print.stringDepth := 1000;

As you go, collecting your tests in an SML file is quite useful. tests.sml holds high-level tests. Feel free to create other files that will automatically run you own more detailed tests.

Testing the Completed Interpreter

Once you have more parts of the interpreter complete, you will be able to run full examples using the concrete syntax. Test.run (); will run a provided test suite, running all programs in the example directory and checking for all expected bindings.

Other Helpful Tricks

Type a binding name at the REPL to see its type quickly:

- Runtime.eq;
val it = fn : Runtime.value * Runtime.value -> bool

SMiLe Errors

When an error occurs in a SMiLe program (e.g., a SMiLe dynamic type-checking error), an error message will be printed starting with :-( if you use the various ..._error functions provided in runtime.sml. If an unexpected error occurs in the SMiLe implementation itself, you will see an ML exception instead.

Abstract Syntax Trees and Desugaring (15 points)

In this part you will explore the ML representation of SMiLe programs as Abstract Syntax Trees (ASTs) and desugaring of SMiLe programs. You will read ast.sml and edit static.sml.

Read ast.sml to familiarize yourself with the Abstract Syntax Tree representation of SMiLe programs. ML datatypes and pattern-matching are beautifully suited to this sort of task. One ML type corresponds to one non-terminal in the (simplest) SMiLe grammar; each constructor/tag of this type corresponds to one production of that non-terminal. The keyword and supports mutually-recursive datatypes or functions.
- Why do you think SMiLe tuple expressions are represented with an ML list of SMiLe expressions instead of an ML tuple?
- Write the ML representation of the AST for examples/fact.sml manually. Do the translation from concrete syntax to AST yourself without the help of the parser. Once you have translated it, check your answer by running ./smile --parse examples/fact.sml to see the AST the parser produces.
Early Desugaring: The provided parser for SMiLe’s concrete syntax does some desugaring before constructing ASTs.
- Examine the provided SMiLe file examples/lists.sml.
- Find two or three syntactic constructs (at least one in fun bindings and one in constructors/tags) used in these programs that are not representable directly by our ASTs. Describe how the parser appears to rewrite each syntax to fit our AST representation.
You may find it useful to examine the AST generated by the parser for lists.sml. You can do this from the SML REPL as described elsewhere, or you can just run ./smile --parse examples/lists.sml in the top-level repository directory. If you use the SML REPL, you can call Parser.parse_file ... to get an AST and then pretty-print it with print (Ast.show ...) to show the AST in concrete syntax and inspect signficant differences with the original program. You may find that the result is no longer quite legal SML/SMiLe syntax due to a trick the parser has used to generate variable names for internal use inside the interpreter.
AST Desugaring: The Static structure in static.sml implements functions for additional desugaring of ASTs by replacing all occurrences of AST nodes with combinations of other nodes that have equivalent behavior. Read the implementations of the desugarfunctions. Our interpreter implementation will not implement IfThenElse expressions.
- Change the desugar_expr function to rewrite IfThenElse expressions to equivalent Case expressions.
- Change the existing desugaring of AndAlso and OrElse AST nodes to work with the interpreter as well.

SMiLe Runtime Evaluator (45 points)

In this part you will explore, modify, and extend the SMiLe interpreter in runtime.sml.

Warmup: Read runtime.sml up through the definition of the environment type. Answer the following questions succinctly:
- How do values built by the ClosureValue constructor differ from the closures we have drawn in environment diagrams? Speculate on reasons for these differences.
- Why is the function name position a var option instead of just a var?
Implement eq : value * value -> bool to compare the structure of two SmiLe values recursively. Use ML pattern-matching for a concise implementation.
- Two non-closure SMiLe values are equivalent if and only if they have the same structure (same ML constructor) and their pieces are equivalent.
- eq returns false if any of its arguments is a closure.
Test your function on several cases.
Read eval_expr: Read several cases from eval_expr to get a feel for how the SMiLe evaluator works.
Implement Tuple Expressions in eval_expr: Evaluating a SMiLe tuple expression evaluates each SMiLe subexpression in order to create a SMiLe tuple of values. Try to accomplish this in just one extra line of ML code. (Hint: think higher-order.)
Implement Case Expressions in eval_expr: Implement evaluation of SMiLe Case expressions. A Case expression carries an expression, e, to evaluate and match against a (pattern * expr) list. All told, this task will require adding 1 line directly in the Case branch of eval_expr and roughly 25-45 lines split between the eval_cases and match functions (stubs are provided). To accomplish pattern-matching for SMiLe, you implement two ML helper functions that currently just raise an Unimplemented exception:
- match is a recursive function that checks to see if a SMiLe value matches a single SMiLe pattern. The match function takes four arguments in curried form:
  1. The function to call to add new bindings to a SMiLe environment.
  2. The SMiLe value to match against.
  3. The SMiLe pattern to try.
  4. The SMiLe environment to extend.
  If the SMiLe value and SMiLe pattern match (as defined below), match returns SOME of an environment that is the initial environment (as passed to match), extended with all the bindings the SMiLe pattern-match introduces. If there is no match, match returns NONE. The implementation should use ML pattern-matching to ML-pattern-match simultaneously on an ML constructor for a SMiLe value and an ML constructor for a SMiLe pattern.
  
  Given a SMiLe value v and a SMiLe pattern p, either p matches v or not. If it does, the match returns an environment extended with new var * value bindings. The order in which bindings are added in a match does not matter. Assume that one pattern binds each possible variable at most once. In other words, no pattern will include VarPattern "x" more than once, so shadowing and binding order are non-issues.
  
  The rules for SMiLe pattern matching should be unsurprising:
  - Wildcard matches everything and produces no bindings.
  - VarPattern s matches any value v and binds s to v.
  - UnitPattern matches only UnitValue and produces no bindings.
  - BoolPattern false matches only BoolValue false and produces no bindings (and similarly for true).
  - IntPattern 17 matches only IntValue 17 and produces no bindings (and similarly for other integers).
  - StringPattern "smile" matches only StringValue "smile" and produces no bindings (and similarly for other strings).
  - TuplePattern ps matches a value of the form TupleValue vs if ps and vs have the same length and each element of ps matches the corresponding element of vs. The match introduces all bindings introduced by all elementwise submatches.
  - TagPattern (s1,p) matches TaggedValue(s2,v) if s1 and s2 are the same ML string (via ML =) and p matches v. The match introduces all bindings introduced by the match of p and v. We call the strings s1 and s2 the constructor name or tag name.
  - Nothing else matches.
  Make sure to use the bind function provided as an argument when you need to introduce new bindings into an environment. This allows the interpreter to print top-level bindings as they are introduced (while introducing other bindings silently in subexpressions).
- eval_cases is a function that takes three arguments:
  - v : value, the SMiLe value to match against.
  - cases : (pattern * expr) list, the ordered list of pairs of SMiLe pattern and corresponding SMiLe result expression.
  - env : environment, the environment to extend with match bindings and use to evaluate the chosen result expression.
  [Updated Saturday 31 October to clarify how to call match.] eval_cases iterates through the list of cases in order, applying match to v and each SMiLe pattern until it finds one that matches. (When calling match here, give Env.bind as its first argument.) For the first SMiLe pattern that matches v, eval_cases evaluates the associated SMiLe result expression from the list, using environment env, extended with all bindings introduced by the matching SMiLe pattern. If no pattern matches, call match_error to report the error.
[Updated Tuesday 3 November with more directed testing suggestions.] Test match and eval_cases by calling these functions directly in the SML/NJ REPL on ML values of the appropriate argument types. You may need to make some helpers visible as described in the general testing guide. Here is an example test of match, expressed as a SML code in a file. To use this on the REPL, add semicolons.
```
 open Ast
 open Runtime
 (* Match a tuple value against a tuple pattern of same width.
    Should produce an environment with x=7, y=8.
    Using a pattern binding to assert that match returns SOME environment *)
 val (SOME env) = Runtime.match Env.bind (TupleValue [IntValue 7, IntValue 8]) (TuplePattern [VarPattern "x", VarPattern "y"]) Env.empty
 (* Print the resulting environment. *)
 val s = Runtime.show_env env
```
With both eval_cases and match implemented (in addition to the Case expression branch of eval_expr, you can also test the evaluation of SMiLe case expressions. (See also general testing guide and a quick way to get ASTs for case expressions from concrete syntax files, such as examples/pat.sml.) For example:
```
 - Runtime.eval_expr
     (Case (Tag ("::",
                 Tuple [Int 1,
                        Tag ("::",
                             Tuple [Int 2,
                                    Tag ("[]", Unit)])]),
            [(TagPattern ("[]", Unit),
              Int 0),
             (TagPattern ("::", (TuplePattern [VarPattern "x", VarPattern "xs"])),
              Plus (Var "x", 7))]))
     Env.empty;
 val it = IntValue 8 : Runtime.value
```
Once you have implemented eval_binding (the next step, much shorter than match and eval_cases), you will be able to start testing SMiLe programs written in the concrete syntax. A SMiLe test program is provided in examples/pat.sml, but it is not exhaustive.
Implement eval_binding: Implement Val and Fun bindings in eval_binding.

eval_binding takes a function for adding bindings to environments, a SMiLe binding, and the current SMiLe environment. It evaluates the SMiLe binding, and returns the SMiLe environment extended by the binding. This requires 5-10 lines of ML code, most for Val bindings.

SMiLe Val bindings evaluate their SMiLe expression, then match their SMiLe pattern against the result value. If the result matches the pattern, any bindings from the match are added to the current environment. Make use of the functions you created for pattern-matching in Case expressions.

SMiLe Fun bindings introduce a binding from the function’s name to a SMiLe closure value for the function.

Make sure to use the bind function provided as an argument when you need to introduce new bindings into an environment. This allows the interpreter to print top-level bindings as they are introduced (while introducing other bindings silently in subexpressions).
Read and describe Let expressions and the top-level eval

Describe what our provided code for Let expressions does in eval_expr. Note that it is very similar to code that appears in the eval function, which evaluates the top-level bindings of a program. Describe this as well, and how it relates to Let expressions. Write your descriptions in comments in the code.
Implement Function Application (Function Call) Expressions: Review the evaluation rule for function application and implement it in the Apply case of eval_expr. If something other than a ClosureValue is encountered where you expect the closure, use the function type_error to report it.

Recall that SMiLe functions all support multi-branch definitions. Rather than a single argument and a single body expression, SMiLe functions have lists of pairs of pattern and associated expr, matching the multi-branch function defintions familiar from ML. This means that evaluating a function call requires pattern-matching to choose the right branch and binding variables in the pattern to parts of the argument value. This should feel very similar to Case expressions!

An important distinction with our earlier graphical representation of closures is that SMiLe closures in this implementation carry an environment that does not bind the original name of the function. (That would require mutation and we can get away without it!) However, a function’s name is carried separately in the closure. To support recursive functions, you must ensure that the function’s name (if any) is bound in the environment where you evaluate its body. (Use Env.bind.)

At this point, you may run and check the results of all examples by calling Test.run(); in the REPL. This tests all of the example programs in the examples directory, which should all work at this point. If they do not, you have some debugging to do. You should also write some of your own tests as ours are by no means exhaustive. Often you will be able to repurpose older SML code for this, removing use of any types/exceptions to work with SMiLe.

Static Checking (20 points)

Implement the freevars function in static.sml. Its job is to find all the unbound (or “free”) variables used in a SMiLe program AST. Your implementation will traverse the AST, tracking a simple static environment of type (Ast.var, unit) Env.t that tracks the set of variables that will be bound in the current scope. As you descend through a binding, you will grow this environment, binding any bound variables to the unit value to indicate they are bound in the current scope. When the functions reach lists of cases (pattern/expression pairs), they should consider all possible cases.

Your implementation will follow a shape that is fairly similar to the eval functions (and their supporthing helpers) in runtime.sml, but instead of evaluating and following just one path through the AST, freevars will perform one full tree traversal, visiting every part of the tree once.

You will want to define ML helper functions to work over bindings and exprs found in programs. Use the ML form for mutually recursive functions:

fun freevars prog = ...
and freevars_binding b env = ...
and freevars_cases cases env = ...
and freevars_expr e env = ...

Optional extension: While checking for free variables, also check for conflicting bindings in a single pattern. Patterns should bind any given variable at most once. A variable cannot appear twice in a pattern. For example, the pattern (x,x) is illegal. (As an AST, this would be TuplePattern [VarPattern "x", VarPattern "x"].) Raise an exception if you find conflicting bindings in a pattern.

A Simpler Language? (5 points)

In this part, you will explore one last instance of syntactic sugar.

You have already investigated several examples of desugaring in our implementation of the SMiLe language. If all instances of an expression form can be rewritten to other forms in the language to produce an equivalent program, we know that this form is “just syntax;” it does not contribute any “interesting” semantics to the language beyond the semantics of the other existing expression forms. We will soon take this to the extreme in class, showing that it is possible to implement any computation using no more than single-argument functions, function application, and literally nothing else. For now, we content ourselves with writing off a couple more essential expression forms in our language as mere syntactic sugar.

At this point we have seen that there are many similarities between SMiLe/SML let expresssions, case expressions, and function application expressions. Evaluation of each of these involves:

evaluating one or more argument expressions, pattern-matching against the result, and introducing bindings to parts (or the entirety) of the result:
- in function calls: the argument expression
- in let expressions: the binding expressions
- in case expressions: the target expression
and evaluating one “body expression” in some environment extended with these bindings:
- in function calls: evaluate the body expression in the definition environment extended by the match
- in let expressions: evaluate the body expression in the current environment extended by the matches
- in case expressions: evaluate the expression following the matching pattern in the current environment extended by the match

Currently, the interpreter exploits these similarities to share significant code among the evaluation rules for these three expression forms. Nonetheless, the interpreter would be simpler if we had fewer evaluation rules.

Your final task is to implement two of the three (let, case, or function application) as sugar in terms of the third.

Save this for last; make a copy before you start.

Before attempting the final desugaring:

Complete all earlier stages of the interpreter.
hg cp runtime.sml runtime-sugar.sml
hg cp static.sml static-sugar.sml
Do your work for this part in the *-sugar.sml files and compile using the provided sources-sugar.cm instead of the original sources.cm.

After making the copy, modify the relevant branches in Static.desugar_expr to introduce the new rewriting and modify Runtime.eval_expr to raise SmileSugarError if the newly-eliminated expression forms are ever encountered at runtime.

Optional Extensions

If you are looking for more, consider the following extensions, or come chat for other ideas.

Closure Size Optimization

Adapt your implementation of helper functions for freevars (functions that find free variables) to optimize the space used for function closures. Our current implementation of closures keeps around the entire environment, even the parts that the closure will never need. By running freevars_expr on the body of a function starting from a static environment that contains only variables that will be bound as part of the function application evaluation rule, you can detect the free variables of the closure – those variables that may be referenced when no binding is in scope within the give expression. These are the only variables the closure could ever need from its environment. Thus, we might save space by storing an environment in the closure that binds only these variables. Do this.

Types

For a big challenge, add an ML datatype typ describing types of SMiLe expressions. Add a typecheck function that takes a SMiLe expr, a static environment mapping variables to types, and a proposed typ for the expr, and decides whether expr may have this type. Emulate the SML type system as closely as possible.

Interpretive Dance

Contents