Interpretive Dance
- Checkpoint: Aim to complete the map, AST, and runtime sections by 11:00pm Thursday 5 November
- Due: 11:00pm Monday 9 November
- Starter files:
- fork wellesleycs251 / cs251-smile and add bpw as admin
- Submission:
- Estimate the time you spent on each section in
time.txt
. - Commit and push your completed work to your Bitbucket fork.
- Estimate the time you spent on each section in
- Relevant reading:
- SML References
- For CS 235 concurrent enrollees: Ocaml vs. SML:
- Tools:
- Collaboration:
- All problems fall under the group collaboration policy. You may collaborate with 0 or 1 other students to produce one solution stored in one repository. Groups should act as individuals usually do with regards to discussing ideas. The group’s writing and code must be their own. Group members are expected to contribute equally and accomplish a large majority of the work (90%) together in person.
In this assignment, you will implement an interpreter for SMiLe, a programming language designed just for this course. SMiLe shares its syntax and evaluation semantics with a subset of Standard ML, but does dynamic type-checking instead of static type-checking.
Contents
MAP (Environment) Abstract Data Type (15 points)
A MAP
signature is provided and documented in map.sig
. Maps map keys to values. We provide an Env
structure in env.sml
that implements maps (environments) as association lists, with some additional operations. Your first task is to implement and discuss an alternative representation of maps (environments).
-
Implement a
FunMap
structure inmap.sml
, ascribing theFUNMAP
signature, which also supports defining a map from a given function. Your implement should represent maps purely as function closures. The ideas to implement these are quite similar to those for theSET
ADTs we (partly) implemented in class. A map-representing function returns an option when called on a key:NONE
if the given key is unbound in the map represented by that function,SOME v
if the given key is bound to valuev
in the map represented by that function. Implement all bindings described inFUNMAP
. -
Test your implementation to make sure it functions as expected on a range of cases.
-
One of the bindings in the
MAP
signature could be implemented externally by clients without knowing the concrete type of maps. Which one? Is there an advantage to implementing it as a primitive ADT operation besides saving clients the trouble of writing it? -
Consider the trade-offs between list representations, function representations, and at least one other (hypothetical) representation for sets and for maps. Examples of other representations include hash tables, heaps, tries, etc. You do not need to implement your extra representation. Answer the following questions in
answers.txt
:- What distinguishes function-representable but non-list-representable sets and maps from list-representable sets and maps? Give examples and explain the fundamental distinction that separates sets/maps that can be represented with, e.g., lists and those that cannot (but can be represented by functions).
- What is the fundamental operation over collections that is not possible to implement for function representations of general sets and maps? (Operations like
toString
andtoList
are impossible because of this limitation.) - Which representation (lists or functions) is more space-efficient for use cases where insertion of duplicate set elements or binding of duplicate map keys (shadowing) is common? How much does your answer depend on implementation choices vs. fundamental limits of the representation?
-
For the last couple points on this problem, convert
map.sml
to include zero uses of thefn ... => ..
syntax. Use exactly onefun
binding per binding inFUNMAP
. (Hint: exploit currying and partial application.)
Testing [updated 3 November]: Test your implementation by building maps and performing bind, unbind and lookup operations. For example, you might try these and reason about what they should yield:
- val m = FunMap.empty;
- FunMap.lookup "x" m;
- val m1 = FunMap.bind ("x", 7) m;
- FunMap.lookup "x" m1;
- val m2 = FunMap.bind ("y", 8) m1;
- FunMap.lookup "x" m2;
- FunMap.lookup "y" m2;
- FunMap.lookup "x" (FunMap.unbind "x" m2);
- FunMap.lookup 73 (FunMap.define (fn n => SOME (n div 2)));
- FunMap.lookup "x" (FunMap.make [("y", 4),("x", 5),("x",2)];
SMiLe
In this part, you will complete the implementation of an interpreter for SMiLe, a dynamically-typed programming language that is otherwise a syntactic and semantic subset of Standard ML. This section will guide you through some exercises to complete the interpreter.
Since the SMiLe implementation is a non-trivial piece of code, you are encouraged to test and hg commit
your changes often. Then if you break something and have trouble debugging it, you can more easily recover a recent working version rather than starting from scratch.
The SMiLe Language
The SMiLe language is similar to Standard ML. It uses a subset of the SML syntax and a subset of the SML semantics, with the exception that it performs dynamic type-checking instead of static type-checking.
Concrete Syntax
You will work mainly with the abstract syntax of SMiLe programs represented as tree data structures. However, to ease the job of testing (and increase the amount of fun possible once your interpreter is working), we have provided a concrete syntax and parser that converts from this syntax to the representation you will use inside the representation.
We use an Extended Backus Normal Form (EBNF) grammar to describe the concrete syntax of SMiLe programs. Nonterminals are written as <nonterm>
. Terminals are any symbol other than angle-bracketed names, |
, ::=
, +
, *
, or curly brackets. Whenever any of these symbols is required as a terminal in the grammar, it is surrounded by quotes to distinguish it from the a symbol with meaning in the syntax of grammars themselves. The productions of each nonterminal are listed with:
<nonterm> ::= ... production 1 ...
| ... production 2 ...
| ...
The “Extended” form adds the ability to write regular expressions as part of the grammar productions to succinctly express several possible productions in just one. (Take CS 235 and CS 301 for more about regular expressions, concrete syntax, grammars, parsing, etc.) The one extension of BNF form that we will use is the postfix +
and *
symbols, which indicate “one or more of the preceding thing, in sequence” and “zero or more of the preceding thing, in sequence”, respectively. We use curly braces {}
to group multiple items together and apply +
. For example, ({<expr>,}+ <expr>)
describes possible tuple expressions in the language, i.e., all strings of the form: a left parenthesis followed by one or more expressions (each followed by a comma), followed by one expression (without a comma), followed by a right parenthesis.
Below is an EBNF grammar for SMiLe. This is not the smallest possible grammar, but it matches what we have chosen for our abstratc syntax representation. It is also an ambiguous grammar, so precedence and associativity rules are needed when understanding SMiLe programs. Since the details of the concrete syntax are not the focus of this assignment, we omit those extra rules (and the more complicated grammar actually used by our parser) in this presentation. You will work with unambiguous Abstract Syntax Trees in your work implementing SMiLe. (If you are curious about the details of parsing concrete syntax, take CS 235 and CS 301 and feel free to take a peek inside our parser implementation.)
<program> ::= <binding> <program>
| <binding>
<binding> ::= val <pattern> = <expr>
| fun <var> <funcases>
<funcases> ::= <pattern> = <expr> "|" <funcases>
| <pattern> = <expr>
<pattern> ::= _ | () | true | false | <int> | <string>
| ({<pattern>,}+ <pattern>) | <tag> <pattern> | (<pattern>)
<expr> ::= () | true | false | <int> | <string>
| <unop> <expr> | <expr> <binop> <expr>
| ({<expr>,}+ <expr>) | [{<expr>,}* <expr>] | [] | <tag> <expr> | (<expr>)
| if <expr> then <expr> else <expr>
| fn <cases> | <expr> <expr> | <var>
| let <program> in <expr> end
| case <expr> of <cases>
<unop> ::= not | ~
<binop> ::= "+" | - | "*" | div | mod | ^
| = | <> | < | <= | > | >= | ::
| andalso | orelse
<cases> ::= <pattern> => <expr> "|" <cases>
| <pattern> => <expr>
Abstract Syntax
The abstract syntax is defined by Abstract Syntax Trees very similar to the grammar above. You will explore the abstract syntax as part of the first problem below.
Evaluation Semantics
The evaluation semantics of SMiLe is the same as that of SML (for the subset of the language SMiLe includes), with one difference: SMiLe performs dynamic type-checking.
Differences with SML
Some notable differences with the syntax of Standard ML:
By intention:
- Type-related syntax is not supported.
- As a result, datatypes and constructors are never declared. Just use constructors (also called “tags” in our implementation) as if already declared.
Minor bugs that may be resolved soon:
~2
is not currently parsed as an int, but as negation applied to 2.- Literal negative numbers (
~2
) cannot be used in patterns.
Many other SML syntax features are available, including:
- Multi-branch pattern-matching and curried form in function bindings.
- Left-associative function application (curried functions can be called without parenthesization).
- Special list syntax with
[]
and::
. - All bindings support full patterns.
Provided Code and Infrastructure
We provide significant code to get you started.
ast.sml
: the definition of Abstract Syntax Trees (ASTs) for the SMiLe langauge, annotated with their correspondence to the concrete syntax, plus functions to pretty-print SMiLe ASTs in the concrete syntaxparser/parser.sig
: the signature of theParser
structure, which translates programs from strings to ASTs.- Ignore the remainder of the parser implementation in
parser/...
, unless you are curious. (Take CS 301 for more about this!)
- Ignore the remainder of the parser implementation in
static.sig
andstatic.sml
: static operations over SMiLe ASTs, including desugaring and static scope checkingruntime.sig
andruntime.sml
: evaluation of SMiLe programs and ASTs
You will edit static.sml
and runtime.sml
in your tasks below.
Implementing a language using a similar implementation language or metalanguage has benefits and problems. The most visible problem is muddled communication or thinking about whether you are working in or discussing the implementation language (SML) or the implemented language (SMiLe). On the other hand, being forced to keep them straight helps you learn to think about the distinction more clearly. Finally, implementing many of the evaluation rules of SML as evaluation rules for SMiLe will help you understand SML more clearly. (I’m looking at you, pattern-matching!)
Running the SMiLe Implementation
There are a couple ways to run the SMiLe implementation we build.
Command-Line Wrapper
In the top-level directory, run ./smile file.sml
. The executable script smile
is a thin wrapper that will evaluate the given file as a SMiLe program. Its output includes much information from SML/NJ, the implementation of our implementation language, ML. The output of the SMiLe interpreter is bounded by (-:
and :-)
:
[10:31 PM] bpw@queets: ~/251/smile$ ./smile examples/pat.sml
Standard ML of New Jersey v110.78 [built: Sun Apr 26 01:06:11 2015]
[scanning ./sources.cm]
... much additional SML/NJ compilation manager output elided here ...
[autoloading done]
val it = () : unit
[autoloading]
[autoloading done]
(-:
bind x = 10
:-)
val it = - : Runtime.environment
-
[10:31 PM] bpw@queets: ~/251/smile$
Much like the SML/NJ REPL prints out val
bindings that are evaluated at the top level, the SMiLe interpreter prints out bindings using a similar notation (with bind
instead of val
to make them more distinguishable from any SML bindings that may appear in the same SML REPL).
Options: The smile
wrapper has a few options for modes of operation other than evaluating one file:
--parse
or-p
: Parse the file and print the AST, but do not invoke anyStatic
orRuntime
actions.--desugar
or-d
: Parse andStatic.desugar
the file and print the desugared AST, but do not invoke any otherStatic
orRuntime
actions.--test
or-t
: Run the provided test suite.
In the SML REPL
A more flexible alternative to the smile
wrapper is to open an SML/NJ REPL in the top-level directory and call functions in the SMiLe implementation directly. To load the SMiLe implementation, run CM.make "sources.cm";
at the REPL.
Standard ML of New Jersey v110.78 [built: Sun Apr 26 01:06:11 2015]
- CM.make "sources.cm";
[autoloading]
... much additional SML/NJ compilation manager output elided here ...
[New bindings added.]
val it = true : bool
-
This runs the SML/NJ Compilation Manager, which will generate a lot of output. If everything builds correctly, the result will be true
(seen here bound to SML/NJ’s default REPL variable, it
). If all is not well, you will see error messages and a false
result.
As you make changes, fix errors, and so on, you may rerun CM.make "sources.cm";
in the same REPL session without closing and reopening as you do with use
. (Unless you also use use
, CM
, the Compilation Manager, will get all the bindings right.)
To run a SMiLe program stored in a file once the SMiLe implementation is loaded:
- Runtime.run_file "examples/pat.sml";
(-:
bind x = 10
:-)
val it = - : Runtime.environment
-
As before, output of the SMiLe implementation is bounded by (-:
and :-)
.
There is no SMiLe REPL, per se. You can run a SMiLe program given on standard input, but it is a little awkward. (Type Control-D to end your input, but this will also close standard input and terminate the SML/NJ REPL.)
- Runtime.run TextIO.stdIn;
[autoloading]
[autoloading done]
val x = if 5 < 4 then 8 else 2*5 div 1
^D
(-:
bind x = 10
:-)
val it = - : Runtime.environment
-
Process sml finished
You are encouraged to run whole files – this also pushes you toward writing down more of your tests.
Testing Guide
(Expanded 3 November)
Before you have completed most of the runtime.sml
additions, your interpreter will be unable to handle the provided examples/*
programs written in concrete syntax. Until then, we suggest testing incrementally: just call functions in your SMiLe implementation directly in the SML REPL.
Visibility
Once the SMiLe implementation is loaded (with CM.make "sources.cm";)
, you may call any visible function directly from the REPL. Check the .sig
files for what’s visible. If a function you wish to test or call is not visible and you want to test it, make it visible. Either:
- add its type binding to the relevant
.sig
file; or - temporarily change the ascription of signature to module to use transparent ascription (
struct Runtime : RUNTIME
instead ofstruct Runtime :> RUNTIME
). This will make all bindings inRuntime
visible externally.
open
To avoid having to type prefixes for all functions/constructors in a structure, you may open
the structure to import all its visible bindings into the current environment. For example open Ast
now allows the use of all AST constructors without the prefix Ast.
This is particularly useful when constructing ASTs or Runtime.value
s. Be warned: bindings introduced by open
will shadow any preexisting bindings in the current environment if they have the same name.
Constructing Inputs
When testing functions that require ASTs as input, you may find it faster to write (or use preexisting) concrete syntax programs, parse them, and then copy the pieces of the AST you want to use. For this task, Parser.parse_file
is quite useful, as is the ./smile --parse
script. To pretty-print an AST structure p
in concrete syntax, use print (Ast.show p)
(or related Ast.show...
functions).
Recall that, to see the full depth or breadth of an AST or other nested structure, you may need to direct the SML/NJ REPL to print more complete outputs. For example:
val _ = Control.Print.printDepth := 100;
val _ = Control.Print.printLength := 100;
val _ = Control.Print.stringDepth := 1000;
As you go, collecting your tests in an SML file is quite useful. tests.sml
holds high-level tests. Feel free to create other files that will automatically run you own more detailed tests.
Testing the Completed Interpreter
Once you have more parts of the interpreter complete, you will be able to run full examples using the concrete syntax. Test.run ();
will run a provided test suite, running all programs in the example
directory and checking for all expected bindings.
Other Helpful Tricks
Type a binding name at the REPL to see its type quickly:
- Runtime.eq;
val it = fn : Runtime.value * Runtime.value -> bool
SMiLe Errors
When an error occurs in a SMiLe program (e.g., a SMiLe dynamic type-checking error), an error message will be printed starting with :-(
if you use the various ..._error
functions provided in runtime.sml
. If an unexpected error occurs in the SMiLe implementation itself, you will see an ML exception instead.
Abstract Syntax Trees and Desugaring (15 points)
In this part you will explore the ML representation of SMiLe programs as Abstract Syntax Trees (ASTs) and desugaring of SMiLe programs. You will read ast.sml
and edit static.sml
.
-
Read
ast.sml
to familiarize yourself with the Abstract Syntax Tree representation of SMiLe programs. ML datatypes and pattern-matching are beautifully suited to this sort of task. One ML type corresponds to one non-terminal in the (simplest) SMiLe grammar; each constructor/tag of this type corresponds to one production of that non-terminal. The keywordand
supports mutually-recursive datatypes or functions.- Why do you think SMiLe tuple expressions are represented with an ML list of SMiLe expressions instead of an ML tuple?
- Write the ML representation of the AST for
examples/fact.sml
manually. Do the translation from concrete syntax to AST yourself without the help of the parser. Once you have translated it, check your answer by running./smile --parse examples/fact.sml
to see the AST the parser produces.
-
Early Desugaring: The provided parser for SMiLe’s concrete syntax does some desugaring before constructing ASTs.
- Examine the provided SMiLe file
examples/lists.sml
. - Find two or three syntactic constructs (at least one in
fun
bindings and one in constructors/tags) used in these programs that are not representable directly by our ASTs. Describe how the parser appears to rewrite each syntax to fit our AST representation.
You may find it useful to examine the AST generated by the parser for
lists.sml
. You can do this from the SML REPL as described elsewhere, or you can just run./smile --parse examples/lists.sml
in the top-level repository directory. If you use the SML REPL, you can callParser.parse_file ...
to get an AST and then pretty-print it withprint (Ast.show ...)
to show the AST in concrete syntax and inspect signficant differences with the original program. You may find that the result is no longer quite legal SML/SMiLe syntax due to a trick the parser has used to generate variable names for internal use inside the interpreter. - Examine the provided SMiLe file
-
AST Desugaring: The
Static
structure instatic.sml
implements functions for additional desugaring of ASTs by replacing all occurrences of AST nodes with combinations of other nodes that have equivalent behavior. Read the implementations of thedesugar
functions. Our interpreter implementation will not implementIfThenElse
expressions.- Change the
desugar_expr
function to rewriteIfThenElse
expressions to equivalentCase
expressions. - Change the existing desugaring of
AndAlso
andOrElse
AST nodes to work with the interpreter as well.
- Change the
SMiLe Runtime Evaluator (45 points)
In this part you will explore, modify, and extend the SMiLe interpreter in runtime.sml
.
-
Warmup: Read
runtime.sml
up through the definition of theenvironment
type. Answer the following questions succinctly:- How do values built by the
ClosureValue
constructor differ from the closures we have drawn in environment diagrams? Speculate on reasons for these differences. - Why is the function name position a
var option
instead of just avar
?
- How do values built by the
-
Implement
eq : value * value -> bool
to compare the structure of two SmiLevalue
s recursively. Use ML pattern-matching for a concise implementation.- Two non-closure SMiLe
value
s are equivalent if and only if they have the same structure (same ML constructor) and their pieces are equivalent. eq
returns false if any of its arguments is a closure.
Test your function on several cases.
- Two non-closure SMiLe
-
Read
eval_expr
: Read several cases fromeval_expr
to get a feel for how the SMiLe evaluator works. -
Implement
Tuple
Expressions ineval_expr
: Evaluating a SMiLe tuple expression evaluates each SMiLe subexpression in order to create a SMiLe tuple of values. Try to accomplish this in just one extra line of ML code. (Hint: think higher-order.) -
Implement
Case
Expressions ineval_expr
: Implement evaluation of SMiLeCase
expressions. ACase
expression carries an expression,e
, to evaluate and match against a(pattern * expr) list
. All told, this task will require adding 1 line directly in theCase
branch ofeval_expr
and roughly 25-45 lines split between theeval_cases
andmatch
functions (stubs are provided). To accomplish pattern-matching for SMiLe, you implement two ML helper functions that currently just raise anUnimplemented
exception:-
match
is a recursive function that checks to see if a SMiLevalue
matches a single SMiLepattern
. Thematch
function takes four arguments in curried form:- The function to call to add new bindings to a SMiLe environment.
- The SMiLe
value
to match against. - The SMiLe
pattern
to try. - The SMiLe
environment
to extend.
If the SMiLe
value
and SMiLepattern
match (as defined below),match
returnsSOME
of anenvironment
that is the initial environment (as passed tomatch
), extended with all the bindings the SMiLe pattern-match introduces. If there is no match,match
returnsNONE
. The implementation should use ML pattern-matching to ML-pattern-match simultaneously on an ML constructor for a SMiLevalue
and an ML constructor for a SMiLepattern
.Given a SMiLe
value
v
and a SMiLepattern
p
, eitherp
matchesv
or not. If it does, the match returns anenvironment
extended with newvar * value
bindings. The order in which bindings are added in a match does not matter. Assume that one pattern binds each possible variable at most once. In other words, no pattern will includeVarPattern "x"
more than once, so shadowing and binding order are non-issues.The rules for SMiLe pattern matching should be unsurprising:
Wildcard
matches everything and produces no bindings.VarPattern s
matches any valuev
and bindss
tov
.UnitPattern
matches onlyUnitValue
and produces no bindings.BoolPattern false
matches onlyBoolValue false
and produces no bindings (and similarly fortrue
).IntPattern 17
matches onlyIntValue 17
and produces no bindings (and similarly for other integers).StringPattern "smile"
matches onlyStringValue "smile"
and produces no bindings (and similarly for other strings).TuplePattern ps
matches a value of the formTupleValue vs
ifps
andvs
have the same length and each element ofps
matches the corresponding element ofvs
. The match introduces all bindings introduced by all elementwise submatches.TagPattern (s1,p)
matchesTaggedValue(s2,v)
ifs1
ands2
are the same ML string (via ML=
) andp
matchesv
. The match introduces all bindings introduced by the match ofp
andv
. We call the stringss1
ands2
the constructor name or tag name.- Nothing else matches.
Make sure to use the
bind
function provided as an argument when you need to introduce new bindings into an environment. This allows the interpreter to print top-level bindings as they are introduced (while introducing other bindings silently in subexpressions). -
eval_cases
is a function that takes three arguments:v : value
, the SMiLe value to match against.cases : (pattern * expr) list
, the ordered list of pairs of SMiLe pattern and corresponding SMiLe result expression.env : environment
, the environment to extend with match bindings and use to evaluate the chosen result expression.
[Updated Saturday 31 October to clarify how to call
match
.]eval_cases
iterates through the list ofcases
in order, applyingmatch
tov
and each SMiLe pattern until it finds one that matches. (When callingmatch
here, giveEnv.bind
as its first argument.) For the first SMiLe pattern that matchesv
,eval_cases
evaluates the associated SMiLe result expression from the list, using environmentenv
, extended with all bindings introduced by the matching SMiLe pattern. If no pattern matches, callmatch_error
to report the error.
[Updated Tuesday 3 November with more directed testing suggestions.] Test
match
andeval_cases
by calling these functions directly in the SML/NJ REPL on ML values of the appropriate argument types. You may need to make some helpers visible as described in the general testing guide. Here is an example test ofmatch
, expressed as a SML code in a file. To use this on the REPL, add semicolons.open Ast open Runtime (* Match a tuple value against a tuple pattern of same width. Should produce an environment with x=7, y=8. Using a pattern binding to assert that match returns SOME environment *) val (SOME env) = Runtime.match Env.bind (TupleValue [IntValue 7, IntValue 8]) (TuplePattern [VarPattern "x", VarPattern "y"]) Env.empty (* Print the resulting environment. *) val s = Runtime.show_env env
With both
eval_cases
andmatch
implemented (in addition to theCase
expression branch ofeval_expr
, you can also test the evaluation of SMiLe case expressions. (See also general testing guide and a quick way to get ASTs for case expressions from concrete syntax files, such asexamples/pat.sml
.) For example:- Runtime.eval_expr (Case (Tag ("::", Tuple [Int 1, Tag ("::", Tuple [Int 2, Tag ("[]", Unit)])]), [(TagPattern ("[]", Unit), Int 0), (TagPattern ("::", (TuplePattern [VarPattern "x", VarPattern "xs"])), Plus (Var "x", 7))])) Env.empty; val it = IntValue 8 : Runtime.value
Once you have implemented
eval_binding
(the next step, much shorter thanmatch
andeval_cases
), you will be able to start testing SMiLe programs written in the concrete syntax. A SMiLe test program is provided inexamples/pat.sml
, but it is not exhaustive. -
-
Implement
eval_binding
: ImplementVal
andFun
bindings ineval_binding
.eval_binding
takes a function for adding bindings to environments, a SMiLebinding
, and the current SMiLeenvironment
. It evaluates the SMiLe binding, and returns the SMiLe environment extended by the binding. This requires 5-10 lines of ML code, most forVal
bindings.SMiLe
Val
bindings evaluate their SMiLe expression, then match their SMiLe pattern against the result value. If the result matches the pattern, any bindings from the match are added to the current environment. Make use of the functions you created for pattern-matching inCase
expressions.SMiLe
Fun
bindings introduce a binding from the function’s name to a SMiLe closure value for the function.Make sure to use the
bind
function provided as an argument when you need to introduce new bindings into an environment. This allows the interpreter to print top-level bindings as they are introduced (while introducing other bindings silently in subexpressions). -
Read and describe
Let
expressions and the top-leveleval
Describe what our provided code for
Let
expressions does ineval_expr
. Note that it is very similar to code that appears in theeval
function, which evaluates the top-level bindings of a program. Describe this as well, and how it relates toLet
expressions. Write your descriptions in comments in the code. -
Implement Function Application (Function Call) Expressions: Review the evaluation rule for function application and implement it in the
Apply
case ofeval_expr
. If something other than aClosureValue
is encountered where you expect the closure, use the functiontype_error
to report it.Recall that SMiLe functions all support multi-branch definitions. Rather than a single argument and a single body expression, SMiLe functions have lists of pairs of
pattern
and associatedexpr
, matching the multi-branch function defintions familiar from ML. This means that evaluating a function call requires pattern-matching to choose the right branch and binding variables in the pattern to parts of the argument value. This should feel very similar toCase
expressions!An important distinction with our earlier graphical representation of closures is that SMiLe closures in this implementation carry an environment that does not bind the original name of the function. (That would require mutation and we can get away without it!) However, a function’s name is carried separately in the closure. To support recursive functions, you must ensure that the function’s name (if any) is bound in the environment where you evaluate its body. (Use
Env.bind
.)At this point, you may run and check the results of all examples by calling
Test.run();
in the REPL. This tests all of the example programs in theexamples
directory, which should all work at this point. If they do not, you have some debugging to do. You should also write some of your own tests as ours are by no means exhaustive. Often you will be able to repurpose older SML code for this, removing use of any types/exceptions to work with SMiLe.
Static Checking (20 points)
Implement the freevars
function in static.sml
. Its job is to find all the unbound (or “free”) variables used in a SMiLe program
AST. Your implementation will traverse the AST, tracking a simple static environment of type (Ast.var, unit) Env.t
that tracks the set of variables that will be bound in the current scope. As you descend through a binding, you will grow this environment, binding any bound variables to the unit value to indicate they are bound in the current scope. When the functions reach lists of cases (pattern/expression pairs), they should consider all possible cases.
Your implementation will follow a shape that is fairly similar to the eval
functions (and their supporthing helpers) in runtime.sml
, but instead of evaluating and following just one path through the AST, freevars
will perform one full tree traversal, visiting every part of the tree once.
You will want to define ML helper functions to work over binding
s and expr
s found in program
s. Use the ML form for mutually recursive functions:
fun freevars prog = ...
and freevars_binding b env = ...
and freevars_cases cases env = ...
and freevars_expr e env = ...
Optional extension: While checking for free variables, also check for conflicting bindings in a single pattern. Patterns should bind any given variable at most once. A variable cannot appear twice in a pattern. For example, the pattern (x,x)
is illegal. (As an AST, this would be TuplePattern [VarPattern "x", VarPattern "x"]
.) Raise an exception if you find conflicting bindings in a pattern.
A Simpler Language? (5 points)
In this part, you will explore one last instance of syntactic sugar.
You have already investigated several examples of desugaring in our implementation of the SMiLe language. If all instances of an expression form can be rewritten to other forms in the language to produce an equivalent program, we know that this form is “just syntax;” it does not contribute any “interesting” semantics to the language beyond the semantics of the other existing expression forms. We will soon take this to the extreme in class, showing that it is possible to implement any computation using no more than single-argument functions, function application, and literally nothing else. For now, we content ourselves with writing off a couple more essential expression forms in our language as mere syntactic sugar.
At this point we have seen that there are many similarities between SMiLe/SML let
expresssions, case
expressions, and function application expressions. Evaluation of each of these involves:
- evaluating one or more argument expressions, pattern-matching against the result, and introducing bindings to parts (or the entirety) of the result:
- in function calls: the argument expression
- in let expressions: the binding expressions
- in case expressions: the target expression
- and evaluating one “body expression” in some environment extended with these bindings:
- in function calls: evaluate the body expression in the definition environment extended by the match
- in let expressions: evaluate the body expression in the current environment extended by the matches
- in case expressions: evaluate the expression following the matching pattern in the current environment extended by the match
Currently, the interpreter exploits these similarities to share significant code among the evaluation rules for these three expression forms. Nonetheless, the interpreter would be simpler if we had fewer evaluation rules.
Your final task is to implement two of the three (let
, case
, or function application) as sugar in terms of the third.
Before attempting the final desugaring:
- Complete all earlier stages of the interpreter.
hg cp runtime.sml runtime-sugar.sml
hg cp static.sml static-sugar.sml
- Do your work for this part in the
*-sugar.sml
files and compile using the providedsources-sugar.cm
instead of the originalsources.cm
.
After making the copy, modify the relevant branches in Static.desugar_expr
to introduce the new rewriting and modify Runtime.eval_expr
to raise SmileSugarError
if the newly-eliminated expression forms are ever encountered at runtime.
Optional Extensions
If you are looking for more, consider the following extensions, or come chat for other ideas.
Closure Size Optimization
Adapt your implementation of helper functions for freevars
(functions that find free variables) to optimize the space used for function closures. Our current implementation of closures keeps around the entire environment, even the parts that the closure will never need. By running freevars_expr
on the body of a function starting from a static environment that contains only variables that will be bound as part of the function application evaluation rule, you can detect the free variables of the closure – those variables that may be referenced when no binding is in scope within the give expression. These are the only variables the closure could ever need from its environment. Thus, we might save space by storing an environment in the closure that binds only these variables. Do this.
Types
For a big challenge, add an ML datatype typ
describing types of SMiLe expressions. Add a typecheck
function that takes a SMiLe expr
, a static environment mapping variables to types, and a proposed typ
for the expr
, and decides whether expr
may have this type. Emulate the SML type system as closely as possible.