Interpretive Dance
- Checkpoint: Aim to complete the map, AST, and runtime sections by 11:00pm Thursday 5 November
- Due: 11:00pm Monday 9 November
- Starter files:
- fork wellesleycs251 / cs251-smile and add bpw as admin
- Submission:
- Estimate the time you spent on each section in
time.txt. - Commit and push your completed work to your Bitbucket fork.
- Estimate the time you spent on each section in
- Relevant reading:
- SML References
- For CS 235 concurrent enrollees: Ocaml vs. SML:
- Tools:
- Collaboration:
- All problems fall under the group collaboration policy. You may collaborate with 0 or 1 other students to produce one solution stored in one repository. Groups should act as individuals usually do with regards to discussing ideas. The group’s writing and code must be their own. Group members are expected to contribute equally and accomplish a large majority of the work (90%) together in person.
In this assignment, you will implement an interpreter for SMiLe, a programming language designed just for this course. SMiLe shares its syntax and evaluation semantics with a subset of Standard ML, but does dynamic type-checking instead of static type-checking.
Contents
MAP (Environment) Abstract Data Type (15 points)
A MAP signature is provided and documented in map.sig. Maps map keys to values. We provide an Env structure in env.sml that implements maps (environments) as association lists, with some additional operations. Your first task is to implement and discuss an alternative representation of maps (environments).
-
Implement a
FunMapstructure inmap.sml, ascribing theFUNMAPsignature, which also supports defining a map from a given function. Your implement should represent maps purely as function closures. The ideas to implement these are quite similar to those for theSETADTs we (partly) implemented in class. A map-representing function returns an option when called on a key:NONEif the given key is unbound in the map represented by that function,SOME vif the given key is bound to valuevin the map represented by that function. Implement all bindings described inFUNMAP. -
Test your implementation to make sure it functions as expected on a range of cases.
-
One of the bindings in the
MAPsignature could be implemented externally by clients without knowing the concrete type of maps. Which one? Is there an advantage to implementing it as a primitive ADT operation besides saving clients the trouble of writing it? -
Consider the trade-offs between list representations, function representations, and at least one other (hypothetical) representation for sets and for maps. Examples of other representations include hash tables, heaps, tries, etc. You do not need to implement your extra representation. Answer the following questions in
answers.txt:- What distinguishes function-representable but non-list-representable sets and maps from list-representable sets and maps? Give examples and explain the fundamental distinction that separates sets/maps that can be represented with, e.g., lists and those that cannot (but can be represented by functions).
- What is the fundamental operation over collections that is not possible to implement for function representations of general sets and maps? (Operations like
toStringandtoListare impossible because of this limitation.) - Which representation (lists or functions) is more space-efficient for use cases where insertion of duplicate set elements or binding of duplicate map keys (shadowing) is common? How much does your answer depend on implementation choices vs. fundamental limits of the representation?
-
For the last couple points on this problem, convert
map.smlto include zero uses of thefn ... => ..syntax. Use exactly onefunbinding per binding inFUNMAP. (Hint: exploit currying and partial application.)
Testing [updated 3 November]: Test your implementation by building maps and performing bind, unbind and lookup operations. For example, you might try these and reason about what they should yield:
- val m = FunMap.empty;
- FunMap.lookup "x" m;
- val m1 = FunMap.bind ("x", 7) m;
- FunMap.lookup "x" m1;
- val m2 = FunMap.bind ("y", 8) m1;
- FunMap.lookup "x" m2;
- FunMap.lookup "y" m2;
- FunMap.lookup "x" (FunMap.unbind "x" m2);
- FunMap.lookup 73 (FunMap.define (fn n => SOME (n div 2)));
- FunMap.lookup "x" (FunMap.make [("y", 4),("x", 5),("x",2)];SMiLe
In this part, you will complete the implementation of an interpreter for SMiLe, a dynamically-typed programming language that is otherwise a syntactic and semantic subset of Standard ML. This section will guide you through some exercises to complete the interpreter.
Since the SMiLe implementation is a non-trivial piece of code, you are encouraged to test and hg commit your changes often. Then if you break something and have trouble debugging it, you can more easily recover a recent working version rather than starting from scratch.
The SMiLe Language
The SMiLe language is similar to Standard ML. It uses a subset of the SML syntax and a subset of the SML semantics, with the exception that it performs dynamic type-checking instead of static type-checking.
Concrete Syntax
You will work mainly with the abstract syntax of SMiLe programs represented as tree data structures. However, to ease the job of testing (and increase the amount of fun possible once your interpreter is working), we have provided a concrete syntax and parser that converts from this syntax to the representation you will use inside the representation.
We use an Extended Backus Normal Form (EBNF) grammar to describe the concrete syntax of SMiLe programs. Nonterminals are written as <nonterm>. Terminals are any symbol other than angle-bracketed names, |, ::=, +, *, or curly brackets. Whenever any of these symbols is required as a terminal in the grammar, it is surrounded by quotes to distinguish it from the a symbol with meaning in the syntax of grammars themselves. The productions of each nonterminal are listed with:
<nonterm> ::= ... production 1 ...
| ... production 2 ...
| ...The “Extended” form adds the ability to write regular expressions as part of the grammar productions to succinctly express several possible productions in just one. (Take CS 235 and CS 301 for more about regular expressions, concrete syntax, grammars, parsing, etc.) The one extension of BNF form that we will use is the postfix + and * symbols, which indicate “one or more of the preceding thing, in sequence” and “zero or more of the preceding thing, in sequence”, respectively. We use curly braces {} to group multiple items together and apply +. For example, ({<expr>,}+ <expr>) describes possible tuple expressions in the language, i.e., all strings of the form: a left parenthesis followed by one or more expressions (each followed by a comma), followed by one expression (without a comma), followed by a right parenthesis.
Below is an EBNF grammar for SMiLe. This is not the smallest possible grammar, but it matches what we have chosen for our abstratc syntax representation. It is also an ambiguous grammar, so precedence and associativity rules are needed when understanding SMiLe programs. Since the details of the concrete syntax are not the focus of this assignment, we omit those extra rules (and the more complicated grammar actually used by our parser) in this presentation. You will work with unambiguous Abstract Syntax Trees in your work implementing SMiLe. (If you are curious about the details of parsing concrete syntax, take CS 235 and CS 301 and feel free to take a peek inside our parser implementation.)
<program> ::= <binding> <program>
| <binding>
<binding> ::= val <pattern> = <expr>
| fun <var> <funcases>
<funcases> ::= <pattern> = <expr> "|" <funcases>
| <pattern> = <expr>
<pattern> ::= _ | () | true | false | <int> | <string>
| ({<pattern>,}+ <pattern>) | <tag> <pattern> | (<pattern>)
<expr> ::= () | true | false | <int> | <string>
| <unop> <expr> | <expr> <binop> <expr>
| ({<expr>,}+ <expr>) | [{<expr>,}* <expr>] | [] | <tag> <expr> | (<expr>)
| if <expr> then <expr> else <expr>
| fn <cases> | <expr> <expr> | <var>
| let <program> in <expr> end
| case <expr> of <cases>
<unop> ::= not | ~
<binop> ::= "+" | - | "*" | div | mod | ^
| = | <> | < | <= | > | >= | ::
| andalso | orelse
<cases> ::= <pattern> => <expr> "|" <cases>
| <pattern> => <expr>Abstract Syntax
The abstract syntax is defined by Abstract Syntax Trees very similar to the grammar above. You will explore the abstract syntax as part of the first problem below.
Evaluation Semantics
The evaluation semantics of SMiLe is the same as that of SML (for the subset of the language SMiLe includes), with one difference: SMiLe performs dynamic type-checking.
Differences with SML
Some notable differences with the syntax of Standard ML:
By intention:
- Type-related syntax is not supported.
- As a result, datatypes and constructors are never declared. Just use constructors (also called “tags” in our implementation) as if already declared.
Minor bugs that may be resolved soon:
~2is not currently parsed as an int, but as negation applied to 2.- Literal negative numbers (
~2) cannot be used in patterns.
Many other SML syntax features are available, including:
- Multi-branch pattern-matching and curried form in function bindings.
- Left-associative function application (curried functions can be called without parenthesization).
- Special list syntax with
[]and::. - All bindings support full patterns.
Provided Code and Infrastructure
We provide significant code to get you started.
ast.sml: the definition of Abstract Syntax Trees (ASTs) for the SMiLe langauge, annotated with their correspondence to the concrete syntax, plus functions to pretty-print SMiLe ASTs in the concrete syntaxparser/parser.sig: the signature of theParserstructure, which translates programs from strings to ASTs.- Ignore the remainder of the parser implementation in
parser/..., unless you are curious. (Take CS 301 for more about this!)
- Ignore the remainder of the parser implementation in
static.sigandstatic.sml: static operations over SMiLe ASTs, including desugaring and static scope checkingruntime.sigandruntime.sml: evaluation of SMiLe programs and ASTs
You will edit static.sml and runtime.sml in your tasks below.
Implementing a language using a similar implementation language or metalanguage has benefits and problems. The most visible problem is muddled communication or thinking about whether you are working in or discussing the implementation language (SML) or the implemented language (SMiLe). On the other hand, being forced to keep them straight helps you learn to think about the distinction more clearly. Finally, implementing many of the evaluation rules of SML as evaluation rules for SMiLe will help you understand SML more clearly. (I’m looking at you, pattern-matching!)
Running the SMiLe Implementation
There are a couple ways to run the SMiLe implementation we build.
Command-Line Wrapper
In the top-level directory, run ./smile file.sml. The executable script smile is a thin wrapper that will evaluate the given file as a SMiLe program. Its output includes much information from SML/NJ, the implementation of our implementation language, ML. The output of the SMiLe interpreter is bounded by (-: and :-):
[10:31 PM] bpw@queets: ~/251/smile$ ./smile examples/pat.sml
Standard ML of New Jersey v110.78 [built: Sun Apr 26 01:06:11 2015]
[scanning ./sources.cm]
... much additional SML/NJ compilation manager output elided here ...
[autoloading done]
val it = () : unit
[autoloading]
[autoloading done]
(-:
bind x = 10
:-)
val it = - : Runtime.environment
-
[10:31 PM] bpw@queets: ~/251/smile$Much like the SML/NJ REPL prints out val bindings that are evaluated at the top level, the SMiLe interpreter prints out bindings using a similar notation (with bind instead of val to make them more distinguishable from any SML bindings that may appear in the same SML REPL).
Options: The smile wrapper has a few options for modes of operation other than evaluating one file:
--parseor-p: Parse the file and print the AST, but do not invoke anyStaticorRuntimeactions.--desugaror-d: Parse andStatic.desugarthe file and print the desugared AST, but do not invoke any otherStaticorRuntimeactions.--testor-t: Run the provided test suite.
In the SML REPL
A more flexible alternative to the smile wrapper is to open an SML/NJ REPL in the top-level directory and call functions in the SMiLe implementation directly. To load the SMiLe implementation, run CM.make "sources.cm"; at the REPL.
Standard ML of New Jersey v110.78 [built: Sun Apr 26 01:06:11 2015]
- CM.make "sources.cm";
[autoloading]
... much additional SML/NJ compilation manager output elided here ...
[New bindings added.]
val it = true : bool
-This runs the SML/NJ Compilation Manager, which will generate a lot of output. If everything builds correctly, the result will be true (seen here bound to SML/NJ’s default REPL variable, it). If all is not well, you will see error messages and a false result.
As you make changes, fix errors, and so on, you may rerun CM.make "sources.cm"; in the same REPL session without closing and reopening as you do with use. (Unless you also use use, CM, the Compilation Manager, will get all the bindings right.)
To run a SMiLe program stored in a file once the SMiLe implementation is loaded:
- Runtime.run_file "examples/pat.sml";
(-:
bind x = 10
:-)
val it = - : Runtime.environment
-As before, output of the SMiLe implementation is bounded by (-: and :-).
There is no SMiLe REPL, per se. You can run a SMiLe program given on standard input, but it is a little awkward. (Type Control-D to end your input, but this will also close standard input and terminate the SML/NJ REPL.)
- Runtime.run TextIO.stdIn;
[autoloading]
[autoloading done]
val x = if 5 < 4 then 8 else 2*5 div 1
^D
(-:
bind x = 10
:-)
val it = - : Runtime.environment
-
Process sml finishedYou are encouraged to run whole files – this also pushes you toward writing down more of your tests.
Testing Guide
(Expanded 3 November)
Before you have completed most of the runtime.sml additions, your interpreter will be unable to handle the provided examples/* programs written in concrete syntax. Until then, we suggest testing incrementally: just call functions in your SMiLe implementation directly in the SML REPL.
Visibility
Once the SMiLe implementation is loaded (with CM.make "sources.cm";), you may call any visible function directly from the REPL. Check the .sig files for what’s visible. If a function you wish to test or call is not visible and you want to test it, make it visible. Either:
- add its type binding to the relevant
.sigfile; or - temporarily change the ascription of signature to module to use transparent ascription (
struct Runtime : RUNTIMEinstead ofstruct Runtime :> RUNTIME). This will make all bindings inRuntimevisible externally.
open
To avoid having to type prefixes for all functions/constructors in a structure, you may open the structure to import all its visible bindings into the current environment. For example open Ast now allows the use of all AST constructors without the prefix Ast. This is particularly useful when constructing ASTs or Runtime.values. Be warned: bindings introduced by open will shadow any preexisting bindings in the current environment if they have the same name.
Constructing Inputs
When testing functions that require ASTs as input, you may find it faster to write (or use preexisting) concrete syntax programs, parse them, and then copy the pieces of the AST you want to use. For this task, Parser.parse_file is quite useful, as is the ./smile --parse script. To pretty-print an AST structure p in concrete syntax, use print (Ast.show p) (or related Ast.show... functions).
Recall that, to see the full depth or breadth of an AST or other nested structure, you may need to direct the SML/NJ REPL to print more complete outputs. For example:
val _ = Control.Print.printDepth := 100;
val _ = Control.Print.printLength := 100;
val _ = Control.Print.stringDepth := 1000;As you go, collecting your tests in an SML file is quite useful. tests.sml holds high-level tests. Feel free to create other files that will automatically run you own more detailed tests.
Testing the Completed Interpreter
Once you have more parts of the interpreter complete, you will be able to run full examples using the concrete syntax. Test.run (); will run a provided test suite, running all programs in the example directory and checking for all expected bindings.
Other Helpful Tricks
Type a binding name at the REPL to see its type quickly:
- Runtime.eq;
val it = fn : Runtime.value * Runtime.value -> boolSMiLe Errors
When an error occurs in a SMiLe program (e.g., a SMiLe dynamic type-checking error), an error message will be printed starting with :-( if you use the various ..._error functions provided in runtime.sml. If an unexpected error occurs in the SMiLe implementation itself, you will see an ML exception instead.
Abstract Syntax Trees and Desugaring (15 points)
In this part you will explore the ML representation of SMiLe programs as Abstract Syntax Trees (ASTs) and desugaring of SMiLe programs. You will read ast.sml and edit static.sml.
-
Read
ast.smlto familiarize yourself with the Abstract Syntax Tree representation of SMiLe programs. ML datatypes and pattern-matching are beautifully suited to this sort of task. One ML type corresponds to one non-terminal in the (simplest) SMiLe grammar; each constructor/tag of this type corresponds to one production of that non-terminal. The keywordandsupports mutually-recursive datatypes or functions.- Why do you think SMiLe tuple expressions are represented with an ML list of SMiLe expressions instead of an ML tuple?
- Write the ML representation of the AST for
examples/fact.smlmanually. Do the translation from concrete syntax to AST yourself without the help of the parser. Once you have translated it, check your answer by running./smile --parse examples/fact.smlto see the AST the parser produces.
-
Early Desugaring: The provided parser for SMiLe’s concrete syntax does some desugaring before constructing ASTs.
- Examine the provided SMiLe file
examples/lists.sml. - Find two or three syntactic constructs (at least one in
funbindings and one in constructors/tags) used in these programs that are not representable directly by our ASTs. Describe how the parser appears to rewrite each syntax to fit our AST representation.
You may find it useful to examine the AST generated by the parser for
lists.sml. You can do this from the SML REPL as described elsewhere, or you can just run./smile --parse examples/lists.smlin the top-level repository directory. If you use the SML REPL, you can callParser.parse_file ...to get an AST and then pretty-print it withprint (Ast.show ...)to show the AST in concrete syntax and inspect signficant differences with the original program. You may find that the result is no longer quite legal SML/SMiLe syntax due to a trick the parser has used to generate variable names for internal use inside the interpreter. - Examine the provided SMiLe file
-
AST Desugaring: The
Staticstructure instatic.smlimplements functions for additional desugaring of ASTs by replacing all occurrences of AST nodes with combinations of other nodes that have equivalent behavior. Read the implementations of thedesugarfunctions. Our interpreter implementation will not implementIfThenElseexpressions.- Change the
desugar_exprfunction to rewriteIfThenElseexpressions to equivalentCaseexpressions. - Change the existing desugaring of
AndAlsoandOrElseAST nodes to work with the interpreter as well.
- Change the
SMiLe Runtime Evaluator (45 points)
In this part you will explore, modify, and extend the SMiLe interpreter in runtime.sml.
-
Warmup: Read
runtime.smlup through the definition of theenvironmenttype. Answer the following questions succinctly:- How do values built by the
ClosureValueconstructor differ from the closures we have drawn in environment diagrams? Speculate on reasons for these differences. - Why is the function name position a
var optioninstead of just avar?
- How do values built by the
-
Implement
eq : value * value -> boolto compare the structure of two SmiLevalues recursively. Use ML pattern-matching for a concise implementation.- Two non-closure SMiLe
values are equivalent if and only if they have the same structure (same ML constructor) and their pieces are equivalent. eqreturns false if any of its arguments is a closure.
Test your function on several cases.
- Two non-closure SMiLe
-
Read
eval_expr: Read several cases fromeval_exprto get a feel for how the SMiLe evaluator works. -
Implement
TupleExpressions ineval_expr: Evaluating a SMiLe tuple expression evaluates each SMiLe subexpression in order to create a SMiLe tuple of values. Try to accomplish this in just one extra line of ML code. (Hint: think higher-order.) -
Implement
CaseExpressions ineval_expr: Implement evaluation of SMiLeCaseexpressions. ACaseexpression carries an expression,e, to evaluate and match against a(pattern * expr) list. All told, this task will require adding 1 line directly in theCasebranch ofeval_exprand roughly 25-45 lines split between theeval_casesandmatchfunctions (stubs are provided). To accomplish pattern-matching for SMiLe, you implement two ML helper functions that currently just raise anUnimplementedexception:-
matchis a recursive function that checks to see if a SMiLevaluematches a single SMiLepattern. Thematchfunction takes four arguments in curried form:- The function to call to add new bindings to a SMiLe environment.
- The SMiLe
valueto match against. - The SMiLe
patternto try. - The SMiLe
environmentto extend.
If the SMiLe
valueand SMiLepatternmatch (as defined below),matchreturnsSOMEof anenvironmentthat is the initial environment (as passed tomatch), extended with all the bindings the SMiLe pattern-match introduces. If there is no match,matchreturnsNONE. The implementation should use ML pattern-matching to ML-pattern-match simultaneously on an ML constructor for a SMiLevalueand an ML constructor for a SMiLepattern.Given a SMiLe
valuevand a SMiLepatternp, eitherpmatchesvor not. If it does, the match returns anenvironmentextended with newvar * valuebindings. The order in which bindings are added in a match does not matter. Assume that one pattern binds each possible variable at most once. In other words, no pattern will includeVarPattern "x"more than once, so shadowing and binding order are non-issues.The rules for SMiLe pattern matching should be unsurprising:
Wildcardmatches everything and produces no bindings.VarPattern smatches any valuevand bindsstov.UnitPatternmatches onlyUnitValueand produces no bindings.BoolPattern falsematches onlyBoolValue falseand produces no bindings (and similarly fortrue).IntPattern 17matches onlyIntValue 17and produces no bindings (and similarly for other integers).StringPattern "smile"matches onlyStringValue "smile"and produces no bindings (and similarly for other strings).TuplePattern psmatches a value of the formTupleValue vsifpsandvshave the same length and each element ofpsmatches the corresponding element ofvs. The match introduces all bindings introduced by all elementwise submatches.TagPattern (s1,p)matchesTaggedValue(s2,v)ifs1ands2are the same ML string (via ML=) andpmatchesv. The match introduces all bindings introduced by the match ofpandv. We call the stringss1ands2the constructor name or tag name.- Nothing else matches.
Make sure to use the
bindfunction provided as an argument when you need to introduce new bindings into an environment. This allows the interpreter to print top-level bindings as they are introduced (while introducing other bindings silently in subexpressions). -
eval_casesis a function that takes three arguments:v : value, the SMiLe value to match against.cases : (pattern * expr) list, the ordered list of pairs of SMiLe pattern and corresponding SMiLe result expression.env : environment, the environment to extend with match bindings and use to evaluate the chosen result expression.
[Updated Saturday 31 October to clarify how to call
match.]eval_casesiterates through the list ofcasesin order, applyingmatchtovand each SMiLe pattern until it finds one that matches. (When callingmatchhere, giveEnv.bindas its first argument.) For the first SMiLe pattern that matchesv,eval_casesevaluates the associated SMiLe result expression from the list, using environmentenv, extended with all bindings introduced by the matching SMiLe pattern. If no pattern matches, callmatch_errorto report the error.
[Updated Tuesday 3 November with more directed testing suggestions.] Test
matchandeval_casesby calling these functions directly in the SML/NJ REPL on ML values of the appropriate argument types. You may need to make some helpers visible as described in the general testing guide. Here is an example test ofmatch, expressed as a SML code in a file. To use this on the REPL, add semicolons.open Ast open Runtime (* Match a tuple value against a tuple pattern of same width. Should produce an environment with x=7, y=8. Using a pattern binding to assert that match returns SOME environment *) val (SOME env) = Runtime.match Env.bind (TupleValue [IntValue 7, IntValue 8]) (TuplePattern [VarPattern "x", VarPattern "y"]) Env.empty (* Print the resulting environment. *) val s = Runtime.show_env envWith both
eval_casesandmatchimplemented (in addition to theCaseexpression branch ofeval_expr, you can also test the evaluation of SMiLe case expressions. (See also general testing guide and a quick way to get ASTs for case expressions from concrete syntax files, such asexamples/pat.sml.) For example:- Runtime.eval_expr (Case (Tag ("::", Tuple [Int 1, Tag ("::", Tuple [Int 2, Tag ("[]", Unit)])]), [(TagPattern ("[]", Unit), Int 0), (TagPattern ("::", (TuplePattern [VarPattern "x", VarPattern "xs"])), Plus (Var "x", 7))])) Env.empty; val it = IntValue 8 : Runtime.valueOnce you have implemented
eval_binding(the next step, much shorter thanmatchandeval_cases), you will be able to start testing SMiLe programs written in the concrete syntax. A SMiLe test program is provided inexamples/pat.sml, but it is not exhaustive. -
-
Implement
eval_binding: ImplementValandFunbindings ineval_binding.eval_bindingtakes a function for adding bindings to environments, a SMiLebinding, and the current SMiLeenvironment. It evaluates the SMiLe binding, and returns the SMiLe environment extended by the binding. This requires 5-10 lines of ML code, most forValbindings.SMiLe
Valbindings evaluate their SMiLe expression, then match their SMiLe pattern against the result value. If the result matches the pattern, any bindings from the match are added to the current environment. Make use of the functions you created for pattern-matching inCaseexpressions.SMiLe
Funbindings introduce a binding from the function’s name to a SMiLe closure value for the function.Make sure to use the
bindfunction provided as an argument when you need to introduce new bindings into an environment. This allows the interpreter to print top-level bindings as they are introduced (while introducing other bindings silently in subexpressions). -
Read and describe
Letexpressions and the top-levelevalDescribe what our provided code for
Letexpressions does ineval_expr. Note that it is very similar to code that appears in theevalfunction, which evaluates the top-level bindings of a program. Describe this as well, and how it relates toLetexpressions. Write your descriptions in comments in the code. -
Implement Function Application (Function Call) Expressions: Review the evaluation rule for function application and implement it in the
Applycase ofeval_expr. If something other than aClosureValueis encountered where you expect the closure, use the functiontype_errorto report it.Recall that SMiLe functions all support multi-branch definitions. Rather than a single argument and a single body expression, SMiLe functions have lists of pairs of
patternand associatedexpr, matching the multi-branch function defintions familiar from ML. This means that evaluating a function call requires pattern-matching to choose the right branch and binding variables in the pattern to parts of the argument value. This should feel very similar toCaseexpressions!An important distinction with our earlier graphical representation of closures is that SMiLe closures in this implementation carry an environment that does not bind the original name of the function. (That would require mutation and we can get away without it!) However, a function’s name is carried separately in the closure. To support recursive functions, you must ensure that the function’s name (if any) is bound in the environment where you evaluate its body. (Use
Env.bind.)At this point, you may run and check the results of all examples by calling
Test.run();in the REPL. This tests all of the example programs in theexamplesdirectory, which should all work at this point. If they do not, you have some debugging to do. You should also write some of your own tests as ours are by no means exhaustive. Often you will be able to repurpose older SML code for this, removing use of any types/exceptions to work with SMiLe.
Static Checking (20 points)
Implement the freevars function in static.sml. Its job is to find all the unbound (or “free”) variables used in a SMiLe program AST. Your implementation will traverse the AST, tracking a simple static environment of type (Ast.var, unit) Env.t that tracks the set of variables that will be bound in the current scope. As you descend through a binding, you will grow this environment, binding any bound variables to the unit value to indicate they are bound in the current scope. When the functions reach lists of cases (pattern/expression pairs), they should consider all possible cases.
Your implementation will follow a shape that is fairly similar to the eval functions (and their supporthing helpers) in runtime.sml, but instead of evaluating and following just one path through the AST, freevars will perform one full tree traversal, visiting every part of the tree once.
You will want to define ML helper functions to work over bindings and exprs found in programs. Use the ML form for mutually recursive functions:
fun freevars prog = ...
and freevars_binding b env = ...
and freevars_cases cases env = ...
and freevars_expr e env = ...Optional extension: While checking for free variables, also check for conflicting bindings in a single pattern. Patterns should bind any given variable at most once. A variable cannot appear twice in a pattern. For example, the pattern (x,x) is illegal. (As an AST, this would be TuplePattern [VarPattern "x", VarPattern "x"].) Raise an exception if you find conflicting bindings in a pattern.
A Simpler Language? (5 points)
In this part, you will explore one last instance of syntactic sugar.
You have already investigated several examples of desugaring in our implementation of the SMiLe language. If all instances of an expression form can be rewritten to other forms in the language to produce an equivalent program, we know that this form is “just syntax;” it does not contribute any “interesting” semantics to the language beyond the semantics of the other existing expression forms. We will soon take this to the extreme in class, showing that it is possible to implement any computation using no more than single-argument functions, function application, and literally nothing else. For now, we content ourselves with writing off a couple more essential expression forms in our language as mere syntactic sugar.
At this point we have seen that there are many similarities between SMiLe/SML let expresssions, case expressions, and function application expressions. Evaluation of each of these involves:
- evaluating one or more argument expressions, pattern-matching against the result, and introducing bindings to parts (or the entirety) of the result:
- in function calls: the argument expression
- in let expressions: the binding expressions
- in case expressions: the target expression
- and evaluating one “body expression” in some environment extended with these bindings:
- in function calls: evaluate the body expression in the definition environment extended by the match
- in let expressions: evaluate the body expression in the current environment extended by the matches
- in case expressions: evaluate the expression following the matching pattern in the current environment extended by the match
Currently, the interpreter exploits these similarities to share significant code among the evaluation rules for these three expression forms. Nonetheless, the interpreter would be simpler if we had fewer evaluation rules.
Your final task is to implement two of the three (let, case, or function application) as sugar in terms of the third.
Before attempting the final desugaring:
- Complete all earlier stages of the interpreter.
hg cp runtime.sml runtime-sugar.smlhg cp static.sml static-sugar.sml- Do your work for this part in the
*-sugar.smlfiles and compile using the providedsources-sugar.cminstead of the originalsources.cm.
After making the copy, modify the relevant branches in Static.desugar_expr to introduce the new rewriting and modify Runtime.eval_expr to raise SmileSugarError if the newly-eliminated expression forms are ever encountered at runtime.
Optional Extensions
If you are looking for more, consider the following extensions, or come chat for other ideas.
Closure Size Optimization
Adapt your implementation of helper functions for freevars (functions that find free variables) to optimize the space used for function closures. Our current implementation of closures keeps around the entire environment, even the parts that the closure will never need. By running freevars_expr on the body of a function starting from a static environment that contains only variables that will be bound as part of the function application evaluation rule, you can detect the free variables of the closure – those variables that may be referenced when no binding is in scope within the give expression. These are the only variables the closure could ever need from its environment. Thus, we might save space by storing an environment in the closure that binds only these variables. Do this.
Types
For a big challenge, add an ML datatype typ describing types of SMiLe expressions. Add a typecheck function that takes a SMiLe expr, a static environment mapping variables to types, and a proposed typ for the expr, and decides whether expr may have this type. Emulate the SML type system as closely as possible.