PS9: Open to Interpretation
- Dueish: Tue. Dec. 12
- Notes:
- This is the last fully required pset, which covers material through the lecture on Fri. Dec 08. It will be followed by a PS10 that has one required solo problem (also due Tue Dec 12) and a some completely optional, extra credit problems.
- Although formally “due” on Tue., Dec. 12, this (and all your other as-yet-incomplete CS251 work) should actually be completed by the end of finals period. (If you need longer than that, consult with Lyn.)
- This pset has 140 points total: 28 for one solo problem and 112 for five regular problems.
- The problems needn’t be done in order. Feel free to jump around.
- Submission:
- In your yourFullName CS251 Fall 2017 Folder, create a Google Doc named yourFullName CS251 PS9.
- At the top of your yourFullName CS251 PS9 doc, include your name, problem set number, date of submission, and an approximation of how long each problem part took.
- For all parts of all problems, include all answers (including SML code) in your PS9 google doc. Format code using a fixed-width font, like Consolas or Courier New. You can use a small font size if that helps.
-
In Problem 1–6, you will modify the following code files from starter files populating your
~wx/cs251/sml/ps9
directory on your wx Virtual Machine:- Problem 1:
LazySequence.sml
- Problem 2:
IntexArgChecker.sml
- Problem 3:
BindexToPostFix.sml
- Problem 4:
Simprex.sml
andSimprexEnvInterp.sml
- Problem 5:
ValexPS9.sml
andValexEnvInterpPS9.sml
- Problem 6:
ValexPS9.sml
andValexEnvInterpPS9.sml
Drop a copy of your entire
ps9
directory into your~/cs251/drop/ps09
drop folder oncs.wellesley.edu
. - Problem 1:
-
For Problems 1–6, you should include all modified functions from the files listed above in your Google Doc (so that I can comment on them). In the Google Doc, you need only include the functions you modified, not all functions.
-
Don’t forget to include the following written (noncode) answers in your solutions:
- The requested Intex program in Problem 2c
- The result of hand translating the Bindex program in Problem 3a into PostFix (use s-expression syntax).
- The results of evaluating the sample Simprex programs in Problem 4a.
Starting this Problem Set
Problems 1–6 all involve starter files in the ~wx/cs251/sml/ps9
directory in your wx Virtual Machine.
To create this directory, execute the following two commands in a wx VM shell:
cd ~/cs251/sml
git add -A
git commit -m "my local changes"
git pull origin master --rebase
Several students have encountered problems when executing this steps. If you have problems, please contact lyn.
1. Solo Problem: Lazy Sequences (28 points)
This is a solo problem. This means you must complete it entirely on your own without help from any other person and without consulting resources other than course materials or online documentation. You may ask Lyn for clarification, but not for help.
Background
This is a module/ADT problem that shares aspects of the FunSet
and OperationTreeSet
problems from PS8. It involves a data structure implementation involving multiple tree nodes, where some of the tree nodes have functional components.
The problem involves a list-like data structure called a sequence that has the following signature:
signature SEQUENCE = sig
(* The type of a sequence *)
type ''a t
(* An empty sequence *)
val empty : ''a t
(* Create a sequence of (hi - lo) values fcn(lo), fcn(lo+1), ..., fcn(hi-1) *)
val segment: int -> int -> (int -> ''a) -> ''a t
(* Convert a lengnth-n list into a sequence of n values *)
val fromList: ''a list -> ''a t
(* Concatenate two sequences: given a length-m sequence s and length-n sequence t,
returns a single length-m+n sequence that has all values of s followed by all
values of t *)
val concat : ''a t -> ''a t -> ''a t
(* Return the length of a sequence = number of values in it *)
val length : ''a t -> int
(* Return the nth value of a sequence (0-indexed).
Raise an IndexOutOfBounds exception for an out-of-bounds index. *)
val get : int -> ''a t -> ''a
(* Return a sequence that results from applying f to each elt *)
val map : (''a -> ''b) -> ''a t -> ''b t
(* Return a list with all elements in the sequence. The ith element of the resulting
list should be the ith element of the given sequence. *)
val toList : ''a t -> ''a list
end
It would be straightforward to implement a sequence as a list. But this implementation can have issues.
As a concrete example, consider the following inefficient ways to increment and double a number:
(* A slow incrementing function *)
fun linearIncrement n =
let fun loop num ans =
if num = 0 then ans else loop (num-1) (ans+1)
in loop n 1
end
(* A very slow doubling function *)
fun quadraticDouble n =
let fun loop num ans =
if num = 0 then ans else loop (num-1) (linearIncrement ans)
in loop n n
end
For example, quadraticDouble 10000
returns fairly quickly, but quadraticDouble 100000
takes many seconds.
Now suppose we want to create the sequence
val reallyBigSeq = segment ~200000000 200000000 quadraticDouble
There are two big problems if we try to represent this as a list. First, the list would need to contain 400 million elements. Second, calculating each entry in the list (especially for large inputs) would take a very long time. So the list implementation is bad in terms of both space and time.
An alternative implementation is not to caclulate and store each element when the sequence is created, but to calculate each element on the fly only when the get
operation is called to return it. This is the basis of the LazySequence
structure that you will implement in this problem. It is based on the following datatype:
datatype ''a t = Segment of int * int * (int -> ''a) (* lo, hi, fcn *)
| ThunkList of (unit -> ''a) list
| Concat of ''a t * ''a t
In programming languages, the term “lazy” means “not computed until necessary”. In this context, an element of a sequence will not be calculated until it needs to be known.
The Segment
constructor
In the case of reallyBigSeq
, the sequence can be represented via the sum-of-product value
Segment(~200000000,200000000,quadraticDouble)
This simply keeps track of the information for constructing the sequence. Now if we wish to get the value of this sequence at index 20000017, we can apply quadraticDouble
to (200000017 + ~200000000) = 17
and return the value quadraticDouble(17)
. This way, we only “pay” for calculations done at the indices on which we actually perform get
. This leads to the following implementation of the segment
function:
fun segment lo hi fcn = Segment(lo,hi,fcn)
Thunks
Lazy data structures often employ something called a thunk
, which is simply a function of zero arguments. Wrapping a calculation in a function of zero arguments is a way to delay it until later ratger than performing it now. For example, consider testThunk
:
val testThunk = fn () => quadraticDouble 100000
Even though calculating quadraticDouble 100000
takes many seconds, defining testThunk
is immediate, because no calculation is performed until testThunk
is called, as in testThunk()
. This is called “dethunking”, which can be defined as follows:
fun dethunk thunk = thunk()
So testThunk()
can equivalently be written dethunk(testThunk)
Note that every SML function necessarily takes exactly one argument, so the notion that thunks take “zero” arguments is loose terminology. In fact a thunk takes the “unit value”, written ()
, which is the only value of type unit
and can be viewed as a tuple with zero components.
The upside of a thunk is that the delayed computation is never performed if the thunk is never dethunked. The downside is that the calculation is performed every time the thunk is dethunked, so computations can be repeated unnecessarily.
Note that the type of testThunk
is unit -> int
, where unit
specifies that the ()
value must be supplied to the thunk in order to extract its value. In general, the type of a thunk for a value of type ''a
is unit -> ''a
, so we use this as the definition of a type abbreviation named thunkTy
:
type ''a thunkTy = unit -> ''a
The ThunkList
constructor
Whereas the Segment
constructor supports “regular” lazy sequences whose values are determined by some function over a range of integers, the ThunkList
constructor is intended to support more “irregular” sequences in which there is not a simple correspondence between sequence values and a range of integers. For example, consider these sequences:
val numSeq = fromList [300005, 20074, 4000017, 1002]
val quadSeq = map quadraticDouble numSeq
Suppose we want all the applications of quadraticDouble
to be delayed until the sequence elements are extracted. E.g., get 1 quadSeq
should calculate quadraticDouble 20074
when the get
is performed, but not before that. If the elements at indices 0 and 2 are never accessed via get
, then the time-intensive calculations quadraticDouble 300005
and quadraticDouble 4000017
are never performed.
The ThunkList
constructor takes a list of thunks:
ThunkList of ''a thunkTy list
Using ThunkList
, the quadSeq
sequence can be represented as something like:
ThunkList[ fn () => quadraticDouble 300005,
fn () => quadraticDouble 20074,
fn () => quadraticDouble 4000017,
fn () => quadraticDouble 1002 ]
Based on this idea, we can use ThunkList
to represent all “irregular” sequences constructed from lists via fromList
:
fun fromList xs = ThunkList (List.map (fn x => (fn () => x)) xs)
This means that in numSeq
, all the numbers will be thunked, and so will be equivalent to:
ThunkList[ fn () => 300005,
fn () => 20074,
fn () => 4000017,
fn () => 1002 ]]
The Concat
constructor
The final constructor in the ''a t
sum-of-product sequence datatype is the Concat
constructor:
Concat of ''a t * ''a t
This is simply used to glue together two sequences:
fun concat seq1 seq2 = Concat(seq1,seq2)
So in the end, any instance of the ''a t
sum-of-product sequence datatype is a binary tree of Concat
nodes whose leaves are either Segment
nodes or ThunkList
nodes.
Your task
The starter file for this problem is LazySequence.sml
in the ~wx/cs251/sml/ps9
directory in your wx Virtual Machine.
LazySequence.sml
contains the SEQUENCE
signature from above and the following partial implementation of the LazySequence
structure:
structure LazySequence :> SEQUENCE = struct
type ''a thunkTy = unit -> ''a
datatype ''a t = Segment of int * int * (int -> ''a) (* lo, hi, fcn *)
| ThunkList of ''a thunkTy list
| Concat of ''a t * ''a t
fun segment lo hi fcn = Segment(lo,hi,fcn)
fun fromList xs = ThunkList (List.map (fn x => (fn () => x)) xs)
fun concat seq1 seq2 = Concat(seq1,seq2)
fun dethunk thunk = thunk()
val empty = ThunkList []
fun length seq = 0 (* replace this stub *)
fun get n seq = raise Unimplemented (* replace this stub *)
fun map f seq = empty (* replace this stub *)
fun toList seq = [] (* replace this stub *)
end
The segment
, fromList
, concat
functions and empty
value have already been defined for you. Your task is to implement the remaining four functions: length
, get
, map
, and toList
. Pay attention to the following notes in your implementation.
Notes:
-
Be careful to use the explicitly qualified
List.map
andList.length
functions when operating on a list, but unqualifiedmap
andlength
operators when opreating on a sequence. Any attempt to use the unqualifiedmap
andlength
on a list will result in a type error. -
Do not use the
toList
function in the implementations oflength
,get
, andmap
. Similarly, theget
function should not be used in the implementation oftoList
. However, you may use thelength
function in the implementation ofget
. -
The
length
function should always return an integer and should never raise an exception. -
The
get
function should raise anIndexOutOfBounds
exception for a sequences
when called on a index outside the range 0 andlength(s) - 1
. In the implementation of theget
function, full credit will be awarded only if the out-of-bounds length check is peformed only one (not separately for each of the three constructors). You may use thelength
function in the implementation ofget
. -
The
map
function should always return a sequence and should never raise an exception, even if the mapped function might raise an error on one of the values. For example, inval mapSeq2 = map (fn n => 20 div n) (fromList [3, 0, 5, 2, 4])
no exception should be raised when defining
mapSeq2
or evaluatingget 0 mapSeq2
,get 2 mapSeq2
,get 3 mapSeq2
, orget 4 mapSeq2
. Butget 1 mapSeq2
should raise a divide-by-zero exception. -
The
map
function should preserve the “shape” of its input sum-of-product constructor tree. So it should return aConcat
output for aConcat
input, aSegment
output for aSegment
input, and aThunkList
output for aThunkList
input. -
In the implementation of the
map
function, it will be necessary to combine two function values into one. What is the most natural way to do this? (Hint: it’s something we’ve studied before!) -
If calculating one or more values of a sequence
s
raises an exception,toList s
should raise the exception associated with the value having the lowest index. -
The
LazySequence
structure is followed by numerous testing functions and test cases. (The expected resuls of all the test cases appear in a comment at the end of the fileLazySequence.sml
.) Here we illustrate the four functions that are used for testingget
in the context of these two example sequences:val mapSeq1 = map (fn n => 20 div n) (fromList [3, 5, 2, 4]) val mapSeq2 = map (fn n => 20 div n) (fromList [3, 0, 5, 2, 4])
-
(testGet seq)
returns a list of (index, value-at-index) pairs for each valid index inseq
. If calculating at least one of the values raises an exception,testGet
raises the exception encountered for the lowest index at which an exception is raised byget
.- testGet mapSeq1; val it = [(0,6),(1,4),(2,10),(3,5)] : (int * int) list - testGet mapSeq2; uncaught exception Div [divide by zero] raised at: /tmp/emacs-region28242nLJ:129.31-129.34 /tmp/emacs-region28242nLJ:104.29
-
(testGetRange seq lo hi)
returns a list of (index, value-at-index) pairs for each index inseq
in the rangelo
(inclusive) tohi
(exclusive). If calculating at least one of the values raises an exception,testGet
raises the exception encountered for the lowest index at which an exception is raised byget
.- testGetRange mapSeq1 1 3; val it = [(1,4),(2,10)] : (int * int) list - testGetRange mapSeq1 3 6; uncaught exception IndexOutOfBounds raised at: /tmp/emacs-region28242nLJ:68.16-68.34 /tmp/emacs-region28242nLJ:108.29 - testGetRange mapSeq2 2 5; val it = [(2,4),(3,10),(4,5)] : (int * int) list - testGetRange mapSeq2 0 3; uncaught exception Div [divide by zero] raised at: /tmp/emacs-region28242nLJ:129.31-129.34 /tmp/emacs-region28242nLJ:108.29
-
(testGetHandleExceptions eltToString seq)
returns a list of (index, string-for-value-or-exception-at-index) pairs for each valid index inseq
. TheeltToString
function must be supplied to turn each sequence value into a string. If an exception is encountered, the exception string is used in place of a value at that index.- testGetHandleException Int.toString mapSeq1; val it = [(0,"6"),(1,"4"),(2,"10"),(3,"5")] : (int * string) list - testGetHandleException Int.toString mapSeq2; val it = [(0,"6"),(1,"Error: Div -- divide by zero"),(2,"4"),(3,"10"),(4,"5")] : (int * string) list
-
(testGetRangeHandleExceptions eltToString seq lo hi)
returns a list of (index, string-for-value-or-exception-at-index) pairs for each index inseq
in the rangelo
(inclusive) tohi
(exclusive). TheeltToString
function must be supplied to turn each sequence value into a string. If an exception is encountered, the exception string is used in place of a value at that index.- testGetRangeHandleException Int.toString mapSeq1 ~1 6; val it = [(~1,"Error: IndexOutOfBounds -- ~1"),(0,"6"),(1,"4"),(2,"10"),(3,"5"), (4,"Error: IndexOutOfBounds -- 4"),(5,"Error: IndexOutOfBounds -- 5")] : (int * string) list - testGetRangeHandleException Int.toString mapSeq2 ~1 6; val it = [(~1,"Error: IndexOutOfBounds -- ~1"),(0,"6"), (1,"Error: Div -- divide by zero"),(2,"4"),(3,"10"),(4,"5"), (5,"Error: IndexOutOfBounds -- 5")] : (int * string) list
-
-
Here is a FAQ associated with this problem:
Q1: What is the structure of a lazy sequence?
A1: Just as an
OperationTreeSet
is a tree made out of six kinds of nodes, aLazySequence
is a tree made out of the three kinds of nodes in this datatype:datatype ''a t = Segment of int * int * (int -> ''a) (* lo, hi, fcn *) | ThunkList of (unit -> ''a) list | Concat of ''a t * ''a t
The
Segment
,ThunkList
, andConcat
constructors represent three completely different ways to represent a lazy sequence.-
A
Segment
represents a length-n lazy sequence as a single unary (single-argument) function that is applied to every index in a range with n indices. -
A
ThunkList
represents a length-n lazy sequence as a list of n nullary (zero-argument) functions (known as thunks). -
A
Concat
represents a length-(m+n) lazy sequence as the combination of a length-m lazy sequence and a length-n lazy sequence.
Q2: Can lazy sequences have infinite length?
A2: No. Based on the answer to Q1, all lazy sequences have a finite number of elements.
Q3: Why do I get the error
data constructor Segment used without argument in pattern
?A3: Be sure to wrap pattens in parens. E.g., write
(Segment(lo,hi,func))
, notSegment(lo,hi,func)
Q4: When I send
LazySequences.sml
to the*sml*
buffer, I don’t get an interpreter prompt in the*sml*
buffer. Why?A4: Look for an infinite loop in your code that is causing it to hang.
Q5: I get an
IndexOutOfBounds
exception when testing my code that prevents me from seeing the results of test cases. What can I do?A5: Try commenting out the test cases involving plain
testGet
(as opposed totestGetHandleException
andtestGetRangeHandleException
). If that stops the uncaughtIndexOutOfBounds
exception, it indicates a bug in yourget
function. -
2. Static Argument Checking in Intex (12 points)
As described in the lecture slides on Intex, it is possible to statically determine (i.e., without running the program) whether an Intex program contains a argument reference with an out-of-bound index. In this problem you will flesh out skeleton implementations of the two approaches for static argument checking sketched in the slides. The skeleton implementations are in the file ps9/IntexArgChecker.sml
.
-
Top-down checker (5 points): In the top-down approach, the
checkTopDown
function on a program passes the number of arguments to thecheckExpTopDown
function on expressions, which recursively passes it down the subexpressions of the abstract syntax tree. When anArg
expression is reached, the index is examined to determine a boolean that indicates whether it is in bounds. The boolean on all subexpressions are combined in the upward phase of recursion to determine a boolean for the whole tree. Complete the following skeleton to flesh out this approach.(* val checkTopDown: pgm -> bool *) fun checkTopDown (Intex(numargs,body)) = checkExpTopDown numargs body (* val checkExpTopDown : int -> Intex.exp -> bool *) and checkExpTopDown numargs (Int i) = raise Unimplemented | checkExpTopDown numargs (Arg index) = raise Unimplemented | checkExpTopDown numargs (BinApp(_,exp1,exp2)) = raise Unimplemented
Uncomment the definition of
topDownChecks
to test your implementation. -
Bottom-up checker (5 points): In the bottom-up approach, the
checkExpBottomUp
function returns a pair(min, max)
of the minimum and maximum argument indices that appear in an expression. If an expression contains no argument indices, it should return the(valOf Int.maxInt, valOf Int.minInt)
, which is a pair of SML’s (1) maximum integer and (2) minumum integer. The functionsInt.min
andInt.max
are useful for combining such pairs. (Note thatInt.maxInt
is the identity forInt.min
, andInt.minInt
is the identity forInt.max
.)The
checkBottomUp
function on programs returnstrue
if all argument references are positive and less than or equal to the programs number of arguments, andfalse
otherwise. Complete the following skeleton to flesh out this approach.(* checkBottomUp: pgm -> bool *) fun checkBottomUp (Intex(numargs,body)) = let val (min,max) = checkExpBottomUp body in 0 < min andalso max <= numargs end (* val checkExpBottomUp : Intex.exp -> int * int *) and checkExpBottomUp (Int i) = (valOf Int.maxInt, valOf Int.minInt) | checkExpBottomUp (Arg index) = raise Unimplemented | checkExpBottomUp (BinApp(_,exp1,exp2)) = raise Unimplemented
Notes
-
Uncomment the definition of
bottomUpChecks
to test your implementation. -
Do not use
Int.min
/Int.max
to find the minimum-so-far and maximum-so-far elements in the tuple returned bycheckExpBottomUp
. Instead use pattern matching! Why? Because they return the wrong value for the tuple returned for an integer. (Thiabout this!)
-
-
Static vs dynamic checking (2 points): A static argument index checker flags programs that might dynamically raise an index-out-of-bounds error, but does not guarantee they will dynamically raise such an error. Give an example of an Intex program for which
checkTopDown
andcheckBottomUp
returnsfalse
but does not raise an argument-out-of-bounds error when run on particular values for the correct number of arguments. Hint: such a program can raise a different error; what other kinds of expression errors are there?
3. Bindex To PostFix (20 points)
Background
In lecture, we studied the following intexToPostFix
function that automatically translates Intex programs to equivalent PostFix programs:
fun intexToPostFix (Intex.Intex(numargs, exp)) =
PostFix.PostFix(numargs, expToCmds exp 0)
(* 0 is a depth argument that statically tracks
how many values are on stack above the arguments *)
and expToCmds (Intex.Int i) depth = [PostFix.Int i]
| expToCmds (Intex.Arg index) depth = [PostFix.Int (index + depth), PostFix.Nget]
(* specified argument is on stack at index + depth *)
| expToCmds (Intex.BinApp(binop,exp1,exp2)) depth =
(expToCmds exp1 depth) (* 1st operand is at same depth as whole binapp *)
@ (expToCmds exp2 (depth + 1)) (* for 2nd operand, add 1 to depth
to account for 1st operand *)
@ [PostFix.Arithop (binopToArithop binop)]
and binopToArithop Intex.Add = PostFix.Add
| binopToArithop Intex.Sub = PostFix.Sub
| binopToArithop Intex.Mul = PostFix.Mul
| binopToArithop Intex.Div = PostFix.Div
| binopToArithop Intex.Rem = PostFix.Rem
For example, given the Intex program intexP2
expressed as an Intex.pgm
datatype instance
val intexP2 = Intex(4, BinApp(Mul,
BinApp(Sub, Arg 1, Arg 2),
BinApp(Div, Arg 3, Arg 4)))
the intexToPostFix
translation is as follows:
- intexToPostFix(intexP2);
val it =
PostFix
(4,
[Int 1,Nget,Int 3,Nget,Arithop Sub,Int 4,Nget,Int 6,Nget,Arithop Div,
Arithop Mul]) : PostFix.pgm
With the helper function translateString
shown below, the input and output of translation can be expressed as strings in s-expression format:
fun translateString intexPgmString =
PostFix.pgmToString (intexToPostFix (Intex.stringToPgm intexPgmString))
- IntexToPostFix.translateString("(intex 4 (* (- ($ 1) ($ 2)) (/ ($ 3) ($ 4))))");
val it = "(postfix 4 1 nget 3 nget sub 4 nget 6 nget div mul)" : string
There are two key aspects to the translation from Intex to PostFix:
-
The
expToCmds
function translates every Intex expression to a sequence of PostFix commands that, when executed, will push the integer value of that expression onto the stack. In the case of translating a binary application expression, this sequence is composed by appending the sequences for the two operands followed by the appropriate arithmetic operator command. -
The trickiest part of
expToCommands
is knowing how many integers are on the stack above the program arguments, so that references to Intex arguments by index can be appropriately translated. This is handled by adepth
argument toexpToCommands
that keeps track of this number. Note that the first operand of a binary application has the same depth as the whole binary application expression, but the second operand has a depth one greater than the first to account for the integer value of the first operand pushed onto the stack.
Your Task
In this problem, you will think about and implement a similar translator from Bindex programs to PostFix programs.
-
Hand translating a sample Bindex expression (6 points) To get a better sense for the issues involved, give the PostFix program (in s-expression notation) that results from translating by hand the following Bindex program:
(bindex (a b) (- (bind c (+ a b) (* b c)) a))
Run your resulting program in PostFix to verify that it works as expected. In particular, running the PostFix program that results from hand translating the above Bindex program on the arguments
1 6
should return41
. If it doesn’t, there’s a bug in your translation strategy. -
Fleshing out the translator (14 points): The file
ps9/BindexToPostFix.sml
contains a skeleton of the Bindex to PostFix translator in which all Bindex expressions translate to a command sequence consisting of the single integer command42
:fun makeArgEnv args = Env.make args (Utils.range 1 ((length args) + 1)) (* returned env associates each arg name with its 1-based index *) fun envPushAll env = Env.map (fn index => index + 1) env (* add one to the index for each name in env *) fun envPush name env = Env.bind name 1 (envPushAll env) fun bindexToPostFix (Bindex.Bindex(args, body)) = PostFix.PostFix(length args, expToCmds body (makeArgEnv args)) (* In expToCmds, env statically tracks the depth of each named variable value on the stack *) and expToCmds (Bindex.Int i) env = [PostFix.Int 42] (* replace this stub *) | expToCmds (Bindex.Var name) env = [PostFix.Int 42] (* replace this stub *) | expToCmds (Bindex.BinApp(binop, rand1, rand2)) env = [PostFix.Int 42] (* replace this stub *) | expToCmds (Bindex.Bind(name, defn, body)) env = [PostFix.Int 42] (* replace this stub *) and binopToArithop Bindex.Add = PostFix.Add | binopToArithop Bindex.Sub = PostFix.Sub | binopToArithop Bindex.Mul = PostFix.Mul | binopToArithop Bindex.Div = PostFix.Div | binopToArithop Bindex.Rem = PostFix.Rem
When loaded, the
BindexToPostFix.sml
file performs numerous tests of the translator. Initially, all of these fail. Your task is to redefine the clauses ofexpToCmds
so that all the test cases succeed.
Notes:
-
The
expToCmds
function in the Bindex to PostFix translator uses an environment to track the depth of each variable name in the program. Recall that an environment (defined by theEnv
structure inutils/Env.sml
) maps names to values. In this case, it maps names in a Bindex program to the current stack depth of the value associated with that name. Carefully study the helper functionsmakeArgEnv
,envPushAll
, andenvPush
to understand what they do; all are helpful in the contex of this problem. -
As in the Intex to PostFix translator, in the Bindex to PostFix translator, each Bindex expression should translate to a sequence of PostFix commands that, when executed, pushes exactly one value — the integer value of that expression — onto the stack. More formally, translating a Bindex expression to PostFix should obey the following invariant:
A Bindex expression E should translate to a sequence of commands, that, when executed on a stack S, should result in a stack v :: S, where v is the result of evaluating E relative to an environment represented by S.
-
For simplicity, you may assume there are no unbound variables in the Bindex program you are translating. With this assumption, you may use the
Option.valOf
function to extract the valuev
from aSOME(v)
value. -
The
BindexToPostFix
structure contains atestTranslator
function that takes the name of a file containing a Bindex program and a list of argument lists and (1) displays the Bindex program as an s-expression string, (2) displays the PostFix program resulting from the translation as an s-expression string and (3) tests the behavior of both programs on each of the argument lists in the list of argument lists. If the behaviors match, it prints an approprate message; if they don’t match, it reports an an error with the two different result values. For instance, in the initial skelton, the behavioral tests fail on the program in../bindex/avg.bdx
:Testing Bindex program file ../bindex/avg.bdx Bindex program input to translation: (bindex (a b) (/ (+ a b) 2)) PostFix program output of translation: (postfix 2 42) Testing args [3,7]: *** ERROR IN TRANSLATION *** Bindex result: 5 PostFix result: 42 Testing args [15,5]: *** ERROR IN TRANSLATION *** Bindex result: 10 PostFix result: 42
However, when the translator is working, the following will be displayed:
Testing Bindex program file ../bindex/avg.bdx Bindex program input to translation: (bindex (a b) (/ (+ a b) 2)) PostFix program output of translation: (postfix 2 1 nget 3 nget add 2 div) Testing args [3,7]: both programs return 5 Testing args [15,5]: both programs return 10
-
You have succeeded when all of the test cases succeed when you load the
.sml
file.
4. Simprex (35 points)
Background
Rae Q. Cerf of Toy-Languages-я-Us likes the sigma
construct from the Bindex lecture, but she wants something more general. In addition to expressing sums, she would also like to express numeric functions like factorial and exponentiation that are easily definable via simple recursion. The functions that Rae wants to define all have the following form:
f(n) = z, if n ≤ 0
f(n) = c(n, f(n-1)), if n > 0
Here, z
is an integer that defines the value of f
for any non-positive integer value and and c
is a binary combining function that combines n
and the value of f(n−1)
for any positive n
. Expanding the definition yields:
f(n) = c(n, c(n-1, c(n−2, ... c(2, c(1, z)))))
For example, to define the factorial function, Rae uses:
z_fact = 1
c_fact(i, a) = i*a
To define the exponentation function bn, Rae uses:
z_expt = 1
c_expt(i, a) = b*a
In this case, c_expt
ignores its first argument, but the fact that c_expt
is called n
times is important.
As another example, Rae defines the sum of the squares of the integers between 1 and n using
z_sos = 0
c_sos(i, a) = (i*i) + a
The examples considered so far don’t distinguish right-to-left vs. left-to-right evaluation, A simple example of that is
z_sub = 0
c_sub(i, a) = i-a
For example, when n
is 4, the result is the value of 4 - (3 - (2 - (1 - 0))), which is 2, and not the value of 1 - (2 - (3 - (4 - 0))), which is -2.
Another example that distinguishes right-to-left vs. left-to-right evaluation is using Horner’s method to calculate the value of the polynomial x^(n-1) + 2*(x^(n-2)) + 3*(x^(n-3)) + ... + (n-1)*x + n
:
z_horner = 0
c_horner(i, a) = i + x*a
For example, when n
is 4 and x
is 5, the result is (4 + 5*(3 + 5*(2 + 5*(1 + 5*0))))
= 194, which is the value of 5^3 + 2*5^2 + 3*5^1 + 4
= (125 + 50 + 15 + 4) = 194.
Rae designs an extension to Bindex named Simprex that adds a new simprec
construct for expressing her simple recursions:
(simprec E_zero (Id_num Id_ans E_combine ) E_arg)
This simprec
expression evaluates E_arg
to the integer value n_arg
and E_zero
to the integer value n_zero
. If n_arg
≤ 0, it returns n_zero
. Otherwise, it returns the value
c(n_arg, c(n_arg−1, c(n_arg−2, ... c(2, c(1, nzero)))))
where c
is the binary combining function specified by (Id_num Id_ans E_combine)
. This denotes a two-argument function whose two formal parameters are named Id_num
and Id_ans
and whose body is E_combine
. The Id_num
parameter ranges over the numbers from n_arg
down to 1, while the Id_ans
parameter ranges over the answers built up by c
starting at n_zero
. The scope of Id_num
and Id_ans
includes only E_combine
; it does not include E_zero
or E_arg
.
Rae’s simprec
expression is closely related to the notion of primitive recursive functions defined in the theory of computation.
Here are some sample Simprex programs:
;; Program that calculuates the factorial of n
(simprex (n) (simprec 1 (i a (* i a)) n))
;; Exponentiation program raising base b to power p
(simprex (b p) (simprec 1 (i ans (* b ans)) p))
;; Program summing the squares of the numbers from 1 to hi
(simprex (hi) (simprec 0 (i sumSoFar (+ (* i i) sumSoFar)) hi))
;; Program distingishing right-to-left and left-to-right evaluation via subtraction
(simprex (n) (simprec 0 (i ans (- i ans)) n))
;; Calculating x^(n-1) + 2*(x^(n-2)) + 3*(x^(n-3)) + ... + (n-1)*x + n via Horner's method
(simprex (n x) (simprec 0 (i a (+ i (* x a)) n)))
Your Tasks
After completing her design, Rae is called away to work on another language design problem. Toy-Languages-я-Us is impressed with your CS251 background, and has hired you to implement the Simprex language, starting with a version of the Sigmex implementation. (Sigmex is the name of the language that results from extending Bindex with the sigma
construct from lecture.) Your first week on the job, you are asked to complete the following tasks that Rae has specified in a memo she has written about finishing her project.
-
(15 points) Rae’s memo contains the following Simprex test programs. Give the results of running each of the programs on the argument 3. Show your work so that you may get partial credit if your answer is incorrect.
;; program 1 (2 points) (simprex (a) (simprec 0 (b c (+ 2 c)) a)) ;; program 2 (3 points) (simprex (x) (simprec 0 (n sum (+ n (* x sum))) 4)) ;; program 3 (4 points) (simprex (y) (simprec 0 (a b (+ b (sigma c 1 a (* a c)))) y)) ;; program 4 (6 points) (simprex (n) (simprec (simprec (* n (- n 3)) (q r r) (* n n)) (c d (+ d (simprec 0 (x sum (+ sum (- (* 2 x) 1))) c))) (simprec -5 (a b (+ 1 b)) (* n n))))
-
(20 points)
Rae has created a skeleton implementation of Simprex by modifying the files for the Sigmex (= Bindex +
sigma
) implementation to contain stubs for thesimprec
construct. Her modified files, which are namedSimprex.sml
andSimprexEnvInterp.sml
, can be found in theps9
folder.Finish Rae’s implementation of the Simprex language by completing the following four tasks, which Rae has listed in her memo:
-
(2 points) Extend the definition of
sexpToExp
inSimprex.sml
to correctly parsesimprec
expressions. -
(2 points) Extend the definition of
expToSexp
inSimprex.sml
to correctly unparsesimprec
expressions. -
(5 points) Extend the definition of
freeVarsExp
inSimprex.sml
to correctly determine the free variables of asimprec
expression. -
(11 points) Extend the definition of
eval
inSimprexEnvInterp.sml
to correctly evaluatesimprec
expressions using the environment model.
Notes:
-
This problem is similar the the Sigmex (= Bindex +
sigma
) language extension we implemented in lecture. You may wish to study the solution filesbindex/SigmexSolns.sml
andbindex/SigmexEnvInterp.sml
as part of doing this problem. -
In
Simprec.sml
, theexp
type is defined to be:and exp = Int of int (* integer literal with value *) | Var of var (* variable reference *) | BinApp of binop * exp * exp (* binary operator application with rator, rands *) | Bind of var * exp * exp (* bind name to value of defn in body *) | Sigma of var * exp * exp * exp (* E_lo, E_hi, E_body *) | Simprec of exp * var * var * exp * exp (* zeroExp * numVar * ansVar * combExp * argExp *)
The s-expression notation
(simprec E_zero (I_num I_ans E_combine) E_arg)
is represented in SML asSimprec (<exp for E-zero>, <string for I_num>, <string for I_ans>, <exp for E_combine>, <exp for E_arg>)
For example, the expression
(simprec 1 (x a (* x a)) n)
is represented in SML asSimprec(Int 1, "x", "a", BinApp(Mul, Var "x", Var "a"), Var "n")
-
You can test your modifications of
sexpToExp
,expToExp
, andfreeVarsExp
viause "Simprex.sml"
after you have first uncommented the test cases at the end ofSimprex.sml
. This will display the results of various test cases forfreeVarsExp
before displaying theSimprex
structure. It will also testsexpToExp
andexpToSexp
. If you uncomment the test cases before you have correctly modifiedsexpToExp
andexpToSexp
, any attempt touse "Simprec.sml"
oruse "SimprecEnvInterp.sml"
will fail with aSyntaxError
. -
You can test your modifications of
eval
viause "SimprexEnvInterp.sml"
after you have first uncommented the test cases at the end ofSimprexEnvInterp.sml
. This will display the results of various test cases foreval
before displaying theSimprexEnvInterp
structure.
-
5. Extending Valex with New Primitive Operators (15 points)
The Valex language implementation is designed to make it easy to add new primitive operators to the language. In this problem, you are asked to extend Valex with some new primitive operators.
The files ValexPS9.sml
and ValexEnvInterpPS9.sml
in the ps9
folder contain a copy of the Valex language implementation we study in class.
In this problem, you will add each of the following four primitive operators to ValexPS9.sml
:
-
(2 points)
(abs n)
: Returns the absolute value of the integer n. E.g.valex> (abs -17) 17 valex> (abs 42) 42
-
(3 points)
(sqrt n)
: If n is a non-negative integer, returns the integer square root n. The integer square root of a non-negative integer is the largest integer i such that i2 ≤ n. Signals an error if n is negative. E.g.valex> (sqrt 25) 5 valex> (sqrt 35) 5 valex> (sqrt 36) 6 valex> (sqrt 37) 6
Hint: The
Real
andMath
structures contain helpful functions for this problem. -
(4 points)
(range lo hi)
: Assumelo
andhi
are integers. Iflo
<hi
, returns a list of integers fromlo
(inclusive) tohi
(exclusive). Otherwise, returns the empty list. For example:valex> (range 1 20) (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19) valex> (range 3 8) (list 3 4 5 6 7) valex> (range 8 3) #e
Hint: The
Utils.range
function is helpful here. -
(6 points)
(rot n xs)
Assume n is an integer and xs is a list. If xs is empty, returns the empty list. If n is a nonnegative integer, returns a list that results from moving the first n elements of the list to the end of the list. If n is greater than the length L of the list, uses (n mod L) instead. If n is a negative integer, raises an error. For example:valex> (rot 2 (explode "abcdefg")) (list 'c' 'd' 'e' 'f' 'g' 'a' 'b') valex> (rot 6 (explode "abcdefg")) (list 'g' 'a' 'b' 'c' 'd' 'e' 'f') valex> (rot 0 (explode "abcdefg")) (list 'a' 'b' 'c' 'd' 'e' 'f' 'g') valex> (rot 17 (explode "abcdefg")) (list 'd' 'e' 'f' 'g' 'a' 'b' 'c') (* same as rot 3 *) valex> (rot 17 (list)) #e valex> (rot -3 (explode "abcdefg")) Eval Error: negative rotation ~3 in rot
NOTES:
-
All four primitives above can be added to Valex by adding new
Primop
entries to theprimops
list inps9/ValexPS9.sml
right below the spot with this SML comment:(* Put your new primops from PS9 here *)
Study the other primitives to see how primitives are declared.
-
Be careful to change the Valex implementation in the
ps9
directory, not the one in thevalex
directory. -
You should only change
ValexPS9.sml
and notValexEnvInterpPS9.sml
-
To test your implementation after editing
ValexPS9.sml
, first loadValexEnvInterpPS9.sml
(notValexPS9.sml
) into a*sml*
buffer, and then test your new primitives interactively in a Valex read-eval-print loop (REPL) launched by invokingValexEnvInterp.repl()
.
6. Desugaring classify
(30 points)
The classify
construct
You are a summer programming intern at Sweetshop Coding, Inc. Your supervisor, Dexter Rose, has been studying the syntactic sugar for Valex and is very impressed by the cond
construct. He decides that it would be neat to extend Valex with a related classify
construct that classifies an integer relative to a collection of ranges. For instance, using his construct, Dexter can write the following grade classification program:
; Classify Program 1
(valex (grade)
(classify grade
((90 100) 'A')
((80 89) 'B')
((70 79) 'C')
((60 69) 'D')
(otherwise 'F')))
This program takes an integer grade value and returns a character indicating which range the grade falls in.
In general, the classify
construct has the following form:
(classify E_disc
((Elo_1 Ehi_1) Ebody_1)
...
((Elo_n Ehi_n) Ebody_n)
(otherwise E_default))
The evaluation of classify should proceed as follows.
First the discriminant expression E_disc
should be evaluated to the value V_disc
, which is expected to be an integer. Then V_disc
should be matched against each of the clauses ((Elo_i Ehi_i) Ebody_i)
from top to bottom until one matches. The value matches a clause if it lies in the range between Vlo_i
and Vhi_i
, inclusive, where Vlo_i
is the integer value of Elo_i
, and Vhi_i
is the integer value of Ehi_i
. When the first matching clause is found, the value of the associated expression Ebody_i
is returned. If none of the clauses matches V_disc
, the value of the default expression E_default
is returned.
Here are a few more examples of the classify
construct:
; Classify program 2
(valex (a b c d)
(classify (* c d)
((a (- (/ (+ a b) 2) 1)) (* a c))
(((+ (/ (+ a b) 2) 1) b) (* b d))
(otherwise (- d c))))
Program 2 emphasizes that any of the subexpressions of classify may be arbitrary expressions that require evaluation. In particular, the upper and lower bound expressions need not be integer literals. For instance, here are some examples of the resulting value returned by Program 2 for some sample inputs:
a | b | c | d | result |
---|---|---|---|---|
10 | 20 | 3 | 4 | 30 |
10 | 20 | 3 | 6 | 120 |
10 | 20 | 3 | 5 | 2 |
; Classify program 3
(valex (a)
(classify a
((0 9) a)
(((/ 20 a) 20) (+ a 1))
(otherwise (/ 100 (- a 5)))))
Program 3 emphasizes that (1) ranges may overlap (in which case the first matching range is chosen) and (2) expressions in clauses after the matching one are not evaluated. For instance, here are here are some examples of the resulting value returned by Program 3 for some sample inputs:
a | result |
---|---|
0 | 0 |
5 | 5 |
10 | 11 |
20 | 21 |
25 | 5 |
30 | 4 |
Your Tasks
Dexter has asked you to implement the classify construct in Valex as syntactic sugar. You will do this in three phases.
-
(9 points) In your PS9 Google Doc, write down how you would like each of the three example programs above to be desugared into equivalent Valex programs that do not have
classify
. Your programs may include other sugar constructs, such as&&
. -
(9 points) In your PS9 Google Doc, write down incremental desugaring rules that desugar
classify
into other Valex constructs. These rules should be expressed as rewrite rules on s-expressions. For example, here are the desugaring rules for Valex’s sugared constructs exceptquote
:(&& E_rand1 E_rand2) ↝ (if E_rand1 E_rand2 #f) (|| E_rand1 E_rand2) ↝ (if E_rand1 #t E_rand2) (cond (else E_default)) ↝ E_default (cond (E_test E_then) ...) ↝ (if E_test E_then (cond ...)) (list) ↝ #e (list E_head ...) ↝ (prep E_head (list ...)) (bindseq () E_body) ↝ E_body (bindseq ((Id E_defn) ...) E_body) ↝ (bind Id E_defn (bindseq (...) E_body)) (bindpar ((Id_1 E_defn_1) ... (Id_n E_defn_n)) E_body) ↝ (bind Id_list (* fresh variable name *) (list E_defn_1 ... E_defn_n) (* eval defns in parallel *) (bindseq ((Id_1 (nth 1 Id_list)) ... (Id_n (nth n Id_list))) E_body))
Note that
...
means zero or more occurrences of the preceding kind of syntactic entity.NOTES:
-
Your desugaring rules may rewrite an s-expression into an s-expression that includes other sugar constructs, such as
&&
. -
Your desugaring rules should be written in such a way that
E_disc
is evaulated only once. To guarantee this, you will need to name the value ofE_disc
with a “fresh” variable (one that does not appear elsewhere in the program). -
Your desugaring rules should be written in such a way that the
Elo
andEhi
expressions in each clause are evaluated at most once. -
You should treat differently the cases where
E_disc
is an identifier and when it is not an identifier. -
Your rules should be designed to give the same output for the three sample programs as in part a.
-
-
(12 points) In this part, you will implement and test your desugaring rules from the previous part. You will modify the very same
ps9/ValexPS9.sml
from Problem 5 in this part. You should make your changes to thedesugarRules
function inps9/ValexPS9.sml
.NOTES:
-
Be careful to change the Valex implementation in the
ps9
directory, not the one in thevalex
directory. -
Use
Utils.fresh
to create a fresh variable. -
To test your implementation after editing
ValexPS9.sml
, first loadValexEnvInterpPS9.sml
(notValexPS9.sml
) into a*sml*
buffer, and then pay attention to the following:-
In the
ps9
directory, the filesclassify1.vlx
,classify2.vlx
, andclassify3.vlx
contain the three exampleclassify
programs from above. The filesort3.vlx
contains a test file that not usingclassify
that returns a sorted list of its 3 arguments. -
Your implementation should give the same output for the three sample programs as in part a. For seeing the output of the desugaring of your
classify
construct for examples, use one of the following approaches:-
To desugar a file, use
Valex.desugarFile
to print the result of desugaring the program in the file. For example, supposedsort3.vlx
contains this program:(valex (a b c) (cond ((&& (<= a b) (<= b c)) (list a b c)) ((&& (<= a c) (<= c b)) (list a c b)) ((&& (<= b a) (<= a c)) (list b a c)) ((&& (<= b c) (<= c a)) (list c b a)) ((&& (<= c a) (<= a b)) (list c a b)) (else (list c b a))))
Then here is the result of using
Valex.desugarFile
on this program:- Valex.desugarFile "sort3.vlx"; (valex (a b c) (if (if (<= a b) (<= b c) #f) (prep a (prep b (prep c #e))) (if (if (<= a c) (<= c b) #f) (prep a (prep c (prep b #e))) (if (if (<= b a) (<= a c) #f) (prep b (prep a (prep c #e))) (if (if (<= b c) (<= c a) #f) (prep c (prep b (prep a #e))) (if (if (<= c a) (<= a b) #f) (prep c (prep a (prep b #e))) (prep c (prep b (prep a #e))) ) ) ) ) ) ) val it = () : unit
-
To desugar an expression, one approach is to invoke the
Valex.desugarString
function on a string representing the expression you want to desugar. For example:- Valex.desugarString "(&& (|| a b) (|| c d))" (if (if a #t b) (if c #t d) #f) - : unit = () - Valex.desugarString "(list 1 2 3)" (prep 1 (prep 2 (prep 3 #e))) - : unit = ()
-
To desugar an expression, another approach is to use
ValexEnvInterp.repl()
to launch a read-eval-print loop and use the#desugar
directive to desugar an expression. For example:- ValexEnvInterp.repl(); valex> (#desugar (&& (|| a b) (|| c d))) (if (if a #t b) (if c #t d) #f) valex> (#desugar (list 1 2 3)) (prep 1 (prep 2 (prep 3 #e)))
-
-
There are several ways to test the evaluation of your desugared classify construct:
-
To run a program, one approach is to invoke
ValexEnvInterp.runFile
on the filename and a list of integer arguments. For example:- ValexEnvInterp.runFile "sort3.vlx" [23, 42, 17]; val it = List [Int 17,Int 23,Int 42] : Valex.value
-
To run a program, another approach is to use
ValexEnvInterp.repl()
to launch a read-eval-print loop and use the#run
directive to run a fileon arguments. For example:- ValexEnvInterp.repl(); valex> (#run sort3.vlx 23 42 17) (list 17 23 42)
-
To evaluate an expression, use
ValexEnvInterp.repl()
to launch a read-eval-print loop and enter the expression. To evaluate an expression with free variables, use the#args
directive to bind the variables. For example:- ValexEnvInterp.repl(); valex> (#args (a 2) (b 3)) valex> (bindpar ((a (+ a b)) (b (* a b))) (list a b)) (list 5 6) valex> (bindseq ((a (+ a b)) (b (* a b))) (list a b)) (list 5 15)
-
-
-