• Due: Monday, May 16
  • Notes:
    • This problem set contains some completely optional extra credit problems. You may do any subset of these problems (including the empty subset!)
    • More problems are under development and will be added when finished.
    • In addition to doing any of these problems, you can also submit extra credit problems from previous assignments.
    • Any points from extra credit problems will simply be added to the point total of your problem set component of your course grade.
    • Of course, partial credit will be awarded where appropriate for incomplete problems.
    • Instructions for many of these problems are less explicit than previous problem set problems. Please email questions if there are apects that are unclear.
  • Submission:
    • In the yourFullName CS251 Spring 2016 Folder, create a Google Doc named yourFullName CS251 PS9.
    • For each problem and subproblem, please indicate at the beginning of each writeup approximately how long that problem took you to solve and write up.

1. Safe Transformations (16 points)

A transformation that rewrites one expression to another is said to be safe if performing the transformation anywhere in a program will not change the behavior of the program . For each of the following transformations, indicate whether it is safe in (i) Hofl and (ii) Hoilec (a version of Hofl with explicit cells). Assume both languages are statically scoped and call-by-value. For each transformation you specify as unsafe, give an example whose behavior is changed by the transformation. Changes in behavior include:

  • the program returns different values before and after the transformation;
  • the program loops infinitely or signals an error before the transformation, but returns a value after the transformation;
  • the program returns a value before the transformation, but loops infinitely or signals an error after the transformation.

For the purposes of this problem, all errors are considered equivalent and programs that loop infinitely are considered equivalent to those that signal an error. In particular, the behavior of an expression should not be considered different if it signals an error both before and after the transformation, but the error messages are different.

In each expression, Id stands for a variable reference and E stands for an expression. You may assume that all subexpressions of an application are evaluated in left-to-right order.

  1. (+ Id Id)(* 2 Id)

  2. (+ E E)(* 2 E)

  3. (+ E1 E2)(+ E2 E1)

  4. (+ E1 E2)(bind x E1 (+ x E2))

  5. (+ E1 E2)(bindpar ((x (fun () E1)) (y (fun () E2 ))) (+ (x) (y)))

  6. (if #t E1 E2)E1

  7. (if E1 E2 E2)E2

  8. (if (if E1 E2 E3) E4 E5)(if E1 (if E2 E4 E5) (if E3 E4 E5))

2 Partial Evaluation (30 points)

Avoiding Magic Constants

It is good programming style to avoid “magic constants” in code by explicitly calculating certain constants from others. For instance, consider the following two Bindex programs for converting years to seconds:

; Program 1
(bindex (years)
  (* 31536000 years))

; Program 2
(bindex (years)
  (bind seconds-per-minute 60 (bind minutes-per-hour 60
    (bind hours-per-day 24
      (bind days-per-year 365 ; ignore leap years
        (bind seconds-per-year (* seconds-per-minute (* minutes-per-hour (* hours-per-day
          days-per-year))) (* seconds-per-year years)))))))

The first program uses the magic constant 31536000, which is the number of seconds in a year. (It is worth noting that this number is approximately π × 107. So a century is approximately π × 109 seconds, which means that π seconds is approximately one nano-century!)

The second program shows how this constant is calculated from simpler constants. By showing the process by which seconds-per-year is calculated, the second program is a more robust and well-documented software artifact. Calculated constants also have the advantage that they are easier to modify. Although the numbers in the above program aren’t going to change, there are many so-called “constants” built into a program that change over its lifetime. For instance, the size of word of computer memory, the price of a first-class stamp, and the rate for a certain tax bracket are all numbers that could be hard-wired into programs but which might need to be updated in future version of the software.

However, magic constants can have performance advantages. In the above programs, the pro- gram with the magic constant performs one multiplication, while the other program performs four multiplications. If performance is critical, the programmer might avoid the clearer style and instead opt for magic constants.

Partial Evaluation

Is there a way to get the best of both approaches? Yes! We can write our program in the clearer style, and then automatically transform it to the more efficient style via a process known as partial evaluation. Partial evaluation transforms an input program into a residual program that has the same meaning by performing computation steps that would otherwise be performed when running the program. Any computation steps that can be performed during partial evaluation are steps that do not need to be performed when the residual program is run later. In most cases, the residual program has better run-time performance than the original program.

For instance, we can use partial evaluation to systematically derive the first program above from the second. We begin via a step known as constant propagation, in which we substitute the four constants at the top of the second program into their references to yield:

(bindex (years)
  (bind seconds-per-minute 60
    (bind minutes-per-hour 60 (bind hours-per-day 24
      (bind days-per-year 365 ; ignore leap years
        (bind seconds-per-year (* 60 (* 60 (* 24 365)))
          (* seconds-per-year years)))))))

Next, we eliminate the now-unnecessary first four bindings via a step known as dead code removal:

(bindex (years)
  (bind seconds-per-year (* 60 (* 60 (* 24 365)))
    (* seconds-per-year years)))

We can now perform the three multiplications involving manifest integers in a step known as constant folding:

(bindex (years)
  (bind seconds-per-year 31536000
    (* seconds-per-year years)))

Finally, another round of constant propagation and dead code removal yields the first program:

~~
(bindex (years)
(* 31536000 years))
~~~

It is not possible to eliminate bindings whose definition ultimately depends on the program parameters. Nevertheless, it is often possible to partially simplify such definitions. For example, consider:

(bindex (a)
  (bind b (* 3 4)
    (bind c (+ a (- 15 b)) (bind d (/ c b)
      (* d c)))))

The transformation techniques described above can simplify this program to:

(bindex (a)
  (bind c (+ a 3)
    (bind d (/ c 12) (* d c))))

In this example, (+ a (- 15 b)) cannot be replaced by a number (because the value of a is unknown), but it can be simplified to the residual expression (+ a 3). Similarly, (/ c b) is transformed to the residual expression (/ c 12) and (bind b ...) is transformed to the residual expression (bind c (+ a 3) (bind d (/ c 12) (* d c))).

Setup

Begin this problem by performing the following steps:

  1. Copy the entire contents of the cs251-download directory on cs.wellesleley.edu to your local cs251-download directory using the following command. This may overwrite some of your existing files in this directory, but that’s OK. (If you think it’s not OK, then save versions of files you care about elsewhere first!) Of course, as usual, here and elsewhere you should replace username gdome by your own account name:

    scp -r gdome@cs.wellesley.edu:/home/cs251/download/* /home/wx/cs251-download
  2. If you don’t already have one, create a *sml* SML interpreter buffer within Emacs.

  3. In the *sml* buffer, execute the following SML command to change the default directory:

    - Posix.FileSys.chdir "/home/wx/cs251-download/ps9";
  4. Make a copy of the file ps9/BindexPartialEvalSkeleton.sml named ps9/BindexPartialEval.sml.

Your Task

In this problem, your task is to write a function partialEval that performs partial evaluation on a Bindex program. Given a Bindex program, partialEval should return another Bindex program that has the same meaning as the original program, but which also satisfies the following properties:

  1. The program should not contain any bind expressions in which a variable is bound to an integer literal.

  2. The program should not contain any binary applications in which an arithmetic operator is applied to two integer literals. There are two exceptions to this property: the program may contain binary applications of the form (/ n 0) or (% n 0), since performing these applications would cause an error in the partial evaluation process.

It is possible to write separate functions that perform the constant propagation, constant folding, and dead-code elimination steps, but it is tricky to get them to work together to perform all simplifications. It turns out that it is much more straightforward to perform all three kinds of simplification at the same time in a single walk over the expression tree.

By analogy with BindexEnvInterp.run and BindexEnvInterp.eval, partial evaluation can be peformed by a pair of functions:

  • val partialEval: Bindex.pgm -> Bindex.pgm

    Returns a partially evaluated version of the given Bindex program.

  • val peval: Bindex.exp -> int Env.env -> Bindex.exp:

    Given a Bindex expression exp and a partial evaluation environment env, returns the partially evaluated version of exp. The partial evaluation environment contains name/value bindings for names whose integer values are known.

Your goal is to implement simplification by fleshing out the above two function definitions in BindexPartialEval.sml.

Note that there is a correspondence between run/eval in BindexEnvInterp and partialEval/peval. peval is effectively a version of eval that evaluates as much of an expression as it can based on the “partial” environment information it is given. Because bindings for some names may be missing in the environment, peval cannot always evaluate every expression to the integer it denotes and in some cases must instead return a residual expression that will determine the value when the program is executed. Because of this, peval must always return an expression rather than an integer; even in the case where it can determine the value of an expression, that value must be expressed as an integer literal node (tagged with the Int constructor), not an integer.

Notes:

  • Perform use "load-peval.ml" to load the partial evaluator.

  • Use BindexEnvInterp.binopToFun as part of applying an operator to two integers.

  • Divisions and remainders whose second operands are zero must be left in the program. Such programs will encounter divide-by-zero errors when they are later executed. For example,

    (bindex (a)
      (bind b (* 3 4)
        (bind c (/ b (- 12 b)) (* c b))))

    should be transformed to:

    (bindex (a)
      (bind c (/ 12 0)
        (* c 12)))
  • In some cases it would be possible to perform more aggressive simplification if you took advantage of algebraic properties like the associativity and commutativity of addition and multiplication. To simplify this problem, you should not use any algebraic properties of the arithmetic operators. For example, you should not transform (+ 1 (+ a 2)) into (+ 3 a), but should leave it as is. You should not even perform “obvious” simplifications like (+ 0 a)a, (* 1 a)a, and (* 0 a)0. Although the first two of these simplification are valid, the last is unsafe in the sense that it can change the meaning of a program. For instance, (* 0 (/ a b)) cannot be simplified to 0, because it does not preserve the meaning of the program in the case where b is 0 (in which case evaluating the expression should give an error).

  • You can use the testPartialEval function (which takes a string representation of a Bindex program) to test your partial evaluator. For example:

    - testPartialEval "(bindex () (+ 1 2))";
    (bindex () 3)
    val it = () : unit
    
    - testPartialEval "(bindex (a)\
    =                 \  (bind b (* 3 4)\
    =                 \    (bind c (/ b (- 12 b)) (* c b))))";
    (bindex (a) (bind c (/ 12 0) (* c 12)))
    val it = () : unit
    
    - testPartialEval "(bindex (a)\
    =                 \  (+ (* (+ 1 2) a)\
    =                 \     (+ (* 3 4)\
    =                 \        (+ (* 0 a)\
    =                 \           (+ (* 1 a)\
    =                 \              (+ 0 a))))))";
    (bindex (a) (+ (* 3 a) (+ 12 (+ (* 0 a) (+ (* 1 a) (+ 0 a))))))
    val it = () : unit

    The last two examples show the SML conventions for entering multiline strings. Each nonterminal line must end with a backslash character, and the next line must begin with a backslash character.

  • You can also test your partial evaluator by evaluating test(). This applies your partial evaluator to all the test entries in the list testEntries in the file BindexPartialEvalTest.sml. The entries in this list are by no means exhaustive. You are strongly encouraged to add more entries to this list.

    - test();
    Add-2-3 -- OK!
    Simple1 -- OK!
    Years -- OK!
    Residuals1 -- OK!
    Residuals2 -- OK!
    Residuals3 -- OK!

3. Compex (35 points)

Background

In this problem, you will define a new language Compex that extends Bindex with a comp construct:

(comp E_num E_pos E_zero E_neg)

This comp expression evaluates E_num to the integer value n_num and then compares this with zero:

  • if n_num is greater than zero, the value of the expression E_pos is returned.
  • if n_num is equal to zero, the value of the expression E_zero is returned.
  • if n_num is less than zero, the value of the expression E_neg is returned.

At most one of E_pos, E_zero, or E_neg will be evaluated.

Here are some examples of programs using the comp construct:

; ps9/pgm1.cpx
;
; An absolute value program 
(compex (n) (comp n n 0 (- 0 n)))

; ps9/pgm2.cpx
;
; Given program arguments a and b:
; * return the square of their differences if a > b
; * return twice a if a = b
; * return the product of a and the difference if a < b
; Store this in ps9/abs.cpx
(compex (a b) 
  (bind diff (- a b) 
    (comp diff (* diff diff) (* 2 a) (* a diff))))

; ps9/pgm3.cpx
;
; Return the max of a, b, and c
(compex (a b c)
  (comp (- a b)
        (comp (- a c) a a c)
        (comp (- a c) a a c)
        (comp (- b c) b b c)))

; ps9/pgm4.cpx
;
; Test the interaction of comp with bind and itself
(compex (p q r)
  (+ (bind s (- p q)
       (bind t (- q r)
         (* (comp s (+ p q) (+ p r) (+ q r))
            (comp t (- p q) (- p r) (- q r)))))
     (comp (- r q)
       (bind u (+ p q) (comp u u (* 2 u) (* 3 u)))
       (bind v (+ p r) (comp v (* 4 v) (* 5 v) (* 6 v)))
       (bind w (+ q r) (comp w (* 7 w) (* 8 w) (* 9 w))))))

; ps9/pgm5.cpx
;
; If x > y > z, returns the sum of x, y, and z. 
; If x = y = z, returns x
; If x < y < z, returns the product of x, y, and z.
; Otherwise gives a divide-by-zero error. 
; (tests that only one branch is evaluated)
(compex (x y z)
  (comp (- x y)
        (comp (- y z) (+ x (+ y z)) (/ y 0) (/ z 0))
        (comp (- y z) (/ y 0) x (/ z 0)
        (comp (- y z) (/ y 0) (/ z 0) (* x (* y z))))))

Setup

Begin this problem by performing the following steps:

  1. Copy the entire contents of the cs251-download directory on cs.wellesleley.edu to your local cs251-download directory using the following command. This may overwrite some of your existing files in this directory, but that’s OK. (If you think it’s not OK, then save versions of files you care about elsewhere first!) Of course, as usual, here and elsewhere you should replace username gdome by your own account name:

    scp -r gdome@cs.wellesley.edu:/home/cs251/download/* /home/wx/cs251-download
  2. If you don’t already have one, create a *sml* SML interpreter buffer within Emacs.

  3. In the *sml* buffer, execute the following SML command to change the default directory:

    - Posix.FileSys.chdir "/home/wx/cs251-download/ps9";
  4. Make copies of cs251-download/bindex/Bindex.sml, cs251-download/bindex/BindexEnvInterp.sml, and cs251-download/ps9/CompexToPostFixSkeleton.sml named (respectively) cs251-download/ps9/Compex.sml, cs251-download/ps9/CompexEnvInterp.sml, and cs251-download/ps9/CompexToPostFix.sml.

    [wx@wx cs251-download]$ cd /home/wx/cs251-development/ps9
    [wx@wx ps9]$ cp ../bindex/Bindex.sml Compex.sml
    [wx@wx ps9]$ cp ../bindex/BindexEnvInterp.sml CompexEnvInterp.sml
    [wx@wx ps9]$ cp CompexToPostFixSkeleton.sml CompexToPostFix.sml

Your Tasks

  1. (2 points) Change Compex.sml and CompexEnvInterp.sml to handle the fact that the language keyword is compex rather than bindex. (Hint: use M-x replace-string in Emacs, but you want use "../ps9/Compex.sml", not use "../bindex/Compex.sml".)

  2. (2 points) In Compex.sml extend the exp datatype to have a Comp constructor for comp expressions.

  3. (2 points) Extend the definition of sexpToExp in Compex.sml to correctly parse comp expressions.

  4. (2 points) Extend the definition of expToSexp in Compex.sml to correctly unparse comp expressions.

  5. (2 points) Extend the definition of freeVarsExp in Compex.sml to correctly determine the free variables of a comp expression.

  6. (5 points) Extend the definition of eval in CompexEnvInterp.sml to correctly evaluate comp expressions using the environment model. You should evaluate E_num exactly once. Use the #run functionality of the Compex REPL to test that your eval function works as expected. For example:

    - use "CompexEnInterp.sml"
    ... lots of output omitted ...
    
    - CompexEnvInterp.repl()
    
    compex> (#run pgm1.cpx -42)
    42
    
    compex> (#run pgm1.cpx 17)
    17
    
    compex> (#run pgm1.cpx 0)
    0
    
    compex> (#run pgm2.cpx 6 2)
    16
    
    compex> (#run pgm2.cpx 3 3)
    6
    
    compex> (#run pgm2.cpx 2 6)
    ~8
    
    compex> (#run pgm3.cpx 1 2 3)
    3
    
    compex> (#run pgm3.cpx 1 3 2)
    3
    
    compex> (#run pgm3.cpx 3 1 2)
    3
    
    compex> (#run pgm4.cpx 7 3 1)
    68
    
    compex> (#run pgm4.cpx 7 1 3)
    ~8
    
    compex> (#run pgm4.cpx 3 7 1)
    24
    
    compex> (#run pgm4.cpx 3 1 7)
    ~20
    
    compex> (#run pgm4.cpx 1 3 7)
    ~36
    
    compex> (#run pgm4.cpx 1 7 3)
    10
    
    compex> (#run pgm5.cpx 4 3 2)
    9
    
    compex> (#run pgm5.cpx 3 3 3)
    3
    
    compex> (#run pgm5.cpx 2 3 4)
    24
    
    compex> (#run pgm5.cpx 4 2 3)
    Error: Division by 0: 3
    
    compex> (#quit)
    Moriturus te saluto!
    val it = () : unit
  7. (20 points) In this part, you will extend the Compex to PostFix translator in CompexToPostFix.sml to correctly handle the translation of the comp construct from Compex to PostFix. Do this by adding a clause to the expToCmds function that correctly translates the comp construct.

    Notes:

    • This is a challenging translation. Think carefully about it in the context of some concrete examples.

    • The PostFix sel, exec, and executable sequence commands are all helpful in this translation.

    • Performing use "CompexToPostFix.sml" will not only load this file, but will test it on the Compex programs pgm1.cpx through pgm5.cpx.