code Closures as data structures, delayed evaluation and laziness with promises and streams

Contents

Time Info

In Fall 2019, students self-reported spending a median of 7.0 hours on this assignment.

Tasks

Bizarre syntax errors when programming with structures/signatures?

Check to see if you have accidentally deleted the end that matches a struct or sig keyword. Running git diff can help answer this question.

1. Representing Maps by Functions (map.sml, 20 points)

MAP and FUNMAP signatures are provided and documented in map.sig. Maps are data structures that represent a mapping keys to values. You previously implemented an Env structure that implements maps as association lists, with some additional operations. In this task you will implement a map data structure using only function closures.

  1. Implement a FunMap structure in map.sml, ascribing the FUNMAP signature, which also supports defining a map from a given function. Your implement should represent maps purely as function closures. The ideas to implement these are quite similar to those for the SET ADTs we (partly) implemented in class. A map-representing function returns an option when called on a key: NONE if the given key is unbound in the map represented by that function, SOME v if the given key is bound to value v in the map represented by that function. Implement all bindings described in FUNMAP.

    Test your implementation to make sure it functions as expected on a range of cases. Some starter ideas are included in test-map.sml:

    val m0 : (string, int) FunMap.t = FunMap.empty;
    val lookup_x_m0 = FunMap.lookup "x" m0;
    val m1 = FunMap.bind ("x", 7) m0;
    val lookup_x_m1 = FunMap.lookup "x" m1;
    val m2 = FunMap.bind ("y", 8) m1;
    val lookup_x_m2 = FunMap.lookup "x" m2;
    val lookup_y_m2 = FunMap.lookup "y" m2;
    val lookup_x_unbind_x_m2 = FunMap.lookup "x" (FunMap.unbind "x" m2);
    val lookup_73_halves = FunMap.lookup 73 (FunMap.define (fn n => SOME (n div 2)));
    val lookup_x_make_shadow = FunMap.lookup "x" (FunMap.make [("y", 4),("x", 5),("x",2)];
  2. One of the bindings in the MAP signature could be implemented externally by clients without knowing the concrete type of maps.

    • Which one? Re-implement it as a new function binding (with the same name and type) outside of the FunMap structure.

    • Does implementing this function as a primitive operation within the abstract data type (i.e., within the structure) confer any advantages versus writing a generic version outside? (Assume the external version is also provided by the library: in other words, “saving clients the trouble of writing it” is irrelevant to this question.)

  3. Consider the trade-offs between list representations, function representations, and at least one other (hypothetical) representation for sets and for maps. Examples of other representations include hash tables, heaps, tries, etc. You do not need to implement your extra representation. Answer the following questions in comments in map.sml:

    1. What distinguishes function-representable but non-list-representable sets and maps from list-representable sets and maps? Give examples and explain the fundamental distinction that separates sets/maps that can be represented with, e.g., lists and those that cannot (but can be represented by functions).
    2. What is the fundamental operation over collections that is not possible to implement for function representations of general sets and maps? (Operations like toString and toList are impossible because of this limitation.)
    3. Which representation (lists or functions) is more space-efficient for use cases where insertion of duplicate set elements or binding of duplicate map keys (shadowing) is common? How much does your answer depend on implementation choices vs. fundamental limits of the representation?
  4. Copy your completed map.sml to map-curry.sml:

    $ cp map.sml map-curry.sml
    $ git add map-curry.sml

    As a final challenge, convert map-curry.sml to eliminate all uses of fn ... => ... (anonymous function definition expressions). You must use no more than one fun binding in your FunMap structure per binding in the FUNMAP signature, with no named or anonymous helper functions. Think about how you can define FunMap functions in curried form and exploit partial application. If you are stumped here, do not worry too much, this is not central to the idea of implementing ADTs as functions, but it is a nice connection to make.

2. Stream Programming (stream.sml, 30 points)

Recall that streams are infinite data structures where the production of “the rest” of the data structure is deferred until needed. In the following examples, we use the following types to represent streams, as defined in this STREAM signature:

signature STREAM =
sig
    datatype 'a scons = Scons of 'a * (unit -> 'a scons)
    type 'a t = unit -> 'a scons
    type 'a stream = 'a t         (* another name for it *)
    ...
end

We represent a stream as a thunk (a function that takes () as its argument) that, when called, produces a pair of an element in the stream and another stream (a thunk representing the rest of the stream). We use a single-constructor datatype to allow a recursive type. You will complete the following functions in stream.sml (mostly the Stream structure).

  1. Write a function stake : 'a stream -> int -> 'a list * 'a stream that takes a stream and an int n and returns a pair of the stream’s first n values in order in a list and the rest of the stream. Build only one list, using only n cons operations, and requesting only n elements (and no more) from the stream. In other words, do not use @ (append) or rev (reverse) and do not expand the stream any farther than necessary. Given nats, a stream of the natural numbers starting at 0:

    - val take5 = stake nats 5;
    val take5 = ([0,1,2,3,4], fn) : int list * int stream
  2. Outside the Stream structure, write a stream funny_numbers that is like the stream of natural numbers 0,1,2,3,..., but every multiple of 5 is negated: 0,1,2,3,4,~5,6,7,8,9,~10,.... For example:

    - val (result,_) = stake funny_numbers 12;
    val result = [0,1,2,3,4,~5,6,7,8,9,~10,11] : int list

    Define funny_numbers outside of the Stream structure.

  3. Write a function smap : ('a -> 'b) -> 'a stream -> 'b stream that takes a mapping function f : 'a -> 'b and a stream, and creates a new result stream where every element is f applied to the corresponding element of the argument stream. Calling smap should not result in any expansion of streams.

    - val (result,_) = stake (smap (fn x => x*2) funny_numbers) 12;
    val result = [0,2,4,6,8,~10,12,14,16,18,~20,22] : int list
  4. Write a function sfilter : ('a -> bool) -> 'a stream -> 'a stream that takes a predicate function and a stream, and creates a new result stream that will contain only the elements of the original stream on which f returns true. Calling sfilter should not result in any expansion of streams.

    - val (result,_) = stake (sfilter (fn x => x mod 2 = 0) funny_numbers) 12;
    val result = [0,2,4,6,8,~10,12,14,16,18,~20,22] : int list

    Once you have everything else working: For full credit (i.e., to get the last 1 of several points), simplify your implementation to use exactly 1 fun binding, 1 val binding, and 0 anonymous function definitions (fn ... => ...). (Hint: partial application and currying.)

  5. Write a function scycle : 'a list -> 'b list -> ('a * 'b) stream that takes two lists and produces a stream where each element is a pair of elements, one from each list, cycling through the elements of each list in order. Do not assume that the lists are the same length. Do assume that they are non-empty.

    - val (result,_) = stake (scycle [1,2,3] ["a","b"]) 7;
    val result = [(1,"a"),(2,"b"),(3,"a"),(1,"b"),(2,"a"),(3,"b"),(1,"a")]
        : (int * string) list

    There are a variety of solutions. The original stream construction work and each step of stream expansion on the resulting stream should all be O(1) (i.e., not proportional to the length of either of the lists). The body of the sample solution contains only four lines that are longer than three characters. It never uses integers or list lengths!

  6. Outside the Stream structure, implement a binding primes : int stream that is a stream of all prime numbers in ascending order, starting from 2. To implement the stream, replace the binding of the function ssieve : int stream -> int stream function within the provided logic for primes. You may rewrite the binding of ssieve in any way so long as it type-checks. You may not change the surrounding code.

    - val (twenty_five_primes, _) = stake primes 25;
    val twenty_five_primes =
      [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97]
      : int list

    For algorithmic inspiration, see the Sieve of Eratosthenes. Think of a sieve as a composition of several filters. Streams and functional programming can make an implementation of this algorithm elegantly simple. The body of the sample solution for ssieve contains only two lines that are longer than three characters (not counting spaces) and reuses one of the stream functions you wer asked to define earlier.

3. Side Effects of Lazy Side Effects (side-effects.sml, 15 points)

Our implementation of promises works well in the absence of side effects.

structure Promise :> PROMISE =
struct
    datatype 'a promise = Thunk of unit -> 'a
                        | Value of 'a
    type 'a t = 'a promise ref
    fun delay th = ref (Thunk th)
    fun force p =
      case !p of
          Value v => v
        | Thunk th => let val v = th ()
                          val _ = p := Value v
                      in v end
end

If we wish to support true “at most once” evaluation of promised thunks in the presence of side effects, we need to consider error cases. Consider the following contrived example using our promise implementation from class:

fun test () =
    let
        val calls = ref 0
        val _ = print "Delaying...\n"
        val p = Promise.delay
                    (fn () =>
                        let val c = !calls
                            val _ = calls := c + 1
                            val _ = print ("This is call #" ^ Int.toString c ^ "\n")
                        in 7 div c end)
        val _ = print "... Delayed\n"
        val _ = print "Forcing...\n"
        val x = (Promise.force p) handle e => ~1
        val _ = print ("... Forced. !calls = " ^ Int.toString (!calls) ^ ". x = " ^ Int.toString x ^ "\n")
        val _ = print "Forcing again...\n"
        val y = (Promise.force p) handle e => ~1
        val _ = print ("... Forced. !calls = " ^ Int.toString (!calls) ^ ". y = " ^ Int.toString y ^ "\n")
    in
        {
          calls = !calls,
          x = x,
          y = y
        }
    end

Mixing laziness and side effects makes it difficult to predict the order in which side effects occur. This is not such a problem in the simple example above, but a new problem does arise.

  1. What printed output does evaluating test () produce? What result value does it produce? Does this violate (at least the spirit of) our definition for promises? Run the code to confirm your prediction. You do not need to submit an answer to this part.
  2. Improve the implementation of Promise to ensure that the thunk truly is called at most once. You will want to extend the implementation of the 'a Promise.t datatype and the Promise.force function to handle the case where the thunk has been called but resulted in an exception. When promises in such state are forced again, the force operation should immediately raise the same exception that was raised the first time the promise was forced, without calling the thunk again. When your implementation is complete, a call of test should print This is call #_ just once and should return the result value {calls=1, x=~1, y=~1}. It will help to recall that exceptions are just contructors of the special datatype exn.

Submission

Submit: The course staff will collect your work directly from your hosted repository. To submit your work:

  1. Test your source code files one last time. Make sure that, at a minimum, submitted source code is free of syntax errors and any other static errors (such as static type errors or name/scope errors). In other words: the code does not need to complete the correct computation when invoked, but it must be a valid program. We will not grade files that do not pass this bar.

  2. Make sure you have committed your latest changes.

    $ git add ...
    $ git commit ...
  3. Run the command cs251 sign to sign your work and respond to any assignment survey questions.

    $ cs251 sign
  4. Push your signature and your latest local commits to the hosted repository.

    $ git push

Confirm: All local changes have been submitted if the output of git status shows both:

  • Your branch is up to date with 'origin/master', meaning all local commits have been pushed
  • nothing to commit, meaning all local changes have been committed

Resubmit: If you realize you need to change something later, just repeat this process.