code More higher-order functions, plus a Forth interpreter

Contents

Tasks

1. Lexical Scope and Closures (20 points)

Draw lexical scope contours and the environments produced by evaluating the following sequence of Racket bindings as we did in class.

How to submit this part:

  • Submit a paper copy under Ben’s office door by the deadline; OR
  • Add and commit a scanned PDF of your drawing in your repository.
(define f2
    (lambda (g2)
        (let ([x2 3])
            (g2 2))))
(define x2 4)
(define h2
    (lambda (y2)
        (+ x2 y2)))
(define z2 (f2 h2))

2. Functional Idioms in the Large (10 points)

Read MapReduce: Simplified Data Processing on Large Clusters, by Dean and Ghemawat. The MapReduce system is a good example of how an idiom developed with a particular language or style of language can be applied even far outside that language. Consider that (a) this paper was published in 2004, when the system had already been in use for some time, (b) the scale of workloads at major software services companies has probably increased exponentially since, and (c) the MapReduce system itself is no longer used, but many of its lessons live on in successor systems.

For the following questions, write a paragraph or two in the file mapreduce.txt.

  1. The map and reduce functions used in MapReduce have names that are similar to familiar higher-order Racket functions. (fold is also sometimes called reduce.) Discuss similarities and differences between MapReduce’s programming model and the programming idioms we have used with familiar higher-order functions in Racket. What familiar Racket functions are similar to the primitives of the MapReduce framework? Is there an exact match?

  2. In class (possibly after the release of this assignment), we consider pros and cons of immutability vs. mutability in programming languages. How do internet-scale design constraints affect the (im)mutability discussion? Discuss how this choice is important for at least 2 design problems that MapReduce aims to solve.

3. Higher-Order Data-Crunching (25 points)

Write your answers to this part in datacrunch.rkt.

Functions in this part will use a corpus of documents. A corpus (for the purposes of this assignment) is a list of documents, where each document is a pair (cons cell) of a name and the contents (i.e., list of words of the document) of the document. Names and words may be any simple values, such as symbols, strings, or numbers.

A corpus may also be considered to be:

  • an association list from document name to document contents; or
  • a list of lists where the first element of each list is the document name and the remaining elements are the document words.

The following example corpus is written with a careful choice of quoted notation that helps illustrate the form as we will interpret it. As we have seen before, Racket may give a different display notation. Consider why.

(define corpus
  '((hw  . (hello world))
    (hw2 . (hi world))
    (wc  . (world champion of the hi fi world))))
  1. Evaluate corpus in DrRacket. How is it displayed? Why does DrRacket display it differently than the above? Why are both displays accurate and equivalent? Recall that '(A . B) denotes a cons cell holding 'A in the car and 'B in the cdr.

  2. Write a function count that takes a corpus of docs and a word and returns the total number of occurrences of word in the corpus. Use higher-order functions to your advantage. Minimize the number of intermediate data structures (lists / cons cells) that your solution produces. An ideal solution never creates a fresh cons cell.

    > (count corpus 'hi)
    2
    > (count corpus 'world)
    4
    > (count corpus 'hello)
    1
    > (count corpus 'hooligans)
    0
    > (count null 'hooligans)
    0
    > (count corpus 'hw)
    0
  3. Write a function grep that takes a corpus docs and a word and returns an association list mapping document name to the number of occurrences of word in that document. ONLY documents in docs with at least 1 occurrence of word should appear as keys in the association list. Use higher-order functions to your advantage. Minimize the number of intermediate data structures (lists / cons cells) that your solution produces that are not part of the final result. Order does not matter within the result.

    > (grep corpus 'hi)
    '((wc . 1)
    (hw2 . 1))
    > (grep corpus 'world)
    '((wc . 2)
    (hw2 . 1)
    (hw . 1))
    > (grep corpus 'hooligans)
    '()
    > (grep null 'world)
    '()
  4. For the purposes of this task, a pile is either:

    • an non-null atom (a number, boolean, symbol, etc.); or
    • a list (empty or non-empty) of piles.

    A pile:

    • never contains cons cells that are not lists;
    • may be arbitrarily deep; and
    • typically contains only one type of atom.

    The following example binds pile1 through pile5 to piles:

    (define pile1 (list 1))
    (define pile2 2)
    (define pile3 null)
    (define pile4 (list 2 3 4))
    (define pile5 (list
                    (list 1 2 3 4)
                    (list 2 (list 4 5) (list 9 2 1) 3)
                    (list 2 8)))

    A pile may be flattened into a single-dimensional list with elements in their original pile order by applying the built in Racket function flatten:

    > (flatten pile5)
    '(1 2 3 4 2 4 5 9 2 1 3 2 8)

    Your task is to write a function fold-pile that takes these parameters:

    • a combiner function combine;
    • an initial accumulator value init; and
    • a pile pile;

    and returns a result equivalent to the result of

    (foldl combine init (flatten pile))

    Implementation rules:

    • You may not invoke flatten or any equivalent function as part of your fold-pile implementation.
    • You must minimize the number of intermediate data structures (lists / cons cells) that your solution produces and discards as by-products. The best implementations of fold-pile do not create any cons cells that are not part of the final result value of fold-pile.

    Examples:

    > (fold-pile + 0 pile1)
    1
    > (fold-pile + 0 pile2)
    2
    > (fold-pile + 0 pile3)
    0
    > (fold-pile + 0 pile4)
    9
    > (fold-pile + 0 pile5)
    46

    Hints:

    • The sample solution adds 3 lines below the define line.
    • Only one of those 3 lines is longer than the define line.

4. A Forth Interpreter (45 points)

This part was canceled in Fall 2019.

Submission

Submit: The course staff will collect your work directly from your hosted repository. To submit your work:

  1. Make sure you have committed your latest changes.

    $ git add ...
    $ git commit ...
  2. Run the command cs251 sign to sign your work and respond to any assignment survey questions.

    $ cs251 sign
  3. Push your signature and your latest local commits to the hosted repository.

    $ git push

Confirm: All local changes have been submitted if the output of git status shows both:

  • Your branch is up to date with 'origin/master', meaning all local commits have been pushed
  • nothing to commit, meaning all local changes have been committed

Resubmit: If you realize you need to change something later, just repeat this process.