CS 251: PS7: Resistance is Futile. You Will Be SMLated.

Picard Borg image )

PS7 includes both regular problems on SML and two of solo problems in Racket. A third solo problem involving PostFix was cancelled.
Dueish: Tuesday, Nov. 21. Try to get as much of this pset done as you can before Thanksgiving break.
Do the SML parts first!
Notes:
- This pset has 93 total points:
  - It has two solo problems worth 33 points. All solo problems use Racket, not SML.
  - It has two regular problems worth 60 points on SML programming, through sum-of-product datatypes.
- It is strongly recommended that you complete the regular SML problems before the Racket solo problems so that you are better prepared for the SML material in class for the rest of the semester.
- Do not attempt the solo problems until you have studied the solutions from PS4 through PS6.
Submission:
- In your yourFullName CS251 Spring 2017 Folder, create a Google Doc named yourFullName CS251 PS7.
- At the top of your yourFullName CS251 PS7 doc, include your name, problem set number, date of submission, and an approximation of how long each problem part took.
- For all parts of all problems, include all answers (including Racket and SML code) in your PS7 Google Doc. Format Racket and SML code using a fixed-width font, like Consolas or Courier New. You can use a small font size if that helps.
- For Problems 1, 2, 3 (the solo problems):
  - Include the box-and-pointer diagram from 2a and your answer to 2b in your PS7 google doc.
  - Be sure that the function definitions you wrote for these problems in yourAccountName-ps7-solo.rkt and yourAccountName-ps7-solo-postfix.rkt also appear in your Google Doc (so that I can comment on them)
  - Drop a copy of your yourAccountName-ps7-solo.rkt and yourAccountName-ps7-solo-postfix.rkt in your ~/cs251/drop/ps07 drop folder on cs.wellesley.edu.
- For Problems 4 and 5:
  - Include the SML code from yourAccountName-ps7-holo.sml in your Google Doc (so that I can comment on them).
  - Include the SML code from yourAccountName-TTTreeFuns.sml in your Google Doc (so that I can comment on them).
  - Drop a copy of your ~wx/cs251/sml/ps7 folder in your ~/cs251/drop/ps07 drop folder on cs.wellesley.edu by executing the following (replacing gdome by your cs server username):
```
scp -r ~wx/cs251/sml/ps7 gdome@cs.wellesley.edu:/students/gdome/cs251/drop/ps07
```

1. Solo Problem: It’s a Factor (16 points)

This is a solo problem. This means you must complete it entirely on your own without help from any other person and without consulting resources other than course materials or online documentation. You may ask Lyn for clarification, but not for help.

Begin this problem by downloading this starter file ps7-solo-starter.rkt, which you should rename to yourAccountName-ps7-solo.rkt. This file contains provided definitions for Solo Problems 1, 2, and 3. Add your definitions for Problems 1, 2, and 3 to this file.

The following least-divisor-rec function correctly returns the least positive integer that evenly divides the given positive integer num.

(define (least-divisor-rec num) ;; Assume num is a postive integer
  (let ((limit (ceiling (sqrt num)))) ;; The largest divisor to be tested
    (define (search-for-divisor candidate)
      (if (> candidate limit)
          num
          (if (divisible-by? num candidate)
              candidate
              (search-for-divisor (+ candidate 2))))) 
    (if (divisible-by? num 2)
        2
        (search-for-divisor 3)))) 

(define (divisible-by? num divisor)
  (= (remainder num divisor) 0))

We can use map with range to test least-divisor-rec on many inputs:

> (map (λ (n) (list n (least-divisor-rec n))) (range 45 56))
'((45 3) (46 2) (47 47) (48 2) (49 7) (50 2) (51 3) (52 2) (53 53) (54 2) (55 5))

Using least-divisor-rec, we can define a function factors-rec that returns a list of all primes factors of a given positive integer, sorted from low to high:

(define (factors-rec num)
  (let ((factor (least-divisor-rec num)))
    (if (= factor num)
        (list factor)
        (cons factor (factors-rec (quotient num factor))))))

> (map (λ (n) (list n (factors-rec n))) (range 60 73))
'((60 (2 2 3 5))
  (61 (61))
  (62 (2 31))
  (63 (3 3 7))
  (64 (2 2 2 2 2 2))
  (65 (5 13))
  (66 (2 3 11))
  (67 (67))
  (68 (2 2 17))
  (69 (3 23))
  (70 (2 5 7))
  (71 (71))
  (72 (2 2 2 3 3)))

(3 points) Using factors-rec in conjunction with the higher-order forall? function defined in PS4 (it is one of your given helper functions), it is possible to give a very simple definition of the hamming? function from PS2. Flesh out this skeleton of hamming?:

(define (hamming? num)
  (and (integer? num)
       (> num 0)
       (or (= num 1)
           (forall?   ; put expression 1 here
                      ; put expression 2 here
                    ))
       ))

> (filter hamming? (range 0 101))
'(1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 75 80 81 90 96 100)

Recall that forall? is defined as follows:

(define (forall? pred xs)
  (if (null? xs)
      #t
      (and (pred (car xs))
           (forall? pred (cdr xs)))))

(4 points) Using the higher-order find function defined in PS4 (it is one of your given helper functions), it is possible to define a function least-divisor-find that searches through the same candidates to return the same answer as least-divisor-rec for every positive integer input. Flesh out this skeleton of least-divisor-find:
```
(define (least-divisor-find num)
  (find   ; put expression 1 here
          ; put expression 2 here 
          ; put expresion 3 here
        ))

> (map (λ (n) (list n (least-divisor-find n))) (range 45 56))
'((45 3) (46 2) (47 47) (48 2) (49 7) (50 2) (51 3) (52 2) (53 53) (54 2) (55 5))
```
Notes:
- Recall that find is defined as follows:
```
(define (find pred not-found xs)
  (if (null? xs)
      not-found
      (if (pred (car xs))
          (car xs)
          (find pred not-found (cdr xs)))))
```
- Your definition of least-divisor-find should should perform the divisible-by? test on exactly the same candidates as least-divisor-rec.
- For generating candidate divisors, it is helpful to know that the Racket range function (like the Python range function) takes an optional third step argument that is added to the current number to determine the next one. (The default step is 1.) For example:
```
> (range 1 20 3)
'(1 4 7 10 13 16 19)
```
- (least-divisor-find n) may take time proportional to the square root of n, even in cases where least-divisor-rec would return after a small number of steps (e.g., when n is an even number.) This is because it will create a list whose length is the square root of n even if it does not examine all the elements of that list.
- You are not allowed to use factors-rec in your definition of least-divisor-rec.

(4 points) Using the higher-order genlist-apply function we defined in class (it is one of your given helper functions), it is possible to define a function factors-genlist-apply that behaves like factors-rec for every positive integer input. Flesh out the following skeleton of factors-genlist-apply. You may use least-divisor-rec in your definition.

(define (factors-genlist-apply num)
  (map second
       (genlist-apply ; put expression 1 here
                      ; put expression 2 here 
                      ; put expression 3 here 
                      (list num 
                            ; put expression 4 here
                        ))))

> (map (λ (n) (list n (factors-genlist n))) (range 60 73))
'((60 (2 2 3 5))
  (61 (61))
  (62 (2 31))
  (63 (3 3 7))
  (64 (2 2 2 2 2 2))
  (65 (5 13))
  (66 (2 3 11))
  (67 (67))
  (68 (2 2 17))
  (69 (3 23))
  (70 (2 5 7))
  (71 (71))
  (72 (2 2 2 3 3)))

Recall that genlist is defined as:

(define (genlist-apply next done? keepDoneValue? seed)
  (if (apply done? seed)
      (if keepDoneValue? (list seed) null)
      (cons seed
            (genlist-apply next done? keepDoneValue? (apply next seed)))))

(5 points) Using the higher-order iterate-apply function we defined in class (it is one of your given helper functions), it is possible to define a function factors-iterate-apply that behaves like factors-rec for every positive integer input. Flesh out the following skeleton of factors-iterate-apply. You may use least-divisor-rec and reverse in your definition.

(define (factors-iterate-apply num)
  (iterate-apply ; put expression 1 here
                 ; put expression 2 here
                 ; put expression 3 here 
                 (list num null)))

> (map (λ (n) (list n (factors-iterate-apply n))) (range 60 73))
'((60 (2 2 3 5))
  (61 (61))
  (62 (2 31))
  (63 (3 3 7))
  (64 (2 2 2 2 2 2))
  (65 (5 13))
  (66 (2 3 11))
  (67 (67))
  (68 (2 2 17))
  (69 (3 23))
  (70 (2 5 7))
  (71 (71))
  (72 (2 2 2 3 3)))

Recall that iterate-apply is defined as:

(define (iterate-apply next done? finalize state)
  (if (apply done? state)
      (apply finalize state)
      (iterate-apply next done? finalize (apply next state))))

2. Solo Problem: Partial Reverses (17 points)

In this problem, you will extend the yourAccountName-ps7-solo.rkt you created in Problem 1. This problem concerns one of the provided functions in that file:

(5 points) Consider the following partial-reverses function. Draw a box-and-pointer diagram of the list that results from the invocation (partial-reverses '(1 2 3 4)). Use the style of diagram shown in PS3 Problem 2. Study the code carefully and be sure to accurately show all sharing between cons cells in your diagram.

(define (partial-reverses xs)
  (partial-reverses-tail xs '() '()))

(define (partial-reverses-tail ys rev list-rev)
  (if (null? ys)
      (cons rev list-rev)
      (partial-reverses-tail (rest ys)
                             (cons (first ys) rev)
                             (cons rev list-rev)
                             )))

Note: As an example of sharing in box-and-pointer diagrams, consider

(define numList '(7 2 4))

(define listOfNumLists (list (append numList (rest numList)) numList (rest numList)))

The result has 9 cons cells arranged as follows:

diagram showing list with sharing

However, if we just enter the printed representation '((7 2 4 2 4) (7 2 4) (2 4)) for listOfNumLists, that would create 13 cons cells:

diagram showing list without sharing

(2 points) How many cons cells would there be in the result of (partial-reverses (range 1 1001))?

(5 points) Complete the following definition of partial-reverses-iterate so that it uses iterate-apply to iteratively calculate the same result (including sharing) as partial-reverses:

(define (partial-reverses-iterate xs)
  (iterate-apply ; expression1 
                 ; expression2
                 ; expression3
                 ; expression4
                 ))

In your expressions, the only functions you may use are list, cons, first, rest, null, and null?.

Recall that iterate-apply is defined as follows:

(define (iterate-apply next done? finalize state)
  (if (apply done? state)
      (apply finalize state)
      (iterate-apply next done? finalize (apply next state))))

(5 Points) Complete the following definition of partial-reverses-foldl so that it uses foldl to iteratively calculate the same result (including sharing) as partial-reverses:
```
(define (partial-reverses-foldl xs)
  (foldl ; expression1
         ; expression2
         xs))
```
In your expressions, the only functions you may use are list, cons, first, and null.

3. Solo Problem: PostFix Tuples (17 points)

This problem was cancelled.

4. Higher-order List Operators in SML (30 points)

In this problem, you will revisit several of the higher-order list operators we have studied in Racket in the context of SML. Since you are already familiar with these operators, your focus in this problem is on SML syntax and type-checking, rather than on the operators themselves.

range, digitsToDecimal, cartesianProduct, partition (10 points). Translate the following Racket functions range, digits->decimal, cartesian-product, and partition into corresponding SML functions named range, digitsToDecimal, cartesianProduct, and partition functions:

(define (range lo hi)
  (if (<= hi lo)
      null
      (cons lo (range (+ 1 lo) hi))))

(define (digits->decimal digits)
  (foldl (λ (digit sum) (+ (* 10 sum) digit))
         0
         digits))

(define (cartesian-product xs ys)
  (foldr (λ (x subres) (append (map (λ (y) (cons x y)) ys)  subres))
         null 
         xs))  

(define (partition pred xs)
  (foldr (λ (x two-lists) 
           (if (pred x) 
               (list (cons x (first two-lists)) (second two-lists))
               (list (first two-lists) (cons x (second two-lists)))))
         (list '() '())
         xs))

For example:

val range = fn : int -> int -> int list
val digitsToDecimal = fn : int list -> int
val cartesianProduct = fn : 'a list -> 'b list -> ('a * 'b) list
val partition = fn : ('a -> bool) -> 'a list -> 'a list * 'a list

- range 0 10;
val it = [0,1,2,3,4,5,6,7,8,9] : int list

- range 3 8;
val it = [3,4,5,6,7] : int list

- range 5 5;
val it = [] : int list

- range 1 100;
val it = [1,2,3,4,5,6,7,8,9,10,11,12,...] : int list

- Control.Print.printLength := 100;

- val it =
  [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,
   29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,
   54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,
   79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99] : int list

- digitsToDecimal '(2 5 1)

- digitsToDecimal '(1 7 2 9)

- digitsToDecimal (range 0 10);
val it = 123456789 : int

- digitsToDecimal  []
val it = 0 : int

- cartesianProduct [1,2,3,4] ["a", "b", "c"];
val it =
  [(1,"a"),(1,"b"),(1,"c"),(2,"a"),(2,"b"),(2,"c"),(3,"a"),(3,"b"),(3,"c"),
   (4,"a"),(4,"b"),(4,"c")] : (int * string) list

- cartesianProduct ["a", "b", "c"] [1,2,3,4];
val it =
  [("a",1),("a",2),("a",3),("a",4),("b",1),("b",2),("b",3),("b",4),("c",1),
   ("c",2),("c",3),("c",4)] : (string * int) list

- partition (fn x => x mod 2 = 0) [4, 2, 7, 8, 5, 1, 9, 3, 6];
val it = ([4,2,8,6],[7,5,1,9,3]) : int list * int list

- partition (fn x => x < 4) [4, 2, 7, 8, 5, 1, 9, 3, 6];
val it = ([2,1,3],[4,7,8,5,9,6]) : int list * int list

- partition (fn x => x > 0) [4, 2, 7, 8, 5, 1, 9, 3, 6];
val it = ([4,2,7,8,5,1,9,3,6],[]) : int list * int list

Notes:

In this and the following parts of this problem, write all of your SML code in a new file named yourAccountName-ps7-holo.sml that is within the ~wx/cs251/sml/ps7 folder on your wx virtual machine.
You should do all your SML programming in Emacs within the wx virtual machine appliance.
If you haven’t done so already, read these notes on SML/NJ and Emacs SML Mode and follow the advice there. In particular, it is strongly recommended that you create an SML interpreter within a *sml* buffer in Emacs. Then you can use M-p and M-n to avoid retyping your test expressions.
In all of your SML programming, do not use #1, #2, etc. to extract tuple components or List.hd, List.tl, or List.null to decompose and test lists. Instead, use pattern matching on tuples and lists, as illustrated in examples from the SML lectures.
Because hyphens are not allowed in SML identifiers, you should translate all hyphens in Racket identifiers either to underscores (so-called ``snake case’’) or camel case. E.g., cartesian-product in Racket becomes cartesian_product or cartesianPrduct in SML. Here and below, other name changes are also required due to limitations in SML identifiers; e.g., -> in digits->decimal is converted to To.
Be careful with your explicit parentheses in SML. Many type errors in SML programs come from incorrectly parenthesizing expressions.

foldr, foldl, map, and List.filter are all built into SML:

val foldr = fn: ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
val foldl = fn: ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
val map = fn: ('a -> 'b) -> 'a list -> 'b list

- List.filter;
val it = fn : ('a -> bool) -> 'a list -> 'a list

Note that List.filter requires the explicit module prefix List., while the other functions do not. Go figure!

Racket’s append translates to SML’s infix @ operator, but when you want to pass it as an argument to a first-class function you write it as op@.
In this entire problem (not just this part) some instances of Racket’s cons will translate to SML’s infix list-prepending operator ::, while others will translate to the tupling notation (<exp1>, <exp2>) for pair creation. Reason about SML types to figure out which to use when. SML’s type checker will yell at you if you get it wrong.
In this entire problem (not just this part) some instances of Racket’s list will translate to SML’s lists while others will translate to SML’s tuples. Again, reason about SML types to figure out which to use when.

Control.Print.printLength controls how many list elements are displayed; after this number, ellipses are used. For example:

- Control.Print.printLength := 5;
val it = () : unit

- range 0 20;
val it = [0,1,2,3,4,...] : int list

- Control.Print.printLength := 20;
val it = () : unit

- range 0 20;
val it = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19] : int list

Another such control is Control.Print.printDepth, which controls printing in nested lists. You won’t need that here, but will in Problem 5.

allContainMultiple, keepBiggerThanNext, foldrTernop, and keepBiggerThanSomeFollowing (10 points) Translate the following Racket functions all-contain-multiple?, keep-bigger-than-next, foldr-ternop, and keep-bigger-than-some-following into corresponding SML functions named doAllContainMultple, keepBiggerThanNext, foldrTernop, and keepBiggerThanSomeFollowing:

(define (all-contain-multiple? m nss)
  (forall? (lambda (ns)
             (exists? (lambda (n) (divisible-by? n m))
                      ns))
            nss))

(define (keep-bigger-than-next nums)
  (if (null? nums)
      '()
       (map car (filter (λ (pair) (> (car pair) (cdr pair)))
                        (zip nums (rest nums))))))

(define (foldr-ternop ternop nullval xs)
  (if (null? xs) 
      nullval
      (ternop (first xs)
              (rest xs)
              (foldr-ternop ternop nullval (rest xs)))))

   
(define (keep-bigger-than-some-following nums)
  (foldr-ternop (λ (fstNum rstNums bigs)
                  (if (exists? (λ (n) (> fstNum n)) rstNums)
                      (cons fstNum bigs)
                      bigs))
                '()
                nums))

For example:

val doAllContainMultiple = fn : int -> int list list -> bool
val keepBiggerThanNext = fn : int list -> int list
val foldrTernop = fn : ('a * 'a list * 'b -> 'b) -> 'b -> 'a list -> 'b
val keepBiggerThanSomeFollowing = fn : int list -> int list

- doAllContainMultiple 5 [[17,10,2],[25],[3,8,5]];
val it = true : bool

- doAllContainMultiple 2 [[17,10,2],[25],[3,8,5]];
val it = false : bool

- doAllContainMultiple 3 [];
val it = true : bool

- keepBiggerThanNext [6, 1, 4, 7, 2, 5, 9, 8, 3];
val it = [6,7,9,8] : int list

- keepBiggerThanNext [9,7,8,6,4,5,3,1,2];
val it = [9,8,6,5,3] : int list

- keepBiggerThanNext (range 0 20);
val it = [] : int list

- keepBiggerThanSomeFollowing [6, 1, 4, 7, 2, 5, 9, 8, 3];
val it = [6,4,7,5,9,8] : int list

- keepBiggerThanSomeFollowing [9,7,8,6,4,5,3,1,2];
val it = [9,7,8,6,4,5,3] : int list

- keepBiggerThanSomeFollowing (range 0 20);
val it = [] : int list

Notes:

SML includes the following analogs of forall?, exists?, and zip that you should use in your definitions:

- List.all;
val it = fn : ('a -> bool) -> 'a list -> bool

- List.exists;
val it = fn : ('a -> bool) -> 'a list -> bool

- ListPair.zip;
val it = fn : 'a list * 'b list -> ('a * 'b) list

Because SML does not allow ? in identifiers, Racket names containing ? need to be be transformed, as in all-contain-multiple? to doAllContainMultiple.
The List. and ListPair. indicate that these functions come from modules. Here is documentation on the List module, and here is documentation on the ListPair module.
In keepBiggerThanNext and foldrTernop, rather than using List.null nums or nums = [] to check for an empty list and List.hd and List.tl to extract the parts of a list, you should instead use pattern patching to to distinguish empty and nonempty lists and find the parts of a nonempty list. Here’s a simple example that illustrates such pattern matching:
```
fun mapScale factor [] = []
  | mapScale factor (num::nums) = (factor * num) :: (mapScale factor nums)
```

genlist, partialSumsTable, iterate, and fibPairs (10 points). Translate the following Racket functions genlist-apply, partial-sums-table, iterate-apply, and fib-pairs functions into SML functions namded genlist, partialSumsTable, iterate, and fibPairs:

(define (genlist-apply next done? keepDoneValue? seed)
  (if (apply done? seed)
      (if keepDoneValue? (list seed) null)
      (cons seed (genlist-apply next done? keepDoneValue? (apply next seed)))))

(define (partial-sums-table ns)
  (genlist-apply (λ (nums ans)
                   (list (rest nums) (+ (first nums) ans)))
                 (λ (nums ans) (null? nums))
                 #t              
                 (list ns 0)))

(define (iterate-apply next done? finalize state)
  (if (apply done? state)
      (apply finalize state)
      (iterate-apply next done? finalize (apply next state))))

(define (fib-pairs threshold)
  ;; returns a list of pairs (a, b) used to calculate Fibonacci iteratively
  ;; until b is greater than or equal to the given threshold
  (iterate-apply (λ (a b pairs) (list b (+ a b) (cons (cons a b) pairs)))
                 (λ (a b pairs) (>= b threshold))
                 (λ (a b pairs) (reverse (cons (cons a b) pairs)))
                 '(0 1 ())))

For example:

val genlist = fn : ('a -> 'a) -> ('a -> bool) -> bool -> 'a -> 'a list
val pairs_genlist = fn : int -> (int * int) list
val iterate = fn : ('a -> 'a) -> ('a -> bool) -> ('a -> 'b) -> 'a -> 'b
val fibPairs = fn : int -> (int * int) list

- genlist (fn n => n * 2) (fn n => n > 1000) true 1;
val it = [1,2,4,8,16,32,64,128,256,512,1024] : int list

- genlist (fn n => n * 2) (fn n => n > 1000) false 1;
val it = [1,2,4,8,16,32,64,128,256,512] : int list

- partialSumsTable [7, 2, 5, 8, 4];
val it =
  [([7,2,5,8,4],0),([2,5,8,4],7),([5,8,4],9),([8,4],14),([4],22),([],26)]
  : (int list * int) list

- partialSumsTable (range 1 11);
val it =
  [([1,2,3,4,5,6,7,8,9,10],0),([2,3,4,5,6,7,8,9,10],1),([3,4,5,6,7,8,9,10],3),
   ([4,5,6,7,8,9,10],6),([5,6,7,8,9,10],10),([6,7,8,9,10],15),([7,8,9,10],21),
   ([8,9,10],28),([9,10],36),([10],45),([],55)] : (int list * int) list

(* Return the first sum of powers of 3 that's greater than 100 *)
- iterate (fn (power, sum) => (3*power, power+sum))
=         (fn (power, sum) => sum > 100)
=         (fn (power, sum) => sum)
=         (1, 0);
val it = 121 : int (* = 1 + 3 + 9 + 27 + 81 *)

- fibPairs 10;
val it = [(0,1),(1,1),(1,2),(2,3),(3,5),(5,8),(8,13)] : (int * int) list

- fibPairs 50;
val it = [(0,1),(1,1),(1,2),(2,3),(3,5),(5,8),(8,13),(13,21),(21,34),(34,55)]
  : (int * int) list

- fibPairs 100;
val it =
  [(0,1),(1,1),(1,2),(2,3),(3,5),(5,8),(8,13),(13,21),(21,34),(34,55),(55,89),
   (89,144)] : (int * int) list

Notes:

SML does not allow ? in identifiers, so translate done? to is_done or isDone and similarly with keepDoneValue?
Use pattern matching on tuples when translating the (λ (nums ans) ...) and (λ (a b pairs) ...) functions. Translate these to (fn (nums,ans) => ...) and (fn (a,b,pairs) => …). Because of SML’s built-in pattern matching, in SML it is unnecessary to have a separate function like Racket’s genlist-apply and iterate-apply (as distinct from genlist and iterate) in SML since the function arguments in SML’s genlist and iterate can already do pattern matching.

5. 2-3 Trees (30 points)

In this problem you will use SML to implement aspects of 2-3 trees, a particular search tree data structure that is guaranteed to be balanced.

Begin by skimming pages 1 through 7 of this handout on 2-3 trees. (I create this handout for CS230 in Fall, 2004, but we no longer teach 2-3 trees in that course).

In a 2-3 tree, there are three kinds of nodes: leaves, 2-nodes, and 3-nodes. Together, these can be expressed in SML as follows:

datatype TTTree = (* 2-3 tree of ints *)
    L (* Leaf *)
  | W of TTTree * int * TTTree (* tWo node *)
  | H of TTTree * int * TTTree * int * TTTree (* tHree node *)

For simplicity, we will only consider 2-3 trees of integers, though they can be generalized to handle any type of value.

For conciseness in constructing and displaying large trees, the three constructors have single-letter names. For example, below are the pictures of four sample 2-3 trees from the handout and how they would be expressed with these constructors:

pictures of four 2-3 trees

val t1 = W( W(W(L,1,L), 2, W(L,3,L)), 4, W(W(L,5,L), 6, W(L,7,L)))
val t2 = H( W(L,1,L), 2, H(L,3,L,4,L), 5, H(L,6,L,7,L))
val t3 = H( H(L,1,L,2,L), 3, W(L,4,L), 5, H(L,6,L,7,L))
val t4 = H( H(L,1,L,2,L), 3, H(L,4,L,5,L), 6, W(L,7,L))

As explained in the handout on 2-3 trees, a 2-3 tree is only valid it it satisfies two additional properties:

The ordering property:
- In a 2-node with left subtree l, value X, and right subtree r, (all values in l) ≤ X ≤ (all values in r).
- In a 3-node with left subtree l, left value X, middle subtree l, right value Y, and right subtree r, (all values in l) ≤ X ≤ (all values in m) ≤ Y ≤ (all values in r).
The height property (called path length invariant in the handout): in a 2-node or 3-node, the height of all subtrees must be exactly the same.

The height property guarantees that a valid 2-3 tree is balanced. Together, these two properties guarantee that a 2-3 tree is efficiently searchable.

Your ~wx/cs251/sml/ps7 folder includes the file TTTree.sml, which contains the TTTree dataytype defined above as well as numerous examples of valid and invalid 2-3 trees that are instances of this datatype. Note that t and vt are used for valid trees, io is used for invalidly ordered trees, and ih is used for invalid height trees.

tree name	elements	ordered?	satisfies the height property?
t1	[1,…,7]	Yes	Yes, with height 3
t2	[1,…,7]	Yes	Yes, with height 2
t3	[1,…,7]	Yes	Yes, with height 2
t4	[1,…,7]	Yes	Yes, with height 2
vt0	[]	Yes	Yes, with height 0
vt2	[1,2]	Yes	Yes, with height 1
vt17	[1,…,17]	Yes	Yes, with height 3
vt20	[1,…,20]	Yes	Yes, with height 4
vt44	[1,…,44]	Yes	Yes, with height 5
vt92	[1,…,92]	Yes	Yes, with height 6
io2	[1,2]	No	Yes, with height 1
io7	[1,…,7]	No	Yes, with height 2
io17	[1,…,17]	No	Yes, with height 3
io20	[1,…,20]	No	Yes, with height 4
io44	[1,…,44]	No	Yes, with height 5
io92	[1,…,92]	No	Yes, with height 6
ih2	[1,2]	Yes	No
ih7	[1,…,7]	Yes	No
ih17	[1,…,17]	Yes	No
ih20	[1,…,20]	Yes	No
ih44	[1,…,44]	Yes	No
ih92	[1,…,92]	Yes	No

These trees are used to define two lists:

val validTrees = [vt0, vt2, t2, t3, t4, t1, vt17, vt20, vt44, vt92]

val invalidTrees = [io2, io7, io17, io20, io44, io92, 
                    ih2, ih9, ih17, ih20, ih44, ih92]

In this problem you will (1) implement a validity test for 2-3 trees and (2) implement the effcient 2-3 insertion algorithm from the handout on 2-3 trees.

Your ~wx/cs251/sml/ps7 folder also includes the starter file TTTreeFuns.sml. In this file, flesh out the missing definitions as specified below. When you finish this problem, rename the file to yourAccountName-TTTreeFuns.sml before you submit it.

Be sure to perform execute the following in a shell in your wx appliance to get the ~wx/cs251/sml/ps7 folder:

cd ~/cs251/sml
git pull origin master

satisfiesOrderingProperty (7 points)

In this part, you will implement a function satisfiesOrderingProperty: TTTree -> bool that takes an instance of the TTTree datatype and returns true if it satisfies the 2-3 tree ordering property and false if it does not. An easy way to do this is to define two helper functions:
- elts: TTTree -> int list returns a list of all the elements of the tree in in-order — i.e.,
  - In a 2-node with left subtree l, value X, and right subtree r, all values in l are listed before X, which is listed before all elements in r.
  - In a 3-node with left subtree l, left value X, middle subtree l, right value Y, and right subtree r, all values in l are listed before X, which is listed before all values in m, which are listed before Y , which is listed before all values in r.
- isSorted: int list -> bool returns true if the list of integers is in sorted order and false otherwise.
For example:
```
- elts vt17;
val it = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17] : int list

- elts io17;
val it = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,17,16] : int list

- isSorted(elts vt17);
val it = true : bool

- isSorted(elts io17);
val it = false : bool
```
Using these two helper functions, satisfiesOrderingProperty can be implemented as:
```
fun satisfiesOrderingProperty t = isSorted(elts t)
```
For example:
```
- map satisfiesOrderingProperty validTrees;
val it = [true,true,true,true,true,true,true,true,true,true] : bool list

- map satisfiesOrderingProperty invalidTrees;
val it = [false,false,false,false,false,false,true,true,true,true,true,true] : bool list
```
Notes:
- Once (and only once) execute
```
use "TTTree.sml";
```
  to load the TTTree datatype and example trees. Then every time you change the file TTTreeFuns.sml, execute
```
use "TTTreeFuns.sml"
```
- If you haven’t done so already, read these notes on SML/NJ and Emacs SML Mode and follow the advice there. In particular, it is strongly recommended that you create an SML interpreter within a *sml* buffer in Emacs. Then you can use M-p and M-n to avoid retyping your test expressions.
- You may need to execute Control.Print.printLength := 100; in order to see all the list elements.
- Implement isSorted using the same zipping approach you used in PS3. You do zipping in SML via ListPair.zip from the ListPair module. List.all from the List module is also helpful.

satisfiesHeightProperty (5 points)

In this part, you will implement a function satisfiesHeightProperty: TTTree -> bool that takes an instance of the TTTree datatype and returns true if it satisfies the 2-3 tree height property and false if it does not. An easy way to do this is to define satisfiesHeightProperty as

fun satisfiesHeightProperty t = Option.isSome (height t)

where Option.isSome is a function from the Option module that determines if an option value is a SOME (vs. a NONE) and height is a function with type TTTree -> option int such that height t returns SOME h if t satisfies the height property with height h and otherwise returns NONE.

Implement the height function by fleshing out this skeleton:

(* height: TTTree -> int option *)
fun height L = (* return an appropriate option value here *)
  | height (W(l, _, r)) =
    (case (height(l), height(r)) of (* put appropriate pattern clauses here *))
  | (* handle the H case here, similarly to the W case *)

For example:

- map height validTrees;
val it = [SOME 0,SOME 1,SOME 2,SOME 2,SOME 2,SOME 3,SOME 3,SOME 4,SOME 5,SOME 6]
: int option list

- map height invalidTrees;
val it = [SOME 1,SOME 2,SOME 3,SOME 4,SOME 5,SOME 6,NONE,NONE,NONE,NONE,NONE,NONE]
: int option list

Notes:

It’s important to enclose the case expressions within height in parens to distinguish the pattern clauses of the case expression from those of the height function definition.

Once height is defined, satisfiesHeightProperty is defined:

- map satisfiesHeightProperty validTrees;
val it = [true,true,true,true,true,true,true,true,true,true] : bool list
  
- map satisfiesHeightProperty invalidTrees;
val it = [true,true,true,true,true,true,
          false,false,false,false,false,false] : bool list

Once both satisfiesOrderingProperty and satisfiesHeightProperty are defined, it is trivial to define a function isValid: TTTree -> bool that returns true if a given 2-3 tree is valid and false otherwise.

fun isValid t = satisfiesOrderingProperty t andalso satisfiesHeightProperty t

- map isValid validTrees;
val it = [true,true,true,true,true,true,true,true,true,true] : bool list

- map isValid invalidTrees;
val it = [false,false,false,false,false,false,
          false,false,false,false,false,false] : bool list

insert (18 points) In this part, you will implement the insertion algorithm for 2-3 trees as described on pages 3 to 5 of the handout on 2-3 trees. You will not implement the special cases described on page 6 .

The insertion algorithm is a recursive algorithm that uses the notion of a pseudo 2-node that is “kicked up” in certain cases and handled specially in the upward phase of the recursion. To distinguish an insertion result that is a regular 2-3 tree from a pseudo 2-node that is “kicked up”, we use the following datatype:

datatype InsResult =
      Tree of TTTree
    | KickedUp of TTTree * int * TTTree (* pseudo 2-node "kicked up" from below *)

You will implement 2-3 tree insertion via a pair of functions:

(insert v t) returns the new TTTree that results from inserting a given int v into a given tree t. insert has type int -> TTTree -> TTTree.
(ins v t) is a helper function that returns the instance of InsResult that results from inserting a given int v into a given tree t. ins has type int -> TTTree -> InsResult.

Your implementation should flesh out the following code skeleton by turning the pictures on pages 3 to 5 from the handout on 2-3 trees into SML code.

(* insert: int -> TTTree -> TTTree *)
fun insert v t =
  case ins v t of
      Tree t => t
    | KickedUp(l, w, r) => W(l, w, r)
and (* "and" glues together mutually recursive functions *)
(* ins: int -> TTTree -> InsResult *)
    ins v L = KickedUp (L, v, L) 
  | ins v (W(l, X, r)) =
    if v <= X
    then (case ins v l of
                Tree l' => Tree(W(l', X, r))
              | KickedUp (l', w, m)  => Tree(H(l', w, m, X, r)))
    else (* flesh this out *)
  | (* handle an H node similarly to the W node based on rules from the handout *)

Notes:

The ins function should only call ins recursively and should not call insert.

An easy way to test insert is to call the following listToTTTree function on a list of numbers generated by range. (These are already defined in your starter file.) Note that you may need to increase the print depth (via Control.Print.printDepth := 100) in order to see all the details of the printed trees:

fun listToTTTree xs =
  foldl (fn (x,t) => insert x t) L xs

fun range lo hi =
  if lo >= hi then [] else lo :: (range (lo + 1) hi)

- listToTTTree (range 1 8);
val it = W (W (W #,2,W #),4,W (W #,6,W #)) : TTTree

- listToTTTree (range 1 17);
val it = W (W (W #,4,W #),8,W (W #,12,W #)) : TTTree

- Control.Print.printDepth := 100;
val it = () : unit

- listToTTTree (range 1 7);
val it = H (W (L,1,L),2,W (L,3,L),4,H (L,5,L,6,L)) : TTTree

- listToTTTree (range 1 18);
val it =
  W
    (W (W (W (L,1,L),2,W (L,3,L)),4,W (W (L,5,L),6,W (L,7,L))),8,
     W
       (W (W (L,9,L),10,W (L,11,L)),12,
        H (W (L,13,L),14,W (L,15,L),16,W (L,17,L)))) : TTTree

- listToTTTree (range 1 45);
val it =
  W
    (W
       (W (W (W (L,1,L),2,W (L,3,L)),4,W (W (L,5,L),6,W (L,7,L))),8,
        W (W (W (L,9,L),10,W (L,11,L)),12,W (W (L,13,L),14,W (L,15,L)))),16,
     H
       (W (W (W (L,17,L),18,W (L,19,L)),20,W (W (L,21,L),22,W (L,23,L))),24,
        W (W (W (L,25,L),26,W (L,27,L)),28,W (W (L,29,L),30,W (L,31,L))),32,
        H
          (W (W (L,33,L),34,W (L,35,L)),36,W (W (L,37,L),38,W (L,39,L)),40,
           W (W (L,41,L),42,H (L,43,L,44,L))))) : TTTree

Unfortunately, inserting elements in sequential order as above tends to create 2-3 trees with almost all 2-nodes and very few 3-nodes. It turns out a better way is to mix up the order of insertion as is done in the following code for makeTTTree. This results in trees that have a good mix of 2-nodes and 3-nodes:

(* Return list that has all ints in nums, but those not divisible by 3
   come before those divisible by three *)
fun arrangeMod3 nums =
  let val (zeroMod3, notZeroMod3) = 
            List.partition (fn n => (n mod 3) = 0) nums
  in notZeroMod3 @ zeroMod3
  end

(* Make a 2-3 tree with elements from 1 up to and including size. 
   Use arrangeMod3 to mix up numbers and lead to more 3-nodes
   than we'd get with sorted integers lists *)
fun makeTTTree size =
  listToTTTree (arrangeMod3 (range 1 (size + 1)))

- makeTTTree 17;
val it =
  H
    (W (W (L,1,L),2,H (L,3,L,4,L)),5,W (H (L,6,L,7,L),8,H (L,9,L,10,L)),11,
     H (H (L,12,L,13,L),14,W (L,15,L),16,W (L,17,L))) : TTTree

- makeTTTree 44;
val it =
  W
    (W
       (W (W (W (L,1,L),2,H (L,3,L,4,L)),5,W (H (L,6,L,7,L),8,H (L,9,L,10,L))),
        11,
        W
          (W (H (L,12,L,13,L),14,H (L,15,L,16,L)),17,
           W (H (L,18,L,19,L),20,H (L,21,L,22,L)))),23,
     W
       (W
          (W (H (L,24,L,25,L),26,H (L,27,L,28,L)),29,
           W (H (L,30,L,31,L),32,H (L,33,L,34,L))),35,
        W
          (W (H (L,36,L,37,L),38,H (L,39,L,40,L)),41,
           W (W (L,42,L),43,W (L,44,L))))) : TTTree

Finally, to test your insertion implementation more extensively, try (testInsert 1000) with the following function, which will use makeTTTree to create 2-3 trees containing 0 elements to 1000 elements and warn you of any invalid trees.

fun testInsert upToSize =
  let val pairs = map (fn n => (n, makeTTTree n)) (range 0 (upToSize + 1))
      val (validPairs, invalidPairs) = List.partition (fn (_,t) => isValid t) 
                                                      pairs
      val wrongElts = List.filter (fn (n,t) => (elts t) <> (range 1 (n + 1))) 
                                  validPairs
  in if (null invalidPairs) andalso (null wrongElts) 
     then (print "Passed all test cases\n"; [])
     else if (not (null invalidPairs))
          then (print "There are invalid trees in the following cases\n"; 
                invalidPairs)
          else (print "The elements or element order is wrong in the following cases\n"; 
                wrongElts)
  end

- testInsert 1000;
Passed all test cases
val it = [] : (int * TTTree) list

Extra Credit: 2-3 Tree Deletion (20 points)

In SML, implement and test the 2-3 tree deletion algorithm presented in pages 8 through 11 of the handout on 2-3 trees.

Extra Credit: 2-3 Trees in Java (35 points)

Implement and test 2-3 tree insertion and deletion in Java. Compare the SML and Java implementations. Which features of which languages make the implementation easier and why?