CS 232

Assignment 3

Due: Tuesday, Oct. 3

This assignment explores iteration in Common LISP with dotimes and dolist, minimax search with alpha-beta pruning, and the application of minimax search to the game of tic-tac-toe.

Problem 1: Lab #4 Exercises

Complete Exercises 1-3 from Lab #4. Your solution code can be placed directly into the lab4.lisp code file. Drop off an electronic copy of this file by connecting to the subdirectory that contains your completed lab4.lisp and running submit:

   submit cs232 lab4 lab4.lisp

Also hand in a hardcopy of your lab4.lisp code file.

Problem 2: Minimax Search with Alpha-Beta Pruning

Part a: In this problem, you will first complete two exercises that give you practice with hand-simulating the minimax search process using alpha-beta pruning. These exercises are borrowed from Winston's Artificial Intelligence text, and refer to the figures shown below.

(1) Consider the game tree in figure E6.3. Explore the tree using the alpha-beta procedure. Strike out any static evaluations that do not need to be computed and indicate parts of the tree that are "cut off". Indicate the "winning" path or paths.

(2) Now consider the tree in figure E6.4, which is the mirror image of the game tree in figure E6.3. Again explore the tree using the alpha-beta procedure. Strike out any static evaluations that do not need to be computed and indicate parts of the tree that are "cut off". Indicate the "winning" path or paths.

You will observe that there is far more pruning of the tree in the second example. Why is this true? What is different about the arrangement of the values of the static evaluations in this case?

Part b: Indicate whether the following statement is true or false, and explain why:

Alpha-beta pruning guarantees that game playing programs can explore game trees twice as deeply, compared to using plain minimax search without alpha-beta pruning.

Problem 3: Applying Minimax Search to the Game of Tic-Tac-Toe

The following files in the ~cs232/download/ttt subdirectory contain code for using minimax search with and without alpha-beta pruning, to play the game of tic-tac-toe:

    minimaxTTT.lisp
    minimaxTTT.fas
    limit-depth.lisp

The minimaxTTT.fas file is a compiled version of minimaxTTT.lisp, which runs faster than the interpreted LISP code. Copy these files onto your own directory by connecting to your cs232 subdirectory and executing the following command:

    cp -R ~cs232/download/ttt .

Load the minimaxTTT.fas code file into clisp. You can then immediately play tic-tac-toe. You can play against the computer or have the computer play against itself. The top-level functions are play and self-play. The play function allows you to play against the computer. It has one required input that specifies whether to use straight minimax search (input value nil) or minimax search with alpha-beta pruning (input value t). The play function also has an optional argument that should be set to t if you want the computer to make the first move, and nil otherwise. The self-play function allows the computer to play against itself and has only a single required argument that specifies whether or not to use alpha-beta pruning. These functions can therefore be called in the following ways:

    (play t)               use alpha-beta pruning, user plays first
    (play t t)            use alpha-beta pruning, machine plays first
    (play nil)            do not prune, user plays first
    (play nil t)         do not prune, machine plays first
    (self-play t)       use alpha-beta pruning, computer plays against itself
    (self-play nil)     do not prune, computer plays against itself

Before modifying any code, experiment with playing the game as it is, with different inputs.

The efficiency with which the search process can find a good move depends greatly on the quality of the static evaluation of the board position and on the depth to which moves are considered (i.e. how many moves into the future are evaluated). The code currently generates the full game tree with paths to all of the possible final board configurations, and makes a decision about the next move based on an assessment of this full game tree. In addition, it uses a very simplistic method for generating a static evaluation score for a particular board configuration. The function static in minimaxTTT.lisp, whose definition is printed below, returns a static evaluation score that is 1 if a particular board configuration is a winning configuration for the current player, -1 if it is a winning configuration for the opponent, and 0 otherwise.

(defun static (pos player)
   (cond ((won? pos player) 1)
         ((won? pos (opposite player)) -1)
         (t 0)))

Part a: The current heuristic for computing a static evaluation score effectively weighs all paths to a winning configuration equally. It may be desirable to give more weight to shorter paths (quick wins) and less weight to longer paths (slower wins). Modify the static function so that it uses the measure shown below to determine the value to return when the input board configuration is a winning configuration for the player or opponent. static should return a positive value if the configuration is a winning configuration for the current player and a negative value if it is a winning configuration for the opponent:

1+(1/(number of moves needed to reach the winning configuration))

If the number of moves in the denominator of the second term is larger, this additional factor will be smaller, so that the absolute value of the static evaluation score will be smaller. A board configuration is stored as a list of 9 elements containing X's and O's for positions that have been played, and NIL for unplayed positions, for example, the input pos to the static function may be the following list:

   (O X NIL NIL O NIL X X O)

If this is a winning configuration, the quantity, "number of moves needed to reach the winning configuration" is just the sum of the non-NIL values in this list (i.e. 6 moves in this example).

Part b: Using your new version of the static function, compare the performance of the game-playing function with straight minimax search (no pruning) versus minimax search with alpha-beta pruning. In particular, compare the number of static evaluations that are performed in each case (this number is printed out as moves are made). If you want the computer to play itself, you can compare the results when executing (self-play nil) versus (self-play t).

Part c: In most game-playing situations, it is not feasible to generate the entire game tree when choosing the next move. The file limit-depth.lisp contains partial code to replace two of the functions in the minimaxTTT.lisp file, static and deep-enough, to allow a limited search depth. The intent of the new static function is to provide a more effective static evaluation score for an intermediate board configuration earlier in the game. The function deep-enough returns T (true) if the search has proceeded deeply enough, so that a static evaluation must now be performed on the terminal node. There is a global parameter max-depth that controls the depth of the search, which is currently set to 4. If the board position supplied as input to the new static function is not a winning configuration for either player or a "drawn" configuration (i.e. no further move can be made), then a score is computed that is the difference between the current player's "board-value" and the opponent's "board-value".

Your first task in this part of the problem is to define the board-value function, which should have two inputs corresponding to a board configuration (list of 9 elements as shown above) and current player (X or O). The board-value for the current player should be calculated as follows. There are 8 possible lines of 3 locations that would correspond to a win if occupied by the same player (3 horizontal lines, 3 vertical lines, 2 diagonal). The global variable ttt-lines in limit-depth.lisp is assigned to a list of 3-element lists that each specify the 3 indices of positions along a line on the tic-tac-toe board. For each of the 8 lines, a line-score should be calculated as follows: if there is an opponent along the line, then the line-score is 0. Otherwise, the line-score is the number of locations in the line occupied by the current player. The final board-value is the sum of the line-scores for the 8 lines. Your definition of the board-value function should use dolist at least once, and it is ok to define a helper function.

Compare the overall performance of the game-playing function with a limited search depth to that observed when the full game tree was explored. For example, how does the number of static evaluations compare? Does the computer seem to win less often with depth-limited search? To see an extreme example of the difference in behavior, try the following:

(setf max-depth 1)    ;; very limited search depth!
(play t)
;; choose the following sequence of moves - 0 8 2 1 -- note
;; that you are able to win in this case
(setf max-depth 10)    ;; search the full game tree
(play t)
;; try to start with the same sequence of moves - 0 and 8
;; You will not be able to win in this case!

Submission details: Hand in a hardcopy of your modified static function from Part a (in minimaxTTT.lisp) and any code that you added to limit-depth.lisp, as well as your answers to the questions throughout this handout. Drop off an electronic copy of your final minimaxTTT.lisp and limit-depth.lisp code files by connecting to your ttt subdirectory and running submit as follows:

    submit cs232 assign2ttt *.lisp