|
CS 232
Assignment 3 Due: Tuesday, Oct. 3 |
|
This assignment explores iteration in Common LISP with dotimes and
dolist, minimax search with alpha-beta pruning, and the application of
minimax search to the game of tic-tac-toe.
Complete Exercises 1-3 from Lab #4. Your solution code can be
placed directly into the lab4.lisp code file. Drop
off an electronic copy of this file by connecting to the subdirectory that contains
your completed lab4.lisp and running submit:
submit cs232 lab4 lab4.lisp
Also hand in a hardcopy of your lab4.lisp code file.
Part a: In this problem, you will first complete two exercises that give you practice with hand-simulating the minimax search process using alpha-beta pruning. These exercises are borrowed from Winston's Artificial Intelligence text, and refer to the figures shown below.
(1) Consider the game tree in figure E6.3. Explore the tree using the alpha-beta procedure. Strike out any static evaluations that do not need to be computed and indicate parts of the tree that are "cut off". Indicate the "winning" path or paths.
(2) Now consider the tree in figure E6.4, which is the mirror image of the game tree in figure E6.3. Again explore the tree using the alpha-beta procedure. Strike out any static evaluations that do not need to be computed and indicate parts of the tree that are "cut off". Indicate the "winning" path or paths.
You will observe that there is far more pruning of the tree in the second example. Why is this true? What is different about the arrangement of the values of the static evaluations in this case?
Part b: Indicate whether the following statement is true or false, and explain why:
Alpha-beta pruning guarantees that game playing programs can explore game trees twice as deeply, compared to using plain minimax search without alpha-beta pruning.
The following files in the ~cs232/download/ttt subdirectory contain code for
using minimax search with and without alpha-beta pruning, to play the game of tic-tac-toe:
minimaxTTT.lisp
minimaxTTT.fas
limit-depth.lisp
The minimaxTTT.fas file is a compiled version of minimaxTTT.lisp,
which runs faster than the interpreted LISP code. Copy these files onto your own directory by
connecting to your cs232 subdirectory and executing the following command:
cp -R ~cs232/download/ttt .
Load the minimaxTTT.fas code file into clisp. You can then
immediately play tic-tac-toe. You can play against the computer or have the computer play
against itself. The top-level functions are play and self-play. The
play function allows you to play against the computer. It has one required
input that specifies whether to use straight minimax search (input value
nil) or minimax search with alpha-beta pruning (input value
t). The play function also has an optional argument that
should be set to t if you want the computer to make the first move, and
nil otherwise. The self-play function allows the computer to play
against itself and has only a single required argument that specifies whether or not to use
alpha-beta pruning. These functions can therefore be called in the following ways:
(play t) use alpha-beta pruning,
user plays first
(play t t) use alpha-beta
pruning, machine plays first
(play nil) do not prune, user
plays first
(play nil t) do not prune,
machine plays first
(self-play t) use alpha-beta
pruning, computer plays against itself
(self-play nil) do not prune,
computer plays against itself
Before modifying any code, experiment with playing the game as it is, with different inputs.
The efficiency with which the search process can find a good move depends greatly on the
quality of the static evaluation of the board position and on the depth to which moves are
considered (i.e. how many moves into the future are evaluated). The code currently generates
the full game tree with paths to all of the possible final board configurations, and makes a decision
about the next move based on an assessment of this full game tree. In addition, it uses a
very simplistic method for generating a static evaluation score for a particular board
configuration. The function static in minimaxTTT.lisp, whose
definition is printed below,
returns a static evaluation score that is 1 if a particular board configuration is a winning
configuration for the current player, -1 if it is a winning configuration for the opponent,
and 0 otherwise.
(defun static (pos player)
(cond ((won? pos player) 1)
((won? pos (opposite player)) -1)
(t 0)))
Part a: The current heuristic for computing a static evaluation score
effectively weighs all paths to a winning configuration equally. It may be desirable to
give more weight to shorter paths (quick wins) and less weight to longer
paths (slower wins). Modify the static function so that it uses the measure
shown below to
determine the value to return when the input board configuration is a winning configuration
for the player or opponent. static should return a positive value if the
configuration is a winning configuration for the current player and a negative value if it is
a winning configuration for the opponent:
1+(1/(number of moves needed to reach the winning configuration))
If the number of moves in the denominator of the second term is larger, this
additional factor will
be smaller, so that the absolute value of the static evaluation score will be smaller. A
board configuration is stored as a list of 9 elements containing X's and O's for positions
that have been played, and NIL for unplayed positions, for example, the input
pos to the static function may be the following list:
(O X NIL NIL O NIL X X O)
If this is a winning configuration, the quantity, "number of moves needed to reach the
winning configuration" is just the sum of the non-NIL values in this list
(i.e. 6 moves in this example).
Part b: Using your new version of the static function, compare
the performance of the game-playing function with straight minimax search (no pruning) versus
minimax search with alpha-beta pruning. In particular, compare the number of static
evaluations that are performed in each case (this number is printed out as moves are made).
If you want the computer to play itself, you can compare the results when executing
(self-play nil) versus (self-play t).
Part c: In most game-playing situations, it is not feasible to generate the
entire game tree when choosing the next move. The file limit-depth.lisp contains
partial code to replace two of the functions in the minimaxTTT.lisp file,
static and deep-enough, to allow a limited search depth.
The intent of the new static function is to provide a
more effective static evaluation score for an intermediate board
configuration earlier in the game. The function deep-enough returns
T (true) if the search has proceeded deeply enough, so that a static evaluation
must now be performed on the terminal node. There is a global parameter
max-depth that controls the depth of the search, which is currently set to 4. If
the board position supplied as input to the new static function is not a winning
configuration for either player or a "drawn" configuration (i.e. no further move can be made),
then a score is computed that is the difference between the current player's "board-value"
and the opponent's "board-value".
Your first task in this part of the problem is to define the board-value function,
which should have two inputs corresponding to a board configuration (list of 9 elements as
shown above) and current player (X or O). The board-value for the current player should be
calculated as follows. There are 8 possible lines of 3 locations that would correspond to a
win if occupied by the same player (3 horizontal lines, 3 vertical lines, 2 diagonal). The
global variable ttt-lines in limit-depth.lisp is assigned to a list
of 3-element lists that each specify the 3 indices of positions along a line on the tic-tac-toe
board. For each of the 8 lines, a line-score should be calculated as follows: if there is an
opponent along the line, then the line-score is 0. Otherwise, the line-score is the number of
locations in the line occupied by the current player. The final board-value is the sum of the
line-scores for the 8 lines. Your definition of the board-value function should
use dolist at least once, and it is ok to define a helper function.
Compare the overall performance of the game-playing function with a limited search depth to that observed when the full game tree was explored. For example, how does the number of static evaluations compare? Does the computer seem to win less often with depth-limited search? To see an extreme example of the difference in behavior, try the following:
(setf max-depth 1) ;; very limited search depth!
(play t)
;; choose the following sequence of moves - 0 8 2 1 -- note
;; that you are able to win in this case
(setf max-depth 10) ;; search the full game tree
(play t)
;; try to start with the same sequence of moves - 0 and 8
;; You will not be able to win in this case!
Submission details: Hand in a hardcopy of your modified static
function from Part a (in minimaxTTT.lisp) and any code that you added to
limit-depth.lisp, as well as your answers to the questions throughout this
handout. Drop off an electronic copy of your
final minimaxTTT.lisp and limit-depth.lisp code files by connecting
to your ttt subdirectory and running
submit as follows:
submit cs232 assign2ttt *.lisp