Documente Academic
Documente Profesional
Documente Cultură
Adversarial Search
22c:145 Artificial Intelligence
Optimal decisions
Minimax
Adversarial Search α-β pruning
ExpectMinimax
Textbook: Chapter 5 Case study: Deep Blue
1 2
3 4
5 6
1
Game Tree Example:
Tic-Tac-Toe
Formal definition of a game:
– Initial state
– Successor function: returns list of (move, state) pairs
– Terminal test: determines when game over
Terminal states: states where game ends
– Utility function (objective function or payoff function):
gives numeric value for terminal states
O’s turn
O’s turn
X X
loss loss
Tie Tie Tie Tie
9 10
2
Each board in game tree gets unique
Look-ahead based Tic-Tac-Toe game tree value (utility; -1/0/+1) E.g. 0 for top board.
under optimal rational play.
(Convince yourself.) What if our opponent
does not play optimally?
X’s turn X’s turn
Win for O Tie Win for O Win for O Tie Win for O
O’s turn
Win for O Tie Win for O Approach: Look first at bottom tree. Label bottom-most boards.
Tie Tie Tie
Then label boards one level up, according result of best possible move.
… and so on. Moving up layer by layer.
Termed the Minimax Algorithm
– Implemented as a depth-first search
13 14
Issue has been studied for decades but research keeps coming back to
game tree search (or most recently, game tree sampling).
15 16
3 9 0 7 2 6
17 18
3
Minimax Algorithm
What if
Minimax Algorithm (cont’d) payoff(Q) = 100 Do DFS. Real games:
payoff(R) = 200 use iterative deepening.
(gives “anytime” approach.)
Starting DFS, left to right,
do we need to know eval(H)? >= 3 (DFS left to right)
3
<= 0 <= 2
3 0 2 3 0 2
Prune! Prune!
3 9 0 7 2 6 3 9 0 7 2 6
19 20
Minimax algorithm
– Perfect play for deterministic, 2-player game
– Max tries to maximize its score
– Min tries to minimize Max’s score (Min)
– Goal: Max to move to position with highest minimax value
Identify best achievable payoff against best play
21 22
Minimax Algorithm
Properties of minimax algorithm:
4
How big is this tree?
Look-ahead based Chess
Black’s ~80
O’s turn
turn levels
deep
X
Approx. 10^120 > Number of atoms in the observable universe (10^80)
We can really only search a tiny, miniscule faction of this tree!
Around 60 x 10^9 nodes for 5 minute move. Approx. 1 / 10^70 fraction.
25 26
Resource limits
Cutoff Search Can’t go to all the way to the “bottom:”
Evaluation functions
Evaluation Function For chess, typically linear weighted sum of features
5
α-β Pruning
Minimax Algorithm
Can we improve search by reducing the size of the game tree
to be examined?
Limitations
– Generally not feasible to traverse entire tree Yes! Using alpha-beta pruning
– Time limitations
Principle
Key Improvements – If a move is determined worse than another move already
– Use evaluation function instead of utility (discussed earlier) examined, then there is no need for further examination of the
• Evaluation function provides estimate of utility at given position node.
– Alpha/beta pruning Analysis shows that will be able to search almost twice as deep.
This really is what makes game tree search practically feasible.
E.g. Deep Blue 14 plies using alpha-beta pruning.
Otherwise only 7 or 8 (weak chess player). (plie = half move / one player)
31 32
33 34
35 36
6
Alpha-Beta Pruning
Rules:
– α is the best (highest) found so far along the path for Max
– β is the best (lowest) found so far along the path for Min
– Search below a MIN node may be alpha-pruned if
its β of some MAX ancestor
– Search below a MAX node may be beta-pruned if its
β of some MIN ancestor.
Note: order
children matters!
What gives best pruning?
Visit most promising (from min/max perspective) first.
37 38
39 40
41 42
7
Alpha-Beta Pruning Example Alpha-Beta Pruning Example
β β
43 44
45 46
1.Search below
a MIN node α is the value of the best
may be
alpha-pruned (i.e., highest-value) choice
5 if the beta
5 3 value is <= to found so far at any choice
the alpha
value of point along the path for
some MAX
ancestor. max
5 3
3 7 2. Search
5 6 below a
α MAX node If v is worse than α, max
may be beta-
pruned if the will avoid it
alpha value is
>= to the beta
5 6 3 value of
5 0 6 1 3 2
some MIN prune that branch
4 7 ancestor.
β
Define β similarly for min
47 48
8
Properties of α-β Prune
A few quick approx. numbers for Chess:
Expectiminimax
When Chance is involved:
Backgammon Board
0 1 2 3 4 5 6 7 8 9 10 11 12
25 24 23 22 21 20 19 18 17 16 15 14 13
51 52
A1 A2 A1 A2
DICE …
… … … … 2.1 1.3
21 40.9
1/18
1/36
1,2 6,5 6,6 .9 .1 .9 .1
1,1
MIN … .9 .1 .9 .1
… … … 2 3 1 4 20 1
DICE C …
30 400
… … …
… 1/18 2 23 3 1 1 4 4
1 400 400
1/36 20 20 30 30 1
1,2
1,1 …6,5 6,6
MAX
… … … .9 * 2 + .1 * 3 = 2.1 Small chance at high payoff wins.
But, not necessarily the best thing
TERMINAL to do!
53 54
9
Expectiminimax Expectiminimax Example
3
Expectiminimax(n) =
2 3
Utility(n) for n, a terminal state (0*0.67 + 6*0.33) (0*0.67 + 9*0.33)
0 6 0 9
maxsSucc(n) expectiminimax( s ) for n, a Max node
(3*1.0)
min s Succ(n) expectiminimax ( s ) for n, a Min node 3 0 6 9 3 0 9 12
55 56
1997
1912
1970s
10
Combinatorics of Chess Positions with Smart Pruning
Search Depth (ply) Positions
Opening book 2 60
Endgame 4 2,000
– database of all 5 piece endgames exists; database of all 6 piece games 6 60,000
being built 8 2,000,000
10 (<1 second DB) 60,000,000
Middle game
12 2,000,000,000
– Positions evaluated (estimation) 14 (5 minutes DB) 60,000,000,000
• 1 move by each player = 1,000 16 2,000,000,000,000
• 2 moves by each player = 1,000,000
• 3 moves by each player = 1,000,000,000
How many lines of play does a grand master consider?
Around 5 to 7
61 62
63 64
Deep Blue
Special-Purpose and Parallel Hardware
Hardware
Belle (Thompson 1978) – 32 general processors
Cray Blitz (1993) – 220 VSLI chess chips
Hitech (1985) Overall: 200,000,000 positions per second
Deep Blue (1987-1996) – 5 minutes = depth 14
– Parallel evaluation: allows more complicated evaluation Selective extensions - search deeper at unstable positions
functions – down to depth 25 !
– Hardest part: coordinating parallel search
– Interesting factoid: Deep Blue never quite played the same
Aside:
game, because of “noise” in its hardware!
4-ply ≈ human novice
8-ply to 10-ply ≈ typical PC, human master
14-ply ≈ Deep Blue, Kasparov (+ depth 25
for “selective extensions”)
65 66
11
Evolution of Deep Blue
67 68
69 70
12
Summary
Game systems rely heavily on
– Search techniques Game playing was one of the first tasks undertaken in AI as
– Heuristic functions soon as computers became programmable (e.g., Turing,
– Bounding and pruning techniques Shannon, and Wiener tackled chess).
– Knowledge database on game
Game playing research has spawned a number of interesting
For AI, the abstract nature of games makes them an research ideas on search, data structures, databases,
appealing subject for study: heuristics, evaluations functions and other areas of computer
science.
state of the game is easy to represent;
agents are usually restricted to a small number of
actions whose outcomes are defined by precise rules
73 74
1M Multi-agent systems
10301,020 5M combining:
reasoning,
Case complexity
uncertainty &
learning
0.5M VLSI
10150,500 1M Verification
13