1
00:00:08,928 --> 00:00:12,580
SPEAKER 1: It was about 1963
when a noted philosopher here

2
00:00:12,580 --> 00:00:15,885
at MIT, named Hubert Dreyfus--

3
00:00:20,642 --> 00:00:30,330
Hubert Dreyfus wrote a paper in
about 1963 in which he had

4
00:00:30,330 --> 00:00:37,610
a heading titled, "Computers
Can't Play Chess." Of course,

5
00:00:37,610 --> 00:00:40,110
he was subsequently invited
over to the artificial

6
00:00:40,110 --> 00:00:41,670
intelligence laboratory
to play the

7
00:00:41,670 --> 00:00:43,420
Greenblatt chess machine.

8
00:00:43,420 --> 00:00:46,650
And, of course, he lost.

9
00:00:46,650 --> 00:00:52,200
Whereupon Seymour Pavitt wrote
a rebuttal to Dreyfus' famous

10
00:00:52,200 --> 00:00:56,240
paper, which had a subject
heading, "Dreyfus Can't Play

11
00:00:56,240 --> 00:00:59,840
Chess Either."

12
00:00:59,840 --> 00:01:02,390
But in a strange sense, Dreyfus
might have been right

13
00:01:02,390 --> 00:01:07,000
and would have been right if he
were to have said computers

14
00:01:07,000 --> 00:01:11,690
can't play chess the way
humans play chess yet.

15
00:01:11,690 --> 00:01:16,630
In any case, around about 1968
a chess master named David

16
00:01:16,630 --> 00:01:23,000
Levy bet noted founder of
artificial intelligence John

17
00:01:23,000 --> 00:01:25,440
McCarthy that no computer would
beat the world champion

18
00:01:25,440 --> 00:01:27,730
within 10 years.

19
00:01:27,730 --> 00:01:31,740
And five years later, McCarthy
gave up, because it had

20
00:01:31,740 --> 00:01:36,509
already become clear that no
computer would win in a way

21
00:01:36,509 --> 00:01:39,590
that McCarthy wanted it to win,
that is to say by playing

22
00:01:39,590 --> 00:01:42,960
chess the way humans
play chess.

23
00:01:42,960 --> 00:01:48,160
But then 20 years after that
in 1997, Deep Blue beat the

24
00:01:48,160 --> 00:01:52,160
world champion, and chess
suddenly became uninteresting.

25
00:01:54,910 --> 00:01:58,690
But we're going to talk about
games today, because there are

26
00:01:58,690 --> 00:02:01,720
elements of game-play that do
model some of the things that

27
00:02:01,720 --> 00:02:03,750
go on in our head.

28
00:02:03,750 --> 00:02:06,050
And if they don't model things
that go on in our head, they

29
00:02:06,050 --> 00:02:08,610
do model some kind
of intelligence.

30
00:02:08,610 --> 00:02:10,490
And if we're to have a general
understanding of what

31
00:02:10,490 --> 00:02:13,690
intelligence is all about, we
have to understand that kind

32
00:02:13,690 --> 00:02:16,000
of intelligence, too.

33
00:02:16,000 --> 00:02:18,790
So, we'll start out by talking
about various ways that we

34
00:02:18,790 --> 00:02:20,329
might design a computer
program to

35
00:02:20,329 --> 00:02:22,360
play a game like chess.

36
00:02:22,360 --> 00:02:26,930
And we'll conclude by talking
a little bit about what Deep

37
00:02:26,930 --> 00:02:32,460
Blue adds to the mix other
than tremendous speed.

38
00:02:32,460 --> 00:02:34,300
So, that's our agenda.

39
00:02:34,300 --> 00:02:37,320
By the end of the hour, you'll
understand and be able to

40
00:02:37,320 --> 00:02:40,290
write your own Deep Blue
if you feel like it.

41
00:02:40,290 --> 00:02:44,270
First, we want to talk about how
it might be possible for a

42
00:02:44,270 --> 00:02:45,590
computer to play chess.

43
00:02:45,590 --> 00:02:48,250
Let's talk about several
approaches

44
00:02:48,250 --> 00:02:50,220
that might be possible.

45
00:02:50,220 --> 00:02:53,860
Approach number one is that
the machine might make a

46
00:02:53,860 --> 00:02:56,980
description of the board the
same way a human would; talk

47
00:02:56,980 --> 00:03:00,000
about pawn structure, King
safety, whether it's a good

48
00:03:00,000 --> 00:03:01,960
time to castle, that
sort of thing.

49
00:03:01,960 --> 00:03:12,130
So, it would be analysis and
perhaps some strategy mixed up

50
00:03:12,130 --> 00:03:13,380
with some tactics.

51
00:03:15,870 --> 00:03:20,190
And all that would get mixed
up and, finally, result in

52
00:03:20,190 --> 00:03:22,710
some kind of move.

53
00:03:22,710 --> 00:03:27,450
If this is the game board, the
next thing to do would be

54
00:03:27,450 --> 00:03:29,400
determined by some process
like that.

55
00:03:29,400 --> 00:03:33,730
And the trouble is no one
knows how to do it.

56
00:03:33,730 --> 00:03:35,180
And so in that sense,
Dreyfus is right.

57
00:03:35,180 --> 00:03:38,829
None the game playing programs
today incorporate any of that

58
00:03:38,829 --> 00:03:41,350
kind of stuff.

59
00:03:41,350 --> 00:03:43,600
And since nobody knows
how to do that, we

60
00:03:43,600 --> 00:03:44,420
can't talk about it.

61
00:03:44,420 --> 00:03:45,610
So we can talk about
other ways, though,

62
00:03:45,610 --> 00:03:46,829
that we might try.

63
00:03:46,829 --> 00:03:48,450
For example, we can have
if-then rules.

64
00:03:55,820 --> 00:03:56,740
How would that work?

65
00:03:56,740 --> 00:03:57,840
That would work this way.

66
00:03:57,840 --> 00:04:01,770
You look at the board,
represented by this node here,

67
00:04:01,770 --> 00:04:09,970
and you say, well, if it's
possible to move the Queen

68
00:04:09,970 --> 00:04:13,390
pawn forward by one,
then do that.

69
00:04:13,390 --> 00:04:16,010
So, it doesn't do any of
evaluation of the board.

70
00:04:16,010 --> 00:04:19,290
It doesn't try anything.

71
00:04:19,290 --> 00:04:22,360
It just says let me look at the
board and select a move on

72
00:04:22,360 --> 00:04:24,170
that basis.

73
00:04:24,170 --> 00:04:28,130
So, that would be a way
of approaching a game

74
00:04:28,130 --> 00:04:30,760
situation like this.

75
00:04:30,760 --> 00:04:32,430
Here's the situation.

76
00:04:32,430 --> 00:04:34,060
Here are the possible moves.

77
00:04:34,060 --> 00:04:36,130
And one is selected
on the basis of an

78
00:04:36,130 --> 00:04:40,180
if-then rule like so.

79
00:04:40,180 --> 00:04:42,195
And nobody can make a very
strong chess player

80
00:04:42,195 --> 00:04:43,350
that works like that.

81
00:04:43,350 --> 00:04:46,100
Curiously enough, someone has
made a pretty good checkers

82
00:04:46,100 --> 00:04:49,326
playing program that
works like that.

83
00:04:49,326 --> 00:04:53,110
It checks to see what moves are
available on the board,

84
00:04:53,110 --> 00:04:57,290
ranks them, and picks the
highest one available.

85
00:04:57,290 --> 00:04:58,920
But, in general, that's not
a very good approach.

86
00:04:58,920 --> 00:04:59,860
It's not very powerful.

87
00:04:59,860 --> 00:05:01,670
You couldn't make it--

88
00:05:01,670 --> 00:05:03,900
well, when I say, couldn't, it
means I can't think of any way

89
00:05:03,900 --> 00:05:05,530
that you could make a
strong chess playing

90
00:05:05,530 --> 00:05:06,780
program that way.

91
00:05:09,260 --> 00:05:19,055
So, the third way to do this is
to look ahead and evaluate.

92
00:05:24,090 --> 00:05:26,410
What that means is you
look ahead like so.

93
00:05:26,410 --> 00:05:29,790
You see all the possible
consequences of moves, and you

94
00:05:29,790 --> 00:05:33,740
say, which of these board
situations is best for me?

95
00:05:33,740 --> 00:05:37,210
So, that would be an approach
that comes in here like so and

96
00:05:37,210 --> 00:05:41,780
says, which one of those three
situations is best?

97
00:05:41,780 --> 00:05:45,990
And to do that, we have to have
some way of evaluating

98
00:05:45,990 --> 00:05:50,770
the situation deciding which
of those is best.

99
00:05:50,770 --> 00:05:53,710
Now, I want to do a little,
brief aside, because I want to

100
00:05:53,710 --> 00:05:56,670
talk about the mechanisms that
are popularly used to do that

101
00:05:56,670 --> 00:05:59,420
kind of evaluation.

102
00:05:59,420 --> 00:06:02,560
In the end, there are lots of
features of the chessboard.

103
00:06:02,560 --> 00:06:05,680
Let's call them f1,
f2, and so on.

104
00:06:08,830 --> 00:06:12,380
And we might form some function
of those features.

105
00:06:12,380 --> 00:06:16,190
And that, overall, is called
the static value.

106
00:06:16,190 --> 00:06:19,400
So, it's static because you're
not exploring any consequences

107
00:06:19,400 --> 00:06:20,250
of what might happen.

108
00:06:20,250 --> 00:06:22,525
You're just looking at the board
as it is, checking the

109
00:06:22,525 --> 00:06:25,080
King's safety, checking
the pawn structure.

110
00:06:25,080 --> 00:06:28,440
Each of those produces a number
fed into this function,

111
00:06:28,440 --> 00:06:30,380
out comes a value.

112
00:06:30,380 --> 00:06:33,960
And that is a value of the
board seen from your

113
00:06:33,960 --> 00:06:36,159
perspective.

114
00:06:36,159 --> 00:06:42,370
Now, normally, this function,
g, is reduced to a linear

115
00:06:42,370 --> 00:06:43,990
polynomial.

116
00:06:43,990 --> 00:06:47,330
So, in the end, the most popular
kind of way of forming

117
00:06:47,330 --> 00:06:52,120
a static value is to take f1,
multiply it times some

118
00:06:52,120 --> 00:06:57,290
constant, c1, add c2, multiply
it times f2.

119
00:07:02,560 --> 00:07:10,973
And that is a linear
scoring polynomial.

120
00:07:18,880 --> 00:07:21,610
So, we could use that function
to produce numbers from each

121
00:07:21,610 --> 00:07:24,150
of these things and then pick
the highest number.

122
00:07:24,150 --> 00:07:26,640
And that would be a way
of playing the game.

123
00:07:26,640 --> 00:07:29,220
Actually, a scoring polynomial
is a little bit

124
00:07:29,220 --> 00:07:29,940
more than we need.

125
00:07:29,940 --> 00:07:33,909
Because all we really need is
a method that looks at those

126
00:07:33,909 --> 00:07:36,990
three boards and says,
I like this one best.

127
00:07:36,990 --> 00:07:38,490
It doesn't have to rank them.

128
00:07:38,490 --> 00:07:40,340
It doesn't have to give
them numbers.

129
00:07:40,340 --> 00:07:43,500
All it has to do is say which
one it likes best.

130
00:07:43,500 --> 00:07:45,690
So, one way of doing that is
to use a linear scoring

131
00:07:45,690 --> 00:07:46,240
polynomial.

132
00:07:46,240 --> 00:07:49,940
But it's not the only
way of doing that.

133
00:07:49,940 --> 00:07:53,980
So, that's number two
and number three.

134
00:07:53,980 --> 00:07:58,409
But now what else might we do?

135
00:07:58,409 --> 00:08:01,210
Well, if we reflect back on some
of the searches we talked

136
00:08:01,210 --> 00:08:04,320
about, what's the base case
against which everything else

137
00:08:04,320 --> 00:08:07,800
is compared much the way of
doing search that doesn't

138
00:08:07,800 --> 00:08:10,910
require any intelligence,
just brute force?

139
00:08:10,910 --> 00:08:13,630
We could use the British Museum
algorithm and simply

140
00:08:13,630 --> 00:08:17,770
evaluate the entire tree of
possibilities; I move, you

141
00:08:17,770 --> 00:08:20,540
move, I move, you move,
all the way down to--

142
00:08:23,370 --> 00:08:25,726
what?--

143
00:08:25,726 --> 00:08:28,800
maybe 100, 50 moves.

144
00:08:28,800 --> 00:08:29,770
You do 50 things.

145
00:08:29,770 --> 00:08:31,750
I do 50 things.

146
00:08:31,750 --> 00:08:35,500
So, before we can decide if
that's a good idea or not, we

147
00:08:35,500 --> 00:08:38,754
probably ought to develop
some vocabulary.

148
00:08:50,160 --> 00:08:58,000
So, consider this
tree of moves.

149
00:08:58,000 --> 00:09:02,530
There will be some
number of choices

150
00:09:02,530 --> 00:09:04,285
considered at each level.

151
00:09:04,285 --> 00:09:07,250
And there will be some
number of levels.

152
00:09:07,250 --> 00:09:09,910
So, the standard language for
this as we call this the

153
00:09:09,910 --> 00:09:11,160
branching factor.

154
00:09:18,880 --> 00:09:23,340
And in this particular case,
b is equal to 3.

155
00:09:23,340 --> 00:09:30,250
This is the depth of the tree.

156
00:09:30,250 --> 00:09:34,280
And, in this case, d is two.

157
00:09:34,280 --> 00:09:37,750
So, now that produces a certain
number of terminal or

158
00:09:37,750 --> 00:09:39,000
leaf nodes.

159
00:09:44,060 --> 00:09:45,875
How many of those are there?

160
00:09:49,020 --> 00:09:50,170
Well, that's pretty simple
computation.

161
00:09:50,170 --> 00:09:51,840
It's just b to the d.

162
00:09:51,840 --> 00:09:55,330
Right, Christopher,
b to the d?

163
00:09:55,330 --> 00:10:01,660
So, if you have b to the d at
this level, you have one.

164
00:10:01,660 --> 00:10:04,670
b to the d at this level,
you have b.

165
00:10:04,670 --> 00:10:09,020
b to the d at this level, you
have [? d ?] squared.

166
00:10:09,020 --> 00:10:14,030
So, b to the d, in this
particular case, is 9.

167
00:10:17,090 --> 00:10:19,310
So, now we can use this
vocabulary that we've

168
00:10:19,310 --> 00:10:21,770
developed to talk about whether
it's reasonable to

169
00:10:21,770 --> 00:10:24,990
just do the British Museum
algorithm, be done with it,

170
00:10:24,990 --> 00:10:28,500
forget about chess,
and go home.

171
00:10:28,500 --> 00:10:29,750
Well, let's see.

172
00:10:32,450 --> 00:10:35,050
It's pretty deep down there.

173
00:10:35,050 --> 00:10:39,290
If we think about chess, and we
think about a standard game

174
00:10:39,290 --> 00:10:41,970
which each person does
50 things, that

175
00:10:41,970 --> 00:10:45,080
gives a d about 100.

176
00:10:45,080 --> 00:10:47,490
And if you think about the
branching factor in chess,

177
00:10:47,490 --> 00:10:50,430
it's generally presumed to be,
depending on the stage of the

178
00:10:50,430 --> 00:10:52,870
game and so on and so forth,
it varies, but it might

179
00:10:52,870 --> 00:10:55,940
average around 14 or 15.

180
00:10:55,940 --> 00:10:59,620
If it were just 10, that would
be 10 to the 100th.

181
00:10:59,620 --> 00:11:01,310
But it's a little more than
that, because the branching

182
00:11:01,310 --> 00:11:03,930
factor is more than 10.

183
00:11:03,930 --> 00:11:09,300
So, in the end, it looks like,
according to Claude Shannon,

184
00:11:09,300 --> 00:11:16,160
there are about 10 to the 120th
leaf nodes down there.

185
00:11:16,160 --> 00:11:18,930
And if you're going to go to a
British Museum treatment of

186
00:11:18,930 --> 00:11:21,990
this tree, you'd have to do
10 to the 120th static

187
00:11:21,990 --> 00:11:28,310
evaluations down there at the
bottom if you're going to see

188
00:11:28,310 --> 00:11:32,080
which one of the moves
is best at the top.

189
00:11:32,080 --> 00:11:33,850
Is that a reasonable number?

190
00:11:33,850 --> 00:11:38,622
It didn't used to seem
practicable.

191
00:11:38,622 --> 00:11:41,460
It used to seem impossible.

192
00:11:41,460 --> 00:11:43,400
But now we've got cloud
computing and everything.

193
00:11:43,400 --> 00:11:48,180
And maybe we could actually
do that, right?

194
00:11:48,180 --> 00:11:51,440
What do you think, Vanessa, can
you do that, get enough

195
00:11:51,440 --> 00:11:54,940
computers going in the cloud?

196
00:11:54,940 --> 00:11:55,385
No?

197
00:11:55,385 --> 00:11:57,150
You're not sure?

198
00:11:57,150 --> 00:11:59,350
Should we work it out?

199
00:11:59,350 --> 00:12:00,520
Let's work it out.

200
00:12:00,520 --> 00:12:04,170
I'll need some help, especially
from any of you who

201
00:12:04,170 --> 00:12:05,420
are studying cosmology.

202
00:12:07,700 --> 00:12:09,470
So, we'll start with
how many atoms are

203
00:12:09,470 --> 00:12:10,720
there in the universe?

204
00:12:13,580 --> 00:12:14,890
Volunteers?

205
00:12:14,890 --> 00:12:15,790
10 to the--

206
00:12:15,790 --> 00:12:16,792
SPEAKER 2: 10 to the 38th?

207
00:12:16,792 --> 00:12:19,300
SPEAKER 1: No, no, 10 to the
38th has been offered.

208
00:12:19,300 --> 00:12:22,376
That's why it's way too low.

209
00:12:22,376 --> 00:12:25,760
The last time I looked, it was
about 10 to the 80th atoms in

210
00:12:25,760 --> 00:12:27,010
the universe.

211
00:12:33,940 --> 00:12:35,900
The next thing I'd like to know
is how many seconds are

212
00:12:35,900 --> 00:12:37,232
there in a year?

213
00:12:37,232 --> 00:12:41,200
It's a good number
have memorized.

214
00:12:41,200 --> 00:12:53,350
That number is approximately
pi times 10 to the seventh.

215
00:12:53,350 --> 00:12:56,190
So, how many nanoseconds
in a second?

216
00:12:56,190 --> 00:13:03,410
That gives us 10 to the ninth.

217
00:13:03,410 --> 00:13:06,670
At last, how many years
are there in the

218
00:13:06,670 --> 00:13:07,920
history of the universe?

219
00:13:10,040 --> 00:13:12,480
SPEAKER 3: [INAUDIBLE].

220
00:13:12,480 --> 00:13:15,790
14.7 billion.

221
00:13:15,790 --> 00:13:18,150
SPEAKER 1: She offers something
on the order of 10

222
00:13:18,150 --> 00:13:21,960
billion, maybe 14 billion.

223
00:13:21,960 --> 00:13:26,130
But we'll say 10 billion to make
our calculation simple.

224
00:13:26,130 --> 00:13:31,630
That's 10 to the 10th years.

225
00:13:31,630 --> 00:13:38,300
If we will add that up, 80, 90,
plus 16, that's 10 to the

226
00:13:38,300 --> 00:13:50,540
106th nanoseconds in the history
of the universe.

227
00:13:50,540 --> 00:13:52,580
Multiply it times the number
of atoms in the universe.

228
00:13:52,580 --> 00:13:56,900
So, if all of the atoms in the
universe were doing static

229
00:13:56,900 --> 00:14:00,740
evaluations at nanosecond speeds
since the beginning of

230
00:14:00,740 --> 00:14:06,640
the Big Bang, we'd still be 14
orders of magnitudes short.

231
00:14:06,640 --> 00:14:08,120
So, it'd be a pretty
good cloud.

232
00:14:08,120 --> 00:14:11,395
It would have to harness
together lots of universes.

233
00:14:15,080 --> 00:14:16,660
So, the British Museum
algorithm is

234
00:14:16,660 --> 00:14:17,910
not going to work.

235
00:14:35,650 --> 00:14:37,700
No good.

236
00:14:37,700 --> 00:14:39,460
So, what we're going to have to
do is we're going to have

237
00:14:39,460 --> 00:14:43,090
to put some things together
and hope for the best.

238
00:14:43,090 --> 00:14:46,680
So, the fifth way is the way
we're actually going to do it.

239
00:14:46,680 --> 00:14:53,580
And what we're going to do is
we're going to look ahead, not

240
00:14:53,580 --> 00:14:55,410
just one level, but as
far as possible.

241
00:15:07,120 --> 00:15:11,460
We consider, not only the
situation that we've developed

242
00:15:11,460 --> 00:15:15,390
here, but we'll try to push that
out as far as we can and

243
00:15:15,390 --> 00:15:21,430
look at these static values of
the leaf nodes down here and

244
00:15:21,430 --> 00:15:24,970
somehow use that as a way
of playing the game.

245
00:15:24,970 --> 00:15:27,885
So, that is number five.

246
00:15:27,885 --> 00:15:30,830
And number four is going
all the way down there.

247
00:15:30,830 --> 00:15:34,850
And this, in the end, is
all that we can do.

248
00:15:34,850 --> 00:15:45,240
This idea is multiply invented
most notably by Claude Shannon

249
00:15:45,240 --> 00:15:51,150
and also by Alan Turing, who,
I found out from a friend of

250
00:15:51,150 --> 00:15:56,460
mine, spent a lot a lunch time
conversations talking with

251
00:15:56,460 --> 00:16:01,130
each other about how a computer
might play chess

252
00:16:01,130 --> 00:16:04,850
against the future when there
would be computers.

253
00:16:04,850 --> 00:16:08,300
So, Donald, Mickey and Alan
Turing also invented this over

254
00:16:08,300 --> 00:16:12,010
lunch while they were taking
some time off from cracking

255
00:16:12,010 --> 00:16:14,730
the German codes.

256
00:16:14,730 --> 00:16:17,710
Well, what is the method?

257
00:16:17,710 --> 00:16:20,290
I want to illustrate the method
with the simplest

258
00:16:20,290 --> 00:16:21,700
possible tree.

259
00:16:21,700 --> 00:16:24,600
So, we're going to have a
branching factor of 2 not 14.

260
00:16:24,600 --> 00:16:27,920
And we're going to have a
depth of 2 not something

261
00:16:27,920 --> 00:16:29,570
highly serious.

262
00:16:32,170 --> 00:16:34,360
Here's the game tree.

263
00:16:34,360 --> 00:16:35,510
And there are going
to be some numbers

264
00:16:35,510 --> 00:16:36,760
down here at the bottom.

265
00:16:39,430 --> 00:16:42,390
And these are going to be the
value of the board from the

266
00:16:42,390 --> 00:16:46,060
perspective of the player
at the top.

267
00:16:46,060 --> 00:16:48,210
Let us say that the player at
the top would like to drive

268
00:16:48,210 --> 00:16:52,330
the play as much as possible
toward the big numbers.

269
00:16:52,330 --> 00:16:54,750
So, we're going to call that
player the maximizing player.

270
00:16:58,440 --> 00:17:01,270
He would like to get over here
to the 8, because that's the

271
00:17:01,270 --> 00:17:02,940
biggest number.

272
00:17:02,940 --> 00:17:04,740
There's another player, his
opponent, which we'll call the

273
00:17:04,740 --> 00:17:06,440
minimizing player.

274
00:17:06,440 --> 00:17:10,108
And he's hoping that the play
will go down to the board

275
00:17:10,108 --> 00:17:11,950
situation that's as
small as possible.

276
00:17:11,950 --> 00:17:14,930
Because his view is the opposite
of the maximizing

277
00:17:14,930 --> 00:17:19,040
player, hence the
name minimax.

278
00:17:19,040 --> 00:17:20,770
But how does it work?

279
00:17:20,770 --> 00:17:24,520
Do you see which way the
play is going to go?

280
00:17:24,520 --> 00:17:27,990
How do you decide which way
the play is going to go?

281
00:17:27,990 --> 00:17:30,650
Well, it's not obvious
at a glance.

282
00:17:30,650 --> 00:17:33,230
Do you see which way
it's going to go?

283
00:17:33,230 --> 00:17:34,980
It's not obvious
to the glance.

284
00:17:34,980 --> 00:17:39,160
But if we do more than a glance,
if we look at the

285
00:17:39,160 --> 00:17:42,150
situation from the perspective
of the minimizing player here

286
00:17:42,150 --> 00:17:44,360
at the middle level, it's
pretty clear that if the

287
00:17:44,360 --> 00:17:48,570
minimizing player finds himself
in that situation,

288
00:17:48,570 --> 00:17:51,480
he's going to choose
to go that way.

289
00:17:51,480 --> 00:17:56,830
And so the value of this
situation, from the

290
00:17:56,830 --> 00:18:00,652
perspective of the minimizing
player, is 2.

291
00:18:00,652 --> 00:18:03,480
He'd never go over
there to the 7.

292
00:18:03,480 --> 00:18:07,200
Similarly, if the minimizing
player is over here with a

293
00:18:07,200 --> 00:18:09,700
choice between going toward
a 1 or toward an 8, he'll

294
00:18:09,700 --> 00:18:11,900
obviously go toward a 1.

295
00:18:11,900 --> 00:18:16,850
And so the value of that board
situation, from the

296
00:18:16,850 --> 00:18:20,340
perspective of the minimizing
player, is 1.

297
00:18:20,340 --> 00:18:22,550
Now, we've taken the scores down
here at the bottom of the

298
00:18:22,550 --> 00:18:25,710
tree, and we back them
up one level.

299
00:18:25,710 --> 00:18:28,840
And you see how we can
just keep doing this?

300
00:18:28,840 --> 00:18:32,160
Now the maximizing player can
see that if he goes to the

301
00:18:32,160 --> 00:18:34,605
left, he gets a score of 2.

302
00:18:34,605 --> 00:18:37,360
If he goes to the right, he
only gets a score of 1.

303
00:18:37,360 --> 00:18:39,800
So, he's going to
go to the left.

304
00:18:39,800 --> 00:18:42,980
So, overall, then, the
maximizing player is going to

305
00:18:42,980 --> 00:18:48,790
have a 2 as the perceived value
of that situation there

306
00:18:48,790 --> 00:18:50,740
at the top.

307
00:18:50,740 --> 00:18:51,790
That's the minimax algorithm.

308
00:18:51,790 --> 00:18:53,390
It's very simple.

309
00:18:53,390 --> 00:18:56,250
You go down to the bottom of the
tree, you compute static

310
00:18:56,250 --> 00:19:00,570
values, you back them up level
by level, and then you decide

311
00:19:00,570 --> 00:19:01,585
where to go.

312
00:19:01,585 --> 00:19:05,390
And in this particular
situation, the maximizer goes

313
00:19:05,390 --> 00:19:05,890
to the left.

314
00:19:05,890 --> 00:19:08,770
And the minimizer goes to the
left, too, so the play ends up

315
00:19:08,770 --> 00:19:13,680
here, far short of the 8 that
the maximizer wanted and less

316
00:19:13,680 --> 00:19:15,460
than the 1 that the
minimizer wanted.

317
00:19:15,460 --> 00:19:17,100
But this is an adversarial
game.

318
00:19:17,100 --> 00:19:18,230
You're competing with
each other.

319
00:19:18,230 --> 00:19:21,280
So, you don't expect to get
what you want, right?

320
00:19:23,930 --> 00:19:25,660
So, maybe we ought to see if
we can make that work.

321
00:19:33,320 --> 00:19:34,100
There's a game tree.

322
00:19:34,100 --> 00:19:35,350
Do you see how it goes?

323
00:19:38,630 --> 00:19:42,730
Let's see if the system
can figure it out.

324
00:19:42,730 --> 00:19:46,350
There it goes, crawling its
way through the tree.

325
00:19:46,350 --> 00:19:49,310
This is a branching factor of
2, just like our sample, but

326
00:19:49,310 --> 00:19:51,540
now four levels.

327
00:19:51,540 --> 00:19:53,700
You can see that it's got quite
a lot of work to do.

328
00:19:53,700 --> 00:20:01,175
That's 2 to the fourth, one,
two, three, four, 2 to the

329
00:20:01,175 --> 00:20:06,790
fourth, 16 static evaluations
to do.

330
00:20:06,790 --> 00:20:07,850
So, it found the answer.

331
00:20:07,850 --> 00:20:09,120
But it's a lot of work.

332
00:20:09,120 --> 00:20:13,290
We could get a new tree and
restart it, maybe speed it up.

333
00:20:17,960 --> 00:20:22,310
There is goes down that
way, get a new tree.

334
00:20:22,310 --> 00:20:23,270
Those are just random numbers.

335
00:20:23,270 --> 00:20:25,360
So, each time it's going to find
a different path through

336
00:20:25,360 --> 00:20:30,330
the tree according to the
numbers that it's generated.

337
00:20:30,330 --> 00:20:32,070
Now, 16 isn't bad.

338
00:20:32,070 --> 00:20:34,620
But if you get down there around
10 levels deep and your

339
00:20:34,620 --> 00:20:36,850
branching factor is 14, well,
we know those numbers get

340
00:20:36,850 --> 00:20:39,290
pretty awful pretty bad, because
the number of static

341
00:20:39,290 --> 00:20:41,105
evaluations to do down
there at the bottom

342
00:20:41,105 --> 00:20:43,830
goes as b to the d.

343
00:20:43,830 --> 00:20:45,080
It's exponential.

344
00:20:47,260 --> 00:20:50,350
And time has shown, if you get
down about seven or eight

345
00:20:50,350 --> 00:20:51,845
levels, you're a jerk.

346
00:20:51,845 --> 00:20:54,450
And if you get down about 15
or 16 levels, you beat the

347
00:20:54,450 --> 00:20:55,900
world champion.

348
00:20:55,900 --> 00:20:58,630
So, you'd like to get as far
down in the tree as possible.

349
00:20:58,630 --> 00:21:02,480
Because when you get as far
down into the tree as

350
00:21:02,480 --> 00:21:04,510
possible, what happens is as
these that these crude

351
00:21:04,510 --> 00:21:09,720
measures of bored quality
begin to clarify.

352
00:21:09,720 --> 00:21:11,910
And, in fact, when you get far
enough, the only thing that

353
00:21:11,910 --> 00:21:15,890
really counts is piece count,
one of those features.

354
00:21:15,890 --> 00:21:18,750
If you get far enough, piece
count and a few other things

355
00:21:18,750 --> 00:21:21,150
will give you a pretty good idea
of what to do if you get

356
00:21:21,150 --> 00:21:23,990
far enough.

357
00:21:23,990 --> 00:21:25,510
But getting far enough
can be a problem.

358
00:21:25,510 --> 00:21:27,400
So, we want to do everything
we can to

359
00:21:27,400 --> 00:21:28,935
get as far as possible.

360
00:21:28,935 --> 00:21:31,970
We want to pull out every trick
we can find to get as

361
00:21:31,970 --> 00:21:33,500
far as possible.

362
00:21:33,500 --> 00:21:38,450
Now, you remember when we talked
about branching down,

363
00:21:38,450 --> 00:21:39,955
we knew that there were some
things that we could do that

364
00:21:39,955 --> 00:21:43,330
would cut off whole portions
of the search tree.

365
00:21:43,330 --> 00:21:45,380
So, what we'd like to do is find
something analogous to

366
00:21:45,380 --> 00:21:48,270
this world of games, so we cut
off whole portions of this

367
00:21:48,270 --> 00:21:49,880
search tree, so we don't
have to look at

368
00:21:49,880 --> 00:21:52,180
those static values.

369
00:21:52,180 --> 00:21:55,330
What I want to do is I want to
come back and redo this thing.

370
00:21:55,330 --> 00:21:56,780
But this time, I'm going
to compute the static

371
00:21:56,780 --> 00:21:59,030
values one at a time.

372
00:21:59,030 --> 00:22:03,190
I've got the same structure
in the tree.

373
00:22:03,190 --> 00:22:06,110
And just as before, I'm going to
assume that the top player

374
00:22:06,110 --> 00:22:08,490
wants to go toward the maximum
values, and the next player

375
00:22:08,490 --> 00:22:10,380
wants to go toward the
minimum values.

376
00:22:10,380 --> 00:22:13,950
But none of the static values
have been computed yet.

377
00:22:13,950 --> 00:22:16,770
So, I better start
computing them.

378
00:22:16,770 --> 00:22:19,226
That's the first
one I find, 2.

379
00:22:19,226 --> 00:22:21,840
Now, as soon as I see that 2, as
soon as the minimizer sees

380
00:22:21,840 --> 00:22:25,580
that 2, the minimizer knows that
the value of this node

381
00:22:25,580 --> 00:22:27,390
can't be any greater than 2.

382
00:22:27,390 --> 00:22:30,020
Because he'll always choose to
go down this way if this

383
00:22:30,020 --> 00:22:32,390
branch produces a
bigger number.

384
00:22:32,390 --> 00:22:35,910
So, we can say that the
minimizer is assured already

385
00:22:35,910 --> 00:22:40,580
that the score there will be
equal to or less than 2.

386
00:22:40,580 --> 00:22:43,580
Now, we go over and compute
the next number.

387
00:22:43,580 --> 00:22:44,980
There's a 7.

388
00:22:44,980 --> 00:22:46,850
Now, I know this is exactly
equal to 2, because he'll

389
00:22:46,850 --> 00:22:49,570
never go down toward a 7.

390
00:22:49,570 --> 00:22:52,420
As soon as the minimizer says
equal to 2, the maximizer

391
00:22:52,420 --> 00:22:55,390
says, OK, I can do equal
to or greater than 2.

392
00:23:00,560 --> 00:23:06,010
One, minimizer says equal
to or less than 1.

393
00:23:06,010 --> 00:23:08,142
Now what?

394
00:23:08,142 --> 00:23:12,275
Did you prepare those
2 numbers?

395
00:23:12,275 --> 00:23:16,360
The maximizer knows that if he
goes down here, he can't do

396
00:23:16,360 --> 00:23:17,990
better than 1.

397
00:23:17,990 --> 00:23:23,510
He already knows if he goes
over here, he an get a 2.

398
00:23:23,510 --> 00:23:27,850
It's as if this branch
doesn't even exist.

399
00:23:27,850 --> 00:23:31,840
Because the maximizer would
never choose to go down there.

400
00:23:31,840 --> 00:23:33,160
So, you have to see that.

401
00:23:33,160 --> 00:23:38,330
This is the important essence
of the notion the alpha-beta

402
00:23:38,330 --> 00:23:41,630
algorithm, which is a layering
on top of minimax that cuts

403
00:23:41,630 --> 00:23:44,870
off large sections of
the search tree.

404
00:23:44,870 --> 00:23:47,420
So, one more time.

405
00:23:47,420 --> 00:23:49,620
We've developed a situation so
we know that the maximizer

406
00:23:49,620 --> 00:23:54,720
gets a 2 going down to the left,
and he sees that if he

407
00:23:54,720 --> 00:23:58,000
goes down to the right, he
can't do better than 1.

408
00:23:58,000 --> 00:24:01,420
So, he says to himself, it's
as if that branch doesn't

409
00:24:01,420 --> 00:24:05,230
exist and the overall
score is 2.

410
00:24:05,230 --> 00:24:08,980
And it doesn't matter what
that static value is.

411
00:24:08,980 --> 00:24:13,350
It can be 8, as it was,
it can be plus 1,000.

412
00:24:13,350 --> 00:24:14,015
It doesn't matter.

413
00:24:14,015 --> 00:24:16,040
It can be minus 1,000.

414
00:24:16,040 --> 00:24:19,420
Or it could be plus infinity
or minus infinity.

415
00:24:19,420 --> 00:24:23,620
It doesn't matter, because
the maximizer will always

416
00:24:23,620 --> 00:24:26,470
go the other way.

417
00:24:26,470 --> 00:24:29,270
So, that's the alpha-beta
algorithm.

418
00:24:29,270 --> 00:24:32,300
Can you guess why it's called
the alpha-beta algorithm?

419
00:24:32,300 --> 00:24:34,210
Well, because in the algorithm
there are two parameters,

420
00:24:34,210 --> 00:24:37,080
alpha and beta.

421
00:24:37,080 --> 00:24:38,750
So, it's important to understand
that alpha-beta is

422
00:24:38,750 --> 00:24:41,810
not an alternative to minimax.

423
00:24:41,810 --> 00:24:44,230
It's minimax with a flourish.

424
00:24:44,230 --> 00:24:47,610
It's something layered on top
like we layered things on top

425
00:24:47,610 --> 00:24:49,605
of branch and bound to make
it more efficient.

426
00:24:49,605 --> 00:24:52,290
We layer stuff on top
of minimax to

427
00:24:52,290 --> 00:24:55,300
make it more efficient.

428
00:24:55,300 --> 00:24:57,250
As you say to me, well, that's
a pretty easy example.

429
00:24:57,250 --> 00:24:57,700
And it is.

430
00:24:57,700 --> 00:24:59,810
So, let's try a little
bit more complex one.

431
00:25:07,330 --> 00:25:09,550
This is just to see if I can
do it without screwing up.

432
00:25:12,220 --> 00:25:15,320
The reason I do one that's
complex is not just to show

433
00:25:15,320 --> 00:25:17,640
how tough I am in front
of a large audience.

434
00:25:17,640 --> 00:25:20,450
But, rather, there's certain
points of interest that only

435
00:25:20,450 --> 00:25:24,030
occur in a tree of depth
four or greater.

436
00:25:24,030 --> 00:25:26,010
That's the reason for
this example.

437
00:25:26,010 --> 00:25:28,120
But work with me and let's
see if we can work

438
00:25:28,120 --> 00:25:29,670
our way through it.

439
00:25:29,670 --> 00:25:34,810
What I'm going to do is I'll
circle the numbers that we

440
00:25:34,810 --> 00:25:36,790
actually have to compute.

441
00:25:36,790 --> 00:25:39,480
So, we actually have
to compute 8.

442
00:25:39,480 --> 00:25:42,430
As soon as we do that, the
minimizer knows that that node

443
00:25:42,430 --> 00:25:44,450
is going to have a score of
equal to or less than 8

444
00:25:44,450 --> 00:25:46,960
without looking at
anything else.

445
00:25:46,960 --> 00:25:50,020
Then, he looks at 7.

446
00:25:50,020 --> 00:25:51,516
So, that's equal to 7.

447
00:25:51,516 --> 00:25:54,910
Because the minimizer will
clearly go to the right.

448
00:25:54,910 --> 00:25:57,330
As soon as that is determined,
then the maximizer knows that

449
00:25:57,330 --> 00:26:00,580
the score here is equal
to or greater than 8.

450
00:26:00,580 --> 00:26:03,680
Now, we evaluate the 3.

451
00:26:03,680 --> 00:26:06,418
The minimizer knows equal
to or less than 3.

452
00:26:06,418 --> 00:26:09,286
SPEAKER 4: [INAUDIBLE].

453
00:26:09,286 --> 00:26:14,920
SPEAKER 1: Oh, sorry, the
minimizer at 7, yeah.

454
00:26:14,920 --> 00:26:17,930
OK, now what happens?

455
00:26:17,930 --> 00:26:20,240
Well, let's see, the maximizer
gets a 7 going that way.

456
00:26:20,240 --> 00:26:22,180
He can't do better than 3 going
that way, so we got

457
00:26:22,180 --> 00:26:24,980
another one of these
cut off situations.

458
00:26:24,980 --> 00:26:28,820
It's as if this branch
doesn't even exist.

459
00:26:28,820 --> 00:26:32,860
So, this static evaluation
need not be made.

460
00:26:32,860 --> 00:26:35,670
And now we know that that's not
merely equal to or greater

461
00:26:35,670 --> 00:26:37,850
than 7, but exactly
equal to 7.

462
00:26:37,850 --> 00:26:40,530
And we can push that
number back up.

463
00:26:40,530 --> 00:26:43,900
That becomes equal to
or less than 7.

464
00:26:43,900 --> 00:26:46,360
OK, are you with me so far?

465
00:26:46,360 --> 00:26:47,620
Let's get over to the other
side of the tree

466
00:26:47,620 --> 00:26:49,300
as quickly as possible.

467
00:26:49,300 --> 00:26:55,410
So, there's a 9, equal to or
less than 9, 8 equal to 8,

468
00:26:55,410 --> 00:27:00,820
push the 8 up equal
or greater than 8.

469
00:27:03,360 --> 00:27:06,740
The minimizer can go down
this way and get a 7.

470
00:27:06,740 --> 00:27:09,020
He'll certainly never go
that way where the

471
00:27:09,020 --> 00:27:11,780
maximizer can get an 8.

472
00:27:11,780 --> 00:27:13,706
Once again, we've
got a cut off.

473
00:27:13,706 --> 00:27:17,900
And if this branch didn't exist,
then that means that

474
00:27:17,900 --> 00:27:21,020
these static evaluations
don't have to be made.

475
00:27:21,020 --> 00:27:25,150
And this value is
now exactly 7.

476
00:27:25,150 --> 00:27:27,340
But there's one more
thing to note here.

477
00:27:27,340 --> 00:27:29,510
And that is that not only do
we not have to make these

478
00:27:29,510 --> 00:27:32,830
static evaluations down here,
but we don't even have to

479
00:27:32,830 --> 00:27:35,040
generate these moves.

480
00:27:35,040 --> 00:27:38,210
So, we save two ways, both on
static evaluation and on move

481
00:27:38,210 --> 00:27:40,390
generation.

482
00:27:40,390 --> 00:27:42,770
This is a real winner, this
alpha-beta thing, because it

483
00:27:42,770 --> 00:27:44,285
saves as enormous amount
of computation.

484
00:27:47,130 --> 00:27:47,930
Well, we're on the way now.

485
00:27:47,930 --> 00:27:50,470
The maximizer up here is
guaranteed equal to or

486
00:27:50,470 --> 00:27:51,220
greater than 7.

487
00:27:51,220 --> 00:27:53,990
Has anyone found the winning
media move yet?

488
00:27:53,990 --> 00:27:56,050
Is it to the left?

489
00:27:56,050 --> 00:27:59,240
I know that we better keep
going, because we want to

490
00:27:59,240 --> 00:28:00,490
trust any oracles.

491
00:28:04,150 --> 00:28:05,090
So, let's see.

492
00:28:05,090 --> 00:28:05,780
There's a 1.

493
00:28:05,780 --> 00:28:06,700
We've calculated that.

494
00:28:06,700 --> 00:28:08,950
The minimizer can be guaranteed
equal to or less

495
00:28:08,950 --> 00:28:11,050
than 1 at that particular
point.

496
00:28:15,130 --> 00:28:17,040
Think about that for a while.

497
00:28:17,040 --> 00:28:19,470
At the top, the maximizer
knows he can go

498
00:28:19,470 --> 00:28:23,161
left and get a 7.

499
00:28:23,161 --> 00:28:28,610
the minimizer, if the play ever
gets here, can ensure

500
00:28:28,610 --> 00:28:30,860
that he's going to drive the
situation to a board

501
00:28:30,860 --> 00:28:33,240
number that's 1.

502
00:28:33,240 --> 00:28:35,150
So, the question is will
the maximizer ever

503
00:28:35,150 --> 00:28:37,080
permit that to happen?

504
00:28:37,080 --> 00:28:39,920
And the answer is surely not.

505
00:28:39,920 --> 00:28:42,090
So, over here in the development
of this side of

506
00:28:42,090 --> 00:28:44,870
the tree, we're always comparing
numbers at adjacent

507
00:28:44,870 --> 00:28:46,530
levels in the tree.

508
00:28:46,530 --> 00:28:48,780
But here's a situation where
we're comparing numbers that

509
00:28:48,780 --> 00:28:51,210
are separated from each
other in the tree.

510
00:28:51,210 --> 00:28:54,430
And we still concluded that no
further examination of this

511
00:28:54,430 --> 00:28:56,870
node makes any sense at all.

512
00:28:56,870 --> 00:28:58,120
This is called deep cut off.

513
00:29:05,590 --> 00:29:08,810
And that means that this whole
branch here might as well not

514
00:29:08,810 --> 00:29:14,150
exist, and we won't have to
compute that static value.

515
00:29:14,150 --> 00:29:15,530
All right?

516
00:29:15,530 --> 00:29:17,950
So, it looks--

517
00:29:17,950 --> 00:29:20,250
you have this stare of
disbelief, which

518
00:29:20,250 --> 00:29:21,660
is perfectly normal.

519
00:29:21,660 --> 00:29:23,510
I have to reconvince myself
every time that

520
00:29:23,510 --> 00:29:24,915
this actually works.

521
00:29:24,915 --> 00:29:28,170
But when you think your way
through it, it is clear that

522
00:29:28,170 --> 00:29:30,660
these computations that
I've x-ed out

523
00:29:30,660 --> 00:29:32,120
don't have to be made.

524
00:29:32,120 --> 00:29:34,510
So, let's carry on and see if we
can complete this equal to

525
00:29:34,510 --> 00:29:39,670
or less than 8, equal
to 8, equal to 8--

526
00:29:39,670 --> 00:29:42,360
because the other branch
doesn't even exist--

527
00:29:42,360 --> 00:29:46,760
equal to or less than 8.

528
00:29:46,760 --> 00:29:50,700
And we compare these two
numbers, do we keep going?

529
00:29:50,700 --> 00:29:52,020
Yes, we keep going.

530
00:29:52,020 --> 00:29:54,010
Because maybe the maximizer
can go to the right and

531
00:29:54,010 --> 00:29:56,870
actually get to that 8.

532
00:29:56,870 --> 00:29:59,990
So, we have to go over here
and keep working away.

533
00:29:59,990 --> 00:30:02,600
There's a nine, equal
to or less than 9,

534
00:30:02,600 --> 00:30:04,790
another 9 equal to 9.

535
00:30:04,790 --> 00:30:08,620
Push that number up equal
to or greater than 9.

536
00:30:11,360 --> 00:30:14,322
The minimizer gets an
8 going this way.

537
00:30:14,322 --> 00:30:16,840
The maximizer is insured of
getting a 9 going that way.

538
00:30:16,840 --> 00:30:18,860
So, once again, we've got
a cut off situation.

539
00:30:18,860 --> 00:30:21,392
It's as if this doesn't exist.

540
00:30:21,392 --> 00:30:24,540
Those static evaluations
are not made.

541
00:30:24,540 --> 00:30:28,000
This move generation is not made
and computation is saved.

542
00:30:32,010 --> 00:30:36,200
So, let's see if we can do
better on this very example

543
00:30:36,200 --> 00:30:38,342
using this alpha-beta idea.

544
00:30:38,342 --> 00:30:42,150
I'll slow it down a little bit
and change the search type to

545
00:30:42,150 --> 00:30:45,110
minimax with alpha-beta.

546
00:30:45,110 --> 00:30:47,540
We see two numbers on each of
those nodes now, guess what

547
00:30:47,540 --> 00:30:48,220
they're called.

548
00:30:48,220 --> 00:30:49,070
We already know.

549
00:30:49,070 --> 00:30:50,430
They're alpha and beta.

550
00:30:50,430 --> 00:30:53,270
So, what's going to happen is
the algorithm proceeds through

551
00:30:53,270 --> 00:30:55,710
trees that those numbers are
going to shrink wrap

552
00:30:55,710 --> 00:30:58,210
themselves around
the situation.

553
00:30:58,210 --> 00:30:59,460
So, we'll start that up.

554
00:31:04,770 --> 00:31:08,030
Two static evaluations
were not made.

555
00:31:08,030 --> 00:31:09,280
Let's try a new tree.

556
00:31:14,240 --> 00:31:16,496
Two different ones
were not made.

557
00:31:16,496 --> 00:31:25,300
A new tree, still again, two
different ones not made.

558
00:31:25,300 --> 00:31:29,180
Let's see what happens when we
use the classroom example, the

559
00:31:29,180 --> 00:31:29,960
one I did up there.

560
00:31:29,960 --> 00:31:32,900
Let's make sure that I
didn't screw it up.

561
00:31:32,900 --> 00:31:34,480
I'll slow that down to 1.

562
00:31:45,150 --> 00:31:48,280
2, same answer.

563
00:31:48,280 --> 00:31:50,380
So, you probably didn't realize
it at the start.

564
00:31:50,380 --> 00:31:51,530
Who could?

565
00:31:51,530 --> 00:31:56,040
In fact, the play goes down that
way, over this way, down

566
00:31:56,040 --> 00:31:59,710
that way, and ultimately to
the 8, which is not the

567
00:31:59,710 --> 00:32:00,390
biggest number.

568
00:32:00,390 --> 00:32:01,460
And it's not the smallest
number.

569
00:32:01,460 --> 00:32:03,866
It's the compromised number
that's arrived at virtue of

570
00:32:03,866 --> 00:32:07,980
the fact that this is an
adversarial situation.

571
00:32:07,980 --> 00:32:12,120
So, you say to me, how much
energy, how much work do you

572
00:32:12,120 --> 00:32:14,820
actually saved by doing this?

573
00:32:14,820 --> 00:32:34,440
Well, it is the case that in
the optimal situation, if

574
00:32:34,440 --> 00:32:37,615
everything is ordered right,
if God has come down and

575
00:32:37,615 --> 00:32:41,110
arranged your tree in just
the right way, then the

576
00:32:41,110 --> 00:32:44,980
approximate amount of work you
need to do, the approximate

577
00:32:44,980 --> 00:32:48,340
number of static evaluations
performed, is approximately

578
00:32:48,340 --> 00:32:54,610
equal to 2 times b
to the d over 2.

579
00:32:54,610 --> 00:32:55,870
We don't care about this 2.

580
00:32:55,870 --> 00:32:59,220
We care a whole lot
about that 2.

581
00:32:59,220 --> 00:33:01,760
That's the amount of
work that's done.

582
00:33:01,760 --> 00:33:06,050
It's b to the d over 2,
instead of b to d.

583
00:33:06,050 --> 00:33:07,000
What's that mean?

584
00:33:07,000 --> 00:33:09,500
Suppose that without
this idea, I can

585
00:33:09,500 --> 00:33:12,080
go down seven levels.

586
00:33:12,080 --> 00:33:15,280
How far can I go down
with this idea?

587
00:33:15,280 --> 00:33:17,940
14 levels.

588
00:33:17,940 --> 00:33:18,910
So, it's the difference
between a

589
00:33:18,910 --> 00:33:21,340
jerk and a world champion.

590
00:33:21,340 --> 00:33:24,880
So, that, however, is only in
the optimal case when God has

591
00:33:24,880 --> 00:33:26,710
arranged things just right.

592
00:33:26,710 --> 00:33:29,750
But in practical situations,
practical game situations, it

593
00:33:29,750 --> 00:33:32,560
appears to be the case,
experimentally, that the

594
00:33:32,560 --> 00:33:36,170
actual number is close to this
approximation for optimal

595
00:33:36,170 --> 00:33:37,760
arrangements.

596
00:33:37,760 --> 00:33:40,462
So, you'd never not want
to use alpha-beta.

597
00:33:40,462 --> 00:33:43,870
It saves an amazing
amount of time.

598
00:33:43,870 --> 00:33:46,700
You could look at
it another way.

599
00:33:46,700 --> 00:33:50,990
Suppose you go down the same
number of levels, how much

600
00:33:50,990 --> 00:33:52,240
less work do you have to do?

601
00:33:55,070 --> 00:33:55,760
Well, quite a bit.

602
00:33:55,760 --> 00:33:59,050
The square root [INAUDIBLE],
right?

603
00:33:59,050 --> 00:34:02,720
That's another way of looking
at how it works.

604
00:34:02,720 --> 00:34:06,710
So, we could go home at this
point except for one problem,

605
00:34:06,710 --> 00:34:11,469
and that is that we pretended
that the branching factor is

606
00:34:11,469 --> 00:34:13,560
always the same.

607
00:34:13,560 --> 00:34:17,909
But, in fact, the branching
factor will vary with the game

608
00:34:17,909 --> 00:34:21,510
state and will vary
with the game.

609
00:34:21,510 --> 00:34:23,989
So, you can calculate how much
computing you can do in two

610
00:34:23,989 --> 00:34:27,223
minutes, or however much time
you have for an average move.

611
00:34:27,223 --> 00:34:30,520
And then you could say,
how deep can I go?

612
00:34:30,520 --> 00:34:32,760
And you won't know for
sure, because it

613
00:34:32,760 --> 00:34:35,210
depends on the game.

614
00:34:35,210 --> 00:34:39,320
So, in the earlier days of
game-playing programs, the

615
00:34:39,320 --> 00:34:41,750
game-playing program left a
lot of computation on the

616
00:34:41,750 --> 00:34:45,670
table, because it would make a
decision in three seconds.

617
00:34:45,670 --> 00:34:49,170
And it might have made a much
different move if it used all

618
00:34:49,170 --> 00:34:51,520
the competition it
had available.

619
00:34:51,520 --> 00:34:54,969
Alternatively, it might be
grinding away, and after two

620
00:34:54,969 --> 00:34:56,880
minutes was consumed.

621
00:34:56,880 --> 00:35:00,410
It had no move and just
did something random.

622
00:35:02,920 --> 00:35:05,020
That's not very good.

623
00:35:05,020 --> 00:35:06,850
But that's what the early
game-playing program's did,

624
00:35:06,850 --> 00:35:11,980
because no one knew how
deep they could go.

625
00:35:11,980 --> 00:35:16,910
So, let's have a look at the
situation here and say, well,

626
00:35:16,910 --> 00:35:18,670
here's a game tree.

627
00:35:18,670 --> 00:35:20,290
It's a binary game tree.

628
00:35:20,290 --> 00:35:22,120
That's level 0.

629
00:35:22,120 --> 00:35:23,890
That's level 1.

630
00:35:23,890 --> 00:35:26,600
This is level d minus 1.

631
00:35:26,600 --> 00:35:28,610
And this is level d.

632
00:35:28,610 --> 00:35:32,050
So, down here you
have a situation

633
00:35:32,050 --> 00:35:33,380
that looks like this.

634
00:35:33,380 --> 00:35:37,050
And I left all the game
tree out in between .

635
00:35:37,050 --> 00:35:40,940
So, how many leaf nodes
are there down here?

636
00:35:40,940 --> 00:35:42,110
b to the d, right?

637
00:35:42,110 --> 00:35:45,280
Oh, I'm going to forget about
alpha alpha-beta for a moment.

638
00:35:45,280 --> 00:35:47,760
As we did when we looked at
some of those optimal

639
00:35:47,760 --> 00:35:50,540
searches, we're going to add
these things one at a time.

640
00:35:50,540 --> 00:35:52,550
So, forget about alpha-beta,
assume we're just doing

641
00:35:52,550 --> 00:35:54,290
straight minimax.

642
00:35:54,290 --> 00:35:56,970
In that case, we would have to
calculate all the static

643
00:35:56,970 --> 00:35:58,610
values down here
at the bottom.

644
00:35:58,610 --> 00:36:03,160
And there are b to d of those.

645
00:36:03,160 --> 00:36:06,760
How many are there at
this next level up?

646
00:36:06,760 --> 00:36:11,720
Well, that must be b
to the d minus 1.

647
00:36:11,720 --> 00:36:14,650
How many fewer nodes are there
at the second to the last, the

648
00:36:14,650 --> 00:36:19,390
penultimate level, relative
to the final level?

649
00:36:19,390 --> 00:36:23,010
Well, 1 over b, right?

650
00:36:23,010 --> 00:36:26,750
So, if I'm concerned about not
getting all the way through

651
00:36:26,750 --> 00:36:31,070
these calculations at the d
level, I can give myself an

652
00:36:31,070 --> 00:36:34,320
insurance policy by calculating
out what the

653
00:36:34,320 --> 00:36:40,590
answer would be if I only went
down to the d minus 1th level.

654
00:36:40,590 --> 00:36:43,540
Do you get that insurance
policy?

655
00:36:43,540 --> 00:36:46,510
Let's say the branching factor
is 10, how much does that

656
00:36:46,510 --> 00:36:48,920
insurance policy cost me?

657
00:36:48,920 --> 00:36:51,160
10% of my competition.

658
00:36:51,160 --> 00:36:53,690
Because I can do this
calculation and have a move in

659
00:36:53,690 --> 00:36:59,580
hand here at level d minus 1 for
only 1/10 of the amount of

660
00:36:59,580 --> 00:37:01,730
the computation that's required
to figure out what I

661
00:37:01,730 --> 00:37:06,000
would do if I go all the way
down to the base level.

662
00:37:06,000 --> 00:37:08,460
OK, is that clear?

663
00:37:08,460 --> 00:37:13,160
So this idea is extremely
important in its general form.

664
00:37:13,160 --> 00:37:16,600
But we haven't quite got there
yet, because what if the

665
00:37:16,600 --> 00:37:19,070
branching factor turns out to be
really big and we can't get

666
00:37:19,070 --> 00:37:22,130
through this level either?

667
00:37:22,130 --> 00:37:23,860
What should we do to
make sure that we

668
00:37:23,860 --> 00:37:26,215
still have a good move?

669
00:37:26,215 --> 00:37:27,610
SPEAKER 5: [INAUDIBLE].

670
00:37:27,610 --> 00:37:32,850
SPEAKER 1: Right, we can do
it at the b minus 2 level.

671
00:37:32,850 --> 00:37:37,120
So, that would be up here.

672
00:37:37,120 --> 00:37:40,806
And at that level, the amount
of computation would be b to

673
00:37:40,806 --> 00:37:42,056
the d minus 2.

674
00:37:44,800 --> 00:37:51,240
So, now we've added 10%
plus 10% of that.

675
00:37:51,240 --> 00:37:56,270
And our knee jerk is begin
to form, right?

676
00:37:56,270 --> 00:37:58,180
What are we going to do in the
end to make sure that no

677
00:37:58,180 --> 00:38:00,458
matter what we've got a move?

678
00:38:00,458 --> 00:38:02,095
CHRISTOPHER: Start from
the very first--

679
00:38:02,095 --> 00:38:03,280
SPEAKER 1: Correct, what's
that, Christopher?

680
00:38:03,280 --> 00:38:04,250
CHRISTOPHER: Start from
the very first level?

681
00:38:04,250 --> 00:38:06,515
SPEAKER 1: Start from the very
first level and give our self

682
00:38:06,515 --> 00:38:11,330
an insurance policy for every
level we try to calculate.

683
00:38:11,330 --> 00:38:13,780
But that might be real costly.

684
00:38:13,780 --> 00:38:15,910
So, we better figure out if this
is going to be too big of

685
00:38:15,910 --> 00:38:18,220
an expense to bear.

686
00:38:18,220 --> 00:38:22,330
So, let's see, if we do what
Christopher suggests, then the

687
00:38:22,330 --> 00:38:25,860
amount of computation we need
in our insurance policy is

688
00:38:25,860 --> 00:38:28,460
going to be equal 1--

689
00:38:28,460 --> 00:38:30,900
we're going to do it up here at
this level, 2, even though

690
00:38:30,900 --> 00:38:33,560
we don't need it, just to make
everything work out easy.

691
00:38:33,560 --> 00:38:37,720
1 plus b, that's getting or
insurance policy down here at

692
00:38:37,720 --> 00:38:39,600
this first level.

693
00:38:39,600 --> 00:38:44,460
And we're going to add b squared
all the way down to b

694
00:38:44,460 --> 00:38:46,820
to d minus 1.

695
00:38:46,820 --> 00:38:49,280
That's how much we're going to
spend getting an insurance

696
00:38:49,280 --> 00:38:50,590
policy at every level.

697
00:38:54,020 --> 00:38:58,390
I wished that some of that high
school algebra, right?

698
00:38:58,390 --> 00:39:01,812
Let's just do it for fun.

699
00:39:01,812 --> 00:39:04,660
Oh, unfortunate choice
of variable names.

700
00:39:04,660 --> 00:39:08,225
bs is equal to--

701
00:39:08,225 --> 00:39:10,165
oh, we're going to multiply
all those by b.

702
00:39:17,530 --> 00:39:29,520
Now, we'll subtract the first
one from the second one, which

703
00:39:29,520 --> 00:39:33,330
tells us that the amount of
calculation needed for our

704
00:39:33,330 --> 00:39:39,720
insurance policy is equal
to b to the d minus 1

705
00:39:39,720 --> 00:39:42,070
over b minus 1.

706
00:39:46,450 --> 00:39:49,580
Is that a big number?

707
00:39:49,580 --> 00:39:53,562
We could do a little algebra on
that and say that b to the

708
00:39:53,562 --> 00:39:54,430
d is a huge number.

709
00:39:54,430 --> 00:39:57,240
So, that minus one
doesn't count.

710
00:39:57,240 --> 00:39:59,620
And B is probably 10 to 15.

711
00:39:59,620 --> 00:40:03,830
So, b minus 1 is, essentially,
equal to b.

712
00:40:03,830 --> 00:40:08,441
So, that's approximately equal
b to the d minus 1.

713
00:40:11,150 --> 00:40:15,340
So, with an approximation
factored in, the amount of

714
00:40:15,340 --> 00:40:17,140
computation needed to do
insurance policies at every

715
00:40:17,140 --> 00:40:19,870
level is not much different from
the amount of computation

716
00:40:19,870 --> 00:40:22,770
needed to get an insurance
policy at just one level, the

717
00:40:22,770 --> 00:40:24,910
penultimate one.

718
00:40:24,910 --> 00:40:27,550
So, this idea is called
progressive deepening.

719
00:40:40,610 --> 00:40:43,610
And now we can visit our gold
star idea list and see how

720
00:40:43,610 --> 00:40:46,170
these things match
up with that.

721
00:40:46,170 --> 00:40:50,050
First of all, the dead horse
principle comes to the fore

722
00:40:50,050 --> 00:40:51,530
when we talk about alpha-beta.

723
00:40:51,530 --> 00:40:53,570
Because we know with alpha-beta
that we can get rid

724
00:40:53,570 --> 00:40:56,705
of a whole lot of the tree and
not do static evaluation, not

725
00:40:56,705 --> 00:40:59,000
even do move generation.

726
00:40:59,000 --> 00:41:01,120
That's the dead horse we
don't want to beat.

727
00:41:01,120 --> 00:41:03,530
There's no point in doing that
calculation, because it can't

728
00:41:03,530 --> 00:41:06,250
figure into the answer.

729
00:41:06,250 --> 00:41:12,830
The development of the
progressive deepening idea, I

730
00:41:12,830 --> 00:41:14,860
like to think of in terms of
the martial arts principle,

731
00:41:14,860 --> 00:41:17,600
we're using the enemy's
characteristics against them.

732
00:41:17,600 --> 00:41:20,690
Because of this exponential
blow-up, we have exactly the

733
00:41:20,690 --> 00:41:23,610
right characteristics to have
a move available at every

734
00:41:23,610 --> 00:41:26,260
level as an insurance policy
against not getting through to

735
00:41:26,260 --> 00:41:28,360
the next level.

736
00:41:28,360 --> 00:41:31,910
And, finally, this whole idea
of progressive deepening can

737
00:41:31,910 --> 00:41:34,440
be viewed as a prime example
of what we like to call

738
00:41:34,440 --> 00:41:37,670
anytime algorithms that always
have an answer ready to go as

739
00:41:37,670 --> 00:41:39,690
soon as an answer is demanded.

740
00:41:39,690 --> 00:41:43,400
So, as soon as that clock runs
out at two minutes, some

741
00:41:43,400 --> 00:41:44,250
answer is available.

742
00:41:44,250 --> 00:41:47,460
It'll be the best one that the
system can compute in the time

743
00:41:47,460 --> 00:41:49,480
available given the
characteristics of the game

744
00:41:49,480 --> 00:41:51,930
tree as it's developed so far.

745
00:41:51,930 --> 00:41:53,780
So, there are other kinds
of anytime algorithms.

746
00:41:53,780 --> 00:41:56,500
This is an example of one.

747
00:41:56,500 --> 00:42:01,500
That's how all game playing
programs work, minimax, plus

748
00:42:01,500 --> 00:42:04,290
alpha-beta, plus progressive
deepening.

749
00:42:04,290 --> 00:42:08,670
Christopher, is alpha-beta
a alternative to minimax?

750
00:42:08,670 --> 00:42:09,450
CHRISTOPHER: No.

751
00:42:09,450 --> 00:42:11,072
SPEAKER 1: No, it's not.

752
00:42:11,072 --> 00:42:13,100
It's something you layer
on top of minimax.

753
00:42:13,100 --> 00:42:15,980
Does alpha-beta give you a
different answer from minimax?

754
00:42:18,655 --> 00:42:20,920
CHRISTOPHER: No.

755
00:42:20,920 --> 00:42:21,600
No, it doesn't.

756
00:42:21,600 --> 00:42:23,105
SPEAKER 1: Let's see everybody
shake their head

757
00:42:23,105 --> 00:42:24,590
one way or the other.

758
00:42:24,590 --> 00:42:26,960
It does not give you an answer
different from minimax.

759
00:42:26,960 --> 00:42:27,770
That's right.

760
00:42:27,770 --> 00:42:29,430
It gives you exactly
the same answer,

761
00:42:29,430 --> 00:42:30,660
not a different answer.

762
00:42:30,660 --> 00:42:32,800
It's a speed-up.

763
00:42:32,800 --> 00:42:34,570
It's not an approximation.

764
00:42:34,570 --> 00:42:35,140
It's a speed-up.

765
00:42:35,140 --> 00:42:36,835
It cuts off lots of the tree.

766
00:42:36,835 --> 00:42:39,260
It's a dead horse principle
at work.

767
00:42:39,260 --> 00:42:40,618
You got a question,
Christopher?

768
00:42:40,618 --> 00:42:45,558
CHRISTOPHER: Yeah, since all
of the lines progressively

769
00:42:45,558 --> 00:42:50,498
[INAUDIBLE], is it possible to
keep a temporary value if the

770
00:42:50,498 --> 00:42:54,944
value [INAUDIBLE] each node of
the tree and then [INAUDIBLE]?

771
00:42:54,944 --> 00:42:56,920
SPEAKER 1: Oh, excellent
suggestion.

772
00:42:56,920 --> 00:42:58,930
In fact, Christopher
has just--

773
00:42:58,930 --> 00:43:01,510
I think, if I can jump ahead
a couple steps--

774
00:43:01,510 --> 00:43:04,600
Christopher has reinvented
a very important idea.

775
00:43:12,250 --> 00:43:14,295
Progressive deepening not only
ensures you have an answer at

776
00:43:14,295 --> 00:43:18,080
any time, it actually improves
the performance of alpha-beta

777
00:43:18,080 --> 00:43:21,090
when you layer alpha-beta
on top of it.

778
00:43:21,090 --> 00:43:25,050
Because these values that are
calculated at intermediate

779
00:43:25,050 --> 00:43:28,890
parts of the tree are used to
reorder the nodes under the

780
00:43:28,890 --> 00:43:33,190
tree so as to give you maximum
alpha-beta cut-off.

781
00:43:33,190 --> 00:43:34,650
I think that's what you
said, Christopher.

782
00:43:34,650 --> 00:43:39,380
But if it isn't, we'll talk
about your idea after class.

783
00:43:39,380 --> 00:43:42,170
So, this is what every game
playing program does.

784
00:43:42,170 --> 00:43:44,510
How is Deep Blue different?

785
00:43:44,510 --> 00:43:45,760
Not much.

786
00:43:51,830 --> 00:43:57,000
So, Deep Blue, as of 1997, did
about 200 million static

787
00:43:57,000 --> 00:43:59,042
evaluations per second.

788
00:43:59,042 --> 00:44:03,530
And it went down, using
alpha-beta,

789
00:44:03,530 --> 00:44:08,100
about 14, 15, 16 levels.

790
00:44:08,100 --> 00:44:21,800
So, Deep Blue was minimax,
plus alpha-beta, plus

791
00:44:21,800 --> 00:44:27,480
progressive deepening, plus
a whole lot of parallel

792
00:44:27,480 --> 00:44:42,080
computing, plus an opening book,
plus special purpose

793
00:44:42,080 --> 00:44:47,800
stuff for the end game, plus--

794
00:44:47,800 --> 00:44:49,470
perhaps the most important
thing--

795
00:45:04,210 --> 00:45:06,150
uneven tree development.

796
00:45:06,150 --> 00:45:08,880
So far, we've pretended that the
tree always goes up in an

797
00:45:08,880 --> 00:45:10,610
even way to a fixed level.

798
00:45:10,610 --> 00:45:13,310
But there's no particular reason
why that has to be so.

799
00:45:16,190 --> 00:45:19,870
Some situation down at the
bottom of the tree may be

800
00:45:19,870 --> 00:45:21,920
particularly dynamic.

801
00:45:21,920 --> 00:45:23,720
In the very next move, you might
be able to capture the

802
00:45:23,720 --> 00:45:25,780
opponent's Queen.

803
00:45:25,780 --> 00:45:27,750
So, in circumstances like that,
you want to blow out a

804
00:45:27,750 --> 00:45:29,600
little extra search.

805
00:45:29,600 --> 00:45:31,280
So, eventually, you get to
the idea that there's no

806
00:45:31,280 --> 00:45:33,810
particular reason to
have the search go

807
00:45:33,810 --> 00:45:35,880
down to a fixed level.

808
00:45:35,880 --> 00:45:38,920
But, instead, you can develop
the tree in a way that gives

809
00:45:38,920 --> 00:45:40,800
you the most confidence
that your

810
00:45:40,800 --> 00:45:43,330
backed-up numbers are correct.

811
00:45:43,330 --> 00:45:46,670
That's the most important of
these extra flourishes added

812
00:45:46,670 --> 00:45:51,370
by Deep Blue when it beat
Kasparov in 1997.

813
00:45:51,370 --> 00:45:53,890
And now we can come back
and say, well, you

814
00:45:53,890 --> 00:45:54,710
understand Deep Blue.

815
00:45:54,710 --> 00:45:56,430
But is this a model of
anything that goes

816
00:45:56,430 --> 00:45:58,950
on in our own heads?

817
00:45:58,950 --> 00:46:02,010
Is this a model of any kind
of human intelligence?

818
00:46:02,010 --> 00:46:05,210
Or is it a different kind
of intelligence?

819
00:46:05,210 --> 00:46:06,460
And the answer is
mixed, right?

820
00:46:06,460 --> 00:46:09,950
Because we are often in
situations where we are

821
00:46:09,950 --> 00:46:11,720
playing a game.

822
00:46:11,720 --> 00:46:13,470
We're competing with another
manufacturer.

823
00:46:13,470 --> 00:46:16,300
We have to think what the other
manufacturer will do in

824
00:46:16,300 --> 00:46:21,500
response to what we do
down several levels.

825
00:46:21,500 --> 00:46:26,230
On the other hand, is going
down 14 levels what human

826
00:46:26,230 --> 00:46:29,705
chess players do when they win
the world championship?

827
00:46:29,705 --> 00:46:33,620
It doesn't seem, even to them,
like that's even a remote

828
00:46:33,620 --> 00:46:35,570
possibility.

829
00:46:35,570 --> 00:46:37,740
They have to do something
different, because they don't

830
00:46:37,740 --> 00:46:41,180
have that kind of computational
horsepower.

831
00:46:41,180 --> 00:46:45,350
This is doing computation in the
same way that a bulldozer

832
00:46:45,350 --> 00:46:47,600
processes gravel.

833
00:46:47,600 --> 00:46:51,650
It's substituting raw power
for sophistication.

834
00:46:51,650 --> 00:46:54,790
So, when a human chess master
plays the game, they have a

835
00:46:54,790 --> 00:46:56,640
great deal of chess knowledge
in their head and they

836
00:46:56,640 --> 00:46:58,730
recognize patterns.

837
00:46:58,730 --> 00:47:00,910
There are famous experiments,
by the way, that demonstrate

838
00:47:00,910 --> 00:47:03,730
this in the following way.

839
00:47:03,730 --> 00:47:08,250
Show a chessboard to a chess
master and ask them to

840
00:47:08,250 --> 00:47:10,130
memorize it.

841
00:47:10,130 --> 00:47:12,950
They're very good at that, as
long as it's a legitimate

842
00:47:12,950 --> 00:47:14,180
chessboard.

843
00:47:14,180 --> 00:47:16,510
If the pieces are placed
randomly, they're no

844
00:47:16,510 --> 00:47:18,380
good at it at all.

845
00:47:18,380 --> 00:47:21,502
So, it's very clear that they've
developed a repertoire

846
00:47:21,502 --> 00:47:24,550
of chess knowledge that makes
it possible for them to

847
00:47:24,550 --> 00:47:28,150
recognize situations and play
the game much more like number

848
00:47:28,150 --> 00:47:29,940
1 up there.

849
00:47:29,940 --> 00:47:33,150
So, Deep Blue is manifesting
some kind of intelligence.

850
00:47:33,150 --> 00:47:34,360
But it's not our intelligence.

851
00:47:34,360 --> 00:47:36,800
It's bulldozer intelligence.

852
00:47:36,800 --> 00:47:38,330
So, it's important to understand
that kind of

853
00:47:38,330 --> 00:47:40,020
intelligence, too.

854
00:47:40,020 --> 00:47:42,290
But it's not necessarily the
same kind of intelligence that

855
00:47:42,290 --> 00:47:43,540
we have in our own head.

856
00:47:46,160 --> 00:47:47,570
So, that concludes what we're
going to do today.

857
00:47:47,570 --> 00:47:49,790
And, as you know, on Wednesday
we have a celebration of

858
00:47:49,790 --> 00:47:56,940
learning, which is familiar to
you if you take a 309.1.

859
00:47:56,940 --> 00:48:00,440
And, therefore, I will
see you on Wednesday,

860
00:48:00,440 --> 00:48:01,690
all of you, I imagine.