1 00:00:08,928 --> 00:00:12,580 SPEAKER 1: It was about 1963 when a noted philosopher here 2 00:00:12,580 --> 00:00:15,885 at MIT, named Hubert Dreyfus-- 3 00:00:20,642 --> 00:00:30,330 Hubert Dreyfus wrote a paper in about 1963 in which he had 4 00:00:30,330 --> 00:00:37,610 a heading titled, "Computers Can't Play Chess." Of course, 5 00:00:37,610 --> 00:00:40,110 he was subsequently invited over to the artificial 6 00:00:40,110 --> 00:00:41,670 intelligence laboratory to play the 7 00:00:41,670 --> 00:00:43,420 Greenblatt chess machine. 8 00:00:43,420 --> 00:00:46,650 And, of course, he lost. 9 00:00:46,650 --> 00:00:52,200 Whereupon Seymour Pavitt wrote a rebuttal to Dreyfus' famous 10 00:00:52,200 --> 00:00:56,240 paper, which had a subject heading, "Dreyfus Can't Play 11 00:00:56,240 --> 00:00:59,840 Chess Either." 12 00:00:59,840 --> 00:01:02,390 But in a strange sense, Dreyfus might have been right 13 00:01:02,390 --> 00:01:07,000 and would have been right if he were to have said computers 14 00:01:07,000 --> 00:01:11,690 can't play chess the way humans play chess yet. 15 00:01:11,690 --> 00:01:16,630 In any case, around about 1968 a chess master named David 16 00:01:16,630 --> 00:01:23,000 Levy bet noted founder of artificial intelligence John 17 00:01:23,000 --> 00:01:25,440 McCarthy that no computer would beat the world champion 18 00:01:25,440 --> 00:01:27,730 within 10 years. 19 00:01:27,730 --> 00:01:31,740 And five years later, McCarthy gave up, because it had 20 00:01:31,740 --> 00:01:36,509 already become clear that no computer would win in a way 21 00:01:36,509 --> 00:01:39,590 that McCarthy wanted it to win, that is to say by playing 22 00:01:39,590 --> 00:01:42,960 chess the way humans play chess. 23 00:01:42,960 --> 00:01:48,160 But then 20 years after that in 1997, Deep Blue beat the 24 00:01:48,160 --> 00:01:52,160 world champion, and chess suddenly became uninteresting. 25 00:01:54,910 --> 00:01:58,690 But we're going to talk about games today, because there are 26 00:01:58,690 --> 00:02:01,720 elements of game-play that do model some of the things that 27 00:02:01,720 --> 00:02:03,750 go on in our head. 28 00:02:03,750 --> 00:02:06,050 And if they don't model things that go on in our head, they 29 00:02:06,050 --> 00:02:08,610 do model some kind of intelligence. 30 00:02:08,610 --> 00:02:10,490 And if we're to have a general understanding of what 31 00:02:10,490 --> 00:02:13,690 intelligence is all about, we have to understand that kind 32 00:02:13,690 --> 00:02:16,000 of intelligence, too. 33 00:02:16,000 --> 00:02:18,790 So, we'll start out by talking about various ways that we 34 00:02:18,790 --> 00:02:20,329 might design a computer program to 35 00:02:20,329 --> 00:02:22,360 play a game like chess. 36 00:02:22,360 --> 00:02:26,930 And we'll conclude by talking a little bit about what Deep 37 00:02:26,930 --> 00:02:32,460 Blue adds to the mix other than tremendous speed. 38 00:02:32,460 --> 00:02:34,300 So, that's our agenda. 39 00:02:34,300 --> 00:02:37,320 By the end of the hour, you'll understand and be able to 40 00:02:37,320 --> 00:02:40,290 write your own Deep Blue if you feel like it. 41 00:02:40,290 --> 00:02:44,270 First, we want to talk about how it might be possible for a 42 00:02:44,270 --> 00:02:45,590 computer to play chess. 43 00:02:45,590 --> 00:02:48,250 Let's talk about several approaches 44 00:02:48,250 --> 00:02:50,220 that might be possible. 45 00:02:50,220 --> 00:02:53,860 Approach number one is that the machine might make a 46 00:02:53,860 --> 00:02:56,980 description of the board the same way a human would; talk 47 00:02:56,980 --> 00:03:00,000 about pawn structure, King safety, whether it's a good 48 00:03:00,000 --> 00:03:01,960 time to castle, that sort of thing. 49 00:03:01,960 --> 00:03:12,130 So, it would be analysis and perhaps some strategy mixed up 50 00:03:12,130 --> 00:03:13,380 with some tactics. 51 00:03:15,870 --> 00:03:20,190 And all that would get mixed up and, finally, result in 52 00:03:20,190 --> 00:03:22,710 some kind of move. 53 00:03:22,710 --> 00:03:27,450 If this is the game board, the next thing to do would be 54 00:03:27,450 --> 00:03:29,400 determined by some process like that. 55 00:03:29,400 --> 00:03:33,730 And the trouble is no one knows how to do it. 56 00:03:33,730 --> 00:03:35,180 And so in that sense, Dreyfus is right. 57 00:03:35,180 --> 00:03:38,829 None the game playing programs today incorporate any of that 58 00:03:38,829 --> 00:03:41,350 kind of stuff. 59 00:03:41,350 --> 00:03:43,600 And since nobody knows how to do that, we 60 00:03:43,600 --> 00:03:44,420 can't talk about it. 61 00:03:44,420 --> 00:03:45,610 So we can talk about other ways, though, 62 00:03:45,610 --> 00:03:46,829 that we might try. 63 00:03:46,829 --> 00:03:48,450 For example, we can have if-then rules. 64 00:03:55,820 --> 00:03:56,740 How would that work? 65 00:03:56,740 --> 00:03:57,840 That would work this way. 66 00:03:57,840 --> 00:04:01,770 You look at the board, represented by this node here, 67 00:04:01,770 --> 00:04:09,970 and you say, well, if it's possible to move the Queen 68 00:04:09,970 --> 00:04:13,390 pawn forward by one, then do that. 69 00:04:13,390 --> 00:04:16,010 So, it doesn't do any of evaluation of the board. 70 00:04:16,010 --> 00:04:19,290 It doesn't try anything. 71 00:04:19,290 --> 00:04:22,360 It just says let me look at the board and select a move on 72 00:04:22,360 --> 00:04:24,170 that basis. 73 00:04:24,170 --> 00:04:28,130 So, that would be a way of approaching a game 74 00:04:28,130 --> 00:04:30,760 situation like this. 75 00:04:30,760 --> 00:04:32,430 Here's the situation. 76 00:04:32,430 --> 00:04:34,060 Here are the possible moves. 77 00:04:34,060 --> 00:04:36,130 And one is selected on the basis of an 78 00:04:36,130 --> 00:04:40,180 if-then rule like so. 79 00:04:40,180 --> 00:04:42,195 And nobody can make a very strong chess player 80 00:04:42,195 --> 00:04:43,350 that works like that. 81 00:04:43,350 --> 00:04:46,100 Curiously enough, someone has made a pretty good checkers 82 00:04:46,100 --> 00:04:49,326 playing program that works like that. 83 00:04:49,326 --> 00:04:53,110 It checks to see what moves are available on the board, 84 00:04:53,110 --> 00:04:57,290 ranks them, and picks the highest one available. 85 00:04:57,290 --> 00:04:58,920 But, in general, that's not a very good approach. 86 00:04:58,920 --> 00:04:59,860 It's not very powerful. 87 00:04:59,860 --> 00:05:01,670 You couldn't make it-- 88 00:05:01,670 --> 00:05:03,900 well, when I say, couldn't, it means I can't think of any way 89 00:05:03,900 --> 00:05:05,530 that you could make a strong chess playing 90 00:05:05,530 --> 00:05:06,780 program that way. 91 00:05:09,260 --> 00:05:19,055 So, the third way to do this is to look ahead and evaluate. 92 00:05:24,090 --> 00:05:26,410 What that means is you look ahead like so. 93 00:05:26,410 --> 00:05:29,790 You see all the possible consequences of moves, and you 94 00:05:29,790 --> 00:05:33,740 say, which of these board situations is best for me? 95 00:05:33,740 --> 00:05:37,210 So, that would be an approach that comes in here like so and 96 00:05:37,210 --> 00:05:41,780 says, which one of those three situations is best? 97 00:05:41,780 --> 00:05:45,990 And to do that, we have to have some way of evaluating 98 00:05:45,990 --> 00:05:50,770 the situation deciding which of those is best. 99 00:05:50,770 --> 00:05:53,710 Now, I want to do a little, brief aside, because I want to 100 00:05:53,710 --> 00:05:56,670 talk about the mechanisms that are popularly used to do that 101 00:05:56,670 --> 00:05:59,420 kind of evaluation. 102 00:05:59,420 --> 00:06:02,560 In the end, there are lots of features of the chessboard. 103 00:06:02,560 --> 00:06:05,680 Let's call them f1, f2, and so on. 104 00:06:08,830 --> 00:06:12,380 And we might form some function of those features. 105 00:06:12,380 --> 00:06:16,190 And that, overall, is called the static value. 106 00:06:16,190 --> 00:06:19,400 So, it's static because you're not exploring any consequences 107 00:06:19,400 --> 00:06:20,250 of what might happen. 108 00:06:20,250 --> 00:06:22,525 You're just looking at the board as it is, checking the 109 00:06:22,525 --> 00:06:25,080 King's safety, checking the pawn structure. 110 00:06:25,080 --> 00:06:28,440 Each of those produces a number fed into this function, 111 00:06:28,440 --> 00:06:30,380 out comes a value. 112 00:06:30,380 --> 00:06:33,960 And that is a value of the board seen from your 113 00:06:33,960 --> 00:06:36,159 perspective. 114 00:06:36,159 --> 00:06:42,370 Now, normally, this function, g, is reduced to a linear 115 00:06:42,370 --> 00:06:43,990 polynomial. 116 00:06:43,990 --> 00:06:47,330 So, in the end, the most popular kind of way of forming 117 00:06:47,330 --> 00:06:52,120 a static value is to take f1, multiply it times some 118 00:06:52,120 --> 00:06:57,290 constant, c1, add c2, multiply it times f2. 119 00:07:02,560 --> 00:07:10,973 And that is a linear scoring polynomial. 120 00:07:18,880 --> 00:07:21,610 So, we could use that function to produce numbers from each 121 00:07:21,610 --> 00:07:24,150 of these things and then pick the highest number. 122 00:07:24,150 --> 00:07:26,640 And that would be a way of playing the game. 123 00:07:26,640 --> 00:07:29,220 Actually, a scoring polynomial is a little bit 124 00:07:29,220 --> 00:07:29,940 more than we need. 125 00:07:29,940 --> 00:07:33,909 Because all we really need is a method that looks at those 126 00:07:33,909 --> 00:07:36,990 three boards and says, I like this one best. 127 00:07:36,990 --> 00:07:38,490 It doesn't have to rank them. 128 00:07:38,490 --> 00:07:40,340 It doesn't have to give them numbers. 129 00:07:40,340 --> 00:07:43,500 All it has to do is say which one it likes best. 130 00:07:43,500 --> 00:07:45,690 So, one way of doing that is to use a linear scoring 131 00:07:45,690 --> 00:07:46,240 polynomial. 132 00:07:46,240 --> 00:07:49,940 But it's not the only way of doing that. 133 00:07:49,940 --> 00:07:53,980 So, that's number two and number three. 134 00:07:53,980 --> 00:07:58,409 But now what else might we do? 135 00:07:58,409 --> 00:08:01,210 Well, if we reflect back on some of the searches we talked 136 00:08:01,210 --> 00:08:04,320 about, what's the base case against which everything else 137 00:08:04,320 --> 00:08:07,800 is compared much the way of doing search that doesn't 138 00:08:07,800 --> 00:08:10,910 require any intelligence, just brute force? 139 00:08:10,910 --> 00:08:13,630 We could use the British Museum algorithm and simply 140 00:08:13,630 --> 00:08:17,770 evaluate the entire tree of possibilities; I move, you 141 00:08:17,770 --> 00:08:20,540 move, I move, you move, all the way down to-- 142 00:08:23,370 --> 00:08:25,726 what?-- 143 00:08:25,726 --> 00:08:28,800 maybe 100, 50 moves. 144 00:08:28,800 --> 00:08:29,770 You do 50 things. 145 00:08:29,770 --> 00:08:31,750 I do 50 things. 146 00:08:31,750 --> 00:08:35,500 So, before we can decide if that's a good idea or not, we 147 00:08:35,500 --> 00:08:38,754 probably ought to develop some vocabulary. 148 00:08:50,160 --> 00:08:58,000 So, consider this tree of moves. 149 00:08:58,000 --> 00:09:02,530 There will be some number of choices 150 00:09:02,530 --> 00:09:04,285 considered at each level. 151 00:09:04,285 --> 00:09:07,250 And there will be some number of levels. 152 00:09:07,250 --> 00:09:09,910 So, the standard language for this as we call this the 153 00:09:09,910 --> 00:09:11,160 branching factor. 154 00:09:18,880 --> 00:09:23,340 And in this particular case, b is equal to 3. 155 00:09:23,340 --> 00:09:30,250 This is the depth of the tree. 156 00:09:30,250 --> 00:09:34,280 And, in this case, d is two. 157 00:09:34,280 --> 00:09:37,750 So, now that produces a certain number of terminal or 158 00:09:37,750 --> 00:09:39,000 leaf nodes. 159 00:09:44,060 --> 00:09:45,875 How many of those are there? 160 00:09:49,020 --> 00:09:50,170 Well, that's pretty simple computation. 161 00:09:50,170 --> 00:09:51,840 It's just b to the d. 162 00:09:51,840 --> 00:09:55,330 Right, Christopher, b to the d? 163 00:09:55,330 --> 00:10:01,660 So, if you have b to the d at this level, you have one. 164 00:10:01,660 --> 00:10:04,670 b to the d at this level, you have b. 165 00:10:04,670 --> 00:10:09,020 b to the d at this level, you have [? d ?] squared. 166 00:10:09,020 --> 00:10:14,030 So, b to the d, in this particular case, is 9. 167 00:10:17,090 --> 00:10:19,310 So, now we can use this vocabulary that we've 168 00:10:19,310 --> 00:10:21,770 developed to talk about whether it's reasonable to 169 00:10:21,770 --> 00:10:24,990 just do the British Museum algorithm, be done with it, 170 00:10:24,990 --> 00:10:28,500 forget about chess, and go home. 171 00:10:28,500 --> 00:10:29,750 Well, let's see. 172 00:10:32,450 --> 00:10:35,050 It's pretty deep down there. 173 00:10:35,050 --> 00:10:39,290 If we think about chess, and we think about a standard game 174 00:10:39,290 --> 00:10:41,970 which each person does 50 things, that 175 00:10:41,970 --> 00:10:45,080 gives a d about 100. 176 00:10:45,080 --> 00:10:47,490 And if you think about the branching factor in chess, 177 00:10:47,490 --> 00:10:50,430 it's generally presumed to be, depending on the stage of the 178 00:10:50,430 --> 00:10:52,870 game and so on and so forth, it varies, but it might 179 00:10:52,870 --> 00:10:55,940 average around 14 or 15. 180 00:10:55,940 --> 00:10:59,620 If it were just 10, that would be 10 to the 100th. 181 00:10:59,620 --> 00:11:01,310 But it's a little more than that, because the branching 182 00:11:01,310 --> 00:11:03,930 factor is more than 10. 183 00:11:03,930 --> 00:11:09,300 So, in the end, it looks like, according to Claude Shannon, 184 00:11:09,300 --> 00:11:16,160 there are about 10 to the 120th leaf nodes down there. 185 00:11:16,160 --> 00:11:18,930 And if you're going to go to a British Museum treatment of 186 00:11:18,930 --> 00:11:21,990 this tree, you'd have to do 10 to the 120th static 187 00:11:21,990 --> 00:11:28,310 evaluations down there at the bottom if you're going to see 188 00:11:28,310 --> 00:11:32,080 which one of the moves is best at the top. 189 00:11:32,080 --> 00:11:33,850 Is that a reasonable number? 190 00:11:33,850 --> 00:11:38,622 It didn't used to seem practicable. 191 00:11:38,622 --> 00:11:41,460 It used to seem impossible. 192 00:11:41,460 --> 00:11:43,400 But now we've got cloud computing and everything. 193 00:11:43,400 --> 00:11:48,180 And maybe we could actually do that, right? 194 00:11:48,180 --> 00:11:51,440 What do you think, Vanessa, can you do that, get enough 195 00:11:51,440 --> 00:11:54,940 computers going in the cloud? 196 00:11:54,940 --> 00:11:55,385 No? 197 00:11:55,385 --> 00:11:57,150 You're not sure? 198 00:11:57,150 --> 00:11:59,350 Should we work it out? 199 00:11:59,350 --> 00:12:00,520 Let's work it out. 200 00:12:00,520 --> 00:12:04,170 I'll need some help, especially from any of you who 201 00:12:04,170 --> 00:12:05,420 are studying cosmology. 202 00:12:07,700 --> 00:12:09,470 So, we'll start with how many atoms are 203 00:12:09,470 --> 00:12:10,720 there in the universe? 204 00:12:13,580 --> 00:12:14,890 Volunteers? 205 00:12:14,890 --> 00:12:15,790 10 to the-- 206 00:12:15,790 --> 00:12:16,792 SPEAKER 2: 10 to the 38th? 207 00:12:16,792 --> 00:12:19,300 SPEAKER 1: No, no, 10 to the 38th has been offered. 208 00:12:19,300 --> 00:12:22,376 That's why it's way too low. 209 00:12:22,376 --> 00:12:25,760 The last time I looked, it was about 10 to the 80th atoms in 210 00:12:25,760 --> 00:12:27,010 the universe. 211 00:12:33,940 --> 00:12:35,900 The next thing I'd like to know is how many seconds are 212 00:12:35,900 --> 00:12:37,232 there in a year? 213 00:12:37,232 --> 00:12:41,200 It's a good number have memorized. 214 00:12:41,200 --> 00:12:53,350 That number is approximately pi times 10 to the seventh. 215 00:12:53,350 --> 00:12:56,190 So, how many nanoseconds in a second? 216 00:12:56,190 --> 00:13:03,410 That gives us 10 to the ninth. 217 00:13:03,410 --> 00:13:06,670 At last, how many years are there in the 218 00:13:06,670 --> 00:13:07,920 history of the universe? 219 00:13:10,040 --> 00:13:12,480 SPEAKER 3: [INAUDIBLE]. 220 00:13:12,480 --> 00:13:15,790 14.7 billion. 221 00:13:15,790 --> 00:13:18,150 SPEAKER 1: She offers something on the order of 10 222 00:13:18,150 --> 00:13:21,960 billion, maybe 14 billion. 223 00:13:21,960 --> 00:13:26,130 But we'll say 10 billion to make our calculation simple. 224 00:13:26,130 --> 00:13:31,630 That's 10 to the 10th years. 225 00:13:31,630 --> 00:13:38,300 If we will add that up, 80, 90, plus 16, that's 10 to the 226 00:13:38,300 --> 00:13:50,540 106th nanoseconds in the history of the universe. 227 00:13:50,540 --> 00:13:52,580 Multiply it times the number of atoms in the universe. 228 00:13:52,580 --> 00:13:56,900 So, if all of the atoms in the universe were doing static 229 00:13:56,900 --> 00:14:00,740 evaluations at nanosecond speeds since the beginning of 230 00:14:00,740 --> 00:14:06,640 the Big Bang, we'd still be 14 orders of magnitudes short. 231 00:14:06,640 --> 00:14:08,120 So, it'd be a pretty good cloud. 232 00:14:08,120 --> 00:14:11,395 It would have to harness together lots of universes. 233 00:14:15,080 --> 00:14:16,660 So, the British Museum algorithm is 234 00:14:16,660 --> 00:14:17,910 not going to work. 235 00:14:35,650 --> 00:14:37,700 No good. 236 00:14:37,700 --> 00:14:39,460 So, what we're going to have to do is we're going to have 237 00:14:39,460 --> 00:14:43,090 to put some things together and hope for the best. 238 00:14:43,090 --> 00:14:46,680 So, the fifth way is the way we're actually going to do it. 239 00:14:46,680 --> 00:14:53,580 And what we're going to do is we're going to look ahead, not 240 00:14:53,580 --> 00:14:55,410 just one level, but as far as possible. 241 00:15:07,120 --> 00:15:11,460 We consider, not only the situation that we've developed 242 00:15:11,460 --> 00:15:15,390 here, but we'll try to push that out as far as we can and 243 00:15:15,390 --> 00:15:21,430 look at these static values of the leaf nodes down here and 244 00:15:21,430 --> 00:15:24,970 somehow use that as a way of playing the game. 245 00:15:24,970 --> 00:15:27,885 So, that is number five. 246 00:15:27,885 --> 00:15:30,830 And number four is going all the way down there. 247 00:15:30,830 --> 00:15:34,850 And this, in the end, is all that we can do. 248 00:15:34,850 --> 00:15:45,240 This idea is multiply invented most notably by Claude Shannon 249 00:15:45,240 --> 00:15:51,150 and also by Alan Turing, who, I found out from a friend of 250 00:15:51,150 --> 00:15:56,460 mine, spent a lot a lunch time conversations talking with 251 00:15:56,460 --> 00:16:01,130 each other about how a computer might play chess 252 00:16:01,130 --> 00:16:04,850 against the future when there would be computers. 253 00:16:04,850 --> 00:16:08,300 So, Donald, Mickey and Alan Turing also invented this over 254 00:16:08,300 --> 00:16:12,010 lunch while they were taking some time off from cracking 255 00:16:12,010 --> 00:16:14,730 the German codes. 256 00:16:14,730 --> 00:16:17,710 Well, what is the method? 257 00:16:17,710 --> 00:16:20,290 I want to illustrate the method with the simplest 258 00:16:20,290 --> 00:16:21,700 possible tree. 259 00:16:21,700 --> 00:16:24,600 So, we're going to have a branching factor of 2 not 14. 260 00:16:24,600 --> 00:16:27,920 And we're going to have a depth of 2 not something 261 00:16:27,920 --> 00:16:29,570 highly serious. 262 00:16:32,170 --> 00:16:34,360 Here's the game tree. 263 00:16:34,360 --> 00:16:35,510 And there are going to be some numbers 264 00:16:35,510 --> 00:16:36,760 down here at the bottom. 265 00:16:39,430 --> 00:16:42,390 And these are going to be the value of the board from the 266 00:16:42,390 --> 00:16:46,060 perspective of the player at the top. 267 00:16:46,060 --> 00:16:48,210 Let us say that the player at the top would like to drive 268 00:16:48,210 --> 00:16:52,330 the play as much as possible toward the big numbers. 269 00:16:52,330 --> 00:16:54,750 So, we're going to call that player the maximizing player. 270 00:16:58,440 --> 00:17:01,270 He would like to get over here to the 8, because that's the 271 00:17:01,270 --> 00:17:02,940 biggest number. 272 00:17:02,940 --> 00:17:04,740 There's another player, his opponent, which we'll call the 273 00:17:04,740 --> 00:17:06,440 minimizing player. 274 00:17:06,440 --> 00:17:10,108 And he's hoping that the play will go down to the board 275 00:17:10,108 --> 00:17:11,950 situation that's as small as possible. 276 00:17:11,950 --> 00:17:14,930 Because his view is the opposite of the maximizing 277 00:17:14,930 --> 00:17:19,040 player, hence the name minimax. 278 00:17:19,040 --> 00:17:20,770 But how does it work? 279 00:17:20,770 --> 00:17:24,520 Do you see which way the play is going to go? 280 00:17:24,520 --> 00:17:27,990 How do you decide which way the play is going to go? 281 00:17:27,990 --> 00:17:30,650 Well, it's not obvious at a glance. 282 00:17:30,650 --> 00:17:33,230 Do you see which way it's going to go? 283 00:17:33,230 --> 00:17:34,980 It's not obvious to the glance. 284 00:17:34,980 --> 00:17:39,160 But if we do more than a glance, if we look at the 285 00:17:39,160 --> 00:17:42,150 situation from the perspective of the minimizing player here 286 00:17:42,150 --> 00:17:44,360 at the middle level, it's pretty clear that if the 287 00:17:44,360 --> 00:17:48,570 minimizing player finds himself in that situation, 288 00:17:48,570 --> 00:17:51,480 he's going to choose to go that way. 289 00:17:51,480 --> 00:17:56,830 And so the value of this situation, from the 290 00:17:56,830 --> 00:18:00,652 perspective of the minimizing player, is 2. 291 00:18:00,652 --> 00:18:03,480 He'd never go over there to the 7. 292 00:18:03,480 --> 00:18:07,200 Similarly, if the minimizing player is over here with a 293 00:18:07,200 --> 00:18:09,700 choice between going toward a 1 or toward an 8, he'll 294 00:18:09,700 --> 00:18:11,900 obviously go toward a 1. 295 00:18:11,900 --> 00:18:16,850 And so the value of that board situation, from the 296 00:18:16,850 --> 00:18:20,340 perspective of the minimizing player, is 1. 297 00:18:20,340 --> 00:18:22,550 Now, we've taken the scores down here at the bottom of the 298 00:18:22,550 --> 00:18:25,710 tree, and we back them up one level. 299 00:18:25,710 --> 00:18:28,840 And you see how we can just keep doing this? 300 00:18:28,840 --> 00:18:32,160 Now the maximizing player can see that if he goes to the 301 00:18:32,160 --> 00:18:34,605 left, he gets a score of 2. 302 00:18:34,605 --> 00:18:37,360 If he goes to the right, he only gets a score of 1. 303 00:18:37,360 --> 00:18:39,800 So, he's going to go to the left. 304 00:18:39,800 --> 00:18:42,980 So, overall, then, the maximizing player is going to 305 00:18:42,980 --> 00:18:48,790 have a 2 as the perceived value of that situation there 306 00:18:48,790 --> 00:18:50,740 at the top. 307 00:18:50,740 --> 00:18:51,790 That's the minimax algorithm. 308 00:18:51,790 --> 00:18:53,390 It's very simple. 309 00:18:53,390 --> 00:18:56,250 You go down to the bottom of the tree, you compute static 310 00:18:56,250 --> 00:19:00,570 values, you back them up level by level, and then you decide 311 00:19:00,570 --> 00:19:01,585 where to go. 312 00:19:01,585 --> 00:19:05,390 And in this particular situation, the maximizer goes 313 00:19:05,390 --> 00:19:05,890 to the left. 314 00:19:05,890 --> 00:19:08,770 And the minimizer goes to the left, too, so the play ends up 315 00:19:08,770 --> 00:19:13,680 here, far short of the 8 that the maximizer wanted and less 316 00:19:13,680 --> 00:19:15,460 than the 1 that the minimizer wanted. 317 00:19:15,460 --> 00:19:17,100 But this is an adversarial game. 318 00:19:17,100 --> 00:19:18,230 You're competing with each other. 319 00:19:18,230 --> 00:19:21,280 So, you don't expect to get what you want, right? 320 00:19:23,930 --> 00:19:25,660 So, maybe we ought to see if we can make that work. 321 00:19:33,320 --> 00:19:34,100 There's a game tree. 322 00:19:34,100 --> 00:19:35,350 Do you see how it goes? 323 00:19:38,630 --> 00:19:42,730 Let's see if the system can figure it out. 324 00:19:42,730 --> 00:19:46,350 There it goes, crawling its way through the tree. 325 00:19:46,350 --> 00:19:49,310 This is a branching factor of 2, just like our sample, but 326 00:19:49,310 --> 00:19:51,540 now four levels. 327 00:19:51,540 --> 00:19:53,700 You can see that it's got quite a lot of work to do. 328 00:19:53,700 --> 00:20:01,175 That's 2 to the fourth, one, two, three, four, 2 to the 329 00:20:01,175 --> 00:20:06,790 fourth, 16 static evaluations to do. 330 00:20:06,790 --> 00:20:07,850 So, it found the answer. 331 00:20:07,850 --> 00:20:09,120 But it's a lot of work. 332 00:20:09,120 --> 00:20:13,290 We could get a new tree and restart it, maybe speed it up. 333 00:20:17,960 --> 00:20:22,310 There is goes down that way, get a new tree. 334 00:20:22,310 --> 00:20:23,270 Those are just random numbers. 335 00:20:23,270 --> 00:20:25,360 So, each time it's going to find a different path through 336 00:20:25,360 --> 00:20:30,330 the tree according to the numbers that it's generated. 337 00:20:30,330 --> 00:20:32,070 Now, 16 isn't bad. 338 00:20:32,070 --> 00:20:34,620 But if you get down there around 10 levels deep and your 339 00:20:34,620 --> 00:20:36,850 branching factor is 14, well, we know those numbers get 340 00:20:36,850 --> 00:20:39,290 pretty awful pretty bad, because the number of static 341 00:20:39,290 --> 00:20:41,105 evaluations to do down there at the bottom 342 00:20:41,105 --> 00:20:43,830 goes as b to the d. 343 00:20:43,830 --> 00:20:45,080 It's exponential. 344 00:20:47,260 --> 00:20:50,350 And time has shown, if you get down about seven or eight 345 00:20:50,350 --> 00:20:51,845 levels, you're a jerk. 346 00:20:51,845 --> 00:20:54,450 And if you get down about 15 or 16 levels, you beat the 347 00:20:54,450 --> 00:20:55,900 world champion. 348 00:20:55,900 --> 00:20:58,630 So, you'd like to get as far down in the tree as possible. 349 00:20:58,630 --> 00:21:02,480 Because when you get as far down into the tree as 350 00:21:02,480 --> 00:21:04,510 possible, what happens is as these that these crude 351 00:21:04,510 --> 00:21:09,720 measures of bored quality begin to clarify. 352 00:21:09,720 --> 00:21:11,910 And, in fact, when you get far enough, the only thing that 353 00:21:11,910 --> 00:21:15,890 really counts is piece count, one of those features. 354 00:21:15,890 --> 00:21:18,750 If you get far enough, piece count and a few other things 355 00:21:18,750 --> 00:21:21,150 will give you a pretty good idea of what to do if you get 356 00:21:21,150 --> 00:21:23,990 far enough. 357 00:21:23,990 --> 00:21:25,510 But getting far enough can be a problem. 358 00:21:25,510 --> 00:21:27,400 So, we want to do everything we can to 359 00:21:27,400 --> 00:21:28,935 get as far as possible. 360 00:21:28,935 --> 00:21:31,970 We want to pull out every trick we can find to get as 361 00:21:31,970 --> 00:21:33,500 far as possible. 362 00:21:33,500 --> 00:21:38,450 Now, you remember when we talked about branching down, 363 00:21:38,450 --> 00:21:39,955 we knew that there were some things that we could do that 364 00:21:39,955 --> 00:21:43,330 would cut off whole portions of the search tree. 365 00:21:43,330 --> 00:21:45,380 So, what we'd like to do is find something analogous to 366 00:21:45,380 --> 00:21:48,270 this world of games, so we cut off whole portions of this 367 00:21:48,270 --> 00:21:49,880 search tree, so we don't have to look at 368 00:21:49,880 --> 00:21:52,180 those static values. 369 00:21:52,180 --> 00:21:55,330 What I want to do is I want to come back and redo this thing. 370 00:21:55,330 --> 00:21:56,780 But this time, I'm going to compute the static 371 00:21:56,780 --> 00:21:59,030 values one at a time. 372 00:21:59,030 --> 00:22:03,190 I've got the same structure in the tree. 373 00:22:03,190 --> 00:22:06,110 And just as before, I'm going to assume that the top player 374 00:22:06,110 --> 00:22:08,490 wants to go toward the maximum values, and the next player 375 00:22:08,490 --> 00:22:10,380 wants to go toward the minimum values. 376 00:22:10,380 --> 00:22:13,950 But none of the static values have been computed yet. 377 00:22:13,950 --> 00:22:16,770 So, I better start computing them. 378 00:22:16,770 --> 00:22:19,226 That's the first one I find, 2. 379 00:22:19,226 --> 00:22:21,840 Now, as soon as I see that 2, as soon as the minimizer sees 380 00:22:21,840 --> 00:22:25,580 that 2, the minimizer knows that the value of this node 381 00:22:25,580 --> 00:22:27,390 can't be any greater than 2. 382 00:22:27,390 --> 00:22:30,020 Because he'll always choose to go down this way if this 383 00:22:30,020 --> 00:22:32,390 branch produces a bigger number. 384 00:22:32,390 --> 00:22:35,910 So, we can say that the minimizer is assured already 385 00:22:35,910 --> 00:22:40,580 that the score there will be equal to or less than 2. 386 00:22:40,580 --> 00:22:43,580 Now, we go over and compute the next number. 387 00:22:43,580 --> 00:22:44,980 There's a 7. 388 00:22:44,980 --> 00:22:46,850 Now, I know this is exactly equal to 2, because he'll 389 00:22:46,850 --> 00:22:49,570 never go down toward a 7. 390 00:22:49,570 --> 00:22:52,420 As soon as the minimizer says equal to 2, the maximizer 391 00:22:52,420 --> 00:22:55,390 says, OK, I can do equal to or greater than 2. 392 00:23:00,560 --> 00:23:06,010 One, minimizer says equal to or less than 1. 393 00:23:06,010 --> 00:23:08,142 Now what? 394 00:23:08,142 --> 00:23:12,275 Did you prepare those 2 numbers? 395 00:23:12,275 --> 00:23:16,360 The maximizer knows that if he goes down here, he can't do 396 00:23:16,360 --> 00:23:17,990 better than 1. 397 00:23:17,990 --> 00:23:23,510 He already knows if he goes over here, he an get a 2. 398 00:23:23,510 --> 00:23:27,850 It's as if this branch doesn't even exist. 399 00:23:27,850 --> 00:23:31,840 Because the maximizer would never choose to go down there. 400 00:23:31,840 --> 00:23:33,160 So, you have to see that. 401 00:23:33,160 --> 00:23:38,330 This is the important essence of the notion the alpha-beta 402 00:23:38,330 --> 00:23:41,630 algorithm, which is a layering on top of minimax that cuts 403 00:23:41,630 --> 00:23:44,870 off large sections of the search tree. 404 00:23:44,870 --> 00:23:47,420 So, one more time. 405 00:23:47,420 --> 00:23:49,620 We've developed a situation so we know that the maximizer 406 00:23:49,620 --> 00:23:54,720 gets a 2 going down to the left, and he sees that if he 407 00:23:54,720 --> 00:23:58,000 goes down to the right, he can't do better than 1. 408 00:23:58,000 --> 00:24:01,420 So, he says to himself, it's as if that branch doesn't 409 00:24:01,420 --> 00:24:05,230 exist and the overall score is 2. 410 00:24:05,230 --> 00:24:08,980 And it doesn't matter what that static value is. 411 00:24:08,980 --> 00:24:13,350 It can be 8, as it was, it can be plus 1,000. 412 00:24:13,350 --> 00:24:14,015 It doesn't matter. 413 00:24:14,015 --> 00:24:16,040 It can be minus 1,000. 414 00:24:16,040 --> 00:24:19,420 Or it could be plus infinity or minus infinity. 415 00:24:19,420 --> 00:24:23,620 It doesn't matter, because the maximizer will always 416 00:24:23,620 --> 00:24:26,470 go the other way. 417 00:24:26,470 --> 00:24:29,270 So, that's the alpha-beta algorithm. 418 00:24:29,270 --> 00:24:32,300 Can you guess why it's called the alpha-beta algorithm? 419 00:24:32,300 --> 00:24:34,210 Well, because in the algorithm there are two parameters, 420 00:24:34,210 --> 00:24:37,080 alpha and beta. 421 00:24:37,080 --> 00:24:38,750 So, it's important to understand that alpha-beta is 422 00:24:38,750 --> 00:24:41,810 not an alternative to minimax. 423 00:24:41,810 --> 00:24:44,230 It's minimax with a flourish. 424 00:24:44,230 --> 00:24:47,610 It's something layered on top like we layered things on top 425 00:24:47,610 --> 00:24:49,605 of branch and bound to make it more efficient. 426 00:24:49,605 --> 00:24:52,290 We layer stuff on top of minimax to 427 00:24:52,290 --> 00:24:55,300 make it more efficient. 428 00:24:55,300 --> 00:24:57,250 As you say to me, well, that's a pretty easy example. 429 00:24:57,250 --> 00:24:57,700 And it is. 430 00:24:57,700 --> 00:24:59,810 So, let's try a little bit more complex one. 431 00:25:07,330 --> 00:25:09,550 This is just to see if I can do it without screwing up. 432 00:25:12,220 --> 00:25:15,320 The reason I do one that's complex is not just to show 433 00:25:15,320 --> 00:25:17,640 how tough I am in front of a large audience. 434 00:25:17,640 --> 00:25:20,450 But, rather, there's certain points of interest that only 435 00:25:20,450 --> 00:25:24,030 occur in a tree of depth four or greater. 436 00:25:24,030 --> 00:25:26,010 That's the reason for this example. 437 00:25:26,010 --> 00:25:28,120 But work with me and let's see if we can work 438 00:25:28,120 --> 00:25:29,670 our way through it. 439 00:25:29,670 --> 00:25:34,810 What I'm going to do is I'll circle the numbers that we 440 00:25:34,810 --> 00:25:36,790 actually have to compute. 441 00:25:36,790 --> 00:25:39,480 So, we actually have to compute 8. 442 00:25:39,480 --> 00:25:42,430 As soon as we do that, the minimizer knows that that node 443 00:25:42,430 --> 00:25:44,450 is going to have a score of equal to or less than 8 444 00:25:44,450 --> 00:25:46,960 without looking at anything else. 445 00:25:46,960 --> 00:25:50,020 Then, he looks at 7. 446 00:25:50,020 --> 00:25:51,516 So, that's equal to 7. 447 00:25:51,516 --> 00:25:54,910 Because the minimizer will clearly go to the right. 448 00:25:54,910 --> 00:25:57,330 As soon as that is determined, then the maximizer knows that 449 00:25:57,330 --> 00:26:00,580 the score here is equal to or greater than 8. 450 00:26:00,580 --> 00:26:03,680 Now, we evaluate the 3. 451 00:26:03,680 --> 00:26:06,418 The minimizer knows equal to or less than 3. 452 00:26:06,418 --> 00:26:09,286 SPEAKER 4: [INAUDIBLE]. 453 00:26:09,286 --> 00:26:14,920 SPEAKER 1: Oh, sorry, the minimizer at 7, yeah. 454 00:26:14,920 --> 00:26:17,930 OK, now what happens? 455 00:26:17,930 --> 00:26:20,240 Well, let's see, the maximizer gets a 7 going that way. 456 00:26:20,240 --> 00:26:22,180 He can't do better than 3 going that way, so we got 457 00:26:22,180 --> 00:26:24,980 another one of these cut off situations. 458 00:26:24,980 --> 00:26:28,820 It's as if this branch doesn't even exist. 459 00:26:28,820 --> 00:26:32,860 So, this static evaluation need not be made. 460 00:26:32,860 --> 00:26:35,670 And now we know that that's not merely equal to or greater 461 00:26:35,670 --> 00:26:37,850 than 7, but exactly equal to 7. 462 00:26:37,850 --> 00:26:40,530 And we can push that number back up. 463 00:26:40,530 --> 00:26:43,900 That becomes equal to or less than 7. 464 00:26:43,900 --> 00:26:46,360 OK, are you with me so far? 465 00:26:46,360 --> 00:26:47,620 Let's get over to the other side of the tree 466 00:26:47,620 --> 00:26:49,300 as quickly as possible. 467 00:26:49,300 --> 00:26:55,410 So, there's a 9, equal to or less than 9, 8 equal to 8, 468 00:26:55,410 --> 00:27:00,820 push the 8 up equal or greater than 8. 469 00:27:03,360 --> 00:27:06,740 The minimizer can go down this way and get a 7. 470 00:27:06,740 --> 00:27:09,020 He'll certainly never go that way where the 471 00:27:09,020 --> 00:27:11,780 maximizer can get an 8. 472 00:27:11,780 --> 00:27:13,706 Once again, we've got a cut off. 473 00:27:13,706 --> 00:27:17,900 And if this branch didn't exist, then that means that 474 00:27:17,900 --> 00:27:21,020 these static evaluations don't have to be made. 475 00:27:21,020 --> 00:27:25,150 And this value is now exactly 7. 476 00:27:25,150 --> 00:27:27,340 But there's one more thing to note here. 477 00:27:27,340 --> 00:27:29,510 And that is that not only do we not have to make these 478 00:27:29,510 --> 00:27:32,830 static evaluations down here, but we don't even have to 479 00:27:32,830 --> 00:27:35,040 generate these moves. 480 00:27:35,040 --> 00:27:38,210 So, we save two ways, both on static evaluation and on move 481 00:27:38,210 --> 00:27:40,390 generation. 482 00:27:40,390 --> 00:27:42,770 This is a real winner, this alpha-beta thing, because it 483 00:27:42,770 --> 00:27:44,285 saves as enormous amount of computation. 484 00:27:47,130 --> 00:27:47,930 Well, we're on the way now. 485 00:27:47,930 --> 00:27:50,470 The maximizer up here is guaranteed equal to or 486 00:27:50,470 --> 00:27:51,220 greater than 7. 487 00:27:51,220 --> 00:27:53,990 Has anyone found the winning media move yet? 488 00:27:53,990 --> 00:27:56,050 Is it to the left? 489 00:27:56,050 --> 00:27:59,240 I know that we better keep going, because we want to 490 00:27:59,240 --> 00:28:00,490 trust any oracles. 491 00:28:04,150 --> 00:28:05,090 So, let's see. 492 00:28:05,090 --> 00:28:05,780 There's a 1. 493 00:28:05,780 --> 00:28:06,700 We've calculated that. 494 00:28:06,700 --> 00:28:08,950 The minimizer can be guaranteed equal to or less 495 00:28:08,950 --> 00:28:11,050 than 1 at that particular point. 496 00:28:15,130 --> 00:28:17,040 Think about that for a while. 497 00:28:17,040 --> 00:28:19,470 At the top, the maximizer knows he can go 498 00:28:19,470 --> 00:28:23,161 left and get a 7. 499 00:28:23,161 --> 00:28:28,610 the minimizer, if the play ever gets here, can ensure 500 00:28:28,610 --> 00:28:30,860 that he's going to drive the situation to a board 501 00:28:30,860 --> 00:28:33,240 number that's 1. 502 00:28:33,240 --> 00:28:35,150 So, the question is will the maximizer ever 503 00:28:35,150 --> 00:28:37,080 permit that to happen? 504 00:28:37,080 --> 00:28:39,920 And the answer is surely not. 505 00:28:39,920 --> 00:28:42,090 So, over here in the development of this side of 506 00:28:42,090 --> 00:28:44,870 the tree, we're always comparing numbers at adjacent 507 00:28:44,870 --> 00:28:46,530 levels in the tree. 508 00:28:46,530 --> 00:28:48,780 But here's a situation where we're comparing numbers that 509 00:28:48,780 --> 00:28:51,210 are separated from each other in the tree. 510 00:28:51,210 --> 00:28:54,430 And we still concluded that no further examination of this 511 00:28:54,430 --> 00:28:56,870 node makes any sense at all. 512 00:28:56,870 --> 00:28:58,120 This is called deep cut off. 513 00:29:05,590 --> 00:29:08,810 And that means that this whole branch here might as well not 514 00:29:08,810 --> 00:29:14,150 exist, and we won't have to compute that static value. 515 00:29:14,150 --> 00:29:15,530 All right? 516 00:29:15,530 --> 00:29:17,950 So, it looks-- 517 00:29:17,950 --> 00:29:20,250 you have this stare of disbelief, which 518 00:29:20,250 --> 00:29:21,660 is perfectly normal. 519 00:29:21,660 --> 00:29:23,510 I have to reconvince myself every time that 520 00:29:23,510 --> 00:29:24,915 this actually works. 521 00:29:24,915 --> 00:29:28,170 But when you think your way through it, it is clear that 522 00:29:28,170 --> 00:29:30,660 these computations that I've x-ed out 523 00:29:30,660 --> 00:29:32,120 don't have to be made. 524 00:29:32,120 --> 00:29:34,510 So, let's carry on and see if we can complete this equal to 525 00:29:34,510 --> 00:29:39,670 or less than 8, equal to 8, equal to 8-- 526 00:29:39,670 --> 00:29:42,360 because the other branch doesn't even exist-- 527 00:29:42,360 --> 00:29:46,760 equal to or less than 8. 528 00:29:46,760 --> 00:29:50,700 And we compare these two numbers, do we keep going? 529 00:29:50,700 --> 00:29:52,020 Yes, we keep going. 530 00:29:52,020 --> 00:29:54,010 Because maybe the maximizer can go to the right and 531 00:29:54,010 --> 00:29:56,870 actually get to that 8. 532 00:29:56,870 --> 00:29:59,990 So, we have to go over here and keep working away. 533 00:29:59,990 --> 00:30:02,600 There's a nine, equal to or less than 9, 534 00:30:02,600 --> 00:30:04,790 another 9 equal to 9. 535 00:30:04,790 --> 00:30:08,620 Push that number up equal to or greater than 9. 536 00:30:11,360 --> 00:30:14,322 The minimizer gets an 8 going this way. 537 00:30:14,322 --> 00:30:16,840 The maximizer is insured of getting a 9 going that way. 538 00:30:16,840 --> 00:30:18,860 So, once again, we've got a cut off situation. 539 00:30:18,860 --> 00:30:21,392 It's as if this doesn't exist. 540 00:30:21,392 --> 00:30:24,540 Those static evaluations are not made. 541 00:30:24,540 --> 00:30:28,000 This move generation is not made and computation is saved. 542 00:30:32,010 --> 00:30:36,200 So, let's see if we can do better on this very example 543 00:30:36,200 --> 00:30:38,342 using this alpha-beta idea. 544 00:30:38,342 --> 00:30:42,150 I'll slow it down a little bit and change the search type to 545 00:30:42,150 --> 00:30:45,110 minimax with alpha-beta. 546 00:30:45,110 --> 00:30:47,540 We see two numbers on each of those nodes now, guess what 547 00:30:47,540 --> 00:30:48,220 they're called. 548 00:30:48,220 --> 00:30:49,070 We already know. 549 00:30:49,070 --> 00:30:50,430 They're alpha and beta. 550 00:30:50,430 --> 00:30:53,270 So, what's going to happen is the algorithm proceeds through 551 00:30:53,270 --> 00:30:55,710 trees that those numbers are going to shrink wrap 552 00:30:55,710 --> 00:30:58,210 themselves around the situation. 553 00:30:58,210 --> 00:30:59,460 So, we'll start that up. 554 00:31:04,770 --> 00:31:08,030 Two static evaluations were not made. 555 00:31:08,030 --> 00:31:09,280 Let's try a new tree. 556 00:31:14,240 --> 00:31:16,496 Two different ones were not made. 557 00:31:16,496 --> 00:31:25,300 A new tree, still again, two different ones not made. 558 00:31:25,300 --> 00:31:29,180 Let's see what happens when we use the classroom example, the 559 00:31:29,180 --> 00:31:29,960 one I did up there. 560 00:31:29,960 --> 00:31:32,900 Let's make sure that I didn't screw it up. 561 00:31:32,900 --> 00:31:34,480 I'll slow that down to 1. 562 00:31:45,150 --> 00:31:48,280 2, same answer. 563 00:31:48,280 --> 00:31:50,380 So, you probably didn't realize it at the start. 564 00:31:50,380 --> 00:31:51,530 Who could? 565 00:31:51,530 --> 00:31:56,040 In fact, the play goes down that way, over this way, down 566 00:31:56,040 --> 00:31:59,710 that way, and ultimately to the 8, which is not the 567 00:31:59,710 --> 00:32:00,390 biggest number. 568 00:32:00,390 --> 00:32:01,460 And it's not the smallest number. 569 00:32:01,460 --> 00:32:03,866 It's the compromised number that's arrived at virtue of 570 00:32:03,866 --> 00:32:07,980 the fact that this is an adversarial situation. 571 00:32:07,980 --> 00:32:12,120 So, you say to me, how much energy, how much work do you 572 00:32:12,120 --> 00:32:14,820 actually saved by doing this? 573 00:32:14,820 --> 00:32:34,440 Well, it is the case that in the optimal situation, if 574 00:32:34,440 --> 00:32:37,615 everything is ordered right, if God has come down and 575 00:32:37,615 --> 00:32:41,110 arranged your tree in just the right way, then the 576 00:32:41,110 --> 00:32:44,980 approximate amount of work you need to do, the approximate 577 00:32:44,980 --> 00:32:48,340 number of static evaluations performed, is approximately 578 00:32:48,340 --> 00:32:54,610 equal to 2 times b to the d over 2. 579 00:32:54,610 --> 00:32:55,870 We don't care about this 2. 580 00:32:55,870 --> 00:32:59,220 We care a whole lot about that 2. 581 00:32:59,220 --> 00:33:01,760 That's the amount of work that's done. 582 00:33:01,760 --> 00:33:06,050 It's b to the d over 2, instead of b to d. 583 00:33:06,050 --> 00:33:07,000 What's that mean? 584 00:33:07,000 --> 00:33:09,500 Suppose that without this idea, I can 585 00:33:09,500 --> 00:33:12,080 go down seven levels. 586 00:33:12,080 --> 00:33:15,280 How far can I go down with this idea? 587 00:33:15,280 --> 00:33:17,940 14 levels. 588 00:33:17,940 --> 00:33:18,910 So, it's the difference between a 589 00:33:18,910 --> 00:33:21,340 jerk and a world champion. 590 00:33:21,340 --> 00:33:24,880 So, that, however, is only in the optimal case when God has 591 00:33:24,880 --> 00:33:26,710 arranged things just right. 592 00:33:26,710 --> 00:33:29,750 But in practical situations, practical game situations, it 593 00:33:29,750 --> 00:33:32,560 appears to be the case, experimentally, that the 594 00:33:32,560 --> 00:33:36,170 actual number is close to this approximation for optimal 595 00:33:36,170 --> 00:33:37,760 arrangements. 596 00:33:37,760 --> 00:33:40,462 So, you'd never not want to use alpha-beta. 597 00:33:40,462 --> 00:33:43,870 It saves an amazing amount of time. 598 00:33:43,870 --> 00:33:46,700 You could look at it another way. 599 00:33:46,700 --> 00:33:50,990 Suppose you go down the same number of levels, how much 600 00:33:50,990 --> 00:33:52,240 less work do you have to do? 601 00:33:55,070 --> 00:33:55,760 Well, quite a bit. 602 00:33:55,760 --> 00:33:59,050 The square root [INAUDIBLE], right? 603 00:33:59,050 --> 00:34:02,720 That's another way of looking at how it works. 604 00:34:02,720 --> 00:34:06,710 So, we could go home at this point except for one problem, 605 00:34:06,710 --> 00:34:11,469 and that is that we pretended that the branching factor is 606 00:34:11,469 --> 00:34:13,560 always the same. 607 00:34:13,560 --> 00:34:17,909 But, in fact, the branching factor will vary with the game 608 00:34:17,909 --> 00:34:21,510 state and will vary with the game. 609 00:34:21,510 --> 00:34:23,989 So, you can calculate how much computing you can do in two 610 00:34:23,989 --> 00:34:27,223 minutes, or however much time you have for an average move. 611 00:34:27,223 --> 00:34:30,520 And then you could say, how deep can I go? 612 00:34:30,520 --> 00:34:32,760 And you won't know for sure, because it 613 00:34:32,760 --> 00:34:35,210 depends on the game. 614 00:34:35,210 --> 00:34:39,320 So, in the earlier days of game-playing programs, the 615 00:34:39,320 --> 00:34:41,750 game-playing program left a lot of computation on the 616 00:34:41,750 --> 00:34:45,670 table, because it would make a decision in three seconds. 617 00:34:45,670 --> 00:34:49,170 And it might have made a much different move if it used all 618 00:34:49,170 --> 00:34:51,520 the competition it had available. 619 00:34:51,520 --> 00:34:54,969 Alternatively, it might be grinding away, and after two 620 00:34:54,969 --> 00:34:56,880 minutes was consumed. 621 00:34:56,880 --> 00:35:00,410 It had no move and just did something random. 622 00:35:02,920 --> 00:35:05,020 That's not very good. 623 00:35:05,020 --> 00:35:06,850 But that's what the early game-playing program's did, 624 00:35:06,850 --> 00:35:11,980 because no one knew how deep they could go. 625 00:35:11,980 --> 00:35:16,910 So, let's have a look at the situation here and say, well, 626 00:35:16,910 --> 00:35:18,670 here's a game tree. 627 00:35:18,670 --> 00:35:20,290 It's a binary game tree. 628 00:35:20,290 --> 00:35:22,120 That's level 0. 629 00:35:22,120 --> 00:35:23,890 That's level 1. 630 00:35:23,890 --> 00:35:26,600 This is level d minus 1. 631 00:35:26,600 --> 00:35:28,610 And this is level d. 632 00:35:28,610 --> 00:35:32,050 So, down here you have a situation 633 00:35:32,050 --> 00:35:33,380 that looks like this. 634 00:35:33,380 --> 00:35:37,050 And I left all the game tree out in between . 635 00:35:37,050 --> 00:35:40,940 So, how many leaf nodes are there down here? 636 00:35:40,940 --> 00:35:42,110 b to the d, right? 637 00:35:42,110 --> 00:35:45,280 Oh, I'm going to forget about alpha alpha-beta for a moment. 638 00:35:45,280 --> 00:35:47,760 As we did when we looked at some of those optimal 639 00:35:47,760 --> 00:35:50,540 searches, we're going to add these things one at a time. 640 00:35:50,540 --> 00:35:52,550 So, forget about alpha-beta, assume we're just doing 641 00:35:52,550 --> 00:35:54,290 straight minimax. 642 00:35:54,290 --> 00:35:56,970 In that case, we would have to calculate all the static 643 00:35:56,970 --> 00:35:58,610 values down here at the bottom. 644 00:35:58,610 --> 00:36:03,160 And there are b to d of those. 645 00:36:03,160 --> 00:36:06,760 How many are there at this next level up? 646 00:36:06,760 --> 00:36:11,720 Well, that must be b to the d minus 1. 647 00:36:11,720 --> 00:36:14,650 How many fewer nodes are there at the second to the last, the 648 00:36:14,650 --> 00:36:19,390 penultimate level, relative to the final level? 649 00:36:19,390 --> 00:36:23,010 Well, 1 over b, right? 650 00:36:23,010 --> 00:36:26,750 So, if I'm concerned about not getting all the way through 651 00:36:26,750 --> 00:36:31,070 these calculations at the d level, I can give myself an 652 00:36:31,070 --> 00:36:34,320 insurance policy by calculating out what the 653 00:36:34,320 --> 00:36:40,590 answer would be if I only went down to the d minus 1th level. 654 00:36:40,590 --> 00:36:43,540 Do you get that insurance policy? 655 00:36:43,540 --> 00:36:46,510 Let's say the branching factor is 10, how much does that 656 00:36:46,510 --> 00:36:48,920 insurance policy cost me? 657 00:36:48,920 --> 00:36:51,160 10% of my competition. 658 00:36:51,160 --> 00:36:53,690 Because I can do this calculation and have a move in 659 00:36:53,690 --> 00:36:59,580 hand here at level d minus 1 for only 1/10 of the amount of 660 00:36:59,580 --> 00:37:01,730 the computation that's required to figure out what I 661 00:37:01,730 --> 00:37:06,000 would do if I go all the way down to the base level. 662 00:37:06,000 --> 00:37:08,460 OK, is that clear? 663 00:37:08,460 --> 00:37:13,160 So this idea is extremely important in its general form. 664 00:37:13,160 --> 00:37:16,600 But we haven't quite got there yet, because what if the 665 00:37:16,600 --> 00:37:19,070 branching factor turns out to be really big and we can't get 666 00:37:19,070 --> 00:37:22,130 through this level either? 667 00:37:22,130 --> 00:37:23,860 What should we do to make sure that we 668 00:37:23,860 --> 00:37:26,215 still have a good move? 669 00:37:26,215 --> 00:37:27,610 SPEAKER 5: [INAUDIBLE]. 670 00:37:27,610 --> 00:37:32,850 SPEAKER 1: Right, we can do it at the b minus 2 level. 671 00:37:32,850 --> 00:37:37,120 So, that would be up here. 672 00:37:37,120 --> 00:37:40,806 And at that level, the amount of computation would be b to 673 00:37:40,806 --> 00:37:42,056 the d minus 2. 674 00:37:44,800 --> 00:37:51,240 So, now we've added 10% plus 10% of that. 675 00:37:51,240 --> 00:37:56,270 And our knee jerk is begin to form, right? 676 00:37:56,270 --> 00:37:58,180 What are we going to do in the end to make sure that no 677 00:37:58,180 --> 00:38:00,458 matter what we've got a move? 678 00:38:00,458 --> 00:38:02,095 CHRISTOPHER: Start from the very first-- 679 00:38:02,095 --> 00:38:03,280 SPEAKER 1: Correct, what's that, Christopher? 680 00:38:03,280 --> 00:38:04,250 CHRISTOPHER: Start from the very first level? 681 00:38:04,250 --> 00:38:06,515 SPEAKER 1: Start from the very first level and give our self 682 00:38:06,515 --> 00:38:11,330 an insurance policy for every level we try to calculate. 683 00:38:11,330 --> 00:38:13,780 But that might be real costly. 684 00:38:13,780 --> 00:38:15,910 So, we better figure out if this is going to be too big of 685 00:38:15,910 --> 00:38:18,220 an expense to bear. 686 00:38:18,220 --> 00:38:22,330 So, let's see, if we do what Christopher suggests, then the 687 00:38:22,330 --> 00:38:25,860 amount of computation we need in our insurance policy is 688 00:38:25,860 --> 00:38:28,460 going to be equal 1-- 689 00:38:28,460 --> 00:38:30,900 we're going to do it up here at this level, 2, even though 690 00:38:30,900 --> 00:38:33,560 we don't need it, just to make everything work out easy. 691 00:38:33,560 --> 00:38:37,720 1 plus b, that's getting or insurance policy down here at 692 00:38:37,720 --> 00:38:39,600 this first level. 693 00:38:39,600 --> 00:38:44,460 And we're going to add b squared all the way down to b 694 00:38:44,460 --> 00:38:46,820 to d minus 1. 695 00:38:46,820 --> 00:38:49,280 That's how much we're going to spend getting an insurance 696 00:38:49,280 --> 00:38:50,590 policy at every level. 697 00:38:54,020 --> 00:38:58,390 I wished that some of that high school algebra, right? 698 00:38:58,390 --> 00:39:01,812 Let's just do it for fun. 699 00:39:01,812 --> 00:39:04,660 Oh, unfortunate choice of variable names. 700 00:39:04,660 --> 00:39:08,225 bs is equal to-- 701 00:39:08,225 --> 00:39:10,165 oh, we're going to multiply all those by b. 702 00:39:17,530 --> 00:39:29,520 Now, we'll subtract the first one from the second one, which 703 00:39:29,520 --> 00:39:33,330 tells us that the amount of calculation needed for our 704 00:39:33,330 --> 00:39:39,720 insurance policy is equal to b to the d minus 1 705 00:39:39,720 --> 00:39:42,070 over b minus 1. 706 00:39:46,450 --> 00:39:49,580 Is that a big number? 707 00:39:49,580 --> 00:39:53,562 We could do a little algebra on that and say that b to the 708 00:39:53,562 --> 00:39:54,430 d is a huge number. 709 00:39:54,430 --> 00:39:57,240 So, that minus one doesn't count. 710 00:39:57,240 --> 00:39:59,620 And B is probably 10 to 15. 711 00:39:59,620 --> 00:40:03,830 So, b minus 1 is, essentially, equal to b. 712 00:40:03,830 --> 00:40:08,441 So, that's approximately equal b to the d minus 1. 713 00:40:11,150 --> 00:40:15,340 So, with an approximation factored in, the amount of 714 00:40:15,340 --> 00:40:17,140 computation needed to do insurance policies at every 715 00:40:17,140 --> 00:40:19,870 level is not much different from the amount of computation 716 00:40:19,870 --> 00:40:22,770 needed to get an insurance policy at just one level, the 717 00:40:22,770 --> 00:40:24,910 penultimate one. 718 00:40:24,910 --> 00:40:27,550 So, this idea is called progressive deepening. 719 00:40:40,610 --> 00:40:43,610 And now we can visit our gold star idea list and see how 720 00:40:43,610 --> 00:40:46,170 these things match up with that. 721 00:40:46,170 --> 00:40:50,050 First of all, the dead horse principle comes to the fore 722 00:40:50,050 --> 00:40:51,530 when we talk about alpha-beta. 723 00:40:51,530 --> 00:40:53,570 Because we know with alpha-beta that we can get rid 724 00:40:53,570 --> 00:40:56,705 of a whole lot of the tree and not do static evaluation, not 725 00:40:56,705 --> 00:40:59,000 even do move generation. 726 00:40:59,000 --> 00:41:01,120 That's the dead horse we don't want to beat. 727 00:41:01,120 --> 00:41:03,530 There's no point in doing that calculation, because it can't 728 00:41:03,530 --> 00:41:06,250 figure into the answer. 729 00:41:06,250 --> 00:41:12,830 The development of the progressive deepening idea, I 730 00:41:12,830 --> 00:41:14,860 like to think of in terms of the martial arts principle, 731 00:41:14,860 --> 00:41:17,600 we're using the enemy's characteristics against them. 732 00:41:17,600 --> 00:41:20,690 Because of this exponential blow-up, we have exactly the 733 00:41:20,690 --> 00:41:23,610 right characteristics to have a move available at every 734 00:41:23,610 --> 00:41:26,260 level as an insurance policy against not getting through to 735 00:41:26,260 --> 00:41:28,360 the next level. 736 00:41:28,360 --> 00:41:31,910 And, finally, this whole idea of progressive deepening can 737 00:41:31,910 --> 00:41:34,440 be viewed as a prime example of what we like to call 738 00:41:34,440 --> 00:41:37,670 anytime algorithms that always have an answer ready to go as 739 00:41:37,670 --> 00:41:39,690 soon as an answer is demanded. 740 00:41:39,690 --> 00:41:43,400 So, as soon as that clock runs out at two minutes, some 741 00:41:43,400 --> 00:41:44,250 answer is available. 742 00:41:44,250 --> 00:41:47,460 It'll be the best one that the system can compute in the time 743 00:41:47,460 --> 00:41:49,480 available given the characteristics of the game 744 00:41:49,480 --> 00:41:51,930 tree as it's developed so far. 745 00:41:51,930 --> 00:41:53,780 So, there are other kinds of anytime algorithms. 746 00:41:53,780 --> 00:41:56,500 This is an example of one. 747 00:41:56,500 --> 00:42:01,500 That's how all game playing programs work, minimax, plus 748 00:42:01,500 --> 00:42:04,290 alpha-beta, plus progressive deepening. 749 00:42:04,290 --> 00:42:08,670 Christopher, is alpha-beta a alternative to minimax? 750 00:42:08,670 --> 00:42:09,450 CHRISTOPHER: No. 751 00:42:09,450 --> 00:42:11,072 SPEAKER 1: No, it's not. 752 00:42:11,072 --> 00:42:13,100 It's something you layer on top of minimax. 753 00:42:13,100 --> 00:42:15,980 Does alpha-beta give you a different answer from minimax? 754 00:42:18,655 --> 00:42:20,920 CHRISTOPHER: No. 755 00:42:20,920 --> 00:42:21,600 No, it doesn't. 756 00:42:21,600 --> 00:42:23,105 SPEAKER 1: Let's see everybody shake their head 757 00:42:23,105 --> 00:42:24,590 one way or the other. 758 00:42:24,590 --> 00:42:26,960 It does not give you an answer different from minimax. 759 00:42:26,960 --> 00:42:27,770 That's right. 760 00:42:27,770 --> 00:42:29,430 It gives you exactly the same answer, 761 00:42:29,430 --> 00:42:30,660 not a different answer. 762 00:42:30,660 --> 00:42:32,800 It's a speed-up. 763 00:42:32,800 --> 00:42:34,570 It's not an approximation. 764 00:42:34,570 --> 00:42:35,140 It's a speed-up. 765 00:42:35,140 --> 00:42:36,835 It cuts off lots of the tree. 766 00:42:36,835 --> 00:42:39,260 It's a dead horse principle at work. 767 00:42:39,260 --> 00:42:40,618 You got a question, Christopher? 768 00:42:40,618 --> 00:42:45,558 CHRISTOPHER: Yeah, since all of the lines progressively 769 00:42:45,558 --> 00:42:50,498 [INAUDIBLE], is it possible to keep a temporary value if the 770 00:42:50,498 --> 00:42:54,944 value [INAUDIBLE] each node of the tree and then [INAUDIBLE]? 771 00:42:54,944 --> 00:42:56,920 SPEAKER 1: Oh, excellent suggestion. 772 00:42:56,920 --> 00:42:58,930 In fact, Christopher has just-- 773 00:42:58,930 --> 00:43:01,510 I think, if I can jump ahead a couple steps-- 774 00:43:01,510 --> 00:43:04,600 Christopher has reinvented a very important idea. 775 00:43:12,250 --> 00:43:14,295 Progressive deepening not only ensures you have an answer at 776 00:43:14,295 --> 00:43:18,080 any time, it actually improves the performance of alpha-beta 777 00:43:18,080 --> 00:43:21,090 when you layer alpha-beta on top of it. 778 00:43:21,090 --> 00:43:25,050 Because these values that are calculated at intermediate 779 00:43:25,050 --> 00:43:28,890 parts of the tree are used to reorder the nodes under the 780 00:43:28,890 --> 00:43:33,190 tree so as to give you maximum alpha-beta cut-off. 781 00:43:33,190 --> 00:43:34,650 I think that's what you said, Christopher. 782 00:43:34,650 --> 00:43:39,380 But if it isn't, we'll talk about your idea after class. 783 00:43:39,380 --> 00:43:42,170 So, this is what every game playing program does. 784 00:43:42,170 --> 00:43:44,510 How is Deep Blue different? 785 00:43:44,510 --> 00:43:45,760 Not much. 786 00:43:51,830 --> 00:43:57,000 So, Deep Blue, as of 1997, did about 200 million static 787 00:43:57,000 --> 00:43:59,042 evaluations per second. 788 00:43:59,042 --> 00:44:03,530 And it went down, using alpha-beta, 789 00:44:03,530 --> 00:44:08,100 about 14, 15, 16 levels. 790 00:44:08,100 --> 00:44:21,800 So, Deep Blue was minimax, plus alpha-beta, plus 791 00:44:21,800 --> 00:44:27,480 progressive deepening, plus a whole lot of parallel 792 00:44:27,480 --> 00:44:42,080 computing, plus an opening book, plus special purpose 793 00:44:42,080 --> 00:44:47,800 stuff for the end game, plus-- 794 00:44:47,800 --> 00:44:49,470 perhaps the most important thing-- 795 00:45:04,210 --> 00:45:06,150 uneven tree development. 796 00:45:06,150 --> 00:45:08,880 So far, we've pretended that the tree always goes up in an 797 00:45:08,880 --> 00:45:10,610 even way to a fixed level. 798 00:45:10,610 --> 00:45:13,310 But there's no particular reason why that has to be so. 799 00:45:16,190 --> 00:45:19,870 Some situation down at the bottom of the tree may be 800 00:45:19,870 --> 00:45:21,920 particularly dynamic. 801 00:45:21,920 --> 00:45:23,720 In the very next move, you might be able to capture the 802 00:45:23,720 --> 00:45:25,780 opponent's Queen. 803 00:45:25,780 --> 00:45:27,750 So, in circumstances like that, you want to blow out a 804 00:45:27,750 --> 00:45:29,600 little extra search. 805 00:45:29,600 --> 00:45:31,280 So, eventually, you get to the idea that there's no 806 00:45:31,280 --> 00:45:33,810 particular reason to have the search go 807 00:45:33,810 --> 00:45:35,880 down to a fixed level. 808 00:45:35,880 --> 00:45:38,920 But, instead, you can develop the tree in a way that gives 809 00:45:38,920 --> 00:45:40,800 you the most confidence that your 810 00:45:40,800 --> 00:45:43,330 backed-up numbers are correct. 811 00:45:43,330 --> 00:45:46,670 That's the most important of these extra flourishes added 812 00:45:46,670 --> 00:45:51,370 by Deep Blue when it beat Kasparov in 1997. 813 00:45:51,370 --> 00:45:53,890 And now we can come back and say, well, you 814 00:45:53,890 --> 00:45:54,710 understand Deep Blue. 815 00:45:54,710 --> 00:45:56,430 But is this a model of anything that goes 816 00:45:56,430 --> 00:45:58,950 on in our own heads? 817 00:45:58,950 --> 00:46:02,010 Is this a model of any kind of human intelligence? 818 00:46:02,010 --> 00:46:05,210 Or is it a different kind of intelligence? 819 00:46:05,210 --> 00:46:06,460 And the answer is mixed, right? 820 00:46:06,460 --> 00:46:09,950 Because we are often in situations where we are 821 00:46:09,950 --> 00:46:11,720 playing a game. 822 00:46:11,720 --> 00:46:13,470 We're competing with another manufacturer. 823 00:46:13,470 --> 00:46:16,300 We have to think what the other manufacturer will do in 824 00:46:16,300 --> 00:46:21,500 response to what we do down several levels. 825 00:46:21,500 --> 00:46:26,230 On the other hand, is going down 14 levels what human 826 00:46:26,230 --> 00:46:29,705 chess players do when they win the world championship? 827 00:46:29,705 --> 00:46:33,620 It doesn't seem, even to them, like that's even a remote 828 00:46:33,620 --> 00:46:35,570 possibility. 829 00:46:35,570 --> 00:46:37,740 They have to do something different, because they don't 830 00:46:37,740 --> 00:46:41,180 have that kind of computational horsepower. 831 00:46:41,180 --> 00:46:45,350 This is doing computation in the same way that a bulldozer 832 00:46:45,350 --> 00:46:47,600 processes gravel. 833 00:46:47,600 --> 00:46:51,650 It's substituting raw power for sophistication. 834 00:46:51,650 --> 00:46:54,790 So, when a human chess master plays the game, they have a 835 00:46:54,790 --> 00:46:56,640 great deal of chess knowledge in their head and they 836 00:46:56,640 --> 00:46:58,730 recognize patterns. 837 00:46:58,730 --> 00:47:00,910 There are famous experiments, by the way, that demonstrate 838 00:47:00,910 --> 00:47:03,730 this in the following way. 839 00:47:03,730 --> 00:47:08,250 Show a chessboard to a chess master and ask them to 840 00:47:08,250 --> 00:47:10,130 memorize it. 841 00:47:10,130 --> 00:47:12,950 They're very good at that, as long as it's a legitimate 842 00:47:12,950 --> 00:47:14,180 chessboard. 843 00:47:14,180 --> 00:47:16,510 If the pieces are placed randomly, they're no 844 00:47:16,510 --> 00:47:18,380 good at it at all. 845 00:47:18,380 --> 00:47:21,502 So, it's very clear that they've developed a repertoire 846 00:47:21,502 --> 00:47:24,550 of chess knowledge that makes it possible for them to 847 00:47:24,550 --> 00:47:28,150 recognize situations and play the game much more like number 848 00:47:28,150 --> 00:47:29,940 1 up there. 849 00:47:29,940 --> 00:47:33,150 So, Deep Blue is manifesting some kind of intelligence. 850 00:47:33,150 --> 00:47:34,360 But it's not our intelligence. 851 00:47:34,360 --> 00:47:36,800 It's bulldozer intelligence. 852 00:47:36,800 --> 00:47:38,330 So, it's important to understand that kind of 853 00:47:38,330 --> 00:47:40,020 intelligence, too. 854 00:47:40,020 --> 00:47:42,290 But it's not necessarily the same kind of intelligence that 855 00:47:42,290 --> 00:47:43,540 we have in our own head. 856 00:47:46,160 --> 00:47:47,570 So, that concludes what we're going to do today. 857 00:47:47,570 --> 00:47:49,790 And, as you know, on Wednesday we have a celebration of 858 00:47:49,790 --> 00:47:56,940 learning, which is familiar to you if you take a 309.1. 859 00:47:56,940 --> 00:48:00,440 And, therefore, I will see you on Wednesday, 860 00:48:00,440 --> 00:48:01,690 all of you, I imagine.