1 00:00:00,070 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,229 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,229 --> 00:00:17,854 at ocw.mit.edu. 8 00:00:21,066 --> 00:00:24,390 PROFESSOR: Today we're going to be talking about games. 9 00:00:24,390 --> 00:00:29,724 And I know you guys as well as I hope I do. 10 00:00:29,724 --> 00:00:32,140 The main thing that you guys want to talk about with games 11 00:00:32,140 --> 00:00:34,330 is how to do that alpha-beta thing. 12 00:00:34,330 --> 00:00:36,300 Because it's pretty confusing. 13 00:00:36,300 --> 00:00:40,490 And it's easy to get lost in a corner or something. 14 00:00:40,490 --> 00:00:43,820 Whereas doing the regular minimax, in my experience, 15 00:00:43,820 --> 00:00:45,590 most 6034 students can do that. 16 00:00:45,590 --> 00:00:48,450 And they do it right pretty much all the time. 17 00:00:48,450 --> 00:00:51,560 However, we're going to focus on all the different components 18 00:00:51,560 --> 00:00:52,940 of games. 19 00:00:52,940 --> 00:00:56,910 And I put up two provocative silver star ideas 20 00:00:56,910 --> 00:01:00,370 up on the board, which will come into play here. 21 00:01:00,370 --> 00:01:02,840 The Snow White principle is a new name. 22 00:01:02,840 --> 00:01:05,489 And it has never been revealed until today. 23 00:01:05,489 --> 00:01:08,050 Because I made up name recently. 24 00:01:08,050 --> 00:01:11,000 So you will be the first people to hear it and decide 25 00:01:11,000 --> 00:01:13,310 if it works better than the term "grandfather 26 00:01:13,310 --> 00:01:15,630 clause" for the thing that I'm trying to describe. 27 00:01:15,630 --> 00:01:19,420 Because most grandfathers don't eat their children. 28 00:01:19,420 --> 00:01:25,300 So here we've got a beautiful game tree. 29 00:01:25,300 --> 00:01:29,970 It has nodes from A through R. This 30 00:01:29,970 --> 00:01:33,250 is our standard game tree from 6034. 31 00:01:33,250 --> 00:01:35,922 We've got a maximizer up at the top 32 00:01:35,922 --> 00:01:37,880 who's trying to get the highest score possible. 33 00:01:37,880 --> 00:01:41,770 The minimizer is her opponent. 34 00:01:41,770 --> 00:01:44,530 And the minimizer is trying to get to the lowest score 35 00:01:44,530 --> 00:01:45,930 possible. 36 00:01:45,930 --> 00:01:49,264 And it's really unclear who wins or loses at each point. 37 00:01:49,264 --> 00:01:51,680 They're just trying to get it to the highest or the lowest 38 00:01:51,680 --> 00:01:54,140 score. 39 00:01:54,140 --> 00:01:56,660 All right, so let's do a refresher. 40 00:01:56,660 --> 00:01:59,590 Hopefully the quiz didn't put people into such panic modes 41 00:01:59,590 --> 00:02:01,180 that they forgot Monday's lecture. 42 00:02:01,180 --> 00:02:06,130 So let's make sure that we can do regular minimax 43 00:02:06,130 --> 00:02:09,639 algorithm on this tree and figure out the minimax 44 00:02:09,639 --> 00:02:11,270 value at A. 45 00:02:11,270 --> 00:02:13,900 So let's see how that works. 46 00:02:13,900 --> 00:02:19,010 All right, as you guys remember, the game search 47 00:02:19,010 --> 00:02:22,320 when using regular minimax is essentially 48 00:02:22,320 --> 00:02:24,550 a depth first search. 49 00:02:24,550 --> 00:02:28,080 And at each level, it chooses between all 50 00:02:28,080 --> 00:02:32,510 of the children whichever value that the parent wants. 51 00:02:32,510 --> 00:02:35,480 So here after it would choose the maximum of K and L, 52 00:02:35,480 --> 00:02:36,330 for instance. 53 00:02:36,330 --> 00:02:37,570 But that's getting ahead of ourselves. 54 00:02:37,570 --> 00:02:38,986 Because it's a depth first search. 55 00:02:38,986 --> 00:02:40,874 So we best start at the top. 56 00:02:40,874 --> 00:02:42,290 I'll help you guys up for a while. 57 00:02:42,290 --> 00:02:44,480 So we're doing A. We need the maximum of B, C, 58 00:02:44,480 --> 00:02:45,770 D, depth first search. 59 00:02:45,770 --> 00:02:49,760 We go to B. We're looking for the minimum of E and F. 60 00:02:49,760 --> 00:02:55,980 So having looked at E, our current minimum of E and F 61 00:02:55,980 --> 00:02:58,780 is just 2 for the moment. 62 00:02:58,780 --> 00:03:04,040 So this is going to be less than or equal to 2. 63 00:03:04,040 --> 00:03:06,690 All right, then we go down to F, which is a maximizer. 64 00:03:06,690 --> 00:03:09,850 And its children are K and L. So now I'm 65 00:03:09,850 --> 00:03:11,650 going to start making you guys do stuff. 66 00:03:11,650 --> 00:03:13,960 So what do you think? 67 00:03:13,960 --> 00:03:17,620 What is going to be the minimax value at F? 68 00:03:17,620 --> 00:03:20,659 The minimax value at F, what will that be? 69 00:03:20,659 --> 00:03:21,534 AUDIENCE: [INAUDIBLE] 70 00:03:25,150 --> 00:03:29,090 PROFESSOR: So that level is a maximizer-- max, min, max. 71 00:03:29,090 --> 00:03:30,280 F is a maximizer. 72 00:03:30,280 --> 00:03:32,420 K and L themselves are minimizers. 73 00:03:32,420 --> 00:03:34,200 But they're pretty impotent minimizers. 74 00:03:34,200 --> 00:03:35,590 Because they don't get to choose. 75 00:03:35,590 --> 00:03:39,950 They just have to do K or L. So the minimax value is three. 76 00:03:39,950 --> 00:03:42,660 And yeah, the path it would like to go is K. 77 00:03:42,660 --> 00:03:45,510 So we'll say that the minimax value here is 3. 78 00:03:45,510 --> 00:03:48,680 It's in fact exactly equal to 3. 79 00:03:48,680 --> 00:03:53,410 So if this is 3 and this is 2, then everyone, 80 00:03:53,410 --> 00:03:55,476 we know that the value of B is? 81 00:03:55,476 --> 00:03:57,710 AUDIENCE: [INAUDIBLE] 82 00:03:57,710 --> 00:03:58,960 PROFESSOR: I hear 3 and 2. 83 00:03:58,960 --> 00:04:00,281 Which one is it? 84 00:04:00,281 --> 00:04:00,780 AUDIENCE: 2. 85 00:04:00,780 --> 00:04:03,150 PROFESSOR: 2, that's right. 86 00:04:03,150 --> 00:04:04,600 So the value here is 2. 87 00:04:04,600 --> 00:04:06,790 Great, let's go down into this branch. 88 00:04:06,790 --> 00:04:09,427 So C is going to be the minimum of G and 6. 89 00:04:09,427 --> 00:04:10,510 But we don't see that yet. 90 00:04:10,510 --> 00:04:12,218 Because we're doing a depth first search. 91 00:04:12,218 --> 00:04:13,840 It's going to be the minimum of G. 92 00:04:13,840 --> 00:04:16,540 Now we need the maximum of M and N. 93 00:04:16,540 --> 00:04:18,160 We're going to need the minimum. 94 00:04:18,160 --> 00:04:23,530 M is the minimum of Q and R. So let's switch sides. 95 00:04:23,530 --> 00:04:27,100 The minimum of Q and R is? 96 00:04:27,100 --> 00:04:28,305 AUDIENCE: 1. 97 00:04:28,305 --> 00:04:29,180 PROFESSOR: Let's see. 98 00:04:29,180 --> 00:04:30,570 That's right, it's 1. 99 00:04:30,570 --> 00:04:32,170 So M has a value of 1. 100 00:04:32,170 --> 00:04:34,240 But I'm going to stay over here. 101 00:04:34,240 --> 00:04:36,140 Because M has a value of 1. 102 00:04:36,140 --> 00:04:39,100 Knowing that, then we know that G has a value of? 103 00:04:39,100 --> 00:04:39,600 AUDIENCE: 7 104 00:04:39,600 --> 00:04:40,600 PROFESSOR: That's right. 105 00:04:40,600 --> 00:04:41,930 7 is higher than 1. 106 00:04:41,930 --> 00:04:45,550 And since G is a 7, we now know going up to C 107 00:04:45,550 --> 00:04:47,850 that C has a value of? 108 00:04:47,850 --> 00:04:48,850 AUDIENCE: 6. 109 00:04:48,850 --> 00:04:50,830 PROFESSOR: Yes, C has a value of 6. 110 00:04:50,830 --> 00:04:52,710 That's the minimum 6 and 7. 111 00:04:52,710 --> 00:04:54,600 So now I'm going to go back down. 112 00:04:54,600 --> 00:04:57,370 Because we've done one of the other sub-trees. 113 00:04:57,370 --> 00:04:58,751 This is a 6. 114 00:04:58,751 --> 00:04:59,950 All right, great. 115 00:04:59,950 --> 00:05:03,900 Now we're going to go down to D. Hopefully it won't be too bad. 116 00:05:03,900 --> 00:05:05,760 These things usually aren't terrible. 117 00:05:05,760 --> 00:05:08,710 Because they're made to be pruned a lot in alpha-beta. 118 00:05:08,710 --> 00:05:12,350 So let's see, in D, we go down to I. And that's just a 1. 119 00:05:12,350 --> 00:05:17,040 We go down to J. And let's see, what's the minimax value of J? 120 00:05:17,040 --> 00:05:17,860 AUDIENCE: It's 20. 121 00:05:17,860 --> 00:05:18,860 PROFESSOR: That's right. 122 00:05:18,860 --> 00:05:21,120 20 is the maximum of 20 and 2. 123 00:05:21,120 --> 00:05:24,780 Great, so what's the minimax value of D? 124 00:05:24,780 --> 00:05:25,990 Everyone said it-- 1. 125 00:05:25,990 --> 00:05:27,946 All right, so what's the minimax value at A? 126 00:05:27,946 --> 00:05:29,574 AUDIENCE: 6. 127 00:05:29,574 --> 00:05:30,490 PROFESSOR: 6 is right. 128 00:05:30,490 --> 00:05:31,660 6 is higher than 2. 129 00:05:31,660 --> 00:05:32,740 It's higher than 1. 130 00:05:32,740 --> 00:05:34,020 Our value is 6. 131 00:05:34,020 --> 00:05:41,960 And our path is-- everyone-- A, C, H. That's it. 132 00:05:41,960 --> 00:05:46,329 Great, is everyone good with minimax? 133 00:05:46,329 --> 00:05:47,995 I know that usually a lot of people are. 134 00:05:47,995 --> 00:05:49,510 There's usually a few people who aren't. 135 00:05:49,510 --> 00:05:50,950 So if you're one of the people who 136 00:05:50,950 --> 00:05:54,150 would like some clarifications on minimax, raise your hand. 137 00:05:54,150 --> 00:05:58,680 There's probably a few other people who would like some too. 138 00:05:58,680 --> 00:06:00,536 OK. 139 00:06:00,536 --> 00:06:03,350 AUDIENCE: When you're doing this minimax, whatever values 140 00:06:03,350 --> 00:06:06,752 are not showing, you keep going down the tree 141 00:06:06,752 --> 00:06:08,452 and then just look at whether you're 142 00:06:08,452 --> 00:06:09,670 trying to find the minimax. 143 00:06:09,670 --> 00:06:12,380 And just whatever values you get you go back up one? 144 00:06:12,380 --> 00:06:14,710 PROFESSOR: Yes. 145 00:06:14,710 --> 00:06:20,340 The question was, when you go to do the minimax-- and let's 146 00:06:20,340 --> 00:06:21,724 say you got E was 2, and you know 147 00:06:21,724 --> 00:06:23,640 that B is going to be less than or equal to 2, 148 00:06:23,640 --> 00:06:25,160 but you don't know F yet. 149 00:06:25,160 --> 00:06:28,350 The question is, do you go down the tree, find the value at F, 150 00:06:28,350 --> 00:06:29,460 and then go back up? 151 00:06:29,460 --> 00:06:31,270 The answer is yes. 152 00:06:31,270 --> 00:06:33,300 By default, we use a depth first search. 153 00:06:33,300 --> 00:06:36,550 However, in non alpha-beta version, just regular minimax, 154 00:06:36,550 --> 00:06:40,900 it turns out it probably doesn't matter what you do. 155 00:06:40,900 --> 00:06:43,150 I suggested doing a depth first search to get yourself 156 00:06:43,150 --> 00:06:44,630 in the mindset of alpha-beta. 157 00:06:44,630 --> 00:06:49,290 Because order is very, very important in alpha-beta. 158 00:06:49,290 --> 00:06:52,150 But here, I don't know, you could do some weird bottom 159 00:06:52,150 --> 00:06:53,110 up search. 160 00:06:53,110 --> 00:06:56,010 Whatever you want, it's going to give you the right answer 161 00:06:56,010 --> 00:06:58,380 unless it asks what order they evaluated. 162 00:06:58,380 --> 00:06:59,280 But here's a hint. 163 00:06:59,280 --> 00:07:03,790 The order they're evaluated in is depth first search order. 164 00:07:03,790 --> 00:07:09,620 So without even doing anything, E, K, L, Q, R, N, H, I, O, P 165 00:07:09,620 --> 00:07:13,355 are the order of starting evaluation in this tree. 166 00:07:13,355 --> 00:07:14,230 AUDIENCE: [INAUDIBLE] 167 00:07:20,330 --> 00:07:23,950 PROFESSOR: So the question is, nodes like M and G 168 00:07:23,950 --> 00:07:26,130 we don't have to put values next to. 169 00:07:26,130 --> 00:07:28,420 Technically, if we were doing this very formally, 170 00:07:28,420 --> 00:07:30,560 and we couldn't remember, and I wasn't 171 00:07:30,560 --> 00:07:33,780 up there among the people, we would put 1 there. 172 00:07:33,780 --> 00:07:36,580 So at M, we would put a 1. 173 00:07:36,580 --> 00:07:37,829 But people remembered that. 174 00:07:37,829 --> 00:07:38,620 So we didn't do it. 175 00:07:38,620 --> 00:07:40,599 But then at G, we would put a 7. 176 00:07:40,599 --> 00:07:42,390 So if we were writing it out very formally, 177 00:07:42,390 --> 00:07:45,280 we would have a 1 and a 7. 178 00:07:45,280 --> 00:07:47,920 And at this D, we would have a 1. 179 00:07:47,920 --> 00:07:50,650 And then at the A, we would put a 6. 180 00:07:50,650 --> 00:07:53,140 And then that's the answer. 181 00:07:53,140 --> 00:07:55,510 Also, we've even put things like less than or greater 182 00:07:55,510 --> 00:07:57,070 than part way along the way. 183 00:07:57,070 --> 00:08:00,590 However, I believe that our alpha-beta search 184 00:08:00,590 --> 00:08:03,960 is going to definitely fulfill everyone's quota 185 00:08:03,960 --> 00:08:07,030 of pedantically putting lots of numbers next to nodes 186 00:08:07,030 --> 00:08:08,650 on a game tree. 187 00:08:08,650 --> 00:08:11,470 And so once you've done alpha-beta, 188 00:08:11,470 --> 00:08:13,870 if you can do it correctly, you'll 189 00:08:13,870 --> 00:08:16,260 think, oh, minimax, oh, those were the days. 190 00:08:16,260 --> 00:08:17,590 It's going to be easy. 191 00:08:17,590 --> 00:08:22,170 Because alpha-beta is a little bit more complicated. 192 00:08:22,170 --> 00:08:24,490 There's a lot of things that trip people up here. 193 00:08:24,490 --> 00:08:26,270 For alpha-beta, however, I will erase 194 00:08:26,270 --> 00:08:28,750 some of these numbers for the moment. 195 00:08:28,750 --> 00:08:30,660 They're still right. 196 00:08:30,660 --> 00:08:33,700 But we do it a little differently. 197 00:08:33,700 --> 00:08:38,470 So what do alpha-beta and beta add to this formula? 198 00:08:38,470 --> 00:08:40,710 Well, this is all sort of a winning formula, 199 00:08:40,710 --> 00:08:41,679 except for it's not. 200 00:08:41,679 --> 00:08:43,490 Because it takes too long. 201 00:08:43,490 --> 00:08:45,620 But it's a very nice formula. 202 00:08:45,620 --> 00:08:47,600 U is the maximizer, say. 203 00:08:47,600 --> 00:08:50,410 I would try to think, if I do this, what's he going to do? 204 00:08:50,410 --> 00:08:52,430 And then if he does that, what am I going to do? 205 00:08:52,430 --> 00:08:54,340 And then what is he going to do if I 206 00:08:54,340 --> 00:08:57,040 do that, et cetera, et cetera, all the way to the bottom. 207 00:08:57,040 --> 00:08:58,560 With alpha and beta, we add in what 208 00:08:58,560 --> 00:09:02,644 I like to call nuclear options. 209 00:09:02,644 --> 00:09:04,560 I'd like in this game of maximizer minimizer-- 210 00:09:04,560 --> 00:09:07,389 you can think of it as like the Cold War or the Peloponnesian 211 00:09:07,389 --> 00:09:09,680 War, except the Peloponnesian War didn't have nukes, so 212 00:09:09,680 --> 00:09:12,710 probably the Cold War. 213 00:09:12,710 --> 00:09:17,200 And in the Cold War, or any situation 214 00:09:17,200 --> 00:09:19,477 where you're up against an adversary who-- actually, 215 00:09:19,477 --> 00:09:21,560 this doesn't really work as well for the Cold War. 216 00:09:21,560 --> 00:09:22,935 But in any situation where you're 217 00:09:22,935 --> 00:09:24,920 up against an adversary whose only goal in life 218 00:09:24,920 --> 00:09:27,260 is to destroy you, you always want 219 00:09:27,260 --> 00:09:29,480 to find out what the best thing you can possibly do 220 00:09:29,480 --> 00:09:32,850 is if they hit that button and send nukes in from Cuba, 221 00:09:32,850 --> 00:09:36,270 or if they send fighter pilots, or whatever is going on. 222 00:09:36,270 --> 00:09:38,600 So the idea of alpha and beta is that they 223 00:09:38,600 --> 00:09:45,030 are numbers that represent the fail-safe, the worst case. 224 00:09:45,030 --> 00:09:47,590 Because obviously in the Cold War, 225 00:09:47,590 --> 00:09:50,820 sending nukes was not a good plan. 226 00:09:50,820 --> 00:09:55,230 But presumably, us sending nukes would 227 00:09:55,230 --> 00:09:57,880 be better than just being attacked and killed. 228 00:09:57,880 --> 00:10:04,150 So the alpha and beta represent the worst possible outcome 229 00:10:04,150 --> 00:10:07,470 you'd be willing to accept for your side. 230 00:10:07,470 --> 00:10:10,080 Because right now, you know you're 231 00:10:10,080 --> 00:10:15,400 guaranteed to be able to force the conflict to that point 232 00:10:15,400 --> 00:10:16,840 or better. 233 00:10:16,840 --> 00:10:19,780 So the alpha is the nuclear option, the fail-safe, 234 00:10:19,780 --> 00:10:22,420 of the maximizer. 235 00:10:22,420 --> 00:10:27,890 Nuclear options-- alpha is maximizer's nuclear option. 236 00:10:34,980 --> 00:10:40,430 And beta is the minimizer's nuclear option. 237 00:10:46,994 --> 00:10:49,410 So we ask ourselves-- and people who were paying attention 238 00:10:49,410 --> 00:10:52,240 at lecture or wrote stuff down know the answer already-- 239 00:10:52,240 --> 00:10:55,240 what could we possibly set to start off? 240 00:10:55,240 --> 00:10:57,230 Before we explore the tree and find anything, 241 00:10:57,230 --> 00:10:59,760 what will we set as our nuclear option, 242 00:10:59,760 --> 00:11:01,280 as our sort of fail-safe? 243 00:11:01,280 --> 00:11:04,490 We could always fall back on this number. 244 00:11:04,490 --> 00:11:06,630 So you could set 0. 245 00:11:06,630 --> 00:11:10,580 You could try to set some low number for the maximizer. 246 00:11:10,580 --> 00:11:12,960 Because if you set a high number for the maximizer 247 00:11:12,960 --> 00:11:16,150 as its fail-safe, it's going to be really snooty and just say, 248 00:11:16,150 --> 00:11:17,310 oh, I won't take this path. 249 00:11:17,310 --> 00:11:18,770 I already have a fail-safe that's 250 00:11:18,770 --> 00:11:19,970 better than all these paths. 251 00:11:19,970 --> 00:11:23,460 If you set, like, 100, you have no tree. 252 00:11:23,460 --> 00:11:26,990 Our default usually, in 6034, is to set negative infinity 253 00:11:26,990 --> 00:11:30,340 for alpha or negative some very large number if you're doing it 254 00:11:30,340 --> 00:11:32,770 in your lab. 255 00:11:32,770 --> 00:11:37,670 So if we set negative infinity as a default for alpha, 256 00:11:37,670 --> 00:11:41,600 that negative infinity is basically maximizer loses. 257 00:11:41,600 --> 00:11:43,020 So the maximizer goes in thinking, 258 00:11:43,020 --> 00:11:44,894 oh my god, if I don't look at this game tree, 259 00:11:44,894 --> 00:11:46,260 I automatically lose. 260 00:11:46,260 --> 00:11:51,070 He's willing to take the first path possibly presented. 261 00:11:51,070 --> 00:11:52,764 And that's why that negative infinity 262 00:11:52,764 --> 00:11:53,930 is a good default for alpha. 263 00:11:53,930 --> 00:11:56,096 Anyone have a good idea what a good default for beta 264 00:11:56,096 --> 00:11:57,710 is, or just remember? 265 00:11:57,710 --> 00:11:59,090 Positive infinity, that's right. 266 00:11:59,090 --> 00:12:01,423 Because the minimizer comes in, and she's like, oh crap, 267 00:12:01,423 --> 00:12:05,660 the maximizer automatically wins if I don't look at this node 268 00:12:05,660 --> 00:12:06,320 here. 269 00:12:06,320 --> 00:12:08,278 That makes sure the maximizer and the minimizer 270 00:12:08,278 --> 00:12:12,240 both are willing to look at the first path they see every time. 271 00:12:12,240 --> 00:12:13,635 Because look on this tree. 272 00:12:13,635 --> 00:12:16,560 If 10 was alpha, the maximizer would just 273 00:12:16,560 --> 00:12:19,020 reject out of hand everything except for P. 274 00:12:19,020 --> 00:12:20,440 And then we wouldn't have a tree. 275 00:12:20,440 --> 00:12:24,750 The maximizer would lose, because he would be like, hmm, 276 00:12:24,750 --> 00:12:26,790 this test game is very interesting. 277 00:12:26,790 --> 00:12:29,120 However, I have another option-- pft, 278 00:12:29,120 --> 00:12:31,380 and then you throw over the table. 279 00:12:31,380 --> 00:12:33,976 That's 10 for me, because you have to pick up the pieces. 280 00:12:33,976 --> 00:12:34,850 I don't own this set. 281 00:12:34,850 --> 00:12:36,990 I don't know. 282 00:12:36,990 --> 00:12:40,440 So that is why we set negative infinity and positive infinity 283 00:12:40,440 --> 00:12:42,770 as the default for alpha and beta. 284 00:12:42,770 --> 00:12:44,240 So how do alpha and beta propagate? 285 00:12:44,240 --> 00:12:45,870 And what do they do? 286 00:12:45,870 --> 00:12:49,110 The main purpose of alpha and beta is that, as we said, 287 00:12:49,110 --> 00:12:52,130 alpha-- let's say we have some chart of values. 288 00:12:52,130 --> 00:12:55,700 Alpha, which starts at negative infinity, 289 00:12:55,700 --> 00:12:58,935 is the worst that the maximizer is willing to accept. 290 00:12:58,935 --> 00:13:01,060 Because they know they can get that much or better. 291 00:13:01,060 --> 00:13:03,800 It starts out, that's the worst thing you can have. 292 00:13:03,800 --> 00:13:05,470 So it's not a problem. 293 00:13:05,470 --> 00:13:09,890 Infinity is the highest that the minimizer is willing to accept. 294 00:13:09,890 --> 00:13:11,430 That's beta. 295 00:13:11,430 --> 00:13:13,250 As you go along, though, the minimizer 296 00:13:13,250 --> 00:13:14,970 sees, oh, look at that. 297 00:13:14,970 --> 00:13:18,610 I can guarantee that at best the maximizer gets 100. 298 00:13:18,610 --> 00:13:20,700 Haha, beta is now 100. 299 00:13:20,700 --> 00:13:22,790 The maximizer says, oh yeah? 300 00:13:22,790 --> 00:13:27,490 Well I can guarantee that the lowest you can get me to go to 301 00:13:27,490 --> 00:13:28,280 is 0. 302 00:13:28,280 --> 00:13:31,330 So it's going to be 0. 303 00:13:31,330 --> 00:13:36,990 And this keeps going on until maybe at 6-- note, 304 00:13:36,990 --> 00:13:38,670 not drawn to scale. 305 00:13:38,670 --> 00:13:41,940 Maybe at 6, the maximizer said, haha, 306 00:13:41,940 --> 00:13:44,030 you can't make me go lower than 6. 307 00:13:44,030 --> 00:13:47,650 And the minimizer says, aha, you can't make me go higher than 6. 308 00:13:47,650 --> 00:13:50,120 And then 6 is the answer. 309 00:13:50,120 --> 00:13:56,950 If you ever get to a point where beta gets lower than alpha, 310 00:13:56,950 --> 00:14:02,000 or alpha gets lower than beta, then you just say, screw this. 311 00:14:02,000 --> 00:14:05,570 I'm not even going to look at the remaining stuff. 312 00:14:05,570 --> 00:14:09,870 I'm going to just prune now and go somewhere else that's 313 00:14:09,870 --> 00:14:11,780 less pointless than this. 314 00:14:11,780 --> 00:14:15,220 Because if the alpha gets higher than the beta, what that's 315 00:14:15,220 --> 00:14:18,515 saying is the maximizer says, oh man, look at this, minimizer. 316 00:14:18,515 --> 00:14:22,450 The lowest you can make me go is, say, 50. 317 00:14:22,450 --> 00:14:24,300 And the minimizer says, that's strange. 318 00:14:24,300 --> 00:14:28,760 Because the highest that you can make me go is 40. 319 00:14:28,760 --> 00:14:32,210 So something's generally amiss there. 320 00:14:32,210 --> 00:14:35,220 It usually means that one of the two of them 321 00:14:35,220 --> 00:14:37,890 doesn't even want to be exploring that branch at all. 322 00:14:37,890 --> 00:14:40,816 So you prune at that point. 323 00:14:40,816 --> 00:14:43,470 All right, so given that that's what we're looking for, 324 00:14:43,470 --> 00:14:46,650 how do we move the alphas and betas throughout the tree? 325 00:14:46,650 --> 00:14:48,400 There's a few different ways to draw them. 326 00:14:48,400 --> 00:14:52,830 And some of them I consider to be very busy. 327 00:14:52,830 --> 00:14:54,930 Probably in recitation and tutorial 328 00:14:54,930 --> 00:14:57,980 you will see a way that's busier and has more numbers. 329 00:14:57,980 --> 00:15:01,060 Technically, every node has both an alpha and a beta. 330 00:15:01,060 --> 00:15:04,560 However, the one that that node is paying attention to 331 00:15:04,560 --> 00:15:06,190 is the alpha, if it's a maximizer, 332 00:15:06,190 --> 00:15:08,360 and the beta if it's a minimizer. 333 00:15:08,360 --> 00:15:12,290 So I generally, for my purposes, only draw the alpha out 334 00:15:12,290 --> 00:15:16,330 for the maximizer and only draw the beta out for the minimizer. 335 00:15:16,330 --> 00:15:19,742 Very rarely, but it happens, they'll sometimes ask you, 336 00:15:19,742 --> 00:15:21,450 well, what's the beta of this node, which 337 00:15:21,450 --> 00:15:22,460 is a maximizer node? 338 00:15:22,460 --> 00:15:24,680 So it's good to know how it's derived. 339 00:15:24,680 --> 00:15:27,540 But I think that it wastes your time to write it out. 340 00:15:27,540 --> 00:15:28,520 That's my opinion. 341 00:15:28,520 --> 00:15:29,820 We'll see how it goes. 342 00:15:29,820 --> 00:15:32,720 So the way that it works, the way that alpha and beta works, 343 00:15:32,720 --> 00:15:33,970 is the Snow White principal. 344 00:15:33,970 --> 00:15:37,100 So does everyone know the story of Snow White? 345 00:15:37,100 --> 00:15:38,860 So there's a beautiful princess. 346 00:15:38,860 --> 00:15:40,696 There's an evil queen stepmother. 347 00:15:40,696 --> 00:15:43,070 Mirror mirror on the wall, who's the fairest of them all, 348 00:15:43,070 --> 00:15:46,620 finds out that it's the stepdaughter. 349 00:15:46,620 --> 00:15:49,130 So much like in the real world, in Snow White, 350 00:15:49,130 --> 00:15:52,440 the stepdaughter, Snow White, had the beauty of her parents. 351 00:15:52,440 --> 00:15:54,070 She inherited those. 352 00:15:54,070 --> 00:15:56,560 However, much like in the real world, maybe 353 00:15:56,560 --> 00:16:00,810 or perhaps not, the stepmother had an even better plan. 354 00:16:00,810 --> 00:16:04,980 She hired a hunter to sort of hunt Snow White, 355 00:16:04,980 --> 00:16:07,560 pull out Snow White's heart, and feed it 356 00:16:07,560 --> 00:16:09,710 to her so that she could gain Snow White's 357 00:16:09,710 --> 00:16:11,590 beauty for herself. 358 00:16:11,590 --> 00:16:14,900 How many people knew that version of the story? 359 00:16:14,900 --> 00:16:15,590 A few people. 360 00:16:15,590 --> 00:16:17,298 That's the original version of the story. 361 00:16:17,298 --> 00:16:18,570 Disney didn't put that in. 362 00:16:18,570 --> 00:16:19,944 The hunter then brought the heart 363 00:16:19,944 --> 00:16:22,250 of a deer, which I think in Disney the hunter did kill 364 00:16:22,250 --> 00:16:24,130 a deer arbitrarily, but it was not 365 00:16:24,130 --> 00:16:26,260 explained that that's why he was doing it. 366 00:16:26,260 --> 00:16:29,540 So in alpha-beta, it's just like that. 367 00:16:29,540 --> 00:16:32,760 By which I mean you start by inheriting the alpha 368 00:16:32,760 --> 00:16:34,400 and beta of your parents. 369 00:16:34,400 --> 00:16:38,360 But if you see something that you like amongst your children, 370 00:16:38,360 --> 00:16:42,810 you take it for yourself-- the Snow White principle. 371 00:16:42,810 --> 00:16:44,100 So let's see how that goes. 372 00:16:44,100 --> 00:16:48,200 Well, I told you guys that the default alpha was-- 373 00:16:48,200 --> 00:16:49,612 AUDIENCE: Negative infinity. 374 00:16:49,612 --> 00:16:50,820 PROFESSOR: Negative infinity. 375 00:16:50,820 --> 00:16:54,300 So here alpha is negative infinity. 376 00:16:54,300 --> 00:16:58,310 And I told you that the default beta was positive infinity. 377 00:16:58,310 --> 00:17:01,110 We're doing a depth first search here. 378 00:17:01,110 --> 00:17:03,590 All right, beta is infinity. 379 00:17:03,590 --> 00:17:09,099 All right, so we come here to E. Now, we could put an alpha. 380 00:17:09,099 --> 00:17:12,140 But I never put an alpha or a beta 381 00:17:12,140 --> 00:17:14,250 for one of the terminal nodes. 382 00:17:14,250 --> 00:17:17,329 Because it can't really do anything. 383 00:17:17,329 --> 00:17:19,060 It's just 2. 384 00:17:19,060 --> 00:17:22,352 So as we go down, we take the alpha and beta 385 00:17:22,352 --> 00:17:23,060 from our parents. 386 00:17:23,060 --> 00:17:25,830 But as we go up to a parent, if the parent likes 387 00:17:25,830 --> 00:17:28,450 what it sees in the child, it takes it instead. 388 00:17:28,450 --> 00:17:31,440 So I ask you all the question, would the minimizer 389 00:17:31,440 --> 00:17:35,490 prefer this 2 that it sees from its child or its own infinity 390 00:17:35,490 --> 00:17:36,630 for a beta? 391 00:17:36,630 --> 00:17:37,390 AUDIENCE: 2 392 00:17:37,390 --> 00:17:38,790 PROFESSOR: It likes the 2. 393 00:17:38,790 --> 00:17:40,190 That's absolutely right. 394 00:17:40,190 --> 00:17:44,060 So 2. 395 00:17:44,060 --> 00:17:48,575 All right, great, so now we go down to F. What is F's alpha? 396 00:17:51,854 --> 00:17:55,938 Who says negative infinity? 397 00:17:55,938 --> 00:17:58,220 Who says 2? 398 00:17:58,220 --> 00:18:01,690 No one-- oh, you guys are good. 399 00:18:01,690 --> 00:18:03,220 It's negative infinity. 400 00:18:03,220 --> 00:18:06,094 Technically, it also will have a beta of 2. 401 00:18:06,094 --> 00:18:07,260 But we're ignoring the beta. 402 00:18:07,260 --> 00:18:09,890 And the alphas that have been progressing downward 403 00:18:09,890 --> 00:18:11,962 from the parents-- negative infinity. 404 00:18:11,962 --> 00:18:14,170 That's why I called it the grandfather clause before. 405 00:18:14,170 --> 00:18:19,280 Because you would often look up to your grandparent to see what 406 00:18:19,280 --> 00:18:21,800 your default number is. 407 00:18:21,800 --> 00:18:23,830 So we get an alpha of negative infinity. 408 00:18:23,830 --> 00:18:26,134 We then go down to the K. It's a static evaluation. 409 00:18:26,134 --> 00:18:28,550 And now I'm going to start calling on people individually. 410 00:18:28,550 --> 00:18:30,450 So hopefully people paid attention 411 00:18:30,450 --> 00:18:34,010 to the mob, who were always correct. 412 00:18:34,010 --> 00:18:38,520 All right, so we go down to K. And we see a 3. 413 00:18:38,520 --> 00:18:40,520 F is a maximizer node. 414 00:18:40,520 --> 00:18:43,300 So what does F do now? 415 00:18:43,300 --> 00:18:45,450 AUDIENCE: Switches its alpha to 3. 416 00:18:45,450 --> 00:18:48,350 PROFESSOR: Yes, switches its alpha to 3, great. 417 00:18:55,430 --> 00:18:59,005 All right, so that's already quite good. 418 00:18:59,005 --> 00:19:00,075 It switches alpha to 3. 419 00:19:00,075 --> 00:19:01,330 It's very happy. 420 00:19:01,330 --> 00:19:03,526 It's got a 3 here. 421 00:19:03,526 --> 00:19:06,360 That's a nice value. 422 00:19:06,360 --> 00:19:16,736 So what does it do at L, the next node? 423 00:19:16,736 --> 00:19:18,865 It's gone to K, went back up to F. Depth 424 00:19:18,865 --> 00:19:20,740 first search, the next one would be L, right? 425 00:19:23,274 --> 00:19:24,149 AUDIENCE: [INAUDIBLE] 426 00:19:27,376 --> 00:19:32,420 PROFESSOR: Well, technically F could take L's value of 0 427 00:19:32,420 --> 00:19:34,690 if it liked it better than 3. 428 00:19:34,690 --> 00:19:35,590 But it's a maximizer. 429 00:19:35,590 --> 00:19:37,535 So does it want to take that? 430 00:19:37,535 --> 00:19:38,410 AUDIENCE: [INAUDIBLE] 431 00:19:38,410 --> 00:19:40,310 PROFESSOR: OK, that technically would be correct. 432 00:19:40,310 --> 00:19:40,893 But I'm sorry. 433 00:19:40,893 --> 00:19:43,830 I burdened you with a trick question. 434 00:19:43,830 --> 00:19:47,180 In fact, we don't look at L at all. 435 00:19:51,780 --> 00:19:53,510 Does everyone see that? 436 00:19:53,510 --> 00:19:54,740 I'll explain. 437 00:19:54,740 --> 00:19:57,190 The alpha at F has reached 3. 438 00:19:57,190 --> 00:20:00,700 But the beta at B is 2. 439 00:20:00,700 --> 00:20:03,600 So B looks down and says, wait a minute. 440 00:20:03,600 --> 00:20:06,990 If I go down to F, my enemy's nuclear option, 441 00:20:06,990 --> 00:20:10,450 my enemy is the worst it can be for-- the best it 442 00:20:10,450 --> 00:20:11,960 can be for me is 3. 443 00:20:11,960 --> 00:20:14,320 F is trumpeting it around. 444 00:20:14,320 --> 00:20:16,436 I was thinking of eating his heart, or whatever, 445 00:20:16,436 --> 00:20:17,310 but I didn't want to. 446 00:20:17,310 --> 00:20:19,050 But it's going to be 3. 447 00:20:19,050 --> 00:20:24,146 It's going to be 3 or higher down there at F. 448 00:20:24,146 --> 00:20:25,270 There's no way I want that. 449 00:20:25,270 --> 00:20:29,310 I already have my own default escape plan. 450 00:20:29,310 --> 00:20:30,219 And that's 2. 451 00:20:30,219 --> 00:20:32,260 That's going to be better than whatever comes out 452 00:20:32,260 --> 00:20:35,871 of that horrible F. So screw it. 453 00:20:35,871 --> 00:20:39,535 And we never look at L. Does everyone get that? 454 00:20:39,535 --> 00:20:43,520 That is the main principle of alpha-beta pruning. 455 00:20:43,520 --> 00:20:47,100 If you see an alpha that's higher than the beta above it-- 456 00:20:47,100 --> 00:20:50,020 as I said, if alpha goes up above the beta-- 457 00:20:50,020 --> 00:20:55,360 or if you see a beta, like if there's a beta down here, 458 00:20:55,360 --> 00:20:59,450 and it's lower than the alpha above it, prune it. 459 00:20:59,450 --> 00:21:02,450 Stop doing that. 460 00:21:02,450 --> 00:21:04,270 And the question is, who prunes? 461 00:21:04,270 --> 00:21:06,840 Who decides that you don't look at L? 462 00:21:06,840 --> 00:21:09,560 The person who is thinking not to look at L 463 00:21:09,560 --> 00:21:13,460 is always up higher by at least two levels. 464 00:21:13,460 --> 00:21:15,670 So up here, B is saying, hmm, I don't 465 00:21:15,670 --> 00:21:19,130 want to look at L. Because F is already so terrible for me 466 00:21:19,130 --> 00:21:22,890 that it's just beyond belief. 467 00:21:22,890 --> 00:21:25,920 If this is 100, it might be 100. 468 00:21:25,920 --> 00:21:29,350 Even if it's lower, I'm still going to get a three. 469 00:21:29,350 --> 00:21:33,330 There's a sanity check that I've written that I sort of came 470 00:21:33,330 --> 00:21:36,180 up with just in case you're not sure that you can skip it. 471 00:21:36,180 --> 00:21:38,060 Because on a lot of these tests, we ask you, 472 00:21:38,060 --> 00:21:40,884 which one's do you evaluate, which ones do you skip, right? 473 00:21:40,884 --> 00:21:42,675 Or we just say, which ones do you evaluate, 474 00:21:42,675 --> 00:21:44,760 and you don't write the ones that you skip. 475 00:21:44,760 --> 00:21:48,860 Here's my sanity test to see if you can skip it. 476 00:21:48,860 --> 00:21:52,080 Ask yourself, if that node that I'm 477 00:21:52,080 --> 00:21:54,040 about to skip contained a negative infinity 478 00:21:54,040 --> 00:21:56,550 or some arbitrarily small number, negative infinity 479 00:21:56,550 --> 00:22:01,260 being the minimizer wins, would it change anything? 480 00:22:01,260 --> 00:22:02,920 Now that I've answered that, if it 481 00:22:02,920 --> 00:22:07,350 contained a positive infinity, would it change anything? 482 00:22:07,350 --> 00:22:11,820 If the answer is no both times, then you're 483 00:22:11,820 --> 00:22:13,760 definitely correct in pruning it. 484 00:22:13,760 --> 00:22:14,850 So look at that 0. 485 00:22:14,850 --> 00:22:16,810 If it was a negative infinity, minimizer wins, 486 00:22:16,810 --> 00:22:17,705 what would happen? 487 00:22:17,705 --> 00:22:20,320 The maximizer would say, I'm not touching that was a 10 foot 488 00:22:20,320 --> 00:22:22,370 pole, choosing 3. 489 00:22:22,370 --> 00:22:26,380 The minimizer would say, screw that, I'll take E. 490 00:22:26,380 --> 00:22:28,005 Let's say it was a positive infinity. 491 00:22:28,005 --> 00:22:31,075 The maximizer would say, eureka, holy grain, I win. 492 00:22:31,075 --> 00:22:33,080 The minimizer would say, yeah, if I'm a moron, 493 00:22:33,080 --> 00:22:36,450 and go down to F, and then would go to E and take 2. 494 00:22:36,450 --> 00:22:41,640 So no matter what was there, the minimizer would go to E. 495 00:22:41,640 --> 00:22:44,100 And you could say, well, what if it was exactly 2? 496 00:22:44,100 --> 00:22:46,130 But still the maximizer would choose 497 00:22:46,130 --> 00:22:48,520 K. The minimizer would go to E. So there's 498 00:22:48,520 --> 00:22:49,680 no reason to go down there. 499 00:22:49,680 --> 00:22:51,410 We can just prune it off right now. 500 00:22:51,410 --> 00:22:54,070 Does everyone agree, everyone see 501 00:22:54,070 --> 00:22:55,720 what I'm talking about here? 502 00:22:55,720 --> 00:23:00,225 Great, so we're now done with this branch. 503 00:23:00,225 --> 00:23:01,790 Because beta is 2. 504 00:23:01,790 --> 00:23:05,232 So now we're up at old grandpappy A. 505 00:23:05,232 --> 00:23:06,940 And he has an alpha of negative infinity. 506 00:23:06,940 --> 00:23:11,660 Everyone, what will he do? 507 00:23:11,660 --> 00:23:12,709 He'll take the 2. 508 00:23:12,709 --> 00:23:14,500 It's better than negative infinity for him. 509 00:23:14,500 --> 00:23:15,600 It's not wonderful. 510 00:23:15,600 --> 00:23:19,670 But certainly anything is better than an automatic loss. 511 00:23:19,670 --> 00:23:23,780 All right, now our highest node is a 2. 512 00:23:23,780 --> 00:23:27,380 So let's keep that in mind for our alpha. 513 00:23:27,380 --> 00:23:29,820 OK, so let's go over here. 514 00:23:29,820 --> 00:23:32,854 Let's see, so what will be the value at C? 515 00:23:32,854 --> 00:23:34,020 What will be the beta value? 516 00:23:37,366 --> 00:23:40,282 AUDIENCE: [INAUDIBLE] 517 00:23:42,712 --> 00:23:44,660 PROFESSOR: You go back to which one? 518 00:23:44,660 --> 00:23:47,090 To G. I'm not at G yet. 519 00:23:47,090 --> 00:23:49,800 I'm actually just starting the middle branch. 520 00:23:49,800 --> 00:23:52,280 So I'm going to C. And what's going to be its starting 521 00:23:52,280 --> 00:23:54,902 beta before I go down? 522 00:23:54,902 --> 00:23:56,170 AUDIENCE: Infinity. 523 00:23:56,170 --> 00:23:58,910 PROFESSOR: Infinity, that's right-- default value. 524 00:23:58,910 --> 00:24:00,570 It's easier than it seemed. 525 00:24:00,570 --> 00:24:03,690 All right, so yes, beta is equal to infinity. 526 00:24:06,370 --> 00:24:08,090 This should be better erased. 527 00:24:08,090 --> 00:24:09,570 I think it's confusing people. 528 00:24:09,570 --> 00:24:14,790 Great, OK, so beta is equal to infinity at C. 529 00:24:14,790 --> 00:24:17,110 Now we go down depth first search to G. What's 530 00:24:17,110 --> 00:24:18,460 going to be our alpha at G? 531 00:24:18,460 --> 00:24:19,620 AUDIENCE: Minus infinity. 532 00:24:19,620 --> 00:24:21,560 PROFESSOR: Ahh, it would seem so. 533 00:24:21,560 --> 00:24:26,110 However, take a look up at the great-grandpappy A. 534 00:24:26,110 --> 00:24:28,490 It seems to have changed to 2. 535 00:24:28,490 --> 00:24:29,930 So this time it's 2. 536 00:24:29,930 --> 00:24:31,680 Why is it 2 instead of negative infinity? 537 00:24:31,680 --> 00:24:34,873 Why can we let A be so noxious and not start with saying, 538 00:24:34,873 --> 00:24:36,710 oh, I automatically lose? 539 00:24:36,710 --> 00:24:40,040 Well, A knows that no matter how awful things get 540 00:24:40,040 --> 00:24:42,680 in that middle branch, he can just say, 541 00:24:42,680 --> 00:24:43,930 screw the whole middle branch. 542 00:24:43,930 --> 00:24:45,110 I'm going to B. 543 00:24:45,110 --> 00:24:47,796 That's something that the minimizer can't do. 544 00:24:47,796 --> 00:24:49,920 And we have to start at infinity for the minimizer. 545 00:24:49,920 --> 00:24:51,140 But the maximizer can. 546 00:24:51,140 --> 00:24:52,790 Because he has the choice at the top. 547 00:24:52,790 --> 00:24:53,850 Does everyone see that? 548 00:24:53,850 --> 00:24:56,440 He can just say, oh, I'm not even going to C. Yeah, 549 00:24:56,440 --> 00:24:57,330 shows you. 550 00:24:57,330 --> 00:24:59,320 I'm going to A and taking the 2. 551 00:24:59,320 --> 00:25:03,640 So therefore alpha is actually 2 at G. 552 00:25:03,640 --> 00:25:06,500 All right, great, so we've got an alpha that's 553 00:25:06,500 --> 00:25:10,790 2 at G. We're going to go down to M. It's a minimizer. 554 00:25:10,790 --> 00:25:14,407 All right, what's going to be our beta value at M? 555 00:25:14,407 --> 00:25:15,282 AUDIENCE: [INAUDIBLE] 556 00:25:19,230 --> 00:25:21,325 PROFESSOR: Or which is the beta default, minus 557 00:25:21,325 --> 00:25:22,200 or positive infinity? 558 00:25:22,200 --> 00:25:24,554 What would be the minimizer? 559 00:25:24,554 --> 00:25:25,345 AUDIENCE: Positive. 560 00:25:25,345 --> 00:25:27,180 PROFESSOR: Positive infinity, that's right. 561 00:25:27,180 --> 00:25:29,250 M is going to be a positive infinity for beta. 562 00:25:29,250 --> 00:25:34,670 Again, it picks it up from C. Great, 563 00:25:34,670 --> 00:25:39,030 now we get to some actual values. 564 00:25:39,030 --> 00:25:41,390 So we're at some actual values. 565 00:25:41,390 --> 00:25:45,520 We are at Q. So what's going to happen 566 00:25:45,520 --> 00:25:47,230 at M when M sees that Q is 1? 567 00:25:53,431 --> 00:25:55,040 AUDIENCE: [INAUDIBLE] 568 00:25:55,040 --> 00:25:56,040 PROFESSOR: What is beta? 569 00:25:56,040 --> 00:25:56,371 It says infinity. 570 00:25:56,371 --> 00:25:57,890 I'm sorry, it's hard to read. 571 00:25:57,890 --> 00:25:59,486 Beta is infinity at M. 572 00:25:59,486 --> 00:26:01,470 AUDIENCE: OK, so it's going to minimize, right? 573 00:26:01,470 --> 00:26:05,001 So it's going to be like, OK, [INAUDIBLE]. 574 00:26:05,001 --> 00:26:06,000 PROFESSOR: That's right. 575 00:26:06,000 --> 00:26:08,270 So they're going to put beta to 1. 576 00:26:08,270 --> 00:26:16,250 Because it sees Q. Great, so my next question is, 577 00:26:16,250 --> 00:26:19,124 what's going to happen at R? 578 00:26:19,124 --> 00:26:22,052 AUDIENCE: [INAUDIBLE] 579 00:26:28,030 --> 00:26:30,010 PROFESSOR: Very smart. 580 00:26:30,010 --> 00:26:31,397 You've detected my trap. 581 00:26:31,397 --> 00:26:32,855 The question is, does it look at R? 582 00:26:32,855 --> 00:26:35,142 The answer is, no. 583 00:26:35,142 --> 00:26:37,100 It doesn't look at R. Why doesn't it look at R? 584 00:26:37,100 --> 00:26:39,020 Does everyone see? 585 00:26:39,020 --> 00:26:42,330 Yeah, alpha is now greater than the beta below it. 586 00:26:42,330 --> 00:26:43,917 Beta has gotten lower than alpha. 587 00:26:43,917 --> 00:26:46,000 This is the same thing I was talking about before, 588 00:26:46,000 --> 00:26:48,090 when we figured out that the alpha here is 2. 589 00:26:48,090 --> 00:26:52,470 The maximizer says, wait a minute. 590 00:26:52,470 --> 00:26:55,740 The maximizer G says, if I go to M, the best I'm getting out 591 00:26:55,740 --> 00:26:56,759 of this is 1. 592 00:26:56,759 --> 00:26:58,300 Because if this is negative infinity, 593 00:26:58,300 --> 00:26:59,680 the minimizer will choose it. 594 00:26:59,680 --> 00:27:01,555 If this is positive infinity, he'll choose 1. 595 00:27:01,555 --> 00:27:04,237 The best I'm going to get out of here is 1. 596 00:27:04,237 --> 00:27:06,320 If that's the case, I might as well have just gone 597 00:27:06,320 --> 00:27:09,600 to B and not even gone to C. So I'm not going to go to M. 598 00:27:09,600 --> 00:27:10,440 I'll go to N, maybe. 599 00:27:10,440 --> 00:27:13,370 Maybe N is better. 600 00:27:13,370 --> 00:27:15,300 Does everyone see that? 601 00:27:15,300 --> 00:27:20,270 Great, so let's say that the maximizer does go to N. 602 00:27:20,270 --> 00:27:23,379 So what's going to happen with this alpha? 603 00:27:23,379 --> 00:27:24,800 AUDIENCE: [INAUDIBLE] 604 00:27:24,800 --> 00:27:26,640 PROFESSOR: That's right, it's going to be 7. 605 00:27:26,640 --> 00:27:28,080 7 is better than 2. 606 00:27:28,080 --> 00:27:30,960 And the maximizer has control to get to that seven, 607 00:27:30,960 --> 00:27:36,940 at least if it gets to G. All right, now the minimizer at C-- 608 00:27:36,940 --> 00:27:38,580 we'll do everyone this time. 609 00:27:38,580 --> 00:27:43,798 The minimizer at C, seeing that 7, what does the minimizer do? 610 00:27:43,798 --> 00:27:44,298 Anyone? 611 00:27:48,200 --> 00:27:49,210 So it sees the 7. 612 00:27:49,210 --> 00:27:50,680 What does it do to its beta? 613 00:27:50,680 --> 00:27:55,330 It takes the 7-- better than infinity, anyway. 614 00:27:55,330 --> 00:27:58,290 And yeah, then it checks H. And everybody, again, what 615 00:27:58,290 --> 00:28:00,100 happens at H? 616 00:28:00,100 --> 00:28:00,970 It takes the 6. 617 00:28:00,970 --> 00:28:01,865 It's lower than 7. 618 00:28:04,730 --> 00:28:07,747 All right, now we'll go back to having 619 00:28:07,747 --> 00:28:08,830 people do it on their own. 620 00:28:08,830 --> 00:28:10,246 Well, all the way back to the top, 621 00:28:10,246 --> 00:28:13,216 what does A do when it sees the 6 coming out of C? 622 00:28:13,216 --> 00:28:15,150 AUDIENCE: Changes to 6. 623 00:28:15,150 --> 00:28:17,330 PROFESSOR: Changes to 6, that's right. 624 00:28:17,330 --> 00:28:18,910 Alpha equals 6. 625 00:28:18,910 --> 00:28:24,620 Great-- homestretch, people, homestretch. 626 00:28:24,620 --> 00:28:31,650 So the minimizer, everyone, has a beta of infinity. 627 00:28:31,650 --> 00:28:35,070 And if I wasn't a static node, it would have an alpha of 6. 628 00:28:35,070 --> 00:28:36,090 But it is a static node. 629 00:28:36,090 --> 00:28:38,570 So it just has a value of 1. 630 00:28:38,570 --> 00:28:43,380 So since it has a value of 1, everyone, the beta becomes 1. 631 00:28:43,380 --> 00:28:45,264 And what next, everyone? 632 00:28:45,264 --> 00:28:45,930 AUDIENCE: Prune. 633 00:28:45,930 --> 00:28:49,960 PROFESSOR: Prune, that's right. 634 00:28:49,960 --> 00:28:51,110 Why prune? 635 00:28:51,110 --> 00:28:54,650 Well, this time it's A himself who can prune. 636 00:28:54,650 --> 00:28:56,723 A says, well darn, if I go to D, I'm 637 00:28:56,723 --> 00:28:59,360 going to get 1 or something even worse than 1. 638 00:28:59,360 --> 00:29:02,550 I might as well take my 6 while I have it, 639 00:29:02,550 --> 00:29:04,875 prune all the rest all the way down. 640 00:29:09,210 --> 00:29:10,620 Everyone see that? 641 00:29:10,620 --> 00:29:12,710 Everyone cool with that? 642 00:29:12,710 --> 00:29:16,060 It's not too bad if you take it one step at a time. 643 00:29:16,060 --> 00:29:18,410 We did it. 644 00:29:18,410 --> 00:29:21,966 Our question is, which nodes are evaluated in order? 645 00:29:21,966 --> 00:29:35,170 Our answer is, everyone-- E, K, Q, N, H, I. OK, not so obvious, 646 00:29:35,170 --> 00:29:35,670 I guess. 647 00:29:35,670 --> 00:29:36,980 A few people followed me. 648 00:29:36,980 --> 00:29:40,590 But it is E, K, Q, N, H, I. It's just depth first order. 649 00:29:40,590 --> 00:29:43,220 And we pruned some of them away. 650 00:29:43,220 --> 00:29:47,260 Great, so that is alpha-beta. 651 00:29:47,260 --> 00:29:49,470 Any questions about that before I give some questions 652 00:29:49,470 --> 00:29:52,022 about progressive deepening? 653 00:29:52,022 --> 00:29:53,230 All right, we've got a bunch. 654 00:29:53,230 --> 00:29:55,182 So first question. 655 00:29:55,182 --> 00:30:02,480 AUDIENCE: [INAUDIBLE] nodes like F, B, C, and D? 656 00:30:02,480 --> 00:30:04,190 PROFESSOR: The question is, when asked 657 00:30:04,190 --> 00:30:07,900 for the order of evaluation, are we excluding F, B, C, and D? 658 00:30:07,900 --> 00:30:11,250 The answer is we're talking about here static evaluation. 659 00:30:11,250 --> 00:30:14,720 The static evaluator is a very important and interesting 660 00:30:14,720 --> 00:30:15,400 function. 661 00:30:15,400 --> 00:30:18,230 And I'll get back to something a few students have asked me 662 00:30:18,230 --> 00:30:21,484 about the static evaluator later and try to explain what it is. 663 00:30:21,484 --> 00:30:23,650 It's basically the thing that pops out those numbers 664 00:30:23,650 --> 00:30:25,015 at the bottom of the leaves. 665 00:30:25,015 --> 00:30:26,765 So when we ask, what is the order of nodes 666 00:30:26,765 --> 00:30:31,510 that were statically evaluated, we mean leaves only. 667 00:30:31,510 --> 00:30:32,580 That's a good question. 668 00:30:32,580 --> 00:30:33,620 Any other questions? 669 00:30:33,620 --> 00:30:35,430 Let's see, there was one up here before. 670 00:30:35,430 --> 00:30:36,170 But it's gone. 671 00:30:36,170 --> 00:30:37,885 It might have been the same one. 672 00:30:37,885 --> 00:30:38,384 Question? 673 00:30:38,384 --> 00:30:39,717 AUDIENCE: So a similar question. 674 00:30:39,717 --> 00:30:42,578 When you say, static nodes, that just means the leaf nodes? 675 00:30:42,578 --> 00:30:43,750 PROFESSOR: Means the leaf nodes, that's right. 676 00:30:43,750 --> 00:30:46,041 The question is, does static nodes mean the leaf nodes. 677 00:30:46,041 --> 00:30:46,903 The answer is yes. 678 00:30:46,903 --> 00:30:48,319 AUDIENCE: And so static evaluation 679 00:30:48,319 --> 00:30:51,737 is when you compare the value of a static node to something? 680 00:30:51,737 --> 00:30:53,570 PROFESSOR: Static evaluation is when you get 681 00:30:53,570 --> 00:30:55,180 that number, the static node. 682 00:30:55,180 --> 00:30:56,480 Let me explain. 683 00:30:56,480 --> 00:30:59,140 Unless someone else has another question about alpha-beta, 684 00:30:59,140 --> 00:31:00,500 let me explain static values. 685 00:31:00,500 --> 00:31:02,030 Because I was about to do that. 686 00:31:02,030 --> 00:31:03,571 There is a question about alpha-beta. 687 00:31:03,571 --> 00:31:06,326 I'll come back to both of yours after I answer this. 688 00:31:06,326 --> 00:31:09,437 AUDIENCE: You were mentioning [INAUDIBLE]. 689 00:31:09,437 --> 00:31:11,385 And I'm a little bit confused. 690 00:31:11,385 --> 00:31:13,576 If you're looking at one node, and you're 691 00:31:13,576 --> 00:31:16,255 seeing either grab the value from the grandparent 692 00:31:16,255 --> 00:31:18,203 or grab it from the-- 693 00:31:18,203 --> 00:31:21,134 PROFESSOR: So it always starts-- the question is, what 694 00:31:21,134 --> 00:31:22,300 is the Snow White principle? 695 00:31:22,300 --> 00:31:23,380 How does it work? 696 00:31:23,380 --> 00:31:26,760 Every node always starts off with taking 697 00:31:26,760 --> 00:31:29,865 the value of the same type, alpha or beta, 698 00:31:29,865 --> 00:31:30,740 from its grandparent. 699 00:31:30,740 --> 00:31:33,010 It always starts that way. 700 00:31:33,010 --> 00:31:34,427 Now, you say, why the grandparent? 701 00:31:34,427 --> 00:31:35,926 Wouldn't it take it from the parent? 702 00:31:35,926 --> 00:31:37,050 It actually does. 703 00:31:37,050 --> 00:31:39,430 But I'm not drawing out the alphas at all the minimizer 704 00:31:39,430 --> 00:31:39,700 levels. 705 00:31:39,700 --> 00:31:40,991 Because they don't do anything. 706 00:31:40,991 --> 00:31:44,200 They're only even there to pass them down. 707 00:31:44,200 --> 00:31:48,110 So all of the values pass down, down, down, down, down 708 00:31:48,110 --> 00:31:49,050 to begin. 709 00:31:49,050 --> 00:31:55,390 Every node, in fact, starts off with its grandparents 710 00:31:55,390 --> 00:31:57,920 with its parents' values, OK? 711 00:31:57,920 --> 00:32:00,730 But then when the node sees a child, 712 00:32:00,730 --> 00:32:02,780 it's completely done evaluating. 713 00:32:02,780 --> 00:32:05,360 It's finished. 714 00:32:05,360 --> 00:32:08,380 It can't be in the process. 715 00:32:08,380 --> 00:32:13,550 Let's say C. When C sees that G is completely 716 00:32:13,550 --> 00:32:15,780 done with all of its sub-branches 717 00:32:15,780 --> 00:32:18,760 and is ready to return a value, or if it's just 718 00:32:18,760 --> 00:32:22,120 a static evaluation, then it's automatically completely done. 719 00:32:22,120 --> 00:32:25,020 Because it has no children. 720 00:32:25,020 --> 00:32:29,550 A static value like K of 3 is automatically completely done. 721 00:32:29,550 --> 00:32:30,830 It's got a 3. 722 00:32:30,830 --> 00:32:33,360 Similarly, when we came back to G after going to N, 723 00:32:33,360 --> 00:32:36,140 and we knew that the value was 7, that was completely done. 724 00:32:36,140 --> 00:32:37,730 The value was definitely 7. 725 00:32:37,730 --> 00:32:39,480 There was no other possibilities. 726 00:32:39,480 --> 00:32:40,830 AUDIENCE: That's after looking at the children, right? 727 00:32:40,830 --> 00:32:41,500 PROFESSOR: Yes. 728 00:32:41,500 --> 00:32:44,630 So once you're done with all the children of G, 729 00:32:44,630 --> 00:32:46,250 then G comes up and says, guess what? 730 00:32:46,250 --> 00:32:47,409 Guess what, guys? 731 00:32:47,409 --> 00:32:48,950 So technically before that, you would 732 00:32:48,950 --> 00:32:53,210 have said that G's alpha is greater than or equal to 1 733 00:32:53,210 --> 00:32:55,950 when we looked at Q. And then we looked at M. We'd say, 734 00:32:55,950 --> 00:32:57,640 it's equal exactly to 7. 735 00:32:57,640 --> 00:32:58,890 We're done here. 736 00:32:58,890 --> 00:33:02,170 And then at that point, when it's fresh and ripe 737 00:33:02,170 --> 00:33:04,890 and has all of its highest value or its best value, 738 00:33:04,890 --> 00:33:08,100 that's when the parent can eat its heart and gain that value 739 00:33:08,100 --> 00:33:08,980 itself. 740 00:33:08,980 --> 00:33:12,340 So that's when C says, for instance, oh man, 741 00:33:12,340 --> 00:33:14,550 I have an infinity. 742 00:33:14,550 --> 00:33:16,060 I really like that 7 better. 743 00:33:16,060 --> 00:33:17,390 And it takes the 7. 744 00:33:17,390 --> 00:33:19,840 But then it saw H. And it said, oh man, that's a 6. 745 00:33:19,840 --> 00:33:20,980 That's even better than 7. 746 00:33:20,980 --> 00:33:21,765 So it took the 6. 747 00:33:21,765 --> 00:33:24,510 AUDIENCE: So shouldn't the alpha take 7 then? 748 00:33:24,510 --> 00:33:25,780 PROFESSOR: So alpha takes 6. 749 00:33:25,780 --> 00:33:27,300 Because C is a minimizer. 750 00:33:27,300 --> 00:33:29,960 C took the 7 from G, but then right 751 00:33:29,960 --> 00:33:32,810 after that C saw H and took the 6. 752 00:33:32,810 --> 00:33:35,880 Because 6 is even lower than 7. 753 00:33:35,880 --> 00:33:38,160 And then alpha took the 6. 754 00:33:38,160 --> 00:33:40,260 Because 6 was higher than 2. 755 00:33:40,260 --> 00:33:42,824 AUDIENCE: So it's not going to look below the branch? 756 00:33:42,824 --> 00:33:45,240 PROFESSOR: Yeah, the problem is that the maximizer doesn't 757 00:33:45,240 --> 00:33:46,490 have control there. 758 00:33:46,490 --> 00:33:48,464 The minimizer has got control at C. 759 00:33:48,464 --> 00:33:49,880 And the minimizer is going to make 760 00:33:49,880 --> 00:33:52,020 sure it's as low as possible. 761 00:33:52,020 --> 00:33:56,470 The maximizer at A, his only control, or her only control, 762 00:33:56,470 --> 00:33:59,460 is the ability to send either way to B or C 763 00:33:59,460 --> 00:34:02,346 or D. And then at that point, at C, 764 00:34:02,346 --> 00:34:04,345 the minimizer gets to choose if we go to G or H. 765 00:34:04,345 --> 00:34:07,300 And it's never going to choose G. Because G is higher than H. 766 00:34:07,300 --> 00:34:09,829 All right, awesome, was there another question? 767 00:34:09,829 --> 00:34:12,830 All right, let's go back to static evaluations. 768 00:34:12,830 --> 00:34:16,139 When I first took this class, I had some weird thoughts 769 00:34:16,139 --> 00:34:17,139 about static evolutions. 770 00:34:17,139 --> 00:34:18,739 I heard some students ask me this. 771 00:34:18,739 --> 00:34:21,475 I almost got a question about it onto one of the tests, 772 00:34:21,475 --> 00:34:23,600 but it was edited to some other weird question that 773 00:34:23,600 --> 00:34:25,451 was m to the b to the d minus 1 or something 774 00:34:25,451 --> 00:34:26,659 like that at the last minute. 775 00:34:26,659 --> 00:34:28,909 So I'm going to pose you guys the actual question that 776 00:34:28,909 --> 00:34:31,520 would have been on one of the older test, which 777 00:34:31,520 --> 00:34:32,790 is the following. 778 00:34:32,790 --> 00:34:34,873 I had a student who came to me and said, you know, 779 00:34:34,873 --> 00:34:37,489 [INAUDIBLE], when we do this alpha-beta pruning, and all 780 00:34:37,489 --> 00:34:40,560 this other stuff, we're trying to assume that we're really 781 00:34:40,560 --> 00:34:42,870 saving that much time by getting rid 782 00:34:42,870 --> 00:34:44,361 of a few static evaluations. 783 00:34:44,361 --> 00:34:46,110 In fact, when we do progressive deepening, 784 00:34:46,110 --> 00:34:48,401 we're always just counting, how many static evaluations 785 00:34:48,401 --> 00:34:49,290 do we have to do? 786 00:34:49,290 --> 00:34:51,586 And he said, I look at these static evaluations. 787 00:34:51,586 --> 00:34:52,710 And there's just a 3 there. 788 00:34:52,710 --> 00:34:56,520 It takes no time to do the static evaluation. 789 00:34:56,520 --> 00:34:57,890 It's on the board. 790 00:34:57,890 --> 00:34:59,920 It takes much longer to do the alpha-beta. 791 00:34:59,920 --> 00:35:02,770 It's faster by far to not do alpha-beta. 792 00:35:02,770 --> 00:35:04,680 So I then tried to explain to that student. 793 00:35:04,680 --> 00:35:07,050 I said, OK, we need to be clear about what 794 00:35:07,050 --> 00:35:08,340 static evaluations are. 795 00:35:08,340 --> 00:35:09,590 You guys get it easy. 796 00:35:09,590 --> 00:35:11,657 We put these numbers on the board. 797 00:35:11,657 --> 00:35:13,240 A static evaluation-- let's say you're 798 00:35:13,240 --> 00:35:17,010 playing a game like chess. 799 00:35:17,010 --> 00:35:19,050 Static evaluation takes a long time. 800 00:35:19,050 --> 00:35:21,910 When I was in 6170, Java [INAUDIBLE], 801 00:35:21,910 --> 00:35:24,150 the class that used to exist, we had a program 802 00:35:24,150 --> 00:35:28,440 called Anti-Chess where I used my 6034 skills to write the AI. 803 00:35:28,440 --> 00:35:31,890 And the static evaluator took a long time. 804 00:35:31,890 --> 00:35:32,970 And we were timed. 805 00:35:32,970 --> 00:35:34,970 So getting the static evaluator faster, 806 00:35:34,970 --> 00:35:36,602 that was the most important thing. 807 00:35:36,602 --> 00:35:37,810 Why does it take a long time? 808 00:35:37,810 --> 00:35:39,875 Well, the static evaluator is an evaluation 809 00:35:39,875 --> 00:35:43,730 of the board position, the state of the game, 810 00:35:43,730 --> 00:35:45,519 at a snapshot of time. 811 00:35:45,519 --> 00:35:48,060 And that's not as easy as just saying, oh, here's the answer. 812 00:35:48,060 --> 00:35:51,500 Because in chess, first of all, not only 813 00:35:51,500 --> 00:35:53,780 did I have to look at how many pieces I had, 814 00:35:53,780 --> 00:35:56,520 what areas that I controlled. 815 00:35:56,520 --> 00:35:58,154 Also-- well, it was anti-chess. 816 00:35:58,154 --> 00:35:59,320 But that's not withstanding. 817 00:35:59,320 --> 00:36:01,110 Let's pretend it's regular chess. 818 00:36:01,110 --> 00:36:03,900 I also had to look, if it was in regular chess-- 819 00:36:03,900 --> 00:36:05,650 and I still had to do this in anti-chess-- 820 00:36:05,650 --> 00:36:06,850 if my king was in check. 821 00:36:06,850 --> 00:36:08,350 And what that meant is I had to look 822 00:36:08,350 --> 00:36:11,040 at all of my opponent's moves, possible moves, 823 00:36:11,040 --> 00:36:13,120 to see if anyone of them could take my king. 824 00:36:13,120 --> 00:36:15,921 Because in regular chess, it's illegal to put your king 825 00:36:15,921 --> 00:36:16,420 into check. 826 00:36:16,420 --> 00:36:18,830 So you better not even allow that move. 827 00:36:18,830 --> 00:36:20,500 And regardless, getting into checkmate 828 00:36:20,500 --> 00:36:22,670 is negative infinity for you. 829 00:36:22,670 --> 00:36:27,100 So it takes a really long time to do static evaluations, 830 00:36:27,100 --> 00:36:29,380 at least good ones, usually. 831 00:36:29,380 --> 00:36:30,400 You want to avoid them. 832 00:36:30,400 --> 00:36:32,740 Because they're not just some number on the page. 833 00:36:32,740 --> 00:36:34,910 They are some function you wrote that does 834 00:36:34,910 --> 00:36:38,230 a very careful analysis of the state of the game and says, 835 00:36:38,230 --> 00:36:41,770 I'm good to heuristically guess that my value is 836 00:36:41,770 --> 00:36:45,580 pi, or some other number, and then rates 837 00:36:45,580 --> 00:36:47,060 that compared to other states. 838 00:36:47,060 --> 00:36:49,530 Does that make sense to everyone? 839 00:36:49,530 --> 00:36:51,790 So the answer to the hypothetical question 840 00:36:51,790 --> 00:36:53,628 that might have been on the old test, when 841 00:36:53,628 --> 00:36:56,086 the person said, I've got this great idea where you do tons 842 00:36:56,086 --> 00:36:57,544 of static evaluation, and you don't 843 00:36:57,544 --> 00:37:00,620 have to do this long alpha-beta, is, don't do that. 844 00:37:00,620 --> 00:37:04,805 The static evaluations actually take a long time. 845 00:37:04,805 --> 00:37:06,680 Does that clear it up for people who asked me 846 00:37:06,680 --> 00:37:11,130 before about what is a static evaluation, why are the leaf 847 00:37:11,130 --> 00:37:12,030 nodes called static? 848 00:37:14,730 --> 00:37:17,550 And you might ask, why are some of these static just 849 00:37:17,550 --> 00:37:19,140 arbitrarily? 850 00:37:19,140 --> 00:37:23,030 The answer is, when you're running out of time to expand 851 00:37:23,030 --> 00:37:26,270 deeper, and you just need to stop that stage of the game-- 852 00:37:26,270 --> 00:37:27,744 maybe it's just getting too hairy, 853 00:37:27,744 --> 00:37:29,160 maybe it's spreading out too much, 854 00:37:29,160 --> 00:37:30,760 you have some heuristic that says, 855 00:37:30,760 --> 00:37:33,500 this is where I stop for now-- it's 856 00:37:33,500 --> 00:37:35,740 a heuristic guess of the value. 857 00:37:35,740 --> 00:37:37,990 It's kind of like those heuristic values in the search 858 00:37:37,990 --> 00:37:38,330 tree. 859 00:37:38,330 --> 00:37:39,704 It's a guess of how much work you 860 00:37:39,704 --> 00:37:41,340 have left to get to the goal. 861 00:37:41,340 --> 00:37:43,720 Here, you say, well, I wish I could go deeper. 862 00:37:43,720 --> 00:37:45,100 But I just don't have the time. 863 00:37:45,100 --> 00:37:48,250 So here's how I think I'm doing at this level. 864 00:37:48,250 --> 00:37:49,359 It's not always right. 865 00:37:49,359 --> 00:37:51,150 And that's going to lead us into the answer 866 00:37:51,150 --> 00:37:53,210 to one of the questions about progressive deepening. 867 00:37:53,210 --> 00:37:54,876 So I'll put up the progressive deepening 868 00:37:54,876 --> 00:37:56,050 question really quickly. 869 00:37:56,050 --> 00:38:00,610 So the question is this. 870 00:38:00,610 --> 00:38:02,610 Let me see, this is a maximizer-- yes. 871 00:38:02,610 --> 00:38:10,515 Suppose that we do progressive deepening on the tree that 872 00:38:10,515 --> 00:38:11,600 is only two levels deep. 873 00:38:11,600 --> 00:38:13,620 What is progressive deepening in a nutshell if you don't 874 00:38:13,620 --> 00:38:14,930 remember from the lecture? 875 00:38:14,930 --> 00:38:17,890 The idea is this. 876 00:38:17,890 --> 00:38:20,080 In this tree, it doesn't work. 877 00:38:20,080 --> 00:38:22,550 But in trees that actually branch like 2 to the n, 878 00:38:22,550 --> 00:38:26,280 it doesn't take that much time to do some of the top levels 879 00:38:26,280 --> 00:38:29,240 first and then move on to the bottom levels. 880 00:38:29,240 --> 00:38:31,250 Just do them one at a time. 881 00:38:31,250 --> 00:38:33,840 So let's say we only did it up through J. We only did 882 00:38:33,840 --> 00:38:35,890 the top two levels of the tree. 883 00:38:35,890 --> 00:38:38,690 We'd like to reorder the tree so that alpha-beta 884 00:38:38,690 --> 00:38:43,250 can prune as much as it possibly can, at least we hope. 885 00:38:43,250 --> 00:38:47,240 So let's pretend that we had a psychic awesome genius 886 00:38:47,240 --> 00:38:50,520 friend who told us that the static values when we went up 887 00:38:50,520 --> 00:38:52,680 to two levels-- remember, when we go to two levels, 888 00:38:52,680 --> 00:38:55,440 F, G, and J have to get a static value, right? 889 00:38:55,440 --> 00:38:56,750 Because we're not going down. 890 00:38:56,750 --> 00:38:58,180 We do a static evaluation. 891 00:38:58,180 --> 00:39:02,410 They get the exact correct numbers-- 3, 7, and 20. 892 00:39:02,410 --> 00:39:03,573 Genius, brilliant. 893 00:39:03,573 --> 00:39:07,500 All right, so if that happens, what 894 00:39:07,500 --> 00:39:10,200 is the best way that we could reorder that tree? 895 00:39:10,200 --> 00:39:18,880 Oh yeah, so it's A, B, C, D with values of 2, 3, 7, 6, 1, 20. 896 00:39:18,880 --> 00:39:21,880 I'll draw that. 897 00:39:21,880 --> 00:39:23,390 This is the non-reordered tree. 898 00:39:29,890 --> 00:39:41,410 Let's see, so it's 2, 3, 7, 6, 1, 20. 899 00:39:41,410 --> 00:39:43,584 So what's the best way to reorder? 900 00:39:43,584 --> 00:39:45,250 Well, first of all, does anyone remember 901 00:39:45,250 --> 00:39:49,300 what Patrick said when he talked about progressive deepening? 902 00:39:49,300 --> 00:39:53,800 Usually no one does, so don't worry about it. 903 00:39:53,800 --> 00:39:55,880 Because at that time you guys didn't think, 904 00:39:55,880 --> 00:39:57,390 oh, I have to do this for the quiz. 905 00:39:57,390 --> 00:39:58,987 You were just thinking, oh man, we've 906 00:39:58,987 --> 00:40:01,070 already heard alpha-beta and all this other stuff. 907 00:40:01,070 --> 00:40:02,570 And this is just a small fact. 908 00:40:02,570 --> 00:40:04,650 But it's a very important fact. 909 00:40:04,650 --> 00:40:06,720 And now you know you have to do it for the quiz. 910 00:40:06,720 --> 00:40:08,600 So you're probably going to remember it. 911 00:40:08,600 --> 00:40:13,510 The way you do it is you try to guess, and you say, 912 00:40:13,510 --> 00:40:16,030 which one of these is going to be a winner? 913 00:40:16,030 --> 00:40:19,530 Whichever one I think is going to be a winner at that level, 914 00:40:19,530 --> 00:40:21,360 I put first. 915 00:40:21,360 --> 00:40:22,500 Why is that the case? 916 00:40:22,500 --> 00:40:27,110 Well, something interesting you may have noticed here-- 917 00:40:27,110 --> 00:40:30,590 whenever you have a winner, like the middle node, 918 00:40:30,590 --> 00:40:34,060 or whenever you have whatever is the current best 919 00:40:34,060 --> 00:40:36,790 for your alpha, you sort of have to explore out 920 00:40:36,790 --> 00:40:39,160 a lot of that area. 921 00:40:39,160 --> 00:40:42,390 Like for instance, the left node was our current best at 2. 922 00:40:42,390 --> 00:40:44,250 The middle branch was our current best, 923 00:40:44,250 --> 00:40:45,130 at that time was 6. 924 00:40:45,130 --> 00:40:46,382 It was the total best. 925 00:40:46,382 --> 00:40:48,090 We had to explore a good number of nodes. 926 00:40:48,090 --> 00:40:50,370 But on the right, we just saw, oh, there's 1. 927 00:40:50,370 --> 00:40:50,930 We're done. 928 00:40:50,930 --> 00:40:53,070 We cut everything off. 929 00:40:53,070 --> 00:40:55,150 In other words, the branch that turns out 930 00:40:55,150 --> 00:40:57,052 to be the one that you take, you have 931 00:40:57,052 --> 00:40:58,760 to do a pretty good amount of exploration 932 00:40:58,760 --> 00:41:00,554 to prove that it's the right one. 933 00:41:00,554 --> 00:41:01,970 Whereas if it's the wrong one, you 934 00:41:01,970 --> 00:41:05,320 can sometimes with just one node say, this is wrong, done. 935 00:41:05,320 --> 00:41:07,100 So therefore, if the one that turns out 936 00:41:07,100 --> 00:41:10,510 to be the eventual winner is first of all, 937 00:41:10,510 --> 00:41:13,290 then it's really easy to reject all the other branches. 938 00:41:13,290 --> 00:41:16,490 Do people see that sort of conceptually a little bit, 939 00:41:16,490 --> 00:41:19,250 that if you get the best node right away, 940 00:41:19,250 --> 00:41:22,390 you can just reject all the wrong ones pretty quickly? 941 00:41:22,390 --> 00:41:23,560 That's our goal. 942 00:41:23,560 --> 00:41:25,970 So how can we, quote, "get the right one," the best one 943 00:41:25,970 --> 00:41:27,040 right away? 944 00:41:27,040 --> 00:41:30,050 Well, here's how we do it. 945 00:41:30,050 --> 00:41:33,590 Let's say we're at B. Which one is the minimizer likely to pick 946 00:41:33,590 --> 00:41:35,800 assuming that our heuristic is good 947 00:41:35,800 --> 00:41:38,920 and that these guesses are pretty much close to the truth? 948 00:41:38,920 --> 00:41:41,261 It turns out they're perfect, so this is going to work. 949 00:41:41,261 --> 00:41:42,760 So which one will the minimizer pick 950 00:41:42,760 --> 00:41:44,801 if it has to choose between E and F, do we think? 951 00:41:44,801 --> 00:41:45,534 AUDIENCE: E. 952 00:41:45,534 --> 00:41:46,450 PROFESSOR: E, perfect. 953 00:41:46,450 --> 00:41:48,981 Which one will it pick between G and H? 954 00:41:48,981 --> 00:41:49,481 AUDIENCE: H. 955 00:41:49,481 --> 00:41:51,689 PROFESSOR: H. Which one will it pick between I and J? 956 00:41:51,689 --> 00:41:52,470 AUDIENCE: I. 957 00:41:52,470 --> 00:41:54,660 PROFESSOR: OK, so what we're saying 958 00:41:54,660 --> 00:41:56,790 is we think it's going to pick E. 959 00:41:56,790 --> 00:41:59,576 We think it's going to pick H. We think it's going to pick I. 960 00:41:59,576 --> 00:42:02,830 So first of all, we should put E before F, H before G, 961 00:42:02,830 --> 00:42:06,490 and I before J. Because we think it's going to pick those first. 962 00:42:06,490 --> 00:42:10,190 Those are probably our best ones to invalidate a poor branch. 963 00:42:10,190 --> 00:42:12,360 So now between 2, 6, and 1, which 964 00:42:12,360 --> 00:42:15,110 is what we think we're going to get, which one do we think 965 00:42:15,110 --> 00:42:16,691 the maximizer is going to take? 966 00:42:16,691 --> 00:42:17,190 AUDIENCE: 6. 967 00:42:17,190 --> 00:42:18,050 PROFESSOR: 6. 968 00:42:18,050 --> 00:42:21,940 Then if it couldn't take 6, what would be its next best choice? 969 00:42:21,940 --> 00:42:23,690 2, then 1. 970 00:42:23,690 --> 00:42:27,120 That's just our order-- simple as that. 971 00:42:27,120 --> 00:42:28,630 It couldn't be anything easier that 972 00:42:28,630 --> 00:42:31,130 evolves really complex trees, a huge number of numbers, 973 00:42:31,130 --> 00:42:33,200 and reordering those trees. 974 00:42:33,200 --> 00:42:42,905 So C-- you guys told me C, B, D. You told me C, B, D, I think? 975 00:42:42,905 --> 00:42:44,780 Yeah, those are the ones the maximizer likes. 976 00:42:44,780 --> 00:42:49,740 And then the ones the minimizer likes you told me was H, 977 00:42:49,740 --> 00:42:53,150 and before G. Because H is smaller than G. 978 00:42:53,150 --> 00:43:03,710 You guys told me E before F. And you guys told me I before J. 979 00:43:03,710 --> 00:43:07,120 And you guys would be correct in all regards. 980 00:43:07,120 --> 00:43:12,490 We have 6, 7, 2, 3, 1, 20. 981 00:43:12,490 --> 00:43:15,450 All the minimizers choose from smallest to highest. 982 00:43:15,450 --> 00:43:19,840 The maximizer chooses from highest to lowest of the ones 983 00:43:19,840 --> 00:43:21,690 that the minimizers will take. 984 00:43:21,690 --> 00:43:24,320 And if we did that, you can see we would probably 985 00:43:24,320 --> 00:43:26,160 save some time. 986 00:43:26,160 --> 00:43:27,640 Let's see how much time. 987 00:43:27,640 --> 00:43:30,770 Let's say we looked at H first. 988 00:43:30,770 --> 00:43:34,010 Well, if we looked at H first, we 989 00:43:34,010 --> 00:43:39,060 would still have actually had to look at Q and N. However, 990 00:43:39,060 --> 00:43:43,340 we would not have had to look at K. Do people see why? 991 00:43:43,340 --> 00:43:44,970 If we already knew this branch was 6, 992 00:43:44,970 --> 00:43:48,160 as soon as we saw 2 for the beta here-- 2 is less than 6-- 993 00:43:48,160 --> 00:43:49,752 we could have pruned. 994 00:43:49,752 --> 00:43:51,710 We still would have had to look at I over here. 995 00:43:51,710 --> 00:43:53,480 Because you have to look at at least one 996 00:43:53,480 --> 00:43:55,760 thing in the new sub-branch. 997 00:43:55,760 --> 00:44:00,500 And it actually only would have saved us one node-- oops. 998 00:44:00,500 --> 00:44:04,740 So it winds up that in total, how many nodes would 999 00:44:04,740 --> 00:44:10,350 we have evaluated if we did that little scheme of reordering? 1000 00:44:10,350 --> 00:44:17,050 Well, we normally had to do six-- E, K, Q, N, H, I. 1001 00:44:17,050 --> 00:44:20,391 How many do we evaluate if we do this progressive deepening 1002 00:44:20,391 --> 00:44:20,890 scheme? 1003 00:44:20,890 --> 00:44:22,845 How many times do we run the static evaluator, 1004 00:44:22,845 --> 00:44:24,970 which of course you know the static evaluator takes 1005 00:44:24,970 --> 00:44:26,620 a long time? 1006 00:44:26,620 --> 00:44:29,400 Anyone have a guess? 1007 00:44:29,400 --> 00:44:36,170 I told you the only one we don't evaluate is K. Raise your hand. 1008 00:44:36,170 --> 00:44:40,460 I won't make anyone give this one. 1009 00:44:40,460 --> 00:44:43,410 So I said the only one we save on is K. 1010 00:44:43,410 --> 00:44:49,180 So we still do E, Q, N, H, and I over here. 1011 00:44:49,180 --> 00:44:51,180 There's two possible answers that I will accept. 1012 00:44:51,180 --> 00:44:53,171 So you have a higher chance of guessing it. 1013 00:44:53,171 --> 00:44:53,670 Anyway? 1014 00:44:57,200 --> 00:44:59,405 Does everyone agree that we did six before? 1015 00:44:59,405 --> 00:45:01,210 If we didn't do any progressive deepening, 1016 00:45:01,210 --> 00:45:08,932 we just did E, K, Q, N, H, I. And now we're not doing K. OK, 1017 00:45:08,932 --> 00:45:09,890 people are saying five. 1018 00:45:09,890 --> 00:45:11,074 All right, good. 1019 00:45:11,074 --> 00:45:12,240 That's not the right answer. 1020 00:45:12,240 --> 00:45:15,680 But it at least shows that you can do taking away the one. 1021 00:45:15,680 --> 00:45:17,489 We did at least five over here. 1022 00:45:17,489 --> 00:45:19,030 There's two possible answers, though. 1023 00:45:19,030 --> 00:45:20,412 Because look over there. 1024 00:45:20,412 --> 00:45:22,120 In order to do the progressive deepening, 1025 00:45:22,120 --> 00:45:27,850 we had to do those static evaluations, right? 1026 00:45:27,850 --> 00:45:34,190 So we either did all those static evaluations 1027 00:45:34,190 --> 00:45:39,470 and these five-- E, K, Q, N, H, I-- static evaluations. 1028 00:45:39,470 --> 00:45:42,590 Because we didn't do the K. 1029 00:45:42,590 --> 00:45:46,010 Or we might have saved ourselves. 1030 00:45:46,010 --> 00:45:47,970 Because maybe we were smart and decided 1031 00:45:47,970 --> 00:45:51,899 to cache the static values when we were going down the tree. 1032 00:45:51,899 --> 00:45:54,440 It's an implementation detail that on this test when we asked 1033 00:45:54,440 --> 00:45:56,210 that question we didn't say. 1034 00:45:56,210 --> 00:45:58,400 What I mean by cache is when we did it here and saw 1035 00:45:58,400 --> 00:46:01,540 that E was a 2, and then here-- oh, we 1036 00:46:01,540 --> 00:46:03,990 have to do the static value at E. If we were smart, 1037 00:46:03,990 --> 00:46:06,520 we might have made a little hash table or something 1038 00:46:06,520 --> 00:46:09,985 and put down 2 so we didn't have to do a static evaluation at E. 1039 00:46:09,985 --> 00:46:12,602 And if that happened, well, we save E, H, and I, 1040 00:46:12,602 --> 00:46:15,270 and we do three fewer. 1041 00:46:15,270 --> 00:46:16,660 Does everyone see that? 1042 00:46:16,660 --> 00:46:18,177 However, that's still more than six. 1043 00:46:18,177 --> 00:46:19,260 So it didn't save us time. 1044 00:46:19,260 --> 00:46:21,051 So you might say, oh, progressive deepening 1045 00:46:21,051 --> 00:46:22,250 is a waste of time. 1046 00:46:22,250 --> 00:46:23,670 But it's not. 1047 00:46:23,670 --> 00:46:28,050 Because this is a very, very small, not very branchy tree 1048 00:46:28,050 --> 00:46:30,750 that was made so that you guys could easily 1049 00:46:30,750 --> 00:46:34,060 do alpha-beta and take the quiz, and it wouldn't be bad. 1050 00:46:34,060 --> 00:46:40,120 If this was actually branching even double at each level, 1051 00:46:40,120 --> 00:46:44,310 it would have, what, 16 nodes down here at the bottom. 1052 00:46:44,310 --> 00:46:47,390 Then you would want to be doing that progressive deepening. 1053 00:46:47,390 --> 00:46:51,360 So now I ask you a conceptual riddle question. 1054 00:46:51,360 --> 00:46:53,180 It's not really that much of a riddle. 1055 00:46:53,180 --> 00:46:54,846 But we'll see if anyone wants to answer. 1056 00:46:54,846 --> 00:46:57,670 Again, I won't call on you for this. 1057 00:46:57,670 --> 00:46:59,730 According to this test, a student 1058 00:46:59,730 --> 00:47:01,730 named Steve says, OK, I know I have 1059 00:47:01,730 --> 00:47:05,320 to pay to do the progressive deepening here. 1060 00:47:05,320 --> 00:47:06,490 But let's ignore that. 1061 00:47:06,490 --> 00:47:09,320 Because it's small in a large tree, right? 1062 00:47:09,320 --> 00:47:11,320 It's not going to take that much. 1063 00:47:11,320 --> 00:47:13,720 Let's ignore the costs of the progressive deepening 1064 00:47:13,720 --> 00:47:16,450 and only look at how much we do here. 1065 00:47:16,450 --> 00:47:18,920 He says, when it comes to performing the alpha-beta 1066 00:47:18,920 --> 00:47:21,220 on the final level, I'm guaranteed 1067 00:47:21,220 --> 00:47:23,600 to always prune at least as well or better 1068 00:47:23,600 --> 00:47:25,970 if I rearrange the nodes based on the best result 1069 00:47:25,970 --> 00:47:27,830 from progressive deepening. 1070 00:47:27,830 --> 00:47:28,759 Do you agree? 1071 00:47:31,876 --> 00:47:32,751 AUDIENCE: [INAUDIBLE] 1072 00:47:32,751 --> 00:47:34,250 PROFESSOR: Can I repeat it? 1073 00:47:34,250 --> 00:47:36,850 OK, the question is, ignoring the cost 1074 00:47:36,850 --> 00:47:39,170 that we pay progressively deepening here-- just 1075 00:47:39,170 --> 00:47:42,295 forget about it-- at the final step, at the final iteration, 1076 00:47:42,295 --> 00:47:45,590 the question is, am I guaranteed to do at least as 1077 00:47:45,590 --> 00:47:47,530 well or better in my alpha-beta pruning 1078 00:47:47,530 --> 00:47:51,137 when I reorder based on the best order 1079 00:47:51,137 --> 00:47:52,220 for progressive deepening? 1080 00:47:52,220 --> 00:47:53,935 Here certainly we did. 1081 00:47:53,935 --> 00:47:57,020 But the question is, is Steve guaranteed? 1082 00:47:57,020 --> 00:47:58,526 Answer? 1083 00:47:58,526 --> 00:47:59,675 AUDIENCE: [INAUDIBLE] 1084 00:47:59,675 --> 00:48:00,841 PROFESSOR: What did you say? 1085 00:48:00,841 --> 00:48:02,162 AUDIENCE: [INAUDIBLE] 1086 00:48:02,162 --> 00:48:03,870 PROFESSOR: That's the answer and the why, 1087 00:48:03,870 --> 00:48:05,150 which we asked to explain. 1088 00:48:05,150 --> 00:48:08,160 The answer we got is, doesn't that depend on the heuristic? 1089 00:48:08,160 --> 00:48:09,025 Perfectly correct. 1090 00:48:09,025 --> 00:48:11,130 The answer is, no, we're not guaranteed, 1091 00:48:11,130 --> 00:48:12,780 and it depends on the heuristic. 1092 00:48:12,780 --> 00:48:14,920 So if we were guaranteed, that would 1093 00:48:14,920 --> 00:48:17,310 be our heuristic was godlike, like this heuristic. 1094 00:48:17,310 --> 00:48:18,960 If your heuristic already tells you 1095 00:48:18,960 --> 00:48:22,180 the correct answer no matter what, don't do game search. 1096 00:48:22,180 --> 00:48:25,340 Just go to the empty chess board, 1097 00:48:25,340 --> 00:48:28,170 put all the pieces in the front rows, 1098 00:48:28,170 --> 00:48:29,990 and run static evaluator on that. 1099 00:48:29,990 --> 00:48:33,110 And it'll say, oh, it looks like with this game not 1100 00:48:33,110 --> 00:48:36,466 started that white is stupid, so black will win in 15 turns. 1101 00:48:36,466 --> 00:48:37,340 And then you're done. 1102 00:48:37,340 --> 00:48:38,690 And you don't do a search. 1103 00:48:38,690 --> 00:48:41,290 We know that our heuristic is flawed in some way. 1104 00:48:41,290 --> 00:48:42,790 It could be very flawed. 1105 00:48:42,790 --> 00:48:44,850 If it's flawed so badly that it tells us 1106 00:48:44,850 --> 00:48:47,320 a very bad result of what's actually going to happen, 1107 00:48:47,320 --> 00:48:49,646 even though we think the minimizer is going to go to H, 1108 00:48:49,646 --> 00:48:52,070 maybe it's wrong by a lot and it goes to G. 1109 00:48:52,070 --> 00:48:54,590 It could take us up an even worse path 1110 00:48:54,590 --> 00:48:56,180 and make us take longer. 1111 00:48:56,180 --> 00:48:56,972 Question? 1112 00:48:56,972 --> 00:48:58,506 AUDIENCE: If it's the heuristic, how 1113 00:48:58,506 --> 00:48:59,804 could you cache the values so you didn't 1114 00:48:59,804 --> 00:49:01,096 have to recalculate them later? 1115 00:49:01,096 --> 00:49:02,720 PROFESSOR: The question is, how can you 1116 00:49:02,720 --> 00:49:04,780 cache the values if it's a heuristic so you don't 1117 00:49:04,780 --> 00:49:06,390 have to recalculate them later? 1118 00:49:06,390 --> 00:49:09,770 The answer is, it wouldn't help if there weren't 1119 00:49:09,770 --> 00:49:11,780 these weird multi-level things where 1120 00:49:11,780 --> 00:49:14,860 we stop at E for some reason, even though it goes down 1121 00:49:14,860 --> 00:49:16,270 to five levels. 1122 00:49:16,270 --> 00:49:19,430 The way you could cache it is it is a heuristic. 1123 00:49:19,430 --> 00:49:20,635 But it's consistent. 1124 00:49:20,635 --> 00:49:22,960 And I don't mean consistent from search. 1125 00:49:22,960 --> 00:49:25,890 I mean it's a consistent heuristic in-- the game state 1126 00:49:25,890 --> 00:49:29,190 E is, let's say that's the state where I moved out 1127 00:49:29,190 --> 00:49:31,530 my knight as the maximizer, and the minimizer said, 1128 00:49:31,530 --> 00:49:33,270 you're doing the knight opening, really, 1129 00:49:33,270 --> 00:49:35,820 and then did a counterattack. 1130 00:49:35,820 --> 00:49:37,740 No matter how we get to E, or where 1131 00:49:37,740 --> 00:49:40,980 we go to get to E, that's always going to be state E. 1132 00:49:40,980 --> 00:49:43,430 It's always going to have the same heuristic value. 1133 00:49:43,430 --> 00:49:46,730 It's not like some guy who goes around and just randomly pulls 1134 00:49:46,730 --> 00:49:49,007 a number out of a hat. 1135 00:49:49,007 --> 00:49:50,840 We're going to have some value that gives us 1136 00:49:50,840 --> 00:49:52,510 points based on state E. And it's 1137 00:49:52,510 --> 00:49:54,670 going to be the same any time we go to state 1138 00:49:54,670 --> 00:49:57,020 E. Does that make sense? 1139 00:49:57,020 --> 00:49:57,980 It is a heuristic. 1140 00:49:57,980 --> 00:50:01,250 But it's always going to give the same value at E no matter 1141 00:50:01,250 --> 00:50:04,500 how you got to E. 1142 00:50:04,500 --> 00:50:05,705 But it could be really bad. 1143 00:50:05,705 --> 00:50:07,330 In fact, you might consider a heuristic 1144 00:50:07,330 --> 00:50:09,830 that's the opposite of correct and always tells us the worst 1145 00:50:09,830 --> 00:50:11,112 move and claims it's the best. 1146 00:50:11,112 --> 00:50:13,070 That's the heuristic that the minimizer program 1147 00:50:13,070 --> 00:50:14,860 did to our computer, perhaps. 1148 00:50:14,860 --> 00:50:17,450 In that case, when we do progressive deepening and we 1149 00:50:17,450 --> 00:50:22,100 reorder, we'll probably get the worst pruning possible. 1150 00:50:22,100 --> 00:50:22,880 We might not. 1151 00:50:22,880 --> 00:50:23,830 But we may. 1152 00:50:23,830 --> 00:50:26,560 So in that case, you're not guaranteed. 1153 00:50:26,560 --> 00:50:28,450 I hope that's given a few clues. 1154 00:50:28,450 --> 00:50:30,090 In tutorial, you guys are going to see 1155 00:50:30,090 --> 00:50:32,500 some more interesting problems that 1156 00:50:32,500 --> 00:50:33,870 go into a few other details. 1157 00:50:33,870 --> 00:50:36,160 I at least plan on doing [INAUDIBLE] interesting game 1158 00:50:36,160 --> 00:50:41,649 problem from last year, which asked a bunch of varied things 1159 00:50:41,649 --> 00:50:43,440 that are a little bit different from these. 1160 00:50:43,440 --> 00:50:47,360 So it should be a lot of fun, hopefully, or at least useful, 1161 00:50:47,360 --> 00:50:49,910 to do the next quiz. 1162 00:50:49,910 --> 00:50:51,550 So have a great weekend. 1163 00:50:51,550 --> 00:50:54,711 Don't stress out too much about the quiz.