1 00:00:09,234 --> 00:00:10,220 PATRICK WINSTON: Today we're going to be 2 00:00:10,220 --> 00:00:11,030 talking about Search. 3 00:00:11,030 --> 00:00:13,220 I know you're going to turn blue with yet 4 00:00:13,220 --> 00:00:15,350 another lecture on Search. 5 00:00:15,350 --> 00:00:18,270 Those of you who are taking computer science subjects, 6 00:00:18,270 --> 00:00:20,360 you've probably seen in 601. 7 00:00:20,360 --> 00:00:22,270 You'll see it again as theory course. 8 00:00:22,270 --> 00:00:25,570 But we're going to do it for a little different purpose. 9 00:00:25,570 --> 00:00:29,480 I want you to develop some intuition about various kinds 10 00:00:29,480 --> 00:00:31,080 of Search work. 11 00:00:31,080 --> 00:00:35,800 And I want to talk a little bit about Search as a model of 12 00:00:35,800 --> 00:00:38,080 what goes on in our heads. 13 00:00:38,080 --> 00:00:40,630 And toward the end, if there's time, I'd like to do a 14 00:00:40,630 --> 00:00:43,660 demonstration for you of something never before 15 00:00:43,660 --> 00:00:49,960 demonstrated to a 603.4 class, because it was only completed 16 00:00:49,960 --> 00:00:51,600 last spring. 17 00:00:51,600 --> 00:00:56,470 And some finishing touches were added by me this morning. 18 00:00:56,470 --> 00:00:58,450 Always dangerous, but we'll see what happens. 19 00:01:01,430 --> 00:01:03,870 There's Cambridge. 20 00:01:03,870 --> 00:01:05,150 You all recognize it, of course. 21 00:01:08,230 --> 00:01:11,550 You might want to get from some starting position s to 22 00:01:11,550 --> 00:01:14,280 some goal position g. 23 00:01:14,280 --> 00:01:18,490 So, you'll hire a cab and hope for the best. 24 00:01:18,490 --> 00:01:31,430 So, here's what might happen, not too hot. 25 00:01:31,430 --> 00:01:34,175 Let's move the starting position over here. 26 00:01:41,250 --> 00:01:43,245 I've had cab drivers like this New York. 27 00:01:47,039 --> 00:01:48,460 But it's not a very good path. 28 00:01:48,460 --> 00:01:50,400 It's the path of a thief. 29 00:01:50,400 --> 00:01:54,930 Let's change the way that the search is done to that of a 30 00:01:54,930 --> 00:01:56,975 beginner, an honest beginner. 31 00:02:01,690 --> 00:02:04,100 Not too bad. 32 00:02:04,100 --> 00:02:09,360 Now, let's have a look at how the Search would happen if the 33 00:02:09,360 --> 00:02:13,440 cab driver was a Ph.D. in physics 34 00:02:13,440 --> 00:02:14,690 after his third post-doc. 35 00:02:25,800 --> 00:02:26,750 These are not actually traverse. 36 00:02:26,750 --> 00:02:31,050 These are just things that the driver is thinking about, and 37 00:02:31,050 --> 00:02:34,840 that is the very best of all possible paths. 38 00:02:34,840 --> 00:02:36,480 So, the thief does a horrible job. 39 00:02:36,480 --> 00:02:38,110 The beginner does a pretty good job, but 40 00:02:38,110 --> 00:02:39,960 not an optimal job. 41 00:02:39,960 --> 00:02:44,790 This is the optimal job as produced by the Ph.D. in 42 00:02:44,790 --> 00:02:48,910 physics after his third post-doc. 43 00:02:48,910 --> 00:02:51,230 So, would you like to understand how those all work? 44 00:02:51,230 --> 00:02:52,480 The answer, of course, is yes. 45 00:02:56,660 --> 00:03:01,070 I'm going to talk to you about procedures that are different 46 00:03:01,070 --> 00:03:05,260 from the way that you just solved this problem. 47 00:03:05,260 --> 00:03:08,420 I imagine that if I said to you, please find a path for s 48 00:03:08,420 --> 00:03:12,020 to g, you would, within a few seconds, find 49 00:03:12,020 --> 00:03:13,070 a pretty good path-- 50 00:03:13,070 --> 00:03:15,710 not the optimal one, but a pretty good one-- 51 00:03:15,710 --> 00:03:18,050 using your eyes. 52 00:03:18,050 --> 00:03:19,940 And we're not going to tell you about how that works, 53 00:03:19,940 --> 00:03:22,540 because we don't know how that works. 54 00:03:22,540 --> 00:03:27,030 But we do know that problem solving with the eyes is an 55 00:03:27,030 --> 00:03:29,260 important part of our total intelligence. 56 00:03:29,260 --> 00:03:31,260 And we'll never have a complete theory of human 57 00:03:31,260 --> 00:03:34,640 intelligence until we can understand the contributions 58 00:03:34,640 --> 00:03:38,990 of the human visual system to solving everyday problems like 59 00:03:38,990 --> 00:03:43,470 finding a pretty good path in that map. 60 00:03:43,470 --> 00:03:45,870 But, alas, we can't talk about that, because we don't know 61 00:03:45,870 --> 00:03:47,100 how to do it. 62 00:03:47,100 --> 00:03:48,820 We're working on it. 63 00:03:48,820 --> 00:03:51,450 But we don't know how to do it. 64 00:03:51,450 --> 00:03:53,880 So, I'm not going to use Cambridge in my illustrations. 65 00:03:53,880 --> 00:03:56,860 There's too much there to work through in an hour. 66 00:03:56,860 --> 00:03:59,620 So, we're going to use this map over here which has been 67 00:03:59,620 --> 00:04:02,850 designed to illustrate a few important points. 68 00:04:02,850 --> 00:04:06,520 You, too, can find a path through that graph pretty 69 00:04:06,520 --> 00:04:08,570 easily with your eyes. 70 00:04:08,570 --> 00:04:11,320 Our programs don't have eyes, and they don't have visually 71 00:04:11,320 --> 00:04:13,280 grounded algorithms, so they're going to have to do 72 00:04:13,280 --> 00:04:14,720 something else. 73 00:04:14,720 --> 00:04:17,180 And the very first kind of search we want to talk about 74 00:04:17,180 --> 00:04:19,980 is called the British Museum approach. 75 00:04:19,980 --> 00:04:23,100 This is a slur against at least the British Museum, if 76 00:04:23,100 --> 00:04:26,400 not the entire nation, because the way you do a British 77 00:04:26,400 --> 00:04:30,230 Museum search is you find every possible path. 78 00:04:30,230 --> 00:04:33,310 So, it'll be helpful to have a diagram of all possible paths 79 00:04:33,310 --> 00:04:34,409 on the board. 80 00:04:34,409 --> 00:04:36,200 We're going to start with a British Museum search. 81 00:04:41,930 --> 00:04:45,130 From the starting position, it's clear, you can go from my 82 00:04:45,130 --> 00:04:47,405 s to either a or b. 83 00:04:52,400 --> 00:04:54,600 And already there's an important quiz point. 84 00:04:54,600 --> 00:04:58,260 Whenever we have these kinds of problems on a quiz, we ask 85 00:04:58,260 --> 00:05:02,720 you to develop the tree associated with a search in 86 00:05:02,720 --> 00:05:04,470 lexical order. 87 00:05:04,470 --> 00:05:11,600 So, the nodes there under s are listed alphabetically, 88 00:05:11,600 --> 00:05:14,710 just to have an orderly way of doing it. 89 00:05:14,710 --> 00:05:18,830 So, from a we can go either b or d. 90 00:05:22,020 --> 00:05:24,940 And another convention of the subject, another thing you 91 00:05:24,940 --> 00:05:27,530 have to keep in mind in quizzes, is it we don't have 92 00:05:27,530 --> 00:05:30,180 these searches bite their own tail. 93 00:05:30,180 --> 00:05:33,400 So, I could have said that if I'm at a, I can 94 00:05:33,400 --> 00:05:35,570 also go back to s. 95 00:05:35,570 --> 00:05:40,730 But no path is ever allowed them to bite itself, to go 96 00:05:40,730 --> 00:05:45,190 around and enter and get back to a place that's 97 00:05:45,190 --> 00:05:47,580 already on the path. 98 00:05:47,580 --> 00:05:52,750 Now if I go on to b first, that means that from b I can 99 00:05:52,750 --> 00:05:55,600 go to either a or c. 100 00:05:55,600 --> 00:05:57,020 This is getting fat pretty fast. 101 00:05:57,020 --> 00:05:59,570 But let's see, s, a, b. 102 00:05:59,570 --> 00:06:03,930 The only place I can go is c and then to e. 103 00:06:03,930 --> 00:06:07,360 s, a, d, without biting my own tail and going back to a, the 104 00:06:07,360 --> 00:06:10,290 only place I can go is g. 105 00:06:10,290 --> 00:06:16,820 s b, a, I can only go to d and then to g. 106 00:06:16,820 --> 00:06:21,040 And finally, s, b, c, I can only go to e. 107 00:06:21,040 --> 00:06:26,190 So, that is a complete set of paths as produced by any 108 00:06:26,190 --> 00:06:31,720 program that you will feel you'd like to write that finds 109 00:06:31,720 --> 00:06:33,470 all possible paths. 110 00:06:33,470 --> 00:06:36,780 I haven't been very precise about how to do that, because 111 00:06:36,780 --> 00:06:38,350 you don't have to be. 112 00:06:38,350 --> 00:06:40,780 You can't save much work by being clever, because you have 113 00:06:40,780 --> 00:06:43,570 to find everything. 114 00:06:43,570 --> 00:06:46,440 So, that's the British Museum expansion of the tree. 115 00:06:57,230 --> 00:06:58,409 So, what have I done? 116 00:06:58,409 --> 00:06:59,820 I've been playing around with a map. 117 00:06:59,820 --> 00:07:01,550 I showed you an example of a map. 118 00:07:01,550 --> 00:07:05,310 And pretty soon you're going to think that 119 00:07:05,310 --> 00:07:08,616 Search is about maps. 120 00:07:08,616 --> 00:07:14,820 So, before going even another tiny step, I want to emphasize 121 00:07:14,820 --> 00:07:18,030 that Search is not equal to maps. 122 00:07:18,030 --> 00:07:19,600 Search is about choice. 123 00:07:19,600 --> 00:07:22,690 And I happen to illustrate these searches with maps, 124 00:07:22,690 --> 00:07:25,390 because they are particularly cogent. 125 00:07:25,390 --> 00:07:26,790 But Search is not about maps. 126 00:07:26,790 --> 00:07:28,800 It's about the choices you make when you're trying to 127 00:07:28,800 --> 00:07:30,800 make decisions. 128 00:07:30,800 --> 00:07:34,470 These things I'm going to be talking to you about today are 129 00:07:34,470 --> 00:07:37,172 choices you make when you explore the map. 130 00:07:37,172 --> 00:07:39,630 You can make other kinds of choices when you're exploring 131 00:07:39,630 --> 00:07:40,630 other kinds of things. 132 00:07:40,630 --> 00:07:44,500 And, in fact, at the end, if there's time, I'll show you 133 00:07:44,500 --> 00:07:47,130 how you do searches when you're solving problems in a 134 00:07:47,130 --> 00:07:48,380 humanities class. 135 00:07:51,000 --> 00:07:52,409 That's the British Museum algorithm. 136 00:07:52,409 --> 00:07:53,950 Search is not about maps. 137 00:07:53,950 --> 00:07:58,240 Our first gold star idea, Search is about choice. 138 00:07:58,240 --> 00:08:00,660 But for our illustration, Search is about maps. 139 00:08:00,660 --> 00:08:03,530 So, the first kind of Search we want to talk about that's 140 00:08:03,530 --> 00:08:07,460 real is Depth-first Search. 141 00:08:10,412 --> 00:08:15,500 And the idea of Depth-first Search is that you barrel 142 00:08:15,500 --> 00:08:17,070 ahead in a single-minded way. 143 00:08:20,160 --> 00:08:27,832 So, from s, your choices are a or b. 144 00:08:27,832 --> 00:08:31,730 And you always go down the left branch by convention. 145 00:08:31,730 --> 00:08:35,350 So, from s, we go to a. 146 00:08:35,350 --> 00:08:36,919 From a we have two choices. 147 00:08:36,919 --> 00:08:46,990 We can go to either b or d following our lexical 148 00:08:46,990 --> 00:08:49,630 convention. 149 00:08:49,630 --> 00:08:52,660 After that, we can go to c. 150 00:08:52,660 --> 00:08:54,660 And after that we can go to e. 151 00:08:54,660 --> 00:08:58,871 And too bad for us, we're stuck. 152 00:08:58,871 --> 00:09:00,520 What are we going to do. 153 00:09:00,520 --> 00:09:04,790 We've got into a dead end, all is lost. 154 00:09:04,790 --> 00:09:06,640 But of course, all isn't lost. 155 00:09:06,640 --> 00:09:11,180 Because we have the choice of backing up to the place where 156 00:09:11,180 --> 00:09:16,280 we last made a decision and choosing another branch. 157 00:09:16,280 --> 00:09:20,350 So, that process is called variously back-up or 158 00:09:20,350 --> 00:09:21,600 backtracking. 159 00:09:23,330 --> 00:09:26,670 At this point, we would say, ah, dead end. 160 00:09:26,670 --> 00:09:30,110 The first place we find when we back up the tree where we 161 00:09:30,110 --> 00:09:34,070 made a choice is when we chose b instead of d. 162 00:09:34,070 --> 00:09:36,420 So, we go back up there and take the other route. 163 00:09:39,060 --> 00:09:42,730 s, a, d now goes to g. 164 00:09:42,730 --> 00:09:45,220 And we're done. 165 00:09:45,220 --> 00:09:47,220 We're going to make up a little table here of things 166 00:09:47,220 --> 00:09:50,880 that we can embellish our basic searches with. 167 00:09:50,880 --> 00:09:53,420 And one of the things we can embellish our basic searches 168 00:09:53,420 --> 00:09:55,680 with is this backtracking idea. 169 00:10:01,700 --> 00:10:04,410 Now, backtrack is not relevant to the British Museum 170 00:10:04,410 --> 00:10:07,120 algorithm, because you've got to find everything. 171 00:10:07,120 --> 00:10:10,000 You can't quit when you've found one path. 172 00:10:10,000 --> 00:10:12,940 But you'd always want to use backtracking with Depth-first 173 00:10:12,940 --> 00:10:16,140 Search, because you may plunge on down and miss the path that 174 00:10:16,140 --> 00:10:18,050 gets to the goal. 175 00:10:18,050 --> 00:10:22,480 Now, you might ask me, is backtracking, therefore, 176 00:10:22,480 --> 00:10:25,330 always part of Depth-first Search? 177 00:10:25,330 --> 00:10:28,350 And you can read textbooks that do it either way. 178 00:10:28,350 --> 00:10:29,190 Count on it. 179 00:10:29,190 --> 00:10:32,500 If we give you a Search problem on a quiz, we'll tell 180 00:10:32,500 --> 00:10:34,400 you whether or not your Search is supposed to use 181 00:10:34,400 --> 00:10:34,990 backtracking. 182 00:10:34,990 --> 00:10:38,200 We consider it to be an optional thing. 183 00:10:38,200 --> 00:10:40,320 You'd be pretty stupid not to use this optional thing when 184 00:10:40,320 --> 00:10:41,925 you're doing Depth-first Search. 185 00:10:41,925 --> 00:10:44,800 But we'll separate these ideas out and call 186 00:10:44,800 --> 00:10:47,300 it an optional add-on. 187 00:10:47,300 --> 00:10:51,600 so, that's Depth-first Search, very simple. 188 00:10:51,600 --> 00:10:55,670 Now, the natural companion to Depth-first Search will be 189 00:10:55,670 --> 00:11:02,454 Breadth-first Search, Breadth-first. 190 00:11:07,640 --> 00:11:11,620 And the way it works is you build up this tree level by 191 00:11:11,620 --> 00:11:16,160 level, and at some point, when you scan across a level, 192 00:11:16,160 --> 00:11:17,790 you'll find that you've completed a path 193 00:11:17,790 --> 00:11:20,100 that goes to the goal. 194 00:11:20,100 --> 00:11:27,310 So, level by level, s can go to either a or b. 195 00:11:27,310 --> 00:11:30,670 a can go either to b or d. 196 00:11:30,670 --> 00:11:34,620 And b can go to either a or c. 197 00:11:34,620 --> 00:11:35,480 So, you see what we're doing. 198 00:11:35,480 --> 00:11:38,030 We're going level by level. 199 00:11:38,030 --> 00:11:40,360 And we haven't hit a level with a goal in it yet, so 200 00:11:40,360 --> 00:11:42,510 we've got to keep going. 201 00:11:42,510 --> 00:11:44,800 Note that we're building up quite a bit of stuff here, 202 00:11:44,800 --> 00:11:49,360 quite a lot of growth in the size of the path set that 203 00:11:49,360 --> 00:11:51,540 we're keeping in mind. 204 00:11:51,540 --> 00:11:56,850 At the next level, we have b going to c, d going to g, a 205 00:11:56,850 --> 00:12:01,030 going to d, and c going to e. 206 00:12:01,030 --> 00:12:05,460 And now, when we scan across, we do hit g. 207 00:12:05,460 --> 00:12:08,530 So, we found a path with Breadth-first Search, just as 208 00:12:08,530 --> 00:12:09,985 we found a path with Depth-first Search. 209 00:12:12,650 --> 00:12:14,360 Now, you might say, well, why didn't you just quit 210 00:12:14,360 --> 00:12:15,420 when you hit g? 211 00:12:15,420 --> 00:12:18,340 Implementation detail. 212 00:12:18,340 --> 00:12:20,830 We'll talk about a sample implementation. 213 00:12:20,830 --> 00:12:22,990 You can write it in any way you want. 214 00:12:22,990 --> 00:12:26,080 But now that we know what these searches are, let's 215 00:12:26,080 --> 00:12:31,560 speed things up a little bit here and do a couple searches 216 00:12:31,560 --> 00:12:34,530 that now have names. 217 00:12:34,530 --> 00:12:38,590 The first type will be Depth-first, boom. 218 00:12:38,590 --> 00:12:43,870 That's the one that produces the thief path. 219 00:12:43,870 --> 00:12:46,070 And then we can also do a Breadth-first Search, which we 220 00:12:46,070 --> 00:12:47,310 haven't tried yet. 221 00:12:47,310 --> 00:12:49,350 What do you suppose is going to happen? 222 00:12:49,350 --> 00:12:53,030 Is it going to be fast, slow, produce a good path, 223 00:12:53,030 --> 00:12:53,840 produce a bad path? 224 00:12:53,840 --> 00:12:56,095 I don't know, let's try it. 225 00:12:56,095 --> 00:12:58,410 I had to speed it up, you see, because it's doing an awful 226 00:12:58,410 --> 00:12:59,670 lot of Search. 227 00:12:59,670 --> 00:13:01,410 It's generating an awful lot of paths. 228 00:13:08,420 --> 00:13:10,280 Finally, you got a path. 229 00:13:10,280 --> 00:13:11,030 Is it the best path? 230 00:13:11,030 --> 00:13:11,680 I don't think so. 231 00:13:11,680 --> 00:13:13,700 But we're not going to talk about optimal paths today. 232 00:13:13,700 --> 00:13:15,730 We're just going to talk about pretty good 233 00:13:15,730 --> 00:13:18,330 paths, heuristic paths. 234 00:13:18,330 --> 00:13:21,545 Let's move the starting position here in the middle. 235 00:13:21,545 --> 00:13:25,050 Do you think Breadth-first Search is going to be stupid? 236 00:13:25,050 --> 00:13:26,310 I think it's going to be pretty stupid. 237 00:13:26,310 --> 00:13:28,730 Let's see what happens. 238 00:13:28,730 --> 00:13:30,840 This Search is a lot to the left, which you would never do 239 00:13:30,840 --> 00:13:31,795 with you eye. 240 00:13:31,795 --> 00:13:35,050 Let me slow that down just to demonstrate it. 241 00:13:35,050 --> 00:13:37,190 It finds a shorter path, because it's right there in 242 00:13:37,190 --> 00:13:38,470 the middle. 243 00:13:38,470 --> 00:13:41,810 But it spends a lot of its time looking off to the left. 244 00:13:41,810 --> 00:13:44,740 It's pretty stupid. 245 00:13:44,740 --> 00:13:46,810 But that's how it works. 246 00:13:46,810 --> 00:13:49,760 So, now that we've got two examples of searches on the 247 00:13:49,760 --> 00:13:52,460 table, I'd like to just write a little flow chart for how 248 00:13:52,460 --> 00:13:54,800 the search might work. 249 00:13:54,800 --> 00:13:58,710 Because if I do that, then it'll be easier for us to see 250 00:13:58,710 --> 00:14:00,600 what kind of small differences there are between the 251 00:14:00,600 --> 00:14:04,550 implementations of these various searches. 252 00:14:04,550 --> 00:14:07,500 So, what we're going to do is we're going to develop a 253 00:14:07,500 --> 00:14:11,020 waiting list, a queue, a line, whatever you'd 254 00:14:11,020 --> 00:14:11,180 like to call it. 255 00:14:11,180 --> 00:14:12,380 Let's call if a queue. 256 00:14:12,380 --> 00:14:14,590 We're going to develop a queue of paths that are under 257 00:14:14,590 --> 00:14:16,960 consideration. 258 00:14:16,960 --> 00:14:20,120 So, the first step in our algorithm will be to 259 00:14:20,120 --> 00:14:26,242 initialize our queue. 260 00:14:29,410 --> 00:14:33,550 And I think what I'll do is I'll simulate Depth-first 261 00:14:33,550 --> 00:14:37,780 Search on this problem up there on the 262 00:14:37,780 --> 00:14:40,500 left using this algorithm. 263 00:14:40,500 --> 00:14:43,890 I need to have some way of representing my paths. 264 00:14:43,890 --> 00:14:46,960 And what I want to do is I'm going to betray my heritage as 265 00:14:46,960 --> 00:14:49,720 a list programmer, because I'm just going to put these up as 266 00:14:49,720 --> 00:14:52,320 if there were lisp s-expressions. 267 00:14:52,320 --> 00:14:55,140 To begin with, I just have one path. 268 00:14:55,140 --> 00:14:58,710 And it has only one node in it, s. 269 00:14:58,710 --> 00:15:01,400 That's the whole path. 270 00:15:01,400 --> 00:15:06,890 The next thing I do after I initialize the queue is I 271 00:15:06,890 --> 00:15:18,415 extend first path on the queue. 272 00:15:24,750 --> 00:15:27,785 OK, when I extend s, I get two paths. 273 00:15:30,300 --> 00:15:38,670 I get s goes to a, and I get s goes to b. 274 00:15:38,670 --> 00:15:41,460 I take the first one off the front of the queue. 275 00:15:41,460 --> 00:15:44,320 And I put back the two that are produced by 276 00:15:44,320 --> 00:15:47,070 extending that path. 277 00:15:47,070 --> 00:15:50,130 Now, after I've extended the first path on the queue, I 278 00:15:50,130 --> 00:15:56,820 have to but those extended paths on to the queue. 279 00:15:56,820 --> 00:15:59,200 In here there's an explicit step where I've checked to see 280 00:15:59,200 --> 00:16:01,620 if that first path is a winner. 281 00:16:01,620 --> 00:16:03,710 If it's not, I extend it. 282 00:16:03,710 --> 00:16:07,950 And I have to put those paths onto the queue. 283 00:16:07,950 --> 00:16:10,180 So, I'll say that what I do is I end queue. 284 00:16:18,390 --> 00:16:19,950 Now, I've done one step. 285 00:16:19,950 --> 00:16:21,160 And let's let me do another step. 286 00:16:21,160 --> 00:16:24,190 I'm going to take this first path off. 287 00:16:24,190 --> 00:16:27,580 I'm going to extend that path. 288 00:16:27,580 --> 00:16:30,760 And where do I put these new paths on the queue if I'm 289 00:16:30,760 --> 00:16:34,320 doing Depth-first Search? 290 00:16:34,320 --> 00:16:38,390 Well, I want to work with the path that I've just generated. 291 00:16:38,390 --> 00:16:43,800 I'm taking this plunge down deep into the search tree. 292 00:16:43,800 --> 00:16:47,290 So, since I want to keep going down into the stuff that I 293 00:16:47,290 --> 00:16:51,440 just generated, where then do I want to put these two paths? 294 00:16:51,440 --> 00:16:52,360 At the end of the queue? 295 00:16:52,360 --> 00:16:54,430 I don't think so, because it'll be a long 296 00:16:54,430 --> 00:16:55,370 time getting there. 297 00:16:55,370 --> 00:16:56,620 I want to put them on the front of the queue. 298 00:17:00,920 --> 00:17:12,858 For Depth-first Search, I want to put them on the 299 00:17:12,858 --> 00:17:15,220 front of the queue. 300 00:17:15,220 --> 00:17:24,510 And that's why s, a, b goes here, and s, a, d, and then 301 00:17:24,510 --> 00:17:25,800 that's s, b. 302 00:17:29,770 --> 00:17:31,100 So, s, b is still there. 303 00:17:31,100 --> 00:17:33,550 That's still a valid possibility. 304 00:17:33,550 --> 00:17:35,830 But now I've stuck two paths in front of it, both of the 305 00:17:35,830 --> 00:17:39,060 ones I generated by taking a path off the front of the 306 00:17:39,060 --> 00:17:42,310 queue, discovering that it doesn't go to the goal, 307 00:17:42,310 --> 00:17:46,130 extending it and putting those back on the queue. 308 00:17:46,130 --> 00:17:49,210 I might as well complete this illustration here. 309 00:17:49,210 --> 00:17:57,070 While I'm at it, I take the s, a, b off, s, a, b, and I can 310 00:17:57,070 --> 00:17:59,840 go only there to c. 311 00:17:59,840 --> 00:18:06,332 But, of course, I keep s, a, d and s, b on the queue. 312 00:18:09,170 --> 00:18:12,860 Now, I take the front off the queue again, and I get s, a, 313 00:18:12,860 --> 00:18:20,501 b, c, e, and not to forget s, a, d and s, b. 314 00:18:23,330 --> 00:18:25,340 I take the first one off the queue. 315 00:18:25,340 --> 00:18:27,900 It doesn't go to the goal. 316 00:18:27,900 --> 00:18:30,000 I try to extend it, but there's nothing there. 317 00:18:30,000 --> 00:18:32,080 I've reached a dead end. 318 00:18:32,080 --> 00:18:34,670 So, in this operation, all I'm doing is taking the front one 319 00:18:34,670 --> 00:18:36,620 off the queue and shortening the queue. 320 00:18:42,030 --> 00:18:43,860 We're almost home. 321 00:18:43,860 --> 00:18:45,870 I take s, a,d off of queue. 322 00:18:45,870 --> 00:18:50,160 And I get s, a, d, c. 323 00:18:50,160 --> 00:18:54,300 And, of course, I still have s, b. 324 00:18:54,300 --> 00:19:00,650 Now, the next time I visit the situation, buried in that 325 00:19:00,650 --> 00:19:04,150 first step, I discover a path that actually does get to 326 00:19:04,150 --> 00:19:06,920 goal, and I'm done. 327 00:19:06,920 --> 00:19:10,810 So, each time around I visualize the queue. 328 00:19:10,810 --> 00:19:12,320 I check to see if I'm done. 329 00:19:12,320 --> 00:19:15,840 If not, I take the extensions and put them 330 00:19:15,840 --> 00:19:17,606 somewhere on the queue. 331 00:19:17,606 --> 00:19:25,000 And then I go back in. 332 00:19:25,000 --> 00:19:29,900 And then here there's a varied test which checks to see if 333 00:19:29,900 --> 00:19:32,280 we're done. 334 00:19:32,280 --> 00:19:36,650 That's how the Depth-first Search algorithm works. 335 00:19:36,650 --> 00:19:39,070 And now, would we have to start all over again if we did 336 00:19:39,070 --> 00:19:41,780 Breadth-first Search? 337 00:19:41,780 --> 00:19:42,310 Nope. 338 00:19:42,310 --> 00:19:44,220 Same algorithm. 339 00:19:44,220 --> 00:19:46,110 All the code we've got needs one line 340 00:19:46,110 --> 00:19:49,060 replaced, one line changed. 341 00:19:49,060 --> 00:19:51,080 What do I have to do different in order to get a 342 00:19:51,080 --> 00:19:52,800 Breadth-first Search out of this instead of 343 00:19:52,800 --> 00:19:54,580 a Depth-first Search? 344 00:19:54,580 --> 00:19:55,440 Tanya? 345 00:19:55,440 --> 00:19:56,930 TANYA: Change [INAUDIBLE] on the queue. 346 00:19:56,930 --> 00:19:57,940 PATRICK WINSTON: And where do I put it on the queue? 347 00:19:57,940 --> 00:19:58,840 She says to change it. 348 00:19:58,840 --> 00:19:59,720 TANYA: On the back? 349 00:19:59,720 --> 00:20:02,030 PATRICK WINSTON: Put it on the back. 350 00:20:02,030 --> 00:20:14,810 So, with Breadth-first Search all I have to do 351 00:20:14,810 --> 00:20:16,060 is put on the back. 352 00:20:21,290 --> 00:20:28,990 Now, if we were content with a inefficient search, and didn't 353 00:20:28,990 --> 00:20:31,350 care much about how good our path was, we'd be done. 354 00:20:31,350 --> 00:20:33,330 And we could go home. 355 00:20:33,330 --> 00:20:35,690 But we are a little concerned about the 356 00:20:35,690 --> 00:20:37,740 efficiency of our search. 357 00:20:37,740 --> 00:20:39,820 And we would like a pretty good path. 358 00:20:39,820 --> 00:20:41,630 So, we're going to have to stick around 359 00:20:41,630 --> 00:20:42,880 for a little while. 360 00:20:45,290 --> 00:20:49,770 Now, you may have noticed, up there in that the development 361 00:20:49,770 --> 00:20:57,180 of the Breadth-first Search, that the algorithm is 362 00:20:57,180 --> 00:20:58,430 incredibly stupid. 363 00:21:01,300 --> 00:21:03,115 Why is the algorithm incredibly stupid? 364 00:21:06,000 --> 00:21:07,200 Ty, what do you think? 365 00:21:07,200 --> 00:21:09,189 TY: It can't tell whether it's getting closer or further away 366 00:21:09,189 --> 00:21:09,540 from the goal. 367 00:21:09,540 --> 00:21:11,090 PATRICK WINSTON: It certainly can't tell whether it's 368 00:21:11,090 --> 00:21:13,330 getting closer or further away from the goal. 369 00:21:13,330 --> 00:21:16,090 And we're going to deal with that in a minute. 370 00:21:16,090 --> 00:21:17,640 But it's even stupider than that. 371 00:21:20,880 --> 00:21:21,540 Why is it stupid? 372 00:21:21,540 --> 00:21:22,474 What's your name? 373 00:21:22,474 --> 00:21:24,342 DYLAN: Dylan. 374 00:21:24,342 --> 00:21:26,677 It [? hits ?] the same nodes twice. 375 00:21:26,677 --> 00:21:30,340 PATRICK WINSTON: Dylan said it's extending paths that go 376 00:21:30,340 --> 00:21:33,660 to the same node more than once. 377 00:21:33,660 --> 00:21:34,970 Let's see what Dylan's talking about. 378 00:21:39,410 --> 00:21:42,522 Down here, it extends a. 379 00:21:42,522 --> 00:21:46,810 But it's already extended a up there. 380 00:21:46,810 --> 00:21:51,440 Down here, it extends a path that goes to b. 381 00:21:51,440 --> 00:21:54,900 And it's already extended a path that goes to d. 382 00:21:54,900 --> 00:22:02,170 Over here, it could extend a path that went through c, but 383 00:22:02,170 --> 00:22:05,800 it's already got a path that goes through c. 384 00:22:05,800 --> 00:22:09,450 So, all of these paths are duplicated. 385 00:22:09,450 --> 00:22:12,040 And we're still going through them. 386 00:22:12,040 --> 00:22:15,450 That's incredibly stupid. 387 00:22:15,450 --> 00:22:17,460 What we're going to do is we're going to amend our 388 00:22:17,460 --> 00:22:18,750 algorithm just a little bit. 389 00:22:23,940 --> 00:22:31,210 And we're not going to extend the first path on the queue 390 00:22:31,210 --> 00:22:49,440 unless final node never before extended. 391 00:22:56,550 --> 00:22:58,380 What we're going to do is we're going to look to see if 392 00:22:58,380 --> 00:22:59,780 there-- we've got this path. 393 00:22:59,780 --> 00:23:00,930 And we're going to extend it. 394 00:23:00,930 --> 00:23:02,190 And it's got a final note. 395 00:23:02,190 --> 00:23:05,700 If we've ever extended a path that goes to that final node, 396 00:23:05,700 --> 00:23:07,580 and it was a final node on that path, then we're not 397 00:23:07,580 --> 00:23:09,630 going to do it again. 398 00:23:09,630 --> 00:23:16,520 We got to keep a list of places that have already been 399 00:23:16,520 --> 00:23:19,506 the last piece of a path that was extended. 400 00:23:19,506 --> 00:23:20,830 Everybody got that? 401 00:23:20,830 --> 00:23:24,130 It's a little awkward to say it, because it's the last node 402 00:23:24,130 --> 00:23:24,870 we care about. 403 00:23:24,870 --> 00:23:30,640 If a path terminates in a node, and if some other path 404 00:23:30,640 --> 00:23:33,480 previously terminated in that node and got extended-- 405 00:23:33,480 --> 00:23:34,570 we're not going to do it again. 406 00:23:34,570 --> 00:23:35,820 Because it's a waste of time. 407 00:23:39,690 --> 00:23:41,260 Now, let's see if this actually helps. 408 00:23:51,470 --> 00:23:53,200 Now, use the extended list. 409 00:23:53,200 --> 00:24:00,305 Let's see, well, gee, we got that place 410 00:24:00,305 --> 00:24:01,050 in the center there. 411 00:24:01,050 --> 00:24:02,790 Let's just repeat the previous search. 412 00:24:15,200 --> 00:24:17,200 Wow, it's taking a long time. 413 00:24:17,200 --> 00:24:22,610 But notice it put 103 paths back on the queue. 414 00:24:22,610 --> 00:24:26,200 Now, let's add a filter and try again. 415 00:24:39,540 --> 00:24:41,640 A lot less. 416 00:24:41,640 --> 00:24:46,950 So, let's speed this up, and we'll start way over here. 417 00:24:46,950 --> 00:24:49,390 You remember how tedious that search was. 418 00:24:49,390 --> 00:24:55,240 And now we'll repeat it with this list, boom, there it is. 419 00:24:55,240 --> 00:24:58,050 That's all because we didn't do that silly thing of going 420 00:24:58,050 --> 00:25:01,610 back through the final node that's 421 00:25:01,610 --> 00:25:04,170 already been gone through. 422 00:25:04,170 --> 00:25:07,380 So, you would never not want to do this. 423 00:25:07,380 --> 00:25:09,220 We better list this as another option. 424 00:25:19,846 --> 00:25:22,420 It doesn't help with a British Museum algorithm, because 425 00:25:22,420 --> 00:25:24,670 nothing helps with the British Museum algorithm. 426 00:25:24,670 --> 00:25:25,950 Does it help with Depth-first? 427 00:25:25,950 --> 00:25:26,800 Yes. 428 00:25:26,800 --> 00:25:28,350 Does it help with Breadth-first? 429 00:25:28,350 --> 00:25:29,680 Yes. 430 00:25:29,680 --> 00:25:32,000 Do we do backtracking with Breadth-first? 431 00:25:32,000 --> 00:25:38,870 No, because backtracking can't do us any good. 432 00:25:38,870 --> 00:25:43,990 OK, we're almost, except that search that's starting in the 433 00:25:43,990 --> 00:25:46,090 middle is still pretty stupid. 434 00:25:46,090 --> 00:25:52,200 Both the Breadth-first version and the Depth-first version 435 00:25:52,200 --> 00:25:53,940 are going off to the left. 436 00:25:53,940 --> 00:25:58,320 And we would never do that with our eyes in any case. 437 00:25:58,320 --> 00:26:01,720 The next thing we want to do is we want to have ourselves a 438 00:26:01,720 --> 00:26:07,020 slightly more informed search by taking into consideration 439 00:26:07,020 --> 00:26:08,525 whether we seem to be getting anywhere. 440 00:26:13,490 --> 00:26:17,550 So, in general, it's a good thing to get closer to where 441 00:26:17,550 --> 00:26:19,890 we want to go. 442 00:26:19,890 --> 00:26:24,200 In general, if we've got a choice of going to a node 443 00:26:24,200 --> 00:26:27,210 that's close to the goal or a node that's not so close to 444 00:26:27,210 --> 00:26:28,830 the goal, we'll always want to go to the one that's 445 00:26:28,830 --> 00:26:31,010 close to the goal. 446 00:26:31,010 --> 00:26:35,610 And as soon as we add that to what we're doing, we have 447 00:26:35,610 --> 00:26:47,290 another kind of Search, which goes by the 448 00:26:47,290 --> 00:26:48,540 name of Hill Climbing. 449 00:26:56,900 --> 00:26:59,830 And it's just like Depth-first Search, except instead of 450 00:26:59,830 --> 00:27:03,540 using lexical order to break ties, we're going to break 451 00:27:03,540 --> 00:27:06,290 ties according to which node is closer to the goal. 452 00:27:09,720 --> 00:27:11,260 I went to some trouble to talk to you about 453 00:27:11,260 --> 00:27:12,680 this enqueued list. 454 00:27:12,680 --> 00:27:14,020 And having gone to that trouble, I'm now 455 00:27:14,020 --> 00:27:15,860 going to ignore it. 456 00:27:15,860 --> 00:27:18,260 Not because it isn't a good idea, but because trying to 457 00:27:18,260 --> 00:27:20,490 keep track of everything in the example is 458 00:27:20,490 --> 00:27:21,310 confusing the example. 459 00:27:21,310 --> 00:27:24,350 It won't work out right in the small example and all that. 460 00:27:24,350 --> 00:27:28,270 Put the queueing thing aside, queued list aside, and think 461 00:27:28,270 --> 00:27:33,120 instead just about the value of going in the direction 462 00:27:33,120 --> 00:27:36,820 that's getting us closer to the goal. 463 00:27:36,820 --> 00:27:40,520 In Hill Climbing Search, just like a Depth-first Search, we 464 00:27:40,520 --> 00:27:42,790 have a and b. 465 00:27:42,790 --> 00:27:45,800 And we're still going to list them lexically on underneath 466 00:27:45,800 --> 00:27:47,636 the parent node. 467 00:27:47,636 --> 00:27:53,630 But now which one is so closer to the goal? 468 00:27:53,630 --> 00:27:56,620 Now, this time b is closer to the goal than a. 469 00:27:56,620 --> 00:27:59,340 So, instead of following the Depth-first course, which 470 00:27:59,340 --> 00:28:02,030 would take us down through a, we're going to go to the one 471 00:28:02,030 --> 00:28:04,890 that's closest which goes through b. 472 00:28:04,890 --> 00:28:08,140 And b can either go to a or c. 473 00:28:14,670 --> 00:28:19,040 b is six units away from the goal. a is about seven plus, 474 00:28:19,040 --> 00:28:20,620 not drawn exactly to scale. 475 00:28:20,620 --> 00:28:22,483 Use the numbers not your eyes. 476 00:28:25,420 --> 00:28:26,740 Now where are we? 477 00:28:26,740 --> 00:28:28,990 It's symmetric, so a and c are both equally 478 00:28:28,990 --> 00:28:30,920 far from the goal. 479 00:28:30,920 --> 00:28:32,220 Now we're going to use the lexical order 480 00:28:32,220 --> 00:28:34,160 to break the tie. 481 00:28:34,160 --> 00:28:41,510 Now from s, b, a, we'll go to d. 482 00:28:41,510 --> 00:28:44,190 And now, which is closest to the goal? 483 00:28:44,190 --> 00:28:45,340 That's the only choice we have. 484 00:28:45,340 --> 00:28:49,050 So, now we have no choice but to go down to the goal. 485 00:28:49,050 --> 00:28:53,250 That's the Hill Climbing way of doing the search. 486 00:28:53,250 --> 00:28:54,820 And notice that this time there's no backtracking. 487 00:28:57,740 --> 00:28:59,270 It's not the optimal path. 488 00:28:59,270 --> 00:28:59,940 It's not the best path. 489 00:28:59,940 --> 00:29:02,010 But at least there's no backtracking. 490 00:29:02,010 --> 00:29:02,830 That's not always true. 491 00:29:02,830 --> 00:29:07,180 That's just an artifact of this particular example. 492 00:29:07,180 --> 00:29:10,910 Do you think Hill Climbing would produce a faster search? 493 00:29:10,910 --> 00:29:13,030 I think so. 494 00:29:13,030 --> 00:29:15,040 Let's see what happens when we add these 495 00:29:15,040 --> 00:29:17,570 things at one at a time. 496 00:29:17,570 --> 00:29:22,790 First, let's turn off our extended list. 497 00:29:25,610 --> 00:29:27,010 We turned off our extended list. 498 00:29:27,010 --> 00:29:29,980 And we're going to do Depth-first again just for the 499 00:29:29,980 --> 00:29:33,040 sake of comparison. 500 00:29:33,040 --> 00:29:38,460 It produces a very roundabout path with 48 enqueueings. 501 00:29:38,460 --> 00:29:41,230 Now, let's switch over to Hill Climbing. 502 00:29:41,230 --> 00:29:41,780 And what do think? 503 00:29:41,780 --> 00:29:44,630 Do you think it will produce a straighter path, fewer 504 00:29:44,630 --> 00:29:46,720 enqueueings? 505 00:29:46,720 --> 00:29:47,970 Boom. 506 00:29:49,930 --> 00:29:52,180 You wouldn't not want to do that, would you? 507 00:29:52,180 --> 00:29:55,450 If you've got some kind of heuristic that tells you that 508 00:29:55,450 --> 00:29:58,360 you're getting close to the goal, you should use it. 509 00:29:58,360 --> 00:30:03,330 Now, it's easy to modify my example over there so that 510 00:30:03,330 --> 00:30:05,550 getting close to the goal gets you trapped in a 511 00:30:05,550 --> 00:30:06,690 blind alley on e. 512 00:30:06,690 --> 00:30:08,730 That's easy to do. 513 00:30:08,730 --> 00:30:10,280 But that's just an artifact of the example. 514 00:30:10,280 --> 00:30:13,130 In general, you want to go along a path that gets you 515 00:30:13,130 --> 00:30:15,060 closer to the goal. 516 00:30:15,060 --> 00:30:16,040 So, that's 23. 517 00:30:16,040 --> 00:30:18,400 I don't know, let's see if using the extended list filter 518 00:30:18,400 --> 00:30:19,650 does any good. 519 00:30:21,870 --> 00:30:23,360 Yeah, still 23. 520 00:30:23,360 --> 00:30:30,670 So, in that particular case the extension list didn't 521 00:30:30,670 --> 00:30:33,190 actually do us any good, because we're driving so 522 00:30:33,190 --> 00:30:34,440 directly toward the goal. 523 00:30:36,960 --> 00:30:38,740 OK, that's that. 524 00:30:38,740 --> 00:30:42,410 Now, let's see, is there any analog to-- 525 00:30:42,410 --> 00:30:47,580 well, we might say that this is yet another way of 526 00:30:47,580 --> 00:30:49,750 distinguishing the searches. 527 00:30:49,750 --> 00:30:57,990 And that is, is it an informed search? 528 00:31:00,620 --> 00:31:03,540 Is it making use of any kind of heuristic information? 529 00:31:03,540 --> 00:31:05,640 Certainly, a British Museum is not, Depth is 530 00:31:05,640 --> 00:31:07,530 not, Breadth is not. 531 00:31:07,530 --> 00:31:09,980 And now let's consider what we got for Hill Climbing. 532 00:31:09,980 --> 00:31:11,240 Do we want to use backtracking? 533 00:31:11,240 --> 00:31:12,410 Sure. 534 00:31:12,410 --> 00:31:14,010 Do we want to use an enqueued list? 535 00:31:14,010 --> 00:31:15,650 Sure. 536 00:31:15,650 --> 00:31:18,210 And it is informed, because it's taking advantage of this 537 00:31:18,210 --> 00:31:18,800 extra information. 538 00:31:18,800 --> 00:31:20,500 It may not be in your problem. 539 00:31:20,500 --> 00:31:24,570 It's not often the case you've got this information in a map. 540 00:31:24,570 --> 00:31:28,040 Your problem may not have any heuristic measurement of 541 00:31:28,040 --> 00:31:29,325 distance to the goal. 542 00:31:29,325 --> 00:31:31,090 In which case, you can't do it. 543 00:31:31,090 --> 00:31:33,980 But if you've got it, you should use it. 544 00:31:33,980 --> 00:31:35,550 Oh, yeah, there's one more. 545 00:31:35,550 --> 00:31:40,020 And I've already given it away by having it on my chart. 546 00:31:40,020 --> 00:31:41,110 It's called Beam Search. 547 00:31:41,110 --> 00:31:46,210 And just as Hill Climbing is an analog of Depth-first 548 00:31:46,210 --> 00:31:51,770 Search, Beam Search is a complement or addition of an 549 00:31:51,770 --> 00:31:55,140 informing heuristic to Breadth-first Search. 550 00:31:55,140 --> 00:31:56,880 What you do is you start off just like 551 00:31:56,880 --> 00:31:58,130 Breadth-first Search. 552 00:32:01,190 --> 00:32:05,230 But you say I'm going to limit the number of paths I'm going 553 00:32:05,230 --> 00:32:09,550 to consider at any level to some small, fixed number, 554 00:32:09,550 --> 00:32:12,530 like, in this case, how about two. 555 00:32:12,530 --> 00:32:15,970 So, I'm going to say that I have a Beam of 556 00:32:15,970 --> 00:32:18,080 two for my Beam Search. 557 00:32:25,530 --> 00:32:29,080 Otherwise, I proceed just like Breadth-first 558 00:32:29,080 --> 00:32:30,330 Search, b, d, a, g. 559 00:32:36,230 --> 00:32:38,880 And now I've got that stupid thing where I'm duplicating my 560 00:32:38,880 --> 00:32:42,110 nodes, because I'm forgetting about the enqueued list. 561 00:32:42,110 --> 00:32:44,760 But to illustrate Beam Search, what about I'm going to do now 562 00:32:44,760 --> 00:32:47,080 is I'm going to take all these paths I've got at the second 563 00:32:47,080 --> 00:32:49,190 level, and I'm only going to keep the best two. 564 00:32:49,190 --> 00:32:51,130 That's my beam width. 565 00:32:51,130 --> 00:32:54,220 And the best two are the two that get closest to the goal. 566 00:32:54,220 --> 00:32:58,430 So, those four, b, c, a, and d, which two get 567 00:32:58,430 --> 00:32:59,945 closest to the goal? 568 00:32:59,945 --> 00:33:02,160 Now, b and d. 569 00:33:02,160 --> 00:33:05,530 These guys are trimmed off. 570 00:33:05,530 --> 00:33:08,030 I'm only keeping two at every level. 571 00:33:08,030 --> 00:33:11,690 Now, going down from b and d, I have, at the next 572 00:33:11,690 --> 00:33:14,250 level, c and g. 573 00:33:14,250 --> 00:33:15,860 And now I've found the goal. 574 00:33:15,860 --> 00:33:17,790 So, I'm done. 575 00:33:17,790 --> 00:33:19,040 We could do that here, too. 576 00:33:22,750 --> 00:33:32,350 We could choose a Beam Search, not bad. 577 00:33:32,350 --> 00:33:34,940 Let's see, let's try this thing from the middle. 578 00:33:34,940 --> 00:33:37,530 Let's slow my speed down a little bit. 579 00:33:37,530 --> 00:33:39,853 Now, are we going to see anything going off to the left 580 00:33:39,853 --> 00:33:44,950 like we did with ordinary Breadth-first Search? 581 00:33:44,950 --> 00:33:47,640 No, because it's smart. 582 00:33:47,640 --> 00:33:49,780 It doesn't say, I want to go to a place that's further away 583 00:33:49,780 --> 00:33:51,030 from my goal. 584 00:33:53,870 --> 00:33:57,940 Now, let's see, maybe we can go back to our algorithm now 585 00:33:57,940 --> 00:34:07,620 and talk about that enqueueing mechanism and 586 00:34:07,620 --> 00:34:08,870 talk about Hill Climbing. 587 00:34:17,170 --> 00:34:21,060 Can I use the same basic search mechanism, just change 588 00:34:21,060 --> 00:34:22,199 that one line again? 589 00:34:22,199 --> 00:34:23,570 Yes. 590 00:34:23,570 --> 00:34:27,199 How do I add new paths to the queue this time? 591 00:34:27,199 --> 00:34:29,210 Well, it's very much like Hill Climbing, right? 592 00:34:29,210 --> 00:34:31,880 I want to add them to the front but 593 00:34:31,880 --> 00:34:32,960 with one little flourish. 594 00:34:32,960 --> 00:34:34,210 What's the flourish? 595 00:34:34,210 --> 00:34:36,460 [? Krishna, ?] what do you think? 596 00:34:36,460 --> 00:34:39,500 Remember, I want to use my heuristic information. 597 00:34:39,500 --> 00:34:41,900 So, I not only add them to the front, but amongst the ones 598 00:34:41,900 --> 00:34:43,705 I'm adding to the front, what do I do? 599 00:34:43,705 --> 00:34:44,980 AUDIENCE:Check the distance? 600 00:34:44,980 --> 00:34:45,850 PATRICK WINSTON: Check the distance. 601 00:34:45,850 --> 00:34:47,120 And how do you arrange them? 602 00:34:47,120 --> 00:34:47,958 AUDIENCE:[? You ?] 603 00:34:47,958 --> 00:34:49,219 [? keep the ?] minimum [? first. ?] 604 00:34:49,219 --> 00:34:51,570 PATRICK WINSTON: Yeah, you can put the minimum 605 00:34:51,570 --> 00:34:52,290 first if you like. 606 00:34:52,290 --> 00:34:53,739 But let's sort them. 607 00:34:53,739 --> 00:34:56,520 We'll sort them, that will keep everything straight. 608 00:34:56,520 --> 00:35:03,060 So Hill Climbing is front-sorted. 609 00:35:13,370 --> 00:35:14,580 And, finally, how about Beam? 610 00:35:14,580 --> 00:35:17,860 What do we do with Beam Search to add them to the queue? 611 00:35:17,860 --> 00:35:21,175 Well, it doesn't matter where we add them, because all we're 612 00:35:21,175 --> 00:35:25,840 going to do is we're going to keep the w best. 613 00:35:25,840 --> 00:35:30,270 So, with Beam, we'll just abbreviate that by 614 00:35:30,270 --> 00:35:35,070 saying keep w best. 615 00:35:41,330 --> 00:35:43,870 Now, you have some of the basic 616 00:35:43,870 --> 00:35:46,450 searches in you're toolkit. 617 00:35:46,450 --> 00:35:48,800 There's one more that's sometimes talked about. 618 00:35:48,800 --> 00:35:53,710 We've got Depth, Breadth, Best, and Beam, one more is 619 00:35:53,710 --> 00:35:57,720 Best, Best-first Search. 620 00:35:57,720 --> 00:36:04,250 It's a variant where you say, I've got this tree. 621 00:36:04,250 --> 00:36:08,270 It's got a bunch of paths that terminate in leaves. 622 00:36:08,270 --> 00:36:10,510 Let me just always work on the leaf node that's 623 00:36:10,510 --> 00:36:12,790 closest to the goal. 624 00:36:12,790 --> 00:36:15,510 It can skip around a little bit from one place to another. 625 00:36:15,510 --> 00:36:18,490 Because as it pursues one path, it may not do very well 626 00:36:18,490 --> 00:36:20,350 in some other path quite distant. 627 00:36:20,350 --> 00:36:21,870 And the tree will become the best one. 628 00:36:24,710 --> 00:36:28,930 We've actually seen an instance of that in then 629 00:36:28,930 --> 00:36:30,690 integration program. 630 00:36:30,690 --> 00:36:32,840 It's capable of skipping all over the place, because it's 631 00:36:32,840 --> 00:36:37,950 always taking the easiest problem in the search tree, in 632 00:36:37,950 --> 00:36:39,940 the and/or tree, working on that. 633 00:36:39,940 --> 00:36:43,300 That's Best-first Search. 634 00:36:43,300 --> 00:36:44,340 You can do these sorts of things in 635 00:36:44,340 --> 00:36:46,780 continuous spaces, too. 636 00:36:46,780 --> 00:36:47,830 And you've done the mathematics of 637 00:36:47,830 --> 00:36:52,580 that in 1802 or something. 638 00:36:52,580 --> 00:36:56,190 But in continuous spaces, the Hill Climbing sometimes leads 639 00:36:56,190 --> 00:36:59,470 to problems or doesn't do very well. 640 00:36:59,470 --> 00:37:04,140 What kind of a problem can you encounter in a continuous 641 00:37:04,140 --> 00:37:08,210 space with Hill Climbing? 642 00:37:08,210 --> 00:37:11,726 Well, how would you do Hill Climbing in 643 00:37:11,726 --> 00:37:12,840 a continuous space? 644 00:37:12,840 --> 00:37:16,550 Let's say we're in the mountains, and a big 645 00:37:16,550 --> 00:37:18,500 fog has come up. 646 00:37:18,500 --> 00:37:20,470 We're trying to get to the top of the hill before 647 00:37:20,470 --> 00:37:22,720 we freeze to death. 648 00:37:22,720 --> 00:37:28,006 And we take a few steps north, a few steps east, west, and 649 00:37:28,006 --> 00:37:30,380 south using our compass. 650 00:37:30,380 --> 00:37:33,510 And we check to see which direction seems to be doing 651 00:37:33,510 --> 00:37:37,220 the best job of getting us moving upward. 652 00:37:37,220 --> 00:37:39,140 And that's our Hill Climbing approach, right? 653 00:37:39,140 --> 00:37:43,700 We have explored four directions we can go and pick 654 00:37:43,700 --> 00:37:45,180 the best one. 655 00:37:45,180 --> 00:37:47,940 And from there, we pick four, try all those, pick the best 656 00:37:47,940 --> 00:37:48,800 one, and away we go. 657 00:37:48,800 --> 00:37:51,180 We've got ourselves a Hill Climbing algorithm. 658 00:37:51,180 --> 00:37:53,210 What's wrong with it? 659 00:37:53,210 --> 00:37:54,710 Or what can be wrong with it? 660 00:37:54,710 --> 00:37:56,840 Sometimes it works just fine. 661 00:37:56,840 --> 00:37:57,330 Yes. 662 00:37:57,330 --> 00:37:59,605 SPEAKER 1: You might get stuck in a local maximum. 663 00:37:59,605 --> 00:38:02,320 PATRICK WINSTON: We might get stuck in a local maximum. 664 00:38:02,320 --> 00:38:10,590 So, problem letter a is that if this is your space, it may 665 00:38:10,590 --> 00:38:11,450 look like that. 666 00:38:11,450 --> 00:38:16,640 And you may get stuck on a local maximum. 667 00:38:16,640 --> 00:38:18,330 Is there any other kind of problem that can come up? 668 00:38:26,215 --> 00:38:28,820 Well, it all depends on what the space is like. 669 00:38:28,820 --> 00:38:30,645 Here's a problem where the space has local maxima. 670 00:38:33,230 --> 00:38:35,010 Now, a lot of people have been killed on Mt. 671 00:38:35,010 --> 00:38:38,000 Washington when the fog comes up. 672 00:38:38,000 --> 00:38:40,270 And they do freeze to death, why? 673 00:38:43,270 --> 00:38:47,440 The reason they freeze to death is the Hill Climbing 674 00:38:47,440 --> 00:38:49,080 fails them, and they can't get to the top 675 00:38:49,080 --> 00:38:50,960 to the ranger station. 676 00:38:50,960 --> 00:38:53,660 And the reason is that there are large lawns on the 677 00:38:53,660 --> 00:38:54,230 shoulders of Mt. 678 00:38:54,230 --> 00:38:54,580 Washington. 679 00:38:54,580 --> 00:38:57,140 It's quite flat. 680 00:38:57,140 --> 00:38:58,555 So, it's the telephone pole problem. 681 00:39:02,090 --> 00:39:06,120 That space looks like this. 682 00:39:06,120 --> 00:39:07,040 Well, this isn't what Mt. 683 00:39:07,040 --> 00:39:07,910 Washington looks like. 684 00:39:07,910 --> 00:39:10,620 But it's the telephone pole problem. 685 00:39:10,620 --> 00:39:14,370 So, when you're wandering around here, the idea of 686 00:39:14,370 --> 00:39:17,720 trying a few directions and picking the one that's best 687 00:39:17,720 --> 00:39:20,210 doesn't help any, because it's flat. 688 00:39:20,210 --> 00:39:23,090 That can be a problem with Hill Climbing. 689 00:39:23,090 --> 00:39:26,160 Now, there's one more problem with Hill Climbing that most 690 00:39:26,160 --> 00:39:29,000 people don't know about. 691 00:39:29,000 --> 00:39:30,253 But it works like this. 692 00:39:34,600 --> 00:39:36,210 This is a particularly acute problem in 693 00:39:36,210 --> 00:39:37,280 high dimensional spaces. 694 00:39:37,280 --> 00:39:39,110 I'll illustrate it here just in two. 695 00:39:39,110 --> 00:39:44,370 And I'm going to switch from a regular kind of view to a 696 00:39:44,370 --> 00:39:46,570 contour map. 697 00:39:46,570 --> 00:39:54,950 So, my contour map is going to betray the presence of a sharp 698 00:39:54,950 --> 00:40:04,050 bridge along the 45 degree line. 699 00:40:04,050 --> 00:40:07,500 Now you see how you can get in trouble there. 700 00:40:07,500 --> 00:40:10,280 You get in trouble, because if you take a step in each 701 00:40:10,280 --> 00:40:13,350 direction, every direction takes you downhill. 702 00:40:13,350 --> 00:40:15,610 And you think you're at the top. 703 00:40:15,610 --> 00:40:19,826 So, suppose you're right here and you go north. 704 00:40:19,826 --> 00:40:24,180 That takes you down over a contour line. 705 00:40:24,180 --> 00:40:27,550 If you go south, that also takes you down 706 00:40:27,550 --> 00:40:29,000 over contour lines. 707 00:40:29,000 --> 00:40:33,870 Likewise, going west and east all appear to be taking you 708 00:40:33,870 --> 00:40:38,015 down, whereas, in fact, you're climbing a ridge. 709 00:40:38,015 --> 00:40:42,270 And that contour line is the highest that I've shown. 710 00:40:42,270 --> 00:40:43,900 So, sometimes you can get fooled-- 711 00:40:43,900 --> 00:40:46,000 not stuck, but fooled-- into thinking you're at the top 712 00:40:46,000 --> 00:40:47,250 when you're actually not. 713 00:40:49,830 --> 00:40:54,410 Now, this is a model something. 714 00:40:54,410 --> 00:40:58,130 This subject is about modeling intelligence. 715 00:40:58,130 --> 00:41:01,680 And this is a kind of algorithm you frequently need 716 00:41:01,680 --> 00:41:04,580 in order to build an intelligent system. 717 00:41:04,580 --> 00:41:07,360 But do we have any kind of Search happening in our heads? 718 00:41:11,650 --> 00:41:13,820 If we're going to model what goes on inside our heads, do 719 00:41:13,820 --> 00:41:19,850 we have to model any kind of searching in order to do the 720 00:41:19,850 --> 00:41:24,010 kinds of things that we humans do? 721 00:41:24,010 --> 00:41:24,720 I suppose so. 722 00:41:24,720 --> 00:41:28,250 Anytime we make a plan, we're actually evaluating a bunch of 723 00:41:28,250 --> 00:41:31,340 choices and seeing how they work. 724 00:41:31,340 --> 00:41:33,020 Let me see if I can illustrate it another way. 725 00:41:37,120 --> 00:41:42,260 This is a system that I showed you a little bit of last time. 726 00:41:42,260 --> 00:41:46,590 And, shoot, I might as well review one or two things here. 727 00:41:51,970 --> 00:41:53,640 I showed you a Macbeth story. 728 00:41:53,640 --> 00:41:56,220 This is the story I showed you. 729 00:41:56,220 --> 00:41:59,500 And if you had this in a humanities class, the simplest 730 00:41:59,500 --> 00:42:04,620 questions that might be asked is why did Macduff kill 731 00:42:04,620 --> 00:42:08,240 Macbeth down there at the bottom? 732 00:42:08,240 --> 00:42:10,960 Did I demonstrate the answering of questions last 733 00:42:10,960 --> 00:42:13,340 time, or just the development of the graph? 734 00:42:13,340 --> 00:42:14,000 I can't remember. 735 00:42:14,000 --> 00:42:16,210 But we'll do it again, anyway. 736 00:42:16,210 --> 00:42:19,895 This is somewhat stylized English. 737 00:42:19,895 --> 00:42:22,570 Just so you'll know, it doesn't have 738 00:42:22,570 --> 00:42:23,660 to be stylized English. 739 00:42:23,660 --> 00:42:27,330 This is English that's made available to the Genesis 740 00:42:27,330 --> 00:42:31,950 system by way of something called Story Workbench. 741 00:42:31,950 --> 00:42:32,800 There's no free lunch. 742 00:42:32,800 --> 00:42:35,740 Either you can use your human resources to rewrite the plot 743 00:42:35,740 --> 00:42:37,540 in third grade English. 744 00:42:37,540 --> 00:42:39,940 Or you can use your human resources to take a more 745 00:42:39,940 --> 00:42:43,240 natural, adult-type version of the story and decorate it with 746 00:42:43,240 --> 00:42:46,510 annotations that make it possible to absorb it. 747 00:42:46,510 --> 00:42:49,970 Just this summer, in a miracle of summer [? Europe, ?] 748 00:42:49,970 --> 00:42:50,313 [? Brit ?] 749 00:42:50,313 --> 00:42:50,656 [? van ?] 750 00:42:50,656 --> 00:42:51,280 [? Zijp-- ?] 751 00:42:51,280 --> 00:42:52,790 one of you-- 752 00:42:52,790 --> 00:42:54,630 connected these two systems together. 753 00:42:54,630 --> 00:42:57,030 So, we can now work with stories that are expressed in 754 00:42:57,030 --> 00:42:59,140 pretty natural English. 755 00:42:59,140 --> 00:43:02,890 Everything in our system is expressed in English, 756 00:43:02,890 --> 00:43:04,230 including common sense knowledge-- 757 00:43:04,230 --> 00:43:06,810 like if somebody kills you, you're dead-- 758 00:43:06,810 --> 00:43:10,880 but more importantly, for today's illustration, that 759 00:43:10,880 --> 00:43:14,920 reflective level knowledge, that knowledge about what 760 00:43:14,920 --> 00:43:17,180 revenge is. 761 00:43:17,180 --> 00:43:17,790 Here you are. 762 00:43:17,790 --> 00:43:20,280 You're in the humanities class, and someone says, 763 00:43:20,280 --> 00:43:21,800 what's really going on in the story? 764 00:43:21,800 --> 00:43:24,420 Not the details of who kills whom, but is 765 00:43:24,420 --> 00:43:26,560 there a Pyrrhic victory? 766 00:43:26,560 --> 00:43:28,140 Does somebody have a success? 767 00:43:28,140 --> 00:43:30,120 Is there an act of revenge? 768 00:43:30,120 --> 00:43:33,380 These are all kinds of things you might be asked about in 769 00:43:33,380 --> 00:43:34,800 some kind of humanities class. 770 00:43:40,470 --> 00:43:42,810 So, let me fire up the genesis system. 771 00:43:52,150 --> 00:43:53,400 Pray for internet connectivity. 772 00:43:58,730 --> 00:44:02,850 Launch the system on a read of that Macbeth story that I 773 00:44:02,850 --> 00:44:05,110 showed you just a moment ago. 774 00:44:05,110 --> 00:44:08,220 At the moment, it's absorbing information about background 775 00:44:08,220 --> 00:44:09,960 knowledge, and about reflective level knowledge, 776 00:44:09,960 --> 00:44:11,210 and all that sort of thing. 777 00:44:15,780 --> 00:44:18,010 It's building itself this thing we call 778 00:44:18,010 --> 00:44:20,100 an elaboration graph. 779 00:44:20,100 --> 00:44:21,320 It's not quite there yet. 780 00:44:21,320 --> 00:44:22,570 It's still reading background knowledge. 781 00:44:30,750 --> 00:44:32,140 Now it's reading Macbeth. 782 00:44:32,140 --> 00:44:36,100 It's building it's elaboration graph, the same thing you saw 783 00:44:36,100 --> 00:44:39,060 last time, except not quite. 784 00:44:39,060 --> 00:44:41,570 Do you see that stuff down at the bottom? 785 00:44:41,570 --> 00:44:44,500 Those are higher level concepts that it's managed to 786 00:44:44,500 --> 00:44:46,930 find in the Macbeth story. 787 00:44:46,930 --> 00:44:48,180 So, its found a revenge. 788 00:44:51,010 --> 00:44:53,510 How did it do that? 789 00:44:53,510 --> 00:44:55,560 It searched. 790 00:44:55,560 --> 00:44:58,810 It had a description of what a revenge is, and it looked to 791 00:44:58,810 --> 00:45:01,270 see if that pattern was exhibited in 792 00:45:01,270 --> 00:45:03,420 the elaboration graph. 793 00:45:03,420 --> 00:45:05,940 So, in a combination of things that were said explicitly and 794 00:45:05,940 --> 00:45:08,650 things that were produced by knee-jerk if/then rules, the 795 00:45:08,650 --> 00:45:11,280 elaboration graph was sufficiently instantiated that 796 00:45:11,280 --> 00:45:15,580 the revenge pattern could be found. 797 00:45:15,580 --> 00:45:19,030 That's interesting, Pyrrhic victory is a little harder. 798 00:45:19,030 --> 00:45:21,070 You'd probably get an a if you said, oh, there's a Pyrrhic 799 00:45:21,070 --> 00:45:24,220 victory in here. 800 00:45:24,220 --> 00:45:26,320 There it is. 801 00:45:26,320 --> 00:45:27,750 So, I'll blow that up a little bit so you can 802 00:45:27,750 --> 00:45:29,870 see what that is. 803 00:45:29,870 --> 00:45:32,220 You know what a Pyrrhic victory is. 804 00:45:32,220 --> 00:45:36,190 It's a situation where everything seems to be going 805 00:45:36,190 --> 00:45:41,910 good at first, and then not so hot. 806 00:45:41,910 --> 00:45:46,580 So, Macbeth wants to be King down here. 807 00:45:46,580 --> 00:45:48,975 And eventually that leads to becoming King. 808 00:45:48,975 --> 00:45:51,610 But too bad for Macbeth, because eventually he gets 809 00:45:51,610 --> 00:45:52,540 killed in consequence. 810 00:45:52,540 --> 00:45:54,650 So, it's a Pyrrhic victory. 811 00:45:54,650 --> 00:45:56,580 All that produced by Search programs who are looking 812 00:45:56,580 --> 00:45:58,670 through this graph. 813 00:45:58,670 --> 00:46:01,030 Now once you've got the capability of doing that, of 814 00:46:01,030 --> 00:46:04,470 course, then you can find all sorts of things. 815 00:46:04,470 --> 00:46:05,820 And you can report them in English. 816 00:46:05,820 --> 00:46:09,510 But, more interestingly, you can answer questions. 817 00:46:09,510 --> 00:46:12,060 Why did Macbeth-- 818 00:46:12,060 --> 00:46:13,520 it cares not a hoot about capitalization. 819 00:46:19,920 --> 00:46:22,160 ARTIFICIAL INTELLIGENCE: On a common sense level, it looks 820 00:46:22,160 --> 00:46:24,690 like Dr. Jekyll thinks Macduff killed Macbeth because Macbeth 821 00:46:24,690 --> 00:46:26,520 angered Macduff on a reflective level. 822 00:46:26,520 --> 00:46:29,200 It looks like Dr. Jekyll thinks Macduff killed Macbeth 823 00:46:29,200 --> 00:46:32,720 as part of acts of mistake, Pyrrhic victory, and revenge. 824 00:46:32,720 --> 00:46:35,410 PATRICK WINSTON: Pretty corny speech output. 825 00:46:35,410 --> 00:46:36,620 But you see the point. 826 00:46:36,620 --> 00:46:40,726 How did it get the stuff on the common sense level? 827 00:46:40,726 --> 00:46:43,420 The same way all those programs that build goal trees 828 00:46:43,420 --> 00:46:46,600 report, answers the questions. 829 00:46:46,600 --> 00:46:48,687 It's just looking locally around in the connections in 830 00:46:48,687 --> 00:46:50,410 the goal tree. 831 00:46:50,410 --> 00:46:52,880 How did it get the stuff on the reflective level? 832 00:46:52,880 --> 00:46:59,240 By reporting on the searches that produced information-- 833 00:46:59,240 --> 00:47:03,670 it does that by looking for higher level thoughts about 834 00:47:03,670 --> 00:47:07,540 its own thoughts and reporting in which of those higher level 835 00:47:07,540 --> 00:47:12,995 thoughts the incident we asked about actually occurs. 836 00:47:12,995 --> 00:47:17,020 So, let's see, just for fun, we might be interested in why 837 00:47:17,020 --> 00:47:20,660 Macbeth murdered Duncan. 838 00:47:20,660 --> 00:47:22,600 Wouldn't this be handy if you hadn't actually read the play, 839 00:47:22,600 --> 00:47:25,276 and here it is, you've got to write that paper? 840 00:47:25,276 --> 00:47:26,400 ARTIFICIAL INTELLIGENCE: On a common sense 841 00:47:26,400 --> 00:47:27,790 level, it looks like-- 842 00:47:27,790 --> 00:47:29,150 PATRICK WINSTON: I'll pull the plug on that, because that's 843 00:47:29,150 --> 00:47:30,400 just annoying. 844 00:47:32,768 --> 00:47:36,340 Yeah, pretty good, Macbeth wants to be King, and Duncan 845 00:47:36,340 --> 00:47:37,620 is the King. 846 00:47:37,620 --> 00:47:39,740 Let's see, why did Macbeth become King? 847 00:47:51,820 --> 00:47:53,450 Oh, it won't answer the question 848 00:47:53,450 --> 00:47:54,700 unless I spell it right. 849 00:48:01,670 --> 00:48:03,990 I wouldn't be able to show that to you until last spring. 850 00:48:03,990 --> 00:48:07,480 In fact, I wouldn't have been able to show you this today 851 00:48:07,480 --> 00:48:10,760 until last week with a tweak this morning. 852 00:48:10,760 --> 00:48:15,090 Because we've just now connected the language output 853 00:48:15,090 --> 00:48:17,910 to, of course, [? Cass's ?] 854 00:48:17,910 --> 00:48:19,910 parser system, which is running in reverse, in order 855 00:48:19,910 --> 00:48:21,040 to generate that English. 856 00:48:21,040 --> 00:48:23,660 So, that's something that has never before been seen by any 857 00:48:23,660 --> 00:48:25,830 eyes but me. 858 00:48:25,830 --> 00:48:27,650 So, that will conclude what we have to do today.