1 00:00:08,982 --> 00:00:12,180 PROFESSOR: It was written about Route 66, which used to 2 00:00:12,180 --> 00:00:15,640 be the main highway between Chicago, Illinois and Los 3 00:00:15,640 --> 00:00:17,610 Angeles, California. 4 00:00:17,610 --> 00:00:19,860 Very famous highway because anybody who wanted to go 5 00:00:19,860 --> 00:00:23,550 across country always took route 66 because it was the 6 00:00:23,550 --> 00:00:25,950 shortest way to go. 7 00:00:25,950 --> 00:00:29,620 And the question is, how do you find the shortest path? 8 00:00:29,620 --> 00:00:34,760 Not just any old path or a good path, but how do you find 9 00:00:34,760 --> 00:00:36,230 the very shortest path? 10 00:00:36,230 --> 00:00:37,764 And that'll be the subject that we're 11 00:00:37,764 --> 00:00:38,660 going to discuss today. 12 00:00:38,660 --> 00:00:43,560 But route 66, I lament its passing, but it's been largely 13 00:00:43,560 --> 00:00:46,570 replaced by the interstate highway system that was 14 00:00:46,570 --> 00:00:48,350 created by President Eisenhower. 15 00:00:48,350 --> 00:00:49,600 Guess why? 16 00:00:52,560 --> 00:00:55,860 Let's see, maybe the ROTC people know. 17 00:00:55,860 --> 00:00:58,840 You know why Eisenhower created the 18 00:00:58,840 --> 00:01:01,800 interstate highway system? 19 00:01:01,800 --> 00:01:06,670 Well, in public affairs, of course, there's always a 20 00:01:06,670 --> 00:01:08,420 distinction to be made between the 21 00:01:08,420 --> 00:01:10,990 explanation and the reason. 22 00:01:10,990 --> 00:01:13,276 The explanation was-- 23 00:01:13,276 --> 00:01:16,463 AUDIENCE: To move weapons across the country? 24 00:01:16,463 --> 00:01:17,560 PROFESSOR: Well, to move nuclear 25 00:01:17,560 --> 00:01:18,700 weapons across the country. 26 00:01:18,700 --> 00:01:20,700 Let's put it in, sightly, more benign terms. 27 00:01:20,700 --> 00:01:24,240 Eisenhower had observed that the German army was able to 28 00:01:24,240 --> 00:01:28,950 move its troops rapidly, even though we bombed their 29 00:01:28,950 --> 00:01:32,910 railroads into oblivion because of their auto bond. 30 00:01:32,910 --> 00:01:36,473 So Eisenhower conceived that if there were ever an invasion 31 00:01:36,473 --> 00:01:39,600 in the United States, we too would want to be able to move 32 00:01:39,600 --> 00:01:42,300 our forces around on a highway system. 33 00:01:42,300 --> 00:01:46,340 And consequence of that, we have a pretty good highway 34 00:01:46,340 --> 00:01:48,009 system and pretty awful railroad system. 35 00:01:48,009 --> 00:01:50,060 It's interesting. 36 00:01:50,060 --> 00:01:53,410 I'm a beneficiary of that, in a funny way, because I'm from 37 00:01:53,410 --> 00:01:55,050 East Peoria, Illinois. 38 00:01:55,050 --> 00:01:57,600 And I was surrounded by the factors of Caterpillar Tractor 39 00:01:57,600 --> 00:02:00,725 Company, which made all of the tractors that built all of 40 00:02:00,725 --> 00:02:01,990 those roads. 41 00:02:01,990 --> 00:02:06,180 So my high school spent money like water from that huge tax 42 00:02:06,180 --> 00:02:09,270 base of all those factors. 43 00:02:09,270 --> 00:02:14,990 Anyway, today we want to find the very best path, instead of 44 00:02:14,990 --> 00:02:16,550 just a good path. 45 00:02:16,550 --> 00:02:26,050 And like the last time, we'll deal with, both, an example 46 00:02:26,050 --> 00:02:29,640 that we can set our program to work on. 47 00:02:29,640 --> 00:02:34,180 By the way, can you find the shortest path between S and G? 48 00:02:34,180 --> 00:02:36,885 Would you like to bet your life on the shortest path 49 00:02:36,885 --> 00:02:39,000 between S and G? 50 00:02:39,000 --> 00:02:39,920 Probably, not. 51 00:02:39,920 --> 00:02:42,600 With your eye, you can find a good path. 52 00:02:42,600 --> 00:02:46,020 But you can't find the best possible path. 53 00:02:46,020 --> 00:02:48,500 Today, what we're doing is probably not modeling any 54 00:02:48,500 --> 00:02:51,550 obvious property of what we have inside our heads. 55 00:02:51,550 --> 00:02:55,610 But being able to find the best path is part of the skill 56 00:02:55,610 --> 00:02:58,260 set that anybody who's had a course artificial intelligence 57 00:02:58,260 --> 00:03:00,270 would be expected to have. 58 00:03:00,270 --> 00:03:02,010 So we're going to look at it, even though it's not like many 59 00:03:02,010 --> 00:03:03,170 of the things we do. 60 00:03:03,170 --> 00:03:04,660 A model of something that's, probably, 61 00:03:04,660 --> 00:03:07,520 going on in your head. 62 00:03:07,520 --> 00:03:10,710 So we're going to use, both, this example from Cambridge 63 00:03:10,710 --> 00:03:14,450 and our Blackboard example. 64 00:03:14,450 --> 00:03:16,800 Let's see, we have to caution ourselves. 65 00:03:16,800 --> 00:03:20,320 Tanya, is search about maps? 66 00:03:20,320 --> 00:03:22,690 No, it's about what? 67 00:03:22,690 --> 00:03:27,829 Starts with a C. And the next letter is H. And it ends up 68 00:03:27,829 --> 00:03:28,750 being choice. 69 00:03:28,750 --> 00:03:30,460 So we're talking about choice. 70 00:03:30,460 --> 00:03:31,240 Not about maps. 71 00:03:31,240 --> 00:03:35,150 Even though our examples are drawn from maps because 72 00:03:35,150 --> 00:03:37,410 they're convenient, they're visual, and helps understand 73 00:03:37,410 --> 00:03:42,190 the concepts behind the algorithms I'm talking about. 74 00:03:42,190 --> 00:03:46,390 So let's start off by looking at our classroom example. 75 00:03:46,390 --> 00:03:50,570 And I did something today that I neglected to do last time. 76 00:03:50,570 --> 00:03:52,030 And that's talk to you about what I meant 77 00:03:52,030 --> 00:03:53,810 by heuristic distance. 78 00:03:53,810 --> 00:03:57,350 It's those pink lines that I just drew on the map. 79 00:03:57,350 --> 00:04:00,130 We're talking about the distance as the crow would fly 80 00:04:00,130 --> 00:04:03,520 between two places, even though there's no road that 81 00:04:03,520 --> 00:04:06,200 goes between those two places. 82 00:04:06,200 --> 00:04:11,100 So in general, and we discussed last time, it's best 83 00:04:11,100 --> 00:04:14,950 to get yourself into a place that's close, as the crow 84 00:04:14,950 --> 00:04:17,170 flies, to your goal. 85 00:04:17,170 --> 00:04:19,740 And of course, that's a heuristic and it can get you 86 00:04:19,740 --> 00:04:23,020 in trouble because it's not always true. 87 00:04:23,020 --> 00:04:26,630 It would appear that being at node E is a good place to be 88 00:04:26,630 --> 00:04:31,040 because it's not very far from G. But in that particular case 89 00:04:31,040 --> 00:04:34,030 designed to illustrate the point, being close is, 90 00:04:34,030 --> 00:04:37,770 actually, not a good thing because it's a dead end. 91 00:04:37,770 --> 00:04:40,110 But in general, it's a good thing to be close. 92 00:04:40,110 --> 00:04:46,150 And we talked last time about hill climbing and beam search, 93 00:04:46,150 --> 00:04:48,580 being close was the objective of those kinds of searches. 94 00:04:48,580 --> 00:04:52,060 And at one point, in a beam search illustration, we had C, 95 00:04:52,060 --> 00:04:56,980 B, A, and D. We had paths terminating at all four of 96 00:04:56,980 --> 00:05:00,980 those nodes as candidates for the next round of search. 97 00:05:00,980 --> 00:05:04,660 And we decided on the basis of these airline distances to 98 00:05:04,660 --> 00:05:09,720 keep D and B, and reject A and C because they're further away 99 00:05:09,720 --> 00:05:12,060 as the crow flies. 100 00:05:12,060 --> 00:05:14,340 Now, I repeat this even though many of you have had this 101 00:05:14,340 --> 00:05:16,790 fixed already in your tutorials because we're going 102 00:05:16,790 --> 00:05:22,450 to need this concept of heuristic distance today. 103 00:05:22,450 --> 00:05:28,250 And I wanted to be sure that that point has been clarified. 104 00:05:28,250 --> 00:05:33,030 So now, with this smaller map I imagine you can do, by eye, 105 00:05:33,030 --> 00:05:37,580 a determination of what the shortest path is. 106 00:05:37,580 --> 00:05:38,230 What is it, Juana? 107 00:05:38,230 --> 00:05:41,584 Can you help me out with that? 108 00:05:41,584 --> 00:05:43,800 AUDIENCE: S, A, D, G. 109 00:05:43,800 --> 00:05:47,820 PROFESSOR: S, A, D, G. And if you add up those distances, 110 00:05:47,820 --> 00:05:54,430 the distance is 11 along that path that goes from S, first 111 00:05:54,430 --> 00:05:59,640 to A, and then to D, and then from D to G. So Juana asserts 112 00:05:59,640 --> 00:06:00,860 that that is the best path. 113 00:06:00,860 --> 00:06:04,540 And we're going to treat Juana as an Oracle because we're 114 00:06:04,540 --> 00:06:07,830 going to follow, in our initial attempt to understand 115 00:06:07,830 --> 00:06:10,590 these algorithms, a very important 116 00:06:10,590 --> 00:06:12,020 principle of problem solving. 117 00:06:12,020 --> 00:06:14,770 And that is that, if you want to solve a problem, the 118 00:06:14,770 --> 00:06:17,810 easiest way is, usually, ask somebody who knows the answer. 119 00:06:17,810 --> 00:06:22,290 Or Google, which also, probably, knows the answer. 120 00:06:22,290 --> 00:06:25,290 So in this particular case, we believe that 121 00:06:25,290 --> 00:06:26,270 Juana knows the answer. 122 00:06:26,270 --> 00:06:30,170 And she said that the shortest path is S,A, D, G, and its 123 00:06:30,170 --> 00:06:32,310 path length is 11. 124 00:06:32,310 --> 00:06:36,680 But we don't trust her because we're applying to the same 125 00:06:36,680 --> 00:06:39,673 medical school and she may be trying to screw us. 126 00:06:39,673 --> 00:06:42,140 [LAUGHTER] 127 00:06:42,140 --> 00:06:44,380 PROFESSOR: So we're going to be very cautious about 128 00:06:44,380 --> 00:06:48,320 accepting her answer until we've checked it to make sure 129 00:06:48,320 --> 00:06:53,750 that she hasn't attempted to delude us. 130 00:06:53,750 --> 00:06:55,970 So how do we go about doing that? 131 00:06:55,970 --> 00:06:59,490 Well, one way to do that is to check to be sure that all 132 00:06:59,490 --> 00:07:03,380 other possible paths that we could develop end up being, 133 00:07:03,380 --> 00:07:07,700 for sure, longer than the one that Juana has told us about. 134 00:07:07,700 --> 00:07:16,910 So she's told us about S, A, D, G. And it has a total path 135 00:07:16,910 --> 00:07:18,610 length of 11. 136 00:07:18,610 --> 00:07:20,500 And now what I'm going to do is I'm just going to develop 137 00:07:20,500 --> 00:07:25,995 the rest of this tree-like diagram. 138 00:07:25,995 --> 00:07:28,380 But what I'm going to do is, I'm not going to do it in a 139 00:07:28,380 --> 00:07:30,270 British Museum or random way. 140 00:07:30,270 --> 00:07:34,830 What I'm going to do is, I'm going to look at the choice 141 00:07:34,830 --> 00:07:38,960 that corresponds the shortest path that can be extended. 142 00:07:38,960 --> 00:07:42,159 So the shortest path that can be extended is 143 00:07:42,159 --> 00:07:44,620 this one right here. 144 00:07:44,620 --> 00:07:47,480 The one that just has the starting node in it. 145 00:07:47,480 --> 00:07:50,880 And I could have gone this other way to B. And if I go 146 00:07:50,880 --> 00:07:53,870 that other way to B, then the path length along 147 00:07:53,870 --> 00:07:57,240 that side is 5. 148 00:07:57,240 --> 00:08:02,760 And likewise, if I look at the path that terminates in A, 149 00:08:02,760 --> 00:08:06,950 that has a path length of 3. 150 00:08:06,950 --> 00:08:08,300 So now I've got two choices. 151 00:08:08,300 --> 00:08:11,710 A and B. I've got choices that extend 152 00:08:11,710 --> 00:08:14,170 beyond those two places. 153 00:08:14,170 --> 00:08:15,640 So I'm always going to extend the one that 154 00:08:15,640 --> 00:08:18,250 has the shorter length. 155 00:08:18,250 --> 00:08:20,450 So in this case, that would be the path that goes from S to 156 00:08:20,450 --> 00:08:22,380 A. 157 00:08:22,380 --> 00:08:25,480 So if I from S to A, I don't have to go to D. I can also go 158 00:08:25,480 --> 00:08:33,429 to B. And if I go to B, then the accumulated path length is 159 00:08:33,429 --> 00:08:35,120 S, A, B. That's 7. 160 00:08:37,760 --> 00:08:43,120 And know that we're talking now about the path, like the 161 00:08:43,120 --> 00:08:46,430 accumulated path length, that we've traveled so far. 162 00:08:46,430 --> 00:08:48,120 Last time we were talking a lot about 163 00:08:48,120 --> 00:08:50,220 distances to the goal. 164 00:08:50,220 --> 00:08:53,340 Heuristic estimates of how far we are from the goal. 165 00:08:53,340 --> 00:08:55,500 Now we're doing exactly the opposite. 166 00:08:55,500 --> 00:08:58,570 We're not considering how far we've got to go. 167 00:08:58,570 --> 00:09:02,450 We're only thinking about how far we've gone so far. 168 00:09:02,450 --> 00:09:05,830 So now, repeating these steps again. 169 00:09:05,830 --> 00:09:07,660 I've got 7 and 5. 170 00:09:07,660 --> 00:09:10,790 So I'll go over and consider the choices that go through 171 00:09:10,790 --> 00:09:16,360 the B node on the path S, B. And that gives me S, B, A and 172 00:09:16,360 --> 00:09:21,440 S, B, C. And what are those path lengths? 173 00:09:21,440 --> 00:09:22,480 Well, let's see. 174 00:09:22,480 --> 00:09:27,200 S, B, A would be 9. 175 00:09:27,200 --> 00:09:29,620 And S, B,C would be 9. 176 00:09:29,620 --> 00:09:32,740 And now the shortest path is this one over here. 177 00:09:32,740 --> 00:09:34,430 So I extend that. 178 00:09:34,430 --> 00:09:40,280 I go S, A, B. S, A, B. The only place I can go is C. That 179 00:09:40,280 --> 00:09:42,150 adds another 4. 180 00:09:42,150 --> 00:09:43,280 So that's 11. 181 00:09:43,280 --> 00:09:46,690 And what do I know about that path? 182 00:09:46,690 --> 00:09:49,770 I don't have to take that any further, right? 183 00:09:49,770 --> 00:09:53,350 Because the path length, since I've gone on that path 184 00:09:53,350 --> 00:09:56,690 already, is equal to the path length that Juana has told me 185 00:09:56,690 --> 00:09:59,240 gets me to the goal. 186 00:09:59,240 --> 00:10:02,110 So it'll be foolhardy to carry on because, presuming that 187 00:10:02,110 --> 00:10:04,810 these lengths are all non-negative, I 188 00:10:04,810 --> 00:10:07,270 can't do any better. 189 00:10:07,270 --> 00:10:10,270 And I can't even do as well, unless I've got a length that 190 00:10:10,270 --> 00:10:13,440 has 0 length. 191 00:10:13,440 --> 00:10:15,820 So now that I have that idea, I can quickly finish up by 192 00:10:15,820 --> 00:10:17,570 saying, well, let me consider these two paths. 193 00:10:17,570 --> 00:10:23,982 S, B, A can only go to D. And if I go to D, that adds 3. 194 00:10:23,982 --> 00:10:27,550 9 plus 3 is 12. 195 00:10:27,550 --> 00:10:30,750 Nothing else can happen there because that's 12 and I've got 196 00:10:30,750 --> 00:10:33,470 a path of a goal that's 11. 197 00:10:33,470 --> 00:10:37,500 C, I can only go to E. It's a dead-end but I don't have to 198 00:10:37,500 --> 00:10:39,640 think about that because I know that the accumulated 199 00:10:39,640 --> 00:10:43,730 distance along this path is 6 plus 9. 200 00:10:43,730 --> 00:10:45,640 That's 15. 201 00:10:45,640 --> 00:10:50,740 So all of these need not be extended any further because 202 00:10:50,740 --> 00:10:54,090 their length, accumulated so far, is equal to or less than 203 00:10:54,090 --> 00:10:55,500 a length of a goal. 204 00:10:55,500 --> 00:10:57,380 So I've checked the Oracle. 205 00:10:57,380 --> 00:11:01,120 And although we're applying to the same medical school, Juana 206 00:11:01,120 --> 00:11:04,400 has told me the truth. 207 00:11:04,400 --> 00:11:07,835 So now, unfortunately, Juana's not always around. 208 00:11:07,835 --> 00:11:10,690 And I don't always have an Oracle. 209 00:11:10,690 --> 00:11:15,030 So I'm going to have to have some way of finding the 210 00:11:15,030 --> 00:11:17,440 shortest path without that Oracle 211 00:11:17,440 --> 00:11:19,210 that I can check against. 212 00:11:22,740 --> 00:11:23,190 Let's see. 213 00:11:23,190 --> 00:11:24,440 What can I do? 214 00:11:26,450 --> 00:11:29,430 Maybe I can do the same thing I just did. 215 00:11:29,430 --> 00:11:32,150 Always extend the shortest path so far and hope that I 216 00:11:32,150 --> 00:11:34,630 run into the goal at some point. 217 00:11:34,630 --> 00:11:36,800 And then I have to ask myself the question how much extra 218 00:11:36,800 --> 00:11:40,210 work did I need to do when I don't have the Oracle? 219 00:11:40,210 --> 00:11:43,760 Let's just try it and see what happens. 220 00:11:43,760 --> 00:11:45,400 You don't have that path to start. 221 00:11:45,400 --> 00:11:49,400 So I just have S. This distance is 0. 222 00:11:49,400 --> 00:11:54,590 I can go either to A or B. If I go to A, I've got 223 00:11:54,590 --> 00:11:56,080 a distance of 3. 224 00:11:56,080 --> 00:11:59,410 Here, I've got a distance of 5. 225 00:11:59,410 --> 00:12:04,550 I'll extend the path that goes S, A. That can either got to B 226 00:12:04,550 --> 00:12:10,360 or D. Going to B or D gives me 7 that way. 227 00:12:10,360 --> 00:12:13,200 S, A, D gives me 6. 228 00:12:13,200 --> 00:12:15,690 So looking across all of these and extending the shortest 229 00:12:15,690 --> 00:12:24,980 path so far takes me back over to S, B. So I extend those. 230 00:12:24,980 --> 00:12:30,510 S, B takes me to A or C. 231 00:12:30,510 --> 00:12:33,550 And those, in turn, have total accumulated path 232 00:12:33,550 --> 00:12:35,880 lengths of 9 and 9. 233 00:12:35,880 --> 00:12:39,390 Now the shortest one is S, A, D. You see the pattern. 234 00:12:39,390 --> 00:12:40,890 Now let's see. 235 00:12:40,890 --> 00:12:45,930 I haven't found the goal yet. 236 00:12:45,930 --> 00:12:48,720 So I can ask myself the question is any of the work 237 00:12:48,720 --> 00:12:52,190 that I've done so far wasted? 238 00:12:52,190 --> 00:12:54,710 No because all of the paths that I've got so far are 239 00:12:54,710 --> 00:12:57,750 shorter than the path of the goal because the goal 240 00:12:57,750 --> 00:12:59,700 hasn't shown up. 241 00:12:59,700 --> 00:13:02,230 So when I do my oracle checking after I found the 242 00:13:02,230 --> 00:13:06,220 goal, none of that work's going to be wasted. 243 00:13:06,220 --> 00:13:09,120 So in the end, I don't, actually, need the Oracle. 244 00:13:09,120 --> 00:13:12,760 I could just develop this graph by extending the 245 00:13:12,760 --> 00:13:16,390 shortest path, so far, until I hit the goal. 246 00:13:16,390 --> 00:13:19,030 And then, perhaps, do a little remaining checking to make 247 00:13:19,030 --> 00:13:22,610 sure that all the other paths extend with a length that's 248 00:13:22,610 --> 00:13:24,970 greater than the path of the goal. 249 00:13:24,970 --> 00:13:26,855 So if those words are confusing, let's carry on with 250 00:13:26,855 --> 00:13:30,000 the algorithm, and I think it'll be clearer. 251 00:13:30,000 --> 00:13:30,390 So let's see. 252 00:13:30,390 --> 00:13:33,490 We've got the 7, 6, and two 9s. 253 00:13:33,490 --> 00:13:35,530 We're going to extend the one that's 6. 254 00:13:35,530 --> 00:13:36,900 That gets this to the goal. 255 00:13:36,900 --> 00:13:38,690 Boom, we've got it. 256 00:13:38,690 --> 00:13:41,230 And we've got a path length of 11. 257 00:13:41,230 --> 00:13:46,040 Note, though, we can't quit because we have to be sure 258 00:13:46,040 --> 00:13:49,300 that all other paths are longer than 11. 259 00:13:49,300 --> 00:13:51,530 So now we have to carry on with the same algorithm that 260 00:13:51,530 --> 00:13:52,580 we started with. 261 00:13:52,580 --> 00:13:55,630 The Oracle checking algorithm. 262 00:13:55,630 --> 00:13:59,460 And when we do that, we look for this shortest path, so 263 00:13:59,460 --> 00:14:01,650 far, that has not been extended. 264 00:14:01,650 --> 00:14:08,360 That's B, S, A, B. That goes to C. That's 11. 265 00:14:08,360 --> 00:14:10,010 So we're done there. 266 00:14:10,010 --> 00:14:13,600 A goes to D. That adds 3. 267 00:14:13,600 --> 00:14:15,630 That's 12. 268 00:14:15,630 --> 00:14:18,390 C goes to E. That adds 6. 269 00:14:18,390 --> 00:14:20,050 That's 15. 270 00:14:20,050 --> 00:14:22,020 And sure enough, we're done. 271 00:14:22,020 --> 00:14:23,270 OK? 272 00:14:26,296 --> 00:14:27,742 Elliot? 273 00:14:27,742 --> 00:14:29,810 AUDIENCE: Does it know that there's know that there isn't 274 00:14:29,810 --> 00:14:33,110 a chance that you could have a zero distance extension from 275 00:14:33,110 --> 00:14:35,460 the [INAUDIBLE]? 276 00:14:35,460 --> 00:14:39,302 PROFESSOR: The question is, does it know that there's no 277 00:14:39,302 --> 00:14:44,510 zero distance length that's coming up. 278 00:14:44,510 --> 00:14:46,400 That's an implementation detail. 279 00:14:46,400 --> 00:14:51,010 This guarantees you'll find a path that's as short as any 280 00:14:51,010 --> 00:14:52,320 that you can possibly find. 281 00:14:52,320 --> 00:14:55,580 But there might be others if they're zero-length lengths. 282 00:14:55,580 --> 00:14:56,580 As long as they're non-negative 283 00:14:56,580 --> 00:14:58,020 lengths, we're safe. 284 00:14:58,020 --> 00:15:00,740 We've got a shortest path. 285 00:15:00,740 --> 00:15:01,670 So that was easy. 286 00:15:01,670 --> 00:15:04,560 And now we can repeat the exercise with our more 287 00:15:04,560 --> 00:15:06,490 complicated map of Cambridge. 288 00:15:10,260 --> 00:15:13,000 First of all, let's do depth first just to recall what that 289 00:15:13,000 --> 00:15:14,610 looks like. 290 00:15:14,610 --> 00:15:18,210 That is, certainly, not a short path. 291 00:15:18,210 --> 00:15:22,270 So let's try this idea, which, by the way, bares the label 292 00:15:22,270 --> 00:15:24,110 branch inbound. 293 00:15:24,110 --> 00:15:26,740 Let's try branch inbound on the same map. 294 00:15:30,226 --> 00:15:32,040 And there it goes. 295 00:15:32,040 --> 00:15:35,720 Each of those little flickers is trying another path. 296 00:15:35,720 --> 00:15:40,590 So you can see it's working it's guts out to find the 297 00:15:40,590 --> 00:15:41,840 shortest path. 298 00:15:52,250 --> 00:15:57,870 It's almost there but it's almost a pathological case. 299 00:15:57,870 --> 00:16:01,410 Or it's almost doing British Museum. 300 00:16:01,410 --> 00:16:04,620 There it's finally found the shortest path. 301 00:16:04,620 --> 00:16:06,190 Now there are some things we can ask about that. 302 00:16:06,190 --> 00:16:08,910 But first of all, before I ask anything about it, I'd like to 303 00:16:08,910 --> 00:16:11,250 get the flow chart up on the board because we're going to 304 00:16:11,250 --> 00:16:15,100 decorate that flow chart, a little bit, as we go. 305 00:16:15,100 --> 00:16:17,140 So the first thing we do is initialize queue. 306 00:16:24,100 --> 00:16:31,310 Then we're going to test first path on the queue. 307 00:16:34,380 --> 00:16:38,650 Then we might be happy because we might be done. 308 00:16:38,650 --> 00:16:41,030 We might have a shortest path to the goal. 309 00:16:41,030 --> 00:16:43,650 Actually, that's not quite true, is it? 310 00:16:43,650 --> 00:16:48,070 We can't really quit until every other path is it. 311 00:16:48,070 --> 00:16:49,320 Well, that's interesting. 312 00:16:55,760 --> 00:16:58,370 If the first element on the queue gets us all the way to 313 00:16:58,370 --> 00:17:02,500 the goal, and we sorted our queue by path length, are we 314 00:17:02,500 --> 00:17:06,680 through as soon as that first element on the queue gets us 315 00:17:06,680 --> 00:17:09,111 to the goal? 316 00:17:09,111 --> 00:17:11,640 Yeah because every other path must have been 317 00:17:11,640 --> 00:17:14,230 sorted beyond it. 318 00:17:14,230 --> 00:17:18,839 And therefore, it can't offer us a shorter path to the goal. 319 00:17:18,839 --> 00:17:22,550 So if the first path is a path to the goal we're done. 320 00:17:22,550 --> 00:17:24,190 Alas, it usually isn't. 321 00:17:24,190 --> 00:17:26,490 So we'll extend first path. 322 00:17:36,549 --> 00:17:39,790 We're going to put all those extensions back on the queue, 323 00:17:39,790 --> 00:17:41,290 and then we're going to sort them. 324 00:17:46,030 --> 00:17:50,340 So that's, pretty much, the same as we did last time. 325 00:17:50,340 --> 00:17:55,610 We're always going to put the elements back on the queue. 326 00:17:55,610 --> 00:17:57,150 We're going to look at the first element the queue and 327 00:17:57,150 --> 00:17:58,360 see if it's a winner. 328 00:17:58,360 --> 00:17:59,430 If it is we're done. 329 00:17:59,430 --> 00:18:02,520 If it's not, we're going to extend it. 330 00:18:02,520 --> 00:18:06,787 And then go back in here and try again. 331 00:18:09,710 --> 00:18:10,900 Well, sort of. 332 00:18:10,900 --> 00:18:15,600 But we noted that this did a awful lot of work because if 333 00:18:15,600 --> 00:18:20,970 we look at those statistics up there, it put 1,354 334 00:18:20,970 --> 00:18:22,210 paths onto the queue. 335 00:18:22,210 --> 00:18:24,210 That's the N queueing part. 336 00:18:24,210 --> 00:18:28,690 And then it extended 835,000 paths that had come to the 337 00:18:28,690 --> 00:18:29,940 front of the queue. 338 00:18:32,496 --> 00:18:37,590 Now I'd like to give you an aside because it's easy to get 339 00:18:37,590 --> 00:18:41,890 confused about N queueing and extending. 340 00:18:41,890 --> 00:18:45,110 In all of the searches we did last time, it would have been 341 00:18:45,110 --> 00:18:48,660 perfectly reasonable to keep a list of all the paths that we 342 00:18:48,660 --> 00:18:51,444 had put onto the queue. 343 00:18:51,444 --> 00:18:53,850 An N queueing list. 344 00:18:53,850 --> 00:19:00,540 And never add a path to our queue if it terminates in a 345 00:19:00,540 --> 00:19:05,820 node that some other path terminate in that has already 346 00:19:05,820 --> 00:19:07,110 gone to the queue. 347 00:19:07,110 --> 00:19:11,820 What I said last time was let us keep track of the things 348 00:19:11,820 --> 00:19:15,900 that have been extended and not extend them again. 349 00:19:15,900 --> 00:19:19,830 So you can either keep track of the nodes that have been 350 00:19:19,830 --> 00:19:21,980 extended and not extend them again. 351 00:19:21,980 --> 00:19:26,290 Or look at the paths with nodes that terminate, and 352 00:19:26,290 --> 00:19:28,570 blah, blah, blah and been put on the queue, the queued ones. 353 00:19:28,570 --> 00:19:31,380 And not put things back on the queue again. 354 00:19:31,380 --> 00:19:33,535 And I think, last time, I may have put a column in there 355 00:19:33,535 --> 00:19:34,190 that said N queued. 356 00:19:34,190 --> 00:19:39,090 It should have been extended. 357 00:19:39,090 --> 00:19:42,840 Even though N queued worked last time, only extended works 358 00:19:42,840 --> 00:19:46,690 this time because we want to be sure that anything we 359 00:19:46,690 --> 00:19:49,070 extend is a short path. 360 00:19:49,070 --> 00:19:54,370 So the N queued idea doesn't work, at all, for these 361 00:19:54,370 --> 00:19:55,910 optimal paths. 362 00:19:55,910 --> 00:19:58,840 So now I want to come back over here off the side bar and 363 00:19:58,840 --> 00:20:01,945 say that we're keeping track of all of the nodes, all of 364 00:20:01,945 --> 00:20:07,040 the paths that end in nodes unless they have already been 365 00:20:07,040 --> 00:20:10,640 extended beyond that particular place. 366 00:20:10,640 --> 00:20:16,590 So we need to decorate our algorithm here and say test 367 00:20:16,590 --> 00:20:24,470 first path and extend the first path 368 00:20:24,470 --> 00:20:31,180 if not already extended. 369 00:20:36,160 --> 00:20:38,740 Because you can see that in the example I had, so far, we 370 00:20:38,740 --> 00:20:43,130 did that same silliness that we talked about last time. 371 00:20:43,130 --> 00:20:49,030 We extended paths that went through A more 372 00:20:49,030 --> 00:20:52,310 than once, like so. 373 00:20:52,310 --> 00:20:55,470 Would it ever make sense to extend this path? 374 00:20:55,470 --> 00:20:58,860 No because we've already extended a path that got there 375 00:20:58,860 --> 00:21:00,780 with less distance. 376 00:21:00,780 --> 00:21:03,900 Will it ever make sense to extend this path? 377 00:21:03,900 --> 00:21:07,460 No because we've already extended another path that 378 00:21:07,460 --> 00:21:11,000 gets to be by a shorter distance. 379 00:21:11,000 --> 00:21:15,040 So if we keep an extended list, we can add that to 380 00:21:15,040 --> 00:21:18,130 branch inbound to our advantage. 381 00:21:18,130 --> 00:21:22,570 So let's see how that would work on the classroom example. 382 00:21:22,570 --> 00:21:23,920 And then we'll do Cambridge. 383 00:21:23,920 --> 00:21:27,200 So this is bridge inbound, plus an extended list. 384 00:21:33,414 --> 00:21:37,260 And I do mean extended. 385 00:21:37,260 --> 00:21:38,350 Not in the N queued list. 386 00:21:38,350 --> 00:21:39,600 N queued list won't work here. 387 00:21:42,050 --> 00:21:45,910 So let's see, I start off the same way as I did before. 388 00:21:45,910 --> 00:21:52,200 S goes to either A or B. That's a length of 3. 389 00:21:52,200 --> 00:21:53,980 That's a length of 5. 390 00:21:53,980 --> 00:22:05,100 So I extend A. That goes to either B or D. But B is as if 391 00:22:05,100 --> 00:22:07,690 it wasn't there at all. 392 00:22:07,690 --> 00:22:08,235 Oh, sorry. 393 00:22:08,235 --> 00:22:09,485 Hang on. 394 00:22:13,180 --> 00:22:13,990 B goes there. 395 00:22:13,990 --> 00:22:18,140 And those path lengths are 7 and 6. 396 00:22:18,140 --> 00:22:21,260 And now I look around on the board, and I say what is the 397 00:22:21,260 --> 00:22:22,730 shortest path so far? 398 00:22:22,730 --> 00:22:28,610 And it's B. So I extend that to get to A and C with path 399 00:22:28,610 --> 00:22:32,020 lengths of 9 and 9. 400 00:22:32,020 --> 00:22:33,820 And what's the shortest one next? 401 00:22:33,820 --> 00:22:38,690 It's D. And that goes to G. And the path length is 11. 402 00:22:38,690 --> 00:22:42,640 And what's the shortest one on the board? 403 00:22:42,640 --> 00:22:44,010 The one that has to be extended next. 404 00:22:44,010 --> 00:22:46,690 That's this one that gets to B. But I've already extended a 405 00:22:46,690 --> 00:22:49,230 path that get to B. So I don't, 406 00:22:49,230 --> 00:22:51,570 actually, do that extension. 407 00:22:51,570 --> 00:22:53,801 So I've saved some work. 408 00:22:53,801 --> 00:22:56,260 But I've got to go over here and do these two now. 409 00:22:56,260 --> 00:22:56,610 But wait. 410 00:22:56,610 --> 00:22:59,770 I've already extended B. I've already extended A, so I don't 411 00:22:59,770 --> 00:23:01,930 have to do that one either. 412 00:23:01,930 --> 00:23:04,340 The only one I have to do is the one that goes to C. And 413 00:23:04,340 --> 00:23:07,950 that those then to E with a path length of 15. 414 00:23:07,950 --> 00:23:09,670 And I'm done. 415 00:23:09,670 --> 00:23:12,760 So if you compare this one with a previous one, you can 416 00:23:12,760 --> 00:23:15,450 see that there might be vast areas of this tree that are 417 00:23:15,450 --> 00:23:19,250 pruned away and don't have to be examined all. 418 00:23:19,250 --> 00:23:21,860 So now, just for the sake of illustrating that, I would 419 00:23:21,860 --> 00:23:24,740 like to keep track of just one of those statistics. 420 00:23:24,740 --> 00:23:27,260 The number of extensions. 421 00:23:27,260 --> 00:23:30,420 So for this particular example, case one, the number 422 00:23:30,420 --> 00:23:35,330 of extensions was 835. 423 00:23:35,330 --> 00:23:37,380 Why don't you see if you can guess to yourself what it 424 00:23:37,380 --> 00:23:42,360 would be if I use this concept of an extended list. 425 00:23:42,360 --> 00:23:44,670 See, I'm not going to extend anything I've already extended 426 00:23:44,670 --> 00:23:47,420 because it's guaranteed to have a longer path length then 427 00:23:47,420 --> 00:23:50,200 something that already got to that same place. 428 00:23:50,200 --> 00:23:53,790 So it makes no sense to do it. 429 00:23:53,790 --> 00:23:58,520 So let me change the type to branch-and-bound with an 430 00:23:58,520 --> 00:24:00,750 extended list. 431 00:24:00,750 --> 00:24:02,420 I'm going to turn the speed down a little bit so 432 00:24:02,420 --> 00:24:04,150 we can watch it. 433 00:24:04,150 --> 00:24:05,830 It might take the rest of the hour. 434 00:24:05,830 --> 00:24:07,080 Who knows? 435 00:24:13,160 --> 00:24:14,000 Still doing a lot of work. 436 00:24:14,000 --> 00:24:15,445 Still examining a lot of paths. 437 00:24:22,880 --> 00:24:24,130 Well, look at that. 438 00:24:26,450 --> 00:24:33,200 Instead of 835 extensions it only did 38. 439 00:24:33,200 --> 00:24:35,800 So that's a pretty substantial savings. 440 00:24:35,800 --> 00:24:39,360 And you would never not want to do that. 441 00:24:39,360 --> 00:24:42,370 So note that that's a layering on top of branching out. 442 00:24:42,370 --> 00:24:44,300 That's not a different algorithm. 443 00:24:44,300 --> 00:24:47,250 It's an adjustment improvement to the algorithm, and it makes 444 00:24:47,250 --> 00:24:48,500 it more efficient. 445 00:24:50,830 --> 00:24:55,490 So this whole thing is based on what I call 446 00:24:55,490 --> 00:24:57,130 the dead horse principle. 447 00:24:57,130 --> 00:25:00,980 As soon as we figure out that a path that goes to a 448 00:25:00,980 --> 00:25:04,640 particular place can't possibly be the winning path, 449 00:25:04,640 --> 00:25:08,830 we get rid of it, and don't bother extending it. 450 00:25:08,830 --> 00:25:12,330 It's a dead horse principle. 451 00:25:12,330 --> 00:25:18,830 But if we look at this example, what's the shortest 452 00:25:18,830 --> 00:25:21,350 possible length of a path that's already 453 00:25:21,350 --> 00:25:22,600 gone from S to B? 454 00:25:27,776 --> 00:25:29,204 What do you think, Tanya? 455 00:25:32,070 --> 00:25:34,110 Well, first of all, it can't be less than 5 because we've 456 00:25:34,110 --> 00:25:37,390 already gone that distance. 457 00:25:37,390 --> 00:25:40,300 So when I say what's the shortest length of any path 458 00:25:40,300 --> 00:25:44,090 that there could possibly be that goes from S to D. We know 459 00:25:44,090 --> 00:25:46,420 it's at least 5. 460 00:25:46,420 --> 00:25:50,680 But can we say something more about it? 461 00:25:50,680 --> 00:25:54,140 Especially, when we look at these airline distances, and 462 00:25:54,140 --> 00:25:57,420 note that this airline distance is 6, and that's a 463 00:25:57,420 --> 00:26:01,150 little more than 7, and that's a little more than 7. 464 00:26:01,150 --> 00:26:02,400 So what do you think? 465 00:26:04,680 --> 00:26:07,270 So it's gone from S to B, and the question is what's the 466 00:26:07,270 --> 00:26:13,370 shortest path that could possibly be that had started 467 00:26:13,370 --> 00:26:14,620 out going from S to B? 468 00:26:17,860 --> 00:26:19,780 11 right? 469 00:26:19,780 --> 00:26:23,690 Because we can't have a path that's shorter than the 470 00:26:23,690 --> 00:26:25,630 airline distance. 471 00:26:25,630 --> 00:26:29,450 If there were a straight line road from A to G, its length 472 00:26:29,450 --> 00:26:30,320 would be 6. 473 00:26:30,320 --> 00:26:31,630 But there isn't. 474 00:26:31,630 --> 00:26:37,740 So that gives us a lower bound on the distance that we have 475 00:26:37,740 --> 00:26:39,840 along that path. 476 00:26:39,840 --> 00:26:45,520 So we're using the accumulated distance, plus the airline 477 00:26:45,520 --> 00:26:49,980 distance, to give us a lower bound on the path that we've 478 00:26:49,980 --> 00:26:53,750 started off on that goes from S to B. 479 00:26:53,750 --> 00:26:57,730 Once again, let's solidify a little bit by simulating the 480 00:26:57,730 --> 00:27:01,320 search and seeing how it turns out. 481 00:27:01,320 --> 00:27:04,270 Not just I did last time, I'm going to forget that I've got 482 00:27:04,270 --> 00:27:06,510 an extended list. 483 00:27:06,510 --> 00:27:08,840 I don't want to carry both of those things around with me at 484 00:27:08,840 --> 00:27:10,270 the same time. 485 00:27:10,270 --> 00:27:12,620 So forget that we've got an extended list. 486 00:27:12,620 --> 00:27:16,020 We'll bring all those back together a little later. 487 00:27:16,020 --> 00:27:18,160 So we're going to forget what we just did there. 488 00:27:18,160 --> 00:27:20,340 And instead we're just going to use this concept of an 489 00:27:20,340 --> 00:27:22,705 airline distance and see what happens. 490 00:27:44,890 --> 00:27:46,910 As before we start with a starting node. 491 00:27:46,910 --> 00:27:48,640 We have two choices as always. 492 00:27:48,640 --> 00:27:52,160 We can go to A or B. And the accumulated distance, if we go 493 00:27:52,160 --> 00:27:54,580 to A, is 3. 494 00:27:54,580 --> 00:27:58,350 And then accumulated distance if we go to B is 5. 495 00:27:58,350 --> 00:28:01,280 But now we're going to add in the airline distances. 496 00:28:01,280 --> 00:28:08,140 So the airline distance from A to G is a little more than 7, 497 00:28:08,140 --> 00:28:12,060 which is 10 plus. 498 00:28:12,060 --> 00:28:16,400 The airline distance from B to G is exactly 6. 499 00:28:16,400 --> 00:28:19,010 So that gives us 11. 500 00:28:19,010 --> 00:28:22,070 And following the procedure we've all been using already 501 00:28:22,070 --> 00:28:25,120 so far, we're going to extend the path that seems to have 502 00:28:25,120 --> 00:28:26,560 the shortest potential. 503 00:28:26,560 --> 00:28:30,390 Now it's the shortest potential distance S to G. So 504 00:28:30,390 --> 00:28:32,970 that must be this one here. 505 00:28:32,970 --> 00:28:41,480 So from A we can go to B or D. The accumulated 506 00:28:41,480 --> 00:28:43,065 distance S, A, B, is 7. 507 00:28:45,840 --> 00:28:48,550 The airline distance is 6, so that's equal to 11. 508 00:28:53,700 --> 00:28:57,890 Standard arithmetic 13. 509 00:28:57,890 --> 00:29:05,280 The distance S, A, D. That is 6 plus a little more than 7. 510 00:29:10,900 --> 00:29:14,090 So what's the accumulated distance? 511 00:29:14,090 --> 00:29:18,810 S, A, D is 3 plus 3 is 6. 512 00:29:18,810 --> 00:29:22,177 AUDIENCE: [INAUDIBLE]. 513 00:29:22,177 --> 00:29:22,925 PROFESSOR: What? 514 00:29:22,925 --> 00:29:25,565 AUDIENCE: The airline distance from D would be 5. 515 00:29:25,565 --> 00:29:28,220 PROFESSOR: Would be 5, right. 516 00:29:28,220 --> 00:29:30,230 So airline distance, in this case, is the same as the 517 00:29:30,230 --> 00:29:31,310 actual distance. 518 00:29:31,310 --> 00:29:33,190 So the accumulated distance is 6. 519 00:29:33,190 --> 00:29:35,360 The actual distance is 5. 520 00:29:35,360 --> 00:29:36,610 So that's equal to 11. 521 00:29:39,720 --> 00:29:42,420 So now I've got two 11's on the board. 522 00:29:42,420 --> 00:29:45,740 And simulating what we'd ask you do on a quiz, we don't 523 00:29:45,740 --> 00:29:47,520 know which of those is going to be better. 524 00:29:47,520 --> 00:29:49,300 They've got a tie score. 525 00:29:49,300 --> 00:29:50,810 So what we're going to do is we're going to choose the one 526 00:29:50,810 --> 00:29:52,510 that's lexically least. 527 00:29:52,510 --> 00:29:58,260 So B comes before D. So we'll expand B. And that can go to 528 00:29:58,260 --> 00:30:05,170 either A or C. And we have to calculate the best possible 529 00:30:05,170 --> 00:30:07,485 distance that goes along those paths. 530 00:30:07,485 --> 00:30:14,670 The accumulated distance S, B, A. S, B, A is 9. 531 00:30:14,670 --> 00:30:18,140 So that's 9 plus 7 plus. 532 00:30:18,140 --> 00:30:20,700 That's 16 plus. 533 00:30:20,700 --> 00:30:23,170 This has an accumulated lead of distance of 9. 534 00:30:23,170 --> 00:30:25,190 Also plus 7 plus. 535 00:30:25,190 --> 00:30:29,060 Also 16 plus. 536 00:30:29,060 --> 00:30:30,000 Well, now let's see. 537 00:30:30,000 --> 00:30:33,940 Things are shaping up pretty well because this one has the 538 00:30:33,940 --> 00:30:37,000 lowest score so far. 539 00:30:37,000 --> 00:30:39,760 We extend that to G. And now the 540 00:30:39,760 --> 00:30:42,470 accumulated distance is 11. 541 00:30:42,470 --> 00:30:45,460 The airline distance is 0, so that's 11. 542 00:30:45,460 --> 00:30:48,840 And that's smaller than everybody else. 543 00:30:48,840 --> 00:30:51,040 So we've got. 544 00:30:51,040 --> 00:30:56,700 So now compare this one with our branch inbound graph. 545 00:30:56,700 --> 00:31:01,370 And you see, once again, we've done considerably less work. 546 00:31:01,370 --> 00:31:04,850 And that, in many practical cases, means that instead of 547 00:31:04,850 --> 00:31:08,300 taking more than the remaining lifetime of the universe to 548 00:31:08,300 --> 00:31:12,760 complete the calculation, it can happen in a few seconds. 549 00:31:12,760 --> 00:31:15,440 But let's see how it works on the example. 550 00:31:15,440 --> 00:31:18,370 So I'm not going to use the extended list. 551 00:31:18,370 --> 00:31:23,810 I'm just going to use this idea of using a lower bound on 552 00:31:23,810 --> 00:31:26,710 the distance remaining, the airline distance, 553 00:31:26,710 --> 00:31:27,960 and see what happens. 554 00:32:00,168 --> 00:32:02,980 So this time, the number of extensions is 70. 555 00:32:08,880 --> 00:32:12,710 So it didn't do quite as well as working alone as the 556 00:32:12,710 --> 00:32:14,680 extended list did working alone. 557 00:32:14,680 --> 00:32:20,160 So we immediately conclude that the extended list is more 558 00:32:20,160 --> 00:32:23,490 useful than using one of these lower bound heuristics. 559 00:32:23,490 --> 00:32:25,570 By the way, this is called an admissible heuristic. 560 00:32:33,555 --> 00:32:37,830 If the heuristic estimate is guaranteed to be less than the 561 00:32:37,830 --> 00:32:41,110 actual distance, that's called an admissible heuristic. 562 00:32:41,110 --> 00:32:45,230 Admissible because you can use it for this kind of purpose. 563 00:32:45,230 --> 00:32:51,940 So it looks like the day extended list is a more useful 564 00:32:51,940 --> 00:32:54,880 idea than the admissible idea. 565 00:32:54,880 --> 00:32:56,690 Right? 566 00:32:56,690 --> 00:32:58,410 What do you think about that, Brett? 567 00:32:58,410 --> 00:33:00,195 Am I hacking? 568 00:33:00,195 --> 00:33:00,945 Am I joking? 569 00:33:00,945 --> 00:33:02,580 AUDIENCE: I think you're judging prematurely. 570 00:33:02,580 --> 00:33:04,180 PROFESSOR: Why am I judging prematurely? 571 00:33:04,180 --> 00:33:05,390 What do you think it might depend on? 572 00:33:05,390 --> 00:33:08,055 AUDIENCE: The fact that we're using extensions and the 573 00:33:08,055 --> 00:33:09,436 extended list pretty much guarantees you can only extend 574 00:33:09,436 --> 00:33:12,054 each node once. 575 00:33:12,054 --> 00:33:15,945 PROFESSOR: Well, Brett has sad something unintelligible that 576 00:33:15,945 --> 00:33:17,600 I can't think how to repeat. 577 00:33:17,600 --> 00:33:19,195 What he meant to say, though, was that-- 578 00:33:19,195 --> 00:33:22,360 [LAUGHTER] 579 00:33:22,360 --> 00:33:24,770 PROFESSOR: --in these cases, it almost always depends on 580 00:33:24,770 --> 00:33:26,930 the problem itself. 581 00:33:26,930 --> 00:33:29,670 If you change the problem, you may get a different result. 582 00:33:29,670 --> 00:33:31,370 So why don't we change the problem and see if we get a 583 00:33:31,370 --> 00:33:32,480 different result? 584 00:33:32,480 --> 00:33:36,040 So instead of starting on the extreme left, let's start in 585 00:33:36,040 --> 00:33:37,380 the middle and see what happens. 586 00:33:42,350 --> 00:33:45,880 So I'll readjust my starting position to be right there. 587 00:33:45,880 --> 00:33:48,320 Oops, that's the wrong adjustment. 588 00:33:54,130 --> 00:33:56,840 And we might as well start by getting our baseline 589 00:33:56,840 --> 00:33:58,315 branch-and-bound without anything. 590 00:34:01,040 --> 00:34:05,580 And for that one, maybe, we'll speed it up a little bit. 591 00:34:09,070 --> 00:34:12,000 So that gives us 57 extensions. 592 00:34:12,000 --> 00:34:13,250 It's an easier problem. 593 00:34:18,880 --> 00:34:22,080 So let's try it with the admissible heuristic. 594 00:34:27,960 --> 00:34:29,210 That went too fast. 595 00:34:33,920 --> 00:34:35,610 Wow, still pretty fast. 596 00:34:35,610 --> 00:34:36,860 Six extensions. 597 00:34:39,382 --> 00:34:43,139 What do you think this number's going to be? 598 00:34:43,139 --> 00:34:45,989 Closer to six or closer to 57? 599 00:34:45,989 --> 00:34:46,630 Better than six? 600 00:34:46,630 --> 00:34:48,370 Worse than six? 601 00:34:48,370 --> 00:34:50,889 Well, let's think. 602 00:34:50,889 --> 00:34:54,100 What we're going to do is we're going to just not repeat 603 00:34:54,100 --> 00:34:58,080 any movements through the same node again. 604 00:34:58,080 --> 00:34:59,360 But it's not going to do something very 605 00:34:59,360 --> 00:35:00,085 important for us. 606 00:35:00,085 --> 00:35:04,350 It's not going to keep us out of the left side because it 607 00:35:04,350 --> 00:35:08,930 has no idea of the remaining airline distance to the goal. 608 00:35:08,930 --> 00:35:19,380 So let's see if that's true It sure is. 609 00:35:19,380 --> 00:35:20,155 Look at that. 610 00:35:20,155 --> 00:35:23,100 It is, foolishly, spending a lot of its' time doing 611 00:35:23,100 --> 00:35:24,090 something we would never do. 612 00:35:24,090 --> 00:35:25,740 Namely, looking over there on the left side. 613 00:35:31,510 --> 00:35:33,370 So this time, the number of extensions is 35. 614 00:35:36,050 --> 00:35:38,860 So in case two, the admissible heuristic 615 00:35:38,860 --> 00:35:40,140 does very much better. 616 00:35:40,140 --> 00:35:43,920 In case one, the extension thing does much better. 617 00:35:43,920 --> 00:35:46,450 But wait a minute, would we ever not want to use both at 618 00:35:46,450 --> 00:35:47,700 the same time? 619 00:35:50,570 --> 00:35:53,660 We wouldn't want to use just one of these, right? 620 00:35:53,660 --> 00:35:57,230 They both have the possibility of doing us a lot of good. 621 00:35:57,230 --> 00:36:00,190 So maybe if we put them in harness together, we'll get 622 00:36:00,190 --> 00:36:02,170 something that's even better. 623 00:36:02,170 --> 00:36:04,350 And when we do that-- 624 00:36:04,350 --> 00:36:07,870 see here, the extended list is a layer on top 625 00:36:07,870 --> 00:36:08,890 of branch and bound. 626 00:36:08,890 --> 00:36:11,200 The admissible heuristic is another layer on top 627 00:36:11,200 --> 00:36:12,250 of branch and bound. 628 00:36:12,250 --> 00:36:17,490 If we put those together, we get something called A star. 629 00:36:17,490 --> 00:36:21,800 So A star is just branch and bound, plus an extended list, 630 00:36:21,800 --> 00:36:24,290 plus and admissible heuristic. 631 00:36:24,290 --> 00:36:26,750 So let's go back to our original problem and try A 632 00:36:26,750 --> 00:36:28,000 star on that. 633 00:36:38,760 --> 00:36:40,410 We're running this at a pretty slow speed because we're 634 00:36:40,410 --> 00:36:43,190 expecting it to be a lot more efficient than the original 635 00:36:43,190 --> 00:36:43,890 branch and bound. 636 00:36:43,890 --> 00:36:44,560 And sure enough it is. 637 00:36:44,560 --> 00:36:47,470 The number of extensions is 27. 638 00:36:47,470 --> 00:36:49,910 So look at that. 639 00:36:49,910 --> 00:36:52,950 A lot better than either of those working independently. 640 00:36:52,950 --> 00:36:54,750 Now I can stick the thing in the center and see what 641 00:36:54,750 --> 00:36:56,000 happens then. 642 00:37:05,450 --> 00:37:09,490 So in this particular case, the extended list didn't, 643 00:37:09,490 --> 00:37:12,090 actually, help us because our admissible heuristic was 644 00:37:12,090 --> 00:37:13,790 channeling us so tightly toward the 645 00:37:13,790 --> 00:37:16,360 goal it didn't matter. 646 00:37:16,360 --> 00:37:19,470 So it all depends on the nature of the space that 647 00:37:19,470 --> 00:37:22,160 you're trying to explore. 648 00:37:22,160 --> 00:37:26,870 By the way, you know how the whole works, right? 649 00:37:26,870 --> 00:37:30,210 So what you want to do is you want to extend the 650 00:37:30,210 --> 00:37:32,120 first path and sort. 651 00:37:32,120 --> 00:37:35,120 But not just by accumulated distance. 652 00:37:35,120 --> 00:37:45,550 Sort by accumulated distance plus admissible heuristic. 653 00:37:53,010 --> 00:37:54,090 But what are the theoreticians? 654 00:37:54,090 --> 00:37:56,550 You must be complaining. 655 00:37:56,550 --> 00:37:58,440 Sort's expensive. 656 00:37:58,440 --> 00:38:00,740 Do we actually need to sort? 657 00:38:03,470 --> 00:38:04,960 No, we don't actually need to sort. 658 00:38:04,960 --> 00:38:05,900 What do we to do? 659 00:38:05,900 --> 00:38:09,141 AUDIENCE: We just need to keep track of what's the minimum. 660 00:38:09,141 --> 00:38:09,946 PROFESSOR: We just need to keep track 661 00:38:09,946 --> 00:38:10,530 of what's the minimum. 662 00:38:10,530 --> 00:38:12,130 So we don't need to, actually, do that sort. 663 00:38:12,130 --> 00:38:16,670 That's an unnecessary computation. 664 00:38:16,670 --> 00:38:20,770 So instead, we can test, not the first path but the 665 00:38:20,770 --> 00:38:22,020 shortest path. 666 00:38:27,990 --> 00:38:28,620 And now you have it. 667 00:38:28,620 --> 00:38:31,430 Now you have the whole of A star. 668 00:38:31,430 --> 00:38:36,650 And now you can go home, but I don't think you should because 669 00:38:36,650 --> 00:38:42,090 I'm about to show you that this idea of admissibility, 670 00:38:42,090 --> 00:38:44,410 actually, leads to certain screw cases that we're very 671 00:38:44,410 --> 00:38:48,340 fond of asking about on exams. 672 00:38:48,340 --> 00:38:51,280 So it turns out that the admissible heuristic, in 673 00:38:51,280 --> 00:38:53,400 certain circumstances, could get you into trouble. 674 00:38:53,400 --> 00:38:56,940 It doesn't look like it could because, logically, nothing 675 00:38:56,940 --> 00:39:00,480 I've said seems strange or questionable. 676 00:39:00,480 --> 00:39:04,150 But that's because I've been working with a map. 677 00:39:04,150 --> 00:39:06,100 And it turns out that if you work with a map then 678 00:39:06,100 --> 00:39:08,500 admissibility is a perfectly sound way of 679 00:39:08,500 --> 00:39:11,710 doing an optimal search. 680 00:39:11,710 --> 00:39:15,650 But, Travis, is search just about maps? 681 00:39:15,650 --> 00:39:17,130 No, search is not just about maps. 682 00:39:17,130 --> 00:39:21,545 So we may have non-Euclidean arrangements that will cause 683 00:39:21,545 --> 00:39:22,710 us trouble. 684 00:39:22,710 --> 00:39:24,815 So I'd like to illustrate that with the following example. 685 00:39:32,710 --> 00:39:36,100 It's not going to be a large map or a large graph. 686 00:39:36,100 --> 00:39:43,866 S, then go up here to A or down here to B. Then they 687 00:39:43,866 --> 00:39:51,650 merge at C. And then they go out here to the goal, G. 688 00:39:51,650 --> 00:39:58,810 And the actual distances are 1, 1, 1, and 10. 689 00:39:58,810 --> 00:40:01,390 And over here, we'll make that 100. 690 00:40:01,390 --> 00:40:05,340 So it's a kind of oddly constructed map, but it's 691 00:40:05,340 --> 00:40:07,240 there because we need a pathological case to 692 00:40:07,240 --> 00:40:09,510 illustrate the idea. 693 00:40:09,510 --> 00:40:11,590 Now that's the actual distances. 694 00:40:11,590 --> 00:40:15,090 And if we did branch and down with an extended list, 695 00:40:15,090 --> 00:40:17,250 everything would work just fine. 696 00:40:17,250 --> 00:40:17,715 But we're not. 697 00:40:17,715 --> 00:40:20,530 We're going to use an admissible heuristic. 698 00:40:20,530 --> 00:40:22,760 And we're going to say that this guy has an estimated 699 00:40:22,760 --> 00:40:25,480 distance to the goal of 100. 700 00:40:25,480 --> 00:40:27,830 This guy is 0. 701 00:40:27,830 --> 00:40:30,240 And this guy is 0. 702 00:40:30,240 --> 00:40:33,500 Now, 0 is always an underestimate of the actual 703 00:40:33,500 --> 00:40:34,600 distance to the goal, right? 704 00:40:34,600 --> 00:40:36,970 So I'm always free to use 0. 705 00:40:36,970 --> 00:40:39,350 Is that 100 OK? 706 00:40:39,350 --> 00:40:43,400 Yeah because the actual distances is 101, so it's less 707 00:40:43,400 --> 00:40:44,650 than that the actual distance. 708 00:40:44,650 --> 00:40:48,950 So it's OK as an admissible heuristic. 709 00:40:48,950 --> 00:40:53,570 So these numbers that I put up here, together, constitute an 710 00:40:53,570 --> 00:40:57,300 admissible heuristic set of estimates to the goal. 711 00:40:57,300 --> 00:41:06,370 So now, let's just simulate A star and see what happens. 712 00:41:06,370 --> 00:41:10,170 So first of all, you start with S, and that can either go 713 00:41:10,170 --> 00:41:20,130 to A or B. The actual distance is 1 plus an estimate on the 714 00:41:20,130 --> 00:41:21,570 remaining distance. 715 00:41:21,570 --> 00:41:25,410 That gives us 100 plus 100. 716 00:41:25,410 --> 00:41:28,860 That's equal to 101. 717 00:41:28,860 --> 00:41:32,160 If we go to B instead, the actual distance is 1 plus the 718 00:41:32,160 --> 00:41:36,190 heuristic's distance is 0, so that's equal to 1. 719 00:41:36,190 --> 00:41:36,870 OK, good. 720 00:41:36,870 --> 00:41:39,140 So now we know that we always extend the 721 00:41:39,140 --> 00:41:42,020 shortest path so far. 722 00:41:42,020 --> 00:41:44,020 Did I goof this, or are you asking a question? 723 00:41:44,020 --> 00:41:45,270 AUDIENCE: [INAUDIBLE]? 724 00:41:49,145 --> 00:41:51,730 PROFESSOR: Yeah, when I say actual, it's the actual 725 00:41:51,730 --> 00:41:52,895 distance that you've traveled. 726 00:41:52,895 --> 00:41:55,650 AUDIENCE: But that's [INAUDIBLE]. 727 00:41:55,650 --> 00:41:57,250 PROFESSOR: So wait a second. 728 00:41:57,250 --> 00:42:00,123 If I go from S to A, the actual distance 729 00:42:00,123 --> 00:42:01,338 I've traveled is 1. 730 00:42:01,338 --> 00:42:03,282 AUDIENCE: I meant like, does the map-- 731 00:42:03,282 --> 00:42:06,780 PROFESSOR: So now I'm taking the sum of the actual 732 00:42:06,780 --> 00:42:08,735 distance, plus the estimated distance to go. 733 00:42:08,735 --> 00:42:09,474 AUDIENCE: All right. 734 00:42:09,474 --> 00:42:10,896 I'm just wondering if the original map has to be 735 00:42:10,896 --> 00:42:12,792 [INAUDIBLE]. 736 00:42:12,792 --> 00:42:17,270 PROFESSOR: See this is not a map. 737 00:42:17,270 --> 00:42:21,630 She was asking if the map has to be geometrically accurate. 738 00:42:21,630 --> 00:42:25,480 See, this could be a model of something that's not a map. 739 00:42:25,480 --> 00:42:28,230 And so, I'm free to put any numbers on those links that I 740 00:42:28,230 --> 00:42:31,520 want, including estimates, as long as they're underestimates 741 00:42:31,520 --> 00:42:35,250 of the distance along the lengths. 742 00:42:35,250 --> 00:42:38,150 So this tells me that my estimated distance 743 00:42:38,150 --> 00:42:40,810 here, so far, is 1. 744 00:42:40,810 --> 00:42:45,530 So I'll, surely, go down here to C. And if I go to C, then 745 00:42:45,530 --> 00:42:50,890 my accumulated distance is 11. 746 00:42:50,890 --> 00:42:53,190 And my estimate of the remaining distance is 0. 747 00:42:56,140 --> 00:42:57,505 So that's a total of 11. 748 00:43:00,400 --> 00:43:02,270 So now I'm following my heuristic again and saying 749 00:43:02,270 --> 00:43:07,130 what's the shortest path on a base of the accumulated 750 00:43:07,130 --> 00:43:10,180 distance plus the estimated distance? 751 00:43:10,180 --> 00:43:12,080 Here, the accumulated distance plus the 752 00:43:12,080 --> 00:43:14,490 estimated distance is 101. 753 00:43:14,490 --> 00:43:15,600 Here, it's only 11. 754 00:43:15,600 --> 00:43:18,420 So plainly, I extend this guy. 755 00:43:18,420 --> 00:43:21,300 And that gets me to the goal. 756 00:43:21,300 --> 00:43:27,980 And the total accumulated distance is now 111 plus 0 757 00:43:27,980 --> 00:43:29,230 equals 111. 758 00:43:35,130 --> 00:43:37,750 And that's not the shortest path, but wait. 759 00:43:37,750 --> 00:43:39,780 I still have to do my checking, right? 760 00:43:39,780 --> 00:43:45,900 I have to extend A. I when I extend A, I get to B. And now, 761 00:43:45,900 --> 00:43:48,710 when I get to B that way, my accumulated 762 00:43:48,710 --> 00:43:53,230 distance is 2 plus my-- 763 00:43:53,230 --> 00:43:53,920 oh, sorry. 764 00:43:53,920 --> 00:43:57,640 S, A, C. 765 00:43:57,640 --> 00:43:59,170 My accumulate distance it 2. 766 00:43:59,170 --> 00:44:02,910 My estimated distance is 0, so that's equal to 2. 767 00:44:02,910 --> 00:44:05,150 So I'm OK because I'm still going to extend 768 00:44:05,150 --> 00:44:06,810 to this guy, right? 769 00:44:06,810 --> 00:44:07,950 Wrong. 770 00:44:07,950 --> 00:44:09,210 I've already extended that guy. 771 00:44:12,220 --> 00:44:13,530 So I'm hosed. 772 00:44:13,530 --> 00:44:15,640 I won't find the shortest path because I'm 773 00:44:15,640 --> 00:44:17,740 going to stop there. 774 00:44:17,740 --> 00:44:20,085 And I'm going to stop there because this is an admissible 775 00:44:20,085 --> 00:44:24,450 heuristic and that's not good enough unless it's a map. 776 00:44:24,450 --> 00:44:26,620 It's not good enough for this particular case because this 777 00:44:26,620 --> 00:44:27,210 is not geometric. 778 00:44:27,210 --> 00:44:32,900 This cannot be done as a map on a plane. 779 00:44:32,900 --> 00:44:36,850 So that's a situation where what I've talked to you about, 780 00:44:36,850 --> 00:44:39,730 so far, works with branch-and-bound. 781 00:44:39,730 --> 00:44:42,070 Works with branch-and -bound plus an extended list. 782 00:44:42,070 --> 00:44:45,210 But doesn't work when we added an admissible heuristic. 783 00:44:45,210 --> 00:44:48,020 So if we're going to do this in general, we need something 784 00:44:48,020 --> 00:44:51,480 stronger than admissibility, which works only on maps. 785 00:44:51,480 --> 00:44:54,590 And so the flourish that I'll tell you about here in the 786 00:44:54,590 --> 00:45:10,780 last few seconds of today's lecture is to add a refinement 787 00:45:10,780 --> 00:45:11,960 as follows. 788 00:45:11,960 --> 00:45:13,590 So far, we've got admissibility. 789 00:45:20,020 --> 00:45:22,410 And if we want to write this down in a kind of mathematical 790 00:45:22,410 --> 00:45:25,680 notation, we could say that it's admissible if the 791 00:45:25,680 --> 00:45:31,500 estimated distance between any node X and the goal is less 792 00:45:31,500 --> 00:45:34,440 than or equal to the actual distance 793 00:45:34,440 --> 00:45:38,490 between X and the goal. 794 00:45:38,490 --> 00:45:40,230 That's the definition of admissible. 795 00:45:40,230 --> 00:45:43,350 As long as heuristic does that it's admissible. 796 00:45:43,350 --> 00:45:46,250 And A star works if it's a map. 797 00:45:46,250 --> 00:45:48,860 But for that kind of situation where it's not a map we need a 798 00:45:48,860 --> 00:45:51,445 stronger condition, which is called consistency. 799 00:45:54,830 --> 00:45:59,100 And what that says is that the distance between X and the 800 00:45:59,100 --> 00:46:04,940 goal minus the distance between some other node in the 801 00:46:04,940 --> 00:46:11,010 goal, Y. Take the absolute value of that. 802 00:46:11,010 --> 00:46:13,950 That has to be less than or equal to the actual distance 803 00:46:13,950 --> 00:46:18,700 between X and Y. 804 00:46:18,700 --> 00:46:24,280 So this heuristic satisfy the consistency condition? 805 00:46:24,280 --> 00:46:25,600 Well, let's see. 806 00:46:25,600 --> 00:46:27,980 Here the guess is 100. 807 00:46:27,980 --> 00:46:28,660 Here it's 0. 808 00:46:28,660 --> 00:46:31,590 So the absolute difference is 100. 809 00:46:31,590 --> 00:46:34,440 But the actual distance is only 2. 810 00:46:34,440 --> 00:46:38,170 So it satisfies admissibility, but it doesn't satisfy 811 00:46:38,170 --> 00:46:38,495 consistency. 812 00:46:38,495 --> 00:46:40,320 And it doesn't work. 813 00:46:40,320 --> 00:46:41,880 So you can almost be guaranteed we'll give you a 814 00:46:41,880 --> 00:46:47,830 situation where if you use an admissible 815 00:46:47,830 --> 00:46:50,490 heuristic you'll lose. 816 00:46:50,490 --> 00:46:54,290 And if you use a consistent heuristic, you'll still win. 817 00:46:57,220 --> 00:47:00,570 So how can we bring this back into the fold? 818 00:47:00,570 --> 00:47:01,950 Well, we can't use that heuristic. 819 00:47:01,950 --> 00:47:03,480 It's no good. 820 00:47:03,480 --> 00:47:09,690 But if this heuristic estimate of the goal were 2, then we'd 821 00:47:09,690 --> 00:47:15,100 be OK because then it would still be admissible. 822 00:47:15,100 --> 00:47:18,310 But it would also be consistent. 823 00:47:18,310 --> 00:47:20,565 So the bottom line is that you now know something you didn't 824 00:47:20,565 --> 00:47:24,000 know when you started out two lectures ago. 825 00:47:24,000 --> 00:47:28,450 You now know how MapQuest and all of its' descendents work. 826 00:47:28,450 --> 00:47:30,470 Now you can find an optimal path, as well as a 827 00:47:30,470 --> 00:47:32,390 heuristically good path. 828 00:47:32,390 --> 00:47:34,795 You see that if you don't do anything other than branch and 829 00:47:34,795 --> 00:47:37,510 bound it can be extremely expensive. 830 00:47:37,510 --> 00:47:39,760 And you can even invent pathological cases where it's 831 00:47:39,760 --> 00:47:45,890 exponential and the distance to the goal. 832 00:47:45,890 --> 00:47:48,730 So because it can be so computationally horrible, you 833 00:47:48,730 --> 00:47:51,360 want to use every advantage you can, which, generally, 834 00:47:51,360 --> 00:47:54,470 involves using an extended list. 835 00:47:54,470 --> 00:47:55,790 As well as-- 836 00:47:55,790 --> 00:47:56,870 no laptops, please. 837 00:47:56,870 --> 00:47:57,510 It still holds. 838 00:47:57,510 --> 00:47:59,930 No smoking, no drinking, and no laptops. 839 00:48:04,160 --> 00:48:06,200 So you're going to use all the muscles you can. 840 00:48:06,200 --> 00:48:11,180 And those muscles include using an extended list and an 841 00:48:11,180 --> 00:48:14,410 admissible or consistent heuristic, depending on the 842 00:48:14,410 --> 00:48:16,350 circumstances. 843 00:48:16,350 --> 00:48:19,880 And so, I think we'll conclude there since our time is up. 844 00:48:19,880 --> 00:48:21,590 And Elliot, you can ask a question after class. 845 00:48:21,590 --> 00:48:22,990 Why don't you come up and ask it now?