1 00:00:00,000 --> 00:00:05,790 2 00:00:05,790 --> 00:00:07,320 PROFESSOR: Today I'd like to talk to you about some 3 00:00:07,320 --> 00:00:10,890 techniques that you can add to basic search to enable your 4 00:00:10,890 --> 00:00:13,510 systems to make more intelligent decisions and save 5 00:00:13,510 --> 00:00:15,380 computation time. 6 00:00:15,380 --> 00:00:17,710 Last time, we introduced basic search and the basic idea of 7 00:00:17,710 --> 00:00:21,150 how to encode search so that our system can use search when 8 00:00:21,150 --> 00:00:24,730 encountering an unknown territory or state space. 9 00:00:24,730 --> 00:00:27,000 And today, we're going to review some things you can do 10 00:00:27,000 --> 00:00:31,110 in terms of using the information that you know and 11 00:00:31,110 --> 00:00:33,980 also using estimations of the information that you would 12 00:00:33,980 --> 00:00:38,510 like to know to improve your chances of discovering the 13 00:00:38,510 --> 00:00:42,040 path that you're interested in the fastest. 14 00:00:42,040 --> 00:00:46,490 The first thing that we can do in order to improve our search 15 00:00:46,490 --> 00:00:49,540 is use dynamic programming. 16 00:00:49,540 --> 00:00:53,510 Dynamic programming refers to the idea is that once you've 17 00:00:53,510 --> 00:00:57,080 done a computation for a particular kind of problem, 18 00:00:57,080 --> 00:01:00,640 you can save that computation and use it later, as opposed 19 00:01:00,640 --> 00:01:05,180 to having to engage in that computation a second time. 20 00:01:05,180 --> 00:01:08,020 The way this manifests in search is that once you 21 00:01:08,020 --> 00:01:10,970 visited a particular state, you don't have to visit that 22 00:01:10,970 --> 00:01:11,830 state again. 23 00:01:11,830 --> 00:01:15,610 Because the way your agenda's set up, you found the fastest 24 00:01:15,610 --> 00:01:20,420 way to that particular state, as far as you're concerned. 25 00:01:20,420 --> 00:01:22,930 So the general idea is that once you visit a particular 26 00:01:22,930 --> 00:01:24,920 state, you don't have to visit it again. 27 00:01:24,920 --> 00:01:28,080 28 00:01:28,080 --> 00:01:32,490 If you're building a normal search tree, it can be 29 00:01:32,490 --> 00:01:35,740 difficult to keep track of where you've been, especially 30 00:01:35,740 --> 00:01:39,060 if you're doing something like that depth-first search. 31 00:01:39,060 --> 00:01:41,990 To get around this and to enable dynamic programming, 32 00:01:41,990 --> 00:01:45,020 the easiest thing to do is just keep a list of the states 33 00:01:45,020 --> 00:01:47,150 that you visited as a consequence of this particular 34 00:01:47,150 --> 00:01:50,210 run of search. 35 00:01:50,210 --> 00:01:53,190 I'm going to demonstrate dynamic programming by running 36 00:01:53,190 --> 00:01:56,440 depth-first search on our state transition 37 00:01:56,440 --> 00:01:59,580 diagram from last time. 38 00:01:59,580 --> 00:02:02,450 The first two steps are the same, except for the fact that 39 00:02:02,450 --> 00:02:04,640 we're going to keep track of the fact that we visited both 40 00:02:04,640 --> 00:02:09,400 A, B, and C as a consequence of the first two 41 00:02:09,400 --> 00:02:10,650 iterations of search. 42 00:02:10,650 --> 00:02:17,490 43 00:02:17,490 --> 00:02:20,720 If I'm running depth-first search, my agenda acts as a 44 00:02:20,720 --> 00:02:24,010 stack, which means I'm going to take the partial path that 45 00:02:24,010 --> 00:02:29,010 I added most recently to the agenda, pop it off, and expand 46 00:02:29,010 --> 00:02:32,270 the last node in the partial path. 47 00:02:32,270 --> 00:02:40,710 When I expand C, I'm going to visit B and D. However, B is 48 00:02:40,710 --> 00:02:43,390 already in my list of states that I visited, because it's 49 00:02:43,390 --> 00:02:46,700 already in one of the partial paths in my agenda. 50 00:02:46,700 --> 00:02:50,260 So I'm not going to visit B again. 51 00:02:50,260 --> 00:02:52,670 Instead, I'm only going to add one new partial path to my 52 00:02:52,670 --> 00:02:54,410 agenda, ACD. 53 00:02:54,410 --> 00:03:02,140 54 00:03:02,140 --> 00:03:06,060 I'm also going to add D to my list of visited states. 55 00:03:06,060 --> 00:03:11,690 56 00:03:11,690 --> 00:03:12,870 If I run another iteration of depth-first 57 00:03:12,870 --> 00:03:14,740 search, I find my goal. 58 00:03:14,740 --> 00:03:24,164 59 00:03:24,164 --> 00:03:29,040 For completeness, E is added to list of visited states. 60 00:03:29,040 --> 00:03:30,980 This took even fewer iterations that the original 61 00:03:30,980 --> 00:03:32,450 depth-first search. 62 00:03:32,450 --> 00:03:34,480 We didn't waste time expanding different nodes. 63 00:03:34,480 --> 00:03:37,580 64 00:03:37,580 --> 00:03:39,240 We also used less space. 65 00:03:39,240 --> 00:03:43,240 In the general sense, the concept of dynamic programming 66 00:03:43,240 --> 00:03:44,730 is great to use whatever you can. 67 00:03:44,730 --> 00:03:46,730 And in search, it can save you a lot of time and energy. 68 00:03:46,730 --> 00:03:54,200 69 00:03:54,200 --> 00:03:56,580 Another way we can intelligently improve our 70 00:03:56,580 --> 00:04:00,040 search technique is by making considerations for costs that 71 00:04:00,040 --> 00:04:03,100 we know that are associated with particular transitions. 72 00:04:03,100 --> 00:04:05,990 In this state transition diagram, I've indicated some 73 00:04:05,990 --> 00:04:08,100 costs associated with the transitions 74 00:04:08,100 --> 00:04:10,710 between particular states. 75 00:04:10,710 --> 00:04:13,400 If we know the costs associated with transitions, 76 00:04:13,400 --> 00:04:17,339 then we can use a cumulation of the weights accumulated 77 00:04:17,339 --> 00:04:20,120 through traversing a particular partial path to 78 00:04:20,120 --> 00:04:23,500 prioritize which paths we're going to explore first in an 79 00:04:23,500 --> 00:04:26,890 effort to reduce the amount of cost associated 80 00:04:26,890 --> 00:04:30,120 with our final actions. 81 00:04:30,120 --> 00:04:33,580 In order to do that, we can keep track of the value 82 00:04:33,580 --> 00:04:36,320 associated with that cumulation, and sort the 83 00:04:36,320 --> 00:04:38,960 agenda at every iteration based on that cost. 84 00:04:38,960 --> 00:04:42,190 85 00:04:42,190 --> 00:04:44,580 If we're using dynamic programming, while making 86 00:04:44,580 --> 00:04:49,310 considerations for both cost and heuristics, and also when 87 00:04:49,310 --> 00:04:52,950 we're running the goal test when making considerations for 88 00:04:52,950 --> 00:04:56,050 cost and heuristics, we're actually going to make our 89 00:04:56,050 --> 00:04:59,470 considerations when we expand the node as opposed to when we 90 00:04:59,470 --> 00:05:00,930 visit the node. 91 00:05:00,930 --> 00:05:03,010 This difference is very important because it provides 92 00:05:03,010 --> 00:05:06,110 us with the most optimal solution. 93 00:05:06,110 --> 00:05:09,650 If it's possible for us to visit the goal node, but it's 94 00:05:09,650 --> 00:05:15,860 100 cost units away, It may be worth our while to search for 95 00:05:15,860 --> 00:05:18,130 alternatives that provide a much shorter 96 00:05:18,130 --> 00:05:21,340 path to the goal node. 97 00:05:21,340 --> 00:05:24,970 That's why we switch from considering when we visit to 98 00:05:24,970 --> 00:05:26,300 considering when we expand. 99 00:05:26,300 --> 00:05:28,950 100 00:05:28,950 --> 00:05:31,400 Let's look at uniform cost search run with dynamic 101 00:05:31,400 --> 00:05:33,724 programming. 102 00:05:33,724 --> 00:05:39,540 In my first step, I expand A and add two 103 00:05:39,540 --> 00:05:41,080 partial paths to my queue. 104 00:05:41,080 --> 00:05:45,760 I'm also going to keep track of the cumulative cost 105 00:05:45,760 --> 00:05:48,410 associated with that partial path. 106 00:05:48,410 --> 00:05:51,310 When I expand A, I add A to the agenda. 107 00:05:51,310 --> 00:05:56,710 108 00:05:56,710 --> 00:05:58,640 Note that I haven't talked about stacks or queues or 109 00:05:58,640 --> 00:06:01,070 depth first or bread first or anything. 110 00:06:01,070 --> 00:06:02,850 We're working with a priority queue. 111 00:06:02,850 --> 00:06:05,580 And what that means is that things with a higher priority 112 00:06:05,580 --> 00:06:06,240 float to the top. 113 00:06:06,240 --> 00:06:08,970 Or things with the lowest cost associated with them we're 114 00:06:08,970 --> 00:06:11,720 going to consider first. 115 00:06:11,720 --> 00:06:14,210 This means that I'm going to expand B in the 116 00:06:14,210 --> 00:06:16,090 partial path AB first. 117 00:06:16,090 --> 00:06:19,790 118 00:06:19,790 --> 00:06:22,740 I'll add it to my list. 119 00:06:22,740 --> 00:06:33,650 When I expand B, B has two child nodes, D and C. ABC has 120 00:06:33,650 --> 00:06:36,050 partial path cost 3 associated with it. 121 00:06:36,050 --> 00:06:42,440 And ABD has partial path cost 11 associated with it. 122 00:06:42,440 --> 00:06:46,300 That means when I reprioritize my queue, I'm going to end up 123 00:06:46,300 --> 00:06:55,030 sorting everything such that AC comes first, ABC come 124 00:06:55,030 --> 00:07:03,340 second, And ABD comes third. 125 00:07:03,340 --> 00:07:08,830 126 00:07:08,830 --> 00:07:13,110 Previously, with our strategy for dynamic programming, C 127 00:07:13,110 --> 00:07:17,200 would not have been added to this partial path, because 128 00:07:17,200 --> 00:07:21,210 we've already visited it with path AC. 129 00:07:21,210 --> 00:07:26,290 At this step, we're going to expand the path AC, add C to 130 00:07:26,290 --> 00:07:31,410 the list, and any other time that we end up visiting C, we 131 00:07:31,410 --> 00:07:32,880 will not add it to our paths. 132 00:07:32,880 --> 00:07:37,750 133 00:07:37,750 --> 00:07:42,470 If I expand C, I have a transition to B, which is 134 00:07:42,470 --> 00:07:46,360 already in my list, and a transition to D 135 00:07:46,360 --> 00:07:47,610 that has cost 7. 136 00:07:47,610 --> 00:07:58,140 137 00:07:58,140 --> 00:08:00,370 ABC is going to float to the top of my priority queue. 138 00:08:00,370 --> 00:08:02,940 139 00:08:02,940 --> 00:08:06,230 ACB is not going to be considered, because B is 140 00:08:06,230 --> 00:08:07,480 already part of my list. 141 00:08:07,480 --> 00:08:10,640 142 00:08:10,640 --> 00:08:13,300 And ABD is going to remain with cost 11. 143 00:08:13,300 --> 00:08:30,940 144 00:08:30,940 --> 00:08:33,059 At this point, you might say, but Kendra, why am I 145 00:08:33,059 --> 00:08:36,350 considering partial path ABC when C is in my list-- 146 00:08:36,350 --> 00:08:38,620 or C should be in my list, excuse me-- 147 00:08:38,620 --> 00:08:43,650 as a consequence of expanding the partial path AC? 148 00:08:43,650 --> 00:08:46,510 Even though we've expanded the partial path AC, if we have 149 00:08:46,510 --> 00:08:49,410 not made any considerations to weed out our list at every 150 00:08:49,410 --> 00:08:52,320 iteration, this is still going to float to the top. 151 00:08:52,320 --> 00:08:56,060 And we're still going to have to deal with it, even though 152 00:08:56,060 --> 00:09:01,110 we've already expanded C. Since we've already expanded 153 00:09:01,110 --> 00:09:07,340 C, we're going to ignore this partial path and just move ACD 154 00:09:07,340 --> 00:09:09,730 7 and ABD 11 up to the front. 155 00:09:09,730 --> 00:09:29,590 156 00:09:29,590 --> 00:09:32,870 D is not in our expanded list. 157 00:09:32,870 --> 00:09:41,630 When we expand D, we have one child node E. Because we're 158 00:09:41,630 --> 00:09:45,280 working with cost and heuristics, we do not actually 159 00:09:45,280 --> 00:09:47,340 evaluate the goal test when we visit a node. 160 00:09:47,340 --> 00:09:49,870 We evaluate it when we expand a node. 161 00:09:49,870 --> 00:09:55,260 So I am going to add ACDE to my agenda. 162 00:09:55,260 --> 00:09:56,520 It's going to have cost 8. 163 00:09:56,520 --> 00:10:05,230 164 00:10:05,230 --> 00:10:11,160 And ABD11 is still going to hang out here at the back of 165 00:10:11,160 --> 00:10:12,410 the priority queue. 166 00:10:12,410 --> 00:10:14,970 167 00:10:14,970 --> 00:10:26,260 At this point, I get to expand E. I skipped adding D to the 168 00:10:26,260 --> 00:10:36,230 expanded list when I expanded it from ACD to ACDE. 169 00:10:36,230 --> 00:10:39,540 At this point, I'm going to expand E. And the first thing 170 00:10:39,540 --> 00:10:43,550 I'm going to do is test and see whether or not it passes 171 00:10:43,550 --> 00:10:45,060 the goal test. 172 00:10:45,060 --> 00:10:50,430 At that point, I stop search, return that I successfully 173 00:10:50,430 --> 00:10:54,940 completed the search, and that my partial path 174 00:10:54,940 --> 00:10:56,190 is going to be ACDE. 175 00:10:56,190 --> 00:11:08,480 176 00:11:08,480 --> 00:11:12,420 That covers uniform cost search. 177 00:11:12,420 --> 00:11:14,390 At this point, you might say, Kendra, this is bearing a lot 178 00:11:14,390 --> 00:11:16,180 of similarity to things like maps. 179 00:11:16,180 --> 00:11:20,010 And I would really like to be able to use my knowledge of 180 00:11:20,010 --> 00:11:22,990 things like the Euclidean in order to make more 181 00:11:22,990 --> 00:11:24,910 intelligent, even more intelligent decisions about 182 00:11:24,910 --> 00:11:28,230 where it is that I go with myself. 183 00:11:28,230 --> 00:11:31,300 And I would say yes, you should be able to. 184 00:11:31,300 --> 00:11:32,130 In fact, people do. 185 00:11:32,130 --> 00:11:35,640 In fact, people do all the time. 186 00:11:35,640 --> 00:11:37,610 They say well, some thing's this far 187 00:11:37,610 --> 00:11:38,700 away as the crow flies. 188 00:11:38,700 --> 00:11:40,900 So I know that if I've gone further than that at any 189 00:11:40,900 --> 00:11:43,390 point, then I've wasted some amount of time. 190 00:11:43,390 --> 00:11:45,920 But it represents a good underestimate of the distance 191 00:11:45,920 --> 00:11:48,020 that I'm going to cover. 192 00:11:48,020 --> 00:11:49,615 This is the basic concept of heuristics. 193 00:11:49,615 --> 00:11:54,560 194 00:11:54,560 --> 00:11:57,600 If you're attempting to find a goal, and you have an estimate 195 00:11:57,600 --> 00:12:00,220 for the remaining cost, but you know it's not exactly 196 00:12:00,220 --> 00:12:03,910 right, you can still use that information to attempt to save 197 00:12:03,910 --> 00:12:06,110 you an amount of computation or amount of search. 198 00:12:06,110 --> 00:12:09,080 199 00:12:09,080 --> 00:12:10,570 In particular, you probably shouldn't be using a 200 00:12:10,570 --> 00:12:12,240 heuristic, if you know it's perfect. 201 00:12:12,240 --> 00:12:14,590 Because if you know the heuristic is perfect, then you 202 00:12:14,590 --> 00:12:16,550 should be using the heuristic to solve your problems, 203 00:12:16,550 --> 00:12:18,410 instead of doing search in the first place. 204 00:12:18,410 --> 00:12:21,820 Or if the heuristic already tells you how long it's going 205 00:12:21,820 --> 00:12:24,680 to take to find something, then it probably also has the 206 00:12:24,680 --> 00:12:29,210 path that represents how long or that represents 207 00:12:29,210 --> 00:12:30,560 that amount of cost. 208 00:12:30,560 --> 00:12:36,220 209 00:12:36,220 --> 00:12:39,940 If you want to use a heuristic effectively, you have to make 210 00:12:39,940 --> 00:12:43,130 sure that your heuristic represents a non-strict 211 00:12:43,130 --> 00:12:48,040 underestimate of the amount of cost that is left over. 212 00:12:48,040 --> 00:12:51,240 213 00:12:51,240 --> 00:12:52,160 And what do I mean by that? 214 00:12:52,160 --> 00:12:56,470 I mean that if you have a heuristic, and you're using it 215 00:12:56,470 --> 00:12:58,720 as a thing to tell you whether or not you're wasting your 216 00:12:58,720 --> 00:13:05,830 time, if your heuristic represents information that is 217 00:13:05,830 --> 00:13:13,480 bogus or says this particular path has more cost associated 218 00:13:13,480 --> 00:13:17,460 with it than it actually does, then it will lead you astray. 219 00:13:17,460 --> 00:13:20,320 Or you don't want to use a heuristic that will prevent 220 00:13:20,320 --> 00:13:24,040 you from using a path that actually costs less than the 221 00:13:24,040 --> 00:13:26,990 heuristic advertises. 222 00:13:26,990 --> 00:13:28,880 This is what's known as an admissible heuristic. 223 00:13:28,880 --> 00:13:33,910 An admissible heuristic always underestimates if it makes an 224 00:13:33,910 --> 00:13:38,900 error in estimation about your total distance to the goal. 225 00:13:38,900 --> 00:13:43,740 226 00:13:43,740 --> 00:13:44,040 All right. 227 00:13:44,040 --> 00:13:45,550 So at this point, I've covered dynamic programming. 228 00:13:45,550 --> 00:13:46,520 I've covered costs. 229 00:13:46,520 --> 00:13:48,130 I've covered heuristics. 230 00:13:48,130 --> 00:13:49,810 And it turns out, you can use all of these 231 00:13:49,810 --> 00:13:53,010 techniques at the same time. 232 00:13:53,010 --> 00:13:58,800 When you use both cost and heuristics, in combination, 233 00:13:58,800 --> 00:14:03,010 while evaluating your priority queue, that's known as an 234 00:14:03,010 --> 00:14:04,910 A-Star search. 235 00:14:04,910 --> 00:14:07,200 And you'll see a decent amount of literature on A-Star. 236 00:14:07,200 --> 00:14:10,300 237 00:14:10,300 --> 00:14:14,100 This covers all of the intelligent improvements to 238 00:14:14,100 --> 00:14:17,610 basic search that I will talk about in this course. 239 00:14:17,610 --> 00:14:19,850 We hope you enjoyed 6.01. 240 00:14:19,850 --> 00:14:20,628