1 00:00:00,790 --> 00:00:03,130 The following content is provided under a Creative 2 00:00:03,130 --> 00:00:04,550 Commons license. 3 00:00:04,550 --> 00:00:06,760 Your support will help MIT OpenCourseWare 4 00:00:06,760 --> 00:00:10,850 continue to offer high quality educational resources for free. 5 00:00:10,850 --> 00:00:13,390 To make a donation or to view additional materials 6 00:00:13,390 --> 00:00:17,320 from hundreds of MIT courses visit MIT OpenCourseWare 7 00:00:17,320 --> 00:00:18,570 at ocw.mit.edu. 8 00:00:29,770 --> 00:00:32,170 JOHN GUTTAG: We ended the last lecture 9 00:00:32,170 --> 00:00:35,200 looking at greedy algorithms. 10 00:00:35,200 --> 00:00:38,370 Today I want to discuss the pros and cons of greedy. 11 00:00:38,370 --> 00:00:40,240 Oh, I should mention-- 12 00:00:40,240 --> 00:00:44,310 in response to popular demand, I have put the PowerPoint up, 13 00:00:44,310 --> 00:00:48,920 so if you download the ZIP file, you'll find the questions, 14 00:00:48,920 --> 00:00:52,600 including question 1, the first question, plus the code, 15 00:00:52,600 --> 00:00:56,090 plus the PowerPoint. 16 00:00:56,090 --> 00:00:59,300 We actually do read Piazza, and sometimes, at least, 17 00:00:59,300 --> 00:01:00,380 pay attention. 18 00:01:00,380 --> 00:01:03,710 We should pay attention all the time. 19 00:01:03,710 --> 00:01:08,150 So what are the pros and cons of greedy? 20 00:01:08,150 --> 00:01:11,160 The pro-- and it's a big pro-- is 21 00:01:11,160 --> 00:01:14,790 that it's really easy to implement, as you could see. 22 00:01:14,790 --> 00:01:18,280 Also enormously important-- it's really fast. 23 00:01:18,280 --> 00:01:20,100 We looked at the complexity last time-- 24 00:01:20,100 --> 00:01:23,730 it was m log n-- quite quick. 25 00:01:23,730 --> 00:01:27,090 The downside-- and this can be either a big problem or not 26 00:01:27,090 --> 00:01:28,500 a big problem-- 27 00:01:28,500 --> 00:01:30,840 is that it doesn't actually solve 28 00:01:30,840 --> 00:01:34,440 the problem, in the sense that we've asked ourselves 29 00:01:34,440 --> 00:01:36,460 to optimize something. 30 00:01:36,460 --> 00:01:40,930 And we get a solution that may or may not be optimal. 31 00:01:40,930 --> 00:01:43,900 Worse-- we don't even know, in this case, 32 00:01:43,900 --> 00:01:46,540 how close to optimal it is. 33 00:01:46,540 --> 00:01:50,830 Maybe it's almost optimal, but maybe it's really far away. 34 00:01:50,830 --> 00:01:54,920 And that's a big problem with many greedy algorithms. 35 00:01:54,920 --> 00:01:58,660 There are some very sophisticated greedy algorithms 36 00:01:58,660 --> 00:02:02,200 we won't be looking at that give you a bound on how good 37 00:02:02,200 --> 00:02:07,010 the approximation is, but most of them don't do that. 38 00:02:07,010 --> 00:02:10,550 Last time we looked at an alternative 39 00:02:10,550 --> 00:02:13,310 to a greedy algorithm that was guaranteed 40 00:02:13,310 --> 00:02:14,540 to find the right solution. 41 00:02:14,540 --> 00:02:16,820 It was a brute force algorithm. 42 00:02:16,820 --> 00:02:19,040 The basic idea is simple-- 43 00:02:19,040 --> 00:02:23,040 that you enumerate all possible combinations of items, 44 00:02:23,040 --> 00:02:25,680 remove the combination whose total units exceed 45 00:02:25,680 --> 00:02:28,920 the allowable weight, and then choose the winner 46 00:02:28,920 --> 00:02:32,400 from those that are remaining. 47 00:02:32,400 --> 00:02:34,360 Now let's talk about how to implement it. 48 00:02:34,360 --> 00:02:36,630 And the way I want to implement it is using something 49 00:02:36,630 --> 00:02:39,360 called a search tree. 50 00:02:39,360 --> 00:02:42,310 There are lots of different ways to implement it. 51 00:02:42,310 --> 00:02:44,180 In the second half of today's lecture, 52 00:02:44,180 --> 00:02:45,950 you'll see why I happen to choose 53 00:02:45,950 --> 00:02:48,780 this particular approach. 54 00:02:48,780 --> 00:02:52,010 So what is a search tree? 55 00:02:52,010 --> 00:02:56,810 A tree is, basically, a kind of graph. 56 00:02:56,810 --> 00:03:00,000 And we'll hear much more about graphs next week. 57 00:03:00,000 --> 00:03:04,450 But this is a simple form where you have a root 58 00:03:04,450 --> 00:03:06,520 and then children of the root. 59 00:03:06,520 --> 00:03:09,310 In this particular form, research C, 60 00:03:09,310 --> 00:03:10,930 you have two children. 61 00:03:10,930 --> 00:03:12,250 So we start with the root. 62 00:03:15,220 --> 00:03:17,680 And then we look at our list of elements 63 00:03:17,680 --> 00:03:21,430 to be considered that we might take, 64 00:03:21,430 --> 00:03:24,700 and we look at the first element in that list. 65 00:03:24,700 --> 00:03:28,660 And then we draw a left branch, which 66 00:03:28,660 --> 00:03:32,590 shows the consequence of choosing to take that element, 67 00:03:32,590 --> 00:03:36,250 and a right branch, which shows the consequences of not 68 00:03:36,250 --> 00:03:37,210 taking that element. 69 00:03:39,980 --> 00:03:43,250 And then we consider the second element, 70 00:03:43,250 --> 00:03:49,390 and so on and so forth, until we get to the bottom of the tree. 71 00:03:49,390 --> 00:03:52,560 So by convention, the left element will mean we took it, 72 00:03:52,560 --> 00:03:54,670 the right direction will mean we didn't take it. 73 00:03:59,000 --> 00:04:02,930 And then we apply it recursively to the non-leaf children. 74 00:04:02,930 --> 00:04:04,520 The leaf means we get to the end, 75 00:04:04,520 --> 00:04:07,400 we've considered the last element to be considered. 76 00:04:07,400 --> 00:04:10,130 Nothing else to think about. 77 00:04:10,130 --> 00:04:11,660 When we get to the code, we'll see 78 00:04:11,660 --> 00:04:15,140 that, in addition to the description being recursive, 79 00:04:15,140 --> 00:04:19,000 it's convenient to write the code that way, too. 80 00:04:19,000 --> 00:04:21,250 And then finally, we'll choose the node 81 00:04:21,250 --> 00:04:24,560 that has the highest value that meets our constraints. 82 00:04:24,560 --> 00:04:26,950 So let's look at an example. 83 00:04:26,950 --> 00:04:29,560 My example is I have my backpack that 84 00:04:29,560 --> 00:04:33,920 can hold a certain number of calories if you will. 85 00:04:33,920 --> 00:04:37,360 And I'm choosing between, to keep it small, a beer, a pizza, 86 00:04:37,360 --> 00:04:39,580 and a burger-- 87 00:04:39,580 --> 00:04:43,320 three essential food groups. 88 00:04:43,320 --> 00:04:47,830 The first thing I explore on the left is take the beer, 89 00:04:47,830 --> 00:04:50,350 and then I have the pizza and the burger 90 00:04:50,350 --> 00:04:53,160 to continue to consider. 91 00:04:53,160 --> 00:04:56,520 I then say, all right, let's take the pizza. 92 00:04:56,520 --> 00:04:58,170 Now I have just the burger. 93 00:04:58,170 --> 00:05:00,510 Now I taste the burger. 94 00:05:00,510 --> 00:05:04,500 This traversal of this generation of the tree 95 00:05:04,500 --> 00:05:07,980 is called left-most depth-most. 96 00:05:07,980 --> 00:05:11,900 So I go all the way down to the bottom of the tree. 97 00:05:11,900 --> 00:05:14,680 I then back up a level and say, all 98 00:05:14,680 --> 00:05:17,740 right, I'm now at the bottom. 99 00:05:17,740 --> 00:05:24,130 Let's go back and see what happens 100 00:05:24,130 --> 00:05:29,210 if I make the other choice at the one level up the tree. 101 00:05:29,210 --> 00:05:31,600 So I went up and said, well, now let's 102 00:05:31,600 --> 00:05:37,000 see what happens if I make a different decision, 103 00:05:37,000 --> 00:05:40,520 as in we didn't take the burger. 104 00:05:40,520 --> 00:05:41,780 And then I work my way-- 105 00:05:41,780 --> 00:05:43,850 this is called backtracking-- 106 00:05:43,850 --> 00:05:45,470 up another level. 107 00:05:45,470 --> 00:05:49,500 I now say, suppose, I didn't take the piece of pizza. 108 00:05:49,500 --> 00:05:52,250 Now I have the beer only and only the burger 109 00:05:52,250 --> 00:05:58,110 to think about, so on and so forth, 110 00:05:58,110 --> 00:06:00,630 until I've generated the whole tree. 111 00:06:00,630 --> 00:06:04,050 You'll notice it will always be the case that the leftmost leaf 112 00:06:04,050 --> 00:06:08,800 of this tree has got all the possible items in it, 113 00:06:08,800 --> 00:06:12,080 and the rightmost leaf none. 114 00:06:12,080 --> 00:06:14,960 And then I just check which of these leaves 115 00:06:14,960 --> 00:06:19,180 meets the constraint and what are the values. 116 00:06:19,180 --> 00:06:24,430 And if I compute the value and the calories in each one, 117 00:06:24,430 --> 00:06:28,120 and if our constraint was 750 calories, 118 00:06:28,120 --> 00:06:30,505 then I get to choose the winner, which is-- 119 00:06:33,356 --> 00:06:34,980 I guess, it's the pizza and the burger. 120 00:06:34,980 --> 00:06:35,710 Is that right? 121 00:06:38,440 --> 00:06:45,810 The most value under 750. 122 00:06:45,810 --> 00:06:49,180 That's the way I go through. 123 00:06:49,180 --> 00:06:52,710 It's quite a straightforward algorithm. 124 00:06:52,710 --> 00:06:56,350 And I don't know why we draw our trees with the root at the top 125 00:06:56,350 --> 00:06:58,290 and the leaves at the bottom. 126 00:06:58,290 --> 00:07:00,960 My only conjecture is computer scientists 127 00:07:00,960 --> 00:07:02,355 don't spend enough time outdoors. 128 00:07:06,210 --> 00:07:09,870 Now let's think of the computational complexity 129 00:07:09,870 --> 00:07:13,020 of this process. 130 00:07:13,020 --> 00:07:15,960 The time is going to be based on the total number of nodes 131 00:07:15,960 --> 00:07:18,100 we generate. 132 00:07:18,100 --> 00:07:21,550 So if we know the number of nodes that are in the tree, 133 00:07:21,550 --> 00:07:24,160 we then know the complexity of the algorithm, 134 00:07:24,160 --> 00:07:27,330 the asymptotic complexity. 135 00:07:27,330 --> 00:07:31,460 Well, how many levels do we have in the tree? 136 00:07:31,460 --> 00:07:33,920 Just the number of items, right? 137 00:07:33,920 --> 00:07:35,660 Because at each level of the tree 138 00:07:35,660 --> 00:07:39,040 we're deciding to take or not to take an item. 139 00:07:39,040 --> 00:07:43,210 And so we can only do that for the number of items we have. 140 00:07:43,210 --> 00:07:46,840 So if we go back, for example, and we look at the tree-- 141 00:07:46,840 --> 00:07:50,420 not that tree, that tree-- 142 00:07:50,420 --> 00:07:52,590 and we count the number of levels, 143 00:07:52,590 --> 00:07:56,600 it's going to be based upon the total number of items. 144 00:07:56,600 --> 00:07:59,150 We know that because if you look at, say, the leftmost node 145 00:07:59,150 --> 00:08:04,160 at the bottom, we've made three separate decisions. 146 00:08:04,160 --> 00:08:08,660 So counting the root, it's n plus 1. 147 00:08:08,660 --> 00:08:10,940 But we don't care about plus 1 when we're 148 00:08:10,940 --> 00:08:14,890 doing asymptotic complexity. 149 00:08:14,890 --> 00:08:19,310 So that tells us how many levels we have in the tree. 150 00:08:19,310 --> 00:08:23,030 The next question we need to ask is, how many 151 00:08:23,030 --> 00:08:26,170 nodes are there at each level? 152 00:08:26,170 --> 00:08:30,580 And you can look at this and see-- the deeper we go, 153 00:08:30,580 --> 00:08:34,590 the more nodes we have at each level. 154 00:08:34,590 --> 00:08:39,340 In fact, if we come here, we can see 155 00:08:39,340 --> 00:08:41,970 that the number of nodes at level i-- 156 00:08:41,970 --> 00:08:46,650 depth i of the tree-- is 2 to the i. 157 00:08:46,650 --> 00:08:48,840 That makes sense if you remember last time we 158 00:08:48,840 --> 00:08:50,970 looked at binary numbers. 159 00:08:50,970 --> 00:08:53,550 We're saying we're representing our choices as either 0 160 00:08:53,550 --> 00:08:55,560 or 1 for what we take. 161 00:08:55,560 --> 00:08:58,230 If we have n items to choose from, 162 00:08:58,230 --> 00:09:00,240 then the number of possible choices 163 00:09:00,240 --> 00:09:03,350 is 2 to the n, the size of the powerset. 164 00:09:03,350 --> 00:09:05,865 So that will tell us the number of nodes at each level. 165 00:09:09,550 --> 00:09:13,470 So if there are n items, the number of nodes in the tree 166 00:09:13,470 --> 00:09:18,510 is going to be the sum from 0 to n of 2 to the i 167 00:09:18,510 --> 00:09:21,450 because we have that many levels. 168 00:09:21,450 --> 00:09:23,850 And if you've studied a little math, 169 00:09:23,850 --> 00:09:28,400 you know that's exactly 2 to the n plus 1. 170 00:09:28,400 --> 00:09:30,980 Or if you do what I do, you look it up in Wikipedia 171 00:09:30,980 --> 00:09:34,660 and you know it's 2 to the n plus 1. 172 00:09:34,660 --> 00:09:37,210 Now, there's an obvious optimization. 173 00:09:37,210 --> 00:09:41,320 We don't need to explore the whole tree. 174 00:09:41,320 --> 00:09:45,760 If we get to a point where the backpack is overstuffed, 175 00:09:45,760 --> 00:09:48,640 there's no point in saying, should we take this next item? 176 00:09:48,640 --> 00:09:50,920 Because we know we can't. 177 00:09:50,920 --> 00:09:53,320 I generated a bunch of leaves that 178 00:09:53,320 --> 00:09:57,820 were useless because the weight was too high. 179 00:09:57,820 --> 00:10:01,660 So you could always abort early and say, oh, 180 00:10:01,660 --> 00:10:05,320 no point in generating the rest of this part of the tree 181 00:10:05,320 --> 00:10:09,320 because we know everything in it will be too heavy. 182 00:10:09,320 --> 00:10:13,460 Adding something cannot reduce the weight. 183 00:10:13,460 --> 00:10:14,780 It's a nice optimization. 184 00:10:14,780 --> 00:10:17,820 It's one you'll see we actually do in the code. 185 00:10:17,820 --> 00:10:20,750 But it really doesn't change the complexity. 186 00:10:20,750 --> 00:10:24,260 It's not going to change the worst-cost complexity. 187 00:10:28,830 --> 00:10:32,390 Exponential, as we saw this, I think, in Eric's lecture, 188 00:10:32,390 --> 00:10:33,750 is a big number. 189 00:10:33,750 --> 00:10:36,150 You don't usually like 2 to the n. 190 00:10:36,150 --> 00:10:39,300 Does this mean that brute force is never useful? 191 00:10:39,300 --> 00:10:40,620 Well, let's give it a try. 192 00:10:43,675 --> 00:10:44,675 We'll look at some code. 193 00:10:48,440 --> 00:10:49,770 Here is the implementation. 194 00:10:57,850 --> 00:11:02,400 So it's maxVal, toConsider, and avail. 195 00:11:02,400 --> 00:11:08,790 And then we say, if toConsider is empty or avail is 0-- 196 00:11:08,790 --> 00:11:11,700 avail is an index, we're going to go through the list using 197 00:11:11,700 --> 00:11:14,580 that to tell us whether or not we still 198 00:11:14,580 --> 00:11:17,170 have an element to consider-- 199 00:11:17,170 --> 00:11:22,610 then the result will be the tuple 0 and the empty tuple. 200 00:11:25,280 --> 00:11:26,420 We couldn't take anything. 201 00:11:26,420 --> 00:11:29,530 This is the base of our recursion. 202 00:11:29,530 --> 00:11:32,510 Either there's nothing left to consider or there's 203 00:11:32,510 --> 00:11:34,250 no available weight-- 204 00:11:34,250 --> 00:11:35,630 the Val, as the amount of weight, 205 00:11:35,630 --> 00:11:39,410 is 0 or toConsider is empty. 206 00:11:39,410 --> 00:11:42,830 Well, if either of those are true, 207 00:11:42,830 --> 00:11:45,850 then we ask whether to consider * 0, 208 00:11:45,850 --> 00:11:48,460 the first element to look at. 209 00:11:48,460 --> 00:11:51,670 Is that cost greater than availability? 210 00:11:54,740 --> 00:11:58,960 If it is, we don't need to explore the left branch. 211 00:11:58,960 --> 00:12:01,270 because it means we can't afford to put that thing 212 00:12:01,270 --> 00:12:03,490 in the backpack, the knapsack. 213 00:12:03,490 --> 00:12:05,350 There's just no room for it. 214 00:12:05,350 --> 00:12:08,530 So we'll explore the right branch only. 215 00:12:08,530 --> 00:12:13,390 The result will be whatever the maximum value is of toConsider 216 00:12:13,390 --> 00:12:17,530 of the remainder of the list-- the list with the first element 217 00:12:17,530 --> 00:12:18,900 sliced off-- 218 00:12:18,900 --> 00:12:22,070 and availability unchanged. 219 00:12:22,070 --> 00:12:24,480 So it's a recursive implementation, saying, 220 00:12:24,480 --> 00:12:27,930 now we only have to consider the right branch of the tree 221 00:12:27,930 --> 00:12:29,910 because we knew we couldn't take this element. 222 00:12:29,910 --> 00:12:32,490 It just weighs too much, or costs too much, 223 00:12:32,490 --> 00:12:36,000 or was too fattening, in my case. 224 00:12:36,000 --> 00:12:41,210 Otherwise, we now have to consider both branches. 225 00:12:41,210 --> 00:12:46,000 So we'll set next item to toConsider of 0, the first one, 226 00:12:46,000 --> 00:12:47,270 and explore the left branch. 227 00:12:51,040 --> 00:12:53,980 On this branch, there are two possibilities 228 00:12:53,980 --> 00:13:01,280 to think about, which I'm calling withVal and withToTake. 229 00:13:01,280 --> 00:13:05,770 So I'm going to call maxVal of toConsider of everything 230 00:13:05,770 --> 00:13:12,310 except the current element and pass in an available weight 231 00:13:12,310 --> 00:13:15,700 of avail minus whatever-- 232 00:13:15,700 --> 00:13:20,010 well, let me widen this so we can see the whole code. 233 00:13:23,986 --> 00:13:28,040 This is not going to let me widen this window any more. 234 00:13:28,040 --> 00:13:28,800 Shame on it. 235 00:13:28,800 --> 00:13:30,591 Let me see if I can get rid of the console. 236 00:13:37,060 --> 00:13:38,770 Well, we'll have to do this instead. 237 00:13:45,190 --> 00:13:48,690 So we're going to call maxVal with everything 238 00:13:48,690 --> 00:13:51,300 except the current element and give it 239 00:13:51,300 --> 00:13:58,200 avail minus the cost of that next item of toConsider sub 0. 240 00:13:58,200 --> 00:14:01,050 Because we know that the availability, available weight 241 00:14:01,050 --> 00:14:03,130 has to have that cost subtracted from it. 242 00:14:09,160 --> 00:14:18,410 And then we'll add to withVal next item dot getValue. 243 00:14:18,410 --> 00:14:22,200 So that's a value if we do take it. 244 00:14:22,200 --> 00:14:23,950 Then we'll explore the right branch-- what 245 00:14:23,950 --> 00:14:25,270 happens if we don't take it? 246 00:14:27,939 --> 00:14:29,605 And then we'll choose the better branch. 247 00:14:33,670 --> 00:14:36,910 So it's a pretty simple recursive algorithm. 248 00:14:36,910 --> 00:14:40,180 We just go all the way to the bottom 249 00:14:40,180 --> 00:14:42,250 and make the right choice at the bottom, 250 00:14:42,250 --> 00:14:46,690 and then percolate back up, like so many recursive algorithms. 251 00:14:52,680 --> 00:14:54,570 We have a simple program to test it. 252 00:15:02,414 --> 00:15:04,580 I better start a console now if I'm going to run it. 253 00:15:12,190 --> 00:15:14,870 And we'll testGreedys on foods. 254 00:15:14,870 --> 00:15:18,790 Well, we'll testGreedys and then we'll testMaxVal. 255 00:15:18,790 --> 00:15:20,710 So I'm building the same thing we 256 00:15:20,710 --> 00:15:23,920 did in Monday's lecture, the same menu. 257 00:15:23,920 --> 00:15:25,540 And I'll run the same testGreedys 258 00:15:25,540 --> 00:15:27,460 we looked at last time. 259 00:15:27,460 --> 00:15:31,030 And we'll see whether or not we get something better when 260 00:15:31,030 --> 00:15:32,770 we run the truly optimal one. 261 00:15:41,260 --> 00:15:43,240 Well, indeed we do. 262 00:15:43,240 --> 00:15:45,640 You remember that last time and, fortunately, 263 00:15:45,640 --> 00:15:52,150 this time too, the best we did was a value of 318. 264 00:15:52,150 --> 00:15:56,470 But now we see we can actually get to 353 if we use 265 00:15:56,470 --> 00:16:00,600 the truly optimal algorithm. 266 00:16:00,600 --> 00:16:05,110 So we see it ran pretty quickly and actually 267 00:16:05,110 --> 00:16:10,420 gave us a better answer than we got from the greedy algorithm. 268 00:16:10,420 --> 00:16:12,760 And it's often the case. 269 00:16:12,760 --> 00:16:14,680 If I have time at the end, I'll show you 270 00:16:14,680 --> 00:16:16,360 an optimization program you might 271 00:16:16,360 --> 00:16:21,220 want to run that works perfectly fine to use 272 00:16:21,220 --> 00:16:24,310 this kind of brute force algorithm on. 273 00:16:24,310 --> 00:16:28,230 Let's go back to the PowerPoint. 274 00:16:28,230 --> 00:16:31,560 So I'm just going through the code again we just ran. 275 00:16:31,560 --> 00:16:35,220 This was the header we saw-- 276 00:16:35,220 --> 00:16:37,700 toConsider, as the items that correspond 277 00:16:37,700 --> 00:16:42,200 to nodes higher up the tree, and avail, as I said, 278 00:16:42,200 --> 00:16:44,619 the amount of space. 279 00:16:44,619 --> 00:16:46,410 And again, here's what the body of the code 280 00:16:46,410 --> 00:16:49,035 loooked like, I took out the comments. 281 00:16:51,364 --> 00:16:53,530 One of the things you might think about in your head 282 00:16:53,530 --> 00:16:57,190 when you look at this code is putting the comments back in. 283 00:16:57,190 --> 00:16:59,410 I always find that for me a really good way 284 00:16:59,410 --> 00:17:04,060 to understand code that I didn't write is to try and comment it. 285 00:17:04,060 --> 00:17:06,579 And that helps me sort of force myself to think about 286 00:17:06,579 --> 00:17:09,069 what is it really doing. 287 00:17:09,069 --> 00:17:12,099 So you'll have both versions-- you'll have the PowerPoint 288 00:17:12,099 --> 00:17:15,010 version without the comments and the actual code 289 00:17:15,010 --> 00:17:16,900 with the comments. 290 00:17:16,900 --> 00:17:18,579 You can think about looking at this 291 00:17:18,579 --> 00:17:20,710 and then looking at the real code 292 00:17:20,710 --> 00:17:23,465 and making sure that you're understanding jibes. 293 00:17:28,000 --> 00:17:30,530 I should point out that this doesn't actually 294 00:17:30,530 --> 00:17:33,510 build the search tree. 295 00:17:33,510 --> 00:17:42,660 We've got this local variable result, starting here, 296 00:17:42,660 --> 00:17:48,050 that records the best solution found so far. 297 00:17:48,050 --> 00:17:51,210 So it's not the picture I drew where I generate all the nodes 298 00:17:51,210 --> 00:17:53,090 and then I inspect them. 299 00:17:53,090 --> 00:17:54,200 I just keep track-- 300 00:17:54,200 --> 00:17:57,890 as I generate a node, I say, how good is this? 301 00:17:57,890 --> 00:17:59,970 Is it better than the best I've found so far? 302 00:17:59,970 --> 00:18:03,180 If so, it becomes the new best. 303 00:18:03,180 --> 00:18:07,000 And I can do that because every node I generate 304 00:18:07,000 --> 00:18:12,610 is, in some sense, a legal solution to the problem. 305 00:18:12,610 --> 00:18:17,200 Probably rarely is it the final optimal solution 306 00:18:17,200 --> 00:18:19,030 but it's at least a legal solution. 307 00:18:19,030 --> 00:18:20,530 And so if it's better than something 308 00:18:20,530 --> 00:18:25,830 we saw before, we can make it the new best. 309 00:18:25,830 --> 00:18:26,920 This is very common. 310 00:18:26,920 --> 00:18:29,310 And this is, in fact, what most people do with it 311 00:18:29,310 --> 00:18:31,350 when they use a search tree-- 312 00:18:31,350 --> 00:18:33,900 they don't actually build the tree 313 00:18:33,900 --> 00:18:36,660 in the pictorial way we've looked at it 314 00:18:36,660 --> 00:18:39,932 but play some trick like this of just keeping 315 00:18:39,932 --> 00:18:40,890 track of their results. 316 00:18:44,610 --> 00:18:45,825 Any questions about this? 317 00:18:50,290 --> 00:18:52,210 All right. 318 00:18:52,210 --> 00:18:57,110 We did just try it on example from lecture 1. 319 00:18:57,110 --> 00:18:59,000 And we saw that it worked great. 320 00:18:59,000 --> 00:19:00,980 It gave us a better answer. 321 00:19:00,980 --> 00:19:03,320 It finished quickly. 322 00:19:03,320 --> 00:19:06,890 But we should not take too much solace from the fact 323 00:19:06,890 --> 00:19:10,160 that it finished quickly because 2 to the eighth 324 00:19:10,160 --> 00:19:11,750 is actually a pretty tiny number. 325 00:19:16,160 --> 00:19:19,760 Almost any algorithm is fine when I'm working on something 326 00:19:19,760 --> 00:19:21,260 this small. 327 00:19:21,260 --> 00:19:24,140 Let's look now at what happens if we have a bigger menu. 328 00:19:28,870 --> 00:19:33,130 Here is some code to do a bigger menu. 329 00:19:33,130 --> 00:19:37,440 Since, as you will discover if you haven't already, 330 00:19:37,440 --> 00:19:39,910 I'm a pretty lazy person, I didn't 331 00:19:39,910 --> 00:19:43,570 want to write out a menu with a 100 items or even 50 items. 332 00:19:43,570 --> 00:19:46,690 So I wrote some code to generate the menus. 333 00:19:46,690 --> 00:19:50,350 And I used randomness to do that. 334 00:19:50,350 --> 00:19:52,810 This is a Python library we'll be 335 00:19:52,810 --> 00:19:57,950 using a lot for the rest of the semester. 336 00:19:57,950 --> 00:20:04,250 It's used any time you want to generate things at random 337 00:20:04,250 --> 00:20:05,750 and do many other things. 338 00:20:05,750 --> 00:20:07,430 We'll come back to it a lot. 339 00:20:07,430 --> 00:20:10,940 Here we're just going to use a very small part of it. 340 00:20:14,210 --> 00:20:19,472 To build a large menu of some numItems-- 341 00:20:19,472 --> 00:20:21,180 and we're going to give the maximum value 342 00:20:21,180 --> 00:20:25,220 and the maximum cost for each item. 343 00:20:25,220 --> 00:20:30,420 We'll assume the minimum is, in this case, 1. 344 00:20:30,420 --> 00:20:32,010 Items will start empty. 345 00:20:32,010 --> 00:20:35,630 And then for i in range number of items, 346 00:20:35,630 --> 00:20:41,690 I'm going to call this function random dot randint that 347 00:20:41,690 --> 00:20:46,780 takes a range of integers from 1 to, actually in this case, 348 00:20:46,780 --> 00:20:52,790 maxVal minus 1, or 1 to maxVal, actually, in this case. 349 00:20:52,790 --> 00:20:55,820 And it just chooses one of them at random. 350 00:20:55,820 --> 00:20:59,050 So when you run this, you don't know what it's going to get. 351 00:20:59,050 --> 00:21:03,040 Random dot randint might return 1, it might return 23, 352 00:21:03,040 --> 00:21:04,779 it might return 54. 353 00:21:04,779 --> 00:21:06,820 The only thing you know is it will be an integer. 354 00:21:09,640 --> 00:21:13,980 And then I'm going to build menus ranging from 5 items 355 00:21:13,980 --> 00:21:14,920 to 60 items-- 356 00:21:19,760 --> 00:21:29,720 buildLargeMenu, the number of items, with maxVal of 90 357 00:21:29,720 --> 00:21:35,250 and a maxCost of 250, pleasure and calories. 358 00:21:35,250 --> 00:21:39,330 And then I'm going to test maxVal on each of these menus. 359 00:21:43,050 --> 00:21:46,590 So building menus of various sizes at random 360 00:21:46,590 --> 00:21:52,220 and then just trying to find the optimal value for each of them. 361 00:21:52,220 --> 00:21:53,240 Let's look at the code. 362 00:21:56,610 --> 00:22:03,600 Let's comment this out, we don't need to run that again. 363 00:22:10,432 --> 00:22:13,430 So we'll build a large menu and then 364 00:22:13,430 --> 00:22:16,010 we'll try it for a bunch of items and see what we get. 365 00:22:29,440 --> 00:22:30,990 So it's going along. 366 00:22:30,990 --> 00:22:34,720 Trying the menu up to 30 went pretty quickly. 367 00:22:34,720 --> 00:22:38,010 So even 2 to the 30 didn't take too long. 368 00:22:38,010 --> 00:22:41,530 But you might notice it's kind of bogging down, we got 35. 369 00:22:46,250 --> 00:22:48,120 I guess, I could ask the question now-- 370 00:22:48,120 --> 00:22:51,290 it was one of the questions I was going to ask as a poll 371 00:22:51,290 --> 00:22:53,660 but maybe I won't bother-- 372 00:22:53,660 --> 00:22:55,592 how much patience do we have? 373 00:22:55,592 --> 00:22:57,800 When do you think we'll run out of patience and quit? 374 00:23:03,860 --> 00:23:05,690 If you're out of patience, raise your hand. 375 00:23:08,640 --> 00:23:11,600 Well, some of you are way more patient than I am. 376 00:23:11,600 --> 00:23:12,850 So we're going to quit anyway. 377 00:23:18,740 --> 00:23:20,240 We were trying to do 40. 378 00:23:20,240 --> 00:23:22,820 It might have finished 40, 45. 379 00:23:22,820 --> 00:23:26,480 I've never waited long enough to get to 45. 380 00:23:26,480 --> 00:23:27,740 It just is too long. 381 00:23:33,100 --> 00:23:35,510 That raises the question, is it hopeless? 382 00:23:41,500 --> 00:23:43,750 And in theory, yes. 383 00:23:43,750 --> 00:23:46,840 As I mentioned last time, it is an inherently exponential 384 00:23:46,840 --> 00:23:48,460 problem. 385 00:23:48,460 --> 00:23:50,590 The answer is-- in practice, no. 386 00:23:50,590 --> 00:23:54,920 Because there's something called dynamic programming, 387 00:23:54,920 --> 00:23:59,200 which was invented by a fellow at the RAND Corporation 388 00:23:59,200 --> 00:24:02,230 called Richard Bellman, a rather remarkable 389 00:24:02,230 --> 00:24:05,500 mathematician/computer scientist. 390 00:24:05,500 --> 00:24:07,720 He wrote a whole book on it, but I'm not sure 391 00:24:07,720 --> 00:24:09,640 why because it's not that complicated. 392 00:24:14,140 --> 00:24:17,020 When we talk about dynamic programming, 393 00:24:17,020 --> 00:24:20,860 it's a kind of a funny story, at least to me. 394 00:24:20,860 --> 00:24:23,550 I learned it and I didn't know anything 395 00:24:23,550 --> 00:24:24,550 about the history of it. 396 00:24:24,550 --> 00:24:28,180 And I've had all sorts of theories about why it 397 00:24:28,180 --> 00:24:30,670 was called dynamic programming. 398 00:24:30,670 --> 00:24:35,260 You know how it is, how people try and fit a theory to data. 399 00:24:35,260 --> 00:24:37,000 And then I read a history book about it, 400 00:24:37,000 --> 00:24:39,370 and this was Bellman's own description 401 00:24:39,370 --> 00:24:43,890 of why he called it dynamic programming. 402 00:24:43,890 --> 00:24:45,630 And it turned out, as you can see, 403 00:24:45,630 --> 00:24:48,960 he basically chose a word because it was the description 404 00:24:48,960 --> 00:24:51,430 that didn't mean anything. 405 00:24:51,430 --> 00:24:55,330 Because he was doing mathematics, and at the time 406 00:24:55,330 --> 00:24:58,180 he was being funded by a part of the Defense Department 407 00:24:58,180 --> 00:25:00,790 that didn't approve of mathematics. 408 00:25:00,790 --> 00:25:04,060 And he wanted to conceal that fact. 409 00:25:04,060 --> 00:25:08,410 And indeed at the time, the head of Defense Appropriations 410 00:25:08,410 --> 00:25:12,239 in the US Congress didn't much like mathematics. 411 00:25:12,239 --> 00:25:13,780 And he was afraid that he didn't want 412 00:25:13,780 --> 00:25:17,530 to have to go and testify and tell people he was doing math. 413 00:25:17,530 --> 00:25:19,570 So he just invented something that no one 414 00:25:19,570 --> 00:25:21,400 would know what it meant. 415 00:25:21,400 --> 00:25:24,400 And years of students spent time later trying 416 00:25:24,400 --> 00:25:27,620 to figure out what it actually did mean. 417 00:25:27,620 --> 00:25:30,280 Anyway, what's the basic idea? 418 00:25:30,280 --> 00:25:34,870 To understand it I want to temporarily abandon 419 00:25:34,870 --> 00:25:39,250 the knapsack problem and look at a much simpler problem-- 420 00:25:39,250 --> 00:25:40,225 Fibonacci numbers. 421 00:25:42,880 --> 00:25:46,630 You've seen this already, with cute little bunnies, I think, 422 00:25:46,630 --> 00:25:49,600 when you saw it. 423 00:25:49,600 --> 00:25:51,980 N equals 0, n equals 1-- return 1. 424 00:25:51,980 --> 00:25:57,220 Otherwise, fib of n minus 1 plus fib of n minus 2. 425 00:25:57,220 --> 00:25:59,560 And as I think you saw when you first saw it, 426 00:25:59,560 --> 00:26:03,970 it takes a long time to run. 427 00:26:03,970 --> 00:26:08,030 Fib of 120, for example, is a very big number. 428 00:26:08,030 --> 00:26:12,410 It's shocking how quickly Fibonacci grows. 429 00:26:12,410 --> 00:26:20,890 So let's think about implementing it. 430 00:26:20,890 --> 00:26:22,459 If we run Fibonacci-- 431 00:26:22,459 --> 00:26:23,750 well, maybe we'll just do that. 432 00:26:37,030 --> 00:26:39,464 So here is fib of n, let's just try running it. 433 00:26:39,464 --> 00:26:41,130 And again, we'll test people's patience. 434 00:26:54,140 --> 00:26:55,970 We'll see how long we're letting it run. 435 00:26:55,970 --> 00:26:59,240 I'm going to try for i in the range of 121. 436 00:26:59,240 --> 00:27:02,433 We'll print fib of i. 437 00:27:09,840 --> 00:27:11,205 Comes clumping along. 438 00:27:14,170 --> 00:27:16,970 It slows down pretty quickly. 439 00:27:16,970 --> 00:27:18,910 And if you look at it, it's kind of surprising 440 00:27:18,910 --> 00:27:21,190 it's this slow because these numbers aren't that big. 441 00:27:24,090 --> 00:27:25,800 These are not enormous numbers. 442 00:27:25,800 --> 00:27:28,320 Fib of 35 is not a huge number. 443 00:27:28,320 --> 00:27:32,120 Yet it took a long time to compute. 444 00:27:32,120 --> 00:27:34,280 So you have the numbers growing pretty quickly 445 00:27:34,280 --> 00:27:38,150 but the computation, actually, seems to be growing faster 446 00:27:38,150 --> 00:27:40,540 than the results. 447 00:27:40,540 --> 00:27:41,160 We're at 37. 448 00:27:44,380 --> 00:27:48,250 It's going to gets slower and slower, even though our numbers 449 00:27:48,250 --> 00:27:51,650 are not that big. 450 00:27:51,650 --> 00:27:53,870 The question is, what's going on? 451 00:27:53,870 --> 00:27:57,440 Why is it taking so long for Fibonacci 452 00:27:57,440 --> 00:28:00,790 to compute these results? 453 00:28:00,790 --> 00:28:13,580 Well, let's call it and look at the question. 454 00:28:13,580 --> 00:28:18,560 And to do that I want to look at the call tree. 455 00:28:18,560 --> 00:28:23,360 This is for Fibonacci of 6, which is only 13, 456 00:28:23,360 --> 00:28:25,630 which, I think, most of us would agree 457 00:28:25,630 --> 00:28:27,980 was not a very big number. 458 00:28:27,980 --> 00:28:30,200 And let's look what's going on here. 459 00:28:33,510 --> 00:28:35,750 If you look at this, what in some sense 460 00:28:35,750 --> 00:28:39,410 seems really stupid about it? 461 00:28:39,410 --> 00:28:44,120 What is it doing that a rational person would not want 462 00:28:44,120 --> 00:28:45,350 to do if they could avoid it? 463 00:28:52,740 --> 00:28:55,320 It's bad enough to do something once. 464 00:28:55,320 --> 00:28:57,780 But to do the same thing over and over again 465 00:28:57,780 --> 00:29:00,990 is really wasteful. 466 00:29:00,990 --> 00:29:04,020 And if we look at this, we'll see, for example, 467 00:29:04,020 --> 00:29:07,440 that fib 4 is being computed here, 468 00:29:07,440 --> 00:29:11,350 and fib 4 is being computed here. 469 00:29:11,350 --> 00:29:16,015 Fib 3 is being considered here, and here, and here. 470 00:29:19,190 --> 00:29:21,980 And do you think we'll get a different answer for fib 3 471 00:29:21,980 --> 00:29:24,480 in one place when we get it in the other place? 472 00:29:24,480 --> 00:29:27,230 You sure hope not. 473 00:29:27,230 --> 00:29:33,160 So you think, well, what should we do about this? 474 00:29:33,160 --> 00:29:36,690 How would we go about avoiding doing the same work over 475 00:29:36,690 --> 00:29:38,540 and over again? 476 00:29:38,540 --> 00:29:40,160 And there's kind of an obvious answer, 477 00:29:40,160 --> 00:29:43,600 and that answer is at the heart of dynamic programming. 478 00:29:43,600 --> 00:29:46,760 What's the answer? 479 00:29:46,760 --> 00:29:50,040 AUDIENCE: [INAUDIBLE] 480 00:29:50,040 --> 00:29:51,467 JOHN GUTTAG: Exactly. 481 00:29:51,467 --> 00:29:53,550 And I'm really happy that someone in the front row 482 00:29:53,550 --> 00:29:57,990 answered the question because I can throw it that far. 483 00:29:57,990 --> 00:30:03,660 You store the answer and then look it up when you need it. 484 00:30:03,660 --> 00:30:06,580 Because we know that we can look things up very quickly. 485 00:30:09,100 --> 00:30:12,950 Dictionary, despite what Eric said in his lecture, 486 00:30:12,950 --> 00:30:17,510 almost all the time works in constant time 487 00:30:17,510 --> 00:30:20,960 if you make it big enough, and it usually is in Python. 488 00:30:20,960 --> 00:30:25,280 We'll see later in the term how to do that trick. 489 00:30:25,280 --> 00:30:30,520 So you store it and then you'd never have to compute it again. 490 00:30:30,520 --> 00:30:34,900 And that's the basic trick behind dynamic programming. 491 00:30:34,900 --> 00:30:41,940 And it's something called memoization, 492 00:30:41,940 --> 00:30:44,890 as in you create a memo and you store it in the memo. 493 00:30:48,400 --> 00:30:50,620 So we see this here. 494 00:30:50,620 --> 00:30:56,150 Notice that what we're doing is trading time for space. 495 00:30:56,150 --> 00:31:05,300 It takes some space to store the old results, but negligible 496 00:31:05,300 --> 00:31:08,750 related to the time we save. 497 00:31:08,750 --> 00:31:10,130 So here's the trick. 498 00:31:10,130 --> 00:31:13,530 We're going to create a table to record what we've done. 499 00:31:13,530 --> 00:31:16,380 And then before computing fib of x, 500 00:31:16,380 --> 00:31:20,370 we'll check if the value has already been computed. 501 00:31:20,370 --> 00:31:22,920 If so, we just look it up and return it. 502 00:31:22,920 --> 00:31:25,000 Otherwise, we'll compute it-- 503 00:31:25,000 --> 00:31:27,000 it's the first time-- and store it in the table. 504 00:31:31,560 --> 00:31:36,190 Here is a fast implementation of Fibonacci that does that. 505 00:31:36,190 --> 00:31:38,980 It looks like the old one, except it's 506 00:31:38,980 --> 00:31:41,410 got an extra argument-- 507 00:31:41,410 --> 00:31:45,350 memo-- which is a dictionary. 508 00:31:45,350 --> 00:31:47,960 The first time we call it, the memo will be empty. 509 00:31:52,320 --> 00:31:57,240 It tries to return the value in the memo. 510 00:31:57,240 --> 00:32:01,080 If it's not there, an exception will get raised, we know that. 511 00:32:01,080 --> 00:32:05,180 And it will branch to here, compute the result, 512 00:32:05,180 --> 00:32:11,190 and then store it in the memo and return it. 513 00:32:11,190 --> 00:32:13,260 It's the same old recursive thing 514 00:32:13,260 --> 00:32:16,920 we did before but with the memo. 515 00:32:16,920 --> 00:32:20,190 Notice, by the way, that I'm using exceptions 516 00:32:20,190 --> 00:32:22,350 not as an error handling mechanism, 517 00:32:22,350 --> 00:32:26,270 really, but just as a flow of control. 518 00:32:26,270 --> 00:32:29,510 To me, this is cleaner than writing code that says, 519 00:32:29,510 --> 00:32:34,440 if this is in the keys, then do this, otherwise, do that. 520 00:32:34,440 --> 00:32:37,440 It's slightly fewer lines of code, and for me, at least, 521 00:32:37,440 --> 00:32:41,050 easier to read to use try-except for this sort of thing. 522 00:32:44,690 --> 00:32:48,930 Let's see what happens if we run this one. 523 00:32:48,930 --> 00:33:02,810 Get rid of the slow fib and we'll run fastFib. 524 00:33:17,780 --> 00:33:20,240 Wow. 525 00:33:20,240 --> 00:33:25,960 We're already done with fib 120. 526 00:33:25,960 --> 00:33:28,980 Pretty amazing, considering last time we got stuck around 40. 527 00:33:31,890 --> 00:33:35,700 It really works, this memoization trick. 528 00:33:35,700 --> 00:33:37,440 An enormous difference. 529 00:33:47,940 --> 00:33:49,560 When can you use it? 530 00:33:49,560 --> 00:33:53,010 It's not that memorization is a magic bullet that 531 00:33:53,010 --> 00:33:54,480 will solve all problems. 532 00:33:58,710 --> 00:34:01,800 The problems it can solve, it can help with, really, 533 00:34:01,800 --> 00:34:02,970 is the right thing. 534 00:34:02,970 --> 00:34:06,780 And by the way, as we'll see, it finds an optimal solution, not 535 00:34:06,780 --> 00:34:10,020 an approximation. 536 00:34:10,020 --> 00:34:13,920 Problems have two things called optimal substructure, 537 00:34:13,920 --> 00:34:16,620 overlapping subproblems. 538 00:34:16,620 --> 00:34:19,350 What are these mean? 539 00:34:19,350 --> 00:34:21,449 We have optimal substructure when 540 00:34:21,449 --> 00:34:23,550 a globally optimal solution can be 541 00:34:23,550 --> 00:34:31,650 found by combining optimal solutions to local subproblems. 542 00:34:31,650 --> 00:34:35,130 So for example, when x is greater than 1 543 00:34:35,130 --> 00:34:42,900 we can solve fib x by solving fib x minus 1 and fib x minus 2 544 00:34:42,900 --> 00:34:47,080 and adding those two things together. 545 00:34:47,080 --> 00:34:50,130 So there is optimal substructure-- 546 00:34:50,130 --> 00:34:53,650 you solve these two smaller problems independently 547 00:34:53,650 --> 00:34:58,060 of each other and then combine the solutions in a fast way. 548 00:35:03,750 --> 00:35:09,490 You also have to have something called overlapping subproblems. 549 00:35:09,490 --> 00:35:11,840 This is why the memo worked. 550 00:35:11,840 --> 00:35:14,570 Finding an optimal solution has to involve 551 00:35:14,570 --> 00:35:19,200 solving the same problem multiple times. 552 00:35:19,200 --> 00:35:21,090 Even if you have optimal substructure, 553 00:35:21,090 --> 00:35:24,570 if you don't see the same problem more than once-- 554 00:35:24,570 --> 00:35:25,760 creating a memo. 555 00:35:25,760 --> 00:35:28,950 Well, it'll work, you can still create the memo. 556 00:35:28,950 --> 00:35:30,790 You'll just never find anything in it 557 00:35:30,790 --> 00:35:32,730 when you look things up because you're 558 00:35:32,730 --> 00:35:34,275 solving each problem once. 559 00:35:36,810 --> 00:35:41,090 So you have to be solving the same problem multiple times 560 00:35:41,090 --> 00:35:45,090 and you have to be able to solve it by combining solutions 561 00:35:45,090 --> 00:35:45,975 to smaller problems. 562 00:35:48,940 --> 00:35:51,780 Now, we've seen things with optimal substructure before. 563 00:35:54,920 --> 00:35:58,250 In some sense, merge sort worked that way-- 564 00:35:58,250 --> 00:36:00,770 we were combining separate problems. 565 00:36:00,770 --> 00:36:03,260 Did merge sort have overlapping subproblems? 566 00:36:05,930 --> 00:36:09,980 No, because-- well, I guess, it might 567 00:36:09,980 --> 00:36:13,550 have if the list had the same element many, many times. 568 00:36:13,550 --> 00:36:17,450 But we would expect, mostly not. 569 00:36:17,450 --> 00:36:19,940 Because each time we're solving a different problem, 570 00:36:19,940 --> 00:36:21,860 because we have different lists that we're now 571 00:36:21,860 --> 00:36:24,320 sorting and merging. 572 00:36:24,320 --> 00:36:27,560 So it has half of it but not the other. 573 00:36:27,560 --> 00:36:31,520 Dynamic programming will not help us for sorting, 574 00:36:31,520 --> 00:36:34,620 cannot be used to improve merge sort. 575 00:36:34,620 --> 00:36:37,770 Oh, well, nothing is a silver bullet. 576 00:36:40,510 --> 00:36:43,980 What about the knapsack problem? 577 00:36:43,980 --> 00:36:46,840 Does it have these two properties? 578 00:36:50,640 --> 00:36:55,210 We can look at it in terms of these pictures. 579 00:36:55,210 --> 00:36:58,990 And it's pretty clear that it does have optimal substructure 580 00:36:58,990 --> 00:37:02,320 because we're taking the left branch and the right branch 581 00:37:02,320 --> 00:37:03,400 and choosing the winner. 582 00:37:06,210 --> 00:37:10,490 But what about overlapping subproblems? 583 00:37:10,490 --> 00:37:13,480 Are we ever solving, in this case, the same problem-- 584 00:37:16,330 --> 00:37:17,120 add two nodes? 585 00:37:21,480 --> 00:37:23,910 Well, do any of these nodes look identical? 586 00:37:28,430 --> 00:37:30,860 In this case, no. 587 00:37:30,860 --> 00:37:34,250 We could write a dynamic programming solution 588 00:37:34,250 --> 00:37:35,580 to the knapsack problem-- 589 00:37:35,580 --> 00:37:40,170 and we will-- and run it on this example, 590 00:37:40,170 --> 00:37:42,060 and we'd get the right answer. 591 00:37:42,060 --> 00:37:45,130 We would get zero speedup. 592 00:37:45,130 --> 00:37:46,960 Because at each node, if you can see, 593 00:37:46,960 --> 00:37:49,360 the problems are different. 594 00:37:49,360 --> 00:37:53,310 We have different things in the knapsack or different things 595 00:37:53,310 --> 00:37:54,630 to consider. 596 00:37:54,630 --> 00:37:57,215 Never do we have the same contents and the same things 597 00:37:57,215 --> 00:37:57,840 left to decide. 598 00:38:01,770 --> 00:38:04,800 So "maybe" was not a bad answer if that was the answer 599 00:38:04,800 --> 00:38:05,940 you gave to this question. 600 00:38:08,550 --> 00:38:11,950 But let's look at a different menu. 601 00:38:11,950 --> 00:38:15,010 This menu happens to have two beers in it. 602 00:38:19,110 --> 00:38:22,680 Now, if we look at what happens, do 603 00:38:22,680 --> 00:38:25,820 we see two nodes that are solving the same problem? 604 00:38:31,790 --> 00:38:34,640 The answer is what? 605 00:38:34,640 --> 00:38:35,510 Yes or no? 606 00:38:43,200 --> 00:38:45,480 I haven't drawn the whole tree here. 607 00:38:45,480 --> 00:38:49,440 Well, you'll notice the answer is yes. 608 00:38:49,440 --> 00:38:56,830 This node and this node are solving the same problem. 609 00:38:56,830 --> 00:38:58,060 Why is it? 610 00:38:58,060 --> 00:39:02,130 Well, in this node, we took this beer 611 00:39:02,130 --> 00:39:05,270 and still had this one to consider. 612 00:39:05,270 --> 00:39:10,250 But in this node, we took that beer 613 00:39:10,250 --> 00:39:12,920 but it doesn't matter which beer we took. 614 00:39:12,920 --> 00:39:17,750 We still have a beer in the knapsack and a burger 615 00:39:17,750 --> 00:39:20,660 and a slice to consider. 616 00:39:20,660 --> 00:39:24,070 So we got there different ways, by choosing different beers, 617 00:39:24,070 --> 00:39:27,770 but we're in the same place. 618 00:39:27,770 --> 00:39:30,880 So in fact, we actually, in this case, 619 00:39:30,880 --> 00:39:37,480 do have the same problem to solve more than once. 620 00:39:37,480 --> 00:39:42,940 Now, here I had two things that were the same. 621 00:39:42,940 --> 00:39:45,310 That's not really necessary. 622 00:39:45,310 --> 00:39:49,430 Here's another very small example. 623 00:39:49,430 --> 00:39:56,330 And the point I want to make here is shown by this. 624 00:39:56,330 --> 00:39:59,390 So here I have again drawn a search tree. 625 00:39:59,390 --> 00:40:02,600 And I'm showing you this because, in fact, it's exactly 626 00:40:02,600 --> 00:40:07,430 this tree that will be producing in our dynamic programming 627 00:40:07,430 --> 00:40:10,040 solution to the knapsack problem. 628 00:40:10,040 --> 00:40:16,220 Each node in the tree starts with what you've taken-- 629 00:40:16,220 --> 00:40:18,500 initially, nothing, the empty set. 630 00:40:18,500 --> 00:40:22,430 What's left, the total value, and the remaining calories. 631 00:40:22,430 --> 00:40:24,710 There's some redundancy here, by the way. 632 00:40:24,710 --> 00:40:27,410 If I know what I've taken, I could already always compute 633 00:40:27,410 --> 00:40:31,310 the value and what's left. 634 00:40:31,310 --> 00:40:33,866 But this is just so it's easier to see. 635 00:40:33,866 --> 00:40:35,740 And I've numbered the nodes here in the order 636 00:40:35,740 --> 00:40:37,130 in which they're get generated. 637 00:40:40,240 --> 00:40:44,650 Now, the thing that I want you to notice 638 00:40:44,650 --> 00:40:49,420 is, when we ask whether we're solving the same problem, 639 00:40:49,420 --> 00:40:56,060 we don't actually care what we've taken. 640 00:40:56,060 --> 00:41:00,740 We don't even care about the value. 641 00:41:00,740 --> 00:41:08,680 All we care is, how much room we have left in the knapsack 642 00:41:08,680 --> 00:41:11,515 and which items we have left to consider. 643 00:41:14,280 --> 00:41:20,490 Because what I take next or what I take remaining really 644 00:41:20,490 --> 00:41:23,220 has nothing to do with how much value I already have 645 00:41:23,220 --> 00:41:27,420 because I'm trying to maximize the value that's left, 646 00:41:27,420 --> 00:41:30,600 independent of previous things done. 647 00:41:30,600 --> 00:41:36,210 Similarly, I don't care why I have a 100 calories left. 648 00:41:36,210 --> 00:41:39,490 Whether I used it up on beers or a burger, doesn't matter. 649 00:41:39,490 --> 00:41:44,570 All that matters is that I just have 100 left. 650 00:41:44,570 --> 00:41:49,910 So we see in a large complicated problem it could easily 651 00:41:49,910 --> 00:41:53,390 be a situation where different choices of what to take 652 00:41:53,390 --> 00:41:57,620 and what to not take would leave you in a situation 653 00:41:57,620 --> 00:41:59,835 where you have the same number of remaining calories. 654 00:42:02,670 --> 00:42:05,700 And therefore you are solving a problem you've already solved. 655 00:42:12,220 --> 00:42:15,490 At each node, we're just given the remaining weight, 656 00:42:15,490 --> 00:42:19,540 maximize the value by choosing among the remaining items. 657 00:42:19,540 --> 00:42:20,710 That's all that matters. 658 00:42:23,310 --> 00:42:26,780 And so indeed, you will have overlapping subproblems. 659 00:42:29,320 --> 00:42:33,690 As we see in this tree, for the example we just saw, 660 00:42:33,690 --> 00:42:36,240 the box is around a place where we're actually 661 00:42:36,240 --> 00:42:39,900 solving the same problem, even though we've 662 00:42:39,900 --> 00:42:44,580 made different decisions about what to take, A versus B. 663 00:42:44,580 --> 00:42:46,890 And in fact, we have different amounts of value 664 00:42:46,890 --> 00:42:48,060 in the knapsack-- 665 00:42:48,060 --> 00:42:49,650 6 versus 7. 666 00:42:49,650 --> 00:42:53,770 What matters is we still have C and D to consider 667 00:42:53,770 --> 00:42:56,260 and we have two units left. 668 00:43:03,930 --> 00:43:06,630 It's a small and easy step. 669 00:43:06,630 --> 00:43:08,430 I'm not going to walk you through the code 670 00:43:08,430 --> 00:43:10,860 because it's kind of boring to do so. 671 00:43:10,860 --> 00:43:16,610 How do you modify the maxVal we looked at before to use a memo? 672 00:43:16,610 --> 00:43:19,790 First, you have to add the third argument, which is initially 673 00:43:19,790 --> 00:43:23,610 going to be set to the empty dictionary. 674 00:43:23,610 --> 00:43:26,840 The key of the memo will be a tuple-- 675 00:43:26,840 --> 00:43:32,660 the items left to be considered and the available weight. 676 00:43:32,660 --> 00:43:37,370 Because the items left to be considered are in a list, 677 00:43:37,370 --> 00:43:41,420 we can represent the items left to be considered 678 00:43:41,420 --> 00:43:45,550 by how long the list is. 679 00:43:45,550 --> 00:43:47,710 Because we'll start at the front item and just 680 00:43:47,710 --> 00:43:48,760 work our way to the end. 681 00:43:52,460 --> 00:43:55,040 And then the function works, essentially, 682 00:43:55,040 --> 00:43:57,700 exactly the same way fastFib worked. 683 00:44:03,796 --> 00:44:06,170 I'm not going to run it for you because we're running out 684 00:44:06,170 --> 00:44:08,755 of time. 685 00:44:08,755 --> 00:44:10,130 You might want to run it yourself 686 00:44:10,130 --> 00:44:14,120 because it is kind of fun to see how really fast it is. 687 00:44:14,120 --> 00:44:19,570 But more interestingly, we can look at this table. 688 00:44:19,570 --> 00:44:22,810 This column is what we would get with the original recursive 689 00:44:22,810 --> 00:44:26,320 implementation where we didn't use a memo. 690 00:44:26,320 --> 00:44:30,730 And it was therefore 2 to the length of items. 691 00:44:30,730 --> 00:44:34,990 And as you can see, it gets really big 692 00:44:34,990 --> 00:44:37,390 or, as we say at the end, huge. 693 00:44:40,770 --> 00:44:43,950 But the number of calls grows incredibly 694 00:44:43,950 --> 00:44:49,200 slowly for the dynamic programming solution. 695 00:44:49,200 --> 00:44:53,360 In the beginning it's worth Oh, well. 696 00:44:53,360 --> 00:44:58,670 But by the time we get to the last number I wrote, 697 00:44:58,670 --> 00:45:03,290 we're looking at 43,000 versus some really big number 698 00:45:03,290 --> 00:45:06,210 I don't know how to pronounce-- 699 00:45:06,210 --> 00:45:09,410 18 somethings. 700 00:45:09,410 --> 00:45:14,120 Incredible improvement in performance. 701 00:45:14,120 --> 00:45:17,660 And then at the end, it's a number 702 00:45:17,660 --> 00:45:21,080 we couldn't fit on the slide, even in tiny font. 703 00:45:21,080 --> 00:45:25,460 And yet, only 703,000 calls. 704 00:45:25,460 --> 00:45:27,380 How can this be? 705 00:45:27,380 --> 00:45:30,770 We know the problem is inherently exponential. 706 00:45:30,770 --> 00:45:34,050 Have we overturned the laws of the universe? 707 00:45:34,050 --> 00:45:38,860 Is dynamic programming a miracle in the liturgical sense? 708 00:45:38,860 --> 00:45:40,850 No. 709 00:45:40,850 --> 00:45:43,520 But the thing I want you to carry away 710 00:45:43,520 --> 00:45:50,190 is that computational complexity can be a very subtle notion. 711 00:45:50,190 --> 00:45:52,470 The running time of fastMaxVal is 712 00:45:52,470 --> 00:45:55,620 governed by the number of distinct pairs 713 00:45:55,620 --> 00:46:00,690 that we might be able to use as keys in the memo-- 714 00:46:00,690 --> 00:46:03,480 toConsider and available. 715 00:46:03,480 --> 00:46:08,430 The number of possible values of toConsider is small. 716 00:46:08,430 --> 00:46:10,870 It's bounded by the length of the items. 717 00:46:10,870 --> 00:46:16,330 If I have a 100 items, it's 0, 1, 2, up to a 100. 718 00:46:16,330 --> 00:46:19,300 The possible values of available weight 719 00:46:19,300 --> 00:46:22,340 is harder to characterize. 720 00:46:22,340 --> 00:46:26,030 But it's bounded by the number of distinct sums of weights 721 00:46:26,030 --> 00:46:28,530 you can get. 722 00:46:28,530 --> 00:46:33,590 If I start with 750 calories left, 723 00:46:33,590 --> 00:46:35,100 what are the possibilities? 724 00:46:35,100 --> 00:46:40,330 Well, in fact, in this case, maybe we can take only 750 725 00:46:40,330 --> 00:46:42,620 because we're using with units. 726 00:46:42,620 --> 00:46:43,520 So it's small. 727 00:46:43,520 --> 00:46:45,940 But it's actually smaller than that because it 728 00:46:45,940 --> 00:46:48,760 has to do with the combinations of ways 729 00:46:48,760 --> 00:46:52,040 I can add up the units I have. 730 00:46:52,040 --> 00:46:53,510 I know this is complicated. 731 00:46:53,510 --> 00:46:56,560 It's not worth my going through the details in the lectures. 732 00:46:56,560 --> 00:47:01,610 It's covered in considerable detail in the assigned reading. 733 00:47:01,610 --> 00:47:03,860 Quickly summarizing lectures 1 and 2, 734 00:47:03,860 --> 00:47:06,320 here's what I want you to take away. 735 00:47:06,320 --> 00:47:08,330 Many problems of practical importance 736 00:47:08,330 --> 00:47:12,082 can be formulated as optimization problems. 737 00:47:12,082 --> 00:47:16,340 Greedy algorithms often provide an adequate though often not 738 00:47:16,340 --> 00:47:18,750 optimal solution. 739 00:47:18,750 --> 00:47:21,870 Even though finding an optimal solution 740 00:47:21,870 --> 00:47:24,630 is, in theory, exponentially hard, 741 00:47:24,630 --> 00:47:29,760 dynamic programming really often yields great results. 742 00:47:29,760 --> 00:47:33,110 It always gives you a correct result and it's sometimes, 743 00:47:33,110 --> 00:47:37,890 in fact, most of the times gives it to you very quickly. 744 00:47:37,890 --> 00:47:39,660 Finally, in the PowerPoint, you'll 745 00:47:39,660 --> 00:47:42,870 find an interesting optimization problem 746 00:47:42,870 --> 00:47:46,000 having to do with whether or not you should roll over problem 747 00:47:46,000 --> 00:47:48,920 that grades into a quiz. 748 00:47:48,920 --> 00:47:51,900 And it's simply a question of solving this optimization 749 00:47:51,900 --> 00:47:53,450 problem.