1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:19,790 --> 00:00:21,040 ocw.mit.edu. 8 00:00:22,660 --> 00:00:25,790 PROFESSOR: Good morning, everybody. 9 00:00:25,790 --> 00:00:30,970 All right, I ended up the last lecture talking about how to 10 00:00:30,970 --> 00:00:34,510 calculate the absolute goodness of fit using 11 00:00:34,510 --> 00:00:36,763 something called the coefficient of determination. 12 00:00:39,390 --> 00:00:43,115 It's usually spelled as R-squared, or R2. 13 00:00:46,830 --> 00:00:49,070 And the formula was quite simple. 14 00:00:59,930 --> 00:01:04,269 We measure the goodness of the fit as R-squared equals 1 15 00:01:04,269 --> 00:01:10,790 minus the estimated error divided by 16 00:01:10,790 --> 00:01:12,490 the measured variance. 17 00:01:16,170 --> 00:01:22,590 As I observed, R-squared always lies between 0 and 1. 18 00:01:22,590 --> 00:01:27,140 If R-squared equals 1, that means that the model that we 19 00:01:27,140 --> 00:01:32,370 constructed, the predicted values if you will, explains 20 00:01:32,370 --> 00:01:37,260 all of the variability in the data so that any change in the 21 00:01:37,260 --> 00:01:41,030 data is explained perfectly by the model. 22 00:01:41,030 --> 00:01:42,910 We don't usually get 1. 23 00:01:42,910 --> 00:01:44,680 In fact, if I ever got 1, I would think 24 00:01:44,680 --> 00:01:45,930 somebody cheated me. 25 00:01:48,760 --> 00:01:52,730 R-squared equals 0, or conversely, it means there's 26 00:01:52,730 --> 00:01:56,730 no linear relationship at all between the values predicted 27 00:01:56,730 --> 00:01:59,310 by the model and the actual data. 28 00:01:59,310 --> 00:02:01,675 That is to say, the model is totally worthless. 29 00:02:05,360 --> 00:02:10,669 So we have code here, the top of the screen, showing how 30 00:02:10,669 --> 00:02:13,500 easy it is to compute R-squared. 31 00:02:13,500 --> 00:02:16,190 And for those of you who have a little trouble interpreting 32 00:02:16,190 --> 00:02:19,560 the formula, because maybe you're not quite sure what EE 33 00:02:19,560 --> 00:02:24,640 and MV mean, this will give you a very straightforward way 34 00:02:24,640 --> 00:02:26,390 to understand it. 35 00:02:30,310 --> 00:02:33,070 So now, we can run it. 36 00:02:33,070 --> 00:02:35,780 We can get some answers. 37 00:02:35,780 --> 00:02:38,240 So if we look at it, you'll remember last time, we looked 38 00:02:38,240 --> 00:02:40,230 at two different fits. 39 00:02:40,230 --> 00:02:44,470 We looked at a quadratic fit and a linear fit for the 40 00:02:44,470 --> 00:02:49,770 trajectory of an arrow fired from my bow. 41 00:02:49,770 --> 00:02:51,585 And we can now compare the two. 42 00:02:56,690 --> 00:03:01,500 And not surprisingly, given what we know about the physics 43 00:03:01,500 --> 00:03:06,560 of projectiles, we see it is exactly what we'd expect, that 44 00:03:06,560 --> 00:03:14,000 the linear fit has an R-quared of 0.0177, showing that, in 45 00:03:14,000 --> 00:03:17,530 fact, it explains almost none of the data. 46 00:03:17,530 --> 00:03:22,070 Whereas the quadratic fit has a really astonishingly good 47 00:03:22,070 --> 00:03:28,130 R-squared of 0.98, saying that almost all of the changes in 48 00:03:28,130 --> 00:03:31,460 the values of the variables, that is to say the way the y 49 00:03:31,460 --> 00:03:35,840 value changes with respect to the x value, is 50 00:03:35,840 --> 00:03:38,150 explained by the model. 51 00:03:38,150 --> 00:03:41,460 i.e., we have a really good model of 52 00:03:41,460 --> 00:03:44,750 the physical situation. 53 00:03:44,750 --> 00:03:46,000 Very comforting. 54 00:03:48,250 --> 00:03:51,000 Essentially, it's telling us at less than 2% of the 55 00:03:51,000 --> 00:03:56,800 variation is explained by the linear model, 98% by the 56 00:03:56,800 --> 00:03:58,180 quadratic model. 57 00:03:58,180 --> 00:04:04,200 Presumably the other 2% is experimental error. 58 00:04:04,200 --> 00:04:07,090 Well, now that we know that we have a really good model of 59 00:04:07,090 --> 00:04:13,830 the data, we can ask the question, why do we care? 60 00:04:13,830 --> 00:04:16,290 We have the data itself. 61 00:04:16,290 --> 00:04:19,320 What's the point of building a model of the data? 62 00:04:19,320 --> 00:04:21,180 And that, of course, is what we're getting when we run 63 00:04:21,180 --> 00:04:25,610 polyfit to get this curve. 64 00:04:25,610 --> 00:04:29,160 The whole purpose of creating a model, or an important 65 00:04:29,160 --> 00:04:32,930 purpose of creating a model, is to be able to answer 66 00:04:32,930 --> 00:04:37,290 questions about the actual physical situation. 67 00:04:37,290 --> 00:04:40,200 So one of the questions one might ask, for example, about 68 00:04:40,200 --> 00:04:43,010 firing an arrow is, how fast is it going? 69 00:04:46,180 --> 00:04:48,610 That's kind of a useful thing to know if you're worried 70 00:04:48,610 --> 00:04:51,820 about whether it will penetrate a target and kill 71 00:04:51,820 --> 00:04:55,380 somebody on the other side, for example. 72 00:04:55,380 --> 00:04:58,000 We can't answer that question directly 73 00:04:58,000 --> 00:05:00,350 looking at the data points. 74 00:05:00,350 --> 00:05:01,380 You look at the data. 75 00:05:01,380 --> 00:05:04,240 Well, I don't know. 76 00:05:04,240 --> 00:05:08,000 But we can use the model to answer the question. 77 00:05:08,000 --> 00:05:11,970 And that's an exercise I want to go through now to show you 78 00:05:11,970 --> 00:05:17,020 the interplay between models, and theory, and computation, 79 00:05:17,020 --> 00:05:21,720 and how we can use the three to answer relevant questions 80 00:05:21,720 --> 00:05:24,290 about data. 81 00:05:24,290 --> 00:05:26,790 No, I do not want to check for new software, thank you. 82 00:05:26,790 --> 00:05:31,430 In fact, let's make sure it won't do that anymore. 83 00:05:36,650 --> 00:05:39,120 So let's look at the PowerPoint. 84 00:05:39,120 --> 00:05:43,340 So here, we'll see how I'm using a little bit of theory, 85 00:05:43,340 --> 00:05:48,100 not very much, to be able to understand how to use the 86 00:05:48,100 --> 00:05:52,250 model to compute the speed of the arrow. 87 00:05:52,250 --> 00:05:58,830 So what we see is we know by our model, and by the good 88 00:05:58,830 --> 00:06:04,050 fit, that the trajectory is given by y equals ax-squared 89 00:06:04,050 --> 00:06:06,900 plus bx plus c. 90 00:06:06,900 --> 00:06:08,150 We know that. 91 00:06:10,160 --> 00:06:15,690 We also know from looking at this equation that the highest 92 00:06:15,690 --> 00:06:23,720 point, which I'll call yPeak, of the arrow must occur at 93 00:06:23,720 --> 00:06:28,340 xMid, the middle of the x-axis. 94 00:06:28,340 --> 00:06:31,850 So if we look at a parabola, and it doesn't matter what the 95 00:06:31,850 --> 00:06:38,240 parabola is, we always know that the vertical peak is 96 00:06:38,240 --> 00:06:42,210 halfway along the x-axis. 97 00:06:42,210 --> 00:06:43,980 The math tells us that from the equation. 98 00:06:47,150 --> 00:06:52,640 So we can say yPeak is x times xMid squared plus b 99 00:06:52,640 --> 00:06:54,880 times xMid plus c. 100 00:06:54,880 --> 00:06:58,410 So now, we have a model that we can tell how high 101 00:06:58,410 --> 00:07:01,530 the arrow can get. 102 00:07:01,530 --> 00:07:09,020 The next question I'll ask is if I fired the arrow from 103 00:07:09,020 --> 00:07:12,200 here, and it hits the target here-- 104 00:07:12,200 --> 00:07:14,410 I've exaggerated by it this way. 105 00:07:14,410 --> 00:07:17,020 It's nowhere near this steep. 106 00:07:17,020 --> 00:07:21,150 How long does it take to get from here to here? 107 00:07:21,150 --> 00:07:26,590 We don't have anything about time in our data. 108 00:07:26,590 --> 00:07:31,660 Yet, I claim we have enough information to go from the 109 00:07:31,660 --> 00:07:36,870 distance here and the distance here to how long it's going to 110 00:07:36,870 --> 00:07:38,810 take the arrow to get from here to the target. 111 00:07:41,310 --> 00:07:42,560 Why do I know that? 112 00:07:46,220 --> 00:07:49,240 What determines how long it's going to take to get 113 00:07:49,240 --> 00:07:50,490 from here to here? 114 00:07:53,310 --> 00:07:57,990 It's going to be how long it takes it to fall that far. 115 00:07:57,990 --> 00:08:01,040 It's going to be gravity. 116 00:08:01,040 --> 00:08:05,520 Because we know that gravity, at least on this planet, is a 117 00:08:05,520 --> 00:08:08,035 constant or close enough to it. 118 00:08:08,035 --> 00:08:12,370 Unless maybe the arrow were going a million miles. 119 00:08:12,370 --> 00:08:15,930 And it's going to be gravity that tells me how long it 120 00:08:15,930 --> 00:08:17,290 takes to get from here to here. 121 00:08:20,250 --> 00:08:27,540 And when it gets to the bottom, it's going to be here, 122 00:08:27,540 --> 00:08:34,270 So again, I can use some very simple math and say that the 123 00:08:34,270 --> 00:08:40,100 time will be the square root of 2 times the yPeak divided 124 00:08:40,100 --> 00:08:41,475 by the gravitational constant. 125 00:08:46,680 --> 00:08:49,540 Because I know that however long it takes to get from this 126 00:08:49,540 --> 00:08:52,990 height to this height is going to be the same time it takes 127 00:08:52,990 --> 00:08:54,650 to get from this point to this point. 128 00:08:57,620 --> 00:09:00,800 And that will therefore let me compute the average speed from 129 00:09:00,800 --> 00:09:03,100 here to here. 130 00:09:03,100 --> 00:09:07,810 And once I know that, I'm done. 131 00:09:07,810 --> 00:09:11,860 Now again, this is assuming no drag and things like that. 132 00:09:11,860 --> 00:09:15,650 The thing that we always have to understand about a model is 133 00:09:15,650 --> 00:09:20,360 no model is actually ever correct. 134 00:09:20,360 --> 00:09:24,250 On the other hand, many models are very useful and they're 135 00:09:24,250 --> 00:09:26,570 close enough to correct. 136 00:09:26,570 --> 00:09:30,190 So I left out things like gravity, wind shear 137 00:09:30,190 --> 00:09:31,440 and stuff like that. 138 00:09:31,440 --> 00:09:35,030 But in fact, the answer we get here will turn out to be very 139 00:09:35,030 --> 00:09:37,590 close to correct. 140 00:09:37,590 --> 00:09:40,605 We can now go back and look at some code. 141 00:09:48,700 --> 00:09:49,950 Get rid of this. 142 00:09:58,300 --> 00:10:00,200 And so now, you'll see this on the handout. 143 00:10:04,390 --> 00:10:09,820 I'm going to just write a little bit of code that just 144 00:10:09,820 --> 00:10:14,310 goes through the math I just showed you to compute the 145 00:10:14,310 --> 00:10:17,970 average x velocity. 146 00:10:17,970 --> 00:10:20,180 Got a print statement here I use to debug it. 147 00:10:20,180 --> 00:10:21,430 And I'm going to return it. 148 00:10:25,500 --> 00:10:28,935 And then, we'll just be able to run it and see what we get. 149 00:10:41,120 --> 00:10:43,526 Well, all right, that we looked at before. 150 00:10:49,310 --> 00:10:51,160 I forgot to close the previous figure. 151 00:10:51,160 --> 00:10:54,900 So now, I'm sure this is a problem you've all seen. 152 00:10:54,900 --> 00:10:56,720 And we'll fix it the way we always fix 153 00:10:56,720 --> 00:10:58,025 things, just start over. 154 00:11:13,920 --> 00:11:17,220 I'll bet you guys have also seen this happen. 155 00:11:17,220 --> 00:11:21,730 What this is suggesting, as we've seen before, is that the 156 00:11:21,730 --> 00:11:26,220 process, the old process, still exists. 157 00:11:26,220 --> 00:11:28,790 Not a good thing. 158 00:11:28,790 --> 00:11:30,500 Again, I'm sure you've all seen these. 159 00:11:30,500 --> 00:11:32,560 Let's make sure we don't have anything running here that 160 00:11:32,560 --> 00:11:34,300 looks like IDLE. 161 00:11:34,300 --> 00:11:35,550 We don't. 162 00:11:42,790 --> 00:11:44,310 Just takes it a little time. 163 00:11:44,310 --> 00:11:46,560 There it is, all right. 164 00:11:46,560 --> 00:11:47,810 All right, now we'll go back. 165 00:11:53,520 --> 00:11:57,790 All of this happened because I forgot to close the figure and 166 00:11:57,790 --> 00:12:01,360 executed pyLab.show twice, which we know 167 00:12:01,360 --> 00:12:02,755 can lead to bad things. 168 00:12:05,800 --> 00:12:08,430 So let's get rid of this. 169 00:12:14,420 --> 00:12:15,670 Now, we'll run it. 170 00:12:23,590 --> 00:12:32,600 And now, we have our figure just using the quadratic fit. 171 00:12:32,600 --> 00:12:37,370 And we see that the speed is 136.25 feet per second. 172 00:12:37,370 --> 00:12:39,980 Do I believe 136.25? 173 00:12:39,980 --> 00:12:41,520 Not really. 174 00:12:41,520 --> 00:12:43,400 I know it's the ballpark. 175 00:12:43,400 --> 00:12:47,850 I confused precision with accuracy here by giving you it 176 00:12:47,850 --> 00:12:50,000 to two decimal places. 177 00:12:50,000 --> 00:12:52,300 I can compute it as precisely as I want. 178 00:12:52,300 --> 00:12:55,050 But that doesn't mean it's actually accurate. 179 00:12:55,050 --> 00:12:58,970 Probably, I should have just said it's about 135 or 180 00:12:58,970 --> 00:13:00,230 something like that. 181 00:13:00,230 --> 00:13:02,150 But it's pretty good. 182 00:13:02,150 --> 00:13:04,910 And for those of who don't know how to do this arithmetic 183 00:13:04,910 --> 00:13:10,190 in your head like me, this is about 93 miles per hour. 184 00:13:10,190 --> 00:13:13,840 And for comparison, the speed of sound, instead of 136 feet 185 00:13:13,840 --> 00:13:16,650 per second, is 1,100 feet per second. 186 00:13:16,650 --> 00:13:19,380 So it's traveling pretty fast. 187 00:13:19,380 --> 00:13:20,990 Well, what's the point of this? 188 00:13:20,990 --> 00:13:22,490 I don't really care if you know how 189 00:13:22,490 --> 00:13:24,550 fast an arrow travels. 190 00:13:24,550 --> 00:13:28,540 I don't expect you'll ever need to compute that. 191 00:13:28,540 --> 00:13:32,670 But I wanted to show you this as an example of a pattern 192 00:13:32,670 --> 00:13:35,510 that we use a lot. 193 00:13:35,510 --> 00:13:39,495 So what we did is we started with an experiment. 194 00:13:44,750 --> 00:13:48,580 You didn't see this, but I actually stood in my backyard 195 00:13:48,580 --> 00:13:53,300 and shot a bunch of arrows and measured them, got real data 196 00:13:53,300 --> 00:13:54,580 out of that. 197 00:13:54,580 --> 00:13:57,790 And this gave me some data about the behavior of a 198 00:13:57,790 --> 00:13:59,040 physical system. 199 00:14:05,590 --> 00:14:07,010 That's what I get for wearing a tie. 200 00:14:07,010 --> 00:14:10,930 Maybe it'll be quieter if I put it in my shirt. 201 00:14:10,930 --> 00:14:13,490 Actually, it looks silly. 202 00:14:13,490 --> 00:14:14,540 Excuse me. 203 00:14:14,540 --> 00:14:16,875 I hope none of you will mind if I take my tie off? 204 00:14:16,875 --> 00:14:19,845 It seems to be making noises in the microphone. 205 00:14:27,680 --> 00:14:29,306 Maybe we should write a computation. 206 00:14:32,220 --> 00:14:36,160 All right, so ends my experiment with 207 00:14:36,160 --> 00:14:39,070 trying to look dignified. 208 00:14:39,070 --> 00:14:40,420 Not something I'm good at. 209 00:14:44,740 --> 00:14:47,740 OK, had an experiment. 210 00:14:47,740 --> 00:14:50,110 That gave us some data. 211 00:14:50,110 --> 00:15:05,170 We then use computation to both find and very importantly 212 00:15:05,170 --> 00:15:09,160 evaluate a model. 213 00:15:09,160 --> 00:15:10,800 It's no good just to find the model. 214 00:15:10,800 --> 00:15:14,350 You need to do some evaluation to convince yourself that it's 215 00:15:14,350 --> 00:15:24,730 a good model of the actual physical system. 216 00:15:24,730 --> 00:15:43,450 And then, finally, we use some theory and analysis and 217 00:15:43,450 --> 00:15:54,975 computation to derive the consequence of the model. 218 00:16:06,600 --> 00:16:10,140 And then, since we believe the accuracy of the model, we 219 00:16:10,140 --> 00:16:13,630 assume this consequence was also a true fact about the 220 00:16:13,630 --> 00:16:16,700 physical system we started with. 221 00:16:16,700 --> 00:16:20,840 This is a pattern that we see over and over again these days 222 00:16:20,840 --> 00:16:23,600 in all branches of science and engineering. 223 00:16:23,600 --> 00:16:25,630 And it's just the kind of thing that you 224 00:16:25,630 --> 00:16:27,260 should get used to doing. 225 00:16:27,260 --> 00:16:30,710 It is what you will do if you go onto a career in science or 226 00:16:30,710 --> 00:16:32,780 engineering. 227 00:16:32,780 --> 00:16:37,580 OK, that's all I want to say now about the topic of data 228 00:16:37,580 --> 00:16:40,210 and experiments and analysis. 229 00:16:40,210 --> 00:16:44,600 We will return to this topic of interpretation of data 230 00:16:44,600 --> 00:16:48,260 later in the semester near the end when we start talking 231 00:16:48,260 --> 00:16:51,180 about machine learning and clustering. 232 00:16:51,180 --> 00:16:56,910 But for now, I want to pull back and start down a new 233 00:16:56,910 --> 00:17:01,340 track that will, I'm sure you'll be pleased to hear, 234 00:17:01,340 --> 00:17:05,420 dovetail nicely with the next few problem sets that you're 235 00:17:05,420 --> 00:17:08,050 going to have to work on. 236 00:17:08,050 --> 00:17:12,032 What I want to talk about is the topic of optimization. 237 00:17:23,069 --> 00:17:27,099 Not so much optimization in the sense of how do you make a 238 00:17:27,099 --> 00:17:31,230 program fast, though we will talk a little about that, but 239 00:17:31,230 --> 00:17:33,880 what people refer to as optimization problems. 240 00:17:37,710 --> 00:17:42,210 How do we write programs to find optimal solutions to 241 00:17:42,210 --> 00:17:43,830 problems that occur in real life? 242 00:17:46,830 --> 00:17:50,740 Every optimization problem we'll look at is 243 00:17:50,740 --> 00:17:52,450 going to have two parts. 244 00:17:55,360 --> 00:18:07,390 There's going to be (1) an objective function that will 245 00:18:07,390 --> 00:18:11,180 either be maximized or minimized. 246 00:18:11,180 --> 00:18:15,210 So for example, I might want to find the minimal air fare 247 00:18:15,210 --> 00:18:18,860 between Boston and Istanbul. 248 00:18:18,860 --> 00:18:21,560 Or more likely, the minimum bus fare between 249 00:18:21,560 --> 00:18:24,290 Boston and New York. 250 00:18:24,290 --> 00:18:27,070 So there's an objective function. 251 00:18:27,070 --> 00:18:28,750 Sometimes, you find the least. 252 00:18:28,750 --> 00:18:30,960 Sometimes, you find the most. 253 00:18:30,960 --> 00:18:35,660 Maybe I want to maximize my income. 254 00:18:35,660 --> 00:18:43,600 And (2) a set of constraints that have to be satisfied. 255 00:18:50,210 --> 00:18:54,210 So maybe I want to find the minimum transportation, 256 00:18:54,210 --> 00:18:59,190 minimum cost transportation between Boston and New York 257 00:18:59,190 --> 00:19:02,200 subject to the constraint that it not take more than eight 258 00:19:02,200 --> 00:19:06,630 hours or some such thing. 259 00:19:06,630 --> 00:19:09,630 So the objective function that you're minimizing or 260 00:19:09,630 --> 00:19:13,760 maximizing, and some set of constraints 261 00:19:13,760 --> 00:19:16,910 that must be obeyed. 262 00:19:16,910 --> 00:19:21,170 A vast number of problems of practical importance can be 263 00:19:21,170 --> 00:19:23,580 formulated this way. 264 00:19:23,580 --> 00:19:28,050 Once we've formulated in this systematic way, we can then 265 00:19:28,050 --> 00:19:31,830 think about how to attack them with a computation that will 266 00:19:31,830 --> 00:19:33,620 help us solve the problem. 267 00:19:36,470 --> 00:19:38,640 You guys do this all the time. 268 00:19:38,640 --> 00:19:42,500 I heard a talk yesterday by Jeremy Wertheimer, an MIT 269 00:19:42,500 --> 00:19:45,810 graduate, who founded a company called ITA. 270 00:19:45,810 --> 00:19:49,660 If you ever use Kayak, for example, or many of these 271 00:19:49,660 --> 00:19:52,840 systems to find an airline fare, they use some of 272 00:19:52,840 --> 00:19:57,180 Jeremy's code and algorithms to solve these various 273 00:19:57,180 --> 00:19:59,720 optimization problems like this. 274 00:19:59,720 --> 00:20:03,600 If you've ever used Google or Bing, they solve optimization 275 00:20:03,600 --> 00:20:06,480 problems to decide what pages to show you. 276 00:20:06,480 --> 00:20:08,550 They're all over the place. 277 00:20:08,550 --> 00:20:12,760 There are a lot of classic optimization problems that 278 00:20:12,760 --> 00:20:17,700 people have worked on for decades. 279 00:20:17,700 --> 00:20:20,640 What we often do when confronted with a new problem, 280 00:20:20,640 --> 00:20:23,620 and it's something you'll get some experience on in problem 281 00:20:23,620 --> 00:20:30,680 sets, is take a seemingly new problem and map it onto a 282 00:20:30,680 --> 00:20:35,490 classic problem, and then use one of the classic solutions. 283 00:20:35,490 --> 00:20:39,100 So we'll go through this section of the course. 284 00:20:39,100 --> 00:20:43,990 And we'll look at a number of classic optimization problems. 285 00:20:43,990 --> 00:20:47,430 And then, you can think about how you would map other 286 00:20:47,430 --> 00:20:48,835 problems onto those. 287 00:20:53,230 --> 00:21:04,230 This is the process known as problem reduction, where we 288 00:21:04,230 --> 00:21:09,970 take a problem and map it onto an existing problem that we 289 00:21:09,970 --> 00:21:12,590 already know how to solve. 290 00:21:12,590 --> 00:21:15,570 I'm not going to go through a list of classic optimization 291 00:21:15,570 --> 00:21:16,960 problems right now. 292 00:21:16,960 --> 00:21:20,055 But we'll see a bunch of them as we go forward. 293 00:21:22,640 --> 00:21:25,310 Now, an important thing to think about when we think 294 00:21:25,310 --> 00:21:29,750 about optimization problems is how long, how 295 00:21:29,750 --> 00:21:32,210 hard, they are to solve. 296 00:21:32,210 --> 00:21:36,450 So far, we have looked at problems that, for the most 297 00:21:36,450 --> 00:21:42,600 part, have pretty fast solutions, often sub-linear, 298 00:21:42,600 --> 00:21:46,990 binary search, sometimes linear, and at worst case, 299 00:21:46,990 --> 00:21:48,260 low-order polynomials. 300 00:21:52,890 --> 00:21:57,130 Optimization problems, as we'll see, are typically much 301 00:21:57,130 --> 00:21:59,440 worse than that. 302 00:21:59,440 --> 00:22:04,070 In fact, what we'll see is there is often no 303 00:22:04,070 --> 00:22:07,980 computationally efficient way to solve them. 304 00:22:07,980 --> 00:22:12,610 And so we end up dealing with approximate solutions to them, 305 00:22:12,610 --> 00:22:16,290 or what people might call best effort solutions. 306 00:22:16,290 --> 00:22:18,460 And we see that as an increasing 307 00:22:18,460 --> 00:22:22,230 trend in tackling problems. 308 00:22:22,230 --> 00:22:26,210 All right, enough of this abstract stuff. 309 00:22:26,210 --> 00:22:27,980 Let's look at an example. 310 00:22:27,980 --> 00:22:31,950 So one of the classic optimization problems is 311 00:22:31,950 --> 00:22:35,440 called the knapsack problem. 312 00:22:35,440 --> 00:22:37,110 People know what a knapsack is? 313 00:22:37,110 --> 00:22:39,070 Sort of an archaic term. 314 00:22:39,070 --> 00:22:43,020 Today, people would use the word backpack. 315 00:22:43,020 --> 00:22:46,300 But in the old days, they called them knapsacks when 316 00:22:46,300 --> 00:22:48,650 they started looking at these things. 317 00:22:48,650 --> 00:22:52,160 And the problem is also discussed in the context of a 318 00:22:52,160 --> 00:22:56,550 burglar or various kinds of thieves. 319 00:22:56,550 --> 00:22:58,730 So it's not easy being a burglar, by the way. 320 00:22:58,730 --> 00:23:01,360 I don't know if any of you ever tried it. 321 00:23:01,360 --> 00:23:04,160 You've got some of the obvious problems, like making sure the 322 00:23:04,160 --> 00:23:06,260 house is empty and picking locks, 323 00:23:06,260 --> 00:23:08,800 circumventing alarms, et cetera. 324 00:23:08,800 --> 00:23:11,690 But one of the really hard problems a burglar has to deal 325 00:23:11,690 --> 00:23:14,610 with is deciding what to steal. 326 00:23:14,610 --> 00:23:17,880 Because you break into the typical luxury home-- and why 327 00:23:17,880 --> 00:23:19,660 would you break into a poor person's house 328 00:23:19,660 --> 00:23:21,730 if you were a burglar-- 329 00:23:21,730 --> 00:23:26,090 there's usually far more to steal than you can carry away. 330 00:23:26,090 --> 00:23:28,750 And so the problem is formulated in terms of the 331 00:23:28,750 --> 00:23:30,860 burglar having a backpack. 332 00:23:30,860 --> 00:23:33,510 They can put a certain amount of stuff in it. 333 00:23:33,510 --> 00:23:40,090 And they have to maximize the value of what they steal 334 00:23:40,090 --> 00:23:43,940 subject to the constraint of how much weight they can 335 00:23:43,940 --> 00:23:46,330 actually carry. 336 00:23:46,330 --> 00:23:49,180 So it's a classic optimization problem. 337 00:23:49,180 --> 00:23:53,020 And people have worked for years at how to solve it, not 338 00:23:53,020 --> 00:23:54,780 so much because they want to be burglars. 339 00:23:54,780 --> 00:23:58,000 But as you'll see, these kinds of optimization problems are 340 00:23:58,000 --> 00:24:00,490 actually quite common. 341 00:24:00,490 --> 00:24:03,420 So let's look at an example. 342 00:24:03,420 --> 00:24:04,800 You break into the house. 343 00:24:04,800 --> 00:24:07,360 And among other things, you have a 344 00:24:07,360 --> 00:24:08,850 choice of what to steal. 345 00:24:08,850 --> 00:24:13,090 You have a rather strange looking clock, some artwork, a 346 00:24:13,090 --> 00:24:18,580 book, a Velvet Elvis in case you lean in that direction, 347 00:24:18,580 --> 00:24:20,630 all sorts of things. 348 00:24:20,630 --> 00:24:25,380 And for some reason, the owner was nice enough to leave you 349 00:24:25,380 --> 00:24:29,200 information about how much everything cost and how much 350 00:24:29,200 --> 00:24:30,350 it weighed. 351 00:24:30,350 --> 00:24:32,760 So you find this piece of paper. 352 00:24:32,760 --> 00:24:35,920 And now, you're trying to decide what to steal based 353 00:24:35,920 --> 00:24:39,120 upon this in a way to maximize your value. 354 00:24:42,030 --> 00:24:44,590 How do we go about doing it? 355 00:24:44,590 --> 00:24:46,580 Oh, I should show you, by the way. 356 00:24:46,580 --> 00:24:48,360 There's a picture of a typical knapsack. 357 00:24:52,260 --> 00:24:56,600 All right, it's almost Easter, after all. 358 00:24:56,600 --> 00:25:02,125 Well, the simplest solution is probably a greedy algorithm. 359 00:25:06,430 --> 00:25:10,440 And we'll talk a lot about greedy algorithms because they 360 00:25:10,440 --> 00:25:15,630 are very popular and often the right way to 361 00:25:15,630 --> 00:25:20,640 tackle a hard problem. 362 00:25:20,640 --> 00:25:26,650 So the notion of a greedy algorithm is it's iterative. 363 00:25:26,650 --> 00:25:30,646 And at each step, you pick the locally optimal solution. 364 00:25:57,780 --> 00:26:03,610 So you make the best choice, put that item in the knapsack. 365 00:26:03,610 --> 00:26:05,810 Ask if you have room, if you're out of weight. 366 00:26:05,810 --> 00:26:09,620 If not, you make the best choice of the remaining ones. 367 00:26:09,620 --> 00:26:10,700 Ask the same question. 368 00:26:10,700 --> 00:26:16,540 You do that until you can't fit anything else in. 369 00:26:16,540 --> 00:26:22,850 Now of course, to do that, that assumes that we know at 370 00:26:22,850 --> 00:26:28,380 each stage what we mean by locally optimal. 371 00:26:28,380 --> 00:26:31,130 And of course, we have choices here. 372 00:26:31,130 --> 00:26:36,000 We're trying to figure out, in some sense, what greedy 373 00:26:36,000 --> 00:26:38,890 algorithm, what approach to being greedy, will give us the 374 00:26:38,890 --> 00:26:40,940 best result. 375 00:26:40,940 --> 00:26:43,900 So one could, for example, say, all right. 376 00:26:43,900 --> 00:26:46,760 At each step, I'll choose the most valuable item and put 377 00:26:46,760 --> 00:26:48,910 that in my knapsack. 378 00:26:48,910 --> 00:26:53,490 And I'll do that till I run out of valuable items. 379 00:26:53,490 --> 00:26:57,890 Or, you could, at each step, say, well, what I'm really 380 00:26:57,890 --> 00:27:01,670 going to choose is the one that weights the least. 381 00:27:01,670 --> 00:27:03,290 That will give me the most items. 382 00:27:03,290 --> 00:27:04,920 And maybe that will give me the most total 383 00:27:04,920 --> 00:27:07,210 value when I'm done. 384 00:27:07,210 --> 00:27:10,810 Or maybe, at each step, you could say, well, let me choose 385 00:27:10,810 --> 00:27:14,610 the one that has the best value to weight ratio 386 00:27:14,610 --> 00:27:15,410 and put that in. 387 00:27:15,410 --> 00:27:20,100 And maybe that will give me the best solution. 388 00:27:20,100 --> 00:27:26,080 As we will see, in this case, none of those is guaranteed to 389 00:27:26,080 --> 00:27:30,090 give you the best solution all the time. 390 00:27:30,090 --> 00:27:33,390 In fact, as we'll see, none of them is guaranteed to be 391 00:27:33,390 --> 00:27:37,430 better than any of the others all the time. 392 00:27:37,430 --> 00:27:41,790 And that's one of the issues with greedy algorithms. 393 00:27:41,790 --> 00:27:45,340 I should point out, by the way, that this version of the 394 00:27:45,340 --> 00:27:49,620 knapsack problem that we're talking about is typically 395 00:27:49,620 --> 00:27:51,825 called the 0/1 knapsack problem. 396 00:27:59,180 --> 00:28:04,840 And that's because we either have to take the entire item 397 00:28:04,840 --> 00:28:07,480 or none of the item. 398 00:28:07,480 --> 00:28:10,270 We're not allowed to cut the Velvet Elvis in half and take 399 00:28:10,270 --> 00:28:11,520 half of it. 400 00:28:14,120 --> 00:28:18,300 This is in contrast to the continuous knapsack problem. 401 00:28:18,300 --> 00:28:21,300 If you imagine you break into the house and you see a barrel 402 00:28:21,300 --> 00:28:26,840 of gold dust, and a barrel of silver dust, and a barrel of 403 00:28:26,840 --> 00:28:31,500 raisins, what you would do is you would fill your knapsack 404 00:28:31,500 --> 00:28:34,380 with as much gold as you could carry, or until 405 00:28:34,380 --> 00:28:36,540 you ran out of gold. 406 00:28:36,540 --> 00:28:38,480 And then, you would fill it with as much silver as you 407 00:28:38,480 --> 00:28:39,730 could carry. 408 00:28:39,730 --> 00:28:41,000 And then, if there's any room left, 409 00:28:41,000 --> 00:28:43,900 you'd put in the raisins. 410 00:28:43,900 --> 00:28:47,110 For the continuous knapsack problem, a greedy algorithm 411 00:28:47,110 --> 00:28:50,470 provides an optimal solution. 412 00:28:50,470 --> 00:28:53,735 Unfortunately, most of the problems we actually encounter 413 00:28:53,735 --> 00:28:58,560 in life, as we'll see, are 0/1 knapsack problems. 414 00:28:58,560 --> 00:29:01,160 You either take something or you don't. 415 00:29:01,160 --> 00:29:03,910 And that's more complicated. 416 00:29:03,910 --> 00:29:05,220 All right, let's look at some code. 417 00:29:13,280 --> 00:29:15,900 So I'm going to formulate it. 418 00:29:15,900 --> 00:29:21,510 I'm first going to start by putting in a class, just so 419 00:29:21,510 --> 00:29:24,110 the rest of my code is simpler. 420 00:29:24,110 --> 00:29:26,200 This is something we've been talking about, that 421 00:29:26,200 --> 00:29:29,960 increasingly people want to start by putting in some 422 00:29:29,960 --> 00:29:32,190 useful data abstractions. 423 00:29:32,190 --> 00:29:37,030 So I've got a class item where I can put in the item. 424 00:29:37,030 --> 00:29:38,170 I can get its name. 425 00:29:38,170 --> 00:29:39,120 I get its value. 426 00:29:39,120 --> 00:29:39,990 I can get its weight. 427 00:29:39,990 --> 00:29:41,860 And I can print it. 428 00:29:41,860 --> 00:29:48,130 Kind of a boring class, but useful to have. 429 00:29:48,130 --> 00:29:56,220 Then, I'm going to use this class to build items. 430 00:29:56,220 --> 00:29:59,040 And in this case, I'm going to build the items based upon 431 00:29:59,040 --> 00:30:02,240 what we just looked at, the table that-- 432 00:30:02,240 --> 00:30:03,470 I think it's in your hand out. 433 00:30:03,470 --> 00:30:07,250 And it's also on this slide. 434 00:30:07,250 --> 00:30:10,520 Later, if we want, we can have a randomized program to build 435 00:30:10,520 --> 00:30:13,250 up a much bigger choice of items. 436 00:30:13,250 --> 00:30:15,450 But here, we'll just try the clock, the painting, the 437 00:30:15,450 --> 00:30:19,350 radio, the vase, the book, and the computer. 438 00:30:19,350 --> 00:30:20,765 Now comes the interesting part. 439 00:30:25,690 --> 00:30:28,970 I've written a function, greedy, that takes three 440 00:30:28,970 --> 00:30:31,410 arguments-- 441 00:30:31,410 --> 00:30:35,830 the set of items that I have to choose from, makes sense, 442 00:30:35,830 --> 00:30:39,640 the maximum weight the burglar can carry. 443 00:30:39,640 --> 00:30:43,580 And there's something called key function, which is 444 00:30:43,580 --> 00:30:47,960 defining essentially what I mean by locally optimal. 445 00:30:55,340 --> 00:30:58,230 Then, it's quite simple. 446 00:30:58,230 --> 00:31:06,270 I'm going to sort the items using the key function. 447 00:31:06,270 --> 00:31:09,190 Remember, sort has this optional argument that says, 448 00:31:09,190 --> 00:31:10,670 what's the ordering? 449 00:31:10,670 --> 00:31:12,450 So maybe I'll order it by value. 450 00:31:12,450 --> 00:31:13,870 Maybe I'll order it by density. 451 00:31:13,870 --> 00:31:16,980 Maybe I'll order it by weight. 452 00:31:16,980 --> 00:31:20,900 I'm going to reverse it, because I want the most 453 00:31:20,900 --> 00:31:25,890 valuable first, not the least valuable, for example. 454 00:31:25,890 --> 00:31:29,510 And then, I'm going to just take the first thing on my 455 00:31:29,510 --> 00:31:33,650 list until I run out of weight, and then I'm done. 456 00:31:33,650 --> 00:31:36,105 And I'll return the result and the total value. 457 00:31:41,600 --> 00:31:45,310 To make life simple, I'm going to define some functions. 458 00:31:45,310 --> 00:31:51,100 These are the functions that I can use for the ordering. 459 00:31:51,100 --> 00:31:54,830 Value, which is just return the value of the item. 460 00:31:54,830 --> 00:31:59,190 The inverse of the weight, because I'm thinking, as a 461 00:31:59,190 --> 00:32:00,760 greedy algorithm, I'll take the 462 00:32:00,760 --> 00:32:02,950 lightest, not the heaviest. 463 00:32:02,950 --> 00:32:06,670 And since I'm reversing it, I want to do the inverse. 464 00:32:06,670 --> 00:32:10,400 And the density, which is just the value 465 00:32:10,400 --> 00:32:11,650 divided by the weight. 466 00:32:15,380 --> 00:32:17,105 OK, make sense to everybody? 467 00:32:20,790 --> 00:32:22,370 You with me? 468 00:32:22,370 --> 00:32:25,010 Speak now, or not. 469 00:32:25,010 --> 00:32:27,890 And then, we'll test it. 470 00:32:27,890 --> 00:32:31,520 So again, kind of a theme of this part of the course. 471 00:32:31,520 --> 00:32:36,020 As we write these more complex programs, we tend to have to 472 00:32:36,020 --> 00:32:38,320 worry about our test harnesses. 473 00:32:38,320 --> 00:32:41,280 So I've got a function that tests the greedy algorithm, 474 00:32:41,280 --> 00:32:44,590 and then another function that tests all three greedy 475 00:32:44,590 --> 00:32:45,850 approaches-- 476 00:32:45,850 --> 00:32:48,510 the one algorithm with different functions-- 477 00:32:48,510 --> 00:32:53,220 and looks at what our results are. 478 00:32:53,220 --> 00:32:54,520 So let's run it. 479 00:33:02,240 --> 00:33:04,020 See what we get. 480 00:33:04,020 --> 00:33:05,500 Oh, you know what I did? 481 00:33:05,500 --> 00:33:07,090 Just the same thing I did last time. 482 00:33:07,090 --> 00:33:09,920 But this time, I'm going to be smarter. 483 00:33:09,920 --> 00:33:14,810 We're going to get rid of this figure and comment out the 484 00:33:14,810 --> 00:33:16,060 code that generated it. 485 00:33:31,150 --> 00:33:32,810 And now, we'll test the greedy algorithms. 486 00:33:39,210 --> 00:33:43,570 So we see the items we had to choose from, which I printed 487 00:33:43,570 --> 00:33:46,640 using the string function and items. 488 00:33:46,640 --> 00:33:51,080 And if I use greedy by value to fill a knapsack of size 20, 489 00:33:51,080 --> 00:33:56,290 we see that I end up getting just the computer if I do 490 00:33:56,290 --> 00:33:59,490 greedy by value. 491 00:33:59,490 --> 00:34:03,860 This is for the nerd burglar. 492 00:34:03,860 --> 00:34:06,120 If I use weight, I get a different-- 493 00:34:06,120 --> 00:34:08,540 I get more things, not surprisingly -- 494 00:34:08,540 --> 00:34:12,429 but lower value. 495 00:34:12,429 --> 00:34:17,219 And if I use density, I also get four things, but four 496 00:34:17,219 --> 00:34:19,720 different things, and I get a higher value. 497 00:34:23,139 --> 00:34:26,389 So I see that I can run these greedy algorithms. 498 00:34:26,389 --> 00:34:27,969 I can get an answer. 499 00:34:27,969 --> 00:34:30,790 But it's not always the same answer. 500 00:34:35,520 --> 00:34:38,429 As I said earlier, greedy by density happens 501 00:34:38,429 --> 00:34:40,170 to work best here. 502 00:34:40,170 --> 00:34:44,679 But you shouldn't assume that will always be the case. 503 00:34:44,679 --> 00:34:47,130 I'm sure you can all imagine a different assignment of 504 00:34:47,130 --> 00:34:50,159 weights and values that would make greedy by density give 505 00:34:50,159 --> 00:34:53,710 you a bad answer. 506 00:34:53,710 --> 00:34:56,350 All right, before we talk about how good 507 00:34:56,350 --> 00:34:58,200 these answers are-- 508 00:34:58,200 --> 00:35:01,290 and we will come back to that as, in particular, suppose I 509 00:35:01,290 --> 00:35:03,480 want the best answer-- 510 00:35:03,480 --> 00:35:07,350 I want to stop for a minute and talk about the algorithmic 511 00:35:07,350 --> 00:35:10,220 efficiency of the greedy algorithm. 512 00:35:13,130 --> 00:35:14,770 So let's go back and look at the code. 513 00:35:14,770 --> 00:35:17,030 And this is why people use greedy algorithms. 514 00:35:17,030 --> 00:35:19,290 Actually, there are two reasons. 515 00:35:19,290 --> 00:35:23,390 One reason is that they're easy to program, and that's 516 00:35:23,390 --> 00:35:25,420 always a good thing. 517 00:35:25,420 --> 00:35:31,140 And the other is that they are typically highly efficient. 518 00:35:31,140 --> 00:35:34,740 So what's the efficiency of this? 519 00:35:34,740 --> 00:35:36,750 How would we think about the efficiency 520 00:35:36,750 --> 00:35:40,060 of this greedy algorithm? 521 00:35:40,060 --> 00:35:43,080 What are we looking at here? 522 00:35:43,080 --> 00:35:46,130 Well, the first thing we have to ask is, what's the first 523 00:35:46,130 --> 00:35:48,290 thing it does? 524 00:35:48,290 --> 00:35:52,090 It sorts the list, right? 525 00:35:52,090 --> 00:35:57,330 So one thing that governs the efficiency might be the amount 526 00:35:57,330 --> 00:36:03,020 of time it takes to sort the list of items. 527 00:36:03,020 --> 00:36:06,580 Well, we know how long that takes. 528 00:36:06,580 --> 00:36:08,960 Or we can speculate at least. 529 00:36:08,960 --> 00:36:12,210 And let's assume it does something like merge sort. 530 00:36:12,210 --> 00:36:14,080 So what's that term going to be? 531 00:36:16,760 --> 00:36:18,095 Order what? 532 00:36:25,110 --> 00:36:32,160 Len of items times what? 533 00:36:32,160 --> 00:36:33,410 Log n, right? 534 00:36:41,210 --> 00:36:43,850 So maybe that's going to tell us the 535 00:36:43,850 --> 00:36:47,410 complexity, but maybe not. 536 00:36:47,410 --> 00:36:51,630 The next thing we have to do is look at the while loop and 537 00:36:51,630 --> 00:36:54,110 see how many times are we going through the while loop. 538 00:36:59,810 --> 00:37:01,060 What's the worst case? 539 00:37:06,170 --> 00:37:07,050 Somebody? 540 00:37:07,050 --> 00:37:09,070 I know I didn't bring any candy today, but you could 541 00:37:09,070 --> 00:37:10,985 answer the question anyway. 542 00:37:10,985 --> 00:37:12,440 Be a sport. 543 00:37:12,440 --> 00:37:13,480 Do it for free. 544 00:37:13,480 --> 00:37:15,782 Yeah? 545 00:37:15,782 --> 00:37:17,440 AUDIENCE: The length of the items. 546 00:37:26,879 --> 00:37:30,880 PROFESSOR: Well, we know this one is bigger. 547 00:37:30,880 --> 00:37:34,530 So it looks like that's the complexity, right? 548 00:37:38,720 --> 00:37:43,910 So we can say, all right, pretty good. 549 00:37:43,910 --> 00:37:47,420 Slightly worse than linear in the length of the items, but 550 00:37:47,420 --> 00:37:50,570 not bad at all. 551 00:37:50,570 --> 00:37:54,390 And that's a big attraction of greedy algorithms. 552 00:37:54,390 --> 00:38:00,930 They are typically order length of items, or order 553 00:38:00,930 --> 00:38:04,430 length of items times the log of the length. 554 00:38:04,430 --> 00:38:07,970 So greedy algorithms are usually very close to linear. 555 00:38:07,970 --> 00:38:11,800 And that's why we really like them. 556 00:38:11,800 --> 00:38:16,750 Why we don't like them is it may be that the accumulation 557 00:38:16,750 --> 00:38:21,460 of a sequence of locally optimal solutions does not 558 00:38:21,460 --> 00:38:23,260 yield a globally optimal solution. 559 00:38:26,420 --> 00:38:30,590 So now, let's ask the question, suppose that's not 560 00:38:30,590 --> 00:38:32,380 good enough. 561 00:38:32,380 --> 00:38:36,330 I have a very demanding thief. 562 00:38:36,330 --> 00:38:40,450 Or maybe the thief works for a very demanding person and 563 00:38:40,450 --> 00:38:45,610 needs to choose the absolute optimal set. 564 00:38:45,610 --> 00:38:51,200 Let's think first about how we formulate that carefully. 565 00:38:51,200 --> 00:38:55,180 And then, what the complexity of solving it would be. 566 00:38:55,180 --> 00:39:00,180 And then, algorithms that might be useful. 567 00:39:00,180 --> 00:39:04,060 Again, the important step here, I think, is not the 568 00:39:04,060 --> 00:39:09,380 solution to the problem, but the process used to formulate 569 00:39:09,380 --> 00:39:11,200 the problem. 570 00:39:11,200 --> 00:39:16,110 Often, it is the case that once one has done a careful 571 00:39:16,110 --> 00:39:21,550 formulation of a problem, it becomes obvious how to solve 572 00:39:21,550 --> 00:39:24,750 it, at least in a brute force way. 573 00:39:24,750 --> 00:39:27,750 So now, let's look at a formalization of the 0/1 574 00:39:27,750 --> 00:39:29,330 knapsack problem. 575 00:39:29,330 --> 00:39:31,720 And it's a kind of formalization we'll use for a 576 00:39:31,720 --> 00:39:33,850 lot of problems. 577 00:39:33,850 --> 00:39:41,820 So step one, we'll represent each item by a pair. 578 00:39:51,530 --> 00:39:54,550 Because in fact, in deciding whether or not to take an 579 00:39:54,550 --> 00:39:56,960 item, we don't care what its name. 580 00:39:56,960 --> 00:40:00,260 We don't care if it's a clock, or a radio, or whatever. 581 00:40:00,260 --> 00:40:03,330 What matters is what's its value and what's its weight. 582 00:40:07,300 --> 00:40:15,320 We'll write W as the maximum weight that the thief can 583 00:40:15,320 --> 00:40:18,700 carry, or that can fit in the knapsack. 584 00:40:18,700 --> 00:40:20,060 So far, so good. 585 00:40:20,060 --> 00:40:22,160 Nothing complicated there. 586 00:40:22,160 --> 00:40:26,230 Now comes the interesting step. 587 00:40:26,230 --> 00:40:29,910 We're going to represent the set of 588 00:40:29,910 --> 00:40:34,030 available items as a vector. 589 00:40:34,030 --> 00:40:57,750 We'll call it I. And then we'll have another vector, V, 590 00:40:57,750 --> 00:41:02,670 which indicates whether or not each item in I has been taken. 591 00:41:05,290 --> 00:41:08,620 So V is a vector. 592 00:41:08,620 --> 00:41:23,050 And if V_i is equal to 1, that implies I_i-- 593 00:41:23,050 --> 00:41:25,320 big I sub little i-- 594 00:41:25,320 --> 00:41:29,560 has been taken, is in the knapsack. 595 00:41:33,360 --> 00:41:37,550 Conversely, if V_i is 0, it means I_i 596 00:41:37,550 --> 00:41:40,910 is not in the knapsack. 597 00:41:40,910 --> 00:41:52,080 So having formulated the situation thusly, we can now 598 00:41:52,080 --> 00:41:56,370 go back to our notion of an optimization problem as an 599 00:41:56,370 --> 00:42:01,580 objective function and a set of constraints to carefully 600 00:42:01,580 --> 00:42:02,830 state the problem. 601 00:42:05,440 --> 00:42:21,760 So for the objective function, we want to maximize the sum of 602 00:42:21,760 --> 00:42:40,700 V_i times I_i dot value, where i ranges over the length of 603 00:42:40,700 --> 00:42:42,270 the vectors. 604 00:42:42,270 --> 00:42:45,360 So that's the trick of the 0/1. 605 00:42:45,360 --> 00:42:46,820 If I don't take it, it's 0. 606 00:42:46,820 --> 00:42:48,520 So it's 0 times the value. 607 00:42:48,520 --> 00:42:51,390 If I take it, it's 1 times the value. 608 00:42:51,390 --> 00:42:54,980 So this is going to give me the sum of the values of the 609 00:42:54,980 --> 00:42:56,230 items I've taken. 610 00:42:58,780 --> 00:43:01,175 And then, I have this subject to the constraint. 611 00:43:11,040 --> 00:43:13,970 And again, we'll do a summation. 612 00:43:13,970 --> 00:43:16,945 And it will look very similar. 613 00:43:16,945 --> 00:43:31,330 V_i times I_i, but this time, dot weight is less than or 614 00:43:31,330 --> 00:43:41,350 equal to W. 615 00:43:41,350 --> 00:43:46,830 Straightforward, but a useful kind of skill. 616 00:43:46,830 --> 00:43:50,180 And people do spend a lot of time on doing that. 617 00:43:50,180 --> 00:43:53,880 If you've ever used MATLAB, you know it wants everything 618 00:43:53,880 --> 00:43:55,840 to be a vector. 619 00:43:55,840 --> 00:43:59,590 And that's often because a lot of these problems can be 620 00:43:59,590 --> 00:44:04,570 nicely formulated in this kind of way. 621 00:44:04,570 --> 00:44:10,620 All right, now let's return to the question of complexity. 622 00:44:10,620 --> 00:44:14,360 What happens if we implement this in the most 623 00:44:14,360 --> 00:44:15,610 straightforward way? 624 00:44:18,220 --> 00:44:20,050 What would the most straightforward 625 00:44:20,050 --> 00:44:23,050 implementation look like? 626 00:44:23,050 --> 00:44:46,660 Well, we could enumerate all possibilities and then choose 627 00:44:46,660 --> 00:44:50,535 the best that meets the constraint. 628 00:44:55,280 --> 00:45:00,650 So this would be the obvious brute force solution to the 629 00:45:00,650 --> 00:45:03,800 optimization problem. 630 00:45:03,800 --> 00:45:05,410 Look at all possible solutions. 631 00:45:05,410 --> 00:45:09,100 Choose the best one. 632 00:45:09,100 --> 00:45:11,820 I think you can see immediately that this is 633 00:45:11,820 --> 00:45:14,300 guaranteed to give you the optimal solution. 634 00:45:14,300 --> 00:45:16,540 Actually, an optimal solution. 635 00:45:16,540 --> 00:45:20,080 Maybe there's more than one best. 636 00:45:20,080 --> 00:45:23,750 But in that case, you can just choose whichever one you like 637 00:45:23,750 --> 00:45:27,490 or whichever comes first, for example. 638 00:45:27,490 --> 00:45:30,840 The question is, how long will this take to run? 639 00:45:34,260 --> 00:45:38,440 Well, we can think about that by asking the question, how 640 00:45:38,440 --> 00:45:39,900 big will this set be? 641 00:45:39,900 --> 00:45:41,540 How many possibilities are there? 642 00:45:44,470 --> 00:45:48,140 Well, we can think about that in a pretty straightforward 643 00:45:48,140 --> 00:45:57,300 way because if we look at our formulation, we can ask 644 00:45:57,300 --> 00:46:02,640 ourselves, how many possible vectors are there? 645 00:46:02,640 --> 00:46:07,960 How many vector V's could there be which shows which 646 00:46:07,960 --> 00:46:10,110 items were taken and which weren't? 647 00:46:10,110 --> 00:46:12,670 And what's the answer to that? 648 00:46:12,670 --> 00:46:23,340 Well, if we have n items, how long will V be? 649 00:46:23,340 --> 00:46:24,920 Length n, right? 650 00:46:24,920 --> 00:46:27,430 0, 1 for each. 651 00:46:27,430 --> 00:46:32,150 If we have a vector of 0's and 1''s of length n, how many 652 00:46:32,150 --> 00:46:34,220 different values can that vector take on? 653 00:46:37,990 --> 00:46:39,790 We asked this question before. 654 00:46:39,790 --> 00:46:42,050 What's the answer? 655 00:46:42,050 --> 00:46:43,325 Somebody shout it out. 656 00:46:45,960 --> 00:46:48,840 I've got a vector of length n. 657 00:46:51,470 --> 00:46:55,050 Every value in the vector is either a 0 or a 1. 658 00:46:55,050 --> 00:46:56,955 So maybe it looks something like this. 659 00:47:02,330 --> 00:47:05,440 How many possible combinations of (0, 1)'s are there? 660 00:47:05,440 --> 00:47:06,292 AUDIENCE: 2 to the n? 661 00:47:06,292 --> 00:47:07,680 PROFESSOR: 2 to the n. 662 00:47:07,680 --> 00:47:11,960 Because essentially, this is a binary number, exactly. 663 00:47:11,960 --> 00:47:16,490 And so if I had an n-bit binary number, I can represent 664 00:47:16,490 --> 00:47:19,630 2 to the n different values. 665 00:47:19,630 --> 00:47:26,030 And so we see that we have 2 to the n possible combinations 666 00:47:26,030 --> 00:47:28,370 to look at if we use a brute force solution. 667 00:47:31,650 --> 00:47:33,790 How bad is this? 668 00:47:33,790 --> 00:47:38,200 Well, if the number of items is small, it's not so bad. 669 00:47:38,200 --> 00:47:42,630 And you'll see that, in fact, I can run this on the example 670 00:47:42,630 --> 00:47:43,400 we've looked. 671 00:47:43,400 --> 00:47:47,500 2 to the 5 is not a huge number. 672 00:47:47,500 --> 00:47:52,100 Suppose I have a different number. 673 00:47:52,100 --> 00:47:56,440 Suppose I have 50 items to choose from. 674 00:47:56,440 --> 00:47:58,940 Not a big problem. 675 00:47:58,940 --> 00:48:02,740 I heard yesterday that the number of different airfares 676 00:48:02,740 --> 00:48:07,700 between two cities in the US is order of 500 -- 677 00:48:07,700 --> 00:48:09,080 500 different airfares between, 678 00:48:09,080 --> 00:48:11,620 say, Boston and Chicago. 679 00:48:11,620 --> 00:48:15,700 So looking at the best there might be 2 to the 500, kind of 680 00:48:15,700 --> 00:48:17,260 a bigger number. 681 00:48:17,260 --> 00:48:18,570 Let's look at 2 to the 50. 682 00:48:21,830 --> 00:48:23,810 Let's say there were 50 items to choose 683 00:48:23,810 --> 00:48:25,880 from in this question. 684 00:48:25,880 --> 00:48:30,610 And let's say for the sake of argument, it takes a 685 00:48:30,610 --> 00:48:33,410 microsecond, one millionth of a second, 686 00:48:33,410 --> 00:48:35,840 to generate a solution. 687 00:48:35,840 --> 00:48:39,830 How long will it take to solve this problem in a brute force 688 00:48:39,830 --> 00:48:43,550 way for 50 items? 689 00:48:43,550 --> 00:48:47,950 Who thinks you can do it in under four seconds? 690 00:48:47,950 --> 00:48:51,470 How about under four minutes? 691 00:48:51,470 --> 00:48:52,740 Wow, skeptics. 692 00:48:52,740 --> 00:48:53,600 Four hours? 693 00:48:53,600 --> 00:48:56,020 That's a lot of computation. 694 00:48:56,020 --> 00:48:57,790 Four hours, you're starting to get some people. 695 00:48:57,790 --> 00:48:59,040 Four days? 696 00:49:01,280 --> 00:49:02,050 All right. 697 00:49:02,050 --> 00:49:03,500 Well, how about four years? 698 00:49:06,300 --> 00:49:09,150 Still longer, just under four decades? 699 00:49:12,000 --> 00:49:18,490 Looking at one choice every microsecond, it takes you 700 00:49:18,490 --> 00:49:24,830 roughly 36 years to evaluate all these possibilities. 701 00:49:24,830 --> 00:49:27,500 Certainly for people of my age, that's not a practical 702 00:49:27,500 --> 00:49:31,800 solution to have to wait 36 years for an answer. 703 00:49:31,800 --> 00:49:35,110 So we have to find something better. 704 00:49:35,110 --> 00:49:36,740 And we'll be talking about that later.