1 00:00:01,000 --> 00:00:03,000 The following content is provided under a Creative 2 00:00:03,000 --> 00:00:05,000 Commons license. Your support will help MIT 3 00:00:05,000 --> 00:00:08,000 OpenCourseWare continue to offer high quality educational 4 00:00:08,000 --> 00:00:13,000 resources for free. To make a donation or to view 5 00:00:13,000 --> 00:00:18,000 additional materials from hundreds of MIT courses, 6 00:00:18,000 --> 00:00:23,000 visit MIT OpenCourseWare at ocw.mit.edu. 7 00:00:23,000 --> 00:00:28,000 Today we are going to see how to use what we saw last time 8 00:00:28,000 --> 00:00:33,000 about partial derivatives to handle minimization or 9 00:00:33,000 --> 00:00:41,000 maximization problems involving functions of several variables. 10 00:00:41,000 --> 00:00:44,000 Remember last time we said that when we have a function, 11 00:00:44,000 --> 00:00:49,000 say, of two variables, x and y, then we have actually two 12 00:00:49,000 --> 00:00:53,000 different derivatives, partial f, partial x, 13 00:00:53,000 --> 00:01:02,000 also called f sub x, the derivative with respect to 14 00:01:02,000 --> 00:01:11,000 x keeping y constant. And we have partial f, 15 00:01:11,000 --> 00:01:21,000 partial y, also called f sub y, where we vary y and we keep x 16 00:01:21,000 --> 00:01:26,000 as a constant. And now, one thing I didn't 17 00:01:26,000 --> 00:01:30,000 have time to tell you about but hopefully you thought about in 18 00:01:30,000 --> 00:01:37,000 recitation yesterday, is the approximation formula 19 00:01:37,000 --> 00:01:47,000 that tells you what happens if you vary both x and y. 20 00:01:47,000 --> 00:01:50,000 f sub x tells us what happens if we change x a little bit, 21 00:01:50,000 --> 00:01:53,000 by some small amount delta x. f sub y tells us how f changes, 22 00:01:53,000 --> 00:01:56,000 if you change y by a small amount delta y. 23 00:01:56,000 --> 00:02:00,000 If we do both at the same time then the two effects will add up 24 00:02:00,000 --> 00:02:02,000 with each other, because you can imagine that 25 00:02:02,000 --> 00:02:05,000 first you will change x and then you will change y. 26 00:02:05,000 --> 00:02:12,000 Or the other way around. It doesn't really matter. 27 00:02:12,000 --> 00:02:18,000 If we change x by a certain amount delta x, 28 00:02:18,000 --> 00:02:23,000 and if we change y by the amount delta y, 29 00:02:23,000 --> 00:02:32,000 and let's say that we have z= f(x, y) then that changes by an 30 00:02:32,000 --> 00:02:40,000 amount which is approximately f sub x times delta x plus f sub y 31 00:02:40,000 --> 00:02:45,000 times delta y. And that is one of the most 32 00:02:45,000 --> 00:02:49,000 important formulas about partial derivatives. 33 00:02:49,000 --> 00:02:54,000 The intuition for this, again, is just the two effects 34 00:02:54,000 --> 00:02:58,000 of if I change x by a small amount and then I change y. 35 00:02:58,000 --> 00:03:02,000 Well, first changing x will modify f, how much does it 36 00:03:02,000 --> 00:03:06,000 modify f? The answer is the rate change 37 00:03:06,000 --> 00:03:09,000 is f sub x. And if I change y then the rate 38 00:03:09,000 --> 00:03:13,000 of change of f when I change y is f sub y. 39 00:03:13,000 --> 00:03:17,000 So all together I get this change as a value of f. 40 00:03:17,000 --> 00:03:19,000 And, of course, that is only an approximation 41 00:03:19,000 --> 00:03:22,000 formula. Actually, there would be higher 42 00:03:22,000 --> 00:03:28,000 order terms involving second and third derivatives and so on. 43 00:03:28,000 --> 00:03:43,000 One way to justify this -- Sorry. 44 00:03:43,000 --> 00:03:47,000 I was distracted by the microphone. 45 00:03:47,000 --> 00:03:55,000 OK. How do we justify this formula? 46 00:03:55,000 --> 00:04:05,000 Well, one way to think about it is in terms of tangent plane 47 00:04:05,000 --> 00:04:10,000 approximation. Let's think about the tangent 48 00:04:10,000 --> 00:04:13,000 plane with regard to a function f. 49 00:04:13,000 --> 00:04:15,000 We have some pictures to show you. 50 00:04:15,000 --> 00:04:20,000 It will be easier if I show you pictures. 51 00:04:20,000 --> 00:04:24,000 Remember, partial f, partial x was obtained by 52 00:04:24,000 --> 00:04:29,000 looking at the situation where y is held constant. 53 00:04:29,000 --> 00:04:33,000 That means I am slicing the graph of f by a plane that is 54 00:04:33,000 --> 00:04:35,000 parallel to the x, z plane. 55 00:04:35,000 --> 00:04:39,000 And when I change x, z changes, and the slope of 56 00:04:39,000 --> 00:04:44,000 that is going to be the derivative with respect to x. 57 00:04:44,000 --> 00:04:49,000 Now, if I do the same in the other direction then I will have 58 00:04:49,000 --> 00:04:53,000 similarly the slope in a slice now parallel to the y, 59 00:04:53,000 --> 00:04:57,000 z plane that will be partial f, partial y. 60 00:04:57,000 --> 00:05:00,000 In fact, in each case, I have a line. 61 00:05:00,000 --> 00:05:02,000 And that line is tangent to the surface. 62 00:05:02,000 --> 00:05:06,000 Now, if I have two lines tangent to the surface, 63 00:05:06,000 --> 00:05:09,000 well, then together they determine for me the tangent 64 00:05:09,000 --> 00:05:13,000 plane to the surface. Let's try to see how that works. 65 00:05:18,000 --> 00:05:28,000 We know that f sub x and f sub y are the slopes of two tangent 66 00:05:28,000 --> 00:05:37,000 lines to this plane, two tangent lines to the graph. 67 00:05:37,000 --> 00:05:39,000 And let's write down the equations of these lines. 68 00:05:39,000 --> 00:05:41,000 I am not going to write parametric equations. 69 00:05:41,000 --> 00:05:45,000 I am going to write them in terms of x, y, 70 00:05:45,000 --> 00:05:49,000 z coordinates. Let's say that partial f of a 71 00:05:49,000 --> 00:05:53,000 partial x at the given point is equal to a. 72 00:05:53,000 --> 00:06:00,000 That means that we have a line given by the following 73 00:06:00,000 --> 00:06:05,000 conditions. I am going to keep y constant 74 00:06:05,000 --> 00:06:07,000 equal to y0. And I am going to change x. 75 00:06:07,000 --> 00:06:12,000 And, as I change x, z will change at the rate that 76 00:06:12,000 --> 00:06:22,000 is equal to a. That would be z = 0 a(x - x0). 77 00:06:22,000 --> 00:06:26,000 That is how you would describe a line that, I guess, 78 00:06:26,000 --> 00:06:30,000 the one that is plotted in green here, been dissected with 79 00:06:30,000 --> 00:06:33,000 the slice parallel to the x, z plane. 80 00:06:33,000 --> 00:06:40,000 I hold y constant equal to y0. And z is a function of x that 81 00:06:40,000 --> 00:06:50,000 varies with a rate of a. And now if I look similarly at 82 00:06:50,000 --> 00:06:55,000 the other slice, let's say that the partial with 83 00:06:55,000 --> 00:07:00,000 respect to y is equal to b, then I get another line which 84 00:07:00,000 --> 00:07:06,000 is obtained by the fact that z now will depend on y. 85 00:07:06,000 --> 00:07:10,000 And the rate of change with respect to y will be b. 86 00:07:10,000 --> 00:07:15,000 While x is held constant equal to x0. 87 00:07:15,000 --> 00:07:19,000 These two lines are both going to be in the tangent plane to 88 00:07:19,000 --> 00:07:20,000 the surface. 89 00:07:40,000 --> 00:07:45,000 They are both tangent to the graph of f and together they 90 00:07:45,000 --> 00:07:47,000 determine the plane. 91 00:07:56,000 --> 00:08:08,000 And that plane is just given by the formula z = z0 a( x - x0) b 92 00:08:08,000 --> 00:08:13,000 ( y - y0). If you look at what happens -- 93 00:08:13,000 --> 00:08:19,000 This is the equation of a plane. z equals constant times x plus 94 00:08:19,000 --> 00:08:24,000 constant times y plus constant. And if you look at what happens 95 00:08:24,000 --> 00:08:28,000 if I hold y constant and vary x, I will get the first line. 96 00:08:28,000 --> 00:08:33,000 If I hold x constant and vary y, I get the second line. 97 00:08:33,000 --> 00:08:34,000 Another way to do it, of course, 98 00:08:34,000 --> 00:08:37,000 would provide actually parametric equations of these 99 00:08:37,000 --> 00:08:40,000 lines, get vectors along them and then 100 00:08:40,000 --> 00:08:43,000 take the cross-product to get the normal vector to the plane. 101 00:08:43,000 --> 00:08:47,000 And then get this equation for the plane using the normal 102 00:08:47,000 --> 00:08:49,000 vector. That also works and it gives 103 00:08:49,000 --> 00:08:53,000 you the same formula. If you are curious of the 104 00:08:53,000 --> 00:08:57,000 exercise, do it again using parametrics and using 105 00:08:57,000 --> 00:09:01,000 cross-product to get the plane equation. 106 00:09:01,000 --> 00:09:03,000 That is how we get the tangent plane. 107 00:09:03,000 --> 00:09:06,000 And now what this approximation formula here says is that, 108 00:09:06,000 --> 00:09:10,000 in fact, the graph of a function is close to the tangent 109 00:09:10,000 --> 00:09:12,000 plane. If we were moving on the 110 00:09:12,000 --> 00:09:15,000 tangent plane, this would be an actual 111 00:09:15,000 --> 00:09:17,000 equality. Delta z would be a linear 112 00:09:17,000 --> 00:09:23,000 function of delta x and delta y. And the graph of a function is 113 00:09:23,000 --> 00:09:27,000 near the tangent plane, but is not quite the same, 114 00:09:27,000 --> 00:09:33,000 so it is only an approximation for small delta x and small 115 00:09:33,000 --> 00:09:43,000 delta y. The approximation formula says 116 00:09:43,000 --> 00:09:57,000 the graph of f is close to its tangent plane. 117 00:09:57,000 --> 00:10:02,000 And we can use that formula over here now to estimate how 118 00:10:02,000 --> 00:10:08,000 the value of f changes if I change x and y at the same time. 119 00:10:08,000 --> 00:10:18,000 Questions about that? Now that we have caught up with 120 00:10:18,000 --> 00:10:23,000 what we were supposed to see on Tuesday, I can tell you now 121 00:10:23,000 --> 00:10:26,000 about max and min problems. 122 00:10:38,000 --> 00:10:48,000 That is going to be an application of partial 123 00:10:48,000 --> 00:11:00,000 derivatives to look at optimization problems. 124 00:11:00,000 --> 00:11:03,000 Maybe ten years from now, when you have a real job, 125 00:11:03,000 --> 00:11:07,000 your job might be to actually minimize the cost of something 126 00:11:07,000 --> 00:11:11,000 or maximize the profit of something or whatever. 127 00:11:11,000 --> 00:11:14,000 But typically the function that you will have to strive to 128 00:11:14,000 --> 00:11:18,000 minimize or maximize will depend on several variables. 129 00:11:18,000 --> 00:11:22,000 If you have a function of one variable, you know that to find 130 00:11:22,000 --> 00:11:26,000 its minimum or its maximum you look at the derivative and set 131 00:11:26,000 --> 00:11:29,000 that equal to zero. And you try to then look at 132 00:11:29,000 --> 00:11:38,000 what happens to the function. Here it is going to be kind of 133 00:11:38,000 --> 00:11:47,000 similar, except, of course, we have several 134 00:11:47,000 --> 00:11:51,000 derivatives. For today we will think about a 135 00:11:51,000 --> 00:11:56,000 function of two variables, but it works exactly the same 136 00:11:56,000 --> 00:12:00,000 if you have three variables, ten variables, 137 00:12:00,000 --> 00:12:07,000 a million variables. The first observation is that 138 00:12:07,000 --> 00:12:17,000 if we have a local minimum or a local maximum then both partial 139 00:12:17,000 --> 00:12:21,000 derivatives, so partial f partial x and 140 00:12:21,000 --> 00:12:26,000 partial f partial y, are both zero at the same time. 141 00:12:26,000 --> 00:12:30,000 Why is that? Well, let's say that f of x is 142 00:12:30,000 --> 00:12:32,000 zero. That means when I vary x to 143 00:12:32,000 --> 00:12:35,000 first order the function doesn't change. 144 00:12:35,000 --> 00:12:37,000 Maybe that is because it is going through... 145 00:12:37,000 --> 00:12:42,000 If I look only at the slice parallel to the x-axis then 146 00:12:42,000 --> 00:12:45,000 maybe I am going through the minimum. 147 00:12:45,000 --> 00:12:48,000 But if partial f, partial y is not 0 then 148 00:12:48,000 --> 00:12:51,000 actually, by changing y, I could still make a value 149 00:12:51,000 --> 00:12:54,000 larger or smaller. That wouldn't be an actual 150 00:12:54,000 --> 00:12:57,000 maximum or minimum. It would only be a maximum or 151 00:12:57,000 --> 00:13:01,000 minimum if I stay in the slice. But if I allow myself to change 152 00:13:01,000 --> 00:13:04,000 y that doesn't work. I need actually to know that if 153 00:13:04,000 --> 00:13:07,000 I change y the value will not change either to first order. 154 00:13:07,000 --> 00:13:11,000 That is why you also need partial f, partial y to be zero. 155 00:13:11,000 --> 00:13:13,000 Now, let's say that they are both zero. 156 00:13:13,000 --> 00:13:16,000 Well, why is that enough? It is essentially enough 157 00:13:16,000 --> 00:13:20,000 because of this formula telling me that if both of these guys 158 00:13:20,000 --> 00:13:24,000 are zero then to first order the function doesn't change. 159 00:13:24,000 --> 00:13:26,000 Then, of course, there will be maybe quadratic 160 00:13:26,000 --> 00:13:28,000 terms that will actually turn that, you know, 161 00:13:28,000 --> 00:13:31,000 this won't really say that your function is actually constant. 162 00:13:31,000 --> 00:13:35,000 It will just tell you that maybe it will actually be 163 00:13:35,000 --> 00:13:40,000 quadratic or higher order in delta x and delta y. 164 00:13:40,000 --> 00:13:52,000 That is what you expect to have at a maximum or a minimum. 165 00:13:52,000 --> 00:14:05,000 The condition is the same thing as saying that the tangent plane 166 00:14:05,000 --> 00:14:15,000 to the graph is actually going to be horizontal. 167 00:14:15,000 --> 00:14:18,000 And that is what you want to have. 168 00:14:18,000 --> 00:14:23,000 Say you have a minimum, well, the tangent plane at this 169 00:14:23,000 --> 00:14:30,000 point, at the bottom of the graph is going to be horizontal. 170 00:14:30,000 --> 00:14:35,000 And you can see that on this equation of a tangent plane, 171 00:14:35,000 --> 00:14:40,000 when both these coefficients are 0 that is when the equation 172 00:14:40,000 --> 00:14:44,000 becomes z equals constant: the horizontal plane. 173 00:14:44,000 --> 00:14:50,000 Does that make sense? We will have a name for this 174 00:14:50,000 --> 00:14:52,000 kind of point because, actually, 175 00:14:52,000 --> 00:14:55,000 what we will see very soon is that these conditions are 176 00:14:55,000 --> 00:14:57,000 necessary but are not sufficient. 177 00:14:57,000 --> 00:15:02,000 There are actually other kinds of points where the partial 178 00:15:02,000 --> 00:15:08,000 derivatives are zero. Let's give a name to this. 179 00:15:08,000 --> 00:15:24,000 We say the definition is (x0, y0) is a critical point of f -- 180 00:15:24,000 --> 00:15:36,000 -- if the partial derivative, with respect to x, 181 00:15:36,000 --> 00:15:44,000 and partial derivative with respect to y are both zero. 182 00:15:44,000 --> 00:15:50,000 Generally, you would want all the partial derivatives, 183 00:15:50,000 --> 00:15:56,000 no matter how many variables you have, to be zero at the same 184 00:15:56,000 --> 00:16:06,000 time. Let's see an example. 185 00:16:06,000 --> 00:16:23,000 Let's say I give you the function f(x;y)= x^2 - 2xy 3y^2 186 00:16:23,000 --> 00:16:28,000 2x - 2y. And let's try to figure out 187 00:16:28,000 --> 00:16:32,000 whether we can minimize or maximize this. 188 00:16:32,000 --> 00:16:37,000 What we would start doing immediately is taking the 189 00:16:37,000 --> 00:16:43,000 partial derivatives. What is f sub x? 190 00:16:43,000 --> 00:16:56,000 It starts with 2x - 2y 0 2. Remember that y is a constant 191 00:16:56,000 --> 00:17:04,000 so this differentiates to zero. Now, if we do f sub y, 192 00:17:04,000 --> 00:17:14,000 that is going to be 0-2x 6y-2. And what we want to do is set 193 00:17:14,000 --> 00:17:17,000 these things to zero. And we want to solve these two 194 00:17:17,000 --> 00:17:21,000 equations at the same time. An important thing to remember, 195 00:17:21,000 --> 00:17:23,000 and maybe I should have told you a couple of weeks ago 196 00:17:23,000 --> 00:17:25,000 already, if you have two equations to 197 00:17:25,000 --> 00:17:28,000 solve, well, it is very good to try to 198 00:17:28,000 --> 00:17:30,000 simplify them by adding them together or whatever, 199 00:17:30,000 --> 00:17:33,000 but you must keep two equations. If you have two equations, 200 00:17:33,000 --> 00:17:37,000 you shouldn't end up with just one equation out of nowhere. 201 00:17:37,000 --> 00:17:40,000 For example here, we can certainly simplify 202 00:17:40,000 --> 00:17:46,000 things by summing them together. If we add them together, 203 00:17:46,000 --> 00:17:52,000 well, the x's cancel and the constants cancel. 204 00:17:52,000 --> 00:17:56,000 In fact, we are just left with 4y for zero. 205 00:17:56,000 --> 00:18:00,000 That is pretty good. That tells us y should be zero. 206 00:18:00,000 --> 00:18:02,000 But then we should, of course, go back to these and 207 00:18:02,000 --> 00:18:07,000 see what else we know. Well, now it tells us, 208 00:18:07,000 --> 00:18:14,000 if you put y = 0 it tells you 2x 2 = 0. 209 00:18:14,000 --> 00:18:26,000 That tells you x = - 1. We have one critical point that 210 00:18:26,000 --> 00:18:33,000 is (x, y) = (- 1; 0). 211 00:18:33,000 --> 00:18:39,000 Any questions so far? No. 212 00:18:39,000 --> 00:18:40,000 Well, you should have a question. 213 00:18:40,000 --> 00:18:49,000 The question should be how do we know if it is a maximum or a 214 00:18:49,000 --> 00:18:53,000 minimum? Yeah. 215 00:18:53,000 --> 00:18:55,000 If we had a function of one variable, we would decide things 216 00:18:55,000 --> 00:18:58,000 based on the second derivative. And, in fact, 217 00:18:58,000 --> 00:19:00,000 we will see tomorrow how to do things based on the second 218 00:19:00,000 --> 00:19:03,000 derivative. But that is kind of tricky 219 00:19:03,000 --> 00:19:06,000 because there are a lot of second derivatives. 220 00:19:06,000 --> 00:19:09,000 I mean we already have two first derivatives. 221 00:19:09,000 --> 00:19:14,000 You can imagine that if you keep taking partials you may end 222 00:19:14,000 --> 00:19:17,000 up with more and more, so we will have to figure out 223 00:19:17,000 --> 00:19:19,000 carefully what the condition should be. 224 00:19:19,000 --> 00:19:27,000 We will do that tomorrow. For now, let's just try to look 225 00:19:27,000 --> 00:19:38,000 a bit at how do we understand these things by hand? 226 00:19:38,000 --> 00:19:42,000 In fact, let me point out to you immediately that there is 227 00:19:42,000 --> 00:19:49,000 more than maxima and minima. Remember, we saw the example of 228 00:19:49,000 --> 00:19:52,000 x^2 y^2. That has a critical point. 229 00:19:52,000 --> 00:19:56,000 That critical point is obviously a minimum. 230 00:19:56,000 --> 00:19:58,000 And, of course, it could be a local minimum 231 00:19:58,000 --> 00:20:01,000 because it could be that if you have a more complicated function 232 00:20:01,000 --> 00:20:04,000 there is indeed a minimum here, but then elsewhere the function 233 00:20:04,000 --> 00:20:08,000 drops to a lower value. We call that just a local 234 00:20:08,000 --> 00:20:12,000 minimum to say that it is a minimum if you stick two values 235 00:20:12,000 --> 00:20:15,000 that are close enough to that point. 236 00:20:15,000 --> 00:20:19,000 Of course, you also have local maximum, which I didn't plot, 237 00:20:19,000 --> 00:20:23,000 but it is easy to plot. That is a local maximum. 238 00:20:23,000 --> 00:20:27,000 But there is a third example of critical point, 239 00:20:27,000 --> 00:20:31,000 and that is a saddle point. The saddle point, 240 00:20:31,000 --> 00:20:35,000 it is a new phenomena that you don't really see in single 241 00:20:35,000 --> 00:20:38,000 variable calculus. It is a critical point that is 242 00:20:38,000 --> 00:20:42,000 neither a minimum nor a maximum because, depending on which 243 00:20:42,000 --> 00:20:46,000 direction you look in, it's either one or the other. 244 00:20:46,000 --> 00:20:50,000 See the point in the middle, at the origin, 245 00:20:50,000 --> 00:20:55,000 is a saddle point. If you look at the tangent 246 00:20:55,000 --> 00:20:58,000 plane to this graph, you will see that it is 247 00:20:58,000 --> 00:21:01,000 actually horizontal at the origin. 248 00:21:01,000 --> 00:21:05,000 You have this mountain pass where the ground is horizontal. 249 00:21:05,000 --> 00:21:08,000 But, depending on which direction you go, 250 00:21:08,000 --> 00:21:12,000 you go up or down. So, we say that a point is a 251 00:21:12,000 --> 00:21:16,000 saddle point if it is neither a minimum or a maximum. 252 00:21:30,000 --> 00:21:38,000 Possibilities could be a local min, a local max or a saddle. 253 00:21:38,000 --> 00:21:42,000 Tomorrow we will see how to decide which one it is, 254 00:21:42,000 --> 00:21:46,000 in general, using second derivatives. 255 00:21:46,000 --> 00:21:50,000 For this time, let's just try to do it by 256 00:21:50,000 --> 00:21:53,000 hand. I just want to observe, 257 00:21:53,000 --> 00:21:57,000 in fact, I can try to, you know, 258 00:21:57,000 --> 00:21:58,000 these examples that I have here, 259 00:21:58,000 --> 00:22:02,000 they are x^2 y^2, y^2 - x^2, they are sums or differences of 260 00:22:02,000 --> 00:22:05,000 squares. And, if we know that we can put 261 00:22:05,000 --> 00:22:08,000 things as sum of squares for example, we will be done. 262 00:22:08,000 --> 00:22:16,000 Let's try to express this maybe as the square of something. 263 00:22:16,000 --> 00:22:21,000 The main problem is this 2xy. Observe we know something that 264 00:22:21,000 --> 00:22:26,000 starts with x^2 - 2xy but is actually a square of something 265 00:22:26,000 --> 00:22:32,000 else. It would be x^2 - 2xy y^2, 266 00:22:32,000 --> 00:22:37,000 not plus 3y2. Let's try that. 267 00:22:37,000 --> 00:22:48,000 So, we are going to complete the square. 268 00:22:48,000 --> 00:22:53,000 I am going to say it is x minus y squared, so it gives me the 269 00:22:53,000 --> 00:23:01,000 first two terms and also the y2. Well, I still need to add two 270 00:23:01,000 --> 00:23:09,000 more y^2, and I also need to add, of course, 271 00:23:09,000 --> 00:23:15,000 the 2x and - 2y. It is still not simple enough 272 00:23:15,000 --> 00:23:19,000 for my taste. I can actually do better. 273 00:23:19,000 --> 00:23:24,000 These guys look like a sum of squares, but here I have this 274 00:23:24,000 --> 00:23:28,000 extra stuff, 2x - 2y. Well, that is 2 (x - y). 275 00:23:28,000 --> 00:23:32,000 It looks like maybe we can modify this and make this into 276 00:23:32,000 --> 00:23:36,000 another square. So, in fact, 277 00:23:36,000 --> 00:23:45,000 I can simplify this further to (x - y 1)^2. 278 00:23:45,000 --> 00:23:51,000 That would be (x - y)^2 2( x - y), and then there is a plus 279 00:23:51,000 --> 00:23:55,000 one. Well, we don't have a plus one 280 00:23:55,000 --> 00:24:00,000 so let's remove it by subtracting one. 281 00:24:00,000 --> 00:24:07,000 And I still have my 2y^2. Do you see why this is the same 282 00:24:07,000 --> 00:24:13,000 function? Yeah. 283 00:24:13,000 --> 00:24:19,000 Again, if I expand x minus y plus one squared, 284 00:24:19,000 --> 00:24:28,000 I get (x - y)^2 2 (x - y) 1. But I will have minus one that 285 00:24:28,000 --> 00:24:34,000 will cancel out and then I have a plus 2y^2. 286 00:24:34,000 --> 00:24:41,000 Now, what I know is a sum of two squared minus one. 287 00:24:41,000 --> 00:24:44,000 And this critical point, (x,y) = (-1;0), 288 00:24:44,000 --> 00:24:49,000 that is actually when this is zero and that is zero, 289 00:24:49,000 --> 00:24:55,000 so that is the smallest value. This is always greater or equal 290 00:24:55,000 --> 00:25:00,000 to zero, the same with that one, so that is always at least 291 00:25:00,000 --> 00:25:03,000 minus one. And minus one happens to be the 292 00:25:03,000 --> 00:25:13,000 value at the critical point. So, it is a minimum. 293 00:25:13,000 --> 00:25:16,000 Now, of course here I was very lucky. 294 00:25:16,000 --> 00:25:19,000 I mean, generally, I couldn't expect things to 295 00:25:19,000 --> 00:25:21,000 simplify that much. In fact, I cheated. 296 00:25:21,000 --> 00:25:26,000 I started from that, I expanded, and then that is 297 00:25:26,000 --> 00:25:30,000 how I got my example. The general method will be a 298 00:25:30,000 --> 00:25:32,000 bit different, but you will see it will 299 00:25:32,000 --> 00:25:34,000 actually also involve completing squares. 300 00:25:34,000 --> 00:25:42,000 Just there is more to it than what we have seen. 301 00:25:42,000 --> 00:25:48,000 We will come back to this tomorrow. 302 00:25:48,000 --> 00:25:56,000 Sorry? How do I know that this equals 303 00:25:56,000 --> 00:26:09,000 -- How do I know that the whole function is greater or equal to 304 00:26:09,000 --> 00:26:15,000 negative one? Well, I wrote f of x, 305 00:26:15,000 --> 00:26:20,000 y as something squared plus 2y^2 - 1. 306 00:26:20,000 --> 00:26:25,000 This squared is always a positive number and not a 307 00:26:25,000 --> 00:26:27,000 negative. It is a square. 308 00:26:27,000 --> 00:26:30,000 The square of something is always non-negative. 309 00:26:30,000 --> 00:26:34,000 Similarly, y^2 is also always non-negative. 310 00:26:34,000 --> 00:26:38,000 So if you add something that is at least zero plus something 311 00:26:38,000 --> 00:26:40,000 that is at least zero and you subtract one, 312 00:26:40,000 --> 00:26:43,000 you get always at least minus one. 313 00:26:43,000 --> 00:26:48,000 And, in fact, the only way you can get minus 314 00:26:48,000 --> 00:26:54,000 one is if both of these guys are zero at the same time. 315 00:26:54,000 --> 00:27:17,000 That is how I get my minimum. More about this tomorrow. 316 00:27:17,000 --> 00:27:20,000 In fact, what I would like to tell you 317 00:27:20,000 --> 00:27:23,000 about now instead is a nice application of min, 318 00:27:23,000 --> 00:27:27,000 max problems that maybe you don't think of as a min, 319 00:27:27,000 --> 00:27:31,000 max problem that you will see. I mean you will think of it 320 00:27:31,000 --> 00:27:35,000 that way because probably your calculator can do it for you or, 321 00:27:35,000 --> 00:27:37,000 if not, your computer can do it for you. 322 00:27:37,000 --> 00:27:42,000 But it is actually something where the theory is based on 323 00:27:42,000 --> 00:27:47,000 minimization in two variables. Very often in experimental 324 00:27:47,000 --> 00:27:52,000 sciences you have to do something called least-squares 325 00:27:52,000 --> 00:28:01,000 intercalation. And what is that about? 326 00:28:01,000 --> 00:28:07,000 Well, it is the idea that maybe you do some experiments and you 327 00:28:07,000 --> 00:28:11,000 record some data. You have some data x and some 328 00:28:11,000 --> 00:28:13,000 data y. And, I don't know, 329 00:28:13,000 --> 00:28:17,000 maybe, for example, x is -- Maybe your measuring 330 00:28:17,000 --> 00:28:21,000 frogs and you're trying to measure how bit the frog leg is 331 00:28:21,000 --> 00:28:23,000 compared to the eyes of the frog, 332 00:28:23,000 --> 00:28:26,000 or you're trying to measure something. 333 00:28:26,000 --> 00:28:30,000 And if you are doing chemistry then it could be how much you 334 00:28:30,000 --> 00:28:35,000 put of some reactant and how much of the output product that 335 00:28:35,000 --> 00:28:37,000 you wanted to synthesize generated. 336 00:28:37,000 --> 00:28:43,000 All sorts of things. Make up your own example. 337 00:28:43,000 --> 00:28:46,000 You measure basically, for various values of x, 338 00:28:46,000 --> 00:28:48,000 what the value of y ends up being. 339 00:28:48,000 --> 00:28:52,000 And then you like to claim these points are kind of 340 00:28:52,000 --> 00:28:53,000 aligned. And, of course, 341 00:28:53,000 --> 00:28:55,000 to a mathematician they are not aligned. 342 00:28:55,000 --> 00:28:57,000 But, to an experimental scientist, that is evidence that 343 00:28:57,000 --> 00:29:00,000 there is a relation between the two. 344 00:29:00,000 --> 00:29:03,000 And so you want to claim -- And in your paper you will actually 345 00:29:03,000 --> 00:29:05,000 draw a nice little line like that. 346 00:29:05,000 --> 00:29:10,000 The functions depend linearly on each of them. 347 00:29:10,000 --> 00:29:15,000 The question is how do we come up with that nice line that 348 00:29:15,000 --> 00:29:19,000 passes smack in the middle of the points? 349 00:29:19,000 --> 00:29:27,000 The question is, given experimental data xi, 350 00:29:27,000 --> 00:29:36,000 yi -- Maybe I should actually be more precise. 351 00:29:36,000 --> 00:29:37,000 You are given some experimental data. 352 00:29:37,000 --> 00:29:45,000 You have data points x1, y1, x2, y2 and so on, 353 00:29:45,000 --> 00:29:52,000 xn, yn, the question would be find the 354 00:29:52,000 --> 00:30:00,000 "best fit" line of a form y equals ax b 355 00:30:00,000 --> 00:30:08,000 that somehow approximates very well this data. 356 00:30:08,000 --> 00:30:11,000 You can also use that right away to predict various things. 357 00:30:11,000 --> 00:30:13,000 For example, if you look at your new 358 00:30:13,000 --> 00:30:17,000 homework, actually the first problem asks 359 00:30:17,000 --> 00:30:22,000 you to predict how many iPods will be on this planet in ten 360 00:30:22,000 --> 00:30:28,000 years looking at past sales and how they behave. 361 00:30:28,000 --> 00:30:31,000 One thing, right away, before you lose all the money 362 00:30:31,000 --> 00:30:35,000 that you don't have yet, you cannot use that to predict 363 00:30:35,000 --> 00:30:39,000 the stock market. So, don't try to use that to 364 00:30:39,000 --> 00:30:52,000 make money. It doesn't work. 365 00:30:52,000 --> 00:30:58,000 One tricky thing here that I want to draw your attention to 366 00:30:58,000 --> 00:31:02,000 is what are the unknowns here? The natural answer would be to 367 00:31:02,000 --> 00:31:03,000 say that the unknowns are x and y. 368 00:31:03,000 --> 00:31:07,000 That is not actually the case. We are not going to solve for 369 00:31:07,000 --> 00:31:09,000 some x and y. I mean we have some values 370 00:31:09,000 --> 00:31:12,000 given to us. And, when we are looking for 371 00:31:12,000 --> 00:31:16,000 that line, we don't really care about the perfect value of x. 372 00:31:16,000 --> 00:31:21,000 What we care about is actually these coefficients a and b that 373 00:31:21,000 --> 00:31:26,000 will tell us what the relation is between x and y. 374 00:31:26,000 --> 00:31:30,000 In fact, we are trying to solve for a and b that will give us 375 00:31:30,000 --> 00:31:34,000 the nicest possible line for these points. 376 00:31:34,000 --> 00:31:36,000 The unknowns, in our equations, 377 00:31:36,000 --> 00:31:39,000 will have to be a and b, not x and y. 378 00:32:11,000 --> 00:32:20,000 The question really is find the "best" 379 00:32:20,000 --> 00:32:23,000 a and b. And, of course, 380 00:32:23,000 --> 00:32:26,000 we have to decide what we mean by best. 381 00:32:26,000 --> 00:32:30,000 Best will mean that we minimize some function of a and b that 382 00:32:30,000 --> 00:32:34,000 measures the total errors that we are making when we are 383 00:32:34,000 --> 00:32:38,000 choosing this line compared to the experimental data. 384 00:32:38,000 --> 00:32:43,000 Maybe, roughly speaking, it should measure how far these 385 00:32:43,000 --> 00:32:49,000 points are from the line. But now there are various ways 386 00:32:49,000 --> 00:32:52,000 to do it. And a lot of them are valid 387 00:32:52,000 --> 00:32:57,000 they give you different answers. You have to decide what it is 388 00:32:57,000 --> 00:32:59,000 that you prefer. For example, 389 00:32:59,000 --> 00:33:04,000 you could measure the distance to the line by projecting 390 00:33:04,000 --> 00:33:08,000 perpendicularly. Or you could measure instead, 391 00:33:08,000 --> 00:33:13,000 for a given value of x, the difference between the 392 00:33:13,000 --> 00:33:17,000 experimental value of y and the predicted one. 393 00:33:17,000 --> 00:33:21,000 And that is often more relevant because these guys actually may 394 00:33:21,000 --> 00:33:25,000 be expressed in different units. They are not the same type of 395 00:33:25,000 --> 00:33:29,000 quantity. You cannot actually combine 396 00:33:29,000 --> 00:33:32,000 them arbitrarily. Anyway, the convention is 397 00:33:32,000 --> 00:33:34,000 usually we measure distance in this way. 398 00:33:34,000 --> 00:33:38,000 Next, you could try to minimize the largest distance. 399 00:33:38,000 --> 00:33:42,000 Say we look at who has the largest error and we make that 400 00:33:42,000 --> 00:33:44,000 the smallest possible. The drawback of doing that is 401 00:33:44,000 --> 00:33:47,000 experimentally very often you have one data point that is not 402 00:33:47,000 --> 00:33:50,000 good because maybe you fell asleep in front of the 403 00:33:50,000 --> 00:33:53,000 experiment. And so you didn't measure the 404 00:33:53,000 --> 00:33:55,000 right thing. You tend to want to not give 405 00:33:55,000 --> 00:33:59,000 too much importance to some data point that is far away from the 406 00:33:59,000 --> 00:34:02,000 others. Maybe instead you want to 407 00:34:02,000 --> 00:34:06,000 measure the average distance or maybe you want to actually give 408 00:34:06,000 --> 00:34:09,000 more weight to things that are further away. 409 00:34:09,000 --> 00:34:12,000 And then you don't want to do the distance with a square of 410 00:34:12,000 --> 00:34:14,000 the distance. There are various possible 411 00:34:14,000 --> 00:34:18,000 answers, but one of them gives us actually a particularly nice 412 00:34:18,000 --> 00:34:22,000 formula for a and b. And so that is why it is the 413 00:34:22,000 --> 00:34:27,000 universally used one. Here it says list squares. 414 00:34:27,000 --> 00:34:31,000 That's because we will measure, actually, the sum of the 415 00:34:31,000 --> 00:34:35,000 squares of the errors. And why do we do that? 416 00:34:35,000 --> 00:34:37,000 Well, part of it is because it looks good. 417 00:34:37,000 --> 00:34:42,000 When you see this plot in scientific papers they really 418 00:34:42,000 --> 00:34:46,000 look like the line is indeed the ideal line. 419 00:34:46,000 --> 00:34:49,000 And the second reason is because actually the 420 00:34:49,000 --> 00:34:52,000 minimization problem that we will get is particularly simple, 421 00:34:52,000 --> 00:34:57,000 well-posed and easy to solve. So we will have a nice formula 422 00:34:57,000 --> 00:35:03,000 for the best a and the best b. If you have a method that is 423 00:35:03,000 --> 00:35:07,000 simple and gives you a good answer then that is probably 424 00:35:07,000 --> 00:35:09,000 good. We have to define best. 425 00:35:09,000 --> 00:35:22,000 Here it is in the sense of minimizing the total square 426 00:35:22,000 --> 00:35:29,000 error. Or maybe I should say total 427 00:35:29,000 --> 00:35:35,000 square deviation instead. What do I mean by this? 428 00:35:35,000 --> 00:35:44,000 The deviation for each data point is the difference between 429 00:35:44,000 --> 00:35:52,000 what you have measured and what you are predicting by your 430 00:35:52,000 --> 00:36:00,000 model. That is the difference between 431 00:36:00,000 --> 00:36:11,000 y1 and axi plus b. Now, what we will do is try to 432 00:36:11,000 --> 00:36:25,000 minimize the function capital D, which is just the sum for all 433 00:36:25,000 --> 00:36:36,000 the data points of the square of a deviation. 434 00:36:36,000 --> 00:36:40,000 Let me go over this again. This is a function of a and b. 435 00:36:40,000 --> 00:36:43,000 Of course there are a lot of letters in here, 436 00:36:43,000 --> 00:36:46,000 but xi and yi in real life there will be numbers given to 437 00:36:46,000 --> 00:36:48,000 you. There will be numbers that you 438 00:36:48,000 --> 00:36:51,000 have measured. You have measured all of this 439 00:36:51,000 --> 00:36:53,000 data. They are just going to be 440 00:36:53,000 --> 00:36:58,000 numbers. You put them in there and you 441 00:36:58,000 --> 00:37:04,000 get a function of a and b. Any questions? 442 00:37:16,000 --> 00:37:20,000 How do we minimize this function of a and b? 443 00:37:20,000 --> 00:37:27,000 Well, let's use your knowledge. Let's actually look for a 444 00:37:27,000 --> 00:37:34,000 critical point. We want to solve for partial d 445 00:37:34,000 --> 00:37:42,000 over partial a= 0, partial d over partial b = 0. 446 00:37:42,000 --> 00:37:48,000 That is how we look for critical points. 447 00:37:48,000 --> 00:37:52,000 Let's take the derivative of this with respect to a. 448 00:37:52,000 --> 00:37:59,000 Well, the derivative of a sum is sum of the derivatives. 449 00:37:59,000 --> 00:38:04,000 And now we have to take the derivative of this quantity 450 00:38:04,000 --> 00:38:07,000 squared. Remember, we take the 451 00:38:07,000 --> 00:38:11,000 derivative of the square. We take twice this quantity 452 00:38:11,000 --> 00:38:15,000 times the derivative of what we are squaring. 453 00:38:15,000 --> 00:38:26,000 We will get 2(yi - axi) b times the derivative of this with 454 00:38:26,000 --> 00:38:30,000 respect to a. What is the derivative of this 455 00:38:30,000 --> 00:38:35,000 with respect to a? Negative xi, exactly. 456 00:38:35,000 --> 00:38:38,000 And so we will want this to be 0. 457 00:38:38,000 --> 00:38:41,000 And partial d over partial b, we do the same thing, 458 00:38:41,000 --> 00:38:45,000 but different shading with respect to b instead of with 459 00:38:45,000 --> 00:38:50,000 respect to a. Again, the sum of squares twice 460 00:38:50,000 --> 00:38:58,000 yi minus axi equals b times the derivative of this with respect 461 00:38:58,000 --> 00:39:02,000 to b is, I think, negative one. 462 00:39:02,000 --> 00:39:07,000 Those are the equations we have to solve. 463 00:39:07,000 --> 00:39:10,000 Well, let's reorganize this a little bit. 464 00:39:24,000 --> 00:39:32,000 The first equation. See, there are a's and there 465 00:39:32,000 --> 00:39:36,000 are b's in these equations. I am going to just look at the 466 00:39:36,000 --> 00:39:39,000 coefficients of a and b. If you have good eyes, 467 00:39:39,000 --> 00:39:42,000 you can see probably that these are actually linear equations in 468 00:39:42,000 --> 00:39:45,000 a and b. There is a lot of clutter with 469 00:39:45,000 --> 00:39:47,000 all these x's and y's all over the place. 470 00:39:47,000 --> 00:39:55,000 Let's actually try to expand things and make that more 471 00:39:55,000 --> 00:39:59,000 apparent. The first thing I will do is 472 00:39:59,000 --> 00:40:02,000 actually get rid of these factors of two. 473 00:40:02,000 --> 00:40:05,000 They are just not very important. 474 00:40:05,000 --> 00:40:10,000 I can simplify things. Next, I am going to look at the 475 00:40:10,000 --> 00:40:15,000 coefficient of a. I will get basically a times xi 476 00:40:15,000 --> 00:40:24,000 squared. Let me just do it and should be 477 00:40:24,000 --> 00:40:33,000 clear. I claim when we simplify this 478 00:40:33,000 --> 00:40:46,000 we get xi squared times a plus xi times b minus xiyi. 479 00:40:46,000 --> 00:40:53,000 And we set this equal to zero. Do you agree that this is what 480 00:40:53,000 --> 00:40:57,000 we get when we expand that product? 481 00:40:57,000 --> 00:41:03,000 Yeah. Kind of? OK. Let's do the other one. 482 00:41:03,000 --> 00:41:08,000 We just multiply by minus one, so we take the opposite of that 483 00:41:08,000 --> 00:41:19,000 which would be axi plus b. I will write that as xia plus b 484 00:41:19,000 --> 00:41:25,000 minus yi. Sorry. I forgot the n here. 485 00:41:25,000 --> 00:41:30,000 And let me just reorganize that by actually putting all the a's 486 00:41:30,000 --> 00:41:34,000 together. That means I will have sum of 487 00:41:34,000 --> 00:41:40,000 all the xi2 times a plus sum of xib minus sum of xiyi equal to 488 00:41:40,000 --> 00:41:41,000 zero. 489 00:42:08,000 --> 00:42:15,000 If I rewrite this, it becomes sum of xi2 times a 490 00:42:15,000 --> 00:42:24,000 plus sum of the xi's time b, and let me move the other guys 491 00:42:24,000 --> 00:42:30,000 to the other side, equals sum of xiyi. 492 00:42:30,000 --> 00:42:37,000 And that one becomes sum of xi times a. 493 00:42:37,000 --> 00:42:41,000 Plus how many b's do I get on this one? 494 00:42:41,000 --> 00:42:45,000 I get one for each data point. When I sum them together, 495 00:42:45,000 --> 00:42:48,000 I will get n. Very good. 496 00:42:48,000 --> 00:42:56,000 N times b equals sum of yi. Now, this quantities look 497 00:42:56,000 --> 00:42:58,000 scary, but they are actually just numbers. 498 00:42:58,000 --> 00:43:01,000 For example, this one, you look at all your 499 00:43:01,000 --> 00:43:05,000 data points. For each of them you take the 500 00:43:05,000 --> 00:43:10,000 value of x and you just sum all these numbers together. 501 00:43:10,000 --> 00:43:19,000 What you get, actually, is a linear system in 502 00:43:19,000 --> 00:43:26,000 a and b, a two by two linear system. 503 00:43:26,000 --> 00:43:32,000 And so now we can solve this for a and b. 504 00:43:32,000 --> 00:43:35,000 In practice, of course, first you plug in 505 00:43:35,000 --> 00:43:40,000 the numbers for xi and yi and then you solve the system that 506 00:43:40,000 --> 00:43:44,000 you get. And we know how to solve two by 507 00:43:44,000 --> 00:43:46,000 two linear systems, I hope. 508 00:43:46,000 --> 00:43:50,000 That's how we find the best fit line. 509 00:43:50,000 --> 00:43:54,000 Now, why is that going to be the best one instead of the 510 00:43:54,000 --> 00:43:56,000 worst one? We just solved for a critical 511 00:43:56,000 --> 00:43:58,000 point. That could actually be a 512 00:43:58,000 --> 00:44:01,000 maximum of this error function D. 513 00:44:01,000 --> 00:44:05,000 We will have the answer to that next time, but trust me. 514 00:44:05,000 --> 00:44:08,000 If you really want to go over the second derivative test that 515 00:44:08,000 --> 00:44:11,000 we will see tomorrow and apply it in this case, 516 00:44:11,000 --> 00:44:14,000 it is quite hard to check, but you can see it is actually 517 00:44:14,000 --> 00:44:28,000 a minimum. I will just say -- -- we can 518 00:44:28,000 --> 00:44:42,000 show that it is a minimum. Now, the event with the linear 519 00:44:42,000 --> 00:44:47,000 case is the one that we are the most familiar with. 520 00:44:47,000 --> 00:44:56,000 Least-squares interpolation actually works in much more 521 00:44:56,000 --> 00:45:03,000 general settings. Because instead of fitting for 522 00:45:03,000 --> 00:45:06,000 the best line, if you think it has a different 523 00:45:06,000 --> 00:45:10,000 kind of relation then maybe you can fit in using a different 524 00:45:10,000 --> 00:45:14,000 kind of formula. Let me actually illustrate that 525 00:45:14,000 --> 00:45:17,000 with an example. I don't know if you are 526 00:45:17,000 --> 00:45:21,000 familiar with Moore's law. It is something that is 527 00:45:21,000 --> 00:45:24,000 supposed to tell you how quickly basically computer chips become 528 00:45:24,000 --> 00:45:27,000 smarter faster and faster all the time. 529 00:45:27,000 --> 00:45:31,000 It's a law that says things about the number of transistors 530 00:45:31,000 --> 00:45:33,000 that you can fit onto a computer chip. 531 00:45:33,000 --> 00:45:45,000 Here I have some data about -- Here is data about the number of 532 00:45:45,000 --> 00:45:58,000 transistors on a standard PC processor as a function of time. 533 00:45:58,000 --> 00:46:01,000 And if you try to do a best-line fit, 534 00:46:01,000 --> 00:46:07,000 well, it doesn't seem to follow a linear trend. 535 00:46:07,000 --> 00:46:11,000 On the other hand, if you plug the diagram in the 536 00:46:11,000 --> 00:46:13,000 log scale, the log of a number of 537 00:46:13,000 --> 00:46:15,000 transitions as a function of time, 538 00:46:15,000 --> 00:46:21,000 then you get a much better line. And so, in fact, 539 00:46:21,000 --> 00:46:26,000 that means that you had an exponential relation between the 540 00:46:26,000 --> 00:46:30,000 number of transistors and time. And so, actually that's what 541 00:46:30,000 --> 00:46:32,000 Moore's law says. It says that the number of 542 00:46:32,000 --> 00:46:36,000 transistors in the chip doubles every 18 months or every two 543 00:46:36,000 --> 00:46:40,000 years. They keep changing the 544 00:46:40,000 --> 00:46:49,000 statement. How do we find the best 545 00:46:49,000 --> 00:46:58,000 exponential fit? Well, an exponential fit would 546 00:46:58,000 --> 00:47:05,000 be something of a form y equals a constant times exponential of 547 00:47:05,000 --> 00:47:09,000 a times x. That is what we want to look at. 548 00:47:09,000 --> 00:47:13,000 Well, we could try to minimize a square error like we did 549 00:47:13,000 --> 00:47:16,000 before. That doesn't work well at all. 550 00:47:16,000 --> 00:47:18,000 The equations that you get are very complicated. 551 00:47:18,000 --> 00:47:24,000 You cannot solve them. But remember what I showed you 552 00:47:24,000 --> 00:47:28,000 on this log plot. If you plot the log of y as a 553 00:47:28,000 --> 00:47:33,000 function of x then suddenly it becomes a linear relation. 554 00:47:33,000 --> 00:47:43,000 Observe, this is the same as ln of y equals ln of c plus ax. 555 00:47:43,000 --> 00:47:55,000 And that is the linear best fit. What you do is you just look 556 00:47:55,000 --> 00:48:08,000 for the best straight line fit for the log of y. 557 00:48:08,000 --> 00:48:10,000 That is something we already know. 558 00:48:10,000 --> 00:48:12,000 But you can also do, for example, 559 00:48:12,000 --> 00:48:16,000 let's say that we have something more complicated. 560 00:48:16,000 --> 00:48:21,000 Let's say that we have actually a quadratic law. 561 00:48:21,000 --> 00:48:27,000 For example, y is of the form ax^2 bx c. 562 00:48:27,000 --> 00:48:31,000 And, of course, you are trying to find somehow 563 00:48:31,000 --> 00:48:34,000 the best. That would mean here fitting 564 00:48:34,000 --> 00:48:37,000 the best parabola for your data points. 565 00:48:37,000 --> 00:48:40,000 Well, to do that, you would need to find a, 566 00:48:40,000 --> 00:48:45,000 b and c. And now you will have actually 567 00:48:45,000 --> 00:48:51,000 a function of a, b and c, which would be the sum 568 00:48:51,000 --> 00:48:57,000 of the old data points of the square deviation. 569 00:48:57,000 --> 00:49:01,000 And, if you try to solve for critical points, 570 00:49:01,000 --> 00:49:03,000 now you will have three equations involving a, 571 00:49:03,000 --> 00:49:05,000 b and c, in fact, you will find a three 572 00:49:05,000 --> 00:49:09,000 by three linear system. And it works the same way. 573 00:49:09,000 --> 00:49:14,000 Just you have a little bit more data. 574 00:49:14,000 --> 00:49:19,000 Basically, you see that this best fit problems are an example 575 00:49:19,000 --> 00:49:24,000 of a minimization problem that maybe you didn't expect to see 576 00:49:24,000 --> 00:49:30,000 minimization problems come in. But that is really the way to 577 00:49:30,000 --> 00:49:34,000 handle these questions. Tomorrow we will go back to the 578 00:49:34,000 --> 00:49:38,000 question of how do we decide whether it is a minimum or a 579 00:49:38,000 --> 00:49:40,000 maximum. And we will continue exploring 580 00:49:40,000 --> 00:49:43,000 in terms of several variables.