1 00:00:01 --> 00:00:03 The following content is provided under a Creative 2 00:00:03 --> 00:00:05 Commons license. Your support will help MIT 3 00:00:05 --> 00:00:08 OpenCourseWare continue to offer high quality educational 4 00:00:08 --> 00:00:13 resources for free. To make a donation or to view 5 00:00:13 --> 00:00:18 additional materials from hundreds of MIT courses, 6 00:00:18 --> 00:00:23 visit MIT OpenCourseWare at ocw.mit.edu. 7 00:00:23 --> 00:00:27 So far we have learned about partial derivatives and how to 8 00:00:27 --> 00:00:31 use them to find minima and maxima of functions of two 9 00:00:31 --> 00:00:35 variables or several variables. And now we are going to try to 10 00:00:35 --> 00:00:38 study, in more detail, how functions of several 11 00:00:38 --> 00:00:41 variables behave, how to compete their 12 00:00:41 --> 00:00:44 variations. How to estimate the variation 13 00:00:44 --> 00:00:50 in arbitrary directions. And so for that we are going to 14 00:00:50 --> 00:00:56 need some more tools actually to study this things. 15 00:00:56 --> 00:01:00 More tools to study functions. 16 00:01:00 --> 00:01:15 17 00:01:15 --> 00:01:26 Today's topic is going to be differentials. 18 00:01:26 --> 00:01:34 And, just to motivate that, let me remind you about one 19 00:01:34 --> 00:01:43 trick that you probably know from single variable calculus, 20 00:01:43 --> 00:01:48 namely implicit differentiation. 21 00:01:48 --> 00:01:56 Let's say that you have a function y equals f of x then 22 00:01:56 --> 00:02:05 you would sometimes write dy equals f prime of x times dx. 23 00:02:05 --> 00:02:17 And then maybe you would -- We use implicit differentiation to 24 00:02:17 --> 00:02:29 actually relate infinitesimal changes in y with infinitesimal 25 00:02:29 --> 00:02:35 changes in x. And one thing we can do with 26 00:02:35 --> 00:02:39 that, for example, is actually figure out the rate 27 00:02:39 --> 00:02:43 of change dy by dx, but also the reciprocal dx by 28 00:02:43 --> 00:02:48 dy. And so, for example, 29 00:02:48 --> 00:02:58 let's say that we have y equals inverse sin(x). 30 00:02:58 --> 00:03:03 Then we can write x equals sin(y). 31 00:03:03 --> 00:03:08 And, from there, we can actually find out what 32 00:03:08 --> 00:03:13 is the derivative of this function if we didn't know the 33 00:03:13 --> 00:03:18 answer already by writing dx equals cosine y dy. 34 00:03:18 --> 00:03:28 That tells us that dy over dx is going to be one over cosine 35 00:03:28 --> 00:03:32 y. And now cosine for relation to 36 00:03:32 --> 00:03:40 sine is basically one over square root of one minus x^2. 37 00:03:40 --> 00:03:44 And that is how you find the formula for the derivative of 38 00:03:44 --> 00:03:50 the inverse sine function. A formula that you probably 39 00:03:50 --> 00:03:54 already knew, but that is one way to derive 40 00:03:54 --> 00:03:57 it. Now we are going to use also 41 00:03:57 --> 00:03:59 these kinds of notations, dx, dy and so on, 42 00:03:59 --> 00:04:03 but use them for functions of several variables. 43 00:04:03 --> 00:04:05 And, of course, we will have to learn what the 44 00:04:05 --> 00:04:08 rules of manipulation are and what we can do with them. 45 00:04:08 --> 00:04:17 46 00:04:17 --> 00:04:20 The actual name of that is the total differential, 47 00:04:20 --> 00:04:23 as opposed to the partial derivatives. 48 00:04:23 --> 00:04:28 The total differential includes all of the various causes that 49 00:04:28 --> 00:04:33 can change -- Sorry. All the contributions that can 50 00:04:33 --> 00:04:38 cause the value of your function f to change. 51 00:04:38 --> 00:04:43 Namely, let's say that you have a function maybe of three 52 00:04:43 --> 00:04:44 variables, x, y, z, 53 00:04:44 --> 00:04:56 then you would write df equals f sub x dx plus f sub y dy plus 54 00:04:56 --> 00:05:02 f sub z dz. Maybe, just to remind you of 55 00:05:02 --> 00:05:07 the other notation, partial f over partial x dx 56 00:05:07 --> 00:05:14 plus partial f over partial y dy plus partial f over partial z 57 00:05:14 --> 00:05:18 dz. Now, what is this object? 58 00:05:18 --> 00:05:22 What are the things on either side of this equality? 59 00:05:22 --> 00:05:24 Well, they are called differentials. 60 00:05:24 --> 00:05:26 And they are not numbers, they are not vectors, 61 00:05:26 --> 00:05:29 they are not matrices, they are a different kind of 62 00:05:29 --> 00:05:32 object. These things have their own 63 00:05:32 --> 00:05:36 rules of manipulations, and we have to learn what we 64 00:05:36 --> 00:05:40 can do with them. So how do we think about them? 65 00:05:40 --> 00:05:51 First of all, how do we not think about them? 66 00:05:51 --> 00:05:55 Here is an important thing to know. 67 00:05:55 --> 00:06:07 Important. df is not the same thing as 68 00:06:07 --> 00:06:12 delta f. That is meant to be a number. 69 00:06:12 --> 00:06:16 It is going to be a number once you have a small variation of x, 70 00:06:16 --> 00:06:19 a small variation of y, a small variation of z. 71 00:06:19 --> 00:06:21 These are numbers. Delta x, delta y and delta z 72 00:06:21 --> 00:06:24 are actual numbers, and this becomes a number. 73 00:06:24 --> 00:06:26 This guy actually is not a number. 74 00:06:26 --> 00:06:30 You cannot give it a particular value. 75 00:06:30 --> 00:06:33 All you can do with a differential is express it in 76 00:06:33 --> 00:06:36 terms of other differentials. In fact, this dx, 77 00:06:36 --> 00:06:38 dy and dz, well, they are mostly symbols out 78 00:06:38 --> 00:06:42 there. But if you want to think about 79 00:06:42 --> 00:06:46 them, they are the differentials of x, y and z. 80 00:06:46 --> 00:06:52 In fact, you can think of these differentials as placeholders 81 00:06:52 --> 00:06:57 where you will put other things. Of course, they represent, 82 00:06:57 --> 00:07:02 you know, there is this idea of changes in x, 83 00:07:02 --> 00:07:05 y, z and f. One way that one could explain 84 00:07:05 --> 00:07:09 it, and I don't really like it, is to say they represent 85 00:07:09 --> 00:07:12 infinitesimal changes. Another way to say it, 86 00:07:12 --> 00:07:14 and I think that is probably closer to the truth, 87 00:07:14 --> 00:07:19 is that these things are somehow placeholders to put 88 00:07:19 --> 00:07:22 values and get a tangent approximation. 89 00:07:22 --> 00:07:25 For example, if I do replace these symbols 90 00:07:25 --> 00:07:30 by delta x, delta y and delta z numbers then I will actually get 91 00:07:30 --> 00:07:33 a numerical quantity. And that will be an 92 00:07:33 --> 00:07:39 approximation formula for delta. It will be the linear 93 00:07:39 --> 00:07:44 approximation, a tangent plane approximation. 94 00:07:44 --> 00:07:52 What we can do -- Well, let me start first with maybe 95 00:07:52 --> 00:08:00 something even before that. The first thing that it does is 96 00:08:00 --> 00:08:10 it can encode how changes in x, y, z affect the value of f. 97 00:08:10 --> 00:08:15 I would say that is the most general answer to what is this 98 00:08:15 --> 00:08:18 formula, what are these differentials. 99 00:08:18 --> 00:08:24 It is a relation between x, y, z and f. 100 00:08:24 --> 00:08:36 And this is a placeholder for small variations, 101 00:08:36 --> 00:08:53 delta x, delta y and delta z to get an approximation formula. 102 00:08:53 --> 00:09:00 Which is delta f is approximately equal to fx delta 103 00:09:00 --> 00:09:06 x fy delta y fz delta z. It is getting cramped, 104 00:09:06 --> 00:09:11 but I am sure you know what is going on here. 105 00:09:11 --> 00:09:15 And observe how this one is actually equal while that one is 106 00:09:15 --> 00:09:19 approximately equal. So they are really not the same. 107 00:09:19 --> 00:09:22 Another thing that the notation suggests we can do, 108 00:09:22 --> 00:09:26 and they claim we can do, is divide everything by some 109 00:09:26 --> 00:09:29 variable that everybody depends on. 110 00:09:29 --> 00:09:33 Say, for example, that x, y and z actually depend 111 00:09:33 --> 00:09:39 on some parameter t then they will vary, at a certain rate, 112 00:09:39 --> 00:09:42 dx over dt, dy over dt, dz over dt. 113 00:09:42 --> 00:09:46 And what the differential will tell us then is the rate of 114 00:09:46 --> 00:09:51 change of f as a function of t, when you plug in these values 115 00:09:51 --> 00:09:57 of x, y, z, you will get df over dt by 116 00:09:57 --> 00:10:05 dividing everything by dt in here. 117 00:10:05 --> 00:10:21 The first thing we can do is divide by something like dt to 118 00:10:21 --> 00:10:30 get infinitesimal rate of change. 119 00:10:30 --> 00:10:43 Well, let me just say rate of change. 120 00:10:43 --> 00:10:52 df over dt equals f sub x dx over dt plus f sub y dy over dt 121 00:10:52 --> 00:11:00 plus f sub z dz over dt. And that corresponds to the 122 00:11:00 --> 00:11:09 situation where x is a function of t, y is a function of t and z 123 00:11:09 --> 00:11:14 is a function of t. That means you can plug in 124 00:11:14 --> 00:11:18 these values into f to get, well, the value of f will 125 00:11:18 --> 00:11:23 depend on t, and then you can find the rate 126 00:11:23 --> 00:11:27 of change with t of a value of f. 127 00:11:27 --> 00:11:35 These are the basic rules. And this is known as the chain 128 00:11:35 --> 00:11:38 rule. It is one instance of a chain 129 00:11:38 --> 00:11:40 rule, which tells you when you have a 130 00:11:40 --> 00:11:42 function that depends on something, 131 00:11:42 --> 00:11:45 and that something in turn depends on something else, 132 00:11:45 --> 00:11:51 how to find the rate of change of a function on the new 133 00:11:51 --> 00:11:56 variable in terms of the derivatives of a function and 134 00:11:56 --> 00:12:01 also the dependence between the various variables. 135 00:12:01 --> 00:12:08 Any questions so far? No. 136 00:12:08 --> 00:12:11 OK. A word of warming, 137 00:12:11 --> 00:12:15 in particular, about what I said up here. 138 00:12:15 --> 00:12:19 It is kind of unfortunate, but the textbook actually has a 139 00:12:19 --> 00:12:23 serious mistake on that. I mean they do have a couple of 140 00:12:23 --> 00:12:29 formulas where they mix a d with a delta, and I warn you not to 141 00:12:29 --> 00:12:32 do that, please. I mean there are d's and there 142 00:12:32 --> 00:12:34 are delta's, and basically they don't live in the same world. 143 00:12:34 --> 00:12:53 They don't see each other. The textbook is lying to you. 144 00:12:53 --> 00:12:59 Let's see. The first and the second 145 00:12:59 --> 00:13:01 claims, I don't really need to justify 146 00:13:01 --> 00:13:05 because the first one is just stating some general principle, 147 00:13:05 --> 00:13:08 but I am not making a precise mathematical claim. 148 00:13:08 --> 00:13:11 The second one, well, we know the approximation 149 00:13:11 --> 00:13:14 formula already, so I don't need to justify it 150 00:13:14 --> 00:13:16 for you. But, on the other hand, 151 00:13:16 --> 00:13:20 this formula here, I mean, you probably have a 152 00:13:20 --> 00:13:24 right to expect some reason for why this works. 153 00:13:24 --> 00:13:27 Why is this valid? After all, I first told you we 154 00:13:27 --> 00:13:29 have these new mysterious objects. 155 00:13:29 --> 00:13:32 And then I am telling you we can do that, but I kind of 156 00:13:32 --> 00:13:44 pulled it out of my hat. I mean I don't have a hat. 157 00:13:44 --> 00:13:53 Why is this valid? How can I get to this? 158 00:13:53 --> 00:14:06 Here is a first attempt of justifying how to get there. 159 00:14:06 --> 00:14:13 Let's see. Well, we said df is f sub x dx 160 00:14:13 --> 00:14:25 plus f sub y dy plus f sub z dz. But we know if x is a function 161 00:14:25 --> 00:14:37 of t then dx is x prime of t dt, dy is y prime of t dt, 162 00:14:37 --> 00:14:47 dz is z prime of t dt. If we plug these into that 163 00:14:47 --> 00:14:58 formula, we will get that df is f sub x times x prime t dt plus 164 00:14:58 --> 00:15:08 f sub y y prime of t dt plus f sub z z prime of t dt. 165 00:15:08 --> 00:15:14 And now I have a relation between df and dt. 166 00:15:14 --> 00:15:17 See, I got df equals sometimes times dt. 167 00:15:17 --> 00:15:23 That means the rate of change of f with respect to t should be 168 00:15:23 --> 00:15:38 that coefficient. If I divide by dt then I get 169 00:15:38 --> 00:15:46 the chain rule. That kind of works, 170 00:15:46 --> 00:15:49 but that shouldn't be completely satisfactory. 171 00:15:49 --> 00:15:53 Let's say that you are a true skeptic and you don't believe in 172 00:15:53 --> 00:15:57 differentials yet then it is maybe not very good that I 173 00:15:57 --> 00:16:01 actually used more of these differential notations in 174 00:16:01 --> 00:16:05 deriving the answer. That is actually not how it is 175 00:16:05 --> 00:16:08 proved. The way in which you prove the 176 00:16:08 --> 00:16:13 chain rule is not this way because we shouldn't have too 177 00:16:13 --> 00:16:16 much trust in differentials just yet. 178 00:16:16 --> 00:16:18 I mean at the end of today's lecture, yes, 179 00:16:18 --> 00:16:20 probably we should believe in them, 180 00:16:20 --> 00:16:26 but so far we should be a little bit reluctant to believe 181 00:16:26 --> 00:16:32 these kind of strange objects telling us weird things. 182 00:16:32 --> 00:16:39 Here is a better way to think about it. 183 00:16:39 --> 00:16:43 One thing that we have trust in so far are approximation 184 00:16:43 --> 00:16:48 formulas. We should have trust in them. 185 00:16:48 --> 00:16:54 We should believe that if we change x a little bit, 186 00:16:54 --> 00:17:02 if we change y a little bit then we are actually going to 187 00:17:02 --> 00:17:11 get a change in f that is approximately given by these 188 00:17:11 --> 00:17:13 guys. And this is true for any 189 00:17:13 --> 00:17:14 changes in x, y, z, 190 00:17:14 --> 00:17:20 but in particular let's look at the changes that we get if we 191 00:17:20 --> 00:17:26 just take these formulas as function of time and change time 192 00:17:26 --> 00:17:32 a little bit by delta t. We will actually use the 193 00:17:32 --> 00:17:39 changes in x, y, z in a small time delta t. 194 00:17:39 --> 00:17:47 Let's divide everybody by delta t. 195 00:17:47 --> 00:17:52 Here I am just dividing numbers so I am not actually playing any 196 00:17:52 --> 00:17:54 tricks on you. I mean we don't really know 197 00:17:54 --> 00:17:57 what it means to divide differentials, 198 00:17:57 --> 00:17:59 but dividing numbers is something we know. 199 00:17:59 --> 00:18:11 And now, if I take delta t very small, this guy tends to the 200 00:18:11 --> 00:18:19 derivative, df over dt. Remember, the definition of df 201 00:18:19 --> 00:18:23 over dt is the limit of this ratio when the time interval 202 00:18:23 --> 00:18:28 delta t tends to zero. That means if I choose smaller 203 00:18:28 --> 00:18:32 and smaller values of delta t then these ratios of numbers 204 00:18:32 --> 00:18:35 will actually tend to some value, 205 00:18:35 --> 00:18:41 and that value is the derivative. 206 00:18:41 --> 00:18:51 Similarly, here delta x over delta t, when delta t is really 207 00:18:51 --> 00:18:59 small, will tend to the derivative dx/dt. 208 00:18:59 --> 00:19:00 And similarly for the others. 209 00:19:00 --> 00:19:18 210 00:19:18 --> 00:19:28 That means, in particular, we take the limit as delta t 211 00:19:28 --> 00:19:35 tends to zero and we get df over dt on one side and on the other 212 00:19:35 --> 00:19:42 side we get f sub x dx over dt plus f sub y dy over dt plus f 213 00:19:42 --> 00:19:46 sub z dz over dt. And the approximation becomes 214 00:19:46 --> 00:19:49 better and better. Remember when we write 215 00:19:49 --> 00:19:53 approximately equal that means it is not quite the same, 216 00:19:53 --> 00:19:57 but if we take smaller variations then actually we will 217 00:19:57 --> 00:20:01 end up with values that are closer and closer. 218 00:20:01 --> 00:20:04 When we take the limit, as delta t tends to zero, 219 00:20:04 --> 00:20:06 eventually we get an equality. 220 00:20:06 --> 00:20:21 221 00:20:21 --> 00:20:24 I mean mathematicians have more complicated words to justify 222 00:20:24 --> 00:20:28 this statement. I will spare them for now, 223 00:20:28 --> 00:20:36 and you will see them when you take analysis if you go in that 224 00:20:36 --> 00:20:42 direction. Any questions so far? 225 00:20:42 --> 00:20:46 No. OK. 226 00:20:46 --> 00:20:47 Let's check this with an example. 227 00:20:47 --> 00:20:58 Let's say that we really don't have any faith in these things 228 00:20:58 --> 00:21:06 so let's try to do it. Let's say I give you a function 229 00:21:06 --> 00:21:14 that is x ^2 y z. And let's say that maybe x will 230 00:21:14 --> 00:21:20 be t, y will be e^t and z will be sin(t). 231 00:21:20 --> 00:21:34 232 00:21:34 --> 00:21:40 What does the chain rule say? Well, the chain rule tells us 233 00:21:40 --> 00:21:46 that dw/dt is, we start with partial w over 234 00:21:46 --> 00:21:51 partial x, well, what is that? 235 00:21:51 --> 00:21:58 That is 2xy, and maybe I should point out 236 00:21:58 --> 00:22:08 that this is w sub x, times dx over dt plus -- Well, 237 00:22:08 --> 00:22:21 w sub y is x squared times dy over dt plus w sub z, 238 00:22:21 --> 00:22:28 which is going to be just one, dz over dt. 239 00:22:28 --> 00:22:33 And so now let's plug in the actual values of these things. 240 00:22:33 --> 00:22:38 x is t and y is e^t, so that will be 2t e to the t, 241 00:22:38 --> 00:22:47 dx over dt is one plus x squared is t squared, 242 00:22:47 --> 00:23:00 dy over dt is e over t, plus dz over dt is cosine t. 243 00:23:00 --> 00:23:06 At the end of calculation we get 2t e to the t plus t squared 244 00:23:06 --> 00:23:11 e to the t plus cosine t. That is what the chain rule 245 00:23:11 --> 00:23:16 tells us. How else could we find that? 246 00:23:16 --> 00:23:20 Well, we could just plug in values of x, y and z, 247 00:23:20 --> 00:23:23 x plus w is a function of t, and take its derivative. 248 00:23:23 --> 00:23:26 Let's do that just for verification. 249 00:23:26 --> 00:23:30 It should be exactly the same answer. 250 00:23:30 --> 00:23:32 And, in fact, in this case, 251 00:23:32 --> 00:23:35 the two calculations are roughly equal in complication. 252 00:23:35 --> 00:23:39 But say that your function of x, y, z was much more 253 00:23:39 --> 00:23:43 complicated than that, or maybe you actually didn't 254 00:23:43 --> 00:23:45 know a formula for it, you only knew its partial 255 00:23:45 --> 00:23:48 derivatives, then you would need to use the 256 00:23:48 --> 00:23:51 chain rule. So, sometimes plugging in 257 00:23:51 --> 00:23:54 values is easier but not always. 258 00:23:54 --> 00:24:13 259 00:24:13 --> 00:24:18 Let's just check quickly. The other method would be to 260 00:24:18 --> 00:24:23 substitute. W as a function of t. 261 00:24:23 --> 00:24:36 Remember w was x^2y z. x was t, so you get t squared, 262 00:24:36 --> 00:24:41 y is e to the t, plus z was sine t. 263 00:24:41 --> 00:24:47 dw over dt, we know how to take the derivative using single 264 00:24:47 --> 00:24:50 variable calculus. Well, we should know. 265 00:24:50 --> 00:24:55 If we don't know then we should take a look at 18.01 again. 266 00:24:55 --> 00:25:02 The product rule that will be derivative of t squared is 2t 267 00:25:02 --> 00:25:08 times e to the t plus t squared time the derivative of e to the 268 00:25:08 --> 00:25:16 t is e to the t plus cosine t. And that is the same answer as 269 00:25:16 --> 00:25:19 over there. I ended up writing, 270 00:25:19 --> 00:25:23 you know, maybe I wrote slightly more here, 271 00:25:23 --> 00:25:28 but actually the amount of calculations really was pretty 272 00:25:28 --> 00:25:32 much the same. Any questions about that? 273 00:25:32 --> 00:25:39 Yes? What kind of object is w? 274 00:25:39 --> 00:25:43 Well, you can think of w as just another variable that is 275 00:25:43 --> 00:25:47 given as a function of x, y and z, for example. 276 00:25:47 --> 00:25:51 You would have a function of x, y, z defined by this formula, 277 00:25:51 --> 00:25:57 and I call it w. I call its value w so that I 278 00:25:57 --> 00:26:04 can substitute t instead of x, y, z. 279 00:26:04 --> 00:26:07 Well, let's think of w as a function of three variables. 280 00:26:07 --> 00:26:12 And then, when I plug in the dependents of these three 281 00:26:12 --> 00:26:17 variables on t, then it becomes just a function 282 00:26:17 --> 00:26:19 of t. I mean, really, 283 00:26:19 --> 00:26:23 my w here is pretty much what I called f before. 284 00:26:23 --> 00:26:31 There is no major difference between the two. 285 00:26:31 --> 00:26:38 Any other questions? No. 286 00:26:38 --> 00:26:45 OK. Let's see. 287 00:26:45 --> 00:26:49 Here is an application of what we have seen. 288 00:26:49 --> 00:26:53 Let's say that you want to understand actually all these 289 00:26:53 --> 00:26:57 rules about taking derivatives in single variable calculus. 290 00:26:57 --> 00:27:00 What I showed you at the beginning, and then erased, 291 00:27:00 --> 00:27:04 basically justifies how to take the derivative of a reciprocal 292 00:27:04 --> 00:27:06 function. And for that you didn't need 293 00:27:06 --> 00:27:10 multivariable calculus. But let's try to justify the 294 00:27:10 --> 00:27:12 product rule, for example, 295 00:27:12 --> 00:27:21 for the derivative. An application of this actually 296 00:27:21 --> 00:27:31 is to justify the product and quotient rules. 297 00:27:31 --> 00:27:33 Let's think, for example, 298 00:27:33 --> 00:27:39 of a function of two variables, u and v, that is just the 299 00:27:39 --> 00:27:44 product uv. And let's say that u and v are 300 00:27:44 --> 00:27:48 actually functions of one variable t. 301 00:27:48 --> 00:28:00 Then, well, d of uv over dt is given by the chain rule applied 302 00:28:00 --> 00:28:04 to f. This is df over dt. 303 00:28:04 --> 00:28:15 So df over dt should be f sub q du over dt plus f sub v plus dv 304 00:28:15 --> 00:28:19 over dt. But now what is the partial of 305 00:28:19 --> 00:28:23 f with respect to u? It is v. 306 00:28:23 --> 00:28:31 That is v du over dt. And partial of f with respect 307 00:28:31 --> 00:28:38 to v is going to be just u, dv over dt. 308 00:28:38 --> 00:28:42 So you get back the usual product rule. 309 00:28:42 --> 00:28:46 That is a slightly complicated way of deriving it, 310 00:28:46 --> 00:28:50 but that is a valid way of understanding how to take the 311 00:28:50 --> 00:28:54 derivative of a product by thinking of the product first as 312 00:28:54 --> 00:28:57 a function of variables, which are u and v. 313 00:28:57 --> 00:29:00 And then say, oh, but u and v were actually 314 00:29:00 --> 00:29:03 functions of a variable t. And then you do the 315 00:29:03 --> 00:29:08 differentiation in two stages using the chain rule. 316 00:29:08 --> 00:29:16 Similarly, you can do the quotient rule just for practice. 317 00:29:16 --> 00:29:21 If I give you the function g equals u of v. 318 00:29:21 --> 00:29:25 Right now I am thinking of it as a function of two variables, 319 00:29:25 --> 00:29:29 u and v. U and v themselves are actually 320 00:29:29 --> 00:29:39 going to be functions of t. Then, well, dg over dt is going 321 00:29:39 --> 00:29:44 to be partial g, partial u. 322 00:29:44 --> 00:29:48 How much is that? How much is partial g, 323 00:29:48 --> 00:29:53 partial u? One over v times du over dt 324 00:29:53 --> 00:29:58 plus -- Well, next we need to have partial g 325 00:29:58 --> 00:30:01 over partial v. Well, what is the derivative of 326 00:30:01 --> 00:30:04 this with respect to v? Here we need to know how to 327 00:30:04 --> 00:30:11 differentiate the inverse. It is minus u over v squared 328 00:30:11 --> 00:30:20 times dv over dt. And that is actually the usual 329 00:30:20 --> 00:30:28 quotient rule just written in a slightly different way. 330 00:30:28 --> 00:30:30 I mean, just in case you really want to see it, 331 00:30:30 --> 00:30:36 if you clear denominators for v squared then you will see 332 00:30:36 --> 00:30:41 basically u prime times v minus v prime times u. 333 00:30:41 --> 00:31:25 334 00:31:25 --> 00:31:32 Now let's go to something even more crazy. 335 00:31:32 --> 00:31:45 I claim we can do chain rules with more variables. 336 00:31:45 --> 00:31:50 Let's say that I have a quantity. 337 00:31:50 --> 00:31:55 Let's call it w for now. Let's say I have quantity w as 338 00:31:55 --> 00:31:58 a function of say variables x and y. 339 00:31:58 --> 00:32:02 And so in the previous setup x and y depended on some 340 00:32:02 --> 00:32:04 parameters t. But, actually, 341 00:32:04 --> 00:32:07 let's now look at the case where x and y themselves are 342 00:32:07 --> 00:32:10 functions of several variables. Let's say of two more variables. 343 00:32:10 --> 00:32:25 Let's call them u and v. I am going to stay with these 344 00:32:25 --> 00:32:27 abstract letters, but if it bothers you, 345 00:32:27 --> 00:32:31 if it sounds completely unmotivated think about it maybe 346 00:32:31 --> 00:32:33 in terms of something you might now. 347 00:32:33 --> 00:32:36 Say, polar coordinates. Let's say that I have a 348 00:32:36 --> 00:32:40 function but is defined in terms of the polar coordinate 349 00:32:40 --> 00:32:43 variables on theta. And then I know I want to 350 00:32:43 --> 00:32:45 switch to usual coordinates x and y. 351 00:32:45 --> 00:32:49 Or, the other way around, I have a function of x and y 352 00:32:49 --> 00:32:53 and I want to express it in terms of the polar coordinates r 353 00:32:53 --> 00:32:57 and theta. Then I would want to know maybe 354 00:32:57 --> 00:33:02 how the derivatives, with respect to the various 355 00:33:02 --> 00:33:07 sets of variables, related to each other. 356 00:33:07 --> 00:33:10 One way I could do it is, of course, 357 00:33:10 --> 00:33:16 to say now if I plug the formula for x and the formula 358 00:33:16 --> 00:33:23 for y into the formula for f then w becomes a function of u 359 00:33:23 --> 00:33:27 and v, and it can try to take partial 360 00:33:27 --> 00:33:29 derivatives. If I have explicit formulas, 361 00:33:29 --> 00:33:32 well, that could work. But maybe the formulas are 362 00:33:32 --> 00:33:35 complicated. Typically, if I switch between 363 00:33:35 --> 00:33:37 rectangular and polar coordinates, 364 00:33:37 --> 00:33:41 there might be inverse trig, there might be maybe arctangent 365 00:33:41 --> 00:33:45 to express the polar angle in terms of x and y. 366 00:33:45 --> 00:33:51 And when I don't really want to actually substitute arctangents 367 00:33:51 --> 00:33:56 everywhere, maybe I would rather deal with the derivatives. 368 00:33:56 --> 00:34:03 How do I do that? The question is what are 369 00:34:03 --> 00:34:11 partial w over partial u and partial w over partial v in 370 00:34:11 --> 00:34:17 terms of, let's see, what do we need to know to 371 00:34:17 --> 00:34:22 understand that? Well, probably we should know 372 00:34:22 --> 00:34:28 how w depends on x and y. If we don't know that then we 373 00:34:28 --> 00:34:32 are probably toast. Partial w over partial x, 374 00:34:32 --> 00:34:36 partial w over partial y should be required. 375 00:34:36 --> 00:34:39 What else should we know? Well, it would probably help to 376 00:34:39 --> 00:34:42 know how x and y depend on u and v. 377 00:34:42 --> 00:34:46 If we don't know that then we don't really know how to do it. 378 00:34:46 --> 00:34:55 We need also x sub u, x sub v, y sub u, 379 00:34:55 --> 00:35:00 y sub v. We have a lot of partials in 380 00:35:00 --> 00:35:07 there. Well, let's see how we can do 381 00:35:07 --> 00:35:13 that. Let's start by writing dw. 382 00:35:13 --> 00:35:19 We know that dw is partial f, well, I don't know why I have 383 00:35:19 --> 00:35:25 two names, w and f. I mean w and f are really the 384 00:35:25 --> 00:35:30 same thing here, but let's say f sub x dx plus f 385 00:35:30 --> 00:35:35 sub y dy. So far that is our new friend, 386 00:35:35 --> 00:35:39 the differential. Now what do we want to do with 387 00:35:39 --> 00:35:42 it? Well, we would like to get rid 388 00:35:42 --> 00:35:47 of dx and dy because we like to express things in terms of, 389 00:35:47 --> 00:35:50 you know, the question we are asking ourselves is let's say 390 00:35:50 --> 00:35:55 that I change u a little bit, how does w change? 391 00:35:55 --> 00:35:58 Of course, what happens, if I change u a little bit, 392 00:35:58 --> 00:36:01 is y and y will change. How do they change? 393 00:36:01 --> 00:36:05 Well, that is given to me by the differential. 394 00:36:05 --> 00:36:13 dx is going to be, well, I can use the 395 00:36:13 --> 00:36:19 differential again. Well, x is a function of u and 396 00:36:19 --> 00:36:24 v. That will be x sub u times du 397 00:36:24 --> 00:36:28 plus x sub v times dv. That is, again, 398 00:36:28 --> 00:36:31 taking the differential of a function of two variables. 399 00:36:31 --> 00:36:37 Does that make sense? And then we have the other guy, 400 00:36:37 --> 00:36:39 f sub y times, what is dy? 401 00:36:39 --> 00:36:49 Well, similarly dy is y sub u du plus y sub v dv. 402 00:36:49 --> 00:36:54 And now we have a relation between dw and du and dv. 403 00:36:54 --> 00:37:00 We are expressing how w reacts to changes in u and v, 404 00:37:00 --> 00:37:04 which was our goal. Now, let's actually collect 405 00:37:04 --> 00:37:08 terms so that we see it a bit better. 406 00:37:08 --> 00:37:19 It is going to be f sub x times x sub u times f sub y times y 407 00:37:19 --> 00:37:28 sub u du plus f sub x, x sub v plus f sub y y sub v 408 00:37:28 --> 00:37:32 dv. Now we have dw equals something 409 00:37:32 --> 00:37:38 du plus something dv. Well, the coefficient here has 410 00:37:38 --> 00:37:44 to be partial f over partial u. What else could it be? 411 00:37:44 --> 00:37:49 That's the rate of change of w with respect to u if I forget 412 00:37:49 --> 00:37:54 what happens when I change v. That is the definition of a 413 00:37:54 --> 00:37:58 partial. Similarly, this one has to be 414 00:37:58 --> 00:38:04 partial f over partial v. That is because it is the rate 415 00:38:04 --> 00:38:09 of change with respect to v, if I keep u constant, 416 00:38:09 --> 00:38:13 so that these guys are completely ignored. 417 00:38:13 --> 00:38:16 Now you see how the total differential accounts for, 418 00:38:16 --> 00:38:21 somehow, all the partial derivatives that come as 419 00:38:21 --> 00:38:27 coefficients of the individual variables in these expressions. 420 00:38:27 --> 00:38:33 Let me maybe rewrite these formulas in a more visible way 421 00:38:33 --> 00:38:40 and then re-explain them to you. Here is the chain rule for this 422 00:38:40 --> 00:38:46 situation, with two intermediate variables and two variables that 423 00:38:46 --> 00:38:50 you express these in terms of. In our setting, 424 00:38:50 --> 00:38:56 we get partial f over partial u equals partial f over partial x 425 00:38:56 --> 00:39:02 time partial x over partial u plus partial f over partial y 426 00:39:02 --> 00:39:08 times partial y over partial u. And the other one, 427 00:39:08 --> 00:39:15 the same thing with v instead of u, 428 00:39:15 --> 00:39:22 partial f over partial x times partial x over partial v plus 429 00:39:22 --> 00:39:28 partial f over partial u partial y over partial v. 430 00:39:28 --> 00:39:31 I have to explain various things about these formulas 431 00:39:31 --> 00:39:34 because they look complicated. And, actually, 432 00:39:34 --> 00:39:39 they are not that complicated. A couple of things to know. 433 00:39:39 --> 00:39:42 The first thing, how do we remember a formula 434 00:39:42 --> 00:39:44 like that? Well, that is easy. 435 00:39:44 --> 00:39:47 We want to know how f depends on u. 436 00:39:47 --> 00:39:51 Well, what does f depend on? It depends on x and y. 437 00:39:51 --> 00:39:55 So we will put partial f over partial x and partial f over 438 00:39:55 --> 00:39:59 partial y. Now, x and y, why are they here? 439 00:39:59 --> 00:40:01 Well, they are here because they actually depend on u as 440 00:40:01 --> 00:40:04 well. How does x depend on u? 441 00:40:04 --> 00:40:06 Well, the answer is partial x over partial u. 442 00:40:06 --> 00:40:10 How does y depend on u? The answer is partial y over 443 00:40:10 --> 00:40:12 partial u. See, the structure of this 444 00:40:12 --> 00:40:16 formula is simple. To find the partial of f with 445 00:40:16 --> 00:40:20 respect to some new variable you use the partials with respect to 446 00:40:20 --> 00:40:24 the variables that f was initially defined in terms of x 447 00:40:24 --> 00:40:28 and y. And you multiply them by the 448 00:40:28 --> 00:40:33 partials of x and y in terms of the new variable that you want 449 00:40:33 --> 00:40:37 to look at, v here, and you sum these things 450 00:40:37 --> 00:40:40 together. That is the structure of the 451 00:40:40 --> 00:40:42 formula. Why does it work? 452 00:40:42 --> 00:40:45 Well, let me explain it to you in a slightly different 453 00:40:45 --> 00:40:49 language. This asks us how does f change 454 00:40:49 --> 00:40:54 if I change u a little bit? Well, why would f change if u 455 00:40:54 --> 00:40:57 changes a little bit? Well, it would change because f 456 00:40:57 --> 00:41:00 actually depends on x and y and x and y depend on u. 457 00:41:00 --> 00:41:03 If I change u, how quickly does x change? 458 00:41:03 --> 00:41:06 Well, the answer is partial x over partial u. 459 00:41:06 --> 00:41:09 And now, if I change x at this rate, how does that have to 460 00:41:09 --> 00:41:13 change? Well, the answer is partial f 461 00:41:13 --> 00:41:17 over partial x times this guy. Well, at the same time, 462 00:41:17 --> 00:41:21 y is also changing. How fast is y changing if I 463 00:41:21 --> 00:41:24 change u? Well, at the rate of partial y 464 00:41:24 --> 00:41:27 over partial u. But now if I change this how 465 00:41:27 --> 00:41:30 does f change? Well, the rate of change is 466 00:41:30 --> 00:41:34 partial f over partial y. The product is the effect of 467 00:41:34 --> 00:41:37 how you change it, changing u, and therefore 468 00:41:37 --> 00:41:40 changing f. Now, what happens in real life, 469 00:41:40 --> 00:41:43 if I change u a little bit? Well, both x and y change at 470 00:41:43 --> 00:41:46 the same time. So how does f change? 471 00:41:46 --> 00:41:50 Well, it is the sum of the two effects. 472 00:41:50 --> 00:41:54 Does that make sense? Good. 473 00:41:54 --> 00:42:00 Of course, if f depends on more variables then you just have 474 00:42:00 --> 00:42:02 more terms in here. OK. 475 00:42:02 --> 00:42:05 Here is another thing that may be a little bit confusing. 476 00:42:05 --> 00:42:09 What is tempting? Well, what is tempting here 477 00:42:09 --> 00:42:12 would be to simplify these formulas by removing these 478 00:42:12 --> 00:42:15 partial x's. Let's simplify by partial x. 479 00:42:15 --> 00:42:18 Let's simplify by partial y. We get partial f over partial u 480 00:42:18 --> 00:42:21 equals partial f over partial u plus partial f over partial u. 481 00:42:21 --> 00:42:25 Something is not working properly. 482 00:42:25 --> 00:42:28 Why doesn't it work? The answer is precisely because 483 00:42:28 --> 00:42:32 these are partial derivatives. These are not total derivatives. 484 00:42:32 --> 00:42:36 And so you cannot simplify them in that way. 485 00:42:36 --> 00:42:39 And that is actually the reason why we use this curly d rather 486 00:42:39 --> 00:42:41 than a straight d. It is to remind us, 487 00:42:41 --> 00:42:44 beware, there are these simplifications that we can do 488 00:42:44 --> 00:42:47 with straight d's that are not legal here. 489 00:42:47 --> 00:42:52 Somehow, when you have a partial derivative, 490 00:42:52 --> 00:42:57 you must resist the urge of simplifying things. 491 00:42:57 --> 00:43:02 No simplifications in here. That is the simplest formula 492 00:43:02 --> 00:43:10 you can get. Any questions at this point? 493 00:43:10 --> 00:43:21 No. Yes? 494 00:43:21 --> 00:43:23 When would you use this and what does it describe? 495 00:43:23 --> 00:43:26 Well, it is basically when you have a function given in terms 496 00:43:26 --> 00:43:29 of a certain set of variables because maybe there is a simply 497 00:43:29 --> 00:43:31 expression in terms of those variables. 498 00:43:31 --> 00:43:35 But ultimately what you care about is not those variables, 499 00:43:35 --> 00:43:39 z and y, but another set of variables, here u and v. 500 00:43:39 --> 00:43:42 So x and y are giving you a nice formula for f, 501 00:43:42 --> 00:43:46 but actually the relevant variables for your problem are u 502 00:43:46 --> 00:43:48 and v. And you know x and y are 503 00:43:48 --> 00:43:50 related to u and v. So, of course, 504 00:43:50 --> 00:43:53 what you could do is plug the formulas the way that we did 505 00:43:53 --> 00:43:55 substituting. But maybe that will give you 506 00:43:55 --> 00:43:59 very complicated expressions. And maybe it is actually easier 507 00:43:59 --> 00:44:02 to just work with the derivates. The important claim here is 508 00:44:02 --> 00:44:05 basically we don't need to know the actual formulas. 509 00:44:05 --> 00:44:07 All we need to know are the rate of changes. 510 00:44:07 --> 00:44:11 If we know all these rates of change then we know how to take 511 00:44:11 --> 00:44:14 these derivatives without actually having to plug in 512 00:44:14 --> 00:44:22 values. Yes? 513 00:44:22 --> 00:44:25 Yes, you could certain do the same things in terms of t. 514 00:44:25 --> 00:44:29 If x and y were functions of t instead of being functions of u 515 00:44:29 --> 00:44:31 and v then it would be the same thing. 516 00:44:31 --> 00:44:34 And you would have the same formulas that I had, 517 00:44:34 --> 00:44:37 well, over there I still have it. 518 00:44:37 --> 00:44:39 Why does that one have straight d's? 519 00:44:39 --> 00:44:42 Well, the answer is I could put curly d's if I wanted, 520 00:44:42 --> 00:44:45 but I end up with a function of a single variable. 521 00:44:45 --> 00:44:48 If you have a single variable then the partial, 522 00:44:48 --> 00:44:50 with respect to that variable, is the same thing as the usual 523 00:44:50 --> 00:44:53 derivative. We don't actually need to worry 524 00:44:53 --> 00:44:57 about curly in that case. But that one is indeed special 525 00:44:57 --> 00:45:00 case of this one where instead of x and y depending on two 526 00:45:00 --> 00:45:03 variables, u and v, they depend on a single 527 00:45:03 --> 00:45:04 variable t. Now, of course, 528 00:45:04 --> 00:45:06 you can call variables any name you want. 529 00:45:06 --> 00:45:12 It doesn't matter. This is just a slight 530 00:45:12 --> 00:45:16 generalization of that. Well, not quite because here I 531 00:45:16 --> 00:45:18 also had a z. See, I am trying to just 532 00:45:18 --> 00:45:21 confuse you by giving you functions that depend on various 533 00:45:21 --> 00:45:25 numbers of variables. If you have a function of 30 534 00:45:25 --> 00:45:28 variables, things work the same way, just longer, 535 00:45:28 --> 00:45:33 and you are going to run out of letters in the alphabet before 536 00:45:33 --> 00:45:38 the end. Any other questions? 537 00:45:38 --> 00:45:43 No. What? 538 00:45:43 --> 00:45:51 Yes? If u and v themselves depended 539 00:45:51 --> 00:45:55 on another variable then you would continue with your chain 540 00:45:55 --> 00:45:58 rules. Maybe you would know to express 541 00:45:58 --> 00:46:02 partial x over partial u in terms using that chain rule. 542 00:46:02 --> 00:46:05 Sorry. If u and v are dependent on yet 543 00:46:05 --> 00:46:08 another variable then you could get the derivative with respect 544 00:46:08 --> 00:46:11 to that using first the chain rule to pass from u v to that 545 00:46:11 --> 00:46:14 new variable, and then you would plug in 546 00:46:14 --> 00:46:17 these formulas for partials of f with respect to u and v. 547 00:46:17 --> 00:46:19 In fact, if you have several substitutions to do, 548 00:46:19 --> 00:46:21 you can always arrange to use one chain rule at a time. 549 00:46:21 --> 00:46:25 You just have to do them in sequence. 550 00:46:25 --> 00:46:28 That's why we don't actually learn that, but you can just do 551 00:46:28 --> 00:46:32 it be repeating the process. I mean, probably at that stage, 552 00:46:32 --> 00:46:35 the easiest to not get confused actually is to manipulate 553 00:46:35 --> 00:46:38 differentials because that is probably easier. 554 00:46:38 --> 00:46:47 Yes? Curly f does not exist. 555 00:46:47 --> 00:46:50 That's easy. Curly f makes no sense by 556 00:46:50 --> 00:46:52 itself. It doesn't exist alone. 557 00:46:52 --> 00:46:58 What exists is only curly df over curly d some variable. 558 00:46:58 --> 00:47:02 And then that accounts only for the rate of change with respect 559 00:47:02 --> 00:47:05 to that variable leaving the others fixed, 560 00:47:05 --> 00:47:11 while straight df is somehow a total variation of f. 561 00:47:11 --> 00:47:16 It accounts for all of the partial derivatives and their 562 00:47:16 --> 00:47:25 combined effects. OK. Any more questions? No. 563 00:47:25 --> 00:47:29 Let me just finish up very quickly by telling you again one 564 00:47:29 --> 00:47:33 example where completely you might want to do this. 565 00:47:33 --> 00:47:40 You have a function that you want to switch between 566 00:47:40 --> 00:47:45 rectangular and polar coordinates. 567 00:47:45 --> 00:47:48 To make things a little bit concrete. 568 00:47:48 --> 00:47:55 If you have polar coordinates that means in the plane, 569 00:47:55 --> 00:48:00 instead of using x and y, you will use coordinates r, 570 00:48:00 --> 00:48:05 distance to the origin, and theta, the angles from the 571 00:48:05 --> 00:48:08 x-axis. The change of variables for 572 00:48:08 --> 00:48:14 that is x equals r cosine theta and y equals r sine theta. 573 00:48:14 --> 00:48:21 And so that means if you have a function f that depends on x and 574 00:48:21 --> 00:48:29 y, in fact, you can plug these in as a function of r and theta. 575 00:48:29 --> 00:48:34 Then you can ask yourself, well, what is partial f over 576 00:48:34 --> 00:48:37 partial r? And that is going to be, 577 00:48:37 --> 00:48:42 well, you want to take partial f over partial x times partial x 578 00:48:42 --> 00:48:48 partial r plus partial f over partial y times partial y over 579 00:48:48 --> 00:48:53 partial r. That will end up being actually 580 00:48:53 --> 00:48:59 f sub x times cosine theta plus f sub y times sine theta. 581 00:48:59 --> 00:49:02 And you can do the same thing to find partial f, 582 00:49:02 --> 00:49:05 partial theta. And so you can express 583 00:49:05 --> 00:49:10 derivatives either in terms of x, y or in terms of r and theta 584 00:49:10 --> 00:49:13 with simple relations between them. 585 00:49:13 --> 00:49:20 And the one last thing I should say. 586 00:49:20 --> 00:49:23 On Thursday we will learn about more tricks we can play with 587 00:49:23 --> 00:49:27 variations of functions. And one that is important, 588 00:49:27 --> 00:49:29 because you need to know it actually to do the p-set, 589 00:49:29 --> 00:49:38 is the gradient vector. The gradient vector is simply a 590 00:49:38 --> 00:49:41 vector. You use this downward pointing 591 00:49:41 --> 00:49:44 triangle as the notation for the gradient. 592 00:49:44 --> 00:49:49 It is simply is a vector whose components are the partial 593 00:49:49 --> 00:49:53 derivatives of a function. I mean, in a way, 594 00:49:53 --> 00:49:56 you can think of a differential as a way to package partial 595 00:49:56 --> 00:49:59 derivatives together into some weird object. 596 00:49:59 --> 00:50:01 Well, the gradient is also a way to package partials 597 00:50:01 --> 00:50:04 together. We will see on Thursday what it 598 00:50:04 --> 00:50:07 is good for, but some of the problems on the p-set use it. 599 00:50:07 --> 00:50:09