1 00:00:00,040 --> 00:00:02,460 The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,870 Commons license. 3 00:00:03,870 --> 00:00:06,320 Your support will help MIT OpenCourseWare 4 00:00:06,320 --> 00:00:10,560 continue to offer high-quality educational resources for free. 5 00:00:10,560 --> 00:00:13,300 To make a donation or view additional materials 6 00:00:13,300 --> 00:00:17,210 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,210 --> 00:00:18,793 at ocw.mit.edu. 8 00:00:35,860 --> 00:00:36,850 HERBERT GROSS: Hi. 9 00:00:36,850 --> 00:00:41,060 Today we do a somewhat computational bit. 10 00:00:41,060 --> 00:00:45,240 And actually, the lecture for today 11 00:00:45,240 --> 00:00:47,680 is not nearly as difficult, once you 12 00:00:47,680 --> 00:00:50,320 get through the maze of symbolism, 13 00:00:50,320 --> 00:00:53,940 as it is to apply the material. 14 00:00:53,940 --> 00:00:57,320 In other words, we're going to devote the next two 15 00:00:57,320 --> 00:01:01,190 units of our course to this particular topic, which 16 00:01:01,190 --> 00:01:03,060 is known as the chain rule. 17 00:01:03,060 --> 00:01:07,870 But we'll give one lecture to cover both units. 18 00:01:07,870 --> 00:01:11,100 And again, the idea is that it's not so much 19 00:01:11,100 --> 00:01:14,780 that the concept becomes more difficult as much as it 20 00:01:14,780 --> 00:01:17,820 is that you must develop a certain amount 21 00:01:17,820 --> 00:01:22,650 of dexterity keeping track of the various partial derivatives 22 00:01:22,650 --> 00:01:25,090 and the like. 23 00:01:25,090 --> 00:01:27,330 At any rate, maybe I think the best way 24 00:01:27,330 --> 00:01:31,120 is to just barge into a hypothetical situation 25 00:01:31,120 --> 00:01:34,060 and see what the situation really is. 26 00:01:34,060 --> 00:01:37,610 The idea is essentially the following. 27 00:01:37,610 --> 00:01:40,150 We're given some function, say, w. 28 00:01:40,150 --> 00:01:43,100 w is a function of, say, the three independent variables 29 00:01:43,100 --> 00:01:45,010 x, y, and z. 30 00:01:45,010 --> 00:01:47,020 Now, for some reason or other, which 31 00:01:47,020 --> 00:01:49,670 we won't worry about right now, it 32 00:01:49,670 --> 00:01:53,880 turns out that x, y, and z are, in turn, 33 00:01:53,880 --> 00:01:57,720 conveniently expressible in terms of the two variables 34 00:01:57,720 --> 00:01:58,915 r and s. 35 00:01:58,915 --> 00:02:01,980 In fact, if you want a physical interpretation of this, 36 00:02:01,980 --> 00:02:06,260 you can think of if x, y, and z are functions 37 00:02:06,260 --> 00:02:09,050 of the two independent variables r and s, 38 00:02:09,050 --> 00:02:11,780 that means that we have two degrees of freedom. 39 00:02:11,780 --> 00:02:15,450 So we may think of this as parametrically representing 40 00:02:15,450 --> 00:02:17,930 the equation of a surface. 41 00:02:17,930 --> 00:02:19,840 And what we're talking about here 42 00:02:19,840 --> 00:02:24,010 is w being a function of something in space and asking, 43 00:02:24,010 --> 00:02:27,900 what does w look like when you restrict your space 44 00:02:27,900 --> 00:02:29,300 to a particular surface? 45 00:02:29,300 --> 00:02:31,700 I mean, that's just a geometrical interpretation 46 00:02:31,700 --> 00:02:33,050 that one could talk about. 47 00:02:33,050 --> 00:02:34,350 But the idea is the following. 48 00:02:34,350 --> 00:02:38,730 After all, if w depends on x, y, and z, and x, y, 49 00:02:38,730 --> 00:02:43,100 and z each depend on r and s, in particular then, 50 00:02:43,100 --> 00:02:46,660 it's clear that w itself is some function of r and s, 51 00:02:46,660 --> 00:02:49,500 where, again, I use the usual notation 52 00:02:49,500 --> 00:02:52,090 of using a g here rather than the f 53 00:02:52,090 --> 00:02:57,090 up here to indicate that the relationship between r and s 54 00:02:57,090 --> 00:02:59,900 which specifies w, may very well be 55 00:02:59,900 --> 00:03:02,930 a different algebraic relationship than that which 56 00:03:02,930 --> 00:03:05,820 relates x, y, and z to give w. 57 00:03:05,820 --> 00:03:08,750 But the point that we have in mind is the following. 58 00:03:08,750 --> 00:03:10,750 Given that w is a function of x, y, 59 00:03:10,750 --> 00:03:15,170 and z, given that x, y, and z are functions of r and s, 60 00:03:15,170 --> 00:03:17,560 hence w is a function of r and s. 61 00:03:17,560 --> 00:03:20,930 The question that we ask in calculus of several variables 62 00:03:20,930 --> 00:03:24,910 is, first of all, if we can be sure that these were all 63 00:03:24,910 --> 00:03:28,060 continuously differentiable functions, 64 00:03:28,060 --> 00:03:32,190 can we be sure that w will be a continuously differentiable 65 00:03:32,190 --> 00:03:33,310 function of r and s? 66 00:03:33,310 --> 00:03:35,290 That's the first question. 67 00:03:35,290 --> 00:03:38,470 And the second question is, OK, assuming 68 00:03:38,470 --> 00:03:41,490 that the answer to the first question is in the affirmative, 69 00:03:41,490 --> 00:03:44,890 that w is a continuously differentiable function of r 70 00:03:44,890 --> 00:03:49,110 and s, how could we compute, for example, the partial of w 71 00:03:49,110 --> 00:03:53,940 with respect to r, knowing all of the so-called obvious 72 00:03:53,940 --> 00:03:55,330 other partial derivatives? 73 00:03:55,330 --> 00:03:56,590 What do I mean by that? 74 00:03:56,590 --> 00:04:00,240 Well, what I mean is if you were to look just at this equation, 75 00:04:00,240 --> 00:04:01,960 just looking at this equation, what 76 00:04:01,960 --> 00:04:04,060 are the obvious partial derivatives to take? 77 00:04:04,060 --> 00:04:05,810 You say, well, we'll take the partial of w 78 00:04:05,810 --> 00:04:09,190 with respect to x, the partial of w with respect to y, 79 00:04:09,190 --> 00:04:11,790 and the partial of w with respect to z. 80 00:04:11,790 --> 00:04:14,180 And if you were to look, say, at this equation, 81 00:04:14,180 --> 00:04:17,050 the natural thing to ask is, what is the partial 82 00:04:17,050 --> 00:04:18,350 of x with respect to r? 83 00:04:18,350 --> 00:04:21,029 What is the partial of x with respect to s? 84 00:04:21,029 --> 00:04:22,010 Et cetera. 85 00:04:22,010 --> 00:04:23,830 In other words, what we're saying is, 86 00:04:23,830 --> 00:04:27,580 in this particular problem, we would like to figure out how, 87 00:04:27,580 --> 00:04:30,410 for example, to compute the partial of w with respect 88 00:04:30,410 --> 00:04:34,010 to r, knowing that we have at our disposal 89 00:04:34,010 --> 00:04:38,390 the partials of w with respect to x, y, and z; 90 00:04:38,390 --> 00:04:43,810 the partials of x, y, and z with respect to r; et cetera; 91 00:04:43,810 --> 00:04:45,950 meaning we also have the partials of x, y, 92 00:04:45,950 --> 00:04:48,310 and z with respect to theta. 93 00:04:48,310 --> 00:04:51,370 Before I go any further, notice, by the way, 94 00:04:51,370 --> 00:04:53,930 that if I left out that phrase that we were talking 95 00:04:53,930 --> 00:04:57,120 about in our last lecture, "continuously differentiable," 96 00:04:57,120 --> 00:04:59,370 notice that all of this would make sense, 97 00:04:59,370 --> 00:05:02,330 provided that the derivatives existed. 98 00:05:02,330 --> 00:05:05,010 There was no place here do I make any statement 99 00:05:05,010 --> 00:05:09,090 that the partials have to not only exist but be continuous. 100 00:05:09,090 --> 00:05:10,270 I never say that at all. 101 00:05:10,270 --> 00:05:11,720 This is what the problem is. 102 00:05:11,720 --> 00:05:13,490 I would like to use the chain rule. 103 00:05:13,490 --> 00:05:15,880 Do you see why it's called the chain rule? 104 00:05:15,880 --> 00:05:18,410 w is a function of x, y, and z. 105 00:05:18,410 --> 00:05:21,400 x, y, and z are each functions of r and s. 106 00:05:21,400 --> 00:05:24,510 Now, what I claim is that not only is it possible 107 00:05:24,510 --> 00:05:27,830 to do this but the recipe for doing 108 00:05:27,830 --> 00:05:32,130 this is a very, very suggestive thing, one which is very, very 109 00:05:32,130 --> 00:05:35,930 easy to remember, once you see how it's put together. 110 00:05:35,930 --> 00:05:37,590 If you don't see how it's put together, 111 00:05:37,590 --> 00:05:39,580 the thing is just a mess-- namely, 112 00:05:39,580 --> 00:05:42,351 the claim is that the partial of w with respect to r 113 00:05:42,351 --> 00:05:44,850 is the partial-- I'll just read it to you-- the partial of w 114 00:05:44,850 --> 00:05:48,210 with respect to x times the partial of x with respect to r, 115 00:05:48,210 --> 00:05:49,960 plus the partial of w with respect 116 00:05:49,960 --> 00:05:53,120 to y times the partial of y with respect to r, 117 00:05:53,120 --> 00:05:54,820 plus the partial of w with respect 118 00:05:54,820 --> 00:05:57,520 to z times the partial of z with respect to r. 119 00:05:57,520 --> 00:05:59,640 And as I say, if you try to memorize that, 120 00:05:59,640 --> 00:06:01,670 it's a very, very nasty business. 121 00:06:01,670 --> 00:06:03,905 But let's look at this in three separate pieces. 122 00:06:09,910 --> 00:06:13,390 In a way, can you sense that this 123 00:06:13,390 --> 00:06:17,260 is nothing more than the change in w with respect 124 00:06:17,260 --> 00:06:20,410 to r due to the change in x alone? 125 00:06:20,410 --> 00:06:22,110 In other words, you're taking here what? 126 00:06:22,110 --> 00:06:25,970 The change in w due to x and multiplying that 127 00:06:25,970 --> 00:06:28,490 by the change in x with respect to r. 128 00:06:28,490 --> 00:06:32,270 So this is the contribution of the change in w with respect 129 00:06:32,270 --> 00:06:35,490 to r due to x alone. 130 00:06:35,490 --> 00:06:38,590 On the other hand, this is the partial of w with respect 131 00:06:38,590 --> 00:06:41,840 to r due to the change in y alone. 132 00:06:41,840 --> 00:06:43,760 And this is the partial of w with respect 133 00:06:43,760 --> 00:06:46,670 to r due to the change in z alone. 134 00:06:46,670 --> 00:06:49,630 And since x, y, and z are independent, 135 00:06:49,630 --> 00:06:52,990 the change in x, the change in y, and the change in z 136 00:06:52,990 --> 00:06:54,900 are also independent variables. 137 00:06:54,900 --> 00:06:58,150 Consequently, it seems reasonable to assume 138 00:06:58,150 --> 00:07:01,840 that to find the total change of w with respect to r, 139 00:07:01,840 --> 00:07:04,990 we just add up all of the partial contributions. 140 00:07:04,990 --> 00:07:07,700 Namely, we take the partial of w with respect 141 00:07:07,700 --> 00:07:11,570 to r due to x alone, add on to that the partial of w 142 00:07:11,570 --> 00:07:14,070 with respect to r due to y alone, 143 00:07:14,070 --> 00:07:16,140 add on to that the partial of w with respect 144 00:07:16,140 --> 00:07:19,940 to r due to z alone, and that that sum should 145 00:07:19,940 --> 00:07:23,810 be the total change in w with respect r, 146 00:07:23,810 --> 00:07:25,120 treating s as a constant. 147 00:07:25,120 --> 00:07:29,000 And by the way, let me point out a pitfall with this notation. 148 00:07:29,000 --> 00:07:31,980 We're so used to using fractional notation here. 149 00:07:31,980 --> 00:07:34,350 Have you noticed that if you're not careful here, 150 00:07:34,350 --> 00:07:37,400 you're almost tempted to cancel-- 151 00:07:37,400 --> 00:07:40,077 I don't want to write this, because you'll think that it's 152 00:07:40,077 --> 00:07:41,160 the right way of doing it. 153 00:07:41,160 --> 00:07:43,620 But see if we say, let's cancel the partials with respect 154 00:07:43,620 --> 00:07:46,300 to x here, let's cancel the partials with respect 155 00:07:46,300 --> 00:07:49,180 to y here, and let's cancel the partials with respect 156 00:07:49,180 --> 00:07:50,200 to z here. 157 00:07:50,200 --> 00:07:53,450 By the way, if you did that, notice what you would get 158 00:07:53,450 --> 00:07:57,590 is the contradiction that the partial of w with respect to r 159 00:07:57,590 --> 00:07:59,750 is equal to the partial of w with respect 160 00:07:59,750 --> 00:08:02,320 to r, plus the partial of w with respect 161 00:08:02,320 --> 00:08:04,935 to r, plus the partial of w with respect to r. 162 00:08:04,935 --> 00:08:06,310 In other words, it seems that you 163 00:08:06,310 --> 00:08:09,340 would get that the partial of w with respect to r 164 00:08:09,340 --> 00:08:12,150 is always three times itself, which 165 00:08:12,150 --> 00:08:15,080 is, I hope, a glaring enough contradiction so I don't 166 00:08:15,080 --> 00:08:18,500 have to go into any more detail about the contradiction part. 167 00:08:18,500 --> 00:08:26,460 Notice again, though, why I have made such a fetish 168 00:08:26,460 --> 00:08:29,090 over labeling the variables. 169 00:08:29,090 --> 00:08:31,440 Notice that when you're taking the partial of w 170 00:08:31,440 --> 00:08:35,919 with respect to x, you're assuming that y and z 171 00:08:35,919 --> 00:08:38,490 are the variables that are being held constant. 172 00:08:38,490 --> 00:08:41,620 And when you're taking the partial of x with respect to r, 173 00:08:41,620 --> 00:08:44,900 it's s that you're assuming is being held constant. 174 00:08:44,900 --> 00:08:47,080 And as soon as you look at these subscripts here, 175 00:08:47,080 --> 00:08:50,120 somehow or other that should put you on your guard 176 00:08:50,120 --> 00:08:53,070 to be careful about crossing out because, after all, 177 00:08:53,070 --> 00:08:55,620 the changes are being made with respect 178 00:08:55,620 --> 00:08:57,660 to different sets of variables. 179 00:08:57,660 --> 00:09:00,710 At any rate, this is the statement. 180 00:09:00,710 --> 00:09:04,990 And my other claim is that the proof follows immediately 181 00:09:04,990 --> 00:09:09,940 from the main, key theorem that we stressed last time, even 182 00:09:09,940 --> 00:09:11,380 though we didn't prove it. 183 00:09:11,380 --> 00:09:14,280 But we've had ample exercises using this. 184 00:09:14,280 --> 00:09:16,870 Namely, notice that we have already 185 00:09:16,870 --> 00:09:21,630 seen that if w does happen to be a continuously differentiable 186 00:09:21,630 --> 00:09:26,840 function of x, y, and z, then delta w is the partial of w 187 00:09:26,840 --> 00:09:30,290 with respect to x times delta x, plus the partial of w 188 00:09:30,290 --> 00:09:33,660 with respect to y times delta y, plus the partial of w 189 00:09:33,660 --> 00:09:37,790 with respect to z times delta z, plus an error term. 190 00:09:37,790 --> 00:09:39,200 And what is that error term? 191 00:09:39,200 --> 00:09:43,740 It's k_1 delta x, plus k_2 delta y, plus k_3 delta 192 00:09:43,740 --> 00:09:48,200 z, where k_1, k_2, and k_3 all approach 193 00:09:48,200 --> 00:09:52,220 0 as delta x, delta y, and delta z approach 0. 194 00:09:52,220 --> 00:09:56,350 Now again, the key step in all of this 195 00:09:56,350 --> 00:10:01,940 is that this amount here I could always call delta w tan, 196 00:10:01,940 --> 00:10:04,810 or as Professor Thomas calls it for more than two variables, 197 00:10:04,810 --> 00:10:10,530 delta w sub lin, l-i-n, meaning that this is a linear equation. 198 00:10:10,530 --> 00:10:12,760 Remember-- I've made an abbreviation here-- 199 00:10:12,760 --> 00:10:17,340 these partials are assumed to be evaluated at a particular point 200 00:10:17,340 --> 00:10:19,040 that we're interested in. 201 00:10:19,040 --> 00:10:24,060 But the idea is, granted that I can always call this delta w 202 00:10:24,060 --> 00:10:30,070 tan, to say that the error has this small a magnitude depends 203 00:10:30,070 --> 00:10:33,110 on the fact that w is a continuously differentiable 204 00:10:33,110 --> 00:10:35,290 function of x, y, and z. 205 00:10:35,290 --> 00:10:37,310 That's why the theory is so important. 206 00:10:37,310 --> 00:10:40,380 What happens in real life is that most examples 207 00:10:40,380 --> 00:10:44,850 you encounter in real-life engineering, the functions 208 00:10:44,850 --> 00:10:48,170 that you're dealing with are continuously differentiable. 209 00:10:48,170 --> 00:10:51,260 So it seems like we're making a big issue over nothing. 210 00:10:51,260 --> 00:10:54,100 I should point out that on the frontiers of knowledge, 211 00:10:54,100 --> 00:10:57,000 enough situations occur where the functions that we're 212 00:10:57,000 --> 00:10:59,800 dealing with are not continuously differentiable, 213 00:10:59,800 --> 00:11:03,610 that some horrible mistakes can be made by assuming that you 214 00:11:03,610 --> 00:11:09,090 can replace delta w by this, without any significant error. 215 00:11:09,090 --> 00:11:11,660 But as long as this is the case, we can do this. 216 00:11:11,660 --> 00:11:14,990 And now notice, what does the partial of w with respect to r 217 00:11:14,990 --> 00:11:15,680 mean? 218 00:11:15,680 --> 00:11:20,050 It means you take delta w divided by delta r. 219 00:11:20,050 --> 00:11:21,820 And let me just do that here. 220 00:11:21,820 --> 00:11:24,890 I'll just divide every term by delta r. 221 00:11:32,360 --> 00:11:34,610 And now, what do I have to do next to get the partial? 222 00:11:34,610 --> 00:11:37,970 I have to take the limit as delta r approaches 0. 223 00:11:37,970 --> 00:11:41,720 Now the interesting point is as delta approaches 0, 224 00:11:41,720 --> 00:11:45,300 holding s fixed, this term obviously becomes 225 00:11:45,300 --> 00:11:48,820 the partial of x with respect to r, by definition. 226 00:11:48,820 --> 00:11:51,940 This term becomes the partial of y with respect to r, 227 00:11:51,940 --> 00:11:53,180 by definition. 228 00:11:53,180 --> 00:11:56,650 And this term becomes a partial of z with respect to r, 229 00:11:56,650 --> 00:11:57,860 by definition. 230 00:11:57,860 --> 00:12:01,070 By the way, notice that even though delta x, delta y, 231 00:12:01,070 --> 00:12:05,020 and delta z are all going to 0 as delta r goes to 0, 232 00:12:05,020 --> 00:12:08,620 you can not immediately conclude that these terms drop out. 233 00:12:08,620 --> 00:12:11,800 Because after all, delta r is also approaching 0. 234 00:12:11,800 --> 00:12:15,380 So delta x over delta r is that 0 over 0 form. 235 00:12:15,380 --> 00:12:18,890 In fact, that's precisely the partial 236 00:12:18,890 --> 00:12:21,560 of x with respect to r term that we're talking about. 237 00:12:21,560 --> 00:12:23,270 The beauty is what? 238 00:12:23,270 --> 00:12:25,780 That as delta x, delta y, and delta z 239 00:12:25,780 --> 00:12:28,840 approach 0, each of the k's approach 0. 240 00:12:31,490 --> 00:12:35,530 You see, the reason that the error term becomes negligible, 241 00:12:35,530 --> 00:12:39,180 becomes 0 in the limit, isn't because delta x, delta y, 242 00:12:39,180 --> 00:12:41,350 and delta z are becoming small. 243 00:12:41,350 --> 00:12:43,550 Because these small numbers are being divided 244 00:12:43,550 --> 00:12:44,850 by another small number. 245 00:12:44,850 --> 00:12:49,010 It's because the k_1, k_2, and k_3 are getting small. 246 00:12:49,010 --> 00:12:52,010 At any rate, putting this all together, 247 00:12:52,010 --> 00:12:54,260 notice that now, in a manner completely 248 00:12:54,260 --> 00:12:57,380 analogous to our part-one treatment of the chain rule, 249 00:12:57,380 --> 00:12:59,680 except that we're now dealing with several variables, 250 00:12:59,680 --> 00:13:01,530 these three terms drop out. 251 00:13:01,530 --> 00:13:06,410 And these three terms become the claim that we made before. 252 00:13:06,410 --> 00:13:10,040 In other words, this is how the partial of w with respect to r 253 00:13:10,040 --> 00:13:10,640 is computed. 254 00:13:10,640 --> 00:13:14,600 And again, the theory is the easiest part of this. 255 00:13:14,600 --> 00:13:15,860 That's the easy part. 256 00:13:15,860 --> 00:13:17,785 The hard part is getting familiarity 257 00:13:17,785 --> 00:13:18,910 with how to work with this. 258 00:13:18,910 --> 00:13:21,270 And I think the best way to get some familiarity 259 00:13:21,270 --> 00:13:24,470 for working with this is to pick particularly simple problems 260 00:13:24,470 --> 00:13:26,300 for the lecture, problems where it's 261 00:13:26,300 --> 00:13:28,620 so easy to do the problem both ways 262 00:13:28,620 --> 00:13:31,100 that no hangup can possibly occur. 263 00:13:31,100 --> 00:13:32,900 Let's take a very simple example. 264 00:13:32,900 --> 00:13:34,830 Let's suppose that w equals x squared 265 00:13:34,830 --> 00:13:37,260 plus y squared plus z squared. 266 00:13:37,260 --> 00:13:42,690 Suppose we also know that x is r plus s, y is r minus s, 267 00:13:42,690 --> 00:13:44,990 and z happens to be 2r. 268 00:13:44,990 --> 00:13:47,410 In this particular case, notice that we 269 00:13:47,410 --> 00:13:52,330 would find the partial of w with respect to r very conveniently 270 00:13:52,330 --> 00:13:53,980 by direct substitution. 271 00:13:53,980 --> 00:13:57,270 Namely, we simply replace x by r plus s, 272 00:13:57,270 --> 00:14:02,620 we replace y by r minus s, we replace z by 2r. 273 00:14:02,620 --> 00:14:06,700 And then w simply becomes this expression here, 274 00:14:06,700 --> 00:14:09,260 which when we collect terms, becomes 275 00:14:09,260 --> 00:14:12,460 6 r squared plus 2 s squared. 276 00:14:12,460 --> 00:14:14,460 And again, the arithmetic there is simple enough 277 00:14:14,460 --> 00:14:16,543 so I'm not even going to bother worrying about how 278 00:14:16,543 --> 00:14:18,600 we justify these steps. 279 00:14:18,600 --> 00:14:22,690 At which stage, to take the partial of w with respect to r, 280 00:14:22,690 --> 00:14:25,956 holding s constant, this is simply what? 281 00:14:25,956 --> 00:14:27,070 12r. 282 00:14:27,070 --> 00:14:29,110 Because s is being treated as a constant, 283 00:14:29,110 --> 00:14:31,990 its derivative with respect to r is 0. 284 00:14:31,990 --> 00:14:34,360 You see, in an example like this, 285 00:14:34,360 --> 00:14:37,530 one would not really be tempted to use the chain rule. 286 00:14:37,530 --> 00:14:40,680 The chain rule is used in many cases 287 00:14:40,680 --> 00:14:43,980 not just for convenience, but in cases of great theory 288 00:14:43,980 --> 00:14:47,950 where you're only given that w is some function of x, y, 289 00:14:47,950 --> 00:14:50,570 and z, and you're not told explicitly 290 00:14:50,570 --> 00:14:51,740 what the function is. 291 00:14:51,740 --> 00:14:53,690 You're just given f of x, y, z. 292 00:14:53,690 --> 00:14:57,110 In the case where the function is given explicitly, 293 00:14:57,110 --> 00:15:00,380 it's sometimes very easy to substitute directly. 294 00:15:00,380 --> 00:15:03,050 At any rate, what the chain rule says is roughly this. 295 00:15:03,050 --> 00:15:03,920 They say, lookit. 296 00:15:03,920 --> 00:15:06,440 From this equation, you could immediately 297 00:15:06,440 --> 00:15:09,590 say the partial of w with respect to x is 2x, 298 00:15:09,590 --> 00:15:12,470 the partial of w with respect to y is 2y, 299 00:15:12,470 --> 00:15:15,710 the partial of w with respect to z is 2z. 300 00:15:15,710 --> 00:15:17,460 From this equation, you could immediately 301 00:15:17,460 --> 00:15:20,340 say that the partial of x with respect to r is 1, 302 00:15:20,340 --> 00:15:23,040 the partial of x with respect to s is 1, 303 00:15:23,040 --> 00:15:25,920 the partial of y with respect to r is 1, 304 00:15:25,920 --> 00:15:29,850 the partial of y with respect to s is minus 1, 305 00:15:29,850 --> 00:15:32,910 the partial of z with respect to r is 2, 306 00:15:32,910 --> 00:15:36,450 and the partial of z with respect to s is 0. 307 00:15:36,450 --> 00:15:43,380 In particular, summarizing our results, we have these here. 308 00:15:43,380 --> 00:15:45,300 Now, what the chain rule says is what? 309 00:15:45,300 --> 00:15:48,190 To find the partial of w with respect to r, 310 00:15:48,190 --> 00:15:50,410 you just take the partial of w with respect 311 00:15:50,410 --> 00:15:53,540 to x times the partial of x with respect to r, 312 00:15:53,540 --> 00:15:55,630 plus the partial of w with respect 313 00:15:55,630 --> 00:15:58,600 to y times the partial of y with respect to r, 314 00:15:58,600 --> 00:16:00,320 plus the partial of w with respect 315 00:16:00,320 --> 00:16:03,100 to z times the partial of z with respect to r. 316 00:16:03,100 --> 00:16:06,150 And if we do that in this case, we simply get what? 317 00:16:06,150 --> 00:16:11,380 2x plus 2y plus 4z. 318 00:16:11,380 --> 00:16:14,170 Now again, I picked, deliberately, 319 00:16:14,170 --> 00:16:16,260 a very simple problem here. 320 00:16:16,260 --> 00:16:24,840 Remember, by definition, x is r plus s, y is r minus s, 321 00:16:24,840 --> 00:16:28,340 and z happens to be 2r. 322 00:16:28,340 --> 00:16:31,770 And now you can see very quickly here that when I substitute in, 323 00:16:31,770 --> 00:16:32,380 I get what? 324 00:16:32,380 --> 00:16:40,360 2r plus 2r is 4r, plus 8r is 12r, and 2s minus 2s is 0. 325 00:16:40,360 --> 00:16:42,750 The partial of w with respect to r 326 00:16:42,750 --> 00:16:45,710 is also 12r, also meaning what? 327 00:16:45,710 --> 00:16:48,390 We found that same answer before. 328 00:16:48,390 --> 00:16:51,630 At least that's how the chain rule works. 329 00:16:51,630 --> 00:16:55,220 And again, we have to remember that the chain rule does not 330 00:16:55,220 --> 00:16:57,390 depend on the number of variables, 331 00:16:57,390 --> 00:17:00,730 even though this may start to look a little bit sticky. 332 00:17:00,730 --> 00:17:02,340 Let's word it as follows. 333 00:17:02,340 --> 00:17:07,109 Suppose w happens to be a continuously differentiable 334 00:17:07,109 --> 00:17:13,099 function of the n independent variables x_1 up to x_n. 335 00:17:13,099 --> 00:17:16,190 See, that's what this parenthetical remark means. 336 00:17:16,190 --> 00:17:20,210 I'm saying that not only do the partials of f with respect 337 00:17:20,210 --> 00:17:22,880 to x_1 up to x_n exist at a given point, 338 00:17:22,880 --> 00:17:24,670 but they are continuous there. 339 00:17:24,670 --> 00:17:26,310 And why do I want that in there? 340 00:17:26,310 --> 00:17:28,650 So I can say that my error term is never 341 00:17:28,650 --> 00:17:33,550 any greater than that k_1 delta x_1, plus k_2 delta x_2, 342 00:17:33,550 --> 00:17:36,680 plus, et cetera, k_n delta x_n, where the k's go 343 00:17:36,680 --> 00:17:39,310 to 0 as the delta x's go to 0. 344 00:17:39,310 --> 00:17:41,880 I'm going to spare you the details of proofs. 345 00:17:41,880 --> 00:17:43,530 But I just want you to keep seeing 346 00:17:43,530 --> 00:17:45,400 why these things are necessary. 347 00:17:45,400 --> 00:17:50,170 At any rate, let's suppose now that each of the n variables 348 00:17:50,170 --> 00:17:56,020 x_1 up to x_n turn out to be functions of the m variables. 349 00:17:56,020 --> 00:17:58,800 n and m could conceivably be equal. 350 00:17:58,800 --> 00:18:01,500 But m could even be more than n. 351 00:18:01,500 --> 00:18:02,637 It can be less than n. 352 00:18:02,637 --> 00:18:04,470 There's no reason why they have to be equal. 353 00:18:04,470 --> 00:18:07,630 All we're saying is, speaking in the most general terms, 354 00:18:07,630 --> 00:18:12,040 suppose each of the n x's is a continuously differentiable 355 00:18:12,040 --> 00:18:16,870 function of the m independent variables y sub 1 up 356 00:18:16,870 --> 00:18:17,880 to y sub m. 357 00:18:17,880 --> 00:18:20,400 In fact, that's what this "et cetera" means here. 358 00:18:20,400 --> 00:18:23,370 The "et cetera" refers to the parenthetical remark here. 359 00:18:23,370 --> 00:18:27,590 I mean that not only are the x's functions of y_1 up to y_m, 360 00:18:27,590 --> 00:18:30,890 but they're continuously differentiable functions. 361 00:18:30,890 --> 00:18:35,180 Now obviously, what we're saying is that if w can be expressed 362 00:18:35,180 --> 00:18:38,260 in terms of the x's, the x's can be expressed in terms 363 00:18:38,260 --> 00:18:42,480 of the y's, obviously then, w can be expressed in terms 364 00:18:42,480 --> 00:18:43,090 of the y's. 365 00:18:43,090 --> 00:18:47,590 In other words, w is some function of y_1 up to y_m. 366 00:18:47,590 --> 00:18:51,040 Now the question that comes up is that just looking at this, 367 00:18:51,040 --> 00:18:54,810 I can talk about the partial of w with respect to y_1, 368 00:18:54,810 --> 00:18:57,640 the partial of w with respect to y_2, 369 00:18:57,640 --> 00:19:02,660 the partial of w with respect to y_3, et cetera, all the way up 370 00:19:02,660 --> 00:19:05,820 to the partial of w with respect to y sub m. 371 00:19:05,820 --> 00:19:07,610 And the question is, lookit. 372 00:19:07,610 --> 00:19:10,900 From the original form of w, it was easy to talk about 373 00:19:10,900 --> 00:19:14,060 the partials of w with respect to the x's. 374 00:19:14,060 --> 00:19:17,620 From how the x's are given in terms of the y's, it's easy 375 00:19:17,620 --> 00:19:20,420 to talk about the derivatives of the x's with respect 376 00:19:20,420 --> 00:19:21,550 to the y's. 377 00:19:21,550 --> 00:19:23,170 And so the question is, how do you 378 00:19:23,170 --> 00:19:26,660 find the partial of w with respect to, say, y sub 379 00:19:26,660 --> 00:19:29,690 1, given all of these other partial derivatives? 380 00:19:29,690 --> 00:19:32,460 And the answer, again, is something that you just 381 00:19:32,460 --> 00:19:33,640 have to get used to. 382 00:19:33,640 --> 00:19:36,600 The proof goes through for n and m the same way 383 00:19:36,600 --> 00:19:38,780 as it did for the lower-dimensional case. 384 00:19:38,780 --> 00:19:41,410 And the intuitive interpretation is the same. 385 00:19:41,410 --> 00:19:45,380 Namely, to find the partial of w with respect to y_1, 386 00:19:45,380 --> 00:19:48,590 we simply see how much w changed with respect 387 00:19:48,590 --> 00:19:52,300 to y_1 due to the change in x_1 alone, 388 00:19:52,300 --> 00:19:54,940 add on to that the change in w with respect 389 00:19:54,940 --> 00:19:59,720 to y_1 due to the change in x_2 alone, et cetera, 390 00:19:59,720 --> 00:20:07,450 add on to that, finally, the change in w with respect to y 391 00:20:07,450 --> 00:20:10,970 sub 1 due to the change in x sub n alone. 392 00:20:10,970 --> 00:20:12,580 In other words, again, if you think 393 00:20:12,580 --> 00:20:15,740 of this in terms of cancellation, 394 00:20:15,740 --> 00:20:19,210 if you cross these things out, don't think of adding them, 395 00:20:19,210 --> 00:20:20,710 but think of them as what? 396 00:20:20,710 --> 00:20:24,130 Giving you the individual components 397 00:20:24,130 --> 00:20:27,450 that tell you how the partial of w with respect to y_1 398 00:20:27,450 --> 00:20:28,350 is made up. 399 00:20:28,350 --> 00:20:30,422 By the way, there is one parenthetical remark 400 00:20:30,422 --> 00:20:31,880 that I haven't written on the board 401 00:20:31,880 --> 00:20:33,920 that I would like to make at this time. 402 00:20:33,920 --> 00:20:36,440 In Professor Thomas's text, he has 403 00:20:36,440 --> 00:20:39,360 elected to introduce matrix algebra prior 404 00:20:39,360 --> 00:20:41,420 to this particular chapter. 405 00:20:41,420 --> 00:20:43,800 It again turns out that one does not 406 00:20:43,800 --> 00:20:46,340 need matrices to talk about the chain rule 407 00:20:46,340 --> 00:20:49,180 but that if one had matrix notation, 408 00:20:49,180 --> 00:20:51,580 the matrix notation is particularly 409 00:20:51,580 --> 00:20:54,680 convenient for summarizing the chain rule. 410 00:20:54,680 --> 00:20:59,010 I have elected to hold off on matrix algebra 411 00:20:59,010 --> 00:21:01,210 till the near future because it comes up 412 00:21:01,210 --> 00:21:03,940 in a much better motivated way, I 413 00:21:03,940 --> 00:21:07,570 think, in terms of these linear approximations. 414 00:21:07,570 --> 00:21:10,410 But the point is if, as you're reading the text, 415 00:21:10,410 --> 00:21:12,980 you see the matrix notation, and you are not 416 00:21:12,980 --> 00:21:15,510 familiar with the matrices, forget it. 417 00:21:15,510 --> 00:21:20,300 All the matrix is, is a shortcut notation for saying this. 418 00:21:20,300 --> 00:21:22,470 And if I want a shortcut notation here, 419 00:21:22,470 --> 00:21:24,530 I don't need matrices for saying this. 420 00:21:24,530 --> 00:21:27,510 I can say this in terms of our sigma notation. 421 00:21:27,510 --> 00:21:29,780 Notice that one other way of writing 422 00:21:29,780 --> 00:21:33,560 this thing very compactly that may be more suggestive 423 00:21:33,560 --> 00:21:34,810 is the following. 424 00:21:34,810 --> 00:21:39,440 Notice that I'm adding up n terms here. 425 00:21:39,440 --> 00:21:43,680 Each term consists of two factors, each of which 426 00:21:43,680 --> 00:21:45,430 looks like a fraction. 427 00:21:45,430 --> 00:21:47,260 The numerator of the first fraction 428 00:21:47,260 --> 00:21:50,960 is always a partial of w. 429 00:21:50,960 --> 00:21:52,790 The denominator of the second fraction 430 00:21:52,790 --> 00:21:55,980 is always the partial y_1. 431 00:21:55,980 --> 00:21:59,330 And it appears that the denominator of the first 432 00:21:59,330 --> 00:22:03,600 and the numerator of the second always have the same subscript, 433 00:22:03,600 --> 00:22:07,910 but they seem to vary consecutively from 1 to n. 434 00:22:07,910 --> 00:22:10,850 And that's precisely where the sigma notation comes in handy. 435 00:22:10,850 --> 00:22:13,130 Why don't we just write, therefore, 436 00:22:13,130 --> 00:22:16,610 that this is the sum, partial of w with respect 437 00:22:16,610 --> 00:22:20,200 to x sub k, plus the partial of x sub k 438 00:22:20,200 --> 00:22:23,020 with respect to y_1, as the subscript 439 00:22:23,020 --> 00:22:28,170 k ranges through all integral values from 1 to n? 440 00:22:28,170 --> 00:22:31,270 In other words, notice that in this particular form, 441 00:22:31,270 --> 00:22:34,310 we have simply rewritten this thing compactly. 442 00:22:34,310 --> 00:22:36,830 But if you look at this and look at this, 443 00:22:36,830 --> 00:22:40,220 I think it's very suggestive to see how the chain rule works. 444 00:22:40,220 --> 00:22:43,940 You see, here's your partial of w with respect to y_1. 445 00:22:43,940 --> 00:22:45,350 And these are what? 446 00:22:45,350 --> 00:22:51,040 The contributions due to each of the changes of the n variables. 447 00:22:51,040 --> 00:22:54,350 This is the change due to the x sub k variable. 448 00:22:54,350 --> 00:22:57,400 And you add these all up because they're independent variables, 449 00:22:57,400 --> 00:22:59,620 as k goes from 1 to n. 450 00:22:59,620 --> 00:23:02,280 And I write "et cetera" here simply to point out 451 00:23:02,280 --> 00:23:04,610 that I could have computed the partial of w 452 00:23:04,610 --> 00:23:07,670 with respect to y_2 instead of y sub 1. 453 00:23:07,670 --> 00:23:10,560 By the way, the recipe would've looked exactly the same, 454 00:23:10,560 --> 00:23:12,560 except that if there was a 2 here, there 455 00:23:12,560 --> 00:23:13,940 would have been a 2 here. 456 00:23:13,940 --> 00:23:17,160 If there were a 3 here, they would've been a 3 here. 457 00:23:17,160 --> 00:23:21,020 If there were an m here, there would have been an m here. 458 00:23:21,020 --> 00:23:22,800 OK, now lookit. 459 00:23:22,800 --> 00:23:25,630 At this particular stage of our lecture 460 00:23:25,630 --> 00:23:30,080 today, this could end with the idea 461 00:23:30,080 --> 00:23:32,440 that for the unit that's now assigned, 462 00:23:32,440 --> 00:23:34,450 this is as far as you have to go. 463 00:23:34,450 --> 00:23:36,870 In other words, for the exercises 464 00:23:36,870 --> 00:23:39,890 that I've given you in this particular unit, 465 00:23:39,890 --> 00:23:43,460 we do nothing higher than using the chain rule 466 00:23:43,460 --> 00:23:46,250 for first-order derivatives. 467 00:23:46,250 --> 00:23:49,670 The point is that in many applications in real life, 468 00:23:49,670 --> 00:23:52,840 we must take higher-order derivatives. 469 00:23:52,840 --> 00:23:55,480 In other words, there are many differential equations, 470 00:23:55,480 --> 00:23:58,360 partial differential equations, where we must work 471 00:23:58,360 --> 00:24:00,100 with higher-order derivatives. 472 00:24:00,100 --> 00:24:03,670 And for that reason, it becomes very important, sometimes, 473 00:24:03,670 --> 00:24:06,360 to be able to take a second derivative 474 00:24:06,360 --> 00:24:08,700 or a third derivative or a fourth derivative 475 00:24:08,700 --> 00:24:10,480 by means of the chain rule. 476 00:24:10,480 --> 00:24:14,140 Now the interesting point is that the theory that we've used 477 00:24:14,140 --> 00:24:16,420 so far doesn't change at all. 478 00:24:16,420 --> 00:24:19,070 What does happen is that the average student, 479 00:24:19,070 --> 00:24:21,960 in learning this material for the first time, 480 00:24:21,960 --> 00:24:24,080 gets swamped by the notation. 481 00:24:24,080 --> 00:24:26,340 Consequently, what I want to do is 482 00:24:26,340 --> 00:24:30,090 to give you the lecture on this material at the same time 483 00:24:30,090 --> 00:24:33,850 that I'm lecturing on first-order derivatives, simply 484 00:24:33,850 --> 00:24:37,640 because the continuity follows smoother this way, 485 00:24:37,640 --> 00:24:40,380 so that you see what the whole overall picture is, then 486 00:24:40,380 --> 00:24:42,740 to make sure that you cement these things down. 487 00:24:42,740 --> 00:24:44,830 The next unit after this will give 488 00:24:44,830 --> 00:24:47,490 you drill on taking higher-order derivatives. 489 00:24:47,490 --> 00:24:49,440 What this may mean is that many of you 490 00:24:49,440 --> 00:24:54,640 may prefer to watch this half of the film 491 00:24:54,640 --> 00:24:57,920 a second time, after you've already 492 00:24:57,920 --> 00:24:59,490 tried working some of the problems 493 00:24:59,490 --> 00:25:01,530 with higher-order derivatives, if you're still 494 00:25:01,530 --> 00:25:02,490 confused by this. 495 00:25:02,490 --> 00:25:03,960 But at any rate, let's take a look 496 00:25:03,960 --> 00:25:05,660 at a hypothetical situation. 497 00:25:05,660 --> 00:25:08,160 Since we're so used to polar coordinates, 498 00:25:08,160 --> 00:25:10,910 let's talk in terms of polar coordinates. 499 00:25:10,910 --> 00:25:13,690 Suppose w happens to be a continuously differentiable 500 00:25:13,690 --> 00:25:15,875 function of x and y. 501 00:25:15,875 --> 00:25:20,760 x and y, in turn, are continuously differentiable 502 00:25:20,760 --> 00:25:23,280 functions of the polar coordinates r and theta. 503 00:25:23,280 --> 00:25:25,860 In fact, they're x equals r cosine theta, 504 00:25:25,860 --> 00:25:27,850 y equals r sine theta. 505 00:25:27,850 --> 00:25:28,540 Now lookit. 506 00:25:28,540 --> 00:25:32,710 If all I want to do is find the partial of w with respect to r, 507 00:25:32,710 --> 00:25:35,650 I can do that by the ordinary chain rule. 508 00:25:35,650 --> 00:25:38,010 Namely, it's the partial of w with respect 509 00:25:38,010 --> 00:25:40,700 to x times the partial of x with respect to r, 510 00:25:40,700 --> 00:25:42,360 plus the partial of w with respect 511 00:25:42,360 --> 00:25:45,680 to y times the partial of y with respect to r. 512 00:25:45,680 --> 00:25:48,240 Now, knowing what x looks like explicitly 513 00:25:48,240 --> 00:25:51,480 in terms of r and theta and what y looks like explicitly 514 00:25:51,480 --> 00:25:53,690 in terms of r and theta, I can certainly 515 00:25:53,690 --> 00:25:56,300 compute the partials of x and y with respect 516 00:25:56,300 --> 00:25:59,560 to r, holding theta constant. 517 00:25:59,560 --> 00:26:02,550 In particular, the partial of x with respect to r 518 00:26:02,550 --> 00:26:04,200 is simply cosine theta. 519 00:26:04,200 --> 00:26:06,740 And the partial of y with respect to r 520 00:26:06,740 --> 00:26:08,170 is simply sine theta. 521 00:26:08,170 --> 00:26:11,140 So the partial of w with respect to r 522 00:26:11,140 --> 00:26:13,560 is partial of w with respect to x times 523 00:26:13,560 --> 00:26:18,010 cosine theta, plus partial of w with respect to y times sine 524 00:26:18,010 --> 00:26:18,510 theta. 525 00:26:18,510 --> 00:26:23,410 And by the way, notice that I cannot simplify these terms. 526 00:26:23,410 --> 00:26:25,790 I cannot simplify these terms in general, 527 00:26:25,790 --> 00:26:29,570 because all I'm given is that w is some function of x and y. 528 00:26:29,570 --> 00:26:33,120 I don't know what w looks like explicitly in terms of x and y. 529 00:26:33,120 --> 00:26:36,660 So all I can do is talk about the partials of w with respect 530 00:26:36,660 --> 00:26:40,320 to x, partial of w respect to y, without worrying 531 00:26:40,320 --> 00:26:43,300 any more about this, with the understanding that if I knew 532 00:26:43,300 --> 00:26:46,440 what w looked like explicitly in terms of x and y, 533 00:26:46,440 --> 00:26:48,985 I could work out what this thing was. 534 00:26:48,985 --> 00:26:50,440 Now, here's the key point. 535 00:26:50,440 --> 00:26:52,340 That's why I've accentuated it. 536 00:26:52,340 --> 00:26:55,400 In the same way that w is a function 537 00:26:55,400 --> 00:27:00,120 of both x and y, so also are the partials of w with respect 538 00:27:00,120 --> 00:27:03,530 to x and the partials of w with respect to y. 539 00:27:03,530 --> 00:27:06,750 In other words, even though this looks 540 00:27:06,750 --> 00:27:09,180 like this emphasizes the x, notice 541 00:27:09,180 --> 00:27:11,290 that when you take the derivative 542 00:27:11,290 --> 00:27:16,040 of a function of both x and y with respect to x, in general, 543 00:27:16,040 --> 00:27:17,640 the resulting function will again 544 00:27:17,640 --> 00:27:20,300 be a function of both x and y. 545 00:27:20,300 --> 00:27:23,070 And so what we're saying is that if the partials of w 546 00:27:23,070 --> 00:27:27,310 with respect to x and the partials of w with respect to y 547 00:27:27,310 --> 00:27:30,580 also happen to be continuously differentiable functions of x 548 00:27:30,580 --> 00:27:34,905 and y, we could, if we wished, use the chain rule again. 549 00:27:34,905 --> 00:27:37,120 In other words, suppose in the particular problem 550 00:27:37,120 --> 00:27:38,890 that I was dealing with, it wasn't 551 00:27:38,890 --> 00:27:42,320 enough to know the partial of w with respect to r. 552 00:27:42,320 --> 00:27:45,670 Suppose, for example, I wanted the second partial of w 553 00:27:45,670 --> 00:27:46,980 with respect to r. 554 00:27:46,980 --> 00:27:50,220 Well obviously, that simply means what? 555 00:27:50,220 --> 00:27:54,360 Take the partial of this with respect to r. 556 00:27:54,360 --> 00:27:57,590 In other words, the second partial of w with respect to r 557 00:27:57,590 --> 00:28:02,070 is just the partial of the partial of w with respect to r, 558 00:28:02,070 --> 00:28:03,770 with respect r. 559 00:28:03,770 --> 00:28:07,510 I'm just going to differentiate this thing with respect to r. 560 00:28:07,510 --> 00:28:10,720 In other words, writing this out more succinctly for you, 561 00:28:10,720 --> 00:28:13,860 the second partial derivative of w with respect to r 562 00:28:13,860 --> 00:28:17,290 is the partial with respect to r of cosine theta 563 00:28:17,290 --> 00:28:21,400 partial of w with respect to x, plus sine theta partial 564 00:28:21,400 --> 00:28:23,640 of w with respect to y. 565 00:28:23,640 --> 00:28:24,750 Now, here's the key point. 566 00:28:24,750 --> 00:28:26,890 When we differentiate here, we're 567 00:28:26,890 --> 00:28:29,570 assuming that theta is being held constant. 568 00:28:29,570 --> 00:28:30,850 Isn't that right? 569 00:28:30,850 --> 00:28:33,720 So consequently, when I'm differentiating with respect 570 00:28:33,720 --> 00:28:37,180 to r, cosine theta is a constant. 571 00:28:37,180 --> 00:28:40,530 I can skip over that, see, and differentiate 572 00:28:40,530 --> 00:28:42,764 what's left with respect to r. 573 00:28:42,764 --> 00:28:43,930 In other words, that's what? 574 00:28:43,930 --> 00:28:47,440 It's the partial of w with respect to x differentiated 575 00:28:47,440 --> 00:28:49,430 with respect to r. 576 00:28:49,430 --> 00:28:54,090 See, I'm using the ordinary rule for the derivative of a sum. 577 00:28:54,090 --> 00:28:55,920 Now, the derivative of sine theta-- 578 00:28:55,920 --> 00:28:58,860 see, sine theta is a constant with respect to r. 579 00:28:58,860 --> 00:29:00,590 So the derivative of this term is just 580 00:29:00,590 --> 00:29:03,800 sine theta times the derivative of the partial 581 00:29:03,800 --> 00:29:09,730 of w with respect to y, with respect to r, written this way. 582 00:29:09,730 --> 00:29:13,320 Now, the key point is that both of these functions 583 00:29:13,320 --> 00:29:18,712 here, both of these are functions of x and y. 584 00:29:18,712 --> 00:29:22,240 x and y, in turn, are functions of r and theta. 585 00:29:22,240 --> 00:29:24,920 So in other words, to differentiate 586 00:29:24,920 --> 00:29:30,120 this thing with respect to r, I must use the chain rule again. 587 00:29:30,120 --> 00:29:33,000 Now, because this may seem difficult for you, 588 00:29:33,000 --> 00:29:34,920 all I'm really saying is, lookit. 589 00:29:34,920 --> 00:29:37,670 If this term here looks messy, since we 590 00:29:37,670 --> 00:29:40,000 know that the partial of w with respect to x 591 00:29:40,000 --> 00:29:44,480 is some function of x and y, let's call that h of x, y. 592 00:29:44,480 --> 00:29:50,570 Then all we're saying is that the partial of the partial of w 593 00:29:50,570 --> 00:29:53,820 with respect to x, with respect to r, 594 00:29:53,820 --> 00:29:56,860 is just the partial of h with respect to r. 595 00:29:56,860 --> 00:29:59,540 But to find the partial of h with respect to r, 596 00:29:59,540 --> 00:30:01,640 we know how to use the chain rule there. 597 00:30:01,640 --> 00:30:02,580 It's just what? 598 00:30:02,580 --> 00:30:04,350 It's the partial of h with respect 599 00:30:04,350 --> 00:30:07,720 to x times the partial of x with respect to r, 600 00:30:07,720 --> 00:30:10,040 plus the partial of h with respect 601 00:30:10,040 --> 00:30:13,780 to y times the partial of y with respect to r. 602 00:30:13,780 --> 00:30:17,655 Of course, if we now remember what h is-- see, 603 00:30:17,655 --> 00:30:21,210 h is the partial of w with respect to x. 604 00:30:21,210 --> 00:30:25,110 So if I differentiate again with respect to x, 605 00:30:25,110 --> 00:30:28,210 I get the second partial of w with respect to x. 606 00:30:28,210 --> 00:30:30,760 We've already seen that the partial of x with respect to r 607 00:30:30,760 --> 00:30:34,220 is cosine theta, so I have this term. 608 00:30:34,220 --> 00:30:37,950 The partial of h with respect to y really says what? 609 00:30:37,950 --> 00:30:41,310 Differentiate the partial of w with respect 610 00:30:41,310 --> 00:30:44,470 to x, with respect to y. 611 00:30:44,470 --> 00:30:46,820 And the usual way of abbreviating that 612 00:30:46,820 --> 00:30:49,590 is like this, which, again, is explained in the reading 613 00:30:49,590 --> 00:30:51,100 material. 614 00:30:51,100 --> 00:30:55,540 And we now multiply that by the partial of y with respect 615 00:30:55,540 --> 00:30:58,580 to r, which happens to be sine theta. 616 00:30:58,580 --> 00:31:03,870 Now, look at this a few times in your spare time, 617 00:31:03,870 --> 00:31:05,060 if it's bothering you. 618 00:31:05,060 --> 00:31:07,390 It is not really that difficult. It 619 00:31:07,390 --> 00:31:10,260 is messy notation in the sense that you're not 620 00:31:10,260 --> 00:31:12,804 used to notation that's quite that messy. 621 00:31:12,804 --> 00:31:13,720 That's why it's messy. 622 00:31:13,720 --> 00:31:15,150 Once you get used to it, it is not 623 00:31:15,150 --> 00:31:18,550 any tougher than the chain rule for one independent variable. 624 00:31:18,550 --> 00:31:21,910 In fact, to take the partial of the partial of w 625 00:31:21,910 --> 00:31:26,270 with respect to y, with respect to r, I'll do that in one step 626 00:31:26,270 --> 00:31:27,830 without using a substitution. 627 00:31:27,830 --> 00:31:30,220 All I'm saying is that this function 628 00:31:30,220 --> 00:31:32,200 depends on both x and y. 629 00:31:32,200 --> 00:31:35,710 So to see what its derivative is with respect to r, 630 00:31:35,710 --> 00:31:38,790 I'll see what the contribution of its derivative with respect 631 00:31:38,790 --> 00:31:41,780 to r is due to just x alone. 632 00:31:41,780 --> 00:31:44,120 Then I'll see what contribution of its derivative 633 00:31:44,120 --> 00:31:46,940 with respect to r is due to just y alone. 634 00:31:46,940 --> 00:31:48,780 And by the way, when I say it that way, 635 00:31:48,780 --> 00:31:51,070 notice how quick it is to write this thing down. 636 00:31:51,070 --> 00:31:52,490 I differentiate this with respect 637 00:31:52,490 --> 00:31:56,120 to x multiplied by the partial of x with respect to r. 638 00:31:56,120 --> 00:31:59,130 Add on to that the partial of this with respect to y. 639 00:31:59,130 --> 00:32:02,390 Multiply that by the partial of y with respect to r. 640 00:32:02,390 --> 00:32:05,400 If I do this, notice now I have what? 641 00:32:05,400 --> 00:32:07,710 I have the partial with respect to y. 642 00:32:07,710 --> 00:32:10,370 And I differentiate that with respect to x. 643 00:32:10,370 --> 00:32:12,350 That's written this way. 644 00:32:12,350 --> 00:32:16,960 And by the way, notice that this is the reverse order 645 00:32:16,960 --> 00:32:19,020 of what we did over here. 646 00:32:19,020 --> 00:32:22,060 Namely, in one case, we first differentiated with respect 647 00:32:22,060 --> 00:32:24,410 to x and then with respect to y. 648 00:32:24,410 --> 00:32:26,810 In the other case, we differentiated first 649 00:32:26,810 --> 00:32:29,770 with respect to y and then with respect to x. 650 00:32:29,770 --> 00:32:32,510 So that actually, conceptually there is a difference. 651 00:32:32,510 --> 00:32:35,980 That's why we write these things differently. 652 00:32:35,980 --> 00:32:40,010 It does, again, turn out that in most cases, 653 00:32:40,010 --> 00:32:42,680 the answer that you get-- thank goodness-- 654 00:32:42,680 --> 00:32:44,820 doesn't depend on the order in which you 655 00:32:44,820 --> 00:32:46,120 perform the derivatives. 656 00:32:46,120 --> 00:32:48,070 But this is not at all self-evident, 657 00:32:48,070 --> 00:32:50,170 even though you'd like to believe that it is. 658 00:32:50,170 --> 00:32:54,050 But we'll talk about that more in the exercises and the like. 659 00:32:54,050 --> 00:32:57,320 But all I'm saying now is that if we put everything together 660 00:32:57,320 --> 00:33:00,270 of what we've had before, we can obtain, 661 00:33:00,270 --> 00:33:03,390 in this particular case, that the second partial of w 662 00:33:03,390 --> 00:33:08,700 with respect to r is this somewhat messy but nonetheless 663 00:33:08,700 --> 00:33:10,770 straightforward expression. 664 00:33:10,770 --> 00:33:12,280 And see, I've circled these things 665 00:33:12,280 --> 00:33:16,120 to sort of tell you that if it is permissible to interchange 666 00:33:16,120 --> 00:33:20,670 the order of differentiation, we could combine these two terms. 667 00:33:20,670 --> 00:33:24,650 On the other hand, if you couldn't interchange the order, 668 00:33:24,650 --> 00:33:27,040 this would be a rather dangerous thing 669 00:33:27,040 --> 00:33:31,090 to do over here because these might be different answers. 670 00:33:31,090 --> 00:33:34,960 As I say again, if you have enough continuity, 671 00:33:34,960 --> 00:33:37,514 it turns out that these two factors are the same. 672 00:33:37,514 --> 00:33:39,180 But that's not the important issue here. 673 00:33:39,180 --> 00:33:41,160 The important issue here is that I 674 00:33:41,160 --> 00:33:45,400 can keep using the chain rule to take higher-order derivatives. 675 00:33:45,400 --> 00:33:47,780 And even though the notation is messier, 676 00:33:47,780 --> 00:33:51,020 this happened when we dealt with functions of a single variable. 677 00:33:51,020 --> 00:33:52,750 Remember when we used the chain rule 678 00:33:52,750 --> 00:33:58,640 to find dy/dx when y and x were given, say, as functions of t? 679 00:33:58,640 --> 00:34:01,910 We could also use the chain rule to find the second derivative 680 00:34:01,910 --> 00:34:03,380 of y with respect to x. 681 00:34:03,380 --> 00:34:06,550 But we had to be a little bit more careful of the computation 682 00:34:06,550 --> 00:34:10,409 because certain factors crept in that we had to keep track of. 683 00:34:10,409 --> 00:34:13,870 At any rate, again, to illustrate this idea 684 00:34:13,870 --> 00:34:15,980 rather than to keep droning on about it, 685 00:34:15,980 --> 00:34:19,449 let me take a particularly simple computational problem 686 00:34:19,449 --> 00:34:20,409 to check this thing on. 687 00:34:20,409 --> 00:34:21,908 In other words, what I'm going to do 688 00:34:21,908 --> 00:34:25,469 is take this messy formula over here and apply it to a case 689 00:34:25,469 --> 00:34:29,929 where the arithmetic happens to be very, very simple. 690 00:34:29,929 --> 00:34:32,070 I'm going to rig this very, very nicely. 691 00:34:32,070 --> 00:34:35,130 I'm going to let f of x, y just be x squared plus y squared, 692 00:34:35,130 --> 00:34:36,489 in this case. 693 00:34:36,489 --> 00:34:38,800 Let w be x squared plus y squared. 694 00:34:38,800 --> 00:34:41,790 In polar coordinates, notice that x squared plus y squared 695 00:34:41,790 --> 00:34:45,030 is just r squared. 696 00:34:45,030 --> 00:34:46,820 So w is just r squared. 697 00:34:46,820 --> 00:34:49,870 What is the partial of w with respect to r, then? 698 00:34:49,870 --> 00:34:54,139 The partial of w with respect to r is 2r. 699 00:34:54,139 --> 00:34:56,675 And if I now differentiate that with respect to r, 700 00:34:56,675 --> 00:35:00,010 the second partial of w with respect to r is 2. 701 00:35:00,010 --> 00:35:03,760 Obviously, one would not use the chain rule in real life 702 00:35:03,760 --> 00:35:06,090 to find the answer to this particular problem. 703 00:35:06,090 --> 00:35:10,340 We've chosen this problem simply to emphasize how 704 00:35:10,340 --> 00:35:12,570 the chain rule would work here. 705 00:35:12,570 --> 00:35:15,350 At any rate, going back here, notice 706 00:35:15,350 --> 00:35:18,000 that it's very simple to see from this equation 707 00:35:18,000 --> 00:35:20,910 that the partial of w with respect to x is 2x. 708 00:35:20,910 --> 00:35:24,480 Therefore, the second partial of w with respect to x is 2. 709 00:35:24,480 --> 00:35:28,090 The partial of w with respect to y is 2y. 710 00:35:28,090 --> 00:35:34,400 Therefore, the second partial of w with respect to y is also 2. 711 00:35:34,400 --> 00:35:37,800 The partial of w with respect to x is a function of x alone, 712 00:35:37,800 --> 00:35:38,870 in this case. 713 00:35:38,870 --> 00:35:42,450 Consequently, the derivative with respect to y will be 0. 714 00:35:42,450 --> 00:35:46,510 Similarly, the partial of w with respect to y is a function of y 715 00:35:46,510 --> 00:35:47,320 alone. 716 00:35:47,320 --> 00:35:48,970 Consequently, when I differentiate 717 00:35:48,970 --> 00:35:51,930 that with respect to x, meaning I'm holding y constant, 718 00:35:51,930 --> 00:35:54,190 that derivative will also be 0. 719 00:35:54,190 --> 00:36:00,670 And the interesting point now is if I take these values 720 00:36:00,670 --> 00:36:06,430 and substitute those into this equation, what happens? 721 00:36:06,430 --> 00:36:06,990 Look. 722 00:36:06,990 --> 00:36:10,900 The second partial of w with respect to x is just 2. 723 00:36:10,900 --> 00:36:15,840 The second partial of w with respect to y is just 2. 724 00:36:15,840 --> 00:36:19,700 The mixed partials are both 0, regardless of which order 725 00:36:19,700 --> 00:36:20,860 you did them in. 726 00:36:20,860 --> 00:36:22,620 That's what we saw over here. 727 00:36:22,620 --> 00:36:25,100 So consequently, according to this recipe, 728 00:36:25,100 --> 00:36:27,840 the second partial of w with respect to r 729 00:36:27,840 --> 00:36:32,580 is 2 cosine squared theta plus 0 plus 2 sine 730 00:36:32,580 --> 00:36:35,390 squared theta plus 0, where the reason I've 731 00:36:35,390 --> 00:36:37,326 written these 0's in is simply so 732 00:36:37,326 --> 00:36:39,200 that when you're looking at your notes later, 733 00:36:39,200 --> 00:36:42,980 that traces the analog of these terms over here. 734 00:36:42,980 --> 00:36:47,310 At any rate, notice now that if I add these up, 735 00:36:47,310 --> 00:36:49,740 2 cosine squared theta plus 2 sine 736 00:36:49,740 --> 00:36:51,900 squared theta, since sine squared 737 00:36:51,900 --> 00:36:54,330 theta plus cosine squared theta is 1, 738 00:36:54,330 --> 00:36:57,570 this sum is just-- I'll write that in white chalk 739 00:36:57,570 --> 00:37:00,020 just so we don't accentuate it. 740 00:37:00,020 --> 00:37:01,830 Let it just be part of the answer. 741 00:37:01,830 --> 00:37:05,180 This is 2 plus 0, which is 2. 742 00:37:05,180 --> 00:37:09,280 And this certainly does check with the result 743 00:37:09,280 --> 00:37:11,520 that we got the so-called easier way. 744 00:37:11,520 --> 00:37:14,990 And again, I don't want to leave you with the idea 745 00:37:14,990 --> 00:37:18,320 that the second way was just the hard way of doing 746 00:37:18,320 --> 00:37:22,140 the same problem that we did easily the first way. 747 00:37:22,140 --> 00:37:25,490 I picked a simple example so you can see how this works. 748 00:37:25,490 --> 00:37:28,140 I'm going to have a multitude of exercises 749 00:37:28,140 --> 00:37:31,920 for you to do in the next unit, simply so that you'll pick up 750 00:37:31,920 --> 00:37:36,160 the kind of know-how that will allow you to change variables 751 00:37:36,160 --> 00:37:40,750 using the chain rule, with a minimum degree of difficulty. 752 00:37:40,750 --> 00:37:42,660 In fact, hopefully, I would like to feel 753 00:37:42,660 --> 00:37:45,930 by the time we're through with the next two units, 754 00:37:45,930 --> 00:37:49,280 you will be doing this almost as second nature. 755 00:37:49,280 --> 00:37:51,950 Well, we have other topics to consider 756 00:37:51,950 --> 00:37:55,780 in terms of our linear approximations and the like. 757 00:37:55,780 --> 00:37:58,950 We'll talk about that more as the course unfolds. 758 00:37:58,950 --> 00:38:02,560 For the time being, I would like you to concentrate simply 759 00:38:02,560 --> 00:38:04,930 on mastering the chain rule. 760 00:38:04,930 --> 00:38:07,620 And so until we meet next time, good bye. 761 00:38:12,620 --> 00:38:15,010 Funding for the publication of this video 762 00:38:15,010 --> 00:38:19,860 was provided by the Gabriella and Paul Rosenbaum Foundation. 763 00:38:19,860 --> 00:38:24,040 Help OCW continue to provide free and open access to MIT 764 00:38:24,040 --> 00:38:28,435 courses by making a donation at ocw.mit.edu/donate.