1 00:00:00,040 --> 00:00:02,460 The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,870 Commons license. 3 00:00:03,870 --> 00:00:06,320 Your support will help MIT OpenCourseWare 4 00:00:06,320 --> 00:00:10,560 continue to offer high-quality educational resources for free. 5 00:00:10,560 --> 00:00:13,300 To make a donation or view additional materials 6 00:00:13,300 --> 00:00:17,210 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,210 --> 00:00:17,862 at ocw.mit.edu. 8 00:00:31,720 --> 00:00:32,549 PROFESSOR: Hi. 9 00:00:32,549 --> 00:00:34,590 In today's lesson, hopefully we will 10 00:00:34,590 --> 00:00:38,760 begin to reap the rewards of our digression 11 00:00:38,760 --> 00:00:41,180 into the subject of linear algebra. 12 00:00:41,180 --> 00:00:43,560 Recall that in the last few lectures, what 13 00:00:43,560 --> 00:00:48,210 we have been dealing with is the problem of inverting systems 14 00:00:48,210 --> 00:00:51,130 of linear equations. 15 00:00:51,130 --> 00:00:53,160 And what we would like to do today 16 00:00:53,160 --> 00:00:55,690 is to tackle the more general problem 17 00:00:55,690 --> 00:00:58,010 of inverting systems of equations, 18 00:00:58,010 --> 00:01:00,510 even if the equations are not linear. 19 00:01:00,510 --> 00:01:03,240 And with this in mind, I simply entitle 20 00:01:03,240 --> 00:01:07,290 today's lesson "Inverting More General Systems of Equations." 21 00:01:07,290 --> 00:01:09,700 And by way of a very brief review, 22 00:01:09,700 --> 00:01:14,730 recall that given the linear system y_1 equals a_(1,1)*x_1 23 00:01:14,730 --> 00:01:19,940 plus, et cetera, a_(1,n)*x_n; up to y_n equals a_(n,1)*x_1 plus, 24 00:01:19,940 --> 00:01:22,100 et cetera, a_(n,n)*x_n. 25 00:01:22,100 --> 00:01:25,910 We saw that that system was invertible, meaning what? 26 00:01:25,910 --> 00:01:28,890 That we could solve this system for the x's 27 00:01:28,890 --> 00:01:31,760 as linear combinations of the y's if 28 00:01:31,760 --> 00:01:37,960 and only if the inverse of the matrix of coefficients 29 00:01:37,960 --> 00:01:40,200 of this system exists. 30 00:01:40,200 --> 00:01:43,620 And in terms of determinants, recall that that means if 31 00:01:43,620 --> 00:01:47,270 and only if the determinant of the matrix of coefficients 32 00:01:47,270 --> 00:01:49,420 is not 0. 33 00:01:49,420 --> 00:01:54,170 If the determinant of the matrix of coefficients was 0, then 34 00:01:54,170 --> 00:01:58,040 we saw that y_1 up to y_n are not independent. 35 00:01:58,040 --> 00:02:00,530 In fact, in that context, that's where 36 00:02:00,530 --> 00:02:05,870 we came to grips with the concept of a constraint, 37 00:02:05,870 --> 00:02:08,979 that the constraint actually turned out to be what? 38 00:02:08,979 --> 00:02:12,780 The fact that the y's were not independent. 39 00:02:12,780 --> 00:02:16,550 Meaning that we could express a linear combination of the y's 40 00:02:16,550 --> 00:02:18,180 equal to 0. 41 00:02:18,180 --> 00:02:20,800 So what situation were we at then? 42 00:02:20,800 --> 00:02:25,460 Given a linear system, if the matrix of coefficients 43 00:02:25,460 --> 00:02:28,050 does not have its determinant equal to 0, 44 00:02:28,050 --> 00:02:30,020 the system is invertible. 45 00:02:30,020 --> 00:02:32,360 We can solve for the x's in terms of the y's. 46 00:02:32,360 --> 00:02:36,620 If the determinant of the matrix of coefficients is 0's, 47 00:02:36,620 --> 00:02:39,950 the y's are not linearly independent. 48 00:02:39,950 --> 00:02:43,840 In other words, the system is not invertible. 49 00:02:43,840 --> 00:02:45,780 And now what we would like to do is 50 00:02:45,780 --> 00:02:49,290 to tackle the more general problem of inverting 51 00:02:49,290 --> 00:02:51,180 any system of equations. 52 00:02:51,180 --> 00:02:52,690 And by any, I mean what? 53 00:02:52,690 --> 00:02:57,470 Now we have y_1 is f sub 1 of x_1 up to x_n, 54 00:02:57,470 --> 00:03:02,440 et cetera, y sub n is f sub n of x_1 up to x_n. 55 00:03:02,440 --> 00:03:04,440 And now what we're saying is, we do not 56 00:03:04,440 --> 00:03:07,870 know whether the f's are linear or not. 57 00:03:07,870 --> 00:03:10,030 In fact, if they are linear, we're 58 00:03:10,030 --> 00:03:13,330 back to, as a special case, what we've tackled before. 59 00:03:13,330 --> 00:03:16,620 But now we're assuming that these need not be linear. 60 00:03:16,620 --> 00:03:18,750 And what we would like to do is to invert 61 00:03:18,750 --> 00:03:22,220 this system, assuming, of course, that such an inversion 62 00:03:22,220 --> 00:03:23,700 is possible. 63 00:03:23,700 --> 00:03:26,160 Again, what do we mean by the inversion? 64 00:03:26,160 --> 00:03:28,740 We mean, somehow, we would like to know that 65 00:03:28,740 --> 00:03:32,140 given this system of n equations and n unknowns, 66 00:03:32,140 --> 00:03:36,480 where the y's are expressed explicitly in terms of the x's, 67 00:03:36,480 --> 00:03:39,630 can we invert this, and express the x's 68 00:03:39,630 --> 00:03:44,070 in terms of the y's, either explicitly or implicitly? 69 00:03:44,070 --> 00:03:46,080 That's the problem that we'd like to tackle. 70 00:03:46,080 --> 00:03:48,770 And what we're going to use to tackle this 71 00:03:48,770 --> 00:03:51,740 is our old friend, the differential, 72 00:03:51,740 --> 00:03:53,960 the linear approximation, again, that 73 00:03:53,960 --> 00:03:57,270 motivated our whole study of linear systems 74 00:03:57,270 --> 00:03:58,410 in the first place. 75 00:03:58,410 --> 00:04:02,030 Remember, we already know that if y_1 76 00:04:02,030 --> 00:04:05,880 is a differentiable function of x_1 up to x_n, 77 00:04:05,880 --> 00:04:09,410 that delta y_1 sub tan is exactly 78 00:04:09,410 --> 00:04:13,620 equal to the partial of f_1 with respect to x_1 times 79 00:04:13,620 --> 00:04:17,620 delta x_1 plus, et cetera, the partial of f_1 with respect 80 00:04:17,620 --> 00:04:20,010 to x_n times delta x_n. 81 00:04:20,010 --> 00:04:24,290 And in terms of reviewing the notation, because we will 82 00:04:24,290 --> 00:04:26,380 use it later in the lecture, notice 83 00:04:26,380 --> 00:04:31,790 that delta y_1 tan was what we abbreviated to be dy_1 84 00:04:31,790 --> 00:04:34,910 And that delta x_1 up to delta x_n 85 00:04:34,910 --> 00:04:39,970 are abbreviated respectively as dx_1 up to dx_n. 86 00:04:39,970 --> 00:04:42,500 Generalization of what we did for the differential 87 00:04:42,500 --> 00:04:44,580 in the case of one independent variable and we 88 00:04:44,580 --> 00:04:46,060 went through this discussion when 89 00:04:46,060 --> 00:04:49,779 we talked about exact differentials in block three. 90 00:04:49,779 --> 00:04:51,570 And in this similar way, we could of course 91 00:04:51,570 --> 00:04:55,450 express delta y_2 tan, et cetera, 92 00:04:55,450 --> 00:04:58,890 and we can talk about the linear approximations. 93 00:04:58,890 --> 00:04:59,670 All right? 94 00:04:59,670 --> 00:05:03,850 Not the true delta y's now, but the linear part of the delta 95 00:05:03,850 --> 00:05:07,340 y's, the delta y_1 sub tan, or if you prefer, 96 00:05:07,340 --> 00:05:10,570 delta y_1 sub lin, L-I-N. All right? 97 00:05:10,570 --> 00:05:12,960 And the key point now is what? 98 00:05:12,960 --> 00:05:16,740 If the y's happen to be continuously differentiable 99 00:05:16,740 --> 00:05:20,160 functions of the x's in a neighborhood of the point 100 00:05:20,160 --> 00:05:24,040 x bar equals a bar-- in other words, x_1 up to x_n 101 00:05:24,040 --> 00:05:26,460 is equal to a_1 comma, et cetera, 102 00:05:26,460 --> 00:05:30,190 up to a_n-- x_1 equals a_1, x_2 equals a_2, 103 00:05:30,190 --> 00:05:34,290 et cetera-- then near that point, what we're saying 104 00:05:34,290 --> 00:05:34,800 is what? 105 00:05:34,800 --> 00:05:37,840 That the error term goes to 0 very rapidly. 106 00:05:37,840 --> 00:05:41,240 And as long as the functions are continuously differentiable, 107 00:05:41,240 --> 00:05:41,950 it means what? 108 00:05:41,950 --> 00:05:45,640 That the change in y is approximately 109 00:05:45,640 --> 00:05:48,710 equal to the change in y sub tan. 110 00:05:48,710 --> 00:05:50,780 So that what we're saying is-- and remember 111 00:05:50,780 --> 00:05:52,640 this is what motivated our linear systems 112 00:05:52,640 --> 00:05:56,010 in the first place-- that delta y_1 is approximately 113 00:05:56,010 --> 00:05:58,760 the partial of y_1 with respect to x_1, 114 00:05:58,760 --> 00:06:03,750 evaluated when x bar is a bar, times delta x_1 plus et cetera, 115 00:06:03,750 --> 00:06:06,940 the partial of y_1 with respect to x_n, also evaluated 116 00:06:06,940 --> 00:06:11,600 when x bar equals a bar, times delta x sub n et cetera, 117 00:06:11,600 --> 00:06:14,240 down to delta y sub n is approximately 118 00:06:14,240 --> 00:06:17,380 equal to the partial of y sub n with respect to x_1 times 119 00:06:17,380 --> 00:06:21,410 delta x_1, plus et cetera, the partial of y sub n with respect 120 00:06:21,410 --> 00:06:23,320 to x sub n times delta x_n, 121 00:06:23,320 --> 00:06:27,120 where all of these partials are evaluated specifically 122 00:06:27,120 --> 00:06:29,950 at the point x bar equals a bar. 123 00:06:29,950 --> 00:06:31,450 The point being what? 124 00:06:31,450 --> 00:06:33,830 That since we're evaluating all these partials 125 00:06:33,830 --> 00:06:36,860 at a particular point, every one of these coefficients 126 00:06:36,860 --> 00:06:38,080 is a constant. 127 00:06:38,080 --> 00:06:40,390 You see in general, these partial derivatives are 128 00:06:40,390 --> 00:06:44,250 functions, but as soon as we evaluate them at a given value, 129 00:06:44,250 --> 00:06:46,450 they become specific numbers. 130 00:06:46,450 --> 00:06:48,050 So this is now what? 131 00:06:48,050 --> 00:06:51,540 On the right-hand side, we have a linear system, 132 00:06:51,540 --> 00:06:55,650 and the point is that we are using the fundamental result 133 00:06:55,650 --> 00:06:59,850 that we can use the linear approximation as being 134 00:06:59,850 --> 00:07:03,330 very nearly equal to the true change in y. 135 00:07:03,330 --> 00:07:05,490 And it's in that sense that we have 136 00:07:05,490 --> 00:07:10,060 derived our system of n linear equations and n unknowns. 137 00:07:10,060 --> 00:07:12,370 You see, this is a linear system. 138 00:07:12,370 --> 00:07:14,540 The approximation-- again, let me emphasize that, 139 00:07:14,540 --> 00:07:15,706 because it's very important. 140 00:07:15,706 --> 00:07:17,790 The approximation hinges on the fact 141 00:07:17,790 --> 00:07:19,740 that what this is exactly equal to 142 00:07:19,740 --> 00:07:22,880 would be delta y_1 tan et cetera, delta y sub n. 143 00:07:22,880 --> 00:07:24,610 But we're assuming that these are 144 00:07:24,610 --> 00:07:27,440 close enough in a small enough neighborhood of x 145 00:07:27,440 --> 00:07:31,340 bar equals a bar, so that we can make this particular statement. 146 00:07:31,340 --> 00:07:33,140 So this is a linear system. 147 00:07:33,140 --> 00:07:34,620 And because it is a linear system, 148 00:07:34,620 --> 00:07:38,510 we're back to our special case that this system is invertible 149 00:07:38,510 --> 00:07:43,220 if and only if the determinant of the coefficients-- matrix 150 00:07:43,220 --> 00:07:45,410 coefficients-- is not 0. 151 00:07:45,410 --> 00:07:47,580 And what is that matrix coefficient? 152 00:07:47,580 --> 00:07:51,050 It consists of n rows and n columns, 153 00:07:51,050 --> 00:07:55,610 and the row is determined by the subscript on the y, 154 00:07:55,610 --> 00:07:59,380 and the column is determined by the subscript on the x. 155 00:07:59,380 --> 00:08:02,660 So we write that matrix as the partial of y 156 00:08:02,660 --> 00:08:05,640 sub i with respect to x sub j. 157 00:08:05,640 --> 00:08:09,470 In other words, the i-th row involves the y's 158 00:08:09,470 --> 00:08:12,790 and the j-th column the x. 159 00:08:12,790 --> 00:08:13,480 All right? 160 00:08:13,480 --> 00:08:18,410 And that's exactly, then, how we handle this particular system. 161 00:08:18,410 --> 00:08:18,910 OK? 162 00:08:18,910 --> 00:08:20,040 Quite mechanically. 163 00:08:20,040 --> 00:08:22,800 And let's just summarize that, then, very quickly. 164 00:08:22,800 --> 00:08:26,200 If f sub 1 et cetera and f sub n are continuously 165 00:08:26,200 --> 00:08:30,220 differentiable functions of x_1 up to x_n near x bar 166 00:08:30,220 --> 00:08:34,809 equals a bar, then the system y_1 equals f_1 of x_1 167 00:08:34,809 --> 00:08:38,820 up to x_n et cetera, y_n equals f sub n of x_1 up 168 00:08:38,820 --> 00:08:42,510 to x_n-- that system is invertible if and only 169 00:08:42,510 --> 00:08:48,240 if the determinant-- the n by n determinant 170 00:08:48,240 --> 00:08:52,930 whose entry in i-th row, j-th column is the partial of y 171 00:08:52,930 --> 00:08:55,350 sub i with respect to x sub j-- if 172 00:08:55,350 --> 00:08:58,160 and only if that determinant is not 0. 173 00:08:58,160 --> 00:09:00,440 Now what does that mean, to say that it's invertible? 174 00:09:00,440 --> 00:09:04,060 It means that we can solve for the x's in terms of the y's. 175 00:09:04,060 --> 00:09:07,050 Now we may not be able to do that explicitly. 176 00:09:07,050 --> 00:09:11,110 The best we may be able to do is to do that implicitly, 177 00:09:11,110 --> 00:09:11,680 meaning this. 178 00:09:11,680 --> 00:09:13,096 Let me just come back to something 179 00:09:13,096 --> 00:09:14,830 I said before to make sure that there's 180 00:09:14,830 --> 00:09:16,630 no misunderstanding about this. 181 00:09:16,630 --> 00:09:20,170 What we're saying is, that in this particular linear system 182 00:09:20,170 --> 00:09:23,390 of equations, as long as this determinant of coefficients is 183 00:09:23,390 --> 00:09:29,280 not 0, we can explicitly solve for delta x_1 up to delta x_n 184 00:09:29,280 --> 00:09:32,580 in terms of delta y_1 up to delta y_n, 185 00:09:32,580 --> 00:09:35,380 even if we may not be able to solve explicitly 186 00:09:35,380 --> 00:09:37,770 for the x's in terms of the y's. 187 00:09:37,770 --> 00:09:39,150 In other words, the crucial point 188 00:09:39,150 --> 00:09:42,030 is, we can solve for the changes in x 189 00:09:42,030 --> 00:09:44,680 in terms of the changes in y. 190 00:09:44,680 --> 00:09:48,850 And that is implicitly enough to see what the x's look like 191 00:09:48,850 --> 00:09:49,890 in terms of the y's. 192 00:09:49,890 --> 00:09:51,290 Once we know what the change of x 193 00:09:51,290 --> 00:09:53,600 looks like in terms of the change in y, 194 00:09:53,600 --> 00:09:57,420 then we really know what x itself 195 00:09:57,420 --> 00:09:59,300 looks like in terms of the y's, even 196 00:09:59,300 --> 00:10:03,430 as I say it may be implicitly rather than explicitly. 197 00:10:03,430 --> 00:10:06,530 At any rate, this matrix is so important 198 00:10:06,530 --> 00:10:09,400 that it's given a very special name. 199 00:10:09,400 --> 00:10:12,840 Definition: The matrix whose entry 200 00:10:12,840 --> 00:10:15,750 in the i-th row, j-th column is the partial 201 00:10:15,750 --> 00:10:20,160 of y sub i with respect to x sub j is called the Jacobian. 202 00:10:20,160 --> 00:10:22,320 I put "matrix" in quotation marks 203 00:10:22,320 --> 00:10:25,170 here, because some people refer to the Jacobian 204 00:10:25,170 --> 00:10:26,300 meaning a matrix. 205 00:10:26,300 --> 00:10:30,710 Other people call the Jacobian the determinant of the Jacobian 206 00:10:30,710 --> 00:10:31,210 matrix. 207 00:10:31,210 --> 00:10:33,370 I'm not going to make any distinction this way. 208 00:10:33,370 --> 00:10:35,070 It'll be clear from context. 209 00:10:35,070 --> 00:10:37,450 Usually when I say the "Jacobian," 210 00:10:37,450 --> 00:10:39,930 I will mean the Jacobian matrix. 211 00:10:39,930 --> 00:10:42,860 I might sometimes mean the Jacobian determinant. 212 00:10:42,860 --> 00:10:45,790 And so to avoid ambiguity, I will hopefully 213 00:10:45,790 --> 00:10:48,420 say "Jacobian matrix" when I mean the matrix, 214 00:10:48,420 --> 00:10:51,610 and "Jacobian determinant" when I mean the determinant. 215 00:10:51,610 --> 00:10:53,600 But should I forget to do this, or should you 216 00:10:53,600 --> 00:10:55,830 read a textbook where the word Jacobian is 217 00:10:55,830 --> 00:10:59,460 used without the proper noun after it, 218 00:10:59,460 --> 00:11:02,140 it should be clear from context which is meant. 219 00:11:02,140 --> 00:11:05,780 But at any rate, that's what we mean by the Jacobian of y_1 220 00:11:05,780 --> 00:11:09,480 up to y_n with respect to x_1 up to x_n. 221 00:11:09,480 --> 00:11:14,600 And this Jacobian matrix is often abbreviated by-- you 222 00:11:14,600 --> 00:11:17,600 either write J for Jacobian of y_1 223 00:11:17,600 --> 00:11:20,930 up to y_n over x_1 up to x_n. 224 00:11:20,930 --> 00:11:25,220 Or else you use a modification of the partial derivative 225 00:11:25,220 --> 00:11:26,250 notation. 226 00:11:26,250 --> 00:11:30,560 And you sort of read this as if it said the partial of y_1 227 00:11:30,560 --> 00:11:35,500 up to y_n divided by the partial of x_1 up to x_n. 228 00:11:35,500 --> 00:11:39,620 And again, there is the same analogy 229 00:11:39,620 --> 00:11:42,810 between why this notation was invented 230 00:11:42,810 --> 00:11:47,050 and why the notation dy divided by dx was invented. 231 00:11:47,050 --> 00:11:51,470 But in terms of giving you a general overview of what we're 232 00:11:51,470 --> 00:11:53,640 interested in, I think I would like 233 00:11:53,640 --> 00:11:56,000 to leave the discussion of why we 234 00:11:56,000 --> 00:12:00,160 write this in a fractional form to the homework. 235 00:12:00,160 --> 00:12:04,910 In other words, as either a supplement to the learning 236 00:12:04,910 --> 00:12:09,420 exercises or else as part of the supplementary notes in one 237 00:12:09,420 --> 00:12:11,430 form or another, we will take care 238 00:12:11,430 --> 00:12:14,300 of all of the computational aspects of how 239 00:12:14,300 --> 00:12:16,550 one handles the Jacobian. 240 00:12:16,550 --> 00:12:19,560 But what I wanted to do now was to emphasize 241 00:12:19,560 --> 00:12:24,750 how one uses the Jacobian matrix and differentials to invert 242 00:12:24,750 --> 00:12:27,810 systems of n equations in n unknowns. 243 00:12:27,810 --> 00:12:30,000 And I will use the technique that's 244 00:12:30,000 --> 00:12:31,720 used right in the textbook, and which 245 00:12:31,720 --> 00:12:34,630 is part of the assignment for today's unit. 246 00:12:34,630 --> 00:12:36,610 The example that I have in mind-- 247 00:12:36,610 --> 00:12:39,410 I simply picked the usual case, n equals 2, 248 00:12:39,410 --> 00:12:41,710 so that things don't get that messy. 249 00:12:41,710 --> 00:12:43,640 Using again, the standard notation 250 00:12:43,640 --> 00:12:46,050 when one deals with two independent variables, 251 00:12:46,050 --> 00:12:48,770 let u equal x squared minus y squared. 252 00:12:48,770 --> 00:12:50,450 Let v equal to 2x*y. 253 00:12:50,450 --> 00:12:52,140 Let's suppose now that I would like 254 00:12:52,140 --> 00:12:53,980 to find the partial of x with respect 255 00:12:53,980 --> 00:12:58,180 for u, treating v as the other independent variable. 256 00:12:58,180 --> 00:13:01,250 You see, again, I want to review this thing. 257 00:13:01,250 --> 00:13:03,790 When I say find the partial of x with respect 258 00:13:03,790 --> 00:13:06,530 to u holding v constant, it is not 259 00:13:06,530 --> 00:13:08,270 the same as finding the partial of u 260 00:13:08,270 --> 00:13:11,710 with respect to x from here and then just inverting it. 261 00:13:11,710 --> 00:13:14,320 Namely the partial of u with respect to x here 262 00:13:14,320 --> 00:13:16,820 assumes that y is being held constant. 263 00:13:16,820 --> 00:13:19,320 And if you then invert that, recall that what you're finding 264 00:13:19,320 --> 00:13:23,090 is the partial of x with respect to u treating y 265 00:13:23,090 --> 00:13:24,230 as the other variable. 266 00:13:24,230 --> 00:13:26,140 We want the partial of x with respect 267 00:13:26,140 --> 00:13:29,970 to u treating u and v as the pair of independent variables. 268 00:13:29,970 --> 00:13:30,580 Why? 269 00:13:30,580 --> 00:13:34,160 Because that's exactly what you mean by inverting this system. 270 00:13:34,160 --> 00:13:37,530 This system as given expresses u and v 271 00:13:37,530 --> 00:13:40,740 in terms of the pair of independent variables x and y. 272 00:13:40,740 --> 00:13:43,990 And now what you'd like to do is express the pair of variables 273 00:13:43,990 --> 00:13:48,540 x and y in terms of the independent variables u and v, 274 00:13:48,540 --> 00:13:51,200 assuming of course, that u and v are indeed 275 00:13:51,200 --> 00:13:52,790 independent variables. 276 00:13:52,790 --> 00:13:55,140 The mechanical solution is simply this. 277 00:13:55,140 --> 00:13:57,580 Using the language of differentials, 278 00:13:57,580 --> 00:14:00,350 we write down du and dv. 279 00:14:00,350 --> 00:14:03,060 Namely, du is what? 280 00:14:03,060 --> 00:14:06,110 The partial of u with respect to x times dx, 281 00:14:06,110 --> 00:14:09,420 plus the partial of u with respect to y times dy. 282 00:14:09,420 --> 00:14:12,670 And from the relationship that u equals x squared minus y 283 00:14:12,670 --> 00:14:17,690 squared, we see that du is 2x*dx minus 2y*dy. 284 00:14:17,690 --> 00:14:23,290 Similarly, since the partial of v with respect to x is 2y, 285 00:14:23,290 --> 00:14:26,500 and the partial of v with respect to y is 2x, 286 00:14:26,500 --> 00:14:31,300 we see that dv is 2y*dx plus 2x*dy. 287 00:14:31,300 --> 00:14:33,700 If we now assume that this is evaluated 288 00:14:33,700 --> 00:14:37,840 at some point (x_0, y_0), what do we have over here? 289 00:14:37,840 --> 00:14:40,620 Once we've picked out a point (x_0, y_0) 290 00:14:40,620 --> 00:14:44,700 to evaluate this at-- and I left out that because it simply 291 00:14:44,700 --> 00:14:46,620 would make the notation too long, 292 00:14:46,620 --> 00:14:48,500 but I'll talk about that more later. 293 00:14:48,500 --> 00:14:51,430 Assuming that we've evaluated this at a particular fixed 294 00:14:51,430 --> 00:14:55,090 value of x and y, we have what? 295 00:14:55,090 --> 00:14:59,836 du is some constant times dx plus some constant times dy. 296 00:14:59,836 --> 00:15:03,960 dv is some constant times dx plus some constant times dy. 297 00:15:03,960 --> 00:15:06,670 In other words, du and dv are expressed 298 00:15:06,670 --> 00:15:09,410 as linear combinations of dx and dy. 299 00:15:09,410 --> 00:15:12,530 We know how to invert this type of a system, 300 00:15:12,530 --> 00:15:14,060 assuming that it's invertible. 301 00:15:14,060 --> 00:15:17,270 Sparing you the details, what I'm saying is what? 302 00:15:17,270 --> 00:15:19,390 I could multiply, say, the top equation 303 00:15:19,390 --> 00:15:22,140 by x, the bottom equation by y. 304 00:15:22,140 --> 00:15:26,100 And when I add them, the terms involving dy will drop out. 305 00:15:26,100 --> 00:15:31,950 And I will get x*du plus y*dv is 2x*dx plus 2y*dx. 306 00:15:31,950 --> 00:15:35,460 In other words twice-- it's 2 x squared. 307 00:15:35,460 --> 00:15:39,240 I multiply the top equation by x, the bottom equation by y. 308 00:15:39,240 --> 00:15:41,620 So the right-hand side here becomes 309 00:15:41,620 --> 00:15:45,360 2 x squared dx plus 2 y squared dx, which is twice 310 00:15:45,360 --> 00:15:47,490 x squared plus y squared dx. 311 00:15:47,490 --> 00:15:49,840 I now divide both sides of the equation 312 00:15:49,840 --> 00:15:53,780 through by twice x squared plus y squared. 313 00:15:53,780 --> 00:15:56,670 I wind up with the fact that dx is 314 00:15:56,670 --> 00:16:01,270 x over twice the quantity x squared plus y squared du, 315 00:16:01,270 --> 00:16:06,510 plus the quantity y over twice x squared plus y squared dv. 316 00:16:06,510 --> 00:16:09,010 Recall that I also know by definition 317 00:16:09,010 --> 00:16:11,700 that dx is the partial of x with respect 318 00:16:11,700 --> 00:16:15,960 to u times du plus the partial of x with respect to v times 319 00:16:15,960 --> 00:16:19,990 dv, recalling from our lecture on exact differentials 320 00:16:19,990 --> 00:16:23,370 that the only way two differentials in terms 321 00:16:23,370 --> 00:16:26,460 of the du and dv can be equal is if they're equal 322 00:16:26,460 --> 00:16:28,610 coefficient by coefficient. 323 00:16:28,610 --> 00:16:31,970 I can therefore equate the two coefficients of du 324 00:16:31,970 --> 00:16:35,020 to conclude that the partial of x with respect to u 325 00:16:35,020 --> 00:16:39,670 is x over 2 times the quantity x squared plus y squared. 326 00:16:39,670 --> 00:16:42,120 In fact, I can get the extra piece of information, 327 00:16:42,120 --> 00:16:44,730 even though I wasn't asked for that in this problem, 328 00:16:44,730 --> 00:16:47,110 that the partial of x with respect to v 329 00:16:47,110 --> 00:16:51,240 is y over twice the quantity x squared plus y squared. 330 00:16:51,240 --> 00:16:54,860 By the way, observe, purely algebraically, 331 00:16:54,860 --> 00:16:57,310 that the only time I would be in any difficulty 332 00:16:57,310 --> 00:17:00,490 with this procedure is if x squared plus y squared 333 00:17:00,490 --> 00:17:02,650 happened to equal 0. 334 00:17:02,650 --> 00:17:05,230 In other words, if x squared plus y squared happened 335 00:17:05,230 --> 00:17:08,470 to equal 0, then to divide through by twice 336 00:17:08,470 --> 00:17:11,740 x squared plus y squared is equivalent to dividing through 337 00:17:11,740 --> 00:17:12,750 by 0. 338 00:17:12,750 --> 00:17:15,730 And division by 0 is not permissible. 339 00:17:15,730 --> 00:17:17,609 In other words, somehow or other, 340 00:17:17,609 --> 00:17:22,069 I must take into consideration that I am in trouble if x 341 00:17:22,069 --> 00:17:24,030 squared plus y squared is 0. 342 00:17:24,030 --> 00:17:26,079 Notice, by the way, that the only time 343 00:17:26,079 --> 00:17:28,580 that x squared plus y squared can be 0 344 00:17:28,580 --> 00:17:30,800 is if both x and y are 0. 345 00:17:30,800 --> 00:17:33,530 And that means, again, that somehow or other, 346 00:17:33,530 --> 00:17:39,460 at the point 0 comma 0-- in a neighborhood of the point 0 347 00:17:39,460 --> 00:17:42,020 comma 0, in the neighborhood of the origin, 348 00:17:42,020 --> 00:17:45,170 I can expect to have a little bit of trouble. 349 00:17:45,170 --> 00:17:47,860 Now again, the main aim of the lecture 350 00:17:47,860 --> 00:17:49,720 is to give you an overview. 351 00:17:49,720 --> 00:17:52,640 The trouble that comes in at the origin 352 00:17:52,640 --> 00:17:55,410 will again be left for the exercises. 353 00:17:55,410 --> 00:17:59,050 In a learning exercise, we will discuss just what goes wrong 354 00:17:59,050 --> 00:18:01,730 if you take a neighborhood of the origin, 355 00:18:01,730 --> 00:18:04,890 to discuss the change of variables u equals x squared 356 00:18:04,890 --> 00:18:07,490 minus y squared, v equals 2x*y. 357 00:18:07,490 --> 00:18:09,920 Suffice it to say, for the time being, 358 00:18:09,920 --> 00:18:13,410 that the system of equations u equals x squared minus y 359 00:18:13,410 --> 00:18:18,330 squared, v equals 2x*y is invertible in any neighborhood 360 00:18:18,330 --> 00:18:23,630 of a point x_0 comma y_0 except in that one possible case when 361 00:18:23,630 --> 00:18:27,575 you have chosen as the point (x_0, y_0) the origin. 362 00:18:27,575 --> 00:18:30,770 At any rate, to take this problem away 363 00:18:30,770 --> 00:18:33,790 from the specific concrete example 364 00:18:33,790 --> 00:18:35,820 that we've been talking about, and to put 365 00:18:35,820 --> 00:18:38,570 this in terms of a more general perspective, 366 00:18:38,570 --> 00:18:42,770 let's go back more abstractly to the more general system. 367 00:18:42,770 --> 00:18:45,530 Let's suppose now that u and v are 368 00:18:45,530 --> 00:18:48,350 any two continuously differentiable functions 369 00:18:48,350 --> 00:18:49,440 of x and y. 370 00:18:49,440 --> 00:18:51,260 Let u be f of x, y. 371 00:18:51,260 --> 00:18:53,560 Let v equal to g of x, y. 372 00:18:53,560 --> 00:18:55,620 And what we're saying is, if you pick 373 00:18:55,620 --> 00:19:00,380 a particular point x_0 comma y_0, then by mechanically 374 00:19:00,380 --> 00:19:02,760 using the total differential, we have 375 00:19:02,760 --> 00:19:05,950 that du is the partial of f with respect 376 00:19:05,950 --> 00:19:10,660 to x evaluated at (x_0, y_0) times dx, plus the partial of f 377 00:19:10,660 --> 00:19:15,010 with respect to y, evaluated at (x_0, y_0) times dy. 378 00:19:15,010 --> 00:19:18,100 We have that dv is the partial of g with respect 379 00:19:18,100 --> 00:19:21,310 to x evaluated at (x_0, y_0) times dx 380 00:19:21,310 --> 00:19:22,860 plus the partial of g with respect 381 00:19:22,860 --> 00:19:26,580 to y, evaluated at (x_0, y_0) times dy. 382 00:19:26,580 --> 00:19:28,000 What is this now? 383 00:19:28,000 --> 00:19:32,560 This is a linear system of two equations in two unknowns. 384 00:19:32,560 --> 00:19:37,180 du and dv are linear combinations of the dx and dy. 385 00:19:37,180 --> 00:19:40,160 The key point being again-- that's why I put this in here 386 00:19:40,160 --> 00:19:42,840 specifically, with the x sub 0 and the y sub 387 00:19:42,840 --> 00:19:45,280 0-- the key point is that as soon 388 00:19:45,280 --> 00:19:49,450 as you evaluate a partial derivative at a fixed point, 389 00:19:49,450 --> 00:19:52,720 the value is a constant, not a variable. 390 00:19:52,720 --> 00:19:54,030 So this is what? 391 00:19:54,030 --> 00:19:55,830 A linear system. 392 00:19:55,830 --> 00:20:00,116 We have du as a constant times the x plus a constant times dy. 393 00:20:00,116 --> 00:20:03,690 dv is a constant times dx plus a constant times dy. 394 00:20:03,690 --> 00:20:05,610 Again, to make a long story short, 395 00:20:05,610 --> 00:20:09,240 I can solve for dx in terms of du and dv. 396 00:20:09,240 --> 00:20:15,680 I can solve for dy in terms of du and dv, provided what? 397 00:20:15,680 --> 00:20:18,070 That my matrix of coefficients does not 398 00:20:18,070 --> 00:20:20,110 have its determinant equal to 0. 399 00:20:20,110 --> 00:20:22,150 And to review this more explicitly 400 00:20:22,150 --> 00:20:26,430 so you see the mechanics, all I'm saying is, to solve for dx, 401 00:20:26,430 --> 00:20:30,420 I can multiply the top equation by the partial 402 00:20:30,420 --> 00:20:34,630 of g with respect to y evaluated at (x_0, y_0). 403 00:20:34,630 --> 00:20:39,050 I can multiply the bottom equation by minus the partial 404 00:20:39,050 --> 00:20:42,440 of f with respect to y evaluated at (x_0, y_0). 405 00:20:42,440 --> 00:20:44,780 And then when I add these two equations, 406 00:20:44,780 --> 00:20:46,860 the dy term will drop out. 407 00:20:46,860 --> 00:20:49,450 Again, leaving the details for you, 408 00:20:49,450 --> 00:20:51,930 it turns out that dx is what? 409 00:20:51,930 --> 00:20:54,039 The partial of g with respect to y-- 410 00:20:54,039 --> 00:20:56,080 and I've abbreviated this again, this means what? 411 00:20:56,080 --> 00:20:58,100 Evaluated at (x_0, y_0). 412 00:20:58,100 --> 00:21:03,500 Times du minus the partial of f with respect to y at (x_0, y_0) 413 00:21:03,500 --> 00:21:06,880 times dv over the partial of f with respect 414 00:21:06,880 --> 00:21:09,300 to x times the partial with g with respect 415 00:21:09,300 --> 00:21:13,050 to y minus the partial of f with respect to y 416 00:21:13,050 --> 00:21:15,770 times the partial of g with respect to x. 417 00:21:15,770 --> 00:21:21,390 And notice, of course, that this denominator is precisely 418 00:21:21,390 --> 00:21:23,860 our matrix of coefficients. 419 00:21:23,860 --> 00:21:28,640 f sub x, f sub y; g sub x, g sub y. 420 00:21:28,640 --> 00:21:30,910 And the only place I've taken a liberty here 421 00:21:30,910 --> 00:21:35,110 is to use the abbreviation of leaving out the (x_0, y_0). 422 00:21:35,110 --> 00:21:36,630 And the key point is what? 423 00:21:36,630 --> 00:21:39,460 The only place I am going to get in trouble 424 00:21:39,460 --> 00:21:42,260 is if this denominator happens to be 0. 425 00:21:42,260 --> 00:21:44,880 In the two by two case-- in other words, 426 00:21:44,880 --> 00:21:48,240 in the case of two equations and two unknowns, 427 00:21:48,240 --> 00:21:50,380 notice that we can see explicitly 428 00:21:50,380 --> 00:21:53,730 what goes wrong when the determinant of coefficients 429 00:21:53,730 --> 00:21:54,490 is 0. 430 00:21:54,490 --> 00:21:57,230 The determinant of coefficients is just this denominator. 431 00:21:57,230 --> 00:22:00,130 And when that denominator is 0, we're in trouble. 432 00:22:00,130 --> 00:22:03,700 In other words, the only time we cannot invert this system, 433 00:22:03,700 --> 00:22:08,080 the only time we cannot find delta x and delta y in terms 434 00:22:08,080 --> 00:22:13,760 of du and dv is when this determinant is 0. 435 00:22:13,760 --> 00:22:17,020 Now you see, I think this is pretty straightforward stuff. 436 00:22:17,020 --> 00:22:19,200 The textbook has a section on this, 437 00:22:19,200 --> 00:22:21,020 as you will be reading shortly. 438 00:22:21,020 --> 00:22:23,430 It is not hard to work this mechanically. 439 00:22:23,430 --> 00:22:25,150 And then the question comes up is, 440 00:22:25,150 --> 00:22:28,610 how come when you pick up a book on advanced calculus, 441 00:22:28,610 --> 00:22:33,830 there's usually a huge chapter on Jacobians and inversion? 442 00:22:33,830 --> 00:22:37,820 Why isn't it this simple in the advanced textbooks? 443 00:22:37,820 --> 00:22:41,180 Why can we have it in our book this simply, but yet, 444 00:22:41,180 --> 00:22:44,370 in the advanced book, why is there so much more to this 445 00:22:44,370 --> 00:22:45,880 beneath the surface? 446 00:22:45,880 --> 00:22:49,300 The answer behind all of this is quite subtle. 447 00:22:49,300 --> 00:22:52,105 In fact, the major subtlety is this. 448 00:22:52,105 --> 00:22:55,280 And that is that the notation du-- 449 00:22:55,280 --> 00:23:00,165 and for that matter, dv or dx or dy or dx_1 dx_2, whatever 450 00:23:00,165 --> 00:23:02,900 you're using here-- is ambiguous. 451 00:23:02,900 --> 00:23:06,170 And it's ambiguous for the following reason. 452 00:23:06,170 --> 00:23:10,450 Recall how we defined the meaning of the symbol du. 453 00:23:10,450 --> 00:23:13,480 If we're assuming that u is expressed 454 00:23:13,480 --> 00:23:18,050 as a function of the independent variables x and y, then by du, 455 00:23:18,050 --> 00:23:20,820 we mean delta u tan. 456 00:23:20,820 --> 00:23:23,930 On the other hand, if we inverted this, 457 00:23:23,930 --> 00:23:26,750 du now-- in other words, what do I mean by inverted this? 458 00:23:26,750 --> 00:23:29,460 What I mean, first of all, is if we assume now 459 00:23:29,460 --> 00:23:33,330 that x and y are expressed in terms of u and v-- for example, 460 00:23:33,330 --> 00:23:38,470 suppose x is some function h of u and v, what does du mean now? 461 00:23:38,470 --> 00:23:40,870 Notice that now u is playing the role 462 00:23:40,870 --> 00:23:42,490 of an independent variable. 463 00:23:42,490 --> 00:23:47,080 For the independent variable, du just means delta u. 464 00:23:47,080 --> 00:23:49,690 In other words, by way of a very quick review, 465 00:23:49,690 --> 00:23:53,620 notice that if we're viewing u as being a dependent variable, 466 00:23:53,620 --> 00:23:56,680 then du means delta u tan. 467 00:23:56,680 --> 00:24:00,510 But if we're viewing u as being an independent variable, 468 00:24:00,510 --> 00:24:03,100 then du means delta u. 469 00:24:03,100 --> 00:24:08,150 And consequently, the results that we're using 470 00:24:08,150 --> 00:24:11,740 hinge very strongly then-- in other words, the inversion 471 00:24:11,740 --> 00:24:15,210 that we're using hinges very strongly on the requirement. 472 00:24:15,210 --> 00:24:19,470 In other words, the inversion requires the validity 473 00:24:19,470 --> 00:24:23,150 of interchanging delta u and delta u tan, et cetera. 474 00:24:23,150 --> 00:24:27,110 Now let me show you what that means more explicitly. 475 00:24:27,110 --> 00:24:29,210 Let's come back to something that we were talking 476 00:24:29,210 --> 00:24:31,700 about just a few moments ago. 477 00:24:31,700 --> 00:24:34,980 From a completely mechanical point of view, 478 00:24:34,980 --> 00:24:39,230 given that u equals f of x, y and v equals g of x, y, 479 00:24:39,230 --> 00:24:41,520 we very mechanically wrote down that du 480 00:24:41,520 --> 00:24:48,475 was f sub x dx plus f sub y dy, dv was g sub x dx plus g sub y 481 00:24:48,475 --> 00:24:51,870 dy, and then we just mechanically solved for dx 482 00:24:51,870 --> 00:24:54,450 in terms of du and dv. 483 00:24:54,450 --> 00:24:57,960 My claim is, is that if we translate this thing, if we 484 00:24:57,960 --> 00:25:02,560 translate this thing into the language of delta u's, delta 485 00:25:02,560 --> 00:25:05,840 x's, delta u tan's, delta x tans, et cetera, 486 00:25:05,840 --> 00:25:08,340 what we really said was what? 487 00:25:08,340 --> 00:25:12,490 That delta u tan was the partial of f with respect to x times 488 00:25:12,490 --> 00:25:16,900 delta x, plus the partial of f with respect to y times delta 489 00:25:16,900 --> 00:25:17,750 y. 490 00:25:17,750 --> 00:25:21,110 And delta v tan was the partial of g with respect to x times 491 00:25:21,110 --> 00:25:24,540 delta x, plus the partial of g with respect to y times delta 492 00:25:24,540 --> 00:25:25,520 y. 493 00:25:25,520 --> 00:25:28,170 And then when we eliminated delta y 494 00:25:28,170 --> 00:25:30,730 by multiplying the top equation by g sub 495 00:25:30,730 --> 00:25:33,780 y and the bottom equation by minus f sub y 496 00:25:33,780 --> 00:25:38,200 and adding, what we found was how to express delta x-- 497 00:25:38,200 --> 00:25:40,080 and catch this, this is the key point-- 498 00:25:40,080 --> 00:25:45,260 what we did was we expressed delta x 499 00:25:45,260 --> 00:25:49,730 as a linear combination-- not of delta u and delta v, 500 00:25:49,730 --> 00:25:53,990 but of delta u tan and delta v tan. 501 00:25:53,990 --> 00:25:57,670 You see, notice that the result that we needed 502 00:25:57,670 --> 00:26:00,250 to have to be able to use differentials 503 00:26:00,250 --> 00:26:02,580 was not this, but this. 504 00:26:02,580 --> 00:26:05,880 See, we found this, not delta x tan 505 00:26:05,880 --> 00:26:12,580 equals g_y delta u minus f sub y delta v over f sub x g sub y 506 00:26:12,580 --> 00:26:14,560 minus f sub y g sub x. 507 00:26:14,560 --> 00:26:18,720 To be able to say-- to invert this required that this was 508 00:26:18,720 --> 00:26:24,470 the expression that we had, yet the expression 509 00:26:24,470 --> 00:26:26,570 that we were really evaluating was this one. 510 00:26:26,570 --> 00:26:28,420 In fact, let me come back for one moment, 511 00:26:28,420 --> 00:26:30,010 and make sure that we see this. 512 00:26:30,010 --> 00:26:32,570 You see, notice again that the subtlety 513 00:26:32,570 --> 00:26:36,560 of going from here to here and inverting never 514 00:26:36,560 --> 00:26:40,510 shows us that we've interchanged the roles of u and v 515 00:26:40,510 --> 00:26:42,960 from being the dependent variables 516 00:26:42,960 --> 00:26:45,050 to the independent variables. 517 00:26:45,050 --> 00:26:47,690 So the reason that there is so much 518 00:26:47,690 --> 00:26:51,930 work done in advanced textbooks under the heading of inverting 519 00:26:51,930 --> 00:26:55,170 systems of equations is to justify 520 00:26:55,170 --> 00:26:59,260 that being able to switch from delta x to delta x tan 521 00:26:59,260 --> 00:27:03,910 or from delta u tan to delta u as we see fit, whenever 522 00:27:03,910 --> 00:27:05,670 it serves our purposes. 523 00:27:05,670 --> 00:27:08,970 The validity of being able to do that 524 00:27:08,970 --> 00:27:11,330 hinges on this more subtle type of proof, 525 00:27:11,330 --> 00:27:13,460 that as far as I'm concerned, goes 526 00:27:13,460 --> 00:27:17,040 beyond the scope of our text, other than for the fact that 527 00:27:17,040 --> 00:27:21,640 in the learning exercises, I will find excuses to bring up 528 00:27:21,640 --> 00:27:24,660 all of the situations that bring out 529 00:27:24,660 --> 00:27:26,320 where the theory is important. 530 00:27:26,320 --> 00:27:28,030 In other words, there will not be proofs 531 00:27:28,030 --> 00:27:30,240 of these more difficult things. 532 00:27:30,240 --> 00:27:31,950 Not because the proofs aren't important, 533 00:27:31,950 --> 00:27:33,366 but from the point of view of what 534 00:27:33,366 --> 00:27:35,030 we're trying to do in our course, 535 00:27:35,030 --> 00:27:38,890 these proofs tend to obscure the main stream of things. 536 00:27:38,890 --> 00:27:42,000 So what I will do in the learning exercises is bring up 537 00:27:42,000 --> 00:27:45,880 places that will show you why the theory is important, 538 00:27:45,880 --> 00:27:50,520 at which point, I will emphasize what the result of the theory 539 00:27:50,520 --> 00:27:53,910 is without belaboring and beleaguering you 540 00:27:53,910 --> 00:27:55,830 with the proofs of these things. 541 00:27:55,830 --> 00:27:57,680 At any rate, what I'd like to do now, 542 00:27:57,680 --> 00:28:00,540 next time, is to give you an example where 543 00:28:00,540 --> 00:28:03,850 all of the material, of the blocks of material that we've 544 00:28:03,850 --> 00:28:06,760 done now on partial derivatives, are sort of pulled together 545 00:28:06,760 --> 00:28:07,810 very nicely. 546 00:28:07,810 --> 00:28:11,090 But at any rate, we'll talk about that more next time. 547 00:28:11,090 --> 00:28:12,480 And until next time, goodbye. 548 00:28:17,920 --> 00:28:20,290 Funding for the publication of this video 549 00:28:20,290 --> 00:28:25,170 was provided by the Gabriella and Paul Rosenbaum Foundation. 550 00:28:25,170 --> 00:28:29,350 Help OCW continue to provide free and open access to MIT 551 00:28:29,350 --> 00:28:33,760 courses by making a donation at ocw.mit.edu/donate.