1
00:00:00,040 --> 00:00:02,460
The following content is
provided under a Creative

2
00:00:02,460 --> 00:00:03,870
Commons license.

3
00:00:03,870 --> 00:00:06,320
Your support will help
MIT OpenCourseWare

4
00:00:06,320 --> 00:00:10,560
continue to offer high-quality
educational resources for free.

5
00:00:10,560 --> 00:00:13,300
To make a donation or
view additional materials

6
00:00:13,300 --> 00:00:17,210
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,210 --> 00:00:17,862
at ocw.mit.edu.

8
00:00:31,720 --> 00:00:32,549
PROFESSOR: Hi.

9
00:00:32,549 --> 00:00:34,590
In today's lesson,
hopefully we will

10
00:00:34,590 --> 00:00:38,760
begin to reap the
rewards of our digression

11
00:00:38,760 --> 00:00:41,180
into the subject
of linear algebra.

12
00:00:41,180 --> 00:00:43,560
Recall that in the
last few lectures, what

13
00:00:43,560 --> 00:00:48,210
we have been dealing with is
the problem of inverting systems

14
00:00:48,210 --> 00:00:51,130
of linear equations.

15
00:00:51,130 --> 00:00:53,160
And what we would
like to do today

16
00:00:53,160 --> 00:00:55,690
is to tackle the
more general problem

17
00:00:55,690 --> 00:00:58,010
of inverting systems
of equations,

18
00:00:58,010 --> 00:01:00,510
even if the equations
are not linear.

19
00:01:00,510 --> 00:01:03,240
And with this in
mind, I simply entitle

20
00:01:03,240 --> 00:01:07,290
today's lesson "Inverting More
General Systems of Equations."

21
00:01:07,290 --> 00:01:09,700
And by way of a
very brief review,

22
00:01:09,700 --> 00:01:14,730
recall that given the linear
system y_1 equals a_(1,1)*x_1

23
00:01:14,730 --> 00:01:19,940
plus, et cetera, a_(1,n)*x_n; up
to y_n equals a_(n,1)*x_1 plus,

24
00:01:19,940 --> 00:01:22,100
et cetera, a_(n,n)*x_n.

25
00:01:22,100 --> 00:01:25,910
We saw that that system was
invertible, meaning what?

26
00:01:25,910 --> 00:01:28,890
That we could solve
this system for the x's

27
00:01:28,890 --> 00:01:31,760
as linear combinations
of the y's if

28
00:01:31,760 --> 00:01:37,960
and only if the inverse of
the matrix of coefficients

29
00:01:37,960 --> 00:01:40,200
of this system exists.

30
00:01:40,200 --> 00:01:43,620
And in terms of determinants,
recall that that means if

31
00:01:43,620 --> 00:01:47,270
and only if the determinant
of the matrix of coefficients

32
00:01:47,270 --> 00:01:49,420
is not 0.

33
00:01:49,420 --> 00:01:54,170
If the determinant of the matrix
of coefficients was 0, then

34
00:01:54,170 --> 00:01:58,040
we saw that y_1 up to
y_n are not independent.

35
00:01:58,040 --> 00:02:00,530
In fact, in that
context, that's where

36
00:02:00,530 --> 00:02:05,870
we came to grips with the
concept of a constraint,

37
00:02:05,870 --> 00:02:08,979
that the constraint actually
turned out to be what?

38
00:02:08,979 --> 00:02:12,780
The fact that the y's
were not independent.

39
00:02:12,780 --> 00:02:16,550
Meaning that we could express
a linear combination of the y's

40
00:02:16,550 --> 00:02:18,180
equal to 0.

41
00:02:18,180 --> 00:02:20,800
So what situation
were we at then?

42
00:02:20,800 --> 00:02:25,460
Given a linear system, if
the matrix of coefficients

43
00:02:25,460 --> 00:02:28,050
does not have its
determinant equal to 0,

44
00:02:28,050 --> 00:02:30,020
the system is invertible.

45
00:02:30,020 --> 00:02:32,360
We can solve for the
x's in terms of the y's.

46
00:02:32,360 --> 00:02:36,620
If the determinant of the
matrix of coefficients is 0's,

47
00:02:36,620 --> 00:02:39,950
the y's are not
linearly independent.

48
00:02:39,950 --> 00:02:43,840
In other words, the
system is not invertible.

49
00:02:43,840 --> 00:02:45,780
And now what we
would like to do is

50
00:02:45,780 --> 00:02:49,290
to tackle the more general
problem of inverting

51
00:02:49,290 --> 00:02:51,180
any system of equations.

52
00:02:51,180 --> 00:02:52,690
And by any, I mean what?

53
00:02:52,690 --> 00:02:57,470
Now we have y_1 is f
sub 1 of x_1 up to x_n,

54
00:02:57,470 --> 00:03:02,440
et cetera, y sub n is f
sub n of x_1 up to x_n.

55
00:03:02,440 --> 00:03:04,440
And now what we're
saying is, we do not

56
00:03:04,440 --> 00:03:07,870
know whether the f's
are linear or not.

57
00:03:07,870 --> 00:03:10,030
In fact, if they
are linear, we're

58
00:03:10,030 --> 00:03:13,330
back to, as a special case,
what we've tackled before.

59
00:03:13,330 --> 00:03:16,620
But now we're assuming that
these need not be linear.

60
00:03:16,620 --> 00:03:18,750
And what we would like
to do is to invert

61
00:03:18,750 --> 00:03:22,220
this system, assuming, of
course, that such an inversion

62
00:03:22,220 --> 00:03:23,700
is possible.

63
00:03:23,700 --> 00:03:26,160
Again, what do we
mean by the inversion?

64
00:03:26,160 --> 00:03:28,740
We mean, somehow, we
would like to know that

65
00:03:28,740 --> 00:03:32,140
given this system of n
equations and n unknowns,

66
00:03:32,140 --> 00:03:36,480
where the y's are expressed
explicitly in terms of the x's,

67
00:03:36,480 --> 00:03:39,630
can we invert this,
and express the x's

68
00:03:39,630 --> 00:03:44,070
in terms of the y's, either
explicitly or implicitly?

69
00:03:44,070 --> 00:03:46,080
That's the problem that
we'd like to tackle.

70
00:03:46,080 --> 00:03:48,770
And what we're going
to use to tackle this

71
00:03:48,770 --> 00:03:51,740
is our old friend,
the differential,

72
00:03:51,740 --> 00:03:53,960
the linear approximation,
again, that

73
00:03:53,960 --> 00:03:57,270
motivated our whole
study of linear systems

74
00:03:57,270 --> 00:03:58,410
in the first place.

75
00:03:58,410 --> 00:04:02,030
Remember, we already
know that if y_1

76
00:04:02,030 --> 00:04:05,880
is a differentiable
function of x_1 up to x_n,

77
00:04:05,880 --> 00:04:09,410
that delta y_1
sub tan is exactly

78
00:04:09,410 --> 00:04:13,620
equal to the partial of f_1
with respect to x_1 times

79
00:04:13,620 --> 00:04:17,620
delta x_1 plus, et cetera, the
partial of f_1 with respect

80
00:04:17,620 --> 00:04:20,010
to x_n times delta x_n.

81
00:04:20,010 --> 00:04:24,290
And in terms of reviewing
the notation, because we will

82
00:04:24,290 --> 00:04:26,380
use it later in
the lecture, notice

83
00:04:26,380 --> 00:04:31,790
that delta y_1 tan was what
we abbreviated to be dy_1

84
00:04:31,790 --> 00:04:34,910
And that delta x_1
up to delta x_n

85
00:04:34,910 --> 00:04:39,970
are abbreviated respectively
as dx_1 up to dx_n.

86
00:04:39,970 --> 00:04:42,500
Generalization of what we
did for the differential

87
00:04:42,500 --> 00:04:44,580
in the case of one
independent variable and we

88
00:04:44,580 --> 00:04:46,060
went through this
discussion when

89
00:04:46,060 --> 00:04:49,779
we talked about exact
differentials in block three.

90
00:04:49,779 --> 00:04:51,570
And in this similar
way, we could of course

91
00:04:51,570 --> 00:04:55,450
express delta y_2
tan, et cetera,

92
00:04:55,450 --> 00:04:58,890
and we can talk about the
linear approximations.

93
00:04:58,890 --> 00:04:59,670
All right?

94
00:04:59,670 --> 00:05:03,850
Not the true delta y's now, but
the linear part of the delta

95
00:05:03,850 --> 00:05:07,340
y's, the delta y_1 sub
tan, or if you prefer,

96
00:05:07,340 --> 00:05:10,570
delta y_1 sub lin,
L-I-N. All right?

97
00:05:10,570 --> 00:05:12,960
And the key point now is what?

98
00:05:12,960 --> 00:05:16,740
If the y's happen to be
continuously differentiable

99
00:05:16,740 --> 00:05:20,160
functions of the x's in a
neighborhood of the point

100
00:05:20,160 --> 00:05:24,040
x bar equals a bar-- in
other words, x_1 up to x_n

101
00:05:24,040 --> 00:05:26,460
is equal to a_1
comma, et cetera,

102
00:05:26,460 --> 00:05:30,190
up to a_n-- x_1 equals
a_1, x_2 equals a_2,

103
00:05:30,190 --> 00:05:34,290
et cetera-- then near that
point, what we're saying

104
00:05:34,290 --> 00:05:34,800
is what?

105
00:05:34,800 --> 00:05:37,840
That the error term
goes to 0 very rapidly.

106
00:05:37,840 --> 00:05:41,240
And as long as the functions
are continuously differentiable,

107
00:05:41,240 --> 00:05:41,950
it means what?

108
00:05:41,950 --> 00:05:45,640
That the change in
y is approximately

109
00:05:45,640 --> 00:05:48,710
equal to the change
in y sub tan.

110
00:05:48,710 --> 00:05:50,780
So that what we're
saying is-- and remember

111
00:05:50,780 --> 00:05:52,640
this is what motivated
our linear systems

112
00:05:52,640 --> 00:05:56,010
in the first place-- that
delta y_1 is approximately

113
00:05:56,010 --> 00:05:58,760
the partial of y_1
with respect to x_1,

114
00:05:58,760 --> 00:06:03,750
evaluated when x bar is a bar,
times delta x_1 plus et cetera,

115
00:06:03,750 --> 00:06:06,940
the partial of y_1 with
respect to x_n, also evaluated

116
00:06:06,940 --> 00:06:11,600
when x bar equals a bar,
times delta x sub n et cetera,

117
00:06:11,600 --> 00:06:14,240
down to delta y sub
n is approximately

118
00:06:14,240 --> 00:06:17,380
equal to the partial of y sub
n with respect to x_1 times

119
00:06:17,380 --> 00:06:21,410
delta x_1, plus et cetera, the
partial of y sub n with respect

120
00:06:21,410 --> 00:06:23,320
to x sub n times delta x_n,

121
00:06:23,320 --> 00:06:27,120
where all of these partials
are evaluated specifically

122
00:06:27,120 --> 00:06:29,950
at the point x bar equals a bar.

123
00:06:29,950 --> 00:06:31,450
The point being what?

124
00:06:31,450 --> 00:06:33,830
That since we're evaluating
all these partials

125
00:06:33,830 --> 00:06:36,860
at a particular point, every
one of these coefficients

126
00:06:36,860 --> 00:06:38,080
is a constant.

127
00:06:38,080 --> 00:06:40,390
You see in general, these
partial derivatives are

128
00:06:40,390 --> 00:06:44,250
functions, but as soon as we
evaluate them at a given value,

129
00:06:44,250 --> 00:06:46,450
they become specific numbers.

130
00:06:46,450 --> 00:06:48,050
So this is now what?

131
00:06:48,050 --> 00:06:51,540
On the right-hand side,
we have a linear system,

132
00:06:51,540 --> 00:06:55,650
and the point is that we are
using the fundamental result

133
00:06:55,650 --> 00:06:59,850
that we can use the linear
approximation as being

134
00:06:59,850 --> 00:07:03,330
very nearly equal to
the true change in y.

135
00:07:03,330 --> 00:07:05,490
And it's in that
sense that we have

136
00:07:05,490 --> 00:07:10,060
derived our system of n linear
equations and n unknowns.

137
00:07:10,060 --> 00:07:12,370
You see, this is
a linear system.

138
00:07:12,370 --> 00:07:14,540
The approximation-- again,
let me emphasize that,

139
00:07:14,540 --> 00:07:15,706
because it's very important.

140
00:07:15,706 --> 00:07:17,790
The approximation
hinges on the fact

141
00:07:17,790 --> 00:07:19,740
that what this is
exactly equal to

142
00:07:19,740 --> 00:07:22,880
would be delta y_1 tan
et cetera, delta y sub n.

143
00:07:22,880 --> 00:07:24,610
But we're assuming
that these are

144
00:07:24,610 --> 00:07:27,440
close enough in a small
enough neighborhood of x

145
00:07:27,440 --> 00:07:31,340
bar equals a bar, so that we can
make this particular statement.

146
00:07:31,340 --> 00:07:33,140
So this is a linear system.

147
00:07:33,140 --> 00:07:34,620
And because it is
a linear system,

148
00:07:34,620 --> 00:07:38,510
we're back to our special case
that this system is invertible

149
00:07:38,510 --> 00:07:43,220
if and only if the determinant
of the coefficients-- matrix

150
00:07:43,220 --> 00:07:45,410
coefficients-- is not 0.

151
00:07:45,410 --> 00:07:47,580
And what is that
matrix coefficient?

152
00:07:47,580 --> 00:07:51,050
It consists of n
rows and n columns,

153
00:07:51,050 --> 00:07:55,610
and the row is determined
by the subscript on the y,

154
00:07:55,610 --> 00:07:59,380
and the column is determined
by the subscript on the x.

155
00:07:59,380 --> 00:08:02,660
So we write that matrix
as the partial of y

156
00:08:02,660 --> 00:08:05,640
sub i with respect to x sub j.

157
00:08:05,640 --> 00:08:09,470
In other words, the i-th
row involves the y's

158
00:08:09,470 --> 00:08:12,790
and the j-th column the x.

159
00:08:12,790 --> 00:08:13,480
All right?

160
00:08:13,480 --> 00:08:18,410
And that's exactly, then, how we
handle this particular system.

161
00:08:18,410 --> 00:08:18,910
OK?

162
00:08:18,910 --> 00:08:20,040
Quite mechanically.

163
00:08:20,040 --> 00:08:22,800
And let's just summarize
that, then, very quickly.

164
00:08:22,800 --> 00:08:26,200
If f sub 1 et cetera and
f sub n are continuously

165
00:08:26,200 --> 00:08:30,220
differentiable functions
of x_1 up to x_n near x bar

166
00:08:30,220 --> 00:08:34,809
equals a bar, then the
system y_1 equals f_1 of x_1

167
00:08:34,809 --> 00:08:38,820
up to x_n et cetera, y_n
equals f sub n of x_1 up

168
00:08:38,820 --> 00:08:42,510
to x_n-- that system is
invertible if and only

169
00:08:42,510 --> 00:08:48,240
if the determinant--
the n by n determinant

170
00:08:48,240 --> 00:08:52,930
whose entry in i-th row, j-th
column is the partial of y

171
00:08:52,930 --> 00:08:55,350
sub i with respect
to x sub j-- if

172
00:08:55,350 --> 00:08:58,160
and only if that
determinant is not 0.

173
00:08:58,160 --> 00:09:00,440
Now what does that mean, to
say that it's invertible?

174
00:09:00,440 --> 00:09:04,060
It means that we can solve for
the x's in terms of the y's.

175
00:09:04,060 --> 00:09:07,050
Now we may not be able
to do that explicitly.

176
00:09:07,050 --> 00:09:11,110
The best we may be able to
do is to do that implicitly,

177
00:09:11,110 --> 00:09:11,680
meaning this.

178
00:09:11,680 --> 00:09:13,096
Let me just come
back to something

179
00:09:13,096 --> 00:09:14,830
I said before to make
sure that there's

180
00:09:14,830 --> 00:09:16,630
no misunderstanding about this.

181
00:09:16,630 --> 00:09:20,170
What we're saying is, that in
this particular linear system

182
00:09:20,170 --> 00:09:23,390
of equations, as long as this
determinant of coefficients is

183
00:09:23,390 --> 00:09:29,280
not 0, we can explicitly solve
for delta x_1 up to delta x_n

184
00:09:29,280 --> 00:09:32,580
in terms of delta
y_1 up to delta y_n,

185
00:09:32,580 --> 00:09:35,380
even if we may not be
able to solve explicitly

186
00:09:35,380 --> 00:09:37,770
for the x's in terms of the y's.

187
00:09:37,770 --> 00:09:39,150
In other words,
the crucial point

188
00:09:39,150 --> 00:09:42,030
is, we can solve
for the changes in x

189
00:09:42,030 --> 00:09:44,680
in terms of the changes in y.

190
00:09:44,680 --> 00:09:48,850
And that is implicitly enough
to see what the x's look like

191
00:09:48,850 --> 00:09:49,890
in terms of the y's.

192
00:09:49,890 --> 00:09:51,290
Once we know what
the change of x

193
00:09:51,290 --> 00:09:53,600
looks like in terms
of the change in y,

194
00:09:53,600 --> 00:09:57,420
then we really
know what x itself

195
00:09:57,420 --> 00:09:59,300
looks like in terms
of the y's, even

196
00:09:59,300 --> 00:10:03,430
as I say it may be implicitly
rather than explicitly.

197
00:10:03,430 --> 00:10:06,530
At any rate, this
matrix is so important

198
00:10:06,530 --> 00:10:09,400
that it's given a
very special name.

199
00:10:09,400 --> 00:10:12,840
Definition: The
matrix whose entry

200
00:10:12,840 --> 00:10:15,750
in the i-th row, j-th
column is the partial

201
00:10:15,750 --> 00:10:20,160
of y sub i with respect to x
sub j is called the Jacobian.

202
00:10:20,160 --> 00:10:22,320
I put "matrix" in
quotation marks

203
00:10:22,320 --> 00:10:25,170
here, because some people
refer to the Jacobian

204
00:10:25,170 --> 00:10:26,300
meaning a matrix.

205
00:10:26,300 --> 00:10:30,710
Other people call the Jacobian
the determinant of the Jacobian

206
00:10:30,710 --> 00:10:31,210
matrix.

207
00:10:31,210 --> 00:10:33,370
I'm not going to make
any distinction this way.

208
00:10:33,370 --> 00:10:35,070
It'll be clear from context.

209
00:10:35,070 --> 00:10:37,450
Usually when I say
the "Jacobian,"

210
00:10:37,450 --> 00:10:39,930
I will mean the Jacobian matrix.

211
00:10:39,930 --> 00:10:42,860
I might sometimes mean
the Jacobian determinant.

212
00:10:42,860 --> 00:10:45,790
And so to avoid ambiguity,
I will hopefully

213
00:10:45,790 --> 00:10:48,420
say "Jacobian matrix"
when I mean the matrix,

214
00:10:48,420 --> 00:10:51,610
and "Jacobian determinant"
when I mean the determinant.

215
00:10:51,610 --> 00:10:53,600
But should I forget to
do this, or should you

216
00:10:53,600 --> 00:10:55,830
read a textbook where
the word Jacobian is

217
00:10:55,830 --> 00:10:59,460
used without the
proper noun after it,

218
00:10:59,460 --> 00:11:02,140
it should be clear from
context which is meant.

219
00:11:02,140 --> 00:11:05,780
But at any rate, that's what
we mean by the Jacobian of y_1

220
00:11:05,780 --> 00:11:09,480
up to y_n with respect
to x_1 up to x_n.

221
00:11:09,480 --> 00:11:14,600
And this Jacobian matrix is
often abbreviated by-- you

222
00:11:14,600 --> 00:11:17,600
either write J for
Jacobian of y_1

223
00:11:17,600 --> 00:11:20,930
up to y_n over x_1 up to x_n.

224
00:11:20,930 --> 00:11:25,220
Or else you use a modification
of the partial derivative

225
00:11:25,220 --> 00:11:26,250
notation.

226
00:11:26,250 --> 00:11:30,560
And you sort of read this as
if it said the partial of y_1

227
00:11:30,560 --> 00:11:35,500
up to y_n divided by the
partial of x_1 up to x_n.

228
00:11:35,500 --> 00:11:39,620
And again, there
is the same analogy

229
00:11:39,620 --> 00:11:42,810
between why this
notation was invented

230
00:11:42,810 --> 00:11:47,050
and why the notation dy
divided by dx was invented.

231
00:11:47,050 --> 00:11:51,470
But in terms of giving you a
general overview of what we're

232
00:11:51,470 --> 00:11:53,640
interested in, I
think I would like

233
00:11:53,640 --> 00:11:56,000
to leave the
discussion of why we

234
00:11:56,000 --> 00:12:00,160
write this in a fractional
form to the homework.

235
00:12:00,160 --> 00:12:04,910
In other words, as either a
supplement to the learning

236
00:12:04,910 --> 00:12:09,420
exercises or else as part of
the supplementary notes in one

237
00:12:09,420 --> 00:12:11,430
form or another,
we will take care

238
00:12:11,430 --> 00:12:14,300
of all of the computational
aspects of how

239
00:12:14,300 --> 00:12:16,550
one handles the Jacobian.

240
00:12:16,550 --> 00:12:19,560
But what I wanted to
do now was to emphasize

241
00:12:19,560 --> 00:12:24,750
how one uses the Jacobian matrix
and differentials to invert

242
00:12:24,750 --> 00:12:27,810
systems of n equations
in n unknowns.

243
00:12:27,810 --> 00:12:30,000
And I will use the
technique that's

244
00:12:30,000 --> 00:12:31,720
used right in the
textbook, and which

245
00:12:31,720 --> 00:12:34,630
is part of the assignment
for today's unit.

246
00:12:34,630 --> 00:12:36,610
The example that
I have in mind--

247
00:12:36,610 --> 00:12:39,410
I simply picked the
usual case, n equals 2,

248
00:12:39,410 --> 00:12:41,710
so that things don't
get that messy.

249
00:12:41,710 --> 00:12:43,640
Using again, the
standard notation

250
00:12:43,640 --> 00:12:46,050
when one deals with two
independent variables,

251
00:12:46,050 --> 00:12:48,770
let u equal x squared
minus y squared.

252
00:12:48,770 --> 00:12:50,450
Let v equal to 2x*y.

253
00:12:50,450 --> 00:12:52,140
Let's suppose now
that I would like

254
00:12:52,140 --> 00:12:53,980
to find the partial
of x with respect

255
00:12:53,980 --> 00:12:58,180
for u, treating v as the
other independent variable.

256
00:12:58,180 --> 00:13:01,250
You see, again, I want
to review this thing.

257
00:13:01,250 --> 00:13:03,790
When I say find the
partial of x with respect

258
00:13:03,790 --> 00:13:06,530
to u holding v
constant, it is not

259
00:13:06,530 --> 00:13:08,270
the same as finding
the partial of u

260
00:13:08,270 --> 00:13:11,710
with respect to x from here
and then just inverting it.

261
00:13:11,710 --> 00:13:14,320
Namely the partial of u
with respect to x here

262
00:13:14,320 --> 00:13:16,820
assumes that y is
being held constant.

263
00:13:16,820 --> 00:13:19,320
And if you then invert that,
recall that what you're finding

264
00:13:19,320 --> 00:13:23,090
is the partial of x with
respect to u treating y

265
00:13:23,090 --> 00:13:24,230
as the other variable.

266
00:13:24,230 --> 00:13:26,140
We want the partial
of x with respect

267
00:13:26,140 --> 00:13:29,970
to u treating u and v as the
pair of independent variables.

268
00:13:29,970 --> 00:13:30,580
Why?

269
00:13:30,580 --> 00:13:34,160
Because that's exactly what you
mean by inverting this system.

270
00:13:34,160 --> 00:13:37,530
This system as given
expresses u and v

271
00:13:37,530 --> 00:13:40,740
in terms of the pair of
independent variables x and y.

272
00:13:40,740 --> 00:13:43,990
And now what you'd like to do
is express the pair of variables

273
00:13:43,990 --> 00:13:48,540
x and y in terms of the
independent variables u and v,

274
00:13:48,540 --> 00:13:51,200
assuming of course,
that u and v are indeed

275
00:13:51,200 --> 00:13:52,790
independent variables.

276
00:13:52,790 --> 00:13:55,140
The mechanical solution
is simply this.

277
00:13:55,140 --> 00:13:57,580
Using the language
of differentials,

278
00:13:57,580 --> 00:14:00,350
we write down du and dv.

279
00:14:00,350 --> 00:14:03,060
Namely, du is what?

280
00:14:03,060 --> 00:14:06,110
The partial of u with
respect to x times dx,

281
00:14:06,110 --> 00:14:09,420
plus the partial of u with
respect to y times dy.

282
00:14:09,420 --> 00:14:12,670
And from the relationship that
u equals x squared minus y

283
00:14:12,670 --> 00:14:17,690
squared, we see that du
is 2x*dx minus 2y*dy.

284
00:14:17,690 --> 00:14:23,290
Similarly, since the partial
of v with respect to x is 2y,

285
00:14:23,290 --> 00:14:26,500
and the partial of v
with respect to y is 2x,

286
00:14:26,500 --> 00:14:31,300
we see that dv is
2y*dx plus 2x*dy.

287
00:14:31,300 --> 00:14:33,700
If we now assume that
this is evaluated

288
00:14:33,700 --> 00:14:37,840
at some point (x_0, y_0),
what do we have over here?

289
00:14:37,840 --> 00:14:40,620
Once we've picked out
a point (x_0, y_0)

290
00:14:40,620 --> 00:14:44,700
to evaluate this at-- and I
left out that because it simply

291
00:14:44,700 --> 00:14:46,620
would make the
notation too long,

292
00:14:46,620 --> 00:14:48,500
but I'll talk about
that more later.

293
00:14:48,500 --> 00:14:51,430
Assuming that we've evaluated
this at a particular fixed

294
00:14:51,430 --> 00:14:55,090
value of x and y, we have what?

295
00:14:55,090 --> 00:14:59,836
du is some constant times dx
plus some constant times dy.

296
00:14:59,836 --> 00:15:03,960
dv is some constant times dx
plus some constant times dy.

297
00:15:03,960 --> 00:15:06,670
In other words, du
and dv are expressed

298
00:15:06,670 --> 00:15:09,410
as linear combinations
of dx and dy.

299
00:15:09,410 --> 00:15:12,530
We know how to invert
this type of a system,

300
00:15:12,530 --> 00:15:14,060
assuming that it's invertible.

301
00:15:14,060 --> 00:15:17,270
Sparing you the details,
what I'm saying is what?

302
00:15:17,270 --> 00:15:19,390
I could multiply,
say, the top equation

303
00:15:19,390 --> 00:15:22,140
by x, the bottom equation by y.

304
00:15:22,140 --> 00:15:26,100
And when I add them, the terms
involving dy will drop out.

305
00:15:26,100 --> 00:15:31,950
And I will get x*du plus
y*dv is 2x*dx plus 2y*dx.

306
00:15:31,950 --> 00:15:35,460
In other words twice--
it's 2 x squared.

307
00:15:35,460 --> 00:15:39,240
I multiply the top equation by
x, the bottom equation by y.

308
00:15:39,240 --> 00:15:41,620
So the right-hand
side here becomes

309
00:15:41,620 --> 00:15:45,360
2 x squared dx plus 2 y
squared dx, which is twice

310
00:15:45,360 --> 00:15:47,490
x squared plus y squared dx.

311
00:15:47,490 --> 00:15:49,840
I now divide both
sides of the equation

312
00:15:49,840 --> 00:15:53,780
through by twice x
squared plus y squared.

313
00:15:53,780 --> 00:15:56,670
I wind up with the
fact that dx is

314
00:15:56,670 --> 00:16:01,270
x over twice the quantity x
squared plus y squared du,

315
00:16:01,270 --> 00:16:06,510
plus the quantity y over twice
x squared plus y squared dv.

316
00:16:06,510 --> 00:16:09,010
Recall that I also
know by definition

317
00:16:09,010 --> 00:16:11,700
that dx is the partial
of x with respect

318
00:16:11,700 --> 00:16:15,960
to u times du plus the partial
of x with respect to v times

319
00:16:15,960 --> 00:16:19,990
dv, recalling from our
lecture on exact differentials

320
00:16:19,990 --> 00:16:23,370
that the only way two
differentials in terms

321
00:16:23,370 --> 00:16:26,460
of the du and dv can be
equal is if they're equal

322
00:16:26,460 --> 00:16:28,610
coefficient by coefficient.

323
00:16:28,610 --> 00:16:31,970
I can therefore equate
the two coefficients of du

324
00:16:31,970 --> 00:16:35,020
to conclude that the partial
of x with respect to u

325
00:16:35,020 --> 00:16:39,670
is x over 2 times the quantity
x squared plus y squared.

326
00:16:39,670 --> 00:16:42,120
In fact, I can get the
extra piece of information,

327
00:16:42,120 --> 00:16:44,730
even though I wasn't asked
for that in this problem,

328
00:16:44,730 --> 00:16:47,110
that the partial of
x with respect to v

329
00:16:47,110 --> 00:16:51,240
is y over twice the quantity
x squared plus y squared.

330
00:16:51,240 --> 00:16:54,860
By the way, observe,
purely algebraically,

331
00:16:54,860 --> 00:16:57,310
that the only time I
would be in any difficulty

332
00:16:57,310 --> 00:17:00,490
with this procedure is if
x squared plus y squared

333
00:17:00,490 --> 00:17:02,650
happened to equal 0.

334
00:17:02,650 --> 00:17:05,230
In other words, if x squared
plus y squared happened

335
00:17:05,230 --> 00:17:08,470
to equal 0, then to
divide through by twice

336
00:17:08,470 --> 00:17:11,740
x squared plus y squared is
equivalent to dividing through

337
00:17:11,740 --> 00:17:12,750
by 0.

338
00:17:12,750 --> 00:17:15,730
And division by 0
is not permissible.

339
00:17:15,730 --> 00:17:17,609
In other words,
somehow or other,

340
00:17:17,609 --> 00:17:22,069
I must take into consideration
that I am in trouble if x

341
00:17:22,069 --> 00:17:24,030
squared plus y squared is 0.

342
00:17:24,030 --> 00:17:26,079
Notice, by the way,
that the only time

343
00:17:26,079 --> 00:17:28,580
that x squared plus
y squared can be 0

344
00:17:28,580 --> 00:17:30,800
is if both x and y are 0.

345
00:17:30,800 --> 00:17:33,530
And that means, again,
that somehow or other,

346
00:17:33,530 --> 00:17:39,460
at the point 0 comma 0-- in
a neighborhood of the point 0

347
00:17:39,460 --> 00:17:42,020
comma 0, in the
neighborhood of the origin,

348
00:17:42,020 --> 00:17:45,170
I can expect to have a
little bit of trouble.

349
00:17:45,170 --> 00:17:47,860
Now again, the main
aim of the lecture

350
00:17:47,860 --> 00:17:49,720
is to give you an overview.

351
00:17:49,720 --> 00:17:52,640
The trouble that
comes in at the origin

352
00:17:52,640 --> 00:17:55,410
will again be left
for the exercises.

353
00:17:55,410 --> 00:17:59,050
In a learning exercise, we will
discuss just what goes wrong

354
00:17:59,050 --> 00:18:01,730
if you take a neighborhood
of the origin,

355
00:18:01,730 --> 00:18:04,890
to discuss the change of
variables u equals x squared

356
00:18:04,890 --> 00:18:07,490
minus y squared, v equals 2x*y.

357
00:18:07,490 --> 00:18:09,920
Suffice it to say,
for the time being,

358
00:18:09,920 --> 00:18:13,410
that the system of equations
u equals x squared minus y

359
00:18:13,410 --> 00:18:18,330
squared, v equals 2x*y is
invertible in any neighborhood

360
00:18:18,330 --> 00:18:23,630
of a point x_0 comma y_0 except
in that one possible case when

361
00:18:23,630 --> 00:18:27,575
you have chosen as the
point (x_0, y_0) the origin.

362
00:18:27,575 --> 00:18:30,770
At any rate, to take
this problem away

363
00:18:30,770 --> 00:18:33,790
from the specific
concrete example

364
00:18:33,790 --> 00:18:35,820
that we've been talking
about, and to put

365
00:18:35,820 --> 00:18:38,570
this in terms of a more
general perspective,

366
00:18:38,570 --> 00:18:42,770
let's go back more abstractly
to the more general system.

367
00:18:42,770 --> 00:18:45,530
Let's suppose now
that u and v are

368
00:18:45,530 --> 00:18:48,350
any two continuously
differentiable functions

369
00:18:48,350 --> 00:18:49,440
of x and y.

370
00:18:49,440 --> 00:18:51,260
Let u be f of x, y.

371
00:18:51,260 --> 00:18:53,560
Let v equal to g of x, y.

372
00:18:53,560 --> 00:18:55,620
And what we're saying
is, if you pick

373
00:18:55,620 --> 00:19:00,380
a particular point x_0 comma
y_0, then by mechanically

374
00:19:00,380 --> 00:19:02,760
using the total
differential, we have

375
00:19:02,760 --> 00:19:05,950
that du is the partial
of f with respect

376
00:19:05,950 --> 00:19:10,660
to x evaluated at (x_0, y_0)
times dx, plus the partial of f

377
00:19:10,660 --> 00:19:15,010
with respect to y, evaluated
at (x_0, y_0) times dy.

378
00:19:15,010 --> 00:19:18,100
We have that dv is the
partial of g with respect

379
00:19:18,100 --> 00:19:21,310
to x evaluated at
(x_0, y_0) times dx

380
00:19:21,310 --> 00:19:22,860
plus the partial
of g with respect

381
00:19:22,860 --> 00:19:26,580
to y, evaluated at
(x_0, y_0) times dy.

382
00:19:26,580 --> 00:19:28,000
What is this now?

383
00:19:28,000 --> 00:19:32,560
This is a linear system of
two equations in two unknowns.

384
00:19:32,560 --> 00:19:37,180
du and dv are linear
combinations of the dx and dy.

385
00:19:37,180 --> 00:19:40,160
The key point being again--
that's why I put this in here

386
00:19:40,160 --> 00:19:42,840
specifically, with the
x sub 0 and the y sub

387
00:19:42,840 --> 00:19:45,280
0-- the key point
is that as soon

388
00:19:45,280 --> 00:19:49,450
as you evaluate a partial
derivative at a fixed point,

389
00:19:49,450 --> 00:19:52,720
the value is a constant,
not a variable.

390
00:19:52,720 --> 00:19:54,030
So this is what?

391
00:19:54,030 --> 00:19:55,830
A linear system.

392
00:19:55,830 --> 00:20:00,116
We have du as a constant times
the x plus a constant times dy.

393
00:20:00,116 --> 00:20:03,690
dv is a constant times dx
plus a constant times dy.

394
00:20:03,690 --> 00:20:05,610
Again, to make a
long story short,

395
00:20:05,610 --> 00:20:09,240
I can solve for dx in
terms of du and dv.

396
00:20:09,240 --> 00:20:15,680
I can solve for dy in terms
of du and dv, provided what?

397
00:20:15,680 --> 00:20:18,070
That my matrix of
coefficients does not

398
00:20:18,070 --> 00:20:20,110
have its determinant equal to 0.

399
00:20:20,110 --> 00:20:22,150
And to review this
more explicitly

400
00:20:22,150 --> 00:20:26,430
so you see the mechanics, all
I'm saying is, to solve for dx,

401
00:20:26,430 --> 00:20:30,420
I can multiply the top
equation by the partial

402
00:20:30,420 --> 00:20:34,630
of g with respect to y
evaluated at (x_0, y_0).

403
00:20:34,630 --> 00:20:39,050
I can multiply the bottom
equation by minus the partial

404
00:20:39,050 --> 00:20:42,440
of f with respect to y
evaluated at (x_0, y_0).

405
00:20:42,440 --> 00:20:44,780
And then when I add
these two equations,

406
00:20:44,780 --> 00:20:46,860
the dy term will drop out.

407
00:20:46,860 --> 00:20:49,450
Again, leaving the
details for you,

408
00:20:49,450 --> 00:20:51,930
it turns out that dx is what?

409
00:20:51,930 --> 00:20:54,039
The partial of g
with respect to y--

410
00:20:54,039 --> 00:20:56,080
and I've abbreviated this
again, this means what?

411
00:20:56,080 --> 00:20:58,100
Evaluated at (x_0, y_0).

412
00:20:58,100 --> 00:21:03,500
Times du minus the partial of f
with respect to y at (x_0, y_0)

413
00:21:03,500 --> 00:21:06,880
times dv over the
partial of f with respect

414
00:21:06,880 --> 00:21:09,300
to x times the partial
with g with respect

415
00:21:09,300 --> 00:21:13,050
to y minus the partial
of f with respect to y

416
00:21:13,050 --> 00:21:15,770
times the partial of
g with respect to x.

417
00:21:15,770 --> 00:21:21,390
And notice, of course, that
this denominator is precisely

418
00:21:21,390 --> 00:21:23,860
our matrix of coefficients.

419
00:21:23,860 --> 00:21:28,640
f sub x, f sub y;
g sub x, g sub y.

420
00:21:28,640 --> 00:21:30,910
And the only place I've
taken a liberty here

421
00:21:30,910 --> 00:21:35,110
is to use the abbreviation of
leaving out the (x_0, y_0).

422
00:21:35,110 --> 00:21:36,630
And the key point is what?

423
00:21:36,630 --> 00:21:39,460
The only place I am
going to get in trouble

424
00:21:39,460 --> 00:21:42,260
is if this denominator
happens to be 0.

425
00:21:42,260 --> 00:21:44,880
In the two by two
case-- in other words,

426
00:21:44,880 --> 00:21:48,240
in the case of two
equations and two unknowns,

427
00:21:48,240 --> 00:21:50,380
notice that we
can see explicitly

428
00:21:50,380 --> 00:21:53,730
what goes wrong when the
determinant of coefficients

429
00:21:53,730 --> 00:21:54,490
is 0.

430
00:21:54,490 --> 00:21:57,230
The determinant of coefficients
is just this denominator.

431
00:21:57,230 --> 00:22:00,130
And when that denominator
is 0, we're in trouble.

432
00:22:00,130 --> 00:22:03,700
In other words, the only time
we cannot invert this system,

433
00:22:03,700 --> 00:22:08,080
the only time we cannot find
delta x and delta y in terms

434
00:22:08,080 --> 00:22:13,760
of du and dv is when
this determinant is 0.

435
00:22:13,760 --> 00:22:17,020
Now you see, I think this is
pretty straightforward stuff.

436
00:22:17,020 --> 00:22:19,200
The textbook has
a section on this,

437
00:22:19,200 --> 00:22:21,020
as you will be reading shortly.

438
00:22:21,020 --> 00:22:23,430
It is not hard to work
this mechanically.

439
00:22:23,430 --> 00:22:25,150
And then the
question comes up is,

440
00:22:25,150 --> 00:22:28,610
how come when you pick up a
book on advanced calculus,

441
00:22:28,610 --> 00:22:33,830
there's usually a huge chapter
on Jacobians and inversion?

442
00:22:33,830 --> 00:22:37,820
Why isn't it this simple
in the advanced textbooks?

443
00:22:37,820 --> 00:22:41,180
Why can we have it in our
book this simply, but yet,

444
00:22:41,180 --> 00:22:44,370
in the advanced book, why is
there so much more to this

445
00:22:44,370 --> 00:22:45,880
beneath the surface?

446
00:22:45,880 --> 00:22:49,300
The answer behind all
of this is quite subtle.

447
00:22:49,300 --> 00:22:52,105
In fact, the major
subtlety is this.

448
00:22:52,105 --> 00:22:55,280
And that is that
the notation du--

449
00:22:55,280 --> 00:23:00,165
and for that matter, dv or dx
or dy or dx_1 dx_2, whatever

450
00:23:00,165 --> 00:23:02,900
you're using here--
is ambiguous.

451
00:23:02,900 --> 00:23:06,170
And it's ambiguous for
the following reason.

452
00:23:06,170 --> 00:23:10,450
Recall how we defined the
meaning of the symbol du.

453
00:23:10,450 --> 00:23:13,480
If we're assuming
that u is expressed

454
00:23:13,480 --> 00:23:18,050
as a function of the independent
variables x and y, then by du,

455
00:23:18,050 --> 00:23:20,820
we mean delta u tan.

456
00:23:20,820 --> 00:23:23,930
On the other hand,
if we inverted this,

457
00:23:23,930 --> 00:23:26,750
du now-- in other words, what
do I mean by inverted this?

458
00:23:26,750 --> 00:23:29,460
What I mean, first of
all, is if we assume now

459
00:23:29,460 --> 00:23:33,330
that x and y are expressed in
terms of u and v-- for example,

460
00:23:33,330 --> 00:23:38,470
suppose x is some function h of
u and v, what does du mean now?

461
00:23:38,470 --> 00:23:40,870
Notice that now u
is playing the role

462
00:23:40,870 --> 00:23:42,490
of an independent variable.

463
00:23:42,490 --> 00:23:47,080
For the independent variable,
du just means delta u.

464
00:23:47,080 --> 00:23:49,690
In other words, by way
of a very quick review,

465
00:23:49,690 --> 00:23:53,620
notice that if we're viewing u
as being a dependent variable,

466
00:23:53,620 --> 00:23:56,680
then du means delta u tan.

467
00:23:56,680 --> 00:24:00,510
But if we're viewing u as
being an independent variable,

468
00:24:00,510 --> 00:24:03,100
then du means delta u.

469
00:24:03,100 --> 00:24:08,150
And consequently, the
results that we're using

470
00:24:08,150 --> 00:24:11,740
hinge very strongly then--
in other words, the inversion

471
00:24:11,740 --> 00:24:15,210
that we're using hinges very
strongly on the requirement.

472
00:24:15,210 --> 00:24:19,470
In other words, the inversion
requires the validity

473
00:24:19,470 --> 00:24:23,150
of interchanging delta u
and delta u tan, et cetera.

474
00:24:23,150 --> 00:24:27,110
Now let me show you what
that means more explicitly.

475
00:24:27,110 --> 00:24:29,210
Let's come back to something
that we were talking

476
00:24:29,210 --> 00:24:31,700
about just a few moments ago.

477
00:24:31,700 --> 00:24:34,980
From a completely
mechanical point of view,

478
00:24:34,980 --> 00:24:39,230
given that u equals f of x,
y and v equals g of x, y,

479
00:24:39,230 --> 00:24:41,520
we very mechanically
wrote down that du

480
00:24:41,520 --> 00:24:48,475
was f sub x dx plus f sub y dy,
dv was g sub x dx plus g sub y

481
00:24:48,475 --> 00:24:51,870
dy, and then we just
mechanically solved for dx

482
00:24:51,870 --> 00:24:54,450
in terms of du and dv.

483
00:24:54,450 --> 00:24:57,960
My claim is, is that if we
translate this thing, if we

484
00:24:57,960 --> 00:25:02,560
translate this thing into the
language of delta u's, delta

485
00:25:02,560 --> 00:25:05,840
x's, delta u tan's,
delta x tans, et cetera,

486
00:25:05,840 --> 00:25:08,340
what we really said was what?

487
00:25:08,340 --> 00:25:12,490
That delta u tan was the partial
of f with respect to x times

488
00:25:12,490 --> 00:25:16,900
delta x, plus the partial of f
with respect to y times delta

489
00:25:16,900 --> 00:25:17,750
y.

490
00:25:17,750 --> 00:25:21,110
And delta v tan was the partial
of g with respect to x times

491
00:25:21,110 --> 00:25:24,540
delta x, plus the partial of g
with respect to y times delta

492
00:25:24,540 --> 00:25:25,520
y.

493
00:25:25,520 --> 00:25:28,170
And then when we
eliminated delta y

494
00:25:28,170 --> 00:25:30,730
by multiplying the
top equation by g sub

495
00:25:30,730 --> 00:25:33,780
y and the bottom
equation by minus f sub y

496
00:25:33,780 --> 00:25:38,200
and adding, what we found
was how to express delta x--

497
00:25:38,200 --> 00:25:40,080
and catch this, this
is the key point--

498
00:25:40,080 --> 00:25:45,260
what we did was we
expressed delta x

499
00:25:45,260 --> 00:25:49,730
as a linear combination--
not of delta u and delta v,

500
00:25:49,730 --> 00:25:53,990
but of delta u tan
and delta v tan.

501
00:25:53,990 --> 00:25:57,670
You see, notice that the
result that we needed

502
00:25:57,670 --> 00:26:00,250
to have to be able
to use differentials

503
00:26:00,250 --> 00:26:02,580
was not this, but this.

504
00:26:02,580 --> 00:26:05,880
See, we found this,
not delta x tan

505
00:26:05,880 --> 00:26:12,580
equals g_y delta u minus f sub
y delta v over f sub x g sub y

506
00:26:12,580 --> 00:26:14,560
minus f sub y g sub x.

507
00:26:14,560 --> 00:26:18,720
To be able to say-- to invert
this required that this was

508
00:26:18,720 --> 00:26:24,470
the expression that we
had, yet the expression

509
00:26:24,470 --> 00:26:26,570
that we were really
evaluating was this one.

510
00:26:26,570 --> 00:26:28,420
In fact, let me come
back for one moment,

511
00:26:28,420 --> 00:26:30,010
and make sure that we see this.

512
00:26:30,010 --> 00:26:32,570
You see, notice again
that the subtlety

513
00:26:32,570 --> 00:26:36,560
of going from here to
here and inverting never

514
00:26:36,560 --> 00:26:40,510
shows us that we've interchanged
the roles of u and v

515
00:26:40,510 --> 00:26:42,960
from being the
dependent variables

516
00:26:42,960 --> 00:26:45,050
to the independent variables.

517
00:26:45,050 --> 00:26:47,690
So the reason that
there is so much

518
00:26:47,690 --> 00:26:51,930
work done in advanced textbooks
under the heading of inverting

519
00:26:51,930 --> 00:26:55,170
systems of equations
is to justify

520
00:26:55,170 --> 00:26:59,260
that being able to switch
from delta x to delta x tan

521
00:26:59,260 --> 00:27:03,910
or from delta u tan to delta
u as we see fit, whenever

522
00:27:03,910 --> 00:27:05,670
it serves our purposes.

523
00:27:05,670 --> 00:27:08,970
The validity of
being able to do that

524
00:27:08,970 --> 00:27:11,330
hinges on this more
subtle type of proof,

525
00:27:11,330 --> 00:27:13,460
that as far as I'm
concerned, goes

526
00:27:13,460 --> 00:27:17,040
beyond the scope of our text,
other than for the fact that

527
00:27:17,040 --> 00:27:21,640
in the learning exercises, I
will find excuses to bring up

528
00:27:21,640 --> 00:27:24,660
all of the situations
that bring out

529
00:27:24,660 --> 00:27:26,320
where the theory is important.

530
00:27:26,320 --> 00:27:28,030
In other words, there
will not be proofs

531
00:27:28,030 --> 00:27:30,240
of these more difficult things.

532
00:27:30,240 --> 00:27:31,950
Not because the proofs
aren't important,

533
00:27:31,950 --> 00:27:33,366
but from the point
of view of what

534
00:27:33,366 --> 00:27:35,030
we're trying to
do in our course,

535
00:27:35,030 --> 00:27:38,890
these proofs tend to obscure
the main stream of things.

536
00:27:38,890 --> 00:27:42,000
So what I will do in the
learning exercises is bring up

537
00:27:42,000 --> 00:27:45,880
places that will show you
why the theory is important,

538
00:27:45,880 --> 00:27:50,520
at which point, I will emphasize
what the result of the theory

539
00:27:50,520 --> 00:27:53,910
is without belaboring
and beleaguering you

540
00:27:53,910 --> 00:27:55,830
with the proofs of these things.

541
00:27:55,830 --> 00:27:57,680
At any rate, what
I'd like to do now,

542
00:27:57,680 --> 00:28:00,540
next time, is to give
you an example where

543
00:28:00,540 --> 00:28:03,850
all of the material, of the
blocks of material that we've

544
00:28:03,850 --> 00:28:06,760
done now on partial derivatives,
are sort of pulled together

545
00:28:06,760 --> 00:28:07,810
very nicely.

546
00:28:07,810 --> 00:28:11,090
But at any rate, we'll talk
about that more next time.

547
00:28:11,090 --> 00:28:12,480
And until next time, goodbye.

548
00:28:17,920 --> 00:28:20,290
Funding for the
publication of this video

549
00:28:20,290 --> 00:28:25,170
was provided by the Gabriella
and Paul Rosenbaum Foundation.

550
00:28:25,170 --> 00:28:29,350
Help OCW continue to provide
free and open access to MIT

551
00:28:29,350 --> 00:28:33,760
courses by making a donation
at ocw.mit.edu/donate.