1
00:00:00,040 --> 00:00:02,460
The following content is
provided under a Creative

2
00:00:02,460 --> 00:00:03,870
Commons license.

3
00:00:03,870 --> 00:00:06,320
Your support will help
MIT OpenCourseWare

4
00:00:06,320 --> 00:00:10,560
continue to offer high quality
educational resources for free.

5
00:00:10,560 --> 00:00:13,300
To make a donation or
view additional materials

6
00:00:13,300 --> 00:00:17,116
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,116 --> 00:00:17,740
at ocw.mit.edu.

8
00:00:28,570 --> 00:00:29,320
HERBERT GROSS: Hi.

9
00:00:29,320 --> 00:00:33,650
Today we try to wrap up
our present discussion

10
00:00:33,650 --> 00:00:38,620
of partial derivatives by means
of a rather practical example,

11
00:00:38,620 --> 00:00:41,110
which not only has
wide application

12
00:00:41,110 --> 00:00:45,320
but which ties in many of
the individual principles

13
00:00:45,320 --> 00:00:49,510
that we've talked about in the
last two blocks of material.

14
00:00:49,510 --> 00:00:52,640
The particular topic
that I have in mind today

15
00:00:52,640 --> 00:00:57,900
is the topic known as the theory
of maxima/minima of functions

16
00:00:57,900 --> 00:00:59,770
in several variables.

17
00:00:59,770 --> 00:01:02,090
You see, in part
one of our course

18
00:01:02,090 --> 00:01:03,740
we studied the
special case where

19
00:01:03,740 --> 00:01:06,840
we had a function from the real
numbers into the real numbers.

20
00:01:06,840 --> 00:01:09,330
And what we were
looking for were

21
00:01:09,330 --> 00:01:13,420
values of the independent
variable for which f

22
00:01:13,420 --> 00:01:15,920
was either maximum or minimum.

23
00:01:15,920 --> 00:01:19,600
And so a natural extension of
this is simply the following:

24
00:01:19,600 --> 00:01:23,110
given a real-valued function
of several real variables--

25
00:01:23,110 --> 00:01:26,670
in other words, assume that f
is a mapping from n-dimensional

26
00:01:26,670 --> 00:01:31,660
space into the real numbers, f
is a function from E^n into E.

27
00:01:31,660 --> 00:01:36,560
Then the n-tuple a bar, in E sub
n, is called a local maximum--

28
00:01:36,560 --> 00:01:39,140
and I suppose we might as
kill two birds with one stone

29
00:01:39,140 --> 00:01:42,370
here and put in the
definition for a local minimum

30
00:01:42,370 --> 00:01:43,710
at the same time.

31
00:01:43,710 --> 00:01:46,530
It's called a local
maximum of f if and only

32
00:01:46,530 --> 00:01:52,510
if there exists a neighborhood
n of a bar such that f of a bar

33
00:01:52,510 --> 00:01:56,940
is greater than or equal to
f of x bar for every x bar

34
00:01:56,940 --> 00:01:58,310
in the neighborhood.

35
00:01:58,310 --> 00:01:59,990
I suppose, by the
way, while we're

36
00:01:59,990 --> 00:02:04,290
at this, for a local minimum
the condition would be what?

37
00:02:04,290 --> 00:02:06,600
For a local minimum,
instead of f of a bar

38
00:02:06,600 --> 00:02:11,020
being greater than or equal to f
of x bar, it would be less than

39
00:02:11,020 --> 00:02:12,270
or equal to.

40
00:02:12,270 --> 00:02:13,700
In other words,
what you're saying

41
00:02:13,700 --> 00:02:18,740
is that what you mean by a
local high point or low point

42
00:02:18,740 --> 00:02:19,450
is what?

43
00:02:19,450 --> 00:02:22,610
That in a neighborhood
of the point in question,

44
00:02:22,610 --> 00:02:28,260
for example, if
that output exceeds

45
00:02:28,260 --> 00:02:31,130
every other possible output
in a sufficiently small

46
00:02:31,130 --> 00:02:35,000
neighborhood, then we say
that that particular input

47
00:02:35,000 --> 00:02:37,080
is the local maximum.

48
00:02:37,080 --> 00:02:39,530
And a similar definition
for local minimum.

49
00:02:39,530 --> 00:02:41,540
In other words,
again, keep in mind

50
00:02:41,540 --> 00:02:44,590
that this is
precisely and almost

51
00:02:44,590 --> 00:02:47,740
a word-for-word translation
of what the terms relative

52
00:02:47,740 --> 00:02:51,050
maximum and relative minimum
or local max and local min

53
00:02:51,050 --> 00:02:54,210
meant for functions of
a single real variable.

54
00:02:54,210 --> 00:02:58,310
Consequently, except for the
fact that in several variables

55
00:02:58,310 --> 00:03:01,850
it's much more difficult
to describe domains because

56
00:03:01,850 --> 00:03:03,730
of all the degrees of
freedom that you have,

57
00:03:03,730 --> 00:03:07,990
one would expect that these same
particular tests for membership

58
00:03:07,990 --> 00:03:10,370
would occur for functions
of several variables

59
00:03:10,370 --> 00:03:13,020
when we're looking for high-low
points or max-min points

60
00:03:13,020 --> 00:03:15,320
as in the case of one
independent variable.

61
00:03:15,320 --> 00:03:18,150
So let's just very
quickly summarize this.

62
00:03:18,150 --> 00:03:22,360
How do we test for candidates
for max-min points?

63
00:03:22,360 --> 00:03:25,780
Now, the idea is simply
this, that if we're

64
00:03:25,780 --> 00:03:28,840
going to have a relative
high or a relative low point,

65
00:03:28,840 --> 00:03:30,730
think of the thing graphically.

66
00:03:30,730 --> 00:03:33,820
If you take a cross section
with respect to any one

67
00:03:33,820 --> 00:03:35,240
of the independent variables.

68
00:03:35,240 --> 00:03:41,150
In other words, if you look at
f as a function just of x_1,

69
00:03:41,150 --> 00:03:44,370
say, and hold x_2
up to x_n constant,

70
00:03:44,370 --> 00:03:47,960
then when w is viewed as
a function of x_1 alone,

71
00:03:47,960 --> 00:03:51,120
then one would expect
that you have what?

72
00:03:51,120 --> 00:03:56,360
That just in the w,x_1-plane,
you must have a candidate,

73
00:03:56,360 --> 00:03:58,920
which means that one would
expect that if you took

74
00:03:58,920 --> 00:04:02,590
the partial of f with respect
to x_1, that must be 0.

75
00:04:02,590 --> 00:04:04,960
Similarly, the partial
of f with respect to x_2

76
00:04:04,960 --> 00:04:06,490
must be 0, et cetera.

77
00:04:06,490 --> 00:04:09,630
The partial of f with
respect to x sub n must be 0.

78
00:04:09,630 --> 00:04:13,280
In other words, to find a
candidate for a max-min point,

79
00:04:13,280 --> 00:04:15,120
notice that right
off the bat we're

80
00:04:15,120 --> 00:04:20,269
back to a practical application
of the study of systems

81
00:04:20,269 --> 00:04:23,470
of several equations
in several unknowns.

82
00:04:23,470 --> 00:04:27,000
Namely, we must solve
the systems of equations

83
00:04:27,000 --> 00:04:30,470
the partial of f with respect
to x_1 equals 0, et cetera,

84
00:04:30,470 --> 00:04:33,250
the partial of f with
respect to x_n equals 0.

85
00:04:33,250 --> 00:04:36,460
Find simultaneous
solutions, whenever

86
00:04:36,460 --> 00:04:39,160
such simultaneous
solutions exist.

87
00:04:39,160 --> 00:04:41,950
Now of course, there may
be points in our domain

88
00:04:41,950 --> 00:04:44,380
where the partials
don't exist, just

89
00:04:44,380 --> 00:04:47,170
like there was in the case of
calculus of a single variable.

90
00:04:47,170 --> 00:04:50,410
In other words, the second thing
we do in testing for candidates

91
00:04:50,410 --> 00:04:55,160
is we find points in the domain
where f is not differentiable.

92
00:04:55,160 --> 00:04:56,970
For example, f might
not be continuous.

93
00:04:56,970 --> 00:04:58,530
f might not be defined.

94
00:04:58,530 --> 00:05:00,200
Whatever the reason
is, we look to see

95
00:05:00,200 --> 00:05:02,160
where f is not differentiable.

96
00:05:02,160 --> 00:05:06,210
And all points in the domain at
which f is not differentiable,

97
00:05:06,210 --> 00:05:10,030
they also become candidates
for max-min points.

98
00:05:10,030 --> 00:05:13,920
And thirdly, we check the
boundary of the domain of f.

99
00:05:13,920 --> 00:05:15,810
In other words,
if the domain of f

100
00:05:15,810 --> 00:05:17,890
happens to be a
two-dimensional region,

101
00:05:17,890 --> 00:05:20,190
we check the boundary
of the region.

102
00:05:20,190 --> 00:05:23,730
Again, the reason being
the same as in the calculus

103
00:05:23,730 --> 00:05:25,390
of a single real variable.

104
00:05:25,390 --> 00:05:27,730
If a function is
differentiable, it

105
00:05:27,730 --> 00:05:30,290
must take on its maximum
and minimum values

106
00:05:30,290 --> 00:05:32,540
some place, if
the domain happens

107
00:05:32,540 --> 00:05:36,140
to be a closed set, in
other words, a connected set

108
00:05:36,140 --> 00:05:37,320
with a boundary.

109
00:05:37,320 --> 00:05:39,160
And we won't go into
that in any more

110
00:05:39,160 --> 00:05:41,420
detail at this particular time.

111
00:05:41,420 --> 00:05:42,900
But essentially, what do we do?

112
00:05:42,900 --> 00:05:45,670
We look at the function
wherever it's differentiable,

113
00:05:45,670 --> 00:05:48,440
take all of its partials,
set them equal to 0,

114
00:05:48,440 --> 00:05:52,790
solve that system simultaneously
to see what values of x_1

115
00:05:52,790 --> 00:05:57,180
up to x_n give us permissible
candidates for max-min points,

116
00:05:57,180 --> 00:06:01,280
we check to see where the
derivative does not exist,

117
00:06:01,280 --> 00:06:03,250
and that gives us
another batch of points,

118
00:06:03,250 --> 00:06:06,110
and then we check
the boundary values

119
00:06:06,110 --> 00:06:08,900
to see if anything
peculiar happens there.

120
00:06:08,900 --> 00:06:12,010
Same three tests as we had for
calculus of a single variable.

121
00:06:12,010 --> 00:06:14,090
And why is this
so closely allied

122
00:06:14,090 --> 00:06:15,700
to calculus of a
single variable?

123
00:06:15,700 --> 00:06:18,560
Well, if we take
the case n equals 2,

124
00:06:18,560 --> 00:06:22,000
we again get a nice
geometric interpretation.

125
00:06:22,000 --> 00:06:24,570
Namely, if w is a
function of x and y,

126
00:06:24,570 --> 00:06:26,700
notice again our
notation for what

127
00:06:26,700 --> 00:06:28,980
happens when we have two
independent variables.

128
00:06:28,980 --> 00:06:31,040
They're called x and
y, and usually we

129
00:06:31,040 --> 00:06:34,740
let the dependent variable
be w in that case.

130
00:06:34,740 --> 00:06:37,320
Let's take a look and see
what happens over here.

131
00:06:37,320 --> 00:06:38,800
Suppose, for the
sake of argument,

132
00:06:38,800 --> 00:06:42,030
that we know that we
have a relative low point

133
00:06:42,030 --> 00:06:45,080
corresponding to
the input a comma b.

134
00:06:45,080 --> 00:06:47,990
In other words, suppose we
know that f of a comma b

135
00:06:47,990 --> 00:06:52,560
is the lowest value of f, the
lowest height on the surface,

136
00:06:52,560 --> 00:06:54,840
in a sufficiently small
neighborhood of the point

137
00:06:54,840 --> 00:06:55,990
a comma b.

138
00:06:55,990 --> 00:07:00,160
All I'm saying is this:
take any slice whatsoever,

139
00:07:00,160 --> 00:07:05,260
any plane through the point a
comma b, pick any direction s.

140
00:07:05,260 --> 00:07:07,720
Take that plane
perpendicular to the xy-plane

141
00:07:07,720 --> 00:07:10,850
in the s direction,
and slice the surface w

142
00:07:10,850 --> 00:07:13,260
equals f of x, y
with that plane,

143
00:07:13,260 --> 00:07:16,640
and we get a slice
something like this.

144
00:07:16,640 --> 00:07:19,550
Not something like this, this
is the slice that we get.

145
00:07:19,550 --> 00:07:21,670
Now, let's take a look at
where that low point is.

146
00:07:21,670 --> 00:07:24,750
Since that is to be a low
point in the entire region,

147
00:07:24,750 --> 00:07:28,310
obviously it must be a
low point, in particular,

148
00:07:28,310 --> 00:07:31,390
with respect to the
particular slice that we took.

149
00:07:31,390 --> 00:07:33,420
How could it be the
lowest point every place

150
00:07:33,420 --> 00:07:35,400
if it's not the lowest
point with respect

151
00:07:35,400 --> 00:07:37,120
to any particular slice?

152
00:07:37,120 --> 00:07:39,580
And all we're saying
then is that with respect

153
00:07:39,580 --> 00:07:43,980
to this slice, notice that
w is a function of s alone.

154
00:07:43,980 --> 00:07:46,140
In other words, we can
talk about the directional

155
00:07:46,140 --> 00:07:49,050
derivative df/ds.

156
00:07:49,050 --> 00:07:52,630
And again, all we're saying is
that directional derivative,

157
00:07:52,630 --> 00:07:56,270
df/ds, evaluated at
the point a comma b,

158
00:07:56,270 --> 00:07:59,730
must be 0 for all directions s.

159
00:07:59,730 --> 00:08:02,140
And as we have seen
throughout our course,

160
00:08:02,140 --> 00:08:05,390
if f happens to be a
continuously differentiable

161
00:08:05,390 --> 00:08:08,060
function of the
variables x_1 up to x_n,

162
00:08:08,060 --> 00:08:12,560
then the directional derivative
is determined completely

163
00:08:12,560 --> 00:08:17,220
by our knowledge of the partials
with respect to x_1 up to x_n.

164
00:08:17,220 --> 00:08:21,060
And you see, as far as
the theory is concerned,

165
00:08:21,060 --> 00:08:22,680
that's all there is to it.

166
00:08:22,680 --> 00:08:26,060
As the cliche goes,
the rest is commentary.

167
00:08:26,060 --> 00:08:28,330
Now what kind of
commentary do I mean?

168
00:08:28,330 --> 00:08:30,640
Well, among other
things, once we

169
00:08:30,640 --> 00:08:34,830
have located all the particular
max-min candidates-- see,

170
00:08:34,830 --> 00:08:37,140
notice why I call these
things candidates.

171
00:08:37,140 --> 00:08:39,710
All we're saying
is that wherever

172
00:08:39,710 --> 00:08:42,559
the system of
possible derivatives,

173
00:08:42,559 --> 00:08:45,930
setting them equal
to 0, yields a value,

174
00:08:45,930 --> 00:08:49,410
that point is simply a
candidate for a max-min point.

175
00:08:49,410 --> 00:08:52,160
Just, again, like in the
calculus of a single variable.

176
00:08:52,160 --> 00:08:56,855
If f prime of a is 0, we cannot
conclude that a is a local max

177
00:08:56,855 --> 00:08:58,930
or a local min. it
might be a saddle point,

178
00:08:58,930 --> 00:09:02,260
a stationary point where
the thing just levels off.

179
00:09:02,260 --> 00:09:04,650
It's just that once
we have the candidate,

180
00:09:04,650 --> 00:09:07,870
how do we test whether it
really is a max or a min?

181
00:09:07,870 --> 00:09:12,290
Well, what we have to do is,
looking at f of a comma b,

182
00:09:12,290 --> 00:09:15,240
we have to see whether that's
the lowest possible value

183
00:09:15,240 --> 00:09:18,200
or the highest possible
value in a sufficiently small

184
00:09:18,200 --> 00:09:19,390
neighborhood of that point.

185
00:09:19,390 --> 00:09:22,400
Well, how do you characterize
a neighborhood of the point?

186
00:09:22,400 --> 00:09:25,950
You look at some nearby point,
which we can denote as what?

187
00:09:25,950 --> 00:09:28,560
a plus h comma b plus k.

188
00:09:28,560 --> 00:09:31,610
As if this h and k
seem strange to you,

189
00:09:31,610 --> 00:09:34,800
notice that h is often
what we call delta x,

190
00:09:34,800 --> 00:09:37,330
and k is what we
often call delta y.

191
00:09:37,330 --> 00:09:41,170
In other words, we look at some
nearby point, a plus delta x

192
00:09:41,170 --> 00:09:43,360
comma b plus delta y.

193
00:09:43,360 --> 00:09:46,540
And what we're saying
is that if this

194
00:09:46,540 --> 00:09:50,140
is to be, for
example, a high point,

195
00:09:50,140 --> 00:09:52,850
it means that when you
compute this difference,

196
00:09:52,850 --> 00:09:56,920
this difference had better
be negative all the time.

197
00:09:56,920 --> 00:10:00,140
Because when you look at this
particular thing over here,

198
00:10:00,140 --> 00:10:02,220
if this is to be the
greatest possible value

199
00:10:02,220 --> 00:10:04,004
in a neighborhood
when you subtract it

200
00:10:04,004 --> 00:10:05,670
from something else
in the neighborhood,

201
00:10:05,670 --> 00:10:07,410
you should get a negative value.

202
00:10:07,410 --> 00:10:10,790
In other words, to put it
in still other terms, what

203
00:10:10,790 --> 00:10:13,740
we're saying is that to--
let's just read this again.

204
00:10:13,740 --> 00:10:16,430
Once a max-min candidate
a comma b is found,

205
00:10:16,430 --> 00:10:20,450
we must investigate the sign
of f of a plus h comma b

206
00:10:20,450 --> 00:10:26,080
plus k minus f of a, b for
all sufficiently small values

207
00:10:26,080 --> 00:10:28,965
of h and k.

208
00:10:28,965 --> 00:10:30,110
OK?

209
00:10:30,110 --> 00:10:32,270
That's for all sufficiently
small values of h of k,

210
00:10:32,270 --> 00:10:33,150
which can be messy.

211
00:10:33,150 --> 00:10:34,040
I don't mean that.

212
00:10:34,040 --> 00:10:37,695
I mean for all sufficiently
small values of h and k,

213
00:10:37,695 --> 00:10:39,070
in other words,
in a neighborhood

214
00:10:39,070 --> 00:10:40,710
of the point a comma b.

215
00:10:40,710 --> 00:10:44,360
And what I'm saying is that
this particular computation

216
00:10:44,360 --> 00:10:46,470
can itself be very messy.

217
00:10:46,470 --> 00:10:46,970
You see?

218
00:10:46,970 --> 00:10:50,170
This is, again, going back
to this idea of how we

219
00:10:50,170 --> 00:10:51,520
invert equations and the like.

220
00:10:51,520 --> 00:10:55,260
It's difficult to compute f
of a plus h comma b plus k,

221
00:10:55,260 --> 00:10:57,140
in general, if f is
a messy function,

222
00:10:57,140 --> 00:10:59,664
if f is a computationally
complicated thing.

223
00:10:59,664 --> 00:11:01,830
And not only that, but we
may have several unknowns,

224
00:11:01,830 --> 00:11:03,310
more than two unknowns.

225
00:11:03,310 --> 00:11:06,150
And you see, this was
true in one variable.

226
00:11:06,150 --> 00:11:08,610
We saw in the case of one
variable that, technically

227
00:11:08,610 --> 00:11:10,930
speaking, to test
whether f of a was

228
00:11:10,930 --> 00:11:13,540
a high point of a
low point, we had

229
00:11:13,540 --> 00:11:17,160
to look at f of a plus h minus
f of a for all sufficiently

230
00:11:17,160 --> 00:11:18,630
small values of h.

231
00:11:18,630 --> 00:11:21,810
And that could have been
a messy computation too.

232
00:11:21,810 --> 00:11:24,260
Of course, the thing that
happened in the single variable

233
00:11:24,260 --> 00:11:27,010
that was very helpful to
us was that in the case

234
00:11:27,010 --> 00:11:29,390
of a single variable,
we were often

235
00:11:29,390 --> 00:11:33,130
able to use f double
prime of a as a hint.

236
00:11:33,130 --> 00:11:35,870
In other words, whenever
f double prime of a

237
00:11:35,870 --> 00:11:40,250
wasn't equal to 0, we can
conclude, or could conclude,

238
00:11:40,250 --> 00:11:43,580
whether f of a was
a max or min, once

239
00:11:43,580 --> 00:11:45,280
we know that f prime of a was 0.

240
00:11:45,280 --> 00:11:47,430
Remember, that was
that holding water

241
00:11:47,430 --> 00:11:49,100
versus spilling water routine.

242
00:11:49,100 --> 00:11:51,980
If f double prime
of a was positive,

243
00:11:51,980 --> 00:11:54,190
that meant that the
curve was holding water.

244
00:11:54,190 --> 00:11:57,410
Holding water meant that
you had a minimum value.

245
00:11:57,410 --> 00:12:02,000
If f double prime
of a was negative,

246
00:12:02,000 --> 00:12:04,160
that meant that the
curve was spilling water.

247
00:12:04,160 --> 00:12:07,329
And spilling water
yielded a maximum value.

248
00:12:07,329 --> 00:12:09,620
And the only problem was, is
when the second derivative

249
00:12:09,620 --> 00:12:12,577
was 0, in which case
the test failed.

250
00:12:12,577 --> 00:12:14,160
What did it mean
that the test failed?

251
00:12:14,160 --> 00:12:16,990
When f double prime of
a was 0, the only way

252
00:12:16,990 --> 00:12:20,030
we could test to see
whether a was a max or a min

253
00:12:20,030 --> 00:12:21,870
on neither, meaning
a saddle point,

254
00:12:21,870 --> 00:12:26,320
what was to actually look at
f of a plus h minus f of a

255
00:12:26,320 --> 00:12:29,050
and see what happened
in that particular case.

256
00:12:29,050 --> 00:12:33,320
Now, one would like to believe
that an analogous result held

257
00:12:33,320 --> 00:12:35,520
for the case of
several real variables,

258
00:12:35,520 --> 00:12:38,860
in particular for the case
of two independent variables.

259
00:12:38,860 --> 00:12:41,960
The point is that in a
manner of speaking, it does.

260
00:12:41,960 --> 00:12:43,980
But in another
manner of speaking,

261
00:12:43,980 --> 00:12:46,300
things are much more
complicated than what

262
00:12:46,300 --> 00:12:49,120
happened in the case of
one independent variable.

263
00:12:49,120 --> 00:12:51,950
In particular, what goes
wrong is the following,

264
00:12:51,950 --> 00:12:54,440
and that is that the
second derivative,

265
00:12:54,440 --> 00:12:56,940
in the case of two
independent variables,

266
00:12:56,940 --> 00:12:59,400
involves three
separate partials.

267
00:12:59,400 --> 00:13:01,290
See what do you mean
by a second derivative?

268
00:13:01,290 --> 00:13:03,600
You mean you must
differentiate something twice.

269
00:13:03,600 --> 00:13:05,630
Well, you could've
differentiated

270
00:13:05,630 --> 00:13:09,250
the function twice with respect
to x at the point a comma b.

271
00:13:09,250 --> 00:13:12,280
That's what we mean
by f sub xx, recall.

272
00:13:12,280 --> 00:13:15,950
Or you might have differentiated
first with respect

273
00:13:15,950 --> 00:13:19,270
to x and then with respect
to y, in which case

274
00:13:19,270 --> 00:13:22,870
it would have been f
sub xy of a comma b.

275
00:13:22,870 --> 00:13:24,300
In fact, I should
be careful here.

276
00:13:24,300 --> 00:13:25,883
There should be a
fourth one here too,

277
00:13:25,883 --> 00:13:28,240
and that is f sub y,x.

278
00:13:28,240 --> 00:13:31,490
Namely, first differentiate
with respect to y and then

279
00:13:31,490 --> 00:13:32,740
with respect to x.

280
00:13:32,740 --> 00:13:34,680
The reason I left
that out over here

281
00:13:34,680 --> 00:13:38,780
was simply because, if we
have a nice function, meaning

282
00:13:38,780 --> 00:13:41,810
one that's continuous, and the
derivatives are continuous,

283
00:13:41,810 --> 00:13:45,360
and the mixed derivatives
exist and are continuous,

284
00:13:45,360 --> 00:13:47,550
we showed that the
order in which we

285
00:13:47,550 --> 00:13:50,740
take the partials made no
difference, that f sub xy

286
00:13:50,740 --> 00:13:53,730
was equal to f sub yx.

287
00:13:53,730 --> 00:13:56,640
But getting back to this
idea, the third possibility

288
00:13:56,640 --> 00:13:58,580
is that you can have
differentiated twice

289
00:13:58,580 --> 00:14:03,170
with respect to y and formed
and f sub yy of a comma b.

290
00:14:03,170 --> 00:14:06,150
And so the question is, with all
of these different second-order

291
00:14:06,150 --> 00:14:07,770
partial derivatives
floating around,

292
00:14:07,770 --> 00:14:10,560
what do you mean by the
second partial derivative?

293
00:14:10,560 --> 00:14:12,880
And the key
expression, and I think

294
00:14:12,880 --> 00:14:15,810
this is far from intuitive,
but the key expression

295
00:14:15,810 --> 00:14:20,860
turns out to be that the
determining factor is

296
00:14:20,860 --> 00:14:23,200
the second partial with
respect to x, in other words f

297
00:14:23,200 --> 00:14:27,730
sub xx, multiplied by the second
partial with respect to y, f

298
00:14:27,730 --> 00:14:33,200
sub yy, minus the square
of the mixed partial.

299
00:14:33,200 --> 00:14:36,830
And that particular factor,
the sine of that factor,

300
00:14:36,830 --> 00:14:40,830
determines whether you have a
maximum point, a minimum point,

301
00:14:40,830 --> 00:14:44,270
a saddle point, or else
the test might fail.

302
00:14:44,270 --> 00:14:48,840
By the way, this is a rather
difficult proof to come by.

303
00:14:48,840 --> 00:14:52,040
The proof is done in chapter
18 of the Thomas text,

304
00:14:52,040 --> 00:14:54,270
and is assigned for you.

305
00:14:54,270 --> 00:14:57,300
I tried to give you learning
exercises that take you

306
00:14:57,300 --> 00:14:59,320
through the proof step by step.

307
00:14:59,320 --> 00:15:03,540
And I have also included an
optional supplementary lecture

308
00:15:03,540 --> 00:15:06,370
for those of you who may still
have difficulty following

309
00:15:06,370 --> 00:15:08,660
both the text and the
supplementary notes

310
00:15:08,660 --> 00:15:12,210
and the learning exercises,
because it's all written out

311
00:15:12,210 --> 00:15:14,090
and might like to
hear the thing spoken.

312
00:15:14,090 --> 00:15:17,140
What I will do is
derive this for you

313
00:15:17,140 --> 00:15:19,570
in an optional lecture for
those of you who want it.

314
00:15:19,570 --> 00:15:22,170
But for the time being,
to give us our overview,

315
00:15:22,170 --> 00:15:24,790
let me simply state
what the properties

316
00:15:24,790 --> 00:15:27,220
of this particular quantity are.

317
00:15:27,220 --> 00:15:29,900
That the main result-- and as I
just write here to remind you,

318
00:15:29,900 --> 00:15:31,580
that the details
are derived later,

319
00:15:31,580 --> 00:15:34,290
both in the text, in
the learning exercises,

320
00:15:34,290 --> 00:15:39,220
and in an optional lecture--
that suppose we have solved

321
00:15:39,220 --> 00:15:43,220
our simultaneous
system and have found

322
00:15:43,220 --> 00:15:45,670
points a comma b,
where the partial

323
00:15:45,670 --> 00:15:48,570
of f with respect to x and the
partial of f with respect to y

324
00:15:48,570 --> 00:15:49,820
equals 0.

325
00:15:49,820 --> 00:15:51,790
So we now have a
candidate, meaning

326
00:15:51,790 --> 00:15:56,120
a comma b now is eligible to be
tested to see whether it yields

327
00:15:56,120 --> 00:15:58,790
a maximum or a minimum value.

328
00:15:58,790 --> 00:16:04,190
The test turns out to be this,
you compute f sub xx times

329
00:16:04,190 --> 00:16:10,330
f sub yy minus f sub xy
squared at the point a comma b.

330
00:16:10,330 --> 00:16:14,680
And if that particular number
turns out to be greater than 0,

331
00:16:14,680 --> 00:16:19,930
then a comma b yields
a local minimum

332
00:16:19,930 --> 00:16:25,230
of f if f sub xx
happens to be positive,

333
00:16:25,230 --> 00:16:30,740
and a local maximum if f sub
xx happens to be negative.

334
00:16:30,740 --> 00:16:32,800
Now, the easiest
way to remember that

335
00:16:32,800 --> 00:16:36,430
is to think in terms of a
partial derivative, again.

336
00:16:36,430 --> 00:16:40,620
Imagine that we've sliced the
surface so that we're looking

337
00:16:40,620 --> 00:16:43,440
at a cut in the wx-plane.

338
00:16:43,440 --> 00:16:49,360
In the wx-plane, notice that
if the second derivative of w

339
00:16:49,360 --> 00:16:51,800
with respect to x
is positive, that

340
00:16:51,800 --> 00:16:55,070
means that the curve
is holding water.

341
00:16:55,070 --> 00:16:59,270
And holding water seems to
indicate a minimum, you see.

342
00:16:59,270 --> 00:17:03,050
And similarly, if it's
negative it's spilling water,

343
00:17:03,050 --> 00:17:05,630
and that would
indicate a maximum.

344
00:17:05,630 --> 00:17:09,750
You might say, what happens
if f sub xx happens to be 0.

345
00:17:09,750 --> 00:17:13,510
And the answer is, lookit,
if f sub xx happens to be 0,

346
00:17:13,510 --> 00:17:16,210
this case couldn't have
occurred in the first place,

347
00:17:16,210 --> 00:17:18,480
because if f sub
xx happens to be 0,

348
00:17:18,480 --> 00:17:20,480
this term drops
out, in which case

349
00:17:20,480 --> 00:17:25,280
this could not be a
positive expression.

350
00:17:25,280 --> 00:17:28,630
With this term missing, the
smallest the square can be

351
00:17:28,630 --> 00:17:37,200
is 0, and a negative of a
positive or non-zero number

352
00:17:37,200 --> 00:17:38,430
can't be negative.

353
00:17:38,430 --> 00:17:40,780
In other words, you could
not obey this inequality

354
00:17:40,780 --> 00:17:42,967
if f sub xx happened to be 0.

355
00:17:42,967 --> 00:17:44,550
If you want to argue,
why couldn't you

356
00:17:44,550 --> 00:17:49,350
look at f sub yy instead of f
sub xx, the answer is, f sub xx

357
00:17:49,350 --> 00:17:52,630
and f sub yy must both
have the same sign.

358
00:17:52,630 --> 00:17:54,340
Because if we just
look at this thing,

359
00:17:54,340 --> 00:17:56,880
notice that this term
has to be positive.

360
00:17:56,880 --> 00:17:59,160
You're subtracting off
something positive,

361
00:17:59,160 --> 00:18:01,120
therefore the term
you're subtracting from

362
00:18:01,120 --> 00:18:02,340
must be positive.

363
00:18:02,340 --> 00:18:05,270
And the only way the product
of two numbers can be positive

364
00:18:05,270 --> 00:18:08,370
is if each of the factors
has the same sign.

365
00:18:08,370 --> 00:18:10,490
But again, I don't
want to belabor that.

366
00:18:10,490 --> 00:18:13,670
I just want to go through this
thing fairly rapidly with you.

367
00:18:13,670 --> 00:18:17,820
It turns out, by the way, if
this key factor, this key term,

368
00:18:17,820 --> 00:18:22,610
f sub xx f sub yy minus the
square of the mixed partial,

369
00:18:22,610 --> 00:18:25,000
happens to be negative,
then you can be sure

370
00:18:25,000 --> 00:18:26,600
that you have a saddle point.

371
00:18:26,600 --> 00:18:28,230
In other words,
what that means is

372
00:18:28,230 --> 00:18:30,390
for any neighborhood
of the point

373
00:18:30,390 --> 00:18:36,080
a comma b, for some values of
f, for some values of x comma

374
00:18:36,080 --> 00:18:39,290
y in that neighborhood, the
function will be greater than f

375
00:18:39,290 --> 00:18:42,520
of a comma b, and for others
it will be less than f

376
00:18:42,520 --> 00:18:43,390
of a comma b.

377
00:18:43,390 --> 00:18:45,640
So it's neither a max nor a min.

378
00:18:45,640 --> 00:18:48,880
And it turns out that the
situation in which to test

379
00:18:48,880 --> 00:18:53,460
fails is if this particular
expression happens to equal 0.

380
00:18:53,460 --> 00:18:56,530
So again, you see that from
a purely mechanical point

381
00:18:56,530 --> 00:19:01,180
of view, this test is
rather easy to memorize.

382
00:19:01,180 --> 00:19:03,305
The hard part is the proof.

383
00:19:03,305 --> 00:19:04,680
And that's why,
as I say, there's

384
00:19:04,680 --> 00:19:07,750
extra drill on that part, if you
happen to be interested in it.

385
00:19:07,750 --> 00:19:10,060
And by the way, if you're
not interested in it,

386
00:19:10,060 --> 00:19:11,490
skip the proof.

387
00:19:11,490 --> 00:19:14,020
In fact, for the
sake of this course,

388
00:19:14,020 --> 00:19:17,930
I am not concerned with how well
you handle max-min problems.

389
00:19:17,930 --> 00:19:21,390
I'm interested more
in showing you overall

390
00:19:21,390 --> 00:19:24,090
what max-min
problems mean and how

391
00:19:24,090 --> 00:19:27,520
all of the principles of
partial differentiation

392
00:19:27,520 --> 00:19:30,320
seem to come up in that
particular application

393
00:19:30,320 --> 00:19:33,040
of max-min problems.

394
00:19:33,040 --> 00:19:34,960
So from a theoretical
point of view,

395
00:19:34,960 --> 00:19:36,580
that would complete
the study of how

396
00:19:36,580 --> 00:19:41,260
one handles max-min problems,
except that an even more

397
00:19:41,260 --> 00:19:45,050
difficult and subtle form
of computational difficulty

398
00:19:45,050 --> 00:19:49,440
comes up, in terms of some
of the practical applications

399
00:19:49,440 --> 00:19:52,990
that we have, which
motivate such mechanical and

400
00:19:52,990 --> 00:19:55,425
computational devices
known as, for example,

401
00:19:55,425 --> 00:19:59,010
the Lagrange multipliers and
things of this sort that,

402
00:19:59,010 --> 00:20:02,550
again, are discussed in the text
and in the learning exercises

403
00:20:02,550 --> 00:20:04,850
and which I will not discuss
in the lecture, other

404
00:20:04,850 --> 00:20:08,440
than to motivate for
you why they occur.

405
00:20:08,440 --> 00:20:11,570
And let me just finish up
today's lesson in terms of one

406
00:20:11,570 --> 00:20:12,620
more topic.

407
00:20:12,620 --> 00:20:14,510
And this is a bad
name for this topic,

408
00:20:14,510 --> 00:20:17,800
because we've already used this
word in a different context.

409
00:20:17,800 --> 00:20:20,170
But this very often
happens in mathematics,

410
00:20:20,170 --> 00:20:24,020
that the same word is used
in more than one context.

411
00:20:24,020 --> 00:20:26,300
But one often talks
about constraints

412
00:20:26,300 --> 00:20:29,400
when one deals with functions
of several variables.

413
00:20:29,400 --> 00:20:31,190
In other words,
in many cases when

414
00:20:31,190 --> 00:20:33,980
one wants to
maximize or minimize

415
00:20:33,980 --> 00:20:36,100
a function of
several variables, it

416
00:20:36,100 --> 00:20:38,870
turns out that certain
external conditions

417
00:20:38,870 --> 00:20:40,510
happen to be imposed.

418
00:20:40,510 --> 00:20:42,540
Now, if that sounds like
a difficult mouthful

419
00:20:42,540 --> 00:20:45,490
to comprehend, I would
like to start off

420
00:20:45,490 --> 00:20:47,540
with an example that's
already occurred

421
00:20:47,540 --> 00:20:50,250
in a max-min problem in
part one of our course,

422
00:20:50,250 --> 00:20:52,470
but in such a subtle
way that we never

423
00:20:52,470 --> 00:20:54,960
noticed that it was really
involving a function

424
00:20:54,960 --> 00:20:56,220
of several variables.

425
00:20:56,220 --> 00:20:58,291
In fact, if it hadn't
have been that subtle,

426
00:20:58,291 --> 00:21:00,040
we'd have been in
trouble, because in part

427
00:21:00,040 --> 00:21:02,570
one of our course, we did
not talk about functions

428
00:21:02,570 --> 00:21:04,270
of more than one real variable.

429
00:21:04,270 --> 00:21:09,060
But for example, let's revisit
a type of problem that says

430
00:21:09,060 --> 00:21:12,270
let's find the minimum
distance, say, from the origin,

431
00:21:12,270 --> 00:21:16,230
0 comma 0, the minimum distance
from the origin to the curve

432
00:21:16,230 --> 00:21:17,960
x*y equals 1.

433
00:21:17,960 --> 00:21:20,030
Well, to find that
minimum distance,

434
00:21:20,030 --> 00:21:23,400
notice that what we have to
do is minimize a distance

435
00:21:23,400 --> 00:21:25,970
function, namely the
square of the distance--

436
00:21:25,970 --> 00:21:28,160
I use the square simply to
eliminate the square root

437
00:21:28,160 --> 00:21:31,530
sign here-- the square of
the distance from the origin

438
00:21:31,530 --> 00:21:35,490
to any point x comma y is
x squared plus y squared.

439
00:21:35,490 --> 00:21:37,720
Well, notice that
in this form, this

440
00:21:37,720 --> 00:21:40,220
is a function of two
independent variables.

441
00:21:40,220 --> 00:21:43,860
The trouble is, you don't want
the distance from the origin

442
00:21:43,860 --> 00:21:45,180
to any old point.

443
00:21:45,180 --> 00:21:49,380
The point that you're
investigating has to be

444
00:21:49,380 --> 00:21:51,950
on the curve x*y equals 1.

445
00:21:51,950 --> 00:21:55,410
And that means that x and
y, for your investigation

446
00:21:55,410 --> 00:21:57,710
in this problem,
are not independent.

447
00:21:57,710 --> 00:21:59,680
Namely, x and y are related.

448
00:21:59,680 --> 00:22:03,540
Now by the way, sometimes this
equation can be so messy that

449
00:22:03,540 --> 00:22:07,000
we cannot solve for y
explicitly in terms of x.

450
00:22:07,000 --> 00:22:10,200
In this particular case,
as long as x is not 0,

451
00:22:10,200 --> 00:22:13,850
we can solve specifically for
y in terms of x, in which case

452
00:22:13,850 --> 00:22:16,140
we get y equals 1 over x.

453
00:22:16,140 --> 00:22:17,890
And that is, in
this particular case

454
00:22:17,890 --> 00:22:20,840
then, the function that
we want to minimize,

455
00:22:20,840 --> 00:22:22,880
even though it looks
like a function of two

456
00:22:22,880 --> 00:22:25,510
independent variables,
is really a function

457
00:22:25,510 --> 00:22:28,050
of one independent
variable, because since y

458
00:22:28,050 --> 00:22:32,850
is equal to 1 over x in our
investigation, f of x comma y

459
00:22:32,850 --> 00:22:36,650
is really f of x comma 1 over x.

460
00:22:36,650 --> 00:22:39,040
In other words, going back
to what f is explicitly

461
00:22:39,040 --> 00:22:44,570
in this case, f of x comma y
is just x squared plus 1 over x

462
00:22:44,570 --> 00:22:45,970
quantity squared.

463
00:22:45,970 --> 00:22:48,970
And that in turn is also
just a function of x.

464
00:22:48,970 --> 00:22:51,310
Now you see, what makes
this thing more difficult

465
00:22:51,310 --> 00:22:54,300
is, first of all, we may not be
dealing with a function of just

466
00:22:54,300 --> 00:22:56,300
two variables to minimize.

467
00:22:56,300 --> 00:23:01,280
And secondly, we may have a very
difficult constraint imposed.

468
00:23:01,280 --> 00:23:03,190
Or what makes things
even more difficult

469
00:23:03,190 --> 00:23:05,270
is that if we have to
minimize a function,

470
00:23:05,270 --> 00:23:09,010
say, a five variables, there
may be two or three or even

471
00:23:09,010 --> 00:23:10,780
four constraints imposed.

472
00:23:10,780 --> 00:23:13,420
And the question is, how
do you maximize or minimize

473
00:23:13,420 --> 00:23:15,220
a function, taking
into consideration

474
00:23:15,220 --> 00:23:18,570
the fact that there are
constraints imposed?

475
00:23:18,570 --> 00:23:21,940
And this is what brings up
all of the type of material

476
00:23:21,940 --> 00:23:26,110
that I gave you as
exercises in the last unit,

477
00:23:26,110 --> 00:23:28,450
where we talked about
the Jacobian matrix,

478
00:23:28,450 --> 00:23:31,890
and handled the
inverting systems,

479
00:23:31,890 --> 00:23:35,130
and how we could solve
explicitly or implicitly

480
00:23:35,130 --> 00:23:37,490
for functions, implicit
function theorems,

481
00:23:37,490 --> 00:23:39,470
things that we talked
about in those exercises.

482
00:23:39,470 --> 00:23:42,830
All of those things come up
in solving max-min problems

483
00:23:42,830 --> 00:23:44,110
in several unknowns.

484
00:23:44,110 --> 00:23:45,990
See, more generally,
what we're saying

485
00:23:45,990 --> 00:23:48,540
is we have more
variables than two,

486
00:23:48,540 --> 00:23:51,380
and we have more
implicit constraints,

487
00:23:51,380 --> 00:23:55,120
meaning more constraints than
just a single constraint,

488
00:23:55,120 --> 00:23:59,560
and also using more variables,
and also more involving

489
00:23:59,560 --> 00:24:02,630
modifying implicit that
you just can't solve

490
00:24:02,630 --> 00:24:04,290
for one of the
variables explicitly

491
00:24:04,290 --> 00:24:06,120
in terms of the other,
even though there

492
00:24:06,120 --> 00:24:08,290
is an implicit
relationship involved.

493
00:24:08,290 --> 00:24:10,880
Now, I hate to work
abstractly in general,

494
00:24:10,880 --> 00:24:13,430
but in this particular
lecture I'm going to do that.

495
00:24:13,430 --> 00:24:15,850
I'm going to talk with
you abstractly here,

496
00:24:15,850 --> 00:24:20,480
but all of the exercises will
deal with concrete situations

497
00:24:20,480 --> 00:24:22,910
so that you'll see all
of the theory come alive

498
00:24:22,910 --> 00:24:23,920
in the problems.

499
00:24:23,920 --> 00:24:28,370
But because I want to give
you the material as compactly

500
00:24:28,370 --> 00:24:31,370
as possible, let me just
state what the situation is.

501
00:24:31,370 --> 00:24:34,680
For example, suppose we
want to maximize or minimize

502
00:24:34,680 --> 00:24:37,790
a function of what appears to
be three independent variables,

503
00:24:37,790 --> 00:24:39,770
say, f of x, y, z.

504
00:24:39,770 --> 00:24:41,450
And all of a sudden,
somebody tells us,

505
00:24:41,450 --> 00:24:45,290
hey, in the domain that we're
interested in, x, y, and z

506
00:24:45,290 --> 00:24:46,570
are not independent.

507
00:24:46,570 --> 00:24:48,150
There's a certain constraint.

508
00:24:48,150 --> 00:24:50,300
And let's write
that symbolically

509
00:24:50,300 --> 00:24:53,340
as some function of
x, y, and z equals 0.

510
00:24:53,340 --> 00:24:56,190
Remember, that's
simply our abstract way

511
00:24:56,190 --> 00:24:59,170
of saying that there is some
functional relationship that

512
00:24:59,170 --> 00:25:01,280
relates x, y, and z.

513
00:25:01,280 --> 00:25:04,440
And we'll simply call that
g of x, y, z equals 0.

514
00:25:04,440 --> 00:25:07,840
Maybe I could solve for z
explicitly in terms of x and y

515
00:25:07,840 --> 00:25:09,520
from this particular
relationship.

516
00:25:09,520 --> 00:25:10,780
Maybe I can.

517
00:25:10,780 --> 00:25:13,230
But at any rate, what we
do know from the learning

518
00:25:13,230 --> 00:25:17,340
exercises of last time,
that g of x, y, z equals 0,

519
00:25:17,340 --> 00:25:21,330
will implicitly define z as
some function of x and y,

520
00:25:21,330 --> 00:25:23,960
say k of x, y,
except in that case

521
00:25:23,960 --> 00:25:28,060
where the partial of g with
respect to z happens to be 0.

522
00:25:28,060 --> 00:25:30,110
And I put this thing
in parentheses for you

523
00:25:30,110 --> 00:25:32,370
simply to give you
motivation to review

524
00:25:32,370 --> 00:25:36,250
the exercises of last time if
this seems a bit vague to you.

525
00:25:36,250 --> 00:25:40,050
At any rate, what this
thing means in plain English

526
00:25:40,050 --> 00:25:42,750
is that subject
to the constraint,

527
00:25:42,750 --> 00:25:45,830
that z is some
function of x and y.

528
00:25:45,830 --> 00:25:49,350
We take this constraint,
we put that back

529
00:25:49,350 --> 00:25:54,590
into our original function
that we're trying to maximize.

530
00:25:54,590 --> 00:26:00,300
Notice that f of x, y, z now
becomes f of x, y, k of x, y.

531
00:26:00,300 --> 00:26:02,140
See? z is k of x, y.

532
00:26:02,140 --> 00:26:04,200
If we now look at
this expression,

533
00:26:04,200 --> 00:26:09,780
notice only x and y appear, so
that subject to the constraint

534
00:26:09,780 --> 00:26:13,990
g of x, y, z equals
0, the function f

535
00:26:13,990 --> 00:26:16,710
is a function of only two
independent variables,

536
00:26:16,710 --> 00:26:17,720
not three.

537
00:26:17,720 --> 00:26:21,992
And to indicate that, let's
simply say that f of x, y,

538
00:26:21,992 --> 00:26:25,130
z-- f of x, y, k of
x, y in this case--

539
00:26:25,130 --> 00:26:27,950
is some function h of x and y.

540
00:26:27,950 --> 00:26:29,680
And what our problem
is now saying

541
00:26:29,680 --> 00:26:33,190
is, minimize or
maximize the function

542
00:26:33,190 --> 00:26:36,890
h, which is a function of
two independent variables.

543
00:26:36,890 --> 00:26:38,720
Now notice here--
it's been quite a

544
00:26:38,720 --> 00:26:41,050
while since we've dealt
with the chain rule--

545
00:26:41,050 --> 00:26:44,180
but notice here that the
chain rule now comes up

546
00:26:44,180 --> 00:26:47,960
in a very important practical
application, namely,

547
00:26:47,960 --> 00:26:52,020
this looks like an eyesore
f of x, y comma k of x, y.

548
00:26:52,020 --> 00:26:53,500
How can we handle that?

549
00:26:53,500 --> 00:26:55,330
Notice that another
way of saying this,

550
00:26:55,330 --> 00:27:00,664
utilizing the chain rule, is
to say h of x, y is f of x, y,

551
00:27:00,664 --> 00:27:05,376
z where z is some
function k of x and y.

552
00:27:05,376 --> 00:27:08,160
See, f is a function
of x, y and z,

553
00:27:08,160 --> 00:27:10,225
and z is a function of x and y.

554
00:27:10,225 --> 00:27:12,100
In fact, if you wanted
to say it another way,

555
00:27:12,100 --> 00:27:13,310
you could say what?

556
00:27:13,310 --> 00:27:18,490
f is some function of x, y, z,
where x equals x, y equals y,

557
00:27:18,490 --> 00:27:20,120
and z equals k of x, y.

558
00:27:20,120 --> 00:27:20,620
You see?

559
00:27:20,620 --> 00:27:22,620
That's your
particular chain rule.

560
00:27:22,620 --> 00:27:23,820
Now, lookit.

561
00:27:23,820 --> 00:27:26,610
See, this is, again, one of
the problems of mathematics,

562
00:27:26,610 --> 00:27:29,400
which I hope is crystal
clear by this time.

563
00:27:29,400 --> 00:27:33,000
Granted, that we discussed
the chain rule in block three.

564
00:27:33,000 --> 00:27:37,290
That's no reason why in
block four we can beg off

565
00:27:37,290 --> 00:27:39,711
and say, we had a long time
ago, I don't remember that.

566
00:27:39,711 --> 00:27:40,210
No.

567
00:27:40,210 --> 00:27:43,060
Hopefully, we've made
the chain rule so clear

568
00:27:43,060 --> 00:27:45,640
that any time I tell you
that we have to invoke it,

569
00:27:45,640 --> 00:27:48,190
you can just write it
down very, very quickly.

570
00:27:48,190 --> 00:27:50,050
How does the chain
rule work now?

571
00:27:50,050 --> 00:27:53,770
To differentiate
this with respect

572
00:27:53,770 --> 00:27:56,060
to x, say, we take
the partial of this

573
00:27:56,060 --> 00:27:59,760
with respect to x, times the
partial of x with respect to x,

574
00:27:59,760 --> 00:28:01,490
plus the partial
of f with respect

575
00:28:01,490 --> 00:28:03,690
to y times the partial
of y with respect

576
00:28:03,690 --> 00:28:06,990
to x, plus the partial of
f with respect to z times

577
00:28:06,990 --> 00:28:08,700
the partial of z
with respect to x.

578
00:28:08,700 --> 00:28:11,800
In other words, the general
theory hasn't changed at all.

579
00:28:11,800 --> 00:28:14,914
To maximize or minimize
h, what we're going to do

580
00:28:14,914 --> 00:28:17,080
is we're going to take the
partial of h with respect

581
00:28:17,080 --> 00:28:19,150
to x and set that
equal to 0, we're

582
00:28:19,150 --> 00:28:21,780
going to take the partial
of h with respect to y

583
00:28:21,780 --> 00:28:23,900
and set that equal
to 0, and solve

584
00:28:23,900 --> 00:28:25,860
that system simultaneously.

585
00:28:25,860 --> 00:28:27,820
But the hard point
computationally

586
00:28:27,820 --> 00:28:31,160
is, sure, you can say let's
set the partials is equal to 0.

587
00:28:31,160 --> 00:28:33,370
But before you can do
that, you had better

588
00:28:33,370 --> 00:28:35,600
be able to take the partials.

589
00:28:35,600 --> 00:28:38,170
And that's where the
computationally skill comes in.

590
00:28:38,170 --> 00:28:41,780
So how do we take the partial
of h with respect to x here?

591
00:28:41,780 --> 00:28:43,300
Well, we just said that.

592
00:28:43,300 --> 00:28:46,280
The partial of h with respect
to x, using the chain rule,

593
00:28:46,280 --> 00:28:50,020
is f sub x times the partial
of x with respect to x, plus f

594
00:28:50,020 --> 00:28:53,600
sub y times the partial of y
with respect to x, plus f sub

595
00:28:53,600 --> 00:28:56,390
z times the partial of
z with respect to x.

596
00:28:56,390 --> 00:29:00,460
Now notice that the partial
of x with respect to x is 1.

597
00:29:00,460 --> 00:29:03,740
So that gives me f
sub x over there.

598
00:29:03,740 --> 00:29:07,650
Keep in mind that whereas z
is a function of x and y--

599
00:29:07,650 --> 00:29:10,490
let's go back here and take
a quick look at that-- notice

600
00:29:10,490 --> 00:29:14,140
that in our function
h, we're assuming what?

601
00:29:14,140 --> 00:29:17,050
That x and y are the
independent variables,

602
00:29:17,050 --> 00:29:20,100
but that z is a
function of x and y.

603
00:29:20,100 --> 00:29:22,860
The point, therefore,
is that since x and y

604
00:29:22,860 --> 00:29:25,680
are independent
variables, by definition

605
00:29:25,680 --> 00:29:28,580
that means that the partial
of y with respect to x is 0.

606
00:29:28,580 --> 00:29:31,510
In other words, the fact
that y and x are independent

607
00:29:31,510 --> 00:29:35,280
means that the change in y
with respect to a change in x

608
00:29:35,280 --> 00:29:38,850
is 0, because we can change
x without changing y.

609
00:29:38,850 --> 00:29:42,010
And finally, given
that z is k of x, y,

610
00:29:42,010 --> 00:29:45,400
we can compute the partial
of z with respect to x.

611
00:29:45,400 --> 00:29:48,055
And so, what we wind up with
is that the partial of h

612
00:29:48,055 --> 00:29:51,460
with respect to x is the
partial of f with respect to x,

613
00:29:51,460 --> 00:29:57,020
plus the partial of f
with respect to z times

614
00:29:57,020 --> 00:29:58,830
the partial of z with respect x.

615
00:29:58,830 --> 00:30:01,210
And we must set that equal to 0.

616
00:30:01,210 --> 00:30:05,240
Again, leaving the details to
you as a review of the chain

617
00:30:05,240 --> 00:30:05,750
rule.

618
00:30:05,750 --> 00:30:09,090
In a similar way, we show that
the partial of h with respect

619
00:30:09,090 --> 00:30:12,390
to y is the partial of
f with respect to y,

620
00:30:12,390 --> 00:30:14,570
plus the partial of
f with respect to z,

621
00:30:14,570 --> 00:30:16,860
times the partial of
z with respect to x,

622
00:30:16,860 --> 00:30:18,670
and we set that equal to 0.

623
00:30:18,670 --> 00:30:23,110
Now, the thing to keep in mind
is to observe that f of x, y, z

624
00:30:23,110 --> 00:30:24,380
was a given function.

625
00:30:24,380 --> 00:30:26,020
We know what that looks like.

626
00:30:26,020 --> 00:30:29,580
Consequently, these four
quantities are known.

627
00:30:29,580 --> 00:30:33,360
The trouble is that
g of x, y, z equals 0

628
00:30:33,360 --> 00:30:37,940
defines z implicitly as
a function of x and y.

629
00:30:37,940 --> 00:30:39,770
And consequently,
if it turns out

630
00:30:39,770 --> 00:30:43,410
that we could not have
solved our system for z

631
00:30:43,410 --> 00:30:46,740
explicitly in terms of x
and y, the question mark

632
00:30:46,740 --> 00:30:48,701
would be, for example--

633
00:30:48,701 --> 00:30:49,200
I'm sorry.

634
00:30:49,200 --> 00:30:50,850
This is a misprint over here.

635
00:30:50,850 --> 00:30:53,090
This should be, of course,
the partial of-- we're

636
00:30:53,090 --> 00:30:54,800
differentiate with respect to y.

637
00:30:54,800 --> 00:30:57,010
This is the partial of
f with respect to y,

638
00:30:57,010 --> 00:30:59,430
plus the partial of
f with respect to z,

639
00:30:59,430 --> 00:31:03,710
times the partial of
z with respect to y.

640
00:31:03,710 --> 00:31:08,500
The key point over here is
simply that we must know what?

641
00:31:08,500 --> 00:31:12,410
If we can't solve for z
explicitly in terms of x and y,

642
00:31:12,410 --> 00:31:15,040
how do we know what
these two quantities are?

643
00:31:15,040 --> 00:31:17,350
We only know them implicitly.

644
00:31:17,350 --> 00:31:19,460
And this gives us
our review, again,

645
00:31:19,460 --> 00:31:22,090
of setting differentials
equal to 0 and the like.

646
00:31:22,090 --> 00:31:25,610
Namely, if g of x, y,
z is identically 0,

647
00:31:25,610 --> 00:31:29,580
we can equate the derivative
on both sides to 0.

648
00:31:29,580 --> 00:31:31,780
And to differentiate
this thing implicitly,

649
00:31:31,780 --> 00:31:34,690
it's, again, using the
chain rule, we get what?

650
00:31:34,690 --> 00:31:37,720
The partial of g with respect
to x, plus the partial

651
00:31:37,720 --> 00:31:40,025
of g with respect to z,
times the partial of z

652
00:31:40,025 --> 00:31:42,010
with respect to x is 0.

653
00:31:42,010 --> 00:31:45,410
Notice, by the way, that
g is given explicitly.

654
00:31:45,410 --> 00:31:48,600
We're told what the function
g of x, y, z looks like.

655
00:31:48,600 --> 00:31:52,400
Consequently, these
things here are known.

656
00:31:52,400 --> 00:31:55,630
And we can now solve for the
partial of z with respect to x.

657
00:31:55,630 --> 00:31:58,570
In fact, what is the partial
of z with respect to x?

658
00:31:58,570 --> 00:32:02,770
That's nothing more than what?

659
00:32:02,770 --> 00:32:05,450
It's minus the partial
of g with respect

660
00:32:05,450 --> 00:32:09,740
to x divided by the partial
of g with respect to z.

661
00:32:09,740 --> 00:32:12,430
And that's why the partial
of g with respect to z

662
00:32:12,430 --> 00:32:15,990
had better not be 0, otherwise
we wind up in trouble here.

663
00:32:15,990 --> 00:32:19,000
Similarly, we can find what the
partial of z with respect to y

664
00:32:19,000 --> 00:32:19,630
is.

665
00:32:19,630 --> 00:32:21,150
That's going to
turn out to be what?

666
00:32:21,150 --> 00:32:23,820
Minus the partial
of g with respect

667
00:32:23,820 --> 00:32:27,860
to y divided by the partial
of g with respect to z.

668
00:32:27,860 --> 00:32:31,780
Knowing what these two values
are from these two equations,

669
00:32:31,780 --> 00:32:33,900
we come back into the here.

670
00:32:33,900 --> 00:32:37,430
And now we know what
these are, and now

671
00:32:37,430 --> 00:32:40,550
we simply have all of
the known functions

672
00:32:40,550 --> 00:32:43,890
on the right-hand side,
and we solve this system.

673
00:32:43,890 --> 00:32:45,580
Now hopefully, this
is one of the times

674
00:32:45,580 --> 00:32:47,390
I hope things don't
sound too clear to you,

675
00:32:47,390 --> 00:32:49,570
meaning you have an
overview, but you

676
00:32:49,570 --> 00:32:53,030
begin to suspect that this
is a very messy situation.

677
00:32:53,030 --> 00:32:55,480
Because, you see, what
I wanted you to see

678
00:32:55,480 --> 00:32:58,250
is that solving such
a system like this

679
00:32:58,250 --> 00:33:00,980
can be extremely cumbersome.

680
00:33:00,980 --> 00:33:04,710
And that's why, in the
exercises, we do two things.

681
00:33:04,710 --> 00:33:06,740
We have to solve
systems like this

682
00:33:06,740 --> 00:33:09,160
to get the experience
of seeing that there's

683
00:33:09,160 --> 00:33:12,330
a big difference between
knowing theoretically

684
00:33:12,330 --> 00:33:14,880
how to solve a
system of equations

685
00:33:14,880 --> 00:33:18,160
and knowing pragmatically
how to carry it out.

686
00:33:18,160 --> 00:33:21,110
And secondly, we hope that
some of these computations

687
00:33:21,110 --> 00:33:24,140
become cumbersome enough
so that you practically

688
00:33:24,140 --> 00:33:26,530
beg to find some shortcuts.

689
00:33:26,530 --> 00:33:29,170
Because if you're not
begging to find a shortcut,

690
00:33:29,170 --> 00:33:31,620
then such things as
Lagrange multipliers

691
00:33:31,620 --> 00:33:33,490
and the like, which
are techniques

692
00:33:33,490 --> 00:33:37,310
for solving max-min problems
subject to constraints,

693
00:33:37,310 --> 00:33:40,180
that those shortcuts
don't appeal to you

694
00:33:40,180 --> 00:33:41,920
and you fail to see
their significance,

695
00:33:41,920 --> 00:33:44,080
and you say, why do I have
to learn these things.

696
00:33:44,080 --> 00:33:46,180
Don't I have enough
problems without it?

697
00:33:46,180 --> 00:33:49,300
You see, again, notice the
difference between what's

698
00:33:49,300 --> 00:33:51,510
happening here
abstractly and what's

699
00:33:51,510 --> 00:33:53,010
happening computationally.

700
00:33:53,010 --> 00:33:56,150
Abstractly, the
theory of max-min

701
00:33:56,150 --> 00:34:00,110
is not that difficult. But
computationally, to handle it,

702
00:34:00,110 --> 00:34:03,890
you have to be extremely
adept at handling systems

703
00:34:03,890 --> 00:34:06,450
of n equations and n
unknowns, not necessarily

704
00:34:06,450 --> 00:34:09,690
linear equations, being
able to throw in constraints

705
00:34:09,690 --> 00:34:11,409
and seeing what's
happening here.

706
00:34:11,409 --> 00:34:15,370
And at any rate, that's
what the learning exercises

707
00:34:15,370 --> 00:34:17,929
will be all about, and
the material in the text.

708
00:34:17,929 --> 00:34:20,719
And I think that's enough
of a mouthful for this time.

709
00:34:20,719 --> 00:34:22,650
So until next time, good bye.

710
00:34:27,730 --> 00:34:30,090
Funding for the
publication of this video

711
00:34:30,090 --> 00:34:34,969
was provided by the Gabriella
and Paul Rosenbaum Foundation.

712
00:34:34,969 --> 00:34:39,150
Help OCW continue to provide
free and open access to MIT

713
00:34:39,150 --> 00:34:43,070
courses by making a donation
at ocw.mit.edu/donate.