1
00:00:01,000 --> 00:00:03,000
The following content is
provided under a Creative

2
00:00:03,000 --> 00:00:05,000
Commons license.
Your support will help MIT

3
00:00:05,000 --> 00:00:08,000
OpenCourseWare continue to offer
high quality educational

4
00:00:08,000 --> 00:00:13,000
resources for free.
To make a donation or to view

5
00:00:13,000 --> 00:00:18,000
additional materials from
hundreds of MIT courses,

6
00:00:18,000 --> 00:00:23,000
visit MIT OpenCourseWare at
ocw.mit.edu.

7
00:00:23,000 --> 00:00:28,000
Today we are going to see how
to use what we saw last time

8
00:00:28,000 --> 00:00:33,000
about partial derivatives to
handle minimization or

9
00:00:33,000 --> 00:00:41,000
maximization problems involving
functions of several variables.

10
00:00:41,000 --> 00:00:44,000
Remember last time we said that
when we have a function,

11
00:00:44,000 --> 00:00:49,000
say, of two variables, x and y,
then we have actually two

12
00:00:49,000 --> 00:00:53,000
different derivatives,
partial f, partial x,

13
00:00:53,000 --> 00:01:02,000
also called f sub x,
the derivative with respect to

14
00:01:02,000 --> 00:01:11,000
x keeping y constant.
And we have partial f,

15
00:01:11,000 --> 00:01:21,000
partial y, also called f sub y,
where we vary y and we keep x

16
00:01:21,000 --> 00:01:26,000
as a constant.
And now, one thing I didn't

17
00:01:26,000 --> 00:01:30,000
have time to tell you about but
hopefully you thought about in

18
00:01:30,000 --> 00:01:37,000
recitation yesterday,
is the approximation formula

19
00:01:37,000 --> 00:01:47,000
that tells you what happens if
you vary both x and y.

20
00:01:47,000 --> 00:01:50,000
f sub x tells us what happens
if we change x a little bit,

21
00:01:50,000 --> 00:01:53,000
by some small amount delta x.
f sub y tells us how f changes,

22
00:01:53,000 --> 00:01:56,000
if you change y by a small
amount delta y.

23
00:01:56,000 --> 00:02:00,000
If we do both at the same time
then the two effects will add up

24
00:02:00,000 --> 00:02:02,000
with each other,
because you can imagine that

25
00:02:02,000 --> 00:02:05,000
first you will change x and then
you will change y.

26
00:02:05,000 --> 00:02:12,000
Or the other way around.
It doesn't really matter.

27
00:02:12,000 --> 00:02:18,000
If we change x by a certain
amount delta x,

28
00:02:18,000 --> 00:02:23,000
and if we change y by the
amount delta y,

29
00:02:23,000 --> 00:02:32,000
and let's say that we have z=
f(x, y) then that changes by an

30
00:02:32,000 --> 00:02:40,000
amount which is approximately f
sub x times delta x plus f sub y

31
00:02:40,000 --> 00:02:45,000
times delta y.
And that is one of the most

32
00:02:45,000 --> 00:02:49,000
important formulas about partial
derivatives.

33
00:02:49,000 --> 00:02:54,000
The intuition for this,
again, is just the two effects

34
00:02:54,000 --> 00:02:58,000
of if I change x by a small
amount and then I change y.

35
00:02:58,000 --> 00:03:02,000
Well, first changing x will
modify f, how much does it

36
00:03:02,000 --> 00:03:06,000
modify f?
The answer is the rate change

37
00:03:06,000 --> 00:03:09,000
is f sub x.
And if I change y then the rate

38
00:03:09,000 --> 00:03:13,000
of change of f when I change y
is f sub y.

39
00:03:13,000 --> 00:03:17,000
So all together I get this
change as a value of f.

40
00:03:17,000 --> 00:03:19,000
And, of course,
that is only an approximation

41
00:03:19,000 --> 00:03:22,000
formula.
Actually, there would be higher

42
00:03:22,000 --> 00:03:28,000
order terms involving second and
third derivatives and so on.

43
00:03:28,000 --> 00:03:43,000
One way to justify this --
Sorry.

44
00:03:43,000 --> 00:03:47,000
I was distracted by the
microphone.

45
00:03:47,000 --> 00:03:55,000
OK.
How do we justify this formula?

46
00:03:55,000 --> 00:04:05,000
Well, one way to think about it
is in terms of tangent plane

47
00:04:05,000 --> 00:04:10,000
approximation.
Let's think about the tangent

48
00:04:10,000 --> 00:04:13,000
plane with regard to a function
f.

49
00:04:13,000 --> 00:04:15,000
We have some pictures to show
you.

50
00:04:15,000 --> 00:04:20,000
It will be easier if I show you
pictures.

51
00:04:20,000 --> 00:04:24,000
Remember, partial f,
partial x was obtained by

52
00:04:24,000 --> 00:04:29,000
looking at the situation where y
is held constant.

53
00:04:29,000 --> 00:04:33,000
That means I am slicing the
graph of f by a plane that is

54
00:04:33,000 --> 00:04:35,000
parallel to the x,
z plane.

55
00:04:35,000 --> 00:04:39,000
And when I change x,
z changes, and the slope of

56
00:04:39,000 --> 00:04:44,000
that is going to be the
derivative with respect to x.

57
00:04:44,000 --> 00:04:49,000
Now, if I do the same in the
other direction then I will have

58
00:04:49,000 --> 00:04:53,000
similarly the slope in a slice
now parallel to the y,

59
00:04:53,000 --> 00:04:57,000
z plane that will be partial f,
partial y.

60
00:04:57,000 --> 00:05:00,000
In fact, in each case,
I have a line.

61
00:05:00,000 --> 00:05:02,000
And that line is tangent to the
surface.

62
00:05:02,000 --> 00:05:06,000
Now, if I have two lines
tangent to the surface,

63
00:05:06,000 --> 00:05:09,000
well, then together they
determine for me the tangent

64
00:05:09,000 --> 00:05:13,000
plane to the surface.
Let's try to see how that works.

65
00:05:18,000 --> 00:05:28,000
We know that f sub x and f sub
y are the slopes of two tangent

66
00:05:28,000 --> 00:05:37,000
lines to this plane,
two tangent lines to the graph.

67
00:05:37,000 --> 00:05:39,000
And let's write down the
equations of these lines.

68
00:05:39,000 --> 00:05:41,000
I am not going to write
parametric equations.

69
00:05:41,000 --> 00:05:45,000
I am going to write them in
terms of x, y,

70
00:05:45,000 --> 00:05:49,000
z coordinates.
Let's say that partial f of a

71
00:05:49,000 --> 00:05:53,000
partial x at the given point is
equal to a.

72
00:05:53,000 --> 00:06:00,000
That means that we have a line
given by the following

73
00:06:00,000 --> 00:06:05,000
conditions.
I am going to keep y constant

74
00:06:05,000 --> 00:06:07,000
equal to y0.
And I am going to change x.

75
00:06:07,000 --> 00:06:12,000
And, as I change x,
z will change at the rate that

76
00:06:12,000 --> 00:06:22,000
is equal to a.
That would be z = 0 a(x - x0).

77
00:06:22,000 --> 00:06:26,000
That is how you would describe
a line that, I guess,

78
00:06:26,000 --> 00:06:30,000
the one that is plotted in
green here, been dissected with

79
00:06:30,000 --> 00:06:33,000
the slice parallel to the x,
z plane.

80
00:06:33,000 --> 00:06:40,000
I hold y constant equal to y0.
And z is a function of x that

81
00:06:40,000 --> 00:06:50,000
varies with a rate of a.
And now if I look similarly at

82
00:06:50,000 --> 00:06:55,000
the other slice,
let's say that the partial with

83
00:06:55,000 --> 00:07:00,000
respect to y is equal to b,
then I get another line which

84
00:07:00,000 --> 00:07:06,000
is obtained by the fact that z
now will depend on y.

85
00:07:06,000 --> 00:07:10,000
And the rate of change with
respect to y will be b.

86
00:07:10,000 --> 00:07:15,000
While x is held constant equal
to x0.

87
00:07:15,000 --> 00:07:19,000
These two lines are both going
to be in the tangent plane to

88
00:07:19,000 --> 00:07:20,000
the surface.

89
00:07:40,000 --> 00:07:45,000
They are both tangent to the
graph of f and together they

90
00:07:45,000 --> 00:07:47,000
determine the plane.

91
00:07:56,000 --> 00:08:08,000
And that plane is just given by
the formula z = z0 a( x - x0) b

92
00:08:08,000 --> 00:08:13,000
( y - y0).
If you look at what happens --

93
00:08:13,000 --> 00:08:19,000
This is the equation of a plane.
z equals constant times x plus

94
00:08:19,000 --> 00:08:24,000
constant times y plus constant.
And if you look at what happens

95
00:08:24,000 --> 00:08:28,000
if I hold y constant and vary x,
I will get the first line.

96
00:08:28,000 --> 00:08:33,000
If I hold x constant and vary
y, I get the second line.

97
00:08:33,000 --> 00:08:34,000
Another way to do it,
of course,

98
00:08:34,000 --> 00:08:37,000
would provide actually
parametric equations of these

99
00:08:37,000 --> 00:08:40,000
lines,
get vectors along them and then

100
00:08:40,000 --> 00:08:43,000
take the cross-product to get
the normal vector to the plane.

101
00:08:43,000 --> 00:08:47,000
And then get this equation for
the plane using the normal

102
00:08:47,000 --> 00:08:49,000
vector.
That also works and it gives

103
00:08:49,000 --> 00:08:53,000
you the same formula.
If you are curious of the

104
00:08:53,000 --> 00:08:57,000
exercise, do it again using
parametrics and using

105
00:08:57,000 --> 00:09:01,000
cross-product to get the plane
equation.

106
00:09:01,000 --> 00:09:03,000
That is how we get the tangent
plane.

107
00:09:03,000 --> 00:09:06,000
And now what this approximation
formula here says is that,

108
00:09:06,000 --> 00:09:10,000
in fact, the graph of a
function is close to the tangent

109
00:09:10,000 --> 00:09:12,000
plane.
If we were moving on the

110
00:09:12,000 --> 00:09:15,000
tangent plane,
this would be an actual

111
00:09:15,000 --> 00:09:17,000
equality.
Delta z would be a linear

112
00:09:17,000 --> 00:09:23,000
function of delta x and delta y.
And the graph of a function is

113
00:09:23,000 --> 00:09:27,000
near the tangent plane,
but is not quite the same,

114
00:09:27,000 --> 00:09:33,000
so it is only an approximation
for small delta x and small

115
00:09:33,000 --> 00:09:43,000
delta y.
The approximation formula says

116
00:09:43,000 --> 00:09:57,000
the graph of f is close to its
tangent plane.

117
00:09:57,000 --> 00:10:02,000
And we can use that formula
over here now to estimate how

118
00:10:02,000 --> 00:10:08,000
the value of f changes if I
change x and y at the same time.

119
00:10:08,000 --> 00:10:18,000
Questions about that?
Now that we have caught up with

120
00:10:18,000 --> 00:10:23,000
what we were supposed to see on
Tuesday, I can tell you now

121
00:10:23,000 --> 00:10:26,000
about max and min problems.

122
00:10:38,000 --> 00:10:48,000
That is going to be an
application of partial

123
00:10:48,000 --> 00:11:00,000
derivatives to look at
optimization problems.

124
00:11:00,000 --> 00:11:03,000
Maybe ten years from now,
when you have a real job,

125
00:11:03,000 --> 00:11:07,000
your job might be to actually
minimize the cost of something

126
00:11:07,000 --> 00:11:11,000
or maximize the profit of
something or whatever.

127
00:11:11,000 --> 00:11:14,000
But typically the function that
you will have to strive to

128
00:11:14,000 --> 00:11:18,000
minimize or maximize will depend
on several variables.

129
00:11:18,000 --> 00:11:22,000
If you have a function of one
variable, you know that to find

130
00:11:22,000 --> 00:11:26,000
its minimum or its maximum you
look at the derivative and set

131
00:11:26,000 --> 00:11:29,000
that equal to zero.
And you try to then look at

132
00:11:29,000 --> 00:11:38,000
what happens to the function.
Here it is going to be kind of

133
00:11:38,000 --> 00:11:47,000
similar, except,
of course, we have several

134
00:11:47,000 --> 00:11:51,000
derivatives.
For today we will think about a

135
00:11:51,000 --> 00:11:56,000
function of two variables,
but it works exactly the same

136
00:11:56,000 --> 00:12:00,000
if you have three variables,
ten variables,

137
00:12:00,000 --> 00:12:07,000
a million variables.
The first observation is that

138
00:12:07,000 --> 00:12:17,000
if we have a local minimum or a
local maximum then both partial

139
00:12:17,000 --> 00:12:21,000
derivatives,
so partial f partial x and

140
00:12:21,000 --> 00:12:26,000
partial f partial y,
are both zero at the same time.

141
00:12:26,000 --> 00:12:30,000
Why is that?
Well, let's say that f of x is

142
00:12:30,000 --> 00:12:32,000
zero.
That means when I vary x to

143
00:12:32,000 --> 00:12:35,000
first order the function doesn't
change.

144
00:12:35,000 --> 00:12:37,000
Maybe that is because it is
going through...

145
00:12:37,000 --> 00:12:42,000
If I look only at the slice
parallel to the x-axis then

146
00:12:42,000 --> 00:12:45,000
maybe I am going through the
minimum.

147
00:12:45,000 --> 00:12:48,000
But if partial f,
partial y is not 0 then

148
00:12:48,000 --> 00:12:51,000
actually, by changing y,
I could still make a value

149
00:12:51,000 --> 00:12:54,000
larger or smaller.
That wouldn't be an actual

150
00:12:54,000 --> 00:12:57,000
maximum or minimum.
It would only be a maximum or

151
00:12:57,000 --> 00:13:01,000
minimum if I stay in the slice.
But if I allow myself to change

152
00:13:01,000 --> 00:13:04,000
y that doesn't work.
I need actually to know that if

153
00:13:04,000 --> 00:13:07,000
I change y the value will not
change either to first order.

154
00:13:07,000 --> 00:13:11,000
That is why you also need
partial f, partial y to be zero.

155
00:13:11,000 --> 00:13:13,000
Now, let's say that they are
both zero.

156
00:13:13,000 --> 00:13:16,000
Well, why is that enough?
It is essentially enough

157
00:13:16,000 --> 00:13:20,000
because of this formula telling
me that if both of these guys

158
00:13:20,000 --> 00:13:24,000
are zero then to first order the
function doesn't change.

159
00:13:24,000 --> 00:13:26,000
Then, of course,
there will be maybe quadratic

160
00:13:26,000 --> 00:13:28,000
terms that will actually turn
that, you know,

161
00:13:28,000 --> 00:13:31,000
this won't really say that your
function is actually constant.

162
00:13:31,000 --> 00:13:35,000
It will just tell you that
maybe it will actually be

163
00:13:35,000 --> 00:13:40,000
quadratic or higher order in
delta x and delta y.

164
00:13:40,000 --> 00:13:52,000
That is what you expect to have
at a maximum or a minimum.

165
00:13:52,000 --> 00:14:05,000
The condition is the same thing
as saying that the tangent plane

166
00:14:05,000 --> 00:14:15,000
to the graph is actually going
to be horizontal.

167
00:14:15,000 --> 00:14:18,000
And that is what you want to
have.

168
00:14:18,000 --> 00:14:23,000
Say you have a minimum,
well, the tangent plane at this

169
00:14:23,000 --> 00:14:30,000
point, at the bottom of the
graph is going to be horizontal.

170
00:14:30,000 --> 00:14:35,000
And you can see that on this
equation of a tangent plane,

171
00:14:35,000 --> 00:14:40,000
when both these coefficients
are 0 that is when the equation

172
00:14:40,000 --> 00:14:44,000
becomes z equals constant:
the horizontal plane.

173
00:14:44,000 --> 00:14:50,000
Does that make sense?
We will have a name for this

174
00:14:50,000 --> 00:14:52,000
kind of point because,
actually,

175
00:14:52,000 --> 00:14:55,000
what we will see very soon is
that these conditions are

176
00:14:55,000 --> 00:14:57,000
necessary but are not
sufficient.

177
00:14:57,000 --> 00:15:02,000
There are actually other kinds
of points where the partial

178
00:15:02,000 --> 00:15:08,000
derivatives are zero.
Let's give a name to this.

179
00:15:08,000 --> 00:15:24,000
We say the definition is (x0,
y0) is a critical point of f --

180
00:15:24,000 --> 00:15:36,000
-- if the partial derivative,
with respect to x,

181
00:15:36,000 --> 00:15:44,000
and partial derivative with
respect to y are both zero.

182
00:15:44,000 --> 00:15:50,000
Generally, you would want all
the partial derivatives,

183
00:15:50,000 --> 00:15:56,000
no matter how many variables
you have, to be zero at the same

184
00:15:56,000 --> 00:16:06,000
time.
Let's see an example.

185
00:16:06,000 --> 00:16:23,000
Let's say I give you the
function f(x;y)= x^2 - 2xy 3y^2

186
00:16:23,000 --> 00:16:28,000
2x - 2y.
And let's try to figure out

187
00:16:28,000 --> 00:16:32,000
whether we can minimize or
maximize this.

188
00:16:32,000 --> 00:16:37,000
What we would start doing
immediately is taking the

189
00:16:37,000 --> 00:16:43,000
partial derivatives.
What is f sub x?

190
00:16:43,000 --> 00:16:56,000
It starts with 2x - 2y 0 2.
Remember that y is a constant

191
00:16:56,000 --> 00:17:04,000
so this differentiates to zero.
Now, if we do f sub y,

192
00:17:04,000 --> 00:17:14,000
that is going to be 0-2x 6y-2.
And what we want to do is set

193
00:17:14,000 --> 00:17:17,000
these things to zero.
And we want to solve these two

194
00:17:17,000 --> 00:17:21,000
equations at the same time.
An important thing to remember,

195
00:17:21,000 --> 00:17:23,000
and maybe I should have told
you a couple of weeks ago

196
00:17:23,000 --> 00:17:25,000
already,
if you have two equations to

197
00:17:25,000 --> 00:17:28,000
solve, well,
it is very good to try to

198
00:17:28,000 --> 00:17:30,000
simplify them by adding them
together or whatever,

199
00:17:30,000 --> 00:17:33,000
but you must keep two equations.
If you have two equations,

200
00:17:33,000 --> 00:17:37,000
you shouldn't end up with just
one equation out of nowhere.

201
00:17:37,000 --> 00:17:40,000
For example here,
we can certainly simplify

202
00:17:40,000 --> 00:17:46,000
things by summing them together.
If we add them together,

203
00:17:46,000 --> 00:17:52,000
well, the x's cancel and the
constants cancel.

204
00:17:52,000 --> 00:17:56,000
In fact, we are just left with
4y for zero.

205
00:17:56,000 --> 00:18:00,000
That is pretty good.
That tells us y should be zero.

206
00:18:00,000 --> 00:18:02,000
But then we should,
of course, go back to these and

207
00:18:02,000 --> 00:18:07,000
see what else we know.
Well, now it tells us,

208
00:18:07,000 --> 00:18:14,000
if you put y = 0 it tells you
2x 2 = 0.

209
00:18:14,000 --> 00:18:26,000
That tells you x = - 1.
We have one critical point that

210
00:18:26,000 --> 00:18:33,000
is (x, y) = (- 1;
0).

211
00:18:33,000 --> 00:18:39,000
Any questions so far?
No.

212
00:18:39,000 --> 00:18:40,000
Well, you should have a
question.

213
00:18:40,000 --> 00:18:49,000
The question should be how do
we know if it is a maximum or a

214
00:18:49,000 --> 00:18:53,000
minimum?
Yeah.

215
00:18:53,000 --> 00:18:55,000
If we had a function of one
variable, we would decide things

216
00:18:55,000 --> 00:18:58,000
based on the second derivative.
And, in fact,

217
00:18:58,000 --> 00:19:00,000
we will see tomorrow how to do
things based on the second

218
00:19:00,000 --> 00:19:03,000
derivative.
But that is kind of tricky

219
00:19:03,000 --> 00:19:06,000
because there are a lot of
second derivatives.

220
00:19:06,000 --> 00:19:09,000
I mean we already have two
first derivatives.

221
00:19:09,000 --> 00:19:14,000
You can imagine that if you
keep taking partials you may end

222
00:19:14,000 --> 00:19:17,000
up with more and more,
so we will have to figure out

223
00:19:17,000 --> 00:19:19,000
carefully what the condition
should be.

224
00:19:19,000 --> 00:19:27,000
We will do that tomorrow.
For now, let's just try to look

225
00:19:27,000 --> 00:19:38,000
a bit at how do we understand
these things by hand?

226
00:19:38,000 --> 00:19:42,000
In fact, let me point out to
you immediately that there is

227
00:19:42,000 --> 00:19:49,000
more than maxima and minima.
Remember, we saw the example of

228
00:19:49,000 --> 00:19:52,000
x^2 y^2.
That has a critical point.

229
00:19:52,000 --> 00:19:56,000
That critical point is
obviously a minimum.

230
00:19:56,000 --> 00:19:58,000
And, of course,
it could be a local minimum

231
00:19:58,000 --> 00:20:01,000
because it could be that if you
have a more complicated function

232
00:20:01,000 --> 00:20:04,000
there is indeed a minimum here,
but then elsewhere the function

233
00:20:04,000 --> 00:20:08,000
drops to a lower value.
We call that just a local

234
00:20:08,000 --> 00:20:12,000
minimum to say that it is a
minimum if you stick two values

235
00:20:12,000 --> 00:20:15,000
that are close enough to that
point.

236
00:20:15,000 --> 00:20:19,000
Of course, you also have local
maximum, which I didn't plot,

237
00:20:19,000 --> 00:20:23,000
but it is easy to plot.
That is a local maximum.

238
00:20:23,000 --> 00:20:27,000
But there is a third example of
critical point,

239
00:20:27,000 --> 00:20:31,000
and that is a saddle point.
The saddle point,

240
00:20:31,000 --> 00:20:35,000
it is a new phenomena that you
don't really see in single

241
00:20:35,000 --> 00:20:38,000
variable calculus.
It is a critical point that is

242
00:20:38,000 --> 00:20:42,000
neither a minimum nor a maximum
because, depending on which

243
00:20:42,000 --> 00:20:46,000
direction you look in,
it's either one or the other.

244
00:20:46,000 --> 00:20:50,000
See the point in the middle,
at the origin,

245
00:20:50,000 --> 00:20:55,000
is a saddle point.
If you look at the tangent

246
00:20:55,000 --> 00:20:58,000
plane to this graph,
you will see that it is

247
00:20:58,000 --> 00:21:01,000
actually horizontal at the
origin.

248
00:21:01,000 --> 00:21:05,000
You have this mountain pass
where the ground is horizontal.

249
00:21:05,000 --> 00:21:08,000
But, depending on which
direction you go,

250
00:21:08,000 --> 00:21:12,000
you go up or down.
So, we say that a point is a

251
00:21:12,000 --> 00:21:16,000
saddle point if it is neither a
minimum or a maximum.

252
00:21:30,000 --> 00:21:38,000
Possibilities could be a local
min, a local max or a saddle.

253
00:21:38,000 --> 00:21:42,000
Tomorrow we will see how to
decide which one it is,

254
00:21:42,000 --> 00:21:46,000
in general, using second
derivatives.

255
00:21:46,000 --> 00:21:50,000
For this time,
let's just try to do it by

256
00:21:50,000 --> 00:21:53,000
hand.
I just want to observe,

257
00:21:53,000 --> 00:21:57,000
in fact, I can try to,
you know,

258
00:21:57,000 --> 00:21:58,000
these examples that I have
here,

259
00:21:58,000 --> 00:22:02,000
they are x^2 y^2, y^2 - x^2,
they are sums or differences of

260
00:22:02,000 --> 00:22:05,000
squares.
And, if we know that we can put

261
00:22:05,000 --> 00:22:08,000
things as sum of squares for
example, we will be done.

262
00:22:08,000 --> 00:22:16,000
Let's try to express this maybe
as the square of something.

263
00:22:16,000 --> 00:22:21,000
The main problem is this 2xy.
Observe we know something that

264
00:22:21,000 --> 00:22:26,000
starts with x^2 - 2xy but is
actually a square of something

265
00:22:26,000 --> 00:22:32,000
else.
It would be x^2 - 2xy y^2,

266
00:22:32,000 --> 00:22:37,000
not plus 3y2.
Let's try that.

267
00:22:37,000 --> 00:22:48,000
So, we are going to complete
the square.

268
00:22:48,000 --> 00:22:53,000
I am going to say it is x minus
y squared, so it gives me the

269
00:22:53,000 --> 00:23:01,000
first two terms and also the y2.
Well, I still need to add two

270
00:23:01,000 --> 00:23:09,000
more y^2, and I also need to
add, of course,

271
00:23:09,000 --> 00:23:15,000
the 2x and - 2y.
It is still not simple enough

272
00:23:15,000 --> 00:23:19,000
for my taste.
I can actually do better.

273
00:23:19,000 --> 00:23:24,000
These guys look like a sum of
squares, but here I have this

274
00:23:24,000 --> 00:23:28,000
extra stuff, 2x - 2y.
Well, that is 2 (x - y).

275
00:23:28,000 --> 00:23:32,000
It looks like maybe we can
modify this and make this into

276
00:23:32,000 --> 00:23:36,000
another square.
So, in fact,

277
00:23:36,000 --> 00:23:45,000
I can simplify this further to
(x - y 1)^2.

278
00:23:45,000 --> 00:23:51,000
That would be (x - y)^2 2( x -
y), and then there is a plus

279
00:23:51,000 --> 00:23:55,000
one.
Well, we don't have a plus one

280
00:23:55,000 --> 00:24:00,000
so let's remove it by
subtracting one.

281
00:24:00,000 --> 00:24:07,000
And I still have my 2y^2.
Do you see why this is the same

282
00:24:07,000 --> 00:24:13,000
function?
Yeah.

283
00:24:13,000 --> 00:24:19,000
Again, if I expand x minus y
plus one squared,

284
00:24:19,000 --> 00:24:28,000
I get (x - y)^2 2 (x - y) 1.
But I will have minus one that

285
00:24:28,000 --> 00:24:34,000
will cancel out and then I have
a plus 2y^2.

286
00:24:34,000 --> 00:24:41,000
Now, what I know is a sum of
two squared minus one.

287
00:24:41,000 --> 00:24:44,000
And this critical point,
(x,y) = (-1;0),

288
00:24:44,000 --> 00:24:49,000
that is actually when this is
zero and that is zero,

289
00:24:49,000 --> 00:24:55,000
so that is the smallest value.
This is always greater or equal

290
00:24:55,000 --> 00:25:00,000
to zero, the same with that one,
so that is always at least

291
00:25:00,000 --> 00:25:03,000
minus one.
And minus one happens to be the

292
00:25:03,000 --> 00:25:13,000
value at the critical point.
So, it is a minimum.

293
00:25:13,000 --> 00:25:16,000
Now, of course here I was very
lucky.

294
00:25:16,000 --> 00:25:19,000
I mean, generally,
I couldn't expect things to

295
00:25:19,000 --> 00:25:21,000
simplify that much.
In fact, I cheated.

296
00:25:21,000 --> 00:25:26,000
I started from that,
I expanded, and then that is

297
00:25:26,000 --> 00:25:30,000
how I got my example.
The general method will be a

298
00:25:30,000 --> 00:25:32,000
bit different,
but you will see it will

299
00:25:32,000 --> 00:25:34,000
actually also involve completing
squares.

300
00:25:34,000 --> 00:25:42,000
Just there is more to it than
what we have seen.

301
00:25:42,000 --> 00:25:48,000
We will come back to this
tomorrow.

302
00:25:48,000 --> 00:25:56,000
Sorry?
How do I know that this equals

303
00:25:56,000 --> 00:26:09,000
-- How do I know that the whole
function is greater or equal to

304
00:26:09,000 --> 00:26:15,000
negative one?
Well, I wrote f of x,

305
00:26:15,000 --> 00:26:20,000
y as something squared plus
2y^2 - 1.

306
00:26:20,000 --> 00:26:25,000
This squared is always a
positive number and not a

307
00:26:25,000 --> 00:26:27,000
negative.
It is a square.

308
00:26:27,000 --> 00:26:30,000
The square of something is
always non-negative.

309
00:26:30,000 --> 00:26:34,000
Similarly, y^2 is also always
non-negative.

310
00:26:34,000 --> 00:26:38,000
So if you add something that is
at least zero plus something

311
00:26:38,000 --> 00:26:40,000
that is at least zero and you
subtract one,

312
00:26:40,000 --> 00:26:43,000
you get always at least minus
one.

313
00:26:43,000 --> 00:26:48,000
And, in fact,
the only way you can get minus

314
00:26:48,000 --> 00:26:54,000
one is if both of these guys are
zero at the same time.

315
00:26:54,000 --> 00:27:17,000
That is how I get my minimum.
More about this tomorrow.

316
00:27:17,000 --> 00:27:20,000
In fact,
what I would like to tell you

317
00:27:20,000 --> 00:27:23,000
about now instead is a nice
application of min,

318
00:27:23,000 --> 00:27:27,000
max problems that maybe you
don't think of as a min,

319
00:27:27,000 --> 00:27:31,000
max problem that you will see.
I mean you will think of it

320
00:27:31,000 --> 00:27:35,000
that way because probably your
calculator can do it for you or,

321
00:27:35,000 --> 00:27:37,000
if not, your computer can do it
for you.

322
00:27:37,000 --> 00:27:42,000
But it is actually something
where the theory is based on

323
00:27:42,000 --> 00:27:47,000
minimization in two variables.
Very often in experimental

324
00:27:47,000 --> 00:27:52,000
sciences you have to do
something called least-squares

325
00:27:52,000 --> 00:28:01,000
intercalation.
And what is that about?

326
00:28:01,000 --> 00:28:07,000
Well, it is the idea that maybe
you do some experiments and you

327
00:28:07,000 --> 00:28:11,000
record some data.
You have some data x and some

328
00:28:11,000 --> 00:28:13,000
data y.
And, I don't know,

329
00:28:13,000 --> 00:28:17,000
maybe, for example,
x is -- Maybe your measuring

330
00:28:17,000 --> 00:28:21,000
frogs and you're trying to
measure how bit the frog leg is

331
00:28:21,000 --> 00:28:23,000
compared to the eyes of the
frog,

332
00:28:23,000 --> 00:28:26,000
or you're trying to measure
something.

333
00:28:26,000 --> 00:28:30,000
And if you are doing chemistry
then it could be how much you

334
00:28:30,000 --> 00:28:35,000
put of some reactant and how
much of the output product that

335
00:28:35,000 --> 00:28:37,000
you wanted to synthesize
generated.

336
00:28:37,000 --> 00:28:43,000
All sorts of things.
Make up your own example.

337
00:28:43,000 --> 00:28:46,000
You measure basically,
for various values of x,

338
00:28:46,000 --> 00:28:48,000
what the value of y ends up
being.

339
00:28:48,000 --> 00:28:52,000
And then you like to claim
these points are kind of

340
00:28:52,000 --> 00:28:53,000
aligned.
And, of course,

341
00:28:53,000 --> 00:28:55,000
to a mathematician they are not
aligned.

342
00:28:55,000 --> 00:28:57,000
But, to an experimental
scientist, that is evidence that

343
00:28:57,000 --> 00:29:00,000
there is a relation between the
two.

344
00:29:00,000 --> 00:29:03,000
And so you want to claim -- And
in your paper you will actually

345
00:29:03,000 --> 00:29:05,000
draw a nice little line like
that.

346
00:29:05,000 --> 00:29:10,000
The functions depend linearly
on each of them.

347
00:29:10,000 --> 00:29:15,000
The question is how do we come
up with that nice line that

348
00:29:15,000 --> 00:29:19,000
passes smack in the middle of
the points?

349
00:29:19,000 --> 00:29:27,000
The question is,
given experimental data xi,

350
00:29:27,000 --> 00:29:36,000
yi -- Maybe I should actually
be more precise.

351
00:29:36,000 --> 00:29:37,000
You are given some experimental
data.

352
00:29:37,000 --> 00:29:45,000
You have data points x1,
y1, x2, y2 and so on,

353
00:29:45,000 --> 00:29:52,000
xn, yn,
the question would be find the

354
00:29:52,000 --> 00:30:00,000
"best fit"
line of a form y equals ax b

355
00:30:00,000 --> 00:30:08,000
that somehow approximates very
well this data.

356
00:30:08,000 --> 00:30:11,000
You can also use that right
away to predict various things.

357
00:30:11,000 --> 00:30:13,000
For example,
if you look at your new

358
00:30:13,000 --> 00:30:17,000
homework,
actually the first problem asks

359
00:30:17,000 --> 00:30:22,000
you to predict how many iPods
will be on this planet in ten

360
00:30:22,000 --> 00:30:28,000
years looking at past sales and
how they behave.

361
00:30:28,000 --> 00:30:31,000
One thing, right away,
before you lose all the money

362
00:30:31,000 --> 00:30:35,000
that you don't have yet,
you cannot use that to predict

363
00:30:35,000 --> 00:30:39,000
the stock market.
So, don't try to use that to

364
00:30:39,000 --> 00:30:52,000
make money.
It doesn't work.

365
00:30:52,000 --> 00:30:58,000
One tricky thing here that I
want to draw your attention to

366
00:30:58,000 --> 00:31:02,000
is what are the unknowns here?
The natural answer would be to

367
00:31:02,000 --> 00:31:03,000
say that the unknowns are x and
y.

368
00:31:03,000 --> 00:31:07,000
That is not actually the case.
We are not going to solve for

369
00:31:07,000 --> 00:31:09,000
some x and y.
I mean we have some values

370
00:31:09,000 --> 00:31:12,000
given to us.
And, when we are looking for

371
00:31:12,000 --> 00:31:16,000
that line, we don't really care
about the perfect value of x.

372
00:31:16,000 --> 00:31:21,000
What we care about is actually
these coefficients a and b that

373
00:31:21,000 --> 00:31:26,000
will tell us what the relation
is between x and y.

374
00:31:26,000 --> 00:31:30,000
In fact, we are trying to solve
for a and b that will give us

375
00:31:30,000 --> 00:31:34,000
the nicest possible line for
these points.

376
00:31:34,000 --> 00:31:36,000
The unknowns,
in our equations,

377
00:31:36,000 --> 00:31:39,000
will have to be a and b,
not x and y.

378
00:32:11,000 --> 00:32:20,000
The question really is find the
"best"

379
00:32:20,000 --> 00:32:23,000
a and b.
And, of course,

380
00:32:23,000 --> 00:32:26,000
we have to decide what we mean
by best.

381
00:32:26,000 --> 00:32:30,000
Best will mean that we minimize
some function of a and b that

382
00:32:30,000 --> 00:32:34,000
measures the total errors that
we are making when we are

383
00:32:34,000 --> 00:32:38,000
choosing this line compared to
the experimental data.

384
00:32:38,000 --> 00:32:43,000
Maybe, roughly speaking,
it should measure how far these

385
00:32:43,000 --> 00:32:49,000
points are from the line.
But now there are various ways

386
00:32:49,000 --> 00:32:52,000
to do it.
And a lot of them are valid

387
00:32:52,000 --> 00:32:57,000
they give you different answers.
You have to decide what it is

388
00:32:57,000 --> 00:32:59,000
that you prefer.
For example,

389
00:32:59,000 --> 00:33:04,000
you could measure the distance
to the line by projecting

390
00:33:04,000 --> 00:33:08,000
perpendicularly.
Or you could measure instead,

391
00:33:08,000 --> 00:33:13,000
for a given value of x,
the difference between the

392
00:33:13,000 --> 00:33:17,000
experimental value of y and the
predicted one.

393
00:33:17,000 --> 00:33:21,000
And that is often more relevant
because these guys actually may

394
00:33:21,000 --> 00:33:25,000
be expressed in different units.
They are not the same type of

395
00:33:25,000 --> 00:33:29,000
quantity.
You cannot actually combine

396
00:33:29,000 --> 00:33:32,000
them arbitrarily.
Anyway, the convention is

397
00:33:32,000 --> 00:33:34,000
usually we measure distance in
this way.

398
00:33:34,000 --> 00:33:38,000
Next, you could try to minimize
the largest distance.

399
00:33:38,000 --> 00:33:42,000
Say we look at who has the
largest error and we make that

400
00:33:42,000 --> 00:33:44,000
the smallest possible.
The drawback of doing that is

401
00:33:44,000 --> 00:33:47,000
experimentally very often you
have one data point that is not

402
00:33:47,000 --> 00:33:50,000
good because maybe you fell
asleep in front of the

403
00:33:50,000 --> 00:33:53,000
experiment.
And so you didn't measure the

404
00:33:53,000 --> 00:33:55,000
right thing.
You tend to want to not give

405
00:33:55,000 --> 00:33:59,000
too much importance to some data
point that is far away from the

406
00:33:59,000 --> 00:34:02,000
others.
Maybe instead you want to

407
00:34:02,000 --> 00:34:06,000
measure the average distance or
maybe you want to actually give

408
00:34:06,000 --> 00:34:09,000
more weight to things that are
further away.

409
00:34:09,000 --> 00:34:12,000
And then you don't want to do
the distance with a square of

410
00:34:12,000 --> 00:34:14,000
the distance.
There are various possible

411
00:34:14,000 --> 00:34:18,000
answers, but one of them gives
us actually a particularly nice

412
00:34:18,000 --> 00:34:22,000
formula for a and b.
And so that is why it is the

413
00:34:22,000 --> 00:34:27,000
universally used one.
Here it says list squares.

414
00:34:27,000 --> 00:34:31,000
That's because we will measure,
actually, the sum of the

415
00:34:31,000 --> 00:34:35,000
squares of the errors.
And why do we do that?

416
00:34:35,000 --> 00:34:37,000
Well, part of it is because it
looks good.

417
00:34:37,000 --> 00:34:42,000
When you see this plot in
scientific papers they really

418
00:34:42,000 --> 00:34:46,000
look like the line is indeed the
ideal line.

419
00:34:46,000 --> 00:34:49,000
And the second reason is
because actually the

420
00:34:49,000 --> 00:34:52,000
minimization problem that we
will get is particularly simple,

421
00:34:52,000 --> 00:34:57,000
well-posed and easy to solve.
So we will have a nice formula

422
00:34:57,000 --> 00:35:03,000
for the best a and the best b.
If you have a method that is

423
00:35:03,000 --> 00:35:07,000
simple and gives you a good
answer then that is probably

424
00:35:07,000 --> 00:35:09,000
good.
We have to define best.

425
00:35:09,000 --> 00:35:22,000
Here it is in the sense of
minimizing the total square

426
00:35:22,000 --> 00:35:29,000
error.
Or maybe I should say total

427
00:35:29,000 --> 00:35:35,000
square deviation instead.
What do I mean by this?

428
00:35:35,000 --> 00:35:44,000
The deviation for each data
point is the difference between

429
00:35:44,000 --> 00:35:52,000
what you have measured and what
you are predicting by your

430
00:35:52,000 --> 00:36:00,000
model.
That is the difference between

431
00:36:00,000 --> 00:36:11,000
y1 and axi plus b.
Now, what we will do is try to

432
00:36:11,000 --> 00:36:25,000
minimize the function capital D,
which is just the sum for all

433
00:36:25,000 --> 00:36:36,000
the data points of the square of
a deviation.

434
00:36:36,000 --> 00:36:40,000
Let me go over this again.
This is a function of a and b.

435
00:36:40,000 --> 00:36:43,000
Of course there are a lot of
letters in here,

436
00:36:43,000 --> 00:36:46,000
but xi and yi in real life
there will be numbers given to

437
00:36:46,000 --> 00:36:48,000
you.
There will be numbers that you

438
00:36:48,000 --> 00:36:51,000
have measured.
You have measured all of this

439
00:36:51,000 --> 00:36:53,000
data.
They are just going to be

440
00:36:53,000 --> 00:36:58,000
numbers.
You put them in there and you

441
00:36:58,000 --> 00:37:04,000
get a function of a and b.
Any questions?

442
00:37:16,000 --> 00:37:20,000
How do we minimize this
function of a and b?

443
00:37:20,000 --> 00:37:27,000
Well, let's use your knowledge.
Let's actually look for a

444
00:37:27,000 --> 00:37:34,000
critical point.
We want to solve for partial d

445
00:37:34,000 --> 00:37:42,000
over partial a= 0,
partial d over partial b = 0.

446
00:37:42,000 --> 00:37:48,000
That is how we look for
critical points.

447
00:37:48,000 --> 00:37:52,000
Let's take the derivative of
this with respect to a.

448
00:37:52,000 --> 00:37:59,000
Well, the derivative of a sum
is sum of the derivatives.

449
00:37:59,000 --> 00:38:04,000
And now we have to take the
derivative of this quantity

450
00:38:04,000 --> 00:38:07,000
squared.
Remember, we take the

451
00:38:07,000 --> 00:38:11,000
derivative of the square.
We take twice this quantity

452
00:38:11,000 --> 00:38:15,000
times the derivative of what we
are squaring.

453
00:38:15,000 --> 00:38:26,000
We will get 2(yi - axi) b times
the derivative of this with

454
00:38:26,000 --> 00:38:30,000
respect to a.
What is the derivative of this

455
00:38:30,000 --> 00:38:35,000
with respect to a?
Negative xi, exactly.

456
00:38:35,000 --> 00:38:38,000
And so we will want this to be
0.

457
00:38:38,000 --> 00:38:41,000
And partial d over partial b,
we do the same thing,

458
00:38:41,000 --> 00:38:45,000
but different shading with
respect to b instead of with

459
00:38:45,000 --> 00:38:50,000
respect to a.
Again, the sum of squares twice

460
00:38:50,000 --> 00:38:58,000
yi minus axi equals b times the
derivative of this with respect

461
00:38:58,000 --> 00:39:02,000
to b is, I think,
negative one.

462
00:39:02,000 --> 00:39:07,000
Those are the equations we have
to solve.

463
00:39:07,000 --> 00:39:10,000
Well, let's reorganize this a
little bit.

464
00:39:24,000 --> 00:39:32,000
The first equation.
See, there are a's and there

465
00:39:32,000 --> 00:39:36,000
are b's in these equations.
I am going to just look at the

466
00:39:36,000 --> 00:39:39,000
coefficients of a and b.
If you have good eyes,

467
00:39:39,000 --> 00:39:42,000
you can see probably that these
are actually linear equations in

468
00:39:42,000 --> 00:39:45,000
a and b.
There is a lot of clutter with

469
00:39:45,000 --> 00:39:47,000
all these x's and y's all over
the place.

470
00:39:47,000 --> 00:39:55,000
Let's actually try to expand
things and make that more

471
00:39:55,000 --> 00:39:59,000
apparent.
The first thing I will do is

472
00:39:59,000 --> 00:40:02,000
actually get rid of these
factors of two.

473
00:40:02,000 --> 00:40:05,000
They are just not very
important.

474
00:40:05,000 --> 00:40:10,000
I can simplify things.
Next, I am going to look at the

475
00:40:10,000 --> 00:40:15,000
coefficient of a.
I will get basically a times xi

476
00:40:15,000 --> 00:40:24,000
squared.
Let me just do it and should be

477
00:40:24,000 --> 00:40:33,000
clear.
I claim when we simplify this

478
00:40:33,000 --> 00:40:46,000
we get xi squared times a plus
xi times b minus xiyi.

479
00:40:46,000 --> 00:40:53,000
And we set this equal to zero.
Do you agree that this is what

480
00:40:53,000 --> 00:40:57,000
we get when we expand that
product?

481
00:40:57,000 --> 00:41:03,000
Yeah. Kind of?
OK. Let's do the other one.

482
00:41:03,000 --> 00:41:08,000
We just multiply by minus one,
so we take the opposite of that

483
00:41:08,000 --> 00:41:19,000
which would be axi plus b.
I will write that as xia plus b

484
00:41:19,000 --> 00:41:25,000
minus yi.
Sorry. I forgot the n here.

485
00:41:25,000 --> 00:41:30,000
And let me just reorganize that
by actually putting all the a's

486
00:41:30,000 --> 00:41:34,000
together.
That means I will have sum of

487
00:41:34,000 --> 00:41:40,000
all the xi2 times a plus sum of
xib minus sum of xiyi equal to

488
00:41:40,000 --> 00:41:41,000
zero.

489
00:42:08,000 --> 00:42:15,000
If I rewrite this,
it becomes sum of xi2 times a

490
00:42:15,000 --> 00:42:24,000
plus sum of the xi's time b,
and let me move the other guys

491
00:42:24,000 --> 00:42:30,000
to the other side,
equals sum of xiyi.

492
00:42:30,000 --> 00:42:37,000
And that one becomes sum of xi
times a.

493
00:42:37,000 --> 00:42:41,000
Plus how many b's do I get on
this one?

494
00:42:41,000 --> 00:42:45,000
I get one for each data point.
When I sum them together,

495
00:42:45,000 --> 00:42:48,000
I will get n.
Very good.

496
00:42:48,000 --> 00:42:56,000
N times b equals sum of yi.
Now, this quantities look

497
00:42:56,000 --> 00:42:58,000
scary, but they are actually
just numbers.

498
00:42:58,000 --> 00:43:01,000
For example,
this one, you look at all your

499
00:43:01,000 --> 00:43:05,000
data points.
For each of them you take the

500
00:43:05,000 --> 00:43:10,000
value of x and you just sum all
these numbers together.

501
00:43:10,000 --> 00:43:19,000
What you get,
actually, is a linear system in

502
00:43:19,000 --> 00:43:26,000
a and b, a two by two linear
system.

503
00:43:26,000 --> 00:43:32,000
And so now we can solve this
for a and b.

504
00:43:32,000 --> 00:43:35,000
In practice,
of course, first you plug in

505
00:43:35,000 --> 00:43:40,000
the numbers for xi and yi and
then you solve the system that

506
00:43:40,000 --> 00:43:44,000
you get.
And we know how to solve two by

507
00:43:44,000 --> 00:43:46,000
two linear systems,
I hope.

508
00:43:46,000 --> 00:43:50,000
That's how we find the best fit
line.

509
00:43:50,000 --> 00:43:54,000
Now, why is that going to be
the best one instead of the

510
00:43:54,000 --> 00:43:56,000
worst one?
We just solved for a critical

511
00:43:56,000 --> 00:43:58,000
point.
That could actually be a

512
00:43:58,000 --> 00:44:01,000
maximum of this error function
D.

513
00:44:01,000 --> 00:44:05,000
We will have the answer to that
next time, but trust me.

514
00:44:05,000 --> 00:44:08,000
If you really want to go over
the second derivative test that

515
00:44:08,000 --> 00:44:11,000
we will see tomorrow and apply
it in this case,

516
00:44:11,000 --> 00:44:14,000
it is quite hard to check,
but you can see it is actually

517
00:44:14,000 --> 00:44:28,000
a minimum.
I will just say -- -- we can

518
00:44:28,000 --> 00:44:42,000
show that it is a minimum.
Now, the event with the linear

519
00:44:42,000 --> 00:44:47,000
case is the one that we are the
most familiar with.

520
00:44:47,000 --> 00:44:56,000
Least-squares interpolation
actually works in much more

521
00:44:56,000 --> 00:45:03,000
general settings.
Because instead of fitting for

522
00:45:03,000 --> 00:45:06,000
the best line,
if you think it has a different

523
00:45:06,000 --> 00:45:10,000
kind of relation then maybe you
can fit in using a different

524
00:45:10,000 --> 00:45:14,000
kind of formula.
Let me actually illustrate that

525
00:45:14,000 --> 00:45:17,000
with an example.
I don't know if you are

526
00:45:17,000 --> 00:45:21,000
familiar with Moore's law.
It is something that is

527
00:45:21,000 --> 00:45:24,000
supposed to tell you how quickly
basically computer chips become

528
00:45:24,000 --> 00:45:27,000
smarter faster and faster all
the time.

529
00:45:27,000 --> 00:45:31,000
It's a law that says things
about the number of transistors

530
00:45:31,000 --> 00:45:33,000
that you can fit onto a computer
chip.

531
00:45:33,000 --> 00:45:45,000
Here I have some data about --
Here is data about the number of

532
00:45:45,000 --> 00:45:58,000
transistors on a standard PC
processor as a function of time.

533
00:45:58,000 --> 00:46:01,000
And if you try to do a
best-line fit,

534
00:46:01,000 --> 00:46:07,000
well, it doesn't seem to follow
a linear trend.

535
00:46:07,000 --> 00:46:11,000
On the other hand,
if you plug the diagram in the

536
00:46:11,000 --> 00:46:13,000
log scale,
the log of a number of

537
00:46:13,000 --> 00:46:15,000
transitions as a function of
time,

538
00:46:15,000 --> 00:46:21,000
then you get a much better line.
And so, in fact,

539
00:46:21,000 --> 00:46:26,000
that means that you had an
exponential relation between the

540
00:46:26,000 --> 00:46:30,000
number of transistors and time.
And so, actually that's what

541
00:46:30,000 --> 00:46:32,000
Moore's law says.
It says that the number of

542
00:46:32,000 --> 00:46:36,000
transistors in the chip doubles
every 18 months or every two

543
00:46:36,000 --> 00:46:40,000
years.
They keep changing the

544
00:46:40,000 --> 00:46:49,000
statement.
How do we find the best

545
00:46:49,000 --> 00:46:58,000
exponential fit?
Well, an exponential fit would

546
00:46:58,000 --> 00:47:05,000
be something of a form y equals
a constant times exponential of

547
00:47:05,000 --> 00:47:09,000
a times x.
That is what we want to look at.

548
00:47:09,000 --> 00:47:13,000
Well, we could try to minimize
a square error like we did

549
00:47:13,000 --> 00:47:16,000
before.
That doesn't work well at all.

550
00:47:16,000 --> 00:47:18,000
The equations that you get are
very complicated.

551
00:47:18,000 --> 00:47:24,000
You cannot solve them.
But remember what I showed you

552
00:47:24,000 --> 00:47:28,000
on this log plot.
If you plot the log of y as a

553
00:47:28,000 --> 00:47:33,000
function of x then suddenly it
becomes a linear relation.

554
00:47:33,000 --> 00:47:43,000
Observe, this is the same as ln
of y equals ln of c plus ax.

555
00:47:43,000 --> 00:47:55,000
And that is the linear best fit.
What you do is you just look

556
00:47:55,000 --> 00:48:08,000
for the best straight line fit
for the log of y.

557
00:48:08,000 --> 00:48:10,000
That is something we already
know.

558
00:48:10,000 --> 00:48:12,000
But you can also do,
for example,

559
00:48:12,000 --> 00:48:16,000
let's say that we have
something more complicated.

560
00:48:16,000 --> 00:48:21,000
Let's say that we have actually
a quadratic law.

561
00:48:21,000 --> 00:48:27,000
For example,
y is of the form ax^2 bx c.

562
00:48:27,000 --> 00:48:31,000
And, of course,
you are trying to find somehow

563
00:48:31,000 --> 00:48:34,000
the best.
That would mean here fitting

564
00:48:34,000 --> 00:48:37,000
the best parabola for your data
points.

565
00:48:37,000 --> 00:48:40,000
Well, to do that,
you would need to find a,

566
00:48:40,000 --> 00:48:45,000
b and c.
And now you will have actually

567
00:48:45,000 --> 00:48:51,000
a function of a,
b and c, which would be the sum

568
00:48:51,000 --> 00:48:57,000
of the old data points of the
square deviation.

569
00:48:57,000 --> 00:49:01,000
And, if you try to solve for
critical points,

570
00:49:01,000 --> 00:49:03,000
now you will have three
equations involving a,

571
00:49:03,000 --> 00:49:05,000
b and c,
in fact, you will find a three

572
00:49:05,000 --> 00:49:09,000
by three linear system.
And it works the same way.

573
00:49:09,000 --> 00:49:14,000
Just you have a little bit more
data.

574
00:49:14,000 --> 00:49:19,000
Basically, you see that this
best fit problems are an example

575
00:49:19,000 --> 00:49:24,000
of a minimization problem that
maybe you didn't expect to see

576
00:49:24,000 --> 00:49:30,000
minimization problems come in.
But that is really the way to

577
00:49:30,000 --> 00:49:34,000
handle these questions.
Tomorrow we will go back to the

578
00:49:34,000 --> 00:49:38,000
question of how do we decide
whether it is a minimum or a

579
00:49:38,000 --> 00:49:40,000
maximum.
And we will continue exploring

580
00:49:40,000 --> 00:49:43,000
in terms of several variables.