1
00:00:00,000 --> 00:00:01,950
The following
content is provided

2
00:00:01,950 --> 00:00:06,050
by MIT OpenCourseWare under
a Creative Commons license.

3
00:00:06,050 --> 00:00:08,263
Additional information
about our license

4
00:00:08,263 --> 00:00:10,470
and MIT OpenCourseWare
in general

5
00:00:10,470 --> 00:00:11,930
is available at ocw.mit.edu.

6
00:00:15,520 --> 00:00:20,150
PROFESSOR: So we
started last week

7
00:00:20,150 --> 00:00:26,630
on the big topic for the rest
of the semester, optimization.

8
00:00:26,630 --> 00:00:30,780
Maybe, can you close that
door or just partway, anyway.

9
00:00:30,780 --> 00:00:32,820
Thanks.

10
00:00:32,820 --> 00:00:33,320
Great.

11
00:00:33,320 --> 00:00:35,020
That's perfect, just like that.

12
00:00:35,020 --> 00:00:40,240
So, since it was
a few days ago, I

13
00:00:40,240 --> 00:00:46,420
wanted to recap what I
did in the first lecture

14
00:00:46,420 --> 00:00:50,680
about optimization, which was
to pick out the least squares

15
00:00:50,680 --> 00:00:55,810
problem as a beautiful
model problem.

16
00:00:55,810 --> 00:01:01,930
And I followed that through
-- the input is a matrix A,

17
00:01:01,930 --> 00:01:08,090
rectangular, a right-hand
side b, probably measurements.

18
00:01:08,090 --> 00:01:12,730
We would like to get A*u
equal b but we can't.

19
00:01:12,730 --> 00:01:16,280
We got too many measurements.

20
00:01:16,280 --> 00:01:20,250
We do the best possible,
which is the solution u

21
00:01:20,250 --> 00:01:24,540
hat of this normal equation.

22
00:01:24,540 --> 00:01:28,670
Then I rewrote the
normal equations -- well,

23
00:01:28,670 --> 00:01:33,580
I also drew the picture that
you see over here on the left.

24
00:01:33,580 --> 00:01:38,070
The picture that shows
the two optimization

25
00:01:38,070 --> 00:01:40,020
problems at the same time.

26
00:01:40,020 --> 00:01:42,530
Here is the vector b.

27
00:01:42,530 --> 00:01:47,455
Here are all possible A*u's
-- this is the column space,

28
00:01:47,455 --> 00:01:51,700
this is all A*u's.

29
00:01:51,700 --> 00:01:55,200
The best A*u was the projection.

30
00:01:55,200 --> 00:01:59,330
And the error e was
what we couldn't

31
00:01:59,330 --> 00:02:03,380
get right, the part that's
perpendicular to the column

32
00:02:03,380 --> 00:02:06,020
space we can't help.

33
00:02:06,020 --> 00:02:10,380
It's the solution to
another projection problem.

34
00:02:10,380 --> 00:02:14,040
e is the same -- of course,
it's the same over here --

35
00:02:14,040 --> 00:02:19,560
as projecting b onto
this perpendicular line,

36
00:02:19,560 --> 00:02:25,670
which is the line of all
vectors, A transpose e says --

37
00:02:25,670 --> 00:02:33,860
what that says in words is e
is perpendicular to the columns

38
00:02:33,860 --> 00:02:44,110
of A. I've drawn it the best
I could as a perpendicular

39
00:02:44,110 --> 00:02:46,480
to the columns.

40
00:02:46,480 --> 00:02:48,920
So that we have a
plane of columns --

41
00:02:48,920 --> 00:02:51,790
this is like a
three by two matrix.

42
00:02:51,790 --> 00:02:58,030
We have a plane with the two
columns and the perpendicular

43
00:02:58,030 --> 00:02:59,830
line.

44
00:02:59,830 --> 00:03:06,315
Then just near the end, I said
that it would be great to have

45
00:03:06,315 --> 00:03:10,240
-- this model would be almost
all we need except it should

46
00:03:10,240 --> 00:03:18,160
have one more matrix and
I called that matrix C.

47
00:03:18,160 --> 00:03:20,990
I want just to show you
quickly where that comes.

48
00:03:20,990 --> 00:03:27,000
So this is all section
7.1 of the notes,

49
00:03:27,000 --> 00:03:32,550
except that these words
are not yet typed.

50
00:03:32,550 --> 00:03:35,850
So they'll -- as
soon as possible,

51
00:03:35,850 --> 00:03:40,420
an updated 7.1 will include
this, what I want to do.

52
00:03:40,420 --> 00:03:42,450
Because I think it's the
very best introduction

53
00:03:42,450 --> 00:03:48,260
I could give to these pair
of optimization problems.

54
00:03:48,260 --> 00:03:51,040
You might think, who
needs optimization?

55
00:03:54,750 --> 00:04:00,000
Your main activity might be
solving differential equations.

56
00:04:00,000 --> 00:04:05,570
So, can I just take a time out
here because I happened to see

57
00:04:05,570 --> 00:04:12,280
the homework, upcoming
-- I think it's 16.930,

58
00:04:12,280 --> 00:04:14,240
so it's a course a
little bit like this,

59
00:04:14,240 --> 00:04:19,040
only the first word in the
course title is "Advanced,"

60
00:04:19,040 --> 00:04:22,310
and our first word is
"Introduction," but, of course,

61
00:04:22,310 --> 00:04:27,110
it's the same course
or same stuff.

62
00:04:27,110 --> 00:04:28,940
This is the homework
that I don't

63
00:04:28,940 --> 00:04:32,630
think has even been assigned,
even been handed out yet.

64
00:04:32,630 --> 00:04:40,720
But I just thought it's a
great example of a applied --

65
00:04:40,720 --> 00:04:46,410
so that you see where
optimization appears

66
00:04:46,410 --> 00:04:50,130
and what's involved.

67
00:04:50,130 --> 00:04:53,330
So differential
equations are involved,

68
00:04:53,330 --> 00:04:57,350
but also you got something
that you want to optimize.

69
00:04:57,350 --> 00:05:02,670
So I just wrote it on this
board and I simplified it

70
00:05:02,670 --> 00:05:07,760
a little over the homework
that they'll actually get.

71
00:05:07,760 --> 00:05:09,480
So here's the problem.

72
00:05:09,480 --> 00:05:12,670
We want a certain
distribution of heat.

73
00:05:12,670 --> 00:05:16,410
So I could draw a picture.

74
00:05:16,410 --> 00:05:20,260
We want a heat distribution,
for whatever reason,

75
00:05:20,260 --> 00:05:27,240
that maybe goes like this
over the interval 0 to 1.

76
00:05:27,240 --> 00:05:31,970
And what do we have
at our disposal

77
00:05:31,970 --> 00:05:34,500
to get the heat to be that way?

78
00:05:34,500 --> 00:05:36,950
Well, we've got sources of heat.

79
00:05:36,950 --> 00:05:42,180
But we don't have a
continuous source,

80
00:05:42,180 --> 00:05:52,640
we only have n
parameters to play with.

81
00:05:52,640 --> 00:05:55,000
I mean, right away, you
recognize an optimization

82
00:05:55,000 --> 00:05:55,770
problem.

83
00:05:55,770 --> 00:05:59,430
We're trying to get this
function here, u_naught of x.

84
00:06:02,660 --> 00:06:04,220
We're trying to
match a function,

85
00:06:04,220 --> 00:06:07,550
but we've only got n parameters.

86
00:06:07,550 --> 00:06:12,180
Those will be the
right-hand side.

87
00:06:12,180 --> 00:06:15,460
So what we're allowed to choose,
we can put in little space

88
00:06:15,460 --> 00:06:20,860
heaters, and we can turn them to
the temperatures we want to --

89
00:06:20,860 --> 00:06:23,900
temperatures s_1, s_2, s_3.

90
00:06:23,900 --> 00:06:26,290
That was probably
a stupid choice

91
00:06:26,290 --> 00:06:30,080
to put an s_4 down there
because I don't even know

92
00:06:30,080 --> 00:06:32,940
if negative heat is allowed.

93
00:06:32,940 --> 00:06:35,910
Anyway, we wouldn't
want it if we're

94
00:06:35,910 --> 00:06:38,610
aiming for that distribution.

95
00:06:38,610 --> 00:06:41,280
Now you understand the
s's are not supposed

96
00:06:41,280 --> 00:06:42,310
to match that u_naught.

97
00:06:42,310 --> 00:06:46,560
The s's are the sources
of heat, and the u's

98
00:06:46,560 --> 00:06:50,440
are the outputs,
the distribution,

99
00:06:50,440 --> 00:06:53,650
and they are controlled by
a differential equation.

100
00:06:56,520 --> 00:06:58,670
What we control is
the right-hand side

101
00:06:58,670 --> 00:07:03,490
of the equation, but we
only control n parameters.

102
00:07:03,490 --> 00:07:05,220
Then we have to
solve the equation

103
00:07:05,220 --> 00:07:07,900
and find out what
distribution that gives.

104
00:07:07,900 --> 00:07:11,610
So that's all like
differential equations,

105
00:07:11,610 --> 00:07:13,950
we know how to do it.

106
00:07:13,950 --> 00:07:18,680
It's one-dimensional in this
example, so straightforward.

107
00:07:18,680 --> 00:07:21,030
But now comes the
optimization part.

108
00:07:21,030 --> 00:07:25,610
We take this result
and we compare it

109
00:07:25,610 --> 00:07:29,690
with the desired result.
So the actual result

110
00:07:29,690 --> 00:07:34,250
from some source of heat
might be something like that.

111
00:07:38,380 --> 00:07:48,970
We want this u of x,
the heat distribution

112
00:07:48,970 --> 00:07:51,450
that we're actually
producing, to be as close

113
00:07:51,450 --> 00:07:53,760
as possible to u_naught.

114
00:07:53,760 --> 00:07:56,220
Close could mean
different things,

115
00:07:56,220 --> 00:08:00,990
but if we measure closeness
in this integral square,

116
00:08:00,990 --> 00:08:05,284
mean square sense, then we're
going to have nice problem.

117
00:08:05,284 --> 00:08:06,950
In fact, we're going
to have a linear --

118
00:08:06,950 --> 00:08:10,080
everything's linear right now.

119
00:08:10,080 --> 00:08:12,310
Well, everything's
quadratic here,

120
00:08:12,310 --> 00:08:16,710
so when we minimize it's going
to give us a linear equation

121
00:08:16,710 --> 00:08:18,740
for the s's.

122
00:08:18,740 --> 00:08:23,980
I just think that's
a good model problem.

123
00:08:23,980 --> 00:08:26,290
A good model of
what we might do.

124
00:08:26,290 --> 00:08:32,810
I think actually in the
problem to be assigned -- well,

125
00:08:32,810 --> 00:08:37,500
it's not a big deal for us --
there is convection as well.

126
00:08:37,500 --> 00:08:41,470
So this is a pure
diffusion problem, just

127
00:08:41,470 --> 00:08:42,780
the second derivative.

128
00:08:42,780 --> 00:08:44,500
We know very well
that if there was

129
00:08:44,500 --> 00:08:48,820
a first derivative in there,
the stuff's convecting,

130
00:08:48,820 --> 00:08:52,670
passing through the region.

131
00:08:52,670 --> 00:09:02,140
So if I put on a
convective term, c*du/dx,

132
00:09:02,140 --> 00:09:04,110
how does that
change the problem?

133
00:09:04,110 --> 00:09:09,280
Not in a big, big major way,
but one thing we can guess

134
00:09:09,280 --> 00:09:14,940
is that now that is not
a symmetric term, right.

135
00:09:14,940 --> 00:09:18,370
We've seen the difference
between first differences

136
00:09:18,370 --> 00:09:20,402
and second differences,
first derivatives

137
00:09:20,402 --> 00:09:21,360
and second derivatives.

138
00:09:21,360 --> 00:09:28,790
So now it's a little trickier
and it's not symmetric.

139
00:09:28,790 --> 00:09:37,370
So somehow there will be a
primal problem, this one,

140
00:09:37,370 --> 00:09:42,580
and there will be a adjoint
problem, dual problem,

141
00:09:42,580 --> 00:09:46,660
perpendicular problem, whatever
name you want to give it,

142
00:09:46,660 --> 00:09:51,790
just as there is in our
very first model over there.

143
00:09:51,790 --> 00:09:59,030
So that's my little
quick look to kind of put

144
00:09:59,030 --> 00:10:04,450
on the board one example
that I didn't invent,

145
00:10:04,450 --> 00:10:11,060
that came from applications
and gives a sort of typical

146
00:10:11,060 --> 00:10:12,550
of what you have to do.

147
00:10:12,550 --> 00:10:18,190
You control an input,
you get an output,

148
00:10:18,190 --> 00:10:20,100
so that's the analysis problem.

149
00:10:20,100 --> 00:10:21,530
Find the output.

150
00:10:21,530 --> 00:10:24,530
But then comes the
optimization problem --

151
00:10:24,530 --> 00:10:27,880
make that output close to
something that you wish.

152
00:10:31,730 --> 00:10:34,470
So what's the typical
algorithm going to do?

153
00:10:34,470 --> 00:10:38,550
It's going to make
a choice of s,

154
00:10:38,550 --> 00:10:43,520
it's going to solve the
analysis problem for the u,

155
00:10:43,520 --> 00:10:46,300
it's going to look and
see what the error is,

156
00:10:46,300 --> 00:10:51,160
it's going to figure out
probably the gradient somehow.

157
00:10:51,160 --> 00:10:55,190
What's the steepest
way to make it closer?

158
00:10:55,190 --> 00:10:58,410
That's going to lead
us to a change of s.

159
00:10:58,410 --> 00:11:03,450
We use a change of s, the new
s, solve that, and iterate.

160
00:11:03,450 --> 00:11:06,110
That would be a
typical algorithm.

161
00:11:06,110 --> 00:11:09,750
We might be able to shortcut it
in a model problem like this.

162
00:11:09,750 --> 00:11:16,310
But that's totally the
typical optimization idea,

163
00:11:16,310 --> 00:11:24,100
is an analysis problem and
then figure out a gradient

164
00:11:24,100 --> 00:11:27,510
to get a better source;
back to the analysis problem

165
00:11:27,510 --> 00:11:29,040
with the new source.

166
00:11:32,390 --> 00:11:35,000
What the algebra,
the math, has to do

167
00:11:35,000 --> 00:11:38,850
is those two steps
of figuring out OK,

168
00:11:38,850 --> 00:11:43,760
how do we improve
the error, how do we

169
00:11:43,760 --> 00:11:46,660
reduce the error, what's
the steepest direction?

170
00:11:46,660 --> 00:11:50,690
Somehow we got to
compute a derivative.

171
00:11:50,690 --> 00:11:54,310
Actually, that's what
this month is about.

172
00:11:54,310 --> 00:11:56,930
Derivatives that are not
just like the derivative

173
00:11:56,930 --> 00:12:00,990
of x cube or something.

174
00:12:00,990 --> 00:12:03,260
I often wondered
how many presidents

175
00:12:03,260 --> 00:12:08,340
could take the derivative
of x cube and I'm not sure.

176
00:12:12,170 --> 00:12:14,360
Anybody occur to
you who you could

177
00:12:14,360 --> 00:12:18,080
count on being able to take
the derivative of x cubed?

178
00:12:18,080 --> 00:12:20,080
I don't think the
current president

179
00:12:20,080 --> 00:12:21,980
would know what it meant.

180
00:12:21,980 --> 00:12:25,180
But I think Carter could
have done it, because he

181
00:12:25,180 --> 00:12:27,170
went to the Naval Academy.

182
00:12:27,170 --> 00:12:30,180
Jefferson was probably
-- he knew everything.

183
00:12:32,720 --> 00:12:33,270
I don't know.

184
00:12:33,270 --> 00:12:37,280
Anybody else has another
candidate they can tell me.

185
00:12:37,280 --> 00:12:43,080
So there is our problem --
finding derivatives that would

186
00:12:43,080 --> 00:12:46,310
be definitely beyond the
capacity of the White House.

187
00:12:52,140 --> 00:12:57,360
Now I want to stay
with this model

188
00:12:57,360 --> 00:13:05,440
a little more because
it's the perfect model.

189
00:13:05,440 --> 00:13:09,710
So this was like model one,
unweighted ordinary least

190
00:13:09,710 --> 00:13:13,830
squared, and it produced the
identity matrix in there.

191
00:13:13,830 --> 00:13:17,800
And I mentioned last
time what I want to do,

192
00:13:17,800 --> 00:13:22,050
the more general model
has another symmetric

193
00:13:22,050 --> 00:13:27,580
positive-definite matrix in
there, but not necessarily i,

194
00:13:27,580 --> 00:13:30,640
and it comes from
weighted least squares.

195
00:13:30,640 --> 00:13:33,070
So that's what I'm
going to talk about.

196
00:13:33,070 --> 00:13:35,050
So what are weighted
least squares?

197
00:13:35,050 --> 00:13:41,460
Well, you've got these
measurements and you think they

198
00:13:41,460 --> 00:13:46,400
maybe don't all -- maybe
they're not independent,

199
00:13:46,400 --> 00:13:49,320
maybe they're not
equally reliable.

200
00:13:49,320 --> 00:13:52,930
So you weight them by
how reliable they are.

201
00:13:52,930 --> 00:13:56,080
A more reliable one you
would put a heavier weight

202
00:13:56,080 --> 00:14:00,410
w on because you want
that to be more important.

203
00:14:00,410 --> 00:14:05,000
So, you change to this problem.

204
00:14:05,000 --> 00:14:06,730
But it looks
practically the same.

205
00:14:06,730 --> 00:14:12,400
The only difference is A has
become W*A, b has become W*b.

206
00:14:12,400 --> 00:14:17,090
So the equations will be
the same, but A is now W*A,

207
00:14:17,090 --> 00:14:22,140
b is now W*b, and those
are the equations.

208
00:14:22,140 --> 00:14:25,270
I guess I should call, just
to make clear that u is

209
00:14:25,270 --> 00:14:30,560
a different u, that the best
u now depends on the choice

210
00:14:30,560 --> 00:14:34,130
of weights, I should
really be calling that u --

211
00:14:34,130 --> 00:14:37,940
somehow indicate that it
depends on the weight.

212
00:14:37,940 --> 00:14:42,890
Now they key nice thing is
that if I write this out as A

213
00:14:42,890 --> 00:14:49,080
transpose -- can you
just write this out?

214
00:14:49,080 --> 00:14:53,500
You get the A transpose,
and the A is over here.

215
00:14:53,500 --> 00:14:56,460
But what's in the middle?

216
00:14:56,460 --> 00:15:00,810
What's this matrix C -- I jumped
the gun and called that matrix

217
00:15:00,810 --> 00:15:05,920
in the middle C, but how
is it connected to W?

218
00:15:05,920 --> 00:15:12,390
You can see it here as C
is W transpose W, right?

219
00:15:12,390 --> 00:15:14,180
It's just sitting
there in the middle.

220
00:15:14,180 --> 00:15:18,680
That's great that the fact that
the combination W transpose

221
00:15:18,680 --> 00:15:21,160
W is all you need to know.

222
00:15:21,160 --> 00:15:27,280
So we can forget W in favor
of -- I'll just put it here --

223
00:15:27,280 --> 00:15:32,630
W transpose W is now
given the name C,

224
00:15:32,630 --> 00:15:36,200
and this matrix is
symmetric positive definite.

225
00:15:40,440 --> 00:15:44,730
So it's a great matrix and
it's exactly the one we want.

226
00:15:44,730 --> 00:15:48,850
And this is exactly
the equation we want.

227
00:15:48,850 --> 00:15:53,480
So if I go back to writing
it in this -- you know,

228
00:15:53,480 --> 00:15:56,210
this was the equation.

229
00:15:56,210 --> 00:16:01,510
You remember the point
that we could go directly

230
00:16:01,510 --> 00:16:05,250
to one equation for
the best u, or we

231
00:16:05,250 --> 00:16:13,880
could keep our options open
and have two equations that

232
00:16:13,880 --> 00:16:15,530
led to the same one.

233
00:16:15,530 --> 00:16:17,410
They're totally equivalent.

234
00:16:17,410 --> 00:16:22,530
But the two equations will give
us not only u, but the error,

235
00:16:22,530 --> 00:16:25,960
b minus A*u, as
an other unknown.

236
00:16:25,960 --> 00:16:27,230
That's what we did up there.

237
00:16:27,230 --> 00:16:32,340
So we had two equations,
and if we eliminated e,

238
00:16:32,340 --> 00:16:34,500
we got to this.

239
00:16:34,500 --> 00:16:36,880
Now I want to have
two equations,

240
00:16:36,880 --> 00:16:40,530
and if I eliminate
e, I get to that.

241
00:16:40,530 --> 00:16:44,450
So let me see what
those would be.

242
00:16:44,450 --> 00:16:47,000
Well, here's one of them.

243
00:16:47,000 --> 00:16:53,040
A transpose C*e is
zero -- you see,

244
00:16:53,040 --> 00:16:55,150
that's the only
difference, really.

245
00:16:55,150 --> 00:17:01,260
That the weighted
normal equation,

246
00:17:01,260 --> 00:17:05,480
I just took A transpose C, and
then I took the b minus A*u

247
00:17:05,480 --> 00:17:08,510
together, and this is
what I'm calling e.

248
00:17:08,510 --> 00:17:15,170
So one way to do it is just
the new guy is just now e plus

249
00:17:15,170 --> 00:17:19,630
A*u_W is still b.

250
00:17:19,630 --> 00:17:25,920
But now a transpose C*e is zero.

251
00:17:25,920 --> 00:17:26,570
Good deal.

252
00:17:26,570 --> 00:17:29,040
I mean that's quite nice.

253
00:17:31,950 --> 00:17:34,720
It isn't absolutely
perfect though,

254
00:17:34,720 --> 00:17:36,380
because I've lost the symmetry.

255
00:17:36,380 --> 00:17:39,920
I'm putting a C in there
and I don't really want it.

256
00:17:39,920 --> 00:17:42,720
I want a C but I
don't want it there.

257
00:17:42,720 --> 00:17:46,970
So, I just make a small change.

258
00:17:46,970 --> 00:17:49,770
I'm going to introduce
a new unknown that I'll

259
00:17:49,770 --> 00:17:55,630
call little w, apologies for
the fact that it's also a w,

260
00:17:55,630 --> 00:17:58,410
it just happened to fit.

261
00:17:58,410 --> 00:17:59,380
That'll be the C*e.

262
00:18:02,730 --> 00:18:07,960
So I'm just calling
this a new name here.

263
00:18:07,960 --> 00:18:11,800
So that now my e -- of
course, if I just invert,

264
00:18:11,800 --> 00:18:14,970
e is C inverse w.

265
00:18:14,970 --> 00:18:19,690
So now I'm just going to write
these equations with w instead

266
00:18:19,690 --> 00:18:21,580
of e because I like it better.

267
00:18:21,580 --> 00:18:29,000
So this equation is now A
transpose w equals zero.

268
00:18:29,000 --> 00:18:35,030
This equation, the e is
disappeared in favor of C

269
00:18:35,030 --> 00:18:42,100
inverse w plus A*u equals b.

270
00:18:42,100 --> 00:18:45,300
So that's the system
that I really like.

271
00:18:45,300 --> 00:18:48,550
That's the saddle point
system, the Kuhn-Tucker system,

272
00:18:48,550 --> 00:18:51,080
the primal-dual system,
the fundamental system

273
00:18:51,080 --> 00:18:59,230
of the whole subject in
this linear, matrix case.

274
00:18:59,230 --> 00:19:02,430
We haven't got functions,
we got vectors,

275
00:19:02,430 --> 00:19:09,730
and we've got symmetry,
and we've got linearity.

276
00:19:09,730 --> 00:19:15,930
And we've got a saddle point
matrix that's now the S --

277
00:19:15,930 --> 00:19:19,060
well, let me just
change it here.

278
00:19:19,060 --> 00:19:23,430
It's just changed
to this, C inverse.

279
00:19:23,430 --> 00:19:28,090
That's the fundamental
matrix of the whole subject.

280
00:19:28,090 --> 00:19:30,740
So, S is the saddle
point matrix.

281
00:19:38,300 --> 00:19:41,230
So I wanted to get that far.

282
00:19:41,230 --> 00:19:46,470
You see that the whole picture
was elementary linear algebra.

283
00:19:46,470 --> 00:19:52,780
Let me come back to the
elementary figure that

284
00:19:52,780 --> 00:19:55,950
illustrates what's the geometry.

285
00:19:55,950 --> 00:20:00,790
How was the geometry
affected by introducing

286
00:20:00,790 --> 00:20:06,460
this guy W, this weighting
matrix, or the C equal W

287
00:20:06,460 --> 00:20:07,320
transpose W?

288
00:20:07,320 --> 00:20:12,020
Well, here was the
picture from last time.

289
00:20:12,020 --> 00:20:14,390
A right-angled picture.

290
00:20:14,390 --> 00:20:16,810
This was a right triangle.

291
00:20:16,810 --> 00:20:22,050
This line was perpendicular
to that plane,

292
00:20:22,050 --> 00:20:23,500
but not anymore now.

293
00:20:23,500 --> 00:20:31,440
The second one, it's A
transpose C*e is zero.

294
00:20:31,440 --> 00:20:36,380
That's still a line, but it's
not any longer perpendicular

295
00:20:36,380 --> 00:20:41,010
to the -- this is
still all the A*u's.

296
00:20:41,010 --> 00:20:44,600
This plane is still the
column space of all the A*u's.

297
00:20:47,380 --> 00:20:49,940
We have the same b.

298
00:20:49,940 --> 00:20:56,850
But you see the problem is
it lost its 90 degree angle.

299
00:20:56,850 --> 00:21:02,690
Because the projection
is now projection on a --

300
00:21:02,690 --> 00:21:05,990
it's now an oblique
projection, it's slanted.

301
00:21:05,990 --> 00:21:12,360
This is the best A*u -- and if
I occasionally keep up-to-date

302
00:21:12,360 --> 00:21:16,190
I'll put that u_W there.

303
00:21:16,190 --> 00:21:19,800
There's still an error e, and
this is still a parallelogram

304
00:21:19,800 --> 00:21:21,340
but it's not a
rectangle anymore.

305
00:21:25,350 --> 00:21:27,620
Forgive my enthusiasm.

306
00:21:27,620 --> 00:21:32,400
I'm sort of happy that the
picture and the algebra both

307
00:21:32,400 --> 00:21:36,230
come out so neatly.

308
00:21:36,230 --> 00:21:40,170
I totally agree
that at this point

309
00:21:40,170 --> 00:21:46,070
I'm asking you to follow
a model without giving you

310
00:21:46,070 --> 00:21:48,250
an application, and
that's one reason

311
00:21:48,250 --> 00:21:53,910
I threw in this mention of
a specific application that

312
00:21:53,910 --> 00:21:55,840
came from somewhere else.

313
00:21:55,840 --> 00:21:59,850
But this is the picture there.

314
00:21:59,850 --> 00:22:07,280
So I'll say one more
word about the picture.

315
00:22:07,280 --> 00:22:11,790
I said that we lost
the right angle.

316
00:22:11,790 --> 00:22:16,110
We lost perpendicularity,
and, of course,

317
00:22:16,110 --> 00:22:18,360
literally speaking we did.

318
00:22:18,360 --> 00:22:25,870
This is no longer -- this is
not a right angle anymore.

319
00:22:25,870 --> 00:22:27,870
This is not a right
angle anymore.

320
00:22:34,000 --> 00:22:41,220
It's not a right angle in the
usual meaning of right angles.

321
00:22:41,220 --> 00:22:48,540
It is a right angle in
the inner product that's

322
00:22:48,540 --> 00:22:50,420
associated with
C. In other words,

323
00:22:50,420 --> 00:22:55,870
right angles here mean x
transpose y equals zero.

324
00:22:55,870 --> 00:22:58,180
That's the idea
of a right angle,

325
00:22:58,180 --> 00:23:02,210
right? x perpendicular
to y, and they

326
00:23:02,210 --> 00:23:03,720
have different letters here.

327
00:23:03,720 --> 00:23:10,870
Now over here I still have
perpendicular, but I don't --

328
00:23:10,870 --> 00:23:14,060
this is not the right
inner product anymore.

329
00:23:14,060 --> 00:23:19,050
It should be a weighted inner
product, weighted with this C

330
00:23:19,050 --> 00:23:20,600
in the middle.

331
00:23:20,600 --> 00:23:26,660
So that's really what
I mean by C-orthogonal.

332
00:23:26,660 --> 00:23:28,560
Maybe I'll put those words down.

333
00:23:28,560 --> 00:23:32,870
So this weighted thing is
-- if I can squeeze it,

334
00:23:32,870 --> 00:23:39,750
I doubt if I can --
is C-orthogonality,

335
00:23:39,750 --> 00:23:41,670
weighted orthogonality.

336
00:23:41,670 --> 00:23:44,250
So let me circle
the whole thing.

337
00:23:44,250 --> 00:23:50,650
The C is the W transpose W. Just
to say that we aren't giving up

338
00:23:50,650 --> 00:23:54,450
on dot products
and perpendiculars

339
00:23:54,450 --> 00:23:58,120
and good equations,
we're just changing them

340
00:23:58,120 --> 00:24:04,240
by inserting C every time
in the inner product.

341
00:24:04,240 --> 00:24:09,010
What it means is that this
is the natural inner product

342
00:24:09,010 --> 00:24:12,590
for the particular problem.

343
00:24:12,590 --> 00:24:16,230
This is the natural inner
product for Euclid, right?

344
00:24:16,230 --> 00:24:18,850
But then from some
specific application

345
00:24:18,850 --> 00:24:21,990
like this one or
a million others,

346
00:24:21,990 --> 00:24:25,340
they have their own
natural inner product,

347
00:24:25,340 --> 00:24:27,900
and the inner product for
that particular problem

348
00:24:27,900 --> 00:24:33,440
would be one of these guys with
some kind of a matrix C showing

349
00:24:33,440 --> 00:24:35,410
up.

350
00:24:35,410 --> 00:24:40,470
So least squares and
weighted least squares,

351
00:24:40,470 --> 00:24:44,600
that's my example one.

352
00:24:44,600 --> 00:24:51,200
Now I'd like to give a second
example, a more mechanic --

353
00:24:51,200 --> 00:24:53,280
will come closer to mechanics.

354
00:24:53,280 --> 00:24:59,030
Because this is least
squares, statistics, algebra.

355
00:24:59,030 --> 00:25:03,400
But let me put on the middle
board an application out

356
00:25:03,400 --> 00:25:05,170
of mechanics.

357
00:25:05,170 --> 00:25:12,220
It will be, say,
I'll make it small

358
00:25:12,220 --> 00:25:19,070
and just a couple of springs
with a mass between them

359
00:25:19,070 --> 00:25:23,750
and fixed at both ends.

360
00:25:26,900 --> 00:25:32,600
So this spring extends
by some amount.

361
00:25:32,600 --> 00:25:35,380
This spring extends
by some amount.

362
00:25:35,380 --> 00:25:37,580
There's a force on this mass.

363
00:25:37,580 --> 00:25:42,130
So there's a force on this
mass, maybe just gravity, f.

364
00:25:46,270 --> 00:25:51,390
That's the external force
from the mass that's

365
00:25:51,390 --> 00:25:54,200
here in between the springs.

366
00:25:54,200 --> 00:26:02,090
Then, also acting on that
mass are spring forces.

367
00:26:02,090 --> 00:26:05,130
This spring is
pulling it up, right?

368
00:26:05,130 --> 00:26:07,110
There's a spring force, w_1.

369
00:26:10,700 --> 00:26:14,920
Here, do you want me
to draw -- really,

370
00:26:14,920 --> 00:26:16,670
which direction is this?

371
00:26:16,670 --> 00:26:26,810
I'm going to draw it this way
just to show the w_2 drawn that

372
00:26:26,810 --> 00:26:33,860
way would actually be negative,
because I think that this

373
00:26:33,860 --> 00:26:35,790
spring would get
compressed, right --

374
00:26:35,790 --> 00:26:38,090
this mass is pushing it down.

375
00:26:38,090 --> 00:26:40,310
This spring would be
under compression.

376
00:26:40,310 --> 00:26:44,040
It would be pushing
the mass back up.

377
00:26:44,040 --> 00:26:48,440
So the w_2 in that
picture would be negative.

378
00:26:48,440 --> 00:26:53,450
So this w_1 will actually
be positive and go

379
00:26:53,450 --> 00:26:55,240
the way the arrow is showing.

380
00:26:55,240 --> 00:26:57,830
This w_2 would
actually be negative

381
00:26:57,830 --> 00:27:01,400
and go not the way
the arrow is showing.

382
00:27:01,400 --> 00:27:10,850
But what's the -- oh, what
equations have we got then?

383
00:27:10,850 --> 00:27:13,070
What's our optimization problem?

384
00:27:13,070 --> 00:27:14,470
Well actually, we have a choice.

385
00:27:14,470 --> 00:27:20,270
We could work with
equations, period.

386
00:27:20,270 --> 00:27:22,870
Actually, one of the
equations is pretty obvious.

387
00:27:22,870 --> 00:27:24,700
This mass is in equilibrium.

388
00:27:24,700 --> 00:27:30,940
So w_1 is equal to w_2 plus f.

389
00:27:30,940 --> 00:27:33,130
So it doesn't move.

390
00:27:33,130 --> 00:27:37,440
Or you might prefer me to write
it, I would rather write it,

391
00:27:37,440 --> 00:27:44,100
with w's on one side and
source terms on the other.

392
00:27:44,100 --> 00:27:53,410
So that's the
equilibrium equation.

393
00:27:56,900 --> 00:28:01,620
So what decides-- we want
to know what these w's are,

394
00:28:01,620 --> 00:28:03,920
and these springs are extended.

395
00:28:03,920 --> 00:28:08,290
So that first spring is
extended by an amount e_1,

396
00:28:08,290 --> 00:28:12,230
and this second spring is
extended by an amount e_2.

397
00:28:12,230 --> 00:28:16,310
Stretched or compressed.

398
00:28:16,310 --> 00:28:19,770
e_1 is probably going
to be positive here --

399
00:28:19,770 --> 00:28:21,820
that spring's going
to be stretched.

400
00:28:21,820 --> 00:28:24,200
This spring is going to be
compressed, so that e_2 is

401
00:28:24,200 --> 00:28:27,710
probably going to be negative.

402
00:28:27,710 --> 00:28:31,620
What's the mechanics here?

403
00:28:31,620 --> 00:28:34,320
Well, I can state it
two ways, as I said.

404
00:28:34,320 --> 00:28:39,120
I can state the mechanics
in terms of equations --

405
00:28:39,120 --> 00:28:43,810
force and stretching,
elastic constant.

406
00:28:43,810 --> 00:28:48,990
That's how we did it in
the first semester, 18.085.

407
00:28:48,990 --> 00:28:55,840
It's a little clearer because
simple equation, Hooke's law.

408
00:28:55,840 --> 00:29:02,870
Or I can state the problem
as a minimization of energy.

409
00:29:02,870 --> 00:29:06,520
That's what I want
to do today in 18.06.

410
00:29:06,520 --> 00:29:09,320
So I want to minimize -- 18.086.

411
00:29:09,320 --> 00:29:12,980
I want to minimize
the total energy,

412
00:29:12,980 --> 00:29:20,920
the energy in these springs.

413
00:29:20,920 --> 00:29:34,650
Subject to the constraint --
this constraint, equilibrium.

414
00:29:37,410 --> 00:29:40,480
So that's the
optimization statement

415
00:29:40,480 --> 00:29:46,610
of the mechanical problem.

416
00:29:50,940 --> 00:29:54,990
Well, I guess all that
remains is -- well, I guess,

417
00:29:54,990 --> 00:29:55,910
what remains?

418
00:29:55,910 --> 00:30:01,670
First of all, I need an
expression for this energy.

419
00:30:01,670 --> 00:30:04,950
If the springs are
governed by Hooke's law,

420
00:30:04,950 --> 00:30:08,400
then it'll be pretty simple.

421
00:30:08,400 --> 00:30:12,050
If they're real springs that
don't quite obey Hooke's law

422
00:30:12,050 --> 00:30:17,010
then there'll be non--
there'll be fourth-degree,

423
00:30:17,010 --> 00:30:21,190
sixth-degree, whatever,
terms in the energy.

424
00:30:21,190 --> 00:30:25,110
It's like the energy in the
first spring plus the energy

425
00:30:25,110 --> 00:30:27,970
in the second spring, anyway.

426
00:30:27,970 --> 00:30:33,960
E in the first spring, and the
energy in the second spring,

427
00:30:33,960 --> 00:30:36,520
and, of course, the two springs
could have different spring

428
00:30:36,520 --> 00:30:37,020
constants.

429
00:30:40,280 --> 00:30:48,040
So those E's -- I'll make
life easy in solving,

430
00:30:48,040 --> 00:30:54,280
if I choose Hooke's law, if I
choose the energy to be just

431
00:30:54,280 --> 00:30:54,930
a square.

432
00:30:57,700 --> 00:31:03,500
The constraint is linear, so
that'll be the model problem.

433
00:31:03,500 --> 00:31:09,850
And actually, that'll be the
kind of problem I've got here.

434
00:31:14,120 --> 00:31:19,740
One reason for introducing a
new example was to get some

435
00:31:19,740 --> 00:31:29,700
mechanics into the lecture,
but also to get a problem where

436
00:31:29,700 --> 00:31:34,820
we're doing a minimization
but we've got a condition

437
00:31:34,820 --> 00:31:36,420
on the w's.

438
00:31:36,420 --> 00:31:42,090
And the question is how
do you find a minimum?

439
00:31:42,090 --> 00:31:45,170
You can't just set derivatives
of the energy to zero.

440
00:31:45,170 --> 00:31:48,520
You would discover w_1
equal w_2 equals zero.

441
00:31:48,520 --> 00:31:50,770
Nothing happening.

442
00:31:50,770 --> 00:31:52,680
That would be the minimum.

443
00:31:52,680 --> 00:31:57,100
But that minimum is
ruled out, that solution

444
00:31:57,100 --> 00:32:00,340
is ruled out because we
have this constraint.

445
00:32:00,340 --> 00:32:03,920
We've got to balance
the external force.

446
00:32:03,920 --> 00:32:08,740
So this is the question
and you'll maybe

447
00:32:08,740 --> 00:32:15,870
have met this question in other
courses, but it's essential.

448
00:32:15,870 --> 00:32:21,690
How do you deal with minimizing
when there's a constraint?

449
00:32:21,690 --> 00:32:26,520
I guess, in some way,
we had it over here.

450
00:32:26,520 --> 00:32:30,450
We were minimizing something
-- well, the minimum would be,

451
00:32:30,450 --> 00:32:32,480
take u equal u_naught.

452
00:32:32,480 --> 00:32:36,330
But no, there was
a constraint that u

453
00:32:36,330 --> 00:32:43,240
had to satisfy a certain
equilibrium equation.

454
00:32:43,240 --> 00:32:44,890
Here it was a
differential equation

455
00:32:44,890 --> 00:32:47,780
so that problem is a
little harder than this one

456
00:32:47,780 --> 00:32:56,690
where the equation is just
discrete, one simple equation.

457
00:32:56,690 --> 00:32:58,080
So how are you going to do it?

458
00:32:58,080 --> 00:33:05,220
Well, actually the
quickest way would be --

459
00:33:05,220 --> 00:33:08,630
that's such an easy constraint
that I could say hey,

460
00:33:08,630 --> 00:33:12,040
w_2 is w_1 minus f.

461
00:33:12,040 --> 00:33:15,280
So I could just, if I
wanted to really like

462
00:33:15,280 --> 00:33:18,310
shortcut this whole
lecture, I could

463
00:33:18,310 --> 00:33:21,170
say well, w_2 is w_1 minus f.

464
00:33:24,690 --> 00:33:27,440
Now I've accounted
for the constraint.

465
00:33:27,440 --> 00:33:30,560
I've removed w_2
from the problem.

466
00:33:30,560 --> 00:33:33,680
I have a minimization,
an ordinary minimization

467
00:33:33,680 --> 00:33:35,240
with an unknown w_1.

468
00:33:35,240 --> 00:33:37,410
I take the derivative.

469
00:33:37,410 --> 00:33:40,230
I solve derivative equals zero.

470
00:33:40,230 --> 00:33:43,610
I find w_1, then I go
back, I get that w_2.

471
00:33:43,610 --> 00:33:44,740
That's the fast way.

472
00:33:47,880 --> 00:33:49,450
Of course, gets
the right answer.

473
00:33:49,450 --> 00:33:56,980
But there's another way that in
the end turns out to be better.

474
00:33:56,980 --> 00:34:00,440
It's not necessarily better
for this simple problem,

475
00:34:00,440 --> 00:34:06,520
but it's better for
the general approach

476
00:34:06,520 --> 00:34:09,780
to constrained optimization.

477
00:34:09,780 --> 00:34:16,320
So I'm going to not do the
simple deal of solving for w_2,

478
00:34:16,320 --> 00:34:19,680
but I'm going to keep
the constraint around.

479
00:34:19,680 --> 00:34:23,900
It's the idea of
Lagrange multipliers.

480
00:34:23,900 --> 00:34:30,550
You've heard those words
and probably seen it happen.

481
00:34:30,550 --> 00:34:32,570
So what is Lagrange multiplier?

482
00:34:32,570 --> 00:34:34,650
What is Lagrange's idea?

483
00:34:34,650 --> 00:34:43,770
Lagrange's idea is -- he
constructs a function of w,

484
00:34:43,770 --> 00:34:47,230
to work with, which
is the same energy.

485
00:34:47,230 --> 00:34:49,660
But he's going to
include a multiplier.

486
00:34:49,660 --> 00:34:52,020
Now the next question
is what letter shall

487
00:34:52,020 --> 00:34:57,260
I use for that multiplier,
Lagrange's multiplier.

488
00:34:57,260 --> 00:35:02,840
Books on optimization
often call it lambda --

489
00:35:02,840 --> 00:35:06,560
lambda's sort of,
like, for Lagrange.

490
00:35:06,560 --> 00:35:11,410
So, Lagrange obviously wasn't
Greek, but anyway -- close.

491
00:35:11,410 --> 00:35:17,780
Lambda for us always
means eigenvalue --

492
00:35:17,780 --> 00:35:19,490
for me always means eigenvalue.

493
00:35:19,490 --> 00:35:21,270
So I'm reluctant to use lambda.

494
00:35:21,270 --> 00:35:30,520
And sometimes in books on
economics, which is doing this

495
00:35:30,520 --> 00:35:35,740
all the time, the
multiplier's called pi

496
00:35:35,740 --> 00:35:39,570
because it turns
out to be a price.

497
00:35:43,200 --> 00:35:50,350
But let me use u for
the Lagrange multiplier.

498
00:35:53,770 --> 00:35:56,100
So that'll be the
Lagrange multiplier.

499
00:35:56,100 --> 00:35:57,420
So what do I do with it?

500
00:35:57,420 --> 00:36:04,340
I multiply this equation by
u and I build it in to this.

501
00:36:04,340 --> 00:36:08,990
So this thing is this part --
I'll just copy that down there

502
00:36:08,990 --> 00:36:13,020
-- plus or minus,
depending what I want,

503
00:36:13,020 --> 00:36:17,370
depending which sign I want to
end up with, u, the multiplier,

504
00:36:17,370 --> 00:36:21,980
times the constraint
-- w_1 minus w_2 --

505
00:36:21,980 --> 00:36:25,210
let me put it like that.

506
00:36:25,210 --> 00:36:29,140
So the constraint is
that this should be zero.

507
00:36:29,140 --> 00:36:32,170
So you can say I
haven't added anything.

508
00:36:32,170 --> 00:36:34,770
I've added zero.

509
00:36:34,770 --> 00:36:40,690
But that will only
be true at the end

510
00:36:40,690 --> 00:36:43,650
when I have a
specific w_1 and w_2.

511
00:36:43,650 --> 00:36:48,700
Right now what I've done is
I've built in the constraint

512
00:36:48,700 --> 00:36:50,430
into the function.

513
00:36:53,070 --> 00:36:58,940
Lagrange's brilliant idea was
that now I've got a function

514
00:36:58,940 --> 00:37:03,960
that I can -- whose
derivatives I can take.

515
00:37:03,960 --> 00:37:05,820
I could set the
derivatives to zero.

516
00:37:05,820 --> 00:37:14,420
dL/dw_1 is going to be zero.
dL/dw_2 is going to be zero.

517
00:37:14,420 --> 00:37:20,310
And dL/du is going to be zero.

518
00:37:20,310 --> 00:37:24,540
So we got now three
equations, three unknowns.

519
00:37:24,540 --> 00:37:26,260
Instead of going
from two to one,

520
00:37:26,260 --> 00:37:29,360
we've gone from two up to three.

521
00:37:29,360 --> 00:37:33,890
But it's so much more systematic
that it's the right thing

522
00:37:33,890 --> 00:37:35,470
to do.

523
00:37:35,470 --> 00:37:38,580
What is this last equation?

524
00:37:38,580 --> 00:37:42,130
What's the u derivative
of this expression?

525
00:37:42,130 --> 00:37:48,260
Well, u doesn't appear here,
the derivative is just w_1 --

526
00:37:48,260 --> 00:37:53,420
this just leads us to
w_1 minus w_2 minus f,

527
00:37:53,420 --> 00:37:56,000
which is the constraint
that we wanted.

528
00:37:56,000 --> 00:38:05,740
So the constraint is
showing up as this equation.

529
00:38:05,740 --> 00:38:09,730
Just the way the
constraint showed up

530
00:38:09,730 --> 00:38:10,940
as this equation here.

531
00:38:15,480 --> 00:38:24,460
I guess what I want to say
is when I wrote this down,

532
00:38:24,460 --> 00:38:27,240
when we did this
first example, I

533
00:38:27,240 --> 00:38:31,820
didn't say here's
the constraint.

534
00:38:31,820 --> 00:38:34,450
I mean that equation
kind of came out

535
00:38:34,450 --> 00:38:38,350
from the normal equations here.

536
00:38:42,540 --> 00:38:47,900
So what we're doing new here
is we're starting with --

537
00:38:47,900 --> 00:38:50,950
the constraint equation is
sort of part of the mechanics

538
00:38:50,950 --> 00:38:55,280
and we're asking the question:
how do I deal with it?

539
00:38:55,280 --> 00:38:57,560
Here it came out
of the geometry,

540
00:38:57,560 --> 00:39:00,630
so it wasn't there
at the beginning,

541
00:39:00,630 --> 00:39:03,730
so we didn't have to say: how
do we deal with this equation?

542
00:39:03,730 --> 00:39:05,620
It just emerged.

543
00:39:05,620 --> 00:39:11,220
Here it's really forced
on us right away.

544
00:39:11,220 --> 00:39:16,830
So anyway, we've still got
to figure out the derivatives

545
00:39:16,830 --> 00:39:22,490
of those, and I guess I do have
to finally now say what choice

546
00:39:22,490 --> 00:39:24,560
-- yeah.

547
00:39:24,560 --> 00:39:36,210
So this is -- what is this,
the derivative of E_1 minus u.

548
00:39:36,210 --> 00:39:39,830
If I take the w_1 derivative,
I've got the derivative of E_1

549
00:39:39,830 --> 00:39:43,300
with respect to w_1.

550
00:39:43,300 --> 00:39:47,130
Then w1 doesn't appear
there but it appears here,

551
00:39:47,130 --> 00:39:50,370
so it's minus u.

552
00:39:50,370 --> 00:39:54,920
This would be the derivative
of E_2 with respect to w_2,

553
00:39:54,920 --> 00:39:59,320
that second spring minus
-- oh maybe plus the u.

554
00:40:05,680 --> 00:40:08,140
Well, those are the
three equations.

555
00:40:10,890 --> 00:40:16,140
Let me move now to
the linear case,

556
00:40:16,140 --> 00:40:19,530
just so we see the
beautiful pattern.

557
00:40:19,530 --> 00:40:24,780
So if I make the
equations linear, what's

558
00:40:24,780 --> 00:40:31,140
the energy in a
Hooke's law spring

559
00:40:31,140 --> 00:40:41,440
here if the extension is e_1 and
if it produces a force of w_1,

560
00:40:41,440 --> 00:40:44,970
I want to know what
is this E_1 here.

561
00:40:44,970 --> 00:40:48,590
So let me just
remember Hooke's law.

562
00:40:48,590 --> 00:40:51,840
I think Hooke's law
would say that -- well,

563
00:40:51,840 --> 00:40:56,440
there's some elastic
constant c_1,

564
00:40:56,440 --> 00:41:01,815
so there's a c_1 and a c_2
that tell us how hard or soft

565
00:41:01,815 --> 00:41:02,810
the springs are.

566
00:41:02,810 --> 00:41:04,890
So these are physical constants.

567
00:41:12,200 --> 00:41:16,500
If I remember
right, the energy --

568
00:41:16,500 --> 00:41:20,170
so I'm going to erase a little
here just to -- well no.

569
00:41:20,170 --> 00:41:23,410
So what is the energy
in that first spring?

570
00:41:23,410 --> 00:41:31,650
You remember, there's a
1/2 of an c_1 e_1 squared.

571
00:41:31,650 --> 00:41:38,100
That's the energy in a
spring with constant c_1

572
00:41:38,100 --> 00:41:39,890
and the stretch e_1.

573
00:41:45,280 --> 00:41:48,200
But now, I really want
it not in terms of e_1,

574
00:41:48,200 --> 00:41:53,412
I want it in terms of w_1,
just the way I did here.

575
00:41:53,412 --> 00:41:54,870
I've got to do the
same thing here.

576
00:41:54,870 --> 00:42:00,630
I want to get to w, because the
constraint is in terms of w.

577
00:42:03,290 --> 00:42:04,820
What does Hooke's law say?

578
00:42:04,820 --> 00:42:12,510
Hooke's law says w equals c*e,
that's Hooke, Hooke's law.

579
00:42:15,400 --> 00:42:19,460
The force is the elastic
constant times the stretch.

580
00:42:19,460 --> 00:42:24,650
So in place of the
e, I have w over c.

581
00:42:24,650 --> 00:42:32,310
So this is 1/2 -- e_1 squared
will be w_1 squared over c_1

582
00:42:32,310 --> 00:42:36,520
squared, and then a c_1 cancels
-- that's what it looks like.

583
00:42:40,450 --> 00:42:45,390
I guess I'm certainly
happy to see that I'm

584
00:42:45,390 --> 00:42:47,380
coming up with c inverse.

585
00:42:47,380 --> 00:42:51,400
This c is showing up
in the denominator,

586
00:42:51,400 --> 00:42:55,390
and that's exactly the --
so this is what I mean,

587
00:42:55,390 --> 00:43:00,780
this is what I
want for my energy,

588
00:43:00,780 --> 00:43:10,530
is a 1/2 w_1 squared over c_1,
and a 1/2 w_2 squared over c_2,

589
00:43:10,530 --> 00:43:18,720
and now I was doing a minus sign
because that kind of goes well

590
00:43:18,720 --> 00:43:23,670
with mechanics where a plus
sign would go well with all

591
00:43:23,670 --> 00:43:26,720
the other applications.

592
00:43:26,720 --> 00:43:31,650
So now I've got explicit energy.

593
00:43:31,650 --> 00:43:35,400
So now I can say what
this thing really --

594
00:43:35,400 --> 00:43:37,860
the derivative with
respect to w_1.

595
00:43:37,860 --> 00:43:39,130
OK.

596
00:43:39,130 --> 00:43:44,740
The derivative with respect
to w_1 is just w_1 over c_1.

597
00:43:44,740 --> 00:43:46,770
Is that what I'm getting?

598
00:43:46,770 --> 00:43:48,520
This is zero.

599
00:43:48,520 --> 00:43:55,580
This says w_2 over
c_2 minus u is zero,

600
00:43:55,580 --> 00:44:00,440
and this one says w_1
minus w_2 equals f.

601
00:44:04,120 --> 00:44:07,200
Our three equations.

602
00:44:07,200 --> 00:44:16,660
Yes, so I will kill these equal
signs and just look at -- oh,

603
00:44:16,660 --> 00:44:17,740
that's a plus.

604
00:44:29,090 --> 00:44:32,170
We've got our saddle
point matrix again.

605
00:44:32,170 --> 00:44:34,950
That's the nice thing here.

606
00:44:34,950 --> 00:44:39,130
This is a problem with
a three by three matrix,

607
00:44:39,130 --> 00:44:43,760
with three unknowns,
w_1, w_2, and u.

608
00:44:43,760 --> 00:44:47,320
With right-hand sides
zero, zero, and f.

609
00:44:53,580 --> 00:44:56,030
And with what matrix?

610
00:44:59,190 --> 00:45:06,110
That looks like a 1 over
c_1, zero, and a minus 1.

611
00:45:06,110 --> 00:45:12,640
This looks like zero, a
1 over c_2 and a plus 1.

612
00:45:12,640 --> 00:45:16,240
This looks like --
oh, wait a minute.

613
00:45:16,240 --> 00:45:18,880
I don't want this.

614
00:45:18,880 --> 00:45:21,870
w_1 minus w_2 equal f.

615
00:45:25,500 --> 00:45:29,020
I can live with 1
and minus 1 there,

616
00:45:29,020 --> 00:45:33,040
but it's not really
what I wanted.

617
00:45:33,040 --> 00:45:37,140
I wanted the signs to --
you know what I want here.

618
00:45:37,140 --> 00:45:41,130
I want the transpose of
that to be here, and zero

619
00:45:41,130 --> 00:45:42,330
to be in that block.

620
00:45:42,330 --> 00:45:50,200
So that's what my saddle
point matrix would look like.

621
00:45:50,200 --> 00:45:55,460
Well, let me just say that
I could live with either.

622
00:45:55,460 --> 00:46:03,450
I was aiming for this one
because it's symmetric.

623
00:46:03,450 --> 00:46:09,180
But a lot of people would
rather have the opposite signs

624
00:46:09,180 --> 00:46:16,140
and have the 1, minus 1 there.

625
00:46:16,140 --> 00:46:22,190
I don't care which
sign f has, of course.

626
00:46:22,190 --> 00:46:25,640
Some people these days are
liking this form better

627
00:46:25,640 --> 00:46:28,120
because then it has
a symmetric part

628
00:46:28,120 --> 00:46:29,590
and an anti-symmetric part.

629
00:46:33,160 --> 00:46:37,010
I mean the thing
is, at some point

630
00:46:37,010 --> 00:46:39,970
we're going to get problems
like this with thousands

631
00:46:39,970 --> 00:46:42,970
of unknowns, and
we're going to think

632
00:46:42,970 --> 00:46:47,640
how do we solve them and
maybe some iteration.

633
00:46:47,640 --> 00:46:53,750
So we might want the matrix to
be symmetric but indefinite,

634
00:46:53,750 --> 00:46:59,500
or we might want a positive
definite, symmetric part

635
00:46:59,500 --> 00:47:01,360
and an anti-symmetric part.

636
00:47:04,350 --> 00:47:07,840
What we can't have is
positive definite symmetric.

637
00:47:07,840 --> 00:47:11,880
That's like asking for
what can't happen here.

638
00:47:11,880 --> 00:47:17,090
The combination of the problems
is producing a saddle point

639
00:47:17,090 --> 00:47:28,330
and we can play with that sign,
but we can't make that zero

640
00:47:28,330 --> 00:47:33,140
something -- oh,
we could, actually.

641
00:47:33,140 --> 00:47:36,970
What I was going to say is we
can't make that zero something

642
00:47:36,970 --> 00:47:40,520
different, but you could.

643
00:47:40,520 --> 00:47:44,260
That would be a
possible way to --

644
00:47:44,260 --> 00:47:47,630
it's another way people thought
of of solving these problems,

645
00:47:47,630 --> 00:47:55,800
is artificially throw
in a big number there,

646
00:47:55,800 --> 00:48:02,590
or even a small number, and
push things towards positive.

647
00:48:02,590 --> 00:48:10,630
Anyway, my purpose today
is essentially completed,

648
00:48:10,630 --> 00:48:15,780
that we're getting out of
a physical application,

649
00:48:15,780 --> 00:48:20,200
after I linearized and it
became a linear equation

650
00:48:20,200 --> 00:48:22,380
and it had that
saddle point form.

651
00:48:22,380 --> 00:48:26,420
So, saddle point form here,
saddle point form here,

652
00:48:26,420 --> 00:48:28,120
saddle point forms everywhere.

653
00:48:28,120 --> 00:48:30,710
I mean we'll have
the saddle point

654
00:48:30,710 --> 00:48:33,460
forms for differential
equations,

655
00:48:33,460 --> 00:48:36,390
as well as for matrix equations.

656
00:48:36,390 --> 00:48:44,590
So those are the examples
to sort of hang on to,

657
00:48:44,590 --> 00:48:53,350
and it's section 7.1 that
has a big part of it.

658
00:48:53,350 --> 00:48:58,980
Ah -- I always have
one last thing to say.

659
00:48:58,980 --> 00:49:00,440
What's the meaning of u?

660
00:49:03,180 --> 00:49:06,070
So, u was a Lagrange multiplier.

661
00:49:06,070 --> 00:49:08,470
Lagrange just like
helped us out by saying

662
00:49:08,470 --> 00:49:13,430
OK, deal with constraints by
using one of my multipliers.

663
00:49:13,430 --> 00:49:17,510
But the point is that
the multiplier always

664
00:49:17,510 --> 00:49:20,570
has a real meaning.

665
00:49:20,570 --> 00:49:24,980
I mention prices before.

666
00:49:24,980 --> 00:49:28,950
Here, what's the meaning of u?

667
00:49:28,950 --> 00:49:31,390
What's the physical meaning
of the Lagrange multiplier.

668
00:49:31,390 --> 00:49:36,220
It turns out to be the
displacement of the mass.

669
00:49:36,220 --> 00:49:41,740
It's the dual variable, so it
always has some interpretation.

670
00:49:41,740 --> 00:49:44,830
In this case, with
mechanics it's

671
00:49:44,830 --> 00:49:51,810
the amount the mass comes
down when the force acts.

672
00:49:51,810 --> 00:49:57,480
What's more, it also has a
derivative interpretation.

673
00:49:57,480 --> 00:50:01,600
Turns out -- I'll
just put turns out --

674
00:50:01,600 --> 00:50:09,370
that the Lagrange multiplier u
turns out to be the derivative

675
00:50:09,370 --> 00:50:16,300
of the minimum energy in
the system with respect

676
00:50:16,300 --> 00:50:20,640
to the source term.

677
00:50:20,640 --> 00:50:23,710
It's the sensitivity
of the problem somehow.

678
00:50:23,710 --> 00:50:27,620
I'll just use sensitivity.

679
00:50:27,620 --> 00:50:33,860
You often want to know -- it's
actually this quantity that we

680
00:50:33,860 --> 00:50:35,380
need over there.

681
00:50:35,380 --> 00:50:38,700
We want to know how much
does the answer depend

682
00:50:38,700 --> 00:50:44,010
on the source, and that's what
the Lagrange multiplier tells

683
00:50:44,010 --> 00:50:44,970
us.

684
00:50:44,970 --> 00:50:51,420
So, if I computed the solution
here, so the notes do this --

685
00:50:51,420 --> 00:50:53,850
maybe I'll leave
this for the notes.

686
00:50:53,850 --> 00:50:57,580
The notes solve
this little problem.

687
00:50:57,580 --> 00:50:59,820
That's easy to do.

688
00:50:59,820 --> 00:51:05,870
They figure out what the
energy is in the springs.

689
00:51:05,870 --> 00:51:08,200
It depends on the
right-hand side

690
00:51:08,200 --> 00:51:12,150
f, which is just a number here.

691
00:51:12,150 --> 00:51:14,330
The energy turns
out to be quadratic.

692
00:51:14,330 --> 00:51:18,830
You can take its derivative
and you find out that it's u.

693
00:51:18,830 --> 00:51:25,630
So that Lagrange multiplier,
that's really a key message,

694
00:51:25,630 --> 00:51:28,850
is an important
quantity in itself.

695
00:51:28,850 --> 00:51:30,980
Here it happens to
mean displacement,

696
00:51:30,980 --> 00:51:33,580
which is obviously
crucial quantity,

697
00:51:33,580 --> 00:51:36,970
and in general it
tells us the change

698
00:51:36,970 --> 00:51:40,850
in the minimum
energy with respect

699
00:51:40,850 --> 00:51:44,480
to a change in the input.

700
00:51:44,480 --> 00:51:49,670
And sensitivity's a natural
word to use for that.

701
00:51:49,670 --> 00:51:51,860
So that's a final
word about -- well,

702
00:51:51,860 --> 00:51:55,000
a near to final word about
Lagrange multipliers.

703
00:51:55,000 --> 00:51:59,280
So I'll see you Wednesday
and by that time

704
00:51:59,280 --> 00:52:02,600
I'll know more
about the projects

705
00:52:02,600 --> 00:52:08,481
and we'll be moving
onward with optimization.

706
00:52:08,481 --> 00:52:08,980
Good.

707
00:52:08,980 --> 00:52:10,230
Thanks.