1
00:00:00,000 --> 00:00:00,500


2
00:00:00,500 --> 00:00:02,756
The following content is
provided under a Creative

3
00:00:02,756 --> 00:00:03,610
Commons license.

4
00:00:03,610 --> 00:00:05,770
Your support will help
MIT OpenCourseWare

5
00:00:05,770 --> 00:00:09,910
continue to offer high quality
educational resources for free.

6
00:00:09,910 --> 00:00:12,530
To make a donation or to
view additional materials

7
00:00:12,530 --> 00:00:15,610
from hundreds of MIT courses,
visit MIT OpenCourseWare

8
00:00:15,610 --> 00:00:20,835
at ocw.mit.edu.

9
00:00:20,835 --> 00:00:21,710
PROFESSOR STRANG: OK.

10
00:00:21,710 --> 00:00:25,840
So this is Lecture 14,
it's also the lecture

11
00:00:25,840 --> 00:00:29,300
before the exam on
Tuesday evening.

12
00:00:29,300 --> 00:00:35,150
I thought I would just go ahead
and tell you what questions

13
00:00:35,150 --> 00:00:37,800
there are, so you could see.

14
00:00:37,800 --> 00:00:39,580
I haven't filled
in all the numbers,

15
00:00:39,580 --> 00:00:44,630
but this will tell
you, it's a way

16
00:00:44,630 --> 00:00:47,680
of reviewing of
course, to sort of see

17
00:00:47,680 --> 00:00:50,010
the things that we've done.

18
00:00:50,010 --> 00:00:52,330
Which is quite a bit, really.

19
00:00:52,330 --> 00:00:58,380
And also of course topics
that will not be in the exam.

20
00:00:58,380 --> 00:01:01,510
You can see that they're not
in the exam, for example.

21
00:01:01,510 --> 00:01:06,240
Topics from 1.7 on
condition number, or 2.3

22
00:01:06,240 --> 00:01:09,440
on Gram-Schmidt, that I can
speak about a little bit.

23
00:01:09,440 --> 00:01:12,740
So those are the,
only thing I didn't

24
00:01:12,740 --> 00:01:16,830
fill in is the end time there.

25
00:01:16,830 --> 00:01:19,360
So I don't usually
ask four questions,

26
00:01:19,360 --> 00:01:20,870
three is more typical.

27
00:01:20,870 --> 00:01:25,060
So it's a little bit longer
but it's not a difficult exam.

28
00:01:25,060 --> 00:01:33,620
And I never want time to be
the essential ingredient.

29
00:01:33,620 --> 00:01:37,270
So 7:30 to 9 is
the nominal time,

30
00:01:37,270 --> 00:01:42,110
but it would not be a
surprise if you're still

31
00:01:42,110 --> 00:01:45,030
going on after 9
o'clock a little bit.

32
00:01:45,030 --> 00:01:50,890
And I'll try not to, like,
tear a paper away from you.

33
00:01:50,890 --> 00:01:57,090
So, just figure that if you move
along at a reasonable speed,

34
00:01:57,090 --> 00:02:04,470
and so you can bring papers,
the book, anything with you.

35
00:02:04,470 --> 00:02:09,240
And I'm open for questions
now about the exam.

36
00:02:09,240 --> 00:02:13,060
You know where 54-100 is, it's
a classroom unfortunately.

37
00:02:13,060 --> 00:02:17,260
Not a, sometimes we'll have the
top of Walker where you have

38
00:02:17,260 --> 00:02:21,030
a whole table to work
on, that's-- This 54-100,

39
00:02:21,030 --> 00:02:26,600
in the tallest building of MIT
out in the middle of the middle

40
00:02:26,600 --> 00:02:31,690
space out there, is
a large classroom.

41
00:02:31,690 --> 00:02:38,380
So if you kind of spread
out your papers we'll be OK.

42
00:02:38,380 --> 00:02:40,730
It's a pretty big room.

43
00:02:40,730 --> 00:02:42,810
Questions.

44
00:02:42,810 --> 00:02:47,090
And, of course, don't forget
this afternoon in 1-190

45
00:02:47,090 --> 00:02:48,530
rather than here.

46
00:02:48,530 --> 00:02:58,620
So, review is in
1-190, 4 to 5 today.

47
00:02:58,620 --> 00:03:04,470
But if you're not
free at that time

48
00:03:04,470 --> 00:03:07,990
don't feel that you've
missed anything essential.

49
00:03:07,990 --> 00:03:11,810
I thought this would
be the right way

50
00:03:11,810 --> 00:03:18,070
to tell you what the exam is and
guide your preparation for it.

51
00:03:18,070 --> 00:03:19,010
Questions.

52
00:03:19,010 --> 00:03:19,680
Yes, thanks.

53
00:03:19,680 --> 00:03:26,870
AUDIENCE: [INAUDIBLE].

54
00:03:26,870 --> 00:03:30,380
PROFESSOR STRANG: So these
are four different questions.

55
00:03:30,380 --> 00:03:36,080
So this would be a question
about getting to this matrix

56
00:03:36,080 --> 00:03:40,150
and what it's about,
A transpose C A.

57
00:03:40,150 --> 00:03:49,430
AUDIENCE: [INAUDIBLE]

58
00:03:49,430 --> 00:03:52,060
PROFESSOR STRANG: OK, I don't
use beam bending for that.

59
00:03:52,060 --> 00:03:53,810
I'm thinking of elastic bar.

60
00:03:53,810 --> 00:03:57,090
Yeah, stretching equation, yeah.

61
00:03:57,090 --> 00:04:02,720
So the stretching equation,
so the first one we ever saw,

62
00:04:02,720 --> 00:04:08,740
the stretching equation, was
just u'', or -u'', equal 1.

63
00:04:08,740 --> 00:04:12,460
And now that allows
a C to sneak in

64
00:04:12,460 --> 00:04:15,980
and that allows a matrix
C to sneak in there,

65
00:04:15,980 --> 00:04:21,880
but I think you'll see what
they should look like, but yeah.

66
00:04:21,880 --> 00:04:24,480
AUDIENCE: [INAUDIBLE]

67
00:04:24,480 --> 00:04:26,250
PROFESSOR STRANG: Yeah, yeah.

68
00:04:26,250 --> 00:04:30,500
So this is Section 2.2,
directly out of the book.

69
00:04:30,500 --> 00:04:33,920
M will be the mass matrix, K
will be the stiffness matrix,

70
00:04:33,920 --> 00:04:35,480
yep.

71
00:04:35,480 --> 00:04:38,825
AUDIENCE: [INAUDIBLE]

72
00:04:38,825 --> 00:04:39,700
PROFESSOR STRANG: OK.

73
00:04:39,700 --> 00:04:43,140
So, good point to say.

74
00:04:43,140 --> 00:04:46,630
In the very first
section, 1.1, we

75
00:04:46,630 --> 00:04:51,530
gave the name K to a very
special matrix, a specific one.

76
00:04:51,530 --> 00:04:57,120
But then later, now, I'm
using the same letter K

77
00:04:57,120 --> 00:04:59,450
for matrices of that type.

78
00:04:59,450 --> 00:05:03,750
That was the most special,
simplest, completely understood

79
00:05:03,750 --> 00:05:07,620
case, but now I'll use
K for stiffness matrices

80
00:05:07,620 --> 00:05:11,560
and when we're doing finite
elements in a few weeks

81
00:05:11,560 --> 00:05:15,100
again it'll be K.
Yeah, same name, right.

82
00:05:15,100 --> 00:05:18,110
So here you'll want
to create K and M

83
00:05:18,110 --> 00:05:24,650
and know how to deal
with, this was our only

84
00:05:24,650 --> 00:05:26,370
time-dependent thing.

85
00:05:26,370 --> 00:05:27,990
So I guess what
you're seeing here

86
00:05:27,990 --> 00:05:31,540
is not only what time-dependent
equation will be there

87
00:05:31,540 --> 00:05:35,160
but also that I'm
not going in detail

88
00:05:35,160 --> 00:05:42,450
into those trapezoidal
difference methods.

89
00:05:42,450 --> 00:05:45,660
Important as they are, we
can't do everything on the quiz

90
00:05:45,660 --> 00:05:55,360
so I'm really focusing on things
that are central to our course.

91
00:05:55,360 --> 00:05:56,030
Good.

92
00:05:56,030 --> 00:05:57,380
Other questions.

93
00:05:57,380 --> 00:06:01,271
I'm very open for more
questions this afternoon.

94
00:06:01,271 --> 00:06:01,770
Yep.

95
00:06:01,770 --> 00:06:04,039
AUDIENCE: [INAUDIBLE]

96
00:06:04,039 --> 00:06:05,080
PROFESSOR STRANG: Others.

97
00:06:05,080 --> 00:06:05,780
OK.

98
00:06:05,780 --> 00:06:12,280
So, let me, I don't want to
go on and do new material,

99
00:06:12,280 --> 00:06:15,410
because we're focused
on these things.

100
00:06:15,410 --> 00:06:17,940
And this course, the
name of this course

101
00:06:17,940 --> 00:06:20,620
is computational
science and engineering.

102
00:06:20,620 --> 00:06:23,680
And by the way I just
had an email last week

103
00:06:23,680 --> 00:06:26,240
from the Dean of
Engineering, or a bunch of us

104
00:06:26,240 --> 00:06:30,030
did, to say that the
School of Engineering

105
00:06:30,030 --> 00:06:36,800
is establishing Center for
Computational Engineering, CCE.

106
00:06:36,800 --> 00:06:40,170
Several faculty members
there and, like myself

107
00:06:40,170 --> 00:06:43,820
in the School of Science
and the Sloan School

108
00:06:43,820 --> 00:06:47,750
are involved with computation,
and this new center

109
00:06:47,750 --> 00:06:50,330
is going to organize that.

110
00:06:50,330 --> 00:06:53,270
So, it's a good development.

111
00:06:53,270 --> 00:06:58,620
And it's headed by people
in Course 2 and Course 16.

112
00:06:58,620 --> 00:06:59,420
So.

113
00:06:59,420 --> 00:07:03,090
If we're talking about
computations, and I do

114
00:07:03,090 --> 00:07:06,260
have to say something about
how you would actually

115
00:07:06,260 --> 00:07:08,310
do the computations.

116
00:07:08,310 --> 00:07:12,410
And what are the
issues about accuracy.

117
00:07:12,410 --> 00:07:16,880
Speed and accuracy
is what you're

118
00:07:16,880 --> 00:07:19,510
aiming for in the computations.

119
00:07:19,510 --> 00:07:22,270
Of course, the first step
is to know what problem

120
00:07:22,270 --> 00:07:23,520
is it you want to compute.

121
00:07:23,520 --> 00:07:26,230
What do you want to solve,
what's the equation?

122
00:07:26,230 --> 00:07:28,430
That's what we've
been doing all along.

123
00:07:28,430 --> 00:07:33,060
Now, I just take a
little time-out to say,

124
00:07:33,060 --> 00:07:35,920
suppose I have the equation.

125
00:07:35,920 --> 00:07:39,920
When I write K, I'm thinking of
a symmetric, positive definite

126
00:07:39,920 --> 00:07:42,150
or at least
semi-definite matrix.

127
00:07:42,150 --> 00:07:46,800
When I write A I'm thinking
of any general, usually

128
00:07:46,800 --> 00:07:48,290
tall, thin, matrix.

129
00:07:48,290 --> 00:07:49,440
Rectangular.

130
00:07:49,440 --> 00:07:54,720
So that I would need least
squares for this guy, where

131
00:07:54,720 --> 00:07:58,380
straightforward elimination
would work for that one.

132
00:07:58,380 --> 00:08:02,050
And so my first question
is-- Let's take this,

133
00:08:02,050 --> 00:08:05,180
so these are two
topics for today.

134
00:08:05,180 --> 00:08:10,340
This one would come out of 1.7,
that discussion with condition

135
00:08:10,340 --> 00:08:11,100
number.

136
00:08:11,100 --> 00:08:16,180
This one would come out of
2.3, the least squares section.

137
00:08:16,180 --> 00:08:21,440
OK, so if I give
you, and I'm thinking

138
00:08:21,440 --> 00:08:24,270
that the computational
questions emerge

139
00:08:24,270 --> 00:08:26,360
when the systems are large.

140
00:08:26,360 --> 00:08:28,970
So I'm thinking thousands
of unknowns here.

141
00:08:28,970 --> 00:08:31,310
Thousands of equations,
at the least.

142
00:08:31,310 --> 00:08:31,930
OK.

143
00:08:31,930 --> 00:08:40,500
So, and the question is, I
do Gaussian elimination here,

144
00:08:40,500 --> 00:08:42,500
ordinary elimination.

145
00:08:42,500 --> 00:08:43,870
Backslash.

146
00:08:43,870 --> 00:08:48,060
And how accurate is the answer?

147
00:08:48,060 --> 00:08:49,540
And how do you understand?

148
00:08:49,540 --> 00:08:51,870
I mean, the accuracy
of the answer

149
00:08:51,870 --> 00:08:53,850
is going to kind of
depend on two things.

150
00:08:53,850 --> 00:08:55,730
And it's good to separate them.

151
00:08:55,730 --> 00:09:01,050
One is the method you
use, like elimination.

152
00:09:01,050 --> 00:09:03,570
Whatever adjustments
you might make.

153
00:09:03,570 --> 00:09:04,640
Pivoting.

154
00:09:04,640 --> 00:09:07,330
Exchanging rows to
get larger pivots.

155
00:09:07,330 --> 00:09:10,620
All that is in the
algorithm, in the code.

156
00:09:10,620 --> 00:09:13,290
And then the second,
very important aspect,

157
00:09:13,290 --> 00:09:16,360
is the matrix K itself.

158
00:09:16,360 --> 00:09:19,950
Is this a tough problem to solve
whatever method you're using,

159
00:09:19,950 --> 00:09:21,410
or is it a simple problem?

160
00:09:21,410 --> 00:09:24,520
Is the problem ill
conditioned, meeting K

161
00:09:24,520 --> 00:09:28,640
would be like nearly
singular, and then we

162
00:09:28,640 --> 00:09:30,550
would know we had
a tougher problem

163
00:09:30,550 --> 00:09:32,820
to solve, whatever method.

164
00:09:32,820 --> 00:09:35,410
Or is K quite well
conditioned, I

165
00:09:35,410 --> 00:09:37,990
mean the best
condition would be when

166
00:09:37,990 --> 00:09:42,480
all the columns are unit
vectors and all orthogonal

167
00:09:42,480 --> 00:09:43,560
to each other.

168
00:09:43,560 --> 00:09:46,860
Yeah, I mean that would be
the best conditioning of all.

169
00:09:46,860 --> 00:09:49,080
That condition
number would be one

170
00:09:49,080 --> 00:09:55,580
if this K, which, not too likely
is a matrix that I would call

171
00:09:55,580 --> 00:10:00,120
Q. Q, which is going
to show up over here,

172
00:10:00,120 --> 00:10:05,670
in the second problem is,
Q stands for a matrix which

173
00:10:05,670 --> 00:10:08,060
has orthonormal columns.

174
00:10:08,060 --> 00:10:17,610
So, you remember what
orthonormal means.

175
00:10:17,610 --> 00:10:19,900
Ortho is telling
us perpendicular,

176
00:10:19,900 --> 00:10:21,390
that's the key point.

177
00:10:21,390 --> 00:10:23,220
Normal is telling
us that they're

178
00:10:23,220 --> 00:10:25,460
unit vectors, lengths one.

179
00:10:25,460 --> 00:10:28,990
So that's the Q, and then
you might ask what's the R.

180
00:10:28,990 --> 00:10:35,790
And the R is upper triangular.

181
00:10:35,790 --> 00:10:38,130
OK.

182
00:10:38,130 --> 00:10:39,820
OK.

183
00:10:39,820 --> 00:10:42,220
So what I said
about this problem,

184
00:10:42,220 --> 00:10:46,690
that there's the method you
use, and also the sensitivity,

185
00:10:46,690 --> 00:10:49,230
the difficulty of the
problem in the first place,

186
00:10:49,230 --> 00:10:51,290
applies just the same here.

187
00:10:51,290 --> 00:10:53,940
There's the method
you use, do you

188
00:10:53,940 --> 00:10:59,630
use A transpose A to
find u hat, this is, now

189
00:10:59,630 --> 00:11:03,060
we're looking for u hat, of
course, the best solution.

190
00:11:03,060 --> 00:11:05,560
Do I use A transpose A?

191
00:11:05,560 --> 00:11:09,380
Well, you would say
of course, what else.

192
00:11:09,380 --> 00:11:11,720
That equation, that
least squares equation

193
00:11:11,720 --> 00:11:16,040
has A transpose A u hat equal A
transpose b, what's the choice?

194
00:11:16,040 --> 00:11:24,350
But, if you're interested
in high accuracy,

195
00:11:24,350 --> 00:11:27,120
and stability,
numerical stability,

196
00:11:27,120 --> 00:11:29,740
maybe you don't go
to A transpose A.

197
00:11:29,740 --> 00:11:34,160
Going to A transpose A kind of
squares the condition number.

198
00:11:34,160 --> 00:11:37,420
You get to an A transpose
A, that'll be our K,

199
00:11:37,420 --> 00:11:42,150
but its condition number
will somehow be squared

200
00:11:42,150 --> 00:11:46,680
and if the problem is
nice, you're OK with that.

201
00:11:46,680 --> 00:11:48,390
But if the problem is delicate?

202
00:11:48,390 --> 00:11:51,330
Now, what does
delicate mean for Au=b?

203
00:11:51,330 --> 00:11:55,910
I'm kind of giving you a
overview of the two problems

204
00:11:55,910 --> 00:11:57,690
before I start on this one.

205
00:11:57,690 --> 00:11:59,180
And then that one.

206
00:11:59,180 --> 00:12:01,610
So with this one
the problem was,

207
00:12:01,610 --> 00:12:03,520
is the matrix nearly singular.

208
00:12:03,520 --> 00:12:05,240
What does that mean?

209
00:12:05,240 --> 00:12:07,970
Does MATLAB tell you,
what does MATLAB tell you

210
00:12:07,970 --> 00:12:09,110
about the matrix?

211
00:12:09,110 --> 00:12:12,860
And that is measured by
the condition number.

212
00:12:12,860 --> 00:12:20,810
What's the issue here is, when
would this be a numerically

213
00:12:20,810 --> 00:12:22,600
difficult, sensitive problem?

214
00:12:22,600 --> 00:12:27,690
Well, the columns of
A are not orthonormal.

215
00:12:27,690 --> 00:12:30,980
If they are, then you're golden.

216
00:12:30,980 --> 00:12:33,100
If the columns of
A are orthonormal,

217
00:12:33,100 --> 00:12:36,600
then you're all set.

218
00:12:36,600 --> 00:12:39,150
So what's the opposite?

219
00:12:39,150 --> 00:12:42,680
Well, the extreme opposite
would be when the columns of A

220
00:12:42,680 --> 00:12:44,110
are dependent.

221
00:12:44,110 --> 00:12:46,440
If the columns of A
are linearly dependent

222
00:12:46,440 --> 00:12:50,520
and some column is a
combination of other columns,

223
00:12:50,520 --> 00:12:52,550
you're in trouble right away.

224
00:12:52,550 --> 00:12:56,060
So that's like big trouble,
that's like K being singular.

225
00:12:56,060 --> 00:12:57,640
Those are the cases.

226
00:12:57,640 --> 00:13:03,160
K singular here,
dependent columns here.

227
00:13:03,160 --> 00:13:05,470
Not full rank.

228
00:13:05,470 --> 00:13:11,160
So, again we're supposing
we're not facing disaster.

229
00:13:11,160 --> 00:13:12,825
Just near disaster.

230
00:13:12,825 --> 00:13:15,160
So we want to know,
is K near the singular

231
00:13:15,160 --> 00:13:17,490
and how to measure
that, and we want

232
00:13:17,490 --> 00:13:22,190
to know what to do
when the columns of A

233
00:13:22,190 --> 00:13:28,780
are independent
but maybe not very.

234
00:13:28,780 --> 00:13:31,470
And that would show up
in a large condition

235
00:13:31,470 --> 00:13:35,420
number for A transpose A. And
this happens all the time;

236
00:13:35,420 --> 00:13:39,410
if you don't set up
your problem well,

237
00:13:39,410 --> 00:13:43,920
your experimental
problem, you can easily

238
00:13:43,920 --> 00:13:49,390
get matrices A, whose columns
are not very independent.

239
00:13:49,390 --> 00:13:54,000
Measured by A transpose A
being close to singular.

240
00:13:54,000 --> 00:13:56,040
Right, everybody
here's got that idea.

241
00:13:56,040 --> 00:13:58,700
If the columns of
A are independent,

242
00:13:58,700 --> 00:14:01,170
A transpose A is non-singular.

243
00:14:01,170 --> 00:14:03,910
In fact, positive definite.

244
00:14:03,910 --> 00:14:09,980
Then, now we're talking about
when we have that property,

245
00:14:09,980 --> 00:14:16,760
but the columns of A
are not very independent

246
00:14:16,760 --> 00:14:20,530
and the matrix A transpose
A is not very invertible.

247
00:14:20,530 --> 00:14:23,350
OK, so that's what
the things are.

248
00:14:23,350 --> 00:14:27,910
And then just to,
because-- Say, on this one,

249
00:14:27,910 --> 00:14:30,350
what's the good thing to do?

250
00:14:30,350 --> 00:14:36,370
The good thing to do is
to call the qr code which

251
00:14:36,370 --> 00:14:43,750
gets its name because it takes
the matrix A and it factors it.

252
00:14:43,750 --> 00:14:49,150
Of course, we all know
that lu is the code here.

253
00:14:49,150 --> 00:14:56,490
It factors K. And qr is
the code that factors

254
00:14:56,490 --> 00:15:04,810
A into a very good guy, an
optimal Q. It couldn't be beat.

255
00:15:04,810 --> 00:15:08,430
And an R, that's upper
triangular, and therefore

256
00:15:08,430 --> 00:15:12,050
in the simplest form you
see exactly what you're

257
00:15:12,050 --> 00:15:15,040
dealing with.

258
00:15:15,040 --> 00:15:24,600
Let me continue this
least squares idea.

259
00:15:24,600 --> 00:15:27,870
Because Q and R are
probably not so familiar.

260
00:15:27,870 --> 00:15:30,730
Maybe the name
Gram-Schmidt is familiar?

261
00:15:30,730 --> 00:15:34,700
How many have seen Gram-Schmidt?

262
00:15:34,700 --> 00:15:37,500
The Gram-Schmidt idea
I'll describe quickly,

263
00:15:37,500 --> 00:15:40,530
but just do those words, do
those guys' names mean anything

264
00:15:40,530 --> 00:15:41,030
to?

265
00:15:41,030 --> 00:15:42,180
Yes, if they do.

266
00:15:42,180 --> 00:15:43,090
Quite a few.

267
00:15:43,090 --> 00:15:44,420
But not all.

268
00:15:44,420 --> 00:15:45,390
OK.

269
00:15:45,390 --> 00:15:47,220
OK.

270
00:15:47,220 --> 00:15:53,190
And can I just say,
also Gram-Schmidt

271
00:15:53,190 --> 00:15:57,390
is kind of our name for
getting these two factors.

272
00:15:57,390 --> 00:16:00,810
And you'll see why,
it's very cool to have,

273
00:16:00,810 --> 00:16:06,030
why this is a good first step.

274
00:16:06,030 --> 00:16:08,420
It costs a little
to take that step,

275
00:16:08,420 --> 00:16:12,550
but if you're interested
in safety, take it.

276
00:16:12,550 --> 00:16:16,470
It might cost twice as much
as solving the A transpose

277
00:16:16,470 --> 00:16:21,330
A equation, so you double the
cost by going this safer route.

278
00:16:21,330 --> 00:16:24,250
And double is not a
big deal, usually.

279
00:16:24,250 --> 00:16:26,750
OK.

280
00:16:26,750 --> 00:16:29,350
So, I was going to
say that Gram-Schmidt,

281
00:16:29,350 --> 00:16:31,270
that's the name everybody uses.

282
00:16:31,270 --> 00:16:37,090
But actually their method
is no longer the winner.

283
00:16:37,090 --> 00:16:41,210
And in Section 2.3 and
I'll try to describe

284
00:16:41,210 --> 00:16:46,540
a slightly better method
than the Gram-Schmidt idea

285
00:16:46,540 --> 00:16:50,720
to arrive at Q and R. But let's
suppose you got to Q and R.

286
00:16:50,720 --> 00:16:53,600
Then, what would be the
least squares equation?

287
00:16:53,600 --> 00:16:59,240
A transpose A u hat is
A transpose b, right?

288
00:16:59,240 --> 00:17:02,840
That's the equation
everybody knows.

289
00:17:02,840 --> 00:17:07,210
But now if we have A
factored into Q times R,

290
00:17:07,210 --> 00:17:09,790
let me see how that simplifies.

291
00:17:09,790 --> 00:17:14,710
So now A is QR, and A
transpose, of course,

292
00:17:14,710 --> 00:17:18,870
is R transpose Q
transpose, u hat, and this

293
00:17:18,870 --> 00:17:23,200
is R transpose Q transpose b.

294
00:17:23,200 --> 00:17:26,060
Same equation,
I'm just supposing

295
00:17:26,060 --> 00:17:29,350
that I've got A into
this nice form where

296
00:17:29,350 --> 00:17:35,580
Q has-- where I've taken these
columns that possibly lined up

297
00:17:35,580 --> 00:17:40,370
too close to each other like,
you know, angles of one degree.

298
00:17:40,370 --> 00:17:42,800
And I've got better angles.

299
00:17:42,800 --> 00:17:47,330
I've got-- These columns
of A are too close,

300
00:17:47,330 --> 00:17:52,690
so I spread them out, to columns
of Q that are at 90 degrees.

301
00:17:52,690 --> 00:17:54,430
Orthogonal columns.

302
00:17:54,430 --> 00:17:57,460
Now, what's the deal
with orthogonal columns?

303
00:17:57,460 --> 00:17:59,990
Let me just remember
the main point

304
00:17:59,990 --> 00:18:06,310
about Q. It has
orthogonal columns, right,

305
00:18:06,310 --> 00:18:11,360
and I'll call those
q's. q_1 to q_n.

306
00:18:11,360 --> 00:18:12,450
OK.

307
00:18:12,450 --> 00:18:14,340
And the good deal
is, what happens

308
00:18:14,340 --> 00:18:16,820
when I do Q transpose Q?

309
00:18:16,820 --> 00:18:23,610
So I do q_1 transpose, these
are now rows, to q_n transpose.

310
00:18:23,610 --> 00:18:27,460
q_1 columns to q_n.

311
00:18:27,460 --> 00:18:31,880
And what do I get when I
multiply those matrices?

312
00:18:31,880 --> 00:18:37,960
Q transpose times Q. I get
I. q_1 transpose q_1, that's

313
00:18:37,960 --> 00:18:40,680
the length of q_1
squared is one,

314
00:18:40,680 --> 00:18:44,090
and q_1 is orthogonal
to all the others.

315
00:18:44,090 --> 00:18:48,930
And then q_2, you
see I get the I. q_3,

316
00:18:48,930 --> 00:18:52,300
get an n by n-- I get
the identity matrix.

317
00:18:52,300 --> 00:18:58,280
Q transpose Q is I.
That's the beautiful,

318
00:18:58,280 --> 00:19:00,720
just remember that fact.

319
00:19:00,720 --> 00:19:02,760
And use it right away.

320
00:19:02,760 --> 00:19:05,360
You see where it's
used, Q transpose Q

321
00:19:05,360 --> 00:19:08,460
is I in the middle of that.

322
00:19:08,460 --> 00:19:12,350
So I can just delete that,
I just have R transpose R

323
00:19:12,350 --> 00:19:14,410
and I can even
simplify this further.

324
00:19:14,410 --> 00:19:15,680
What can I do now?

325
00:19:15,680 --> 00:19:20,350
So that's the identity,
so I have R transpose R,

326
00:19:20,350 --> 00:19:25,640
but now I have an R transpose
over here so am I left with?

327
00:19:25,640 --> 00:19:28,760
I'll multiply both sides
by R transpose inverse

328
00:19:28,760 --> 00:19:35,360
and that will lead me to, R
u hat equals, knocking out

329
00:19:35,360 --> 00:19:40,690
our transpose inverse on
both sides, Q transpose b.

330
00:19:40,690 --> 00:19:45,280
Well, that's, our least
squares equation has become

331
00:19:45,280 --> 00:19:47,390
completely easy to solve.

332
00:19:47,390 --> 00:19:49,300
We've got a triangular
matrix here,

333
00:19:49,300 --> 00:19:52,030
I mean it's just
back substitution.

334
00:19:52,030 --> 00:19:57,800
It's just back substitution now,
and a Q transpose b over there.

335
00:19:57,800 --> 00:20:06,150
So a very simple solution for
our equation after the initial

336
00:20:06,150 --> 00:20:09,380
work of A=QR.

337
00:20:09,380 --> 00:20:10,990
OK.

338
00:20:10,990 --> 00:20:18,340
But very safe, Q is a
great matrix to work with.

339
00:20:18,340 --> 00:20:22,320
In fact people-- codes
are written so as

340
00:20:22,320 --> 00:20:27,920
to use orthogonal matrices
Q as often as they can.

341
00:20:27,920 --> 00:20:34,240
Alright, so you had a look
ahead of the computational side

342
00:20:34,240 --> 00:20:39,980
of 2.3, let me come back to
the most basic equations, just

343
00:20:39,980 --> 00:20:42,310
symmetric, positive
definite equations.

344
00:20:42,310 --> 00:20:53,050
Ku=f, and consider OK, how do
we measure whether K is nearly

345
00:20:53,050 --> 00:20:55,260
singular?

346
00:20:55,260 --> 00:20:57,400
OK, let me just
ask that question.

347
00:20:57,400 --> 00:21:00,190
That's the central question.

348
00:21:00,190 --> 00:21:07,060
How to measure,
when is K, which is,

349
00:21:07,060 --> 00:21:12,150
which we're assuming be
symmetric positive definite,

350
00:21:12,150 --> 00:21:13,650
nearly singular?

351
00:21:13,650 --> 00:21:20,820
How to measure that?

352
00:21:20,820 --> 00:21:24,590
How to know whether we're
in any danger or not?

353
00:21:24,590 --> 00:21:25,480
OK.

354
00:21:25,480 --> 00:21:29,570
Well, first you
might think OK, if it

355
00:21:29,570 --> 00:21:33,200
is singular its
determinant is zero.

356
00:21:33,200 --> 00:21:36,310
So why not take its determinant?

357
00:21:36,310 --> 00:21:40,030
Well determinants,
as we've said,

358
00:21:40,030 --> 00:21:45,420
are not a good idea numerically.

359
00:21:45,420 --> 00:21:47,600
First, they're not
fun to compute.

360
00:21:47,600 --> 00:21:53,460
Second, they depend on the
number of unknowns, right?

361
00:21:53,460 --> 00:21:57,190
If I just have
twice the identity,

362
00:21:57,190 --> 00:22:01,140
suppose K is twice
the identity matrix.

363
00:22:01,140 --> 00:22:04,250
You could not get a better
problem than that, right?

364
00:22:04,250 --> 00:22:06,070
If K was twice the
identity matrix

365
00:22:06,070 --> 00:22:08,270
the whole thing's simple.

366
00:22:08,270 --> 00:22:12,330
Or if K is, suppose
K is one millionth

367
00:22:12,330 --> 00:22:15,190
of the identity matrix.

368
00:22:15,190 --> 00:22:18,970
OK, again, that's a
perfect problem, right?

369
00:22:18,970 --> 00:22:22,210
If K is one millionth
of the identity matrix,

370
00:22:22,210 --> 00:22:25,860
well to solve the problem you
just multiply by a million,

371
00:22:25,860 --> 00:22:27,330
you've got the answer.

372
00:22:27,330 --> 00:22:29,680
So those are good.

373
00:22:29,680 --> 00:22:34,490
And we have to have some
measure of bad or good

374
00:22:34,490 --> 00:22:37,160
that tells us those are good.

375
00:22:37,160 --> 00:22:40,030
OK.

376
00:22:40,030 --> 00:22:42,320
So the determinant won't do.

377
00:22:42,320 --> 00:22:44,850
Because the determinant
of 2 I would be two

378
00:22:44,850 --> 00:22:48,610
to the nth, the
size of the matrix.

379
00:22:48,610 --> 00:22:51,640
Or the determinant of
one millionth identity

380
00:22:51,640 --> 00:22:53,840
would be one millionth to the n.

381
00:22:53,840 --> 00:22:56,160
Those are not numbers we want.

382
00:22:56,160 --> 00:22:57,310
What's a better number?

383
00:22:57,310 --> 00:22:59,950
Maybe you could
suggest a better number

384
00:22:59,950 --> 00:23:05,070
to measure how close is the
matrix to being singular.

385
00:23:05,070 --> 00:23:08,350
What would you say?

386
00:23:08,350 --> 00:23:10,370
I think if you think
about it a little,

387
00:23:10,370 --> 00:23:13,900
so what numbers do we know?

388
00:23:13,900 --> 00:23:17,330
Well eigenvalues jumps to mind.

389
00:23:17,330 --> 00:23:18,840
Eigenvalues jumps to mind.

390
00:23:18,840 --> 00:23:22,940
Because this matrix K, being
symmetric positive definite,

391
00:23:22,940 --> 00:23:31,520
has eigenvalues say lambda_1
less than lambda_2, so on.

392
00:23:31,520 --> 00:23:32,790
So on.

393
00:23:32,790 --> 00:23:40,020
Up to, so this is lambda_max
and that's lambda_min,

394
00:23:40,020 --> 00:23:42,840
and they're all positive.

395
00:23:42,840 --> 00:23:50,640
And so what's your idea of
whether the thing's nearly

396
00:23:50,640 --> 00:23:52,150
singular now?

397
00:23:52,150 --> 00:23:54,390
Look at lambda_1, right?

398
00:23:54,390 --> 00:23:56,920
If lambda_1 is near
zero, that somehow

399
00:23:56,920 --> 00:23:59,320
indicates near singular.

400
00:23:59,320 --> 00:24:02,600
So lambda_1 is sort
of a natural test.

401
00:24:02,600 --> 00:24:05,060
Not that I intend
to compute lambda_1,

402
00:24:05,060 --> 00:24:08,070
that would take longer
than solving the system.

403
00:24:08,070 --> 00:24:10,990
But an estimate of
lambda_1 would be enough.

404
00:24:10,990 --> 00:24:13,060
OK.

405
00:24:13,060 --> 00:24:17,270
But my answer is
not just lambda_1.

406
00:24:17,270 --> 00:24:20,090


407
00:24:20,090 --> 00:24:22,510
And why is that?

408
00:24:22,510 --> 00:24:28,850
Because the examples I gave you,
when I had twice the identity,

409
00:24:28,850 --> 00:24:31,780
what would lambda_1
be there in that case?

410
00:24:31,780 --> 00:24:35,930
If my matrix K was beautiful,
twice the identity matrix,

411
00:24:35,930 --> 00:24:40,250
lambda_1 would be two.

412
00:24:40,250 --> 00:24:44,630
All the eigenvalues are two
for the identity matrix.

413
00:24:44,630 --> 00:24:48,340
Now if my matrix was one
millionth of the identity,

414
00:24:48,340 --> 00:24:50,450
again I have a
beautiful problem.

415
00:24:50,450 --> 00:24:52,540
Just as good, just
as beautiful problem.

416
00:24:52,540 --> 00:24:55,190
What's lambda_1 for that one?

417
00:24:55,190 --> 00:24:56,400
One millionth.

418
00:24:56,400 --> 00:24:59,770
It looks not as good, it
looks much more singular,

419
00:24:59,770 --> 00:25:05,220
but that's not really there.

420
00:25:05,220 --> 00:25:11,320
So you could say, we'll
scale your matrix.

421
00:25:11,320 --> 00:25:15,740
And scaling the matrices, in
fact scaling individual rows

422
00:25:15,740 --> 00:25:19,350
and columns to get it,
you might have used,

423
00:25:19,350 --> 00:25:25,890
your unknowns might be
somehow in the wrong units.

424
00:25:25,890 --> 00:25:29,825
So one of the answers is way
big and the second component

425
00:25:29,825 --> 00:25:30,960
is way small.

426
00:25:30,960 --> 00:25:33,640
That's not good.

427
00:25:33,640 --> 00:25:36,630
So scaling is important.

428
00:25:36,630 --> 00:25:42,670
But even then you still
end up with a matrix K,

429
00:25:42,670 --> 00:25:47,990
some eigenvalues and I'll
tell you the condition number.

430
00:25:47,990 --> 00:25:55,670
The condition number of K is the
ratio of this guy to this one.

431
00:25:55,670 --> 00:26:00,170
In other words, two K, or a
million K, or one millionth K,

432
00:26:00,170 --> 00:26:02,320
all have the same
condition number.

433
00:26:02,320 --> 00:26:06,530
Because those problems
are identical problems.

434
00:26:06,530 --> 00:26:08,690
Multiplying by two,
multiplying by a million,

435
00:26:08,690 --> 00:26:12,150
dividing by a million
didn't change reality there.

436
00:26:12,150 --> 00:26:17,130
So if we're in floating
points, it just didn't change.

437
00:26:17,130 --> 00:26:19,210
So the condition
number is going to be

438
00:26:19,210 --> 00:26:20,880
lambda_max over lambda_min.

439
00:26:20,880 --> 00:26:24,710


440
00:26:24,710 --> 00:26:31,310
And this is for symmetric
positive definite matrices.

441
00:26:31,310 --> 00:26:35,170
And MATLAB will print
out that number.

442
00:26:35,170 --> 00:26:37,230
Or print an estimate
for that number;

443
00:26:37,230 --> 00:26:39,740
as I said we don't want
to compute it exactly.

444
00:26:39,740 --> 00:26:41,720
Lambda_max over lambda_min.

445
00:26:41,720 --> 00:26:48,220
That measures how sensitive,
how tough your problem is.

446
00:26:48,220 --> 00:26:49,470
OK.

447
00:26:49,470 --> 00:26:55,450
And then I have to think,
how does that come in, why

448
00:26:55,450 --> 00:26:57,200
is that an appropriate number?

449
00:26:57,200 --> 00:26:59,790
I guess I've tried to
give an instinct for why

450
00:26:59,790 --> 00:27:05,410
it's appropriate, but we can
be pretty specific about it.

451
00:27:05,410 --> 00:27:08,770
In fact, let's do that now.

452
00:27:08,770 --> 00:27:13,510
So what would be the condition
number of twice the identity?

453
00:27:13,510 --> 00:27:15,720
It would be one.

454
00:27:15,720 --> 00:27:17,470
Perfectly conditioned problem.

455
00:27:17,470 --> 00:27:20,440
What would be the
condition, yeah, OK.

456
00:27:20,440 --> 00:27:24,020
What would be the condition
of a diagonal matrix?

457
00:27:24,020 --> 00:27:32,900
Suppose K was the diagonal
matrix two, three, four?

458
00:27:32,900 --> 00:27:36,650
The condition number of
that matrix is two, right?

459
00:27:36,650 --> 00:27:39,990
Lambda_max is sitting there,
lambda_min is sitting there,

460
00:27:39,990 --> 00:27:41,370
the ratio is two.

461
00:27:41,370 --> 00:27:44,950
Of course, any condition
number under 100 or 1000

462
00:27:44,950 --> 00:27:47,500
is no problem.

463
00:27:47,500 --> 00:27:52,509
Roughly the rule of
thumb is that the--

464
00:27:52,509 --> 00:27:53,550
What's the rule of thumb?

465
00:27:53,550 --> 00:27:55,700
I think that maybe
the number of digits

466
00:27:55,700 --> 00:28:03,020
in the condition number,
the number of digits-- Maybe

467
00:28:03,020 --> 00:28:06,660
if the condition number was
1000 you would be taking

468
00:28:06,660 --> 00:28:11,920
a chance that your last three
digits, single precision that

469
00:28:11,920 --> 00:28:18,160
would be three out of
six, three bits-- Somehow

470
00:28:18,160 --> 00:28:22,240
the log of the condition number,
the number of digits in it,

471
00:28:22,240 --> 00:28:26,340
is some measure of the
number of digits you'd lose.

472
00:28:26,340 --> 00:28:30,450
Because you're doing floating
point, of course, here.

473
00:28:30,450 --> 00:28:33,670
That's it, totally well
conditioned matrix.

474
00:28:33,670 --> 00:28:35,740
I wouldn't touch that one.

475
00:28:35,740 --> 00:28:37,630
I mean that's just fine.

476
00:28:37,630 --> 00:28:45,260
But we can figure out-- Here's
the point that I should make.

477
00:28:45,260 --> 00:28:48,490
Because here's the
computational science point.

478
00:28:48,490 --> 00:28:57,540
When this is our special
K, our -1, 2, -1 matrix.

479
00:28:57,540 --> 00:29:03,730
-1, 2, -1 matrix of size
n, the condition number

480
00:29:03,730 --> 00:29:09,430
goes like n squared.

481
00:29:09,430 --> 00:29:11,910
Because we know the
eigenvalues of that matrix,

482
00:29:11,910 --> 00:29:13,160
we could see it.

483
00:29:13,160 --> 00:29:18,340
The largest eigenvalue, when
n is big, say n is 1000.

484
00:29:18,340 --> 00:29:21,850
And we're dealing with our
standard second difference

485
00:29:21,850 --> 00:29:24,830
matrix, the most important
example I could possibly

486
00:29:24,830 --> 00:29:25,980
present.

487
00:29:25,980 --> 00:29:30,430
Then the largest eigenvalues
we actually are there

488
00:29:30,430 --> 00:29:36,430
in Section 1.5, we
didn't do them in detail,

489
00:29:36,430 --> 00:29:39,280
we'll probably come back
to them when we need them.

490
00:29:39,280 --> 00:29:42,370
But the largest
one is about four.

491
00:29:42,370 --> 00:29:46,030
And the smallest
one is pretty small.

492
00:29:46,030 --> 00:29:50,120
The smallest one is
sort of like a sine

493
00:29:50,120 --> 00:29:54,390
squared of a small number.

494
00:29:54,390 --> 00:29:59,970
And so the smallest eigenvalue
is of order 1/n squared.

495
00:29:59,970 --> 00:30:04,840
And then when I do that
ratio of four, lambda_max

496
00:30:04,840 --> 00:30:09,620
is just like four, this
lambda_min is like 1/n squared,

497
00:30:09,620 --> 00:30:10,720
quite small.

498
00:30:10,720 --> 00:30:13,260
That ratio gives
me the n squared.

499
00:30:13,260 --> 00:30:16,890
So there's an indication.

500
00:30:16,890 --> 00:30:19,680
Basically, that's not bad.

501
00:30:19,680 --> 00:30:26,330
That's not bad if n is 1000 in
most engineering problems, that

502
00:30:26,330 --> 00:30:29,160
gives you extremely,
extremely good accuracy.

503
00:30:29,160 --> 00:30:33,060
Condition number of a
million, you could live with.

504
00:30:33,060 --> 00:30:37,480
If n is 100, more typical,
condition number of 10,000

505
00:30:37,480 --> 00:30:44,330
is basically, I think OK.

506
00:30:44,330 --> 00:30:46,280
And I would go with it.

507
00:30:46,280 --> 00:30:53,820
But if the condition number is
way up, then I'd think again,

508
00:30:53,820 --> 00:30:55,970
did I model the problem well.

509
00:30:55,970 --> 00:30:56,950
OK.

510
00:30:56,950 --> 00:30:57,870
Alright.

511
00:30:57,870 --> 00:31:01,650
So that's-- Now I have to tell
you, why is this inappropriate?

512
00:31:01,650 --> 00:31:04,990
How do you look at the error?

513
00:31:04,990 --> 00:31:15,500
So can I write down a way
of approaching this Ku=f?

514
00:31:15,500 --> 00:31:21,680


515
00:31:21,680 --> 00:31:24,950
So this is the first time I've
used the word round-off error.

516
00:31:24,950 --> 00:31:28,970
So in all the calculations,
in all the calculations

517
00:31:28,970 --> 00:31:36,710
that you have to do, to get
to u, those row operations,

518
00:31:36,710 --> 00:31:40,680
and you're doing them
to the right side too,

519
00:31:40,680 --> 00:31:42,960
so all those are floating
point operations in which

520
00:31:42,960 --> 00:31:44,910
small errors are sneaking in.

521
00:31:44,910 --> 00:31:50,520
And it was very unclear,
in the early years,

522
00:31:50,520 --> 00:31:54,090
whether the millions and
millions of operations that you

523
00:31:54,090 --> 00:31:59,150
do, additions, subtractions,
multiplications,

524
00:31:59,150 --> 00:32:06,480
in elimination, do those,
could those add up?

525
00:32:06,480 --> 00:32:10,960
If they don't cancel,
you've got problems, right?

526
00:32:10,960 --> 00:32:14,930
But in general you
would expect somehow

527
00:32:14,930 --> 00:32:17,120
that these are just
round off errors,

528
00:32:17,120 --> 00:32:19,760
you're making them millions
and millions of times,

529
00:32:19,760 --> 00:32:23,870
it would be pretty bad luck,
I mean like Red Sox twelfth

530
00:32:23,870 --> 00:32:29,790
inning bad luck to have
them pile up on you.

531
00:32:29,790 --> 00:32:32,050
So you don't expect that.

532
00:32:32,050 --> 00:32:37,120
Now, what you do solve, so
what you actually compute,

533
00:32:37,120 --> 00:32:41,110
so this is the exact.

534
00:32:41,110 --> 00:32:44,630
This would be the computed.

535
00:32:44,630 --> 00:32:48,610
Let me suppose that the
computed one is sort of a,

536
00:32:48,610 --> 00:32:49,900
there's an error.

537
00:32:49,900 --> 00:32:51,850
I'll call it delta u.

538
00:32:51,850 --> 00:32:54,580
That's our error.

539
00:32:54,580 --> 00:32:58,180
And it's equal to
an f plus delta f.

540
00:32:58,180 --> 00:33:08,740
And this is our round off
error, this is error we make,

541
00:33:08,740 --> 00:33:13,920
and this is error in the answer.

542
00:33:13,920 --> 00:33:18,620
In the final answer.

543
00:33:18,620 --> 00:33:21,570
OK, now I would like to
have an error equation.

544
00:33:21,570 --> 00:33:23,810
An equation for
that error, delta u,

545
00:33:23,810 --> 00:33:26,700
because that's what I'm
trying to get an idea of.

546
00:33:26,700 --> 00:33:28,050
No problem.

547
00:33:28,050 --> 00:33:32,140
If I subtract the exact
equation from this equation,

548
00:33:32,140 --> 00:33:35,290
I have a simple error equation.

549
00:33:35,290 --> 00:33:43,030
So this is my error equation.

550
00:33:43,030 --> 00:33:45,550
OK.

551
00:33:45,550 --> 00:33:50,740
So I want to estimate
the size of that error,

552
00:33:50,740 --> 00:33:54,310
compared to the exact.

553
00:33:54,310 --> 00:33:58,010
You might say, and you
would be right, in saying,

554
00:33:58,010 --> 00:34:03,130
well wait a minute as you do
all these operations you're also

555
00:34:03,130 --> 00:34:09,050
creating errors in K. So I could
have a K plus delta K here,

556
00:34:09,050 --> 00:34:09,660
too.

557
00:34:09,660 --> 00:34:13,940
And actually it wouldn't
be difficult to deal with.

558
00:34:13,940 --> 00:34:18,960
And would certainly be there
in a proper error analysis.

559
00:34:18,960 --> 00:34:20,760
And it wouldn't make
a big difference,

560
00:34:20,760 --> 00:34:23,820
the condition number would
still be the right measure.

561
00:34:23,820 --> 00:34:28,020
So let me concentrate
here on the error

562
00:34:28,020 --> 00:34:32,260
in f, when subtracting
one from the other

563
00:34:32,260 --> 00:34:37,780
gives me this simple
error equation.

564
00:34:37,780 --> 00:34:46,990
So my question is, when is
that error, delta u, big?

565
00:34:46,990 --> 00:34:49,210
When do I get a large error?

566
00:34:49,210 --> 00:34:51,330
And delta f, I'm
not controlling.

567
00:34:51,330 --> 00:34:55,920
I might control the size, but
the details of it I can't know.

568
00:34:55,920 --> 00:35:03,730
So what delta f, now I'll
take worst possible here.

569
00:35:03,730 --> 00:35:07,620
Suppose this is of
some small size,

570
00:35:07,620 --> 00:35:13,150
ten to the minus something,
times some vector of errors,

571
00:35:13,150 --> 00:35:15,340
but I don't know anything
about that vector

572
00:35:15,340 --> 00:35:17,950
and therefore I'd better
take the worst possibility.

573
00:35:17,950 --> 00:35:19,800
What would be the
worst possibility?

574
00:35:19,800 --> 00:35:24,490
What right hand side would
give me the biggest delta u?

575
00:35:24,490 --> 00:35:24,990
Yeah.

576
00:35:24,990 --> 00:35:29,360
Maybe that's the
right question to ask.

577
00:35:29,360 --> 00:35:31,930
So now we're being a
little pessimistic.

578
00:35:31,930 --> 00:35:35,070
We're saying what
right hand side, what

579
00:35:35,070 --> 00:35:37,650
set of errors in
the measurements

580
00:35:37,650 --> 00:35:42,650
or from the calculations would
give me the largest delta u?

581
00:35:42,650 --> 00:35:49,380
Well, so let's see.

582
00:35:49,380 --> 00:35:51,710
I'm thinking the
worst case would

583
00:35:51,710 --> 00:35:58,890
be if delta f was an eigenvector
with the smallest eigenvalue,

584
00:35:58,890 --> 00:36:00,160
right?

585
00:36:00,160 --> 00:36:05,910
If delta f is an eigenvector,
is x_1, the eigenvector that

586
00:36:05,910 --> 00:36:11,340
goes with lambda_1,
the worst case

587
00:36:11,340 --> 00:36:17,330
would be for that to be
the first eigenvector.

588
00:36:17,330 --> 00:36:19,166
That would be the
worst direction.

589
00:36:19,166 --> 00:36:20,540
Of course, it
would be multiplied

590
00:36:20,540 --> 00:36:22,710
by some little number.

591
00:36:22,710 --> 00:36:26,080
Epsilon is every mathematician's
idea of a little number.

592
00:36:26,080 --> 00:36:26,590
OK.

593
00:36:26,590 --> 00:36:32,330
So epsilon x_1, then
what is delta u?

594
00:36:32,330 --> 00:36:39,610
Then the worst delta u is what?

595
00:36:39,610 --> 00:36:44,140
What would be the
solution to that equation?

596
00:36:44,140 --> 00:36:49,640
If the right-hand side was
epsilon and an eigenvector?

597
00:36:49,640 --> 00:36:51,780
This is the whole
point of eigenvectors.

598
00:36:51,780 --> 00:36:54,310
You can tell me what
the solution is.

599
00:36:54,310 --> 00:36:57,880
Is it a multiple of
that eigenvector?

600
00:36:57,880 --> 00:36:59,910
You bet.

601
00:36:59,910 --> 00:37:01,610
If this is an
eigenvector, then I

602
00:37:01,610 --> 00:37:03,540
can put in the same
eigenvector there,

603
00:37:03,540 --> 00:37:05,480
I just have to
scale it properly.

604
00:37:05,480 --> 00:37:14,380
So it'll be just this
side and what do I need?

605
00:37:14,380 --> 00:37:15,720
I think I just need lambda_1.

606
00:37:15,720 --> 00:37:19,640


607
00:37:19,640 --> 00:37:24,830
The worst K inverse can
be is like 1/lambda_1.

608
00:37:24,830 --> 00:37:30,600
Right, if I claim that that's
the answer, if the right hand

609
00:37:30,600 --> 00:37:33,140
side is sort of in
the worst direction,

610
00:37:33,140 --> 00:37:36,380
then the answer is
that same right hand

611
00:37:36,380 --> 00:37:39,830
side divided by lambda_1.

612
00:37:39,830 --> 00:37:40,960
Let me just check.

613
00:37:40,960 --> 00:37:45,850
If I multiply both sides by
K, I have K delta u equals K,

614
00:37:45,850 --> 00:37:48,860
what's K*x_1?

615
00:37:48,860 --> 00:37:49,690
Everybody with me?

616
00:37:49,690 --> 00:37:51,470
What's K*x_1?

617
00:37:51,470 --> 00:37:54,280
Lambda_1*x_1, so the
lambda_1's cancel,

618
00:37:54,280 --> 00:37:56,750
then I got the
epsilon x_1 I want.

619
00:37:56,750 --> 00:37:59,360
So, no surprise.

620
00:37:59,360 --> 00:38:03,380
That's just telling us that
the worst error is an error

621
00:38:03,380 --> 00:38:06,310
in the direction of
the low eigenvector,

622
00:38:06,310 --> 00:38:10,530
and that error gets
amplified by 1/lambda_1.

623
00:38:10,530 --> 00:38:11,240
OK.

624
00:38:11,240 --> 00:38:15,070
So that's brought lambda_1,
lambda_min into it,

625
00:38:15,070 --> 00:38:16,540
in the denominator.

626
00:38:16,540 --> 00:38:19,920
Now, here's another point.

627
00:38:19,920 --> 00:38:22,350
Second point now.

628
00:38:22,350 --> 00:38:25,790
So that would be
the absolute error.

629
00:38:25,790 --> 00:38:29,290
But we saw for those
factors of two and one

630
00:38:29,290 --> 00:38:33,980
millionth and so on, that
really it's the relative error.

631
00:38:33,980 --> 00:38:40,060
So I want to estimate not
the absolute error, delta u,

632
00:38:40,060 --> 00:38:45,050
but the error delta u
relative to u itself.

633
00:38:45,050 --> 00:38:47,460
So that if I scale
the whole problem,

634
00:38:47,460 --> 00:38:50,220
my relative error
wouldn't change.

635
00:38:50,220 --> 00:38:52,630
So, in other words,
what I want to do

636
00:38:52,630 --> 00:39:00,960
is ask in this case,
what's the right hand side?

637
00:39:00,960 --> 00:39:07,540
How big, yeah, I know, I want
to know how small u could be,

638
00:39:07,540 --> 00:39:08,280
right?

639
00:39:08,280 --> 00:39:10,410
I'm shooting for the worst.

640
00:39:10,410 --> 00:39:20,370
The relative error
is the size of u,

641
00:39:20,370 --> 00:39:21,925
maybe I should put it this way.

642
00:39:21,925 --> 00:39:26,320
The relative error is
the size of the error

643
00:39:26,320 --> 00:39:28,300
relative to the size of u.

644
00:39:28,300 --> 00:39:32,620
And I want to know
how big that could be.

645
00:39:32,620 --> 00:39:33,200
OK.

646
00:39:33,200 --> 00:39:37,280
So now I know how big delta u
could be, it could be that big.

647
00:39:37,280 --> 00:39:40,710
But u itself, how
big could u be?

648
00:39:40,710 --> 00:39:44,030
How small could u be, right?
u's in the denominator.

649
00:39:44,030 --> 00:39:45,940
So if I'm trying
to make this big,

650
00:39:45,940 --> 00:39:47,510
I'll try to make that small.

651
00:39:47,510 --> 00:39:50,990
So when is u the smallest?

652
00:39:50,990 --> 00:39:54,035
Over there I said when is delta
u the biggest, now I'm going

653
00:39:54,035 --> 00:39:56,730
to say when is u the smallest?

654
00:39:56,730 --> 00:40:03,260
What f would point
me in the direction

655
00:40:03,260 --> 00:40:07,390
in which u was the smallest?

656
00:40:07,390 --> 00:40:10,530
Got to be the other eigenvector.

657
00:40:10,530 --> 00:40:11,480
This end.

658
00:40:11,480 --> 00:40:17,620
The worst case
would be when this

659
00:40:17,620 --> 00:40:19,840
is in the direction of x_n.

660
00:40:19,840 --> 00:40:21,420
The top eigenvector.

661
00:40:21,420 --> 00:40:27,800
In that case, what is u?

662
00:40:27,800 --> 00:40:34,300
So I'm saying the worst f is
the one that makes u smallest,

663
00:40:34,300 --> 00:40:39,460
and the worst delta f is the
one that makes delta u biggest.

664
00:40:39,460 --> 00:40:41,540
I'm going for the
worst case here,

665
00:40:41,540 --> 00:40:52,040
so if the right side is x_n,
what is u? x_n over lambda_n.

666
00:40:52,040 --> 00:40:56,560
Because when I multiply by K, K
times x_n brings me a lambda_n,

667
00:40:56,560 --> 00:40:58,660
cancel that lambda_n
I get it right.

668
00:40:58,660 --> 00:41:04,980
So there is the smallest u, and
here is the largest delta u.

669
00:41:04,980 --> 00:41:09,220
And the epsilon is coming
from the method we use,

670
00:41:09,220 --> 00:41:13,930
so that's not involved
with the matrix K. So

671
00:41:13,930 --> 00:41:17,340
do you see on there, if
I'm trying to estimate

672
00:41:17,340 --> 00:41:23,360
delta u over u, that's big.

673
00:41:23,360 --> 00:41:27,610
The size of delta u over
this, the size of delta u

674
00:41:27,610 --> 00:41:31,400
over the size of u is what?

675
00:41:31,400 --> 00:41:36,190
Delta u has some epsilon
that measures the machine

676
00:41:36,190 --> 00:41:41,850
number of digits we're keeping,
the machine length, word length

677
00:41:41,850 --> 00:41:43,020
and so on.

678
00:41:43,020 --> 00:41:46,040
This is a unit
vector over lambda_1.

679
00:41:46,040 --> 00:41:48,610


680
00:41:48,610 --> 00:41:50,850
That's delta u over
lambda_1, this u

681
00:41:50,850 --> 00:41:54,550
is in the denominator.
lambda_n-- u is down here,

682
00:41:54,550 --> 00:41:57,430
so the lambda_n flips up.

683
00:41:57,430 --> 00:42:00,940
Do you see it?

684
00:42:00,940 --> 00:42:07,550
By taking the worst case, I've
got the worst relative error.

685
00:42:07,550 --> 00:42:15,430
So that for other methods,
other f's and delta f's, they

686
00:42:15,430 --> 00:42:17,410
won't be the very worst ones.

687
00:42:17,410 --> 00:42:22,960
But here I've written
down what's the worst.

688
00:42:22,960 --> 00:42:27,570
And that's the reason that
this is the condition number.

689
00:42:27,570 --> 00:42:35,640
So I'm speaking about topics
that are there in 1.7, trying

690
00:42:35,640 --> 00:42:37,660
to give you the main point.

691
00:42:37,660 --> 00:42:40,950
The main point is,
look at relative error

692
00:42:40,950 --> 00:42:43,430
because that's the
right thing to look at.

693
00:42:43,430 --> 00:42:46,000
Look at the worst
cases, which are

694
00:42:46,000 --> 00:42:49,530
in the directions of the
top and bottom eigenvectors.

695
00:42:49,530 --> 00:42:52,670
In that case, that relative
error has this condition

696
00:42:52,670 --> 00:42:58,960
number, lambda_n/lambda_1, and
that's the good measure for how

697
00:42:58,960 --> 00:43:01,970
singular the matrix is.

698
00:43:01,970 --> 00:43:05,870
So one millionth of the identity
is not a nearly singular

699
00:43:05,870 --> 00:43:08,800
matrix, because lambda_max
and lambda_min are equal,

700
00:43:08,800 --> 00:43:11,040
that's a perfectly
conditioned matrix.

701
00:43:11,040 --> 00:43:14,990
This matrix has condition
number two, 4/2.

702
00:43:14,990 --> 00:43:16,360
It's quite good.

703
00:43:16,360 --> 00:43:21,670
This matrix is getting worse,
with an n squared in there,

704
00:43:21,670 --> 00:43:29,230
if n is big, and other matrices
could be worse than that.

705
00:43:29,230 --> 00:43:30,460
OK.

706
00:43:30,460 --> 00:43:37,560
So that's my discussion
of condition numbers.

707
00:43:37,560 --> 00:43:40,570
I'll add one more thing.

708
00:43:40,570 --> 00:43:43,750
These eigenvalues
were a good measure

709
00:43:43,750 --> 00:43:47,300
when my matrix was
symmetric positive definite.

710
00:43:47,300 --> 00:43:50,300
If I have a matrix
that I would never

711
00:43:50,300 --> 00:44:00,310
call K, a matrix like one, one,
zero and a million, OK, that,

712
00:44:00,310 --> 00:44:02,480
I would never write K for that.

713
00:44:02,480 --> 00:44:05,200
I would shoot myself
first before writing K.

714
00:44:05,200 --> 00:44:07,110
So what are the
eigenvalues lambda_min

715
00:44:07,110 --> 00:44:11,340
and lambda_max for that matrix?

716
00:44:11,340 --> 00:44:13,970
Are you up now on eigenvalues?

717
00:44:13,970 --> 00:44:15,490
We haven't done a
lot of eigenvalues

718
00:44:15,490 --> 00:44:19,190
but triangular matrices
are really easy.

719
00:44:19,190 --> 00:44:21,770
The eigenvalues of
that matrix are?

720
00:44:21,770 --> 00:44:23,470
One and one.

721
00:44:23,470 --> 00:44:28,370
The condition number should
not be 1/1, that would be bad.

722
00:44:28,370 --> 00:44:32,250
So instead, if
this was my matrix

723
00:44:32,250 --> 00:44:34,800
and I wanted to know
its condition number,

724
00:44:34,800 --> 00:44:36,530
what would I do?

725
00:44:36,530 --> 00:44:40,520
How would I define the
condition number of A?

726
00:44:40,520 --> 00:44:42,670
You know what I
do whenever I have

727
00:44:42,670 --> 00:44:46,770
a matrix that is not symmetric.

728
00:44:46,770 --> 00:44:50,700
I get to a symmetric matrix
by forming A transpose A,

729
00:44:50,700 --> 00:44:56,550
I get a K, I take
its condition number

730
00:44:56,550 --> 00:45:00,700
by my formula, which I
like in a symmetric case,

731
00:45:00,700 --> 00:45:05,950
and then I would
take the square root.

732
00:45:05,950 --> 00:45:08,840
So that would be a
pretty big number.

733
00:45:08,840 --> 00:45:13,770
For this, for that matrix A, the
condition number of that matrix

734
00:45:13,770 --> 00:45:16,470
A is up around 10^6.

735
00:45:16,470 --> 00:45:19,850
The condition number of that
matrix is up around 10^6 even

736
00:45:19,850 --> 00:45:24,070
though its eigenvalues are one
and one because when I form A

737
00:45:24,070 --> 00:45:29,880
transpose A, those eigenvalues
will jump all over.

738
00:45:29,880 --> 00:45:37,670
And then probably this thing
will have eigenvalues way up

739
00:45:37,670 --> 00:45:41,370
and the condition
number will be high.

740
00:45:41,370 --> 00:45:42,170
OK.

741
00:45:42,170 --> 00:45:45,070
So that's a little, you'll
meet condition number.

742
00:45:45,070 --> 00:45:49,410
MATLAB shows it, and you
naturally wonder what is this?

743
00:45:49,410 --> 00:45:52,050
Well, if it's a positive
definite matrix,

744
00:45:52,050 --> 00:45:54,640
it's just the ratio
lambda_max to lambda_min

745
00:45:54,640 --> 00:45:57,010
and it tells you
as it gets bigger

746
00:45:57,010 --> 00:46:00,320
that means the matrix
is tougher to work with.

747
00:46:00,320 --> 00:46:01,660
OK.

748
00:46:01,660 --> 00:46:07,020
We have just five minutes left
to say a few words about QR.

749
00:46:07,020 --> 00:46:11,970
OK, can I do that in
just a few minutes?

750
00:46:11,970 --> 00:46:16,140
And much more is in the codes.

751
00:46:16,140 --> 00:46:19,800
OK, what's the deal with QR?

752
00:46:19,800 --> 00:46:28,910
I'm starting with a
matrix A. Let's make

753
00:46:28,910 --> 00:46:32,540
it two by two, a two by two.

754
00:46:32,540 --> 00:46:34,750
OK, it's got a
couple of columns.

755
00:46:34,750 --> 00:46:36,100
Can I draw them?

756
00:46:36,100 --> 00:46:39,050
So it's got a column there,
that's its first column

757
00:46:39,050 --> 00:46:40,790
and it's got another
column there,

758
00:46:40,790 --> 00:46:44,320
maybe that's not a very
well conditioned matrix.

759
00:46:44,320 --> 00:46:47,840
Those are the columns of
A. Plotted in the plane,

760
00:46:47,840 --> 00:46:48,980
two-space.

761
00:46:48,980 --> 00:46:55,530
OK, so now the Gram-Schmidt
idea is: out of those columns,

762
00:46:55,530 --> 00:46:57,910
get orthonormal columns.

763
00:46:57,910 --> 00:47:02,560
Get from A to Q. So the
Gram-Schmidt idea is, out

764
00:47:02,560 --> 00:47:09,650
of these two vectors, two axes
that are not at 90 degrees,

765
00:47:09,650 --> 00:47:12,340
produce vectors that
are at 90 degrees.

766
00:47:12,340 --> 00:47:16,330
Actually, you can guess
how you're going to do it.

767
00:47:16,330 --> 00:47:19,090
Let me say, OK I'll settle
for that direction, that

768
00:47:19,090 --> 00:47:22,400
can be my first direction, q_1.

769
00:47:22,400 --> 00:47:26,870
What should be q_2?

770
00:47:26,870 --> 00:47:30,080
If that direction is
the right one for q_1

771
00:47:30,080 --> 00:47:36,270
I say OK I'll settle for
that, what's the q_2 guy?

772
00:47:36,270 --> 00:47:38,850
Well, what am I going to do?

773
00:47:38,850 --> 00:47:43,500
I mean, Gram thought of it
and Schmidt thought of it.

774
00:47:43,500 --> 00:47:46,210
Schmidt was a little
later, but it wasn't,

775
00:47:46,210 --> 00:47:49,830
like that hard to think of it.

776
00:47:49,830 --> 00:47:52,700
What do you do here?

777
00:47:52,700 --> 00:47:54,470
Well, we know how
to do projections

778
00:47:54,470 --> 00:47:59,010
from this least squares.

779
00:47:59,010 --> 00:48:01,600
What am I looking for?

780
00:48:01,600 --> 00:48:03,880
Subtract off the
projection, right.

781
00:48:03,880 --> 00:48:07,510
Take the projection
and subtract it off

782
00:48:07,510 --> 00:48:12,040
and be left with the component
that's perpendicular.

783
00:48:12,040 --> 00:48:17,500
So this will be the q_1
direction and this will be,

784
00:48:17,500 --> 00:48:20,550
this guy e, what we
called e, would tell me

785
00:48:20,550 --> 00:48:22,750
the q_2 direction.

786
00:48:22,750 --> 00:48:26,440
And then we could make
those into unit vectors

787
00:48:26,440 --> 00:48:29,750
and we'd be golden.

788
00:48:29,750 --> 00:48:34,750
And if we did that we would
discover that the original a_1,

789
00:48:34,750 --> 00:48:40,160
a_2 and a_1, so this is
the original first column,

790
00:48:40,160 --> 00:48:45,040
and the original second
column, would be the good q_1

791
00:48:45,040 --> 00:48:53,000
and the good q_2 times some
matrix R. So here's our A=QR.

792
00:48:53,000 --> 00:48:57,620
It's your chance to see this
second major factorization

793
00:48:57,620 --> 00:48:59,470
of linear algebra.

794
00:48:59,470 --> 00:49:03,520
LU being the first,
QR being the second.

795
00:49:03,520 --> 00:49:05,240
So what's up?

796
00:49:05,240 --> 00:49:09,380
Well, compare first columns.

797
00:49:09,380 --> 00:49:12,240
First columns, I didn't
change direction.

798
00:49:12,240 --> 00:49:17,580
So all I have is here is
some scaling r_(1,1), zero.

799
00:49:17,580 --> 00:49:20,700
Some number times q_1 is a_1.

800
00:49:20,700 --> 00:49:23,880
That direction was fine.

801
00:49:23,880 --> 00:49:31,320
The second direction, q_2 and
a_2, those involve also a_1.

802
00:49:31,320 --> 00:49:37,550
So there's an r_(1,2)
and an r_(2,2) there.

803
00:49:37,550 --> 00:49:40,570
The point was, this
came out triangular.

804
00:49:40,570 --> 00:49:42,990
And that's what
makes things good.

805
00:49:42,990 --> 00:49:46,170
It came out triangular
because of the order

806
00:49:46,170 --> 00:49:48,300
that Gram and Schmidt worked.

807
00:49:48,300 --> 00:49:51,250
Gram and Schmidt
settled the first one

808
00:49:51,250 --> 00:49:53,070
in the first direction.

809
00:49:53,070 --> 00:49:58,180
Then they settled the first two
in the first two directions.

810
00:49:58,180 --> 00:50:00,050
If we were in three
dimensions, there'd

811
00:50:00,050 --> 00:50:04,350
be an a_3 somewhere here,
coming out of the board.

812
00:50:04,350 --> 00:50:09,580
And then the q_3 would come
straight out of the board.

813
00:50:09,580 --> 00:50:11,650
Right?

814
00:50:11,650 --> 00:50:15,910
If you just see that you've got
Gram-Schmidt completely. a_1

815
00:50:15,910 --> 00:50:17,190
is there.

816
00:50:17,190 --> 00:50:19,900
So is q_1. a_2 is there.

817
00:50:19,900 --> 00:50:25,720
I'm in the board still, in
the plane of a_1 and a_2,

818
00:50:25,720 --> 00:50:31,300
is the plane of q_1 and q_2, I'm
just getting right angles in.

819
00:50:31,300 --> 00:50:34,850
a_3, the third column in
a three by three case,

820
00:50:34,850 --> 00:50:37,760
is coming out at some angle.

821
00:50:37,760 --> 00:50:42,100
I want q_3 to come out
at a 90 degree angle.

822
00:50:42,100 --> 00:50:50,140
So that q_3 will involve some
combination of all the a's.

823
00:50:50,140 --> 00:50:52,380
So if it was three
by three, this

824
00:50:52,380 --> 00:50:57,450
would grow to q_1, q_2, q_3,
and this would then have

825
00:50:57,450 --> 00:50:59,530
three guys in its third column.

826
00:50:59,530 --> 00:51:02,970
But maybe you see that picture.

827
00:51:02,970 --> 00:51:06,670
So that's what
Gram-Schmidt achieves.

828
00:51:06,670 --> 00:51:09,940
And I just can't
let time run out

829
00:51:09,940 --> 00:51:14,800
without saying that this
is a pretty good way.

830
00:51:14,800 --> 00:51:19,330
Actually, nobody thought there
was a better one for centuries.

831
00:51:19,330 --> 00:51:23,330
But then a guy named
Householder came up

832
00:51:23,330 --> 00:51:29,480
with a different way, and a
numerically little better way.

833
00:51:29,480 --> 00:51:31,000
Numerically a little better way.

834
00:51:31,000 --> 00:51:33,440
So this is the Gram-Schmidt way.

835
00:51:33,440 --> 00:51:35,620
Can I just put
those words up here?

836
00:51:35,620 --> 00:51:41,760
So there's the Gram-Schmidt,
the classical Gram-Schmidt idea,

837
00:51:41,760 --> 00:51:47,150
which was what I described, was
the easy one, easy to describe.

838
00:51:47,150 --> 00:51:53,090
And then there's a method
called Householder, just named

839
00:51:53,090 --> 00:51:56,220
after him, that
MATLAB would follow.

840
00:51:56,220 --> 00:52:01,310
That every good qr code now
uses Householder matrices.

841
00:52:01,310 --> 00:52:04,440
It achieves the same results.

842
00:52:04,440 --> 00:52:06,120
And if I had a
little bit more time

843
00:52:06,120 --> 00:52:08,400
I could draw a picture
of what it does.

844
00:52:08,400 --> 00:52:09,720
But there you go.

845
00:52:09,720 --> 00:52:15,560
So that's my lecture
on, my quick lecture

846
00:52:15,560 --> 00:52:17,180
on numerical linear algebra.

847
00:52:17,180 --> 00:52:23,010
These two essential points and
I'll see you this afternoon.

848
00:52:23,010 --> 00:52:26,270
Let me bring those quiz
questions down again,

849
00:52:26,270 --> 00:52:29,480
for any discussion
about the quiz.