1
00:00:07,460 --> 00:00:09,910
OK.

2
00:00:09,910 --> 00:00:15,260
Here's lecture sixteen
and if you remember

3
00:00:15,260 --> 00:00:21,150
I ended up the last lecture
with this formula for what

4
00:00:21,150 --> 00:00:24,610
I called a projection matrix.

5
00:00:24,610 --> 00:00:31,610
And maybe I could just
recap for a minute what

6
00:00:31,610 --> 00:00:35,010
is that magic formula doing?

7
00:00:35,010 --> 00:00:38,390
For example, it's
supposed to be --

8
00:00:38,390 --> 00:00:40,230
it's supposed to
produce a projection,

9
00:00:40,230 --> 00:00:44,770
if I multiply by a b,
so I take P times b,

10
00:00:44,770 --> 00:00:51,240
I'm supposed to project that
vector b to the nearest point

11
00:00:51,240 --> 00:00:54,150
in the column space.

12
00:00:54,150 --> 00:00:54,800
OK.

13
00:00:54,800 --> 00:00:56,350
Can I just --

14
00:00:56,350 --> 00:01:01,420
one way to recap is to
take the two extreme cases.

15
00:01:01,420 --> 00:01:05,040
Suppose a vector b is
in the column space?

16
00:01:05,040 --> 00:01:10,030
Then what do I get when
I apply the projection P?

17
00:01:10,030 --> 00:01:13,610
So I'm projecting
into the column space

18
00:01:13,610 --> 00:01:18,780
but I'm starting with a vector
in this case that's already

19
00:01:18,780 --> 00:01:20,950
in the column
space, so of course

20
00:01:20,950 --> 00:01:25,860
when I project it I
get B again, right.

21
00:01:25,860 --> 00:01:29,860
And I want to show you how
that comes out of this formula.

22
00:01:29,860 --> 00:01:32,550
Let me do the other extreme.

23
00:01:32,550 --> 00:01:35,350
Suppose that vector is
perpendicular to the column

24
00:01:35,350 --> 00:01:36,210
space.

25
00:01:36,210 --> 00:01:38,860
So imagine this column
space as a plane

26
00:01:38,860 --> 00:01:42,550
and imagine b as sticking
straight up perpendicular

27
00:01:42,550 --> 00:01:43,490
to it.

28
00:01:43,490 --> 00:01:50,490
What's the nearest point in the
column space to b in that case?

29
00:01:50,490 --> 00:01:54,380
So what's the projection
onto the plane,

30
00:01:54,380 --> 00:01:57,980
the nearest point in the
plane, if the vector b that

31
00:01:57,980 --> 00:02:02,000
I'm looking at is -- got no
component in the column space,

32
00:02:02,000 --> 00:02:05,050
it's sticking completely
-- ninety degrees with it,

33
00:02:05,050 --> 00:02:10,220
then Pb should be zero, right.

34
00:02:10,220 --> 00:02:13,040
So those are the
two extreme cases.

35
00:02:13,040 --> 00:02:18,000
The average vector has a
component P in the column space

36
00:02:18,000 --> 00:02:20,930
and a component
perpendicular to it,

37
00:02:20,930 --> 00:02:25,930
and what the projection
does is it kills this part

38
00:02:25,930 --> 00:02:29,510
and it preserves this part.

39
00:02:29,510 --> 00:02:30,010
OK.

40
00:02:30,010 --> 00:02:32,230
Can we just see why that's true?

41
00:02:32,230 --> 00:02:37,140
Just -- that formula
ought to work.

42
00:02:37,140 --> 00:02:41,010
So let me start with this one.

43
00:02:41,010 --> 00:02:44,260
What vectors are in the -- are
perpendicular to the column

44
00:02:44,260 --> 00:02:45,240
space?

45
00:02:45,240 --> 00:02:48,220
How do I see that
I really get zero?

46
00:02:48,220 --> 00:02:50,850
I have to think, what does
it mean for a vector b

47
00:02:50,850 --> 00:02:54,410
to be perpendicular
to the column space?

48
00:02:54,410 --> 00:02:59,430
So if it's perpendicular
to all the columns,

49
00:02:59,430 --> 00:03:02,100
then it's in some other space.

50
00:03:02,100 --> 00:03:05,740
We've got our four spaces so
the reason I do this is it's

51
00:03:05,740 --> 00:03:10,030
perfectly using what we
know about our four spaces.

52
00:03:10,030 --> 00:03:13,860
What vectors are perpendicular
to the column space?

53
00:03:13,860 --> 00:03:19,190
Those are the guys in the
null space of A transpose,

54
00:03:19,190 --> 00:03:20,290
right?

55
00:03:20,290 --> 00:03:22,740
That's the first
section of this chapter,

56
00:03:22,740 --> 00:03:26,160
that's the key geometry
of these spaces.

57
00:03:26,160 --> 00:03:28,300
If I'm perpendicular
to the column space,

58
00:03:28,300 --> 00:03:30,961
I'm in the null
space of A transpose.

59
00:03:30,961 --> 00:03:31,460
OK.

60
00:03:31,460 --> 00:03:33,790
So if I'm in the null
space of A transpose,

61
00:03:33,790 --> 00:03:41,760
and I multiply this big formula
times b, so now I'm getting Pb,

62
00:03:41,760 --> 00:03:48,530
this is now the projection,
Pb, do you see that I get zero?

63
00:03:48,530 --> 00:03:50,470
Of course I get zero.

64
00:03:50,470 --> 00:03:52,950
Right at the end
there, A transpose b

65
00:03:52,950 --> 00:03:54,840
will give me zero right away.

66
00:03:54,840 --> 00:03:57,780
So that's why that zero's here.

67
00:03:57,780 --> 00:04:00,890
Because if I'm perpendicular
to the column space, then

68
00:04:00,890 --> 00:04:03,840
I'm in the null space of A
transpose and A transpose

69
00:04:03,840 --> 00:04:08,640
b is OK, what about
the other possibility.

70
00:04:08,640 --> 00:04:09,970
zilch.

71
00:04:09,970 --> 00:04:13,370
How do I see that this formula
gives me the right answer

72
00:04:13,370 --> 00:04:15,250
if b is in the column space?

73
00:04:18,230 --> 00:04:21,890
So what's a typical vector
in the column space?

74
00:04:21,890 --> 00:04:24,480
It's a combination
of the columns.

75
00:04:24,480 --> 00:04:27,240
How do I write a
combination of the columns?

76
00:04:27,240 --> 00:04:31,090
So tell me, how would
I write, you know,

77
00:04:31,090 --> 00:04:34,440
your everyday vector
that's in the column space?

78
00:04:34,440 --> 00:04:38,860
It would have the form
A times some x, right?

79
00:04:38,860 --> 00:04:42,280
That's what's in the column
space, A times something.

80
00:04:42,280 --> 00:04:44,730
That makes it a
combination of the columns.

81
00:04:44,730 --> 00:04:49,570
So these b's were in the
null space of A transpose.

82
00:04:49,570 --> 00:04:54,640
These guys in the column
space, those b's are Ax-s.

83
00:04:54,640 --> 00:04:55,210
Right?

84
00:04:55,210 --> 00:04:58,825
If b is in the column space
then it has the form Ax.

85
00:05:01,380 --> 00:05:04,450
I'm going to stick that on the
quiz or the final for sure.

86
00:05:04,450 --> 00:05:08,290
That you have to realize --
because we've said it like

87
00:05:08,290 --> 00:05:13,020
a thousand times that the things
in the column space are vectors

88
00:05:13,020 --> 00:05:14,241
A times x.

89
00:05:14,241 --> 00:05:14,740
OK.

90
00:05:14,740 --> 00:05:18,050
And do you see what happens
now if we use our formula?

91
00:05:18,050 --> 00:05:19,990
There's an A transpose A.

92
00:05:19,990 --> 00:05:21,860
Gets canceled by its inverse.

93
00:05:21,860 --> 00:05:25,800
We're left with an A times x.

94
00:05:25,800 --> 00:05:27,530
So the result was Ax.

95
00:05:27,530 --> 00:05:28,570
Which was b.

96
00:05:28,570 --> 00:05:30,100
Do you see that it works?

97
00:05:30,100 --> 00:05:32,750
This is that whole business.

98
00:05:32,750 --> 00:05:35,870
Cancel, cancel, leaving Ax.

99
00:05:35,870 --> 00:05:37,840
And Ax was b.

100
00:05:37,840 --> 00:05:43,730
So that turned out to
be b, in this case.

101
00:05:43,730 --> 00:05:51,300
OK, so geometrically what we're
seeing is we're taking a vector

102
00:05:51,300 --> 00:05:53,010
--

103
00:05:53,010 --> 00:06:00,770
we've got the column space
and perpendicular to that

104
00:06:00,770 --> 00:06:06,350
is the null space
of A transpose.

105
00:06:06,350 --> 00:06:10,230
And our typical
vector b is out here.

106
00:06:10,230 --> 00:06:12,970
There's zero, so there's
our typical vector b,

107
00:06:12,970 --> 00:06:19,460
and what we're doing is we're
projecting it to P. And the --

108
00:06:19,460 --> 00:06:22,300
and of course at the same time
we're finding the other part

109
00:06:22,300 --> 00:06:24,810
of it which is e.

110
00:06:24,810 --> 00:06:30,520
So the two pieces, the
projection piece and the error

111
00:06:30,520 --> 00:06:35,280
piece, add up to the original b.

112
00:06:35,280 --> 00:06:36,230
OK.

113
00:06:36,230 --> 00:06:39,520
That's like what
our matrix does.

114
00:06:39,520 --> 00:06:41,440
So this is P --

115
00:06:41,440 --> 00:06:48,260
P is -- this P is Ab, is sorry
-- is Pb, it's the projection,

116
00:06:48,260 --> 00:06:52,590
applied to b, and this one is --

117
00:06:52,590 --> 00:06:55,080
OK, that's a projection too.

118
00:06:55,080 --> 00:06:58,070
That's a projection
down onto that space.

119
00:06:58,070 --> 00:06:59,860
What's a good formula for it?

120
00:06:59,860 --> 00:07:05,340
Suppose I ask you for the
projection of the projection

121
00:07:05,340 --> 00:07:08,830
matrix onto the --

122
00:07:08,830 --> 00:07:13,240
this space, this
perpendicular space?

123
00:07:13,240 --> 00:07:16,960
So if this projection
was P, what's

124
00:07:16,960 --> 00:07:21,070
the projection that gives me e?

125
00:07:21,070 --> 00:07:24,170
It's the -- what I want is to
get the rest of the vector,

126
00:07:24,170 --> 00:07:30,790
so it'll be just I minus P times
b, that's a projection too.

127
00:07:30,790 --> 00:07:35,880
That's the projection onto
the perpendicular space.

128
00:07:38,950 --> 00:07:40,040
OK.

129
00:07:40,040 --> 00:07:44,150
So if P's a projection, I
minus P is a projection.

130
00:07:44,150 --> 00:07:47,790
If P is symmetric, I
minus P is symmetric.

131
00:07:47,790 --> 00:07:52,290
If P squared equals P, then I
minus P squared equals I minus

132
00:07:52,290 --> 00:07:55,690
P. It's just --

133
00:07:55,690 --> 00:08:00,460
the algebra -- is only
doing what your --

134
00:08:00,460 --> 00:08:05,270
picture is completely
telling you.

135
00:08:05,270 --> 00:08:08,122
But the algebra leads
to this expression.

136
00:08:11,820 --> 00:08:16,280
That expression for P given --

137
00:08:16,280 --> 00:08:19,810
given a basis for
the subspace, given

138
00:08:19,810 --> 00:08:25,690
the matrix A whose columns are
a basis for our column space.

139
00:08:25,690 --> 00:08:28,820
OK, that's recap because you
-- you need to see that formula

140
00:08:28,820 --> 00:08:30,460
more than once.

141
00:08:30,460 --> 00:08:34,669
And now can I pick
up on using it?

142
00:08:34,669 --> 00:08:37,789
So now -- and the --

143
00:08:37,789 --> 00:08:46,590
it's like, let me do that again,
I'll go right through a problem

144
00:08:46,590 --> 00:08:52,470
that I started at the end, which
is find a best straight line.

145
00:08:52,470 --> 00:08:53,820
You remember that problem, I --

146
00:08:53,820 --> 00:08:57,530
I picked a particular
set of points,

147
00:08:57,530 --> 00:09:00,820
they weren't specially
brilliant, t equal one,

148
00:09:00,820 --> 00:09:07,190
two, three, the heights were
one, two, and then two again.

149
00:09:07,190 --> 00:09:10,570
So they were -- heights
were that point, that point,

150
00:09:10,570 --> 00:09:13,320
which makes it look like I've
got a nice forty-five-degree

151
00:09:13,320 --> 00:09:18,800
line -- but then the third
point didn't lie on the line.

152
00:09:18,800 --> 00:09:22,500
And I wanted to find
the best straight line.

153
00:09:22,500 --> 00:09:26,404
So I'm looking for the
-- this line, y=C+Dt.

154
00:09:30,850 --> 00:09:35,110
And it's not going to go
through all three points,

155
00:09:35,110 --> 00:09:37,880
because no line goes
through all three points.

156
00:09:37,880 --> 00:09:42,190
So I'm going to pick
the best line, the --

157
00:09:42,190 --> 00:09:45,990
the best being the one that
makes the overall error

158
00:09:45,990 --> 00:09:48,430
as small as I can make it.

159
00:09:48,430 --> 00:09:52,390
Now I have to tell you,
what is that overall error?

160
00:09:52,390 --> 00:10:01,600
And -- because that determines
what's the winning line.

161
00:10:01,600 --> 00:10:02,740
If we don't know --

162
00:10:02,740 --> 00:10:06,810
I mean we have to decide
what we mean by the error --

163
00:10:06,810 --> 00:10:12,940
and then we minimize and we find
the right -- the best C and D.

164
00:10:12,940 --> 00:10:18,100
So if I went through this --
if I went through that point,

165
00:10:18,100 --> 00:10:18,600
OK.

166
00:10:18,600 --> 00:10:20,688
I would solve the
equation C+D=1.

167
00:10:23,680 --> 00:10:26,310
Because at t equal to one --

168
00:10:26,310 --> 00:10:30,050
I'd have C plus D, and
it would come out right.

169
00:10:30,050 --> 00:10:34,310
If it went through this point,
I'd have C plus two D equal to

170
00:10:34,310 --> 00:10:35,150
two.

171
00:10:35,150 --> 00:10:38,990
Because at t equal to two, I
would like to get the answer

172
00:10:38,990 --> 00:10:39,550
two.

173
00:10:39,550 --> 00:10:43,950
At the third point, I have
C plus three D because t is

174
00:10:43,950 --> 00:10:47,160
three, but the -- the
answer I'm shooting for is

175
00:10:47,160 --> 00:10:49,850
two again.

176
00:10:49,850 --> 00:10:52,680
So those are my three equations.

177
00:10:52,680 --> 00:10:55,720
And they don't have a solution.

178
00:10:55,720 --> 00:10:58,110
But they've got a best solution.

179
00:10:58,110 --> 00:10:59,890
What do I mean by best solution?

180
00:10:59,890 --> 00:11:04,220
So let me take time out to
remember what I'm talking

181
00:11:04,220 --> 00:11:06,550
about for best solution.

182
00:11:06,550 --> 00:11:11,960
So this is my equation Ax=b.

183
00:11:11,960 --> 00:11:18,440
A is this matrix, one,
one, one, one, two, three.

184
00:11:18,440 --> 00:11:22,580
x is my -- only have
two unknowns, C and D,

185
00:11:22,580 --> 00:11:27,440
and b is my right-hand
side, one, two, three.

186
00:11:27,440 --> 00:11:27,940
OK.

187
00:11:31,930 --> 00:11:34,670
No solution.

188
00:11:34,670 --> 00:11:37,300
Three eq- I have a
three by two matrix,

189
00:11:37,300 --> 00:11:40,200
I do have two
independent columns --

190
00:11:40,200 --> 00:11:42,540
so I do have a basis
for the column space,

191
00:11:42,540 --> 00:11:44,630
those two columns
are independent,

192
00:11:44,630 --> 00:11:46,420
they're a basis for
the column space,

193
00:11:46,420 --> 00:11:52,370
but the column space
doesn't include that vector.

194
00:11:52,370 --> 00:11:57,600
So best possible in this --

195
00:11:57,600 --> 00:12:01,540
what would best possible mean?

196
00:12:01,540 --> 00:12:05,750
The way that comes out to
linear equations is I --

197
00:12:05,750 --> 00:12:13,970
I want to minimize
the sum of these --

198
00:12:13,970 --> 00:12:15,457
I'm going to make an error here.

199
00:12:15,457 --> 00:12:16,790
I'm going to make an error here.

200
00:12:16,790 --> 00:12:18,950
I'm going to make
an error there.

201
00:12:18,950 --> 00:12:24,970
And I'm going to sum and
square and add up those errors.

202
00:12:24,970 --> 00:12:26,720
So it's a sum of squares.

203
00:12:26,720 --> 00:12:30,750
It's a least squares
solution I'm looking for.

204
00:12:30,750 --> 00:12:37,980
So if I -- those errors are the
difference between Ax and b.

205
00:12:37,980 --> 00:12:40,320
That's what I want
to make small.

206
00:12:40,320 --> 00:12:42,951
And the way I'm measuring
this -- this is a vector,

207
00:12:42,951 --> 00:12:43,450
right?

208
00:12:43,450 --> 00:12:45,480
This is e1,e2 ,e3.

209
00:12:45,480 --> 00:12:49,090
The Ax-b, this is the e.

210
00:12:49,090 --> 00:12:50,890
The error vector.

211
00:12:50,890 --> 00:12:55,890
And small means its length.

212
00:12:55,890 --> 00:12:57,662
The length of that vector.

213
00:12:57,662 --> 00:12:59,370
That's what I'm going
to try to minimize.

214
00:12:59,370 --> 00:13:04,280
And it's convenient to square.

215
00:13:04,280 --> 00:13:06,920
If I make something
small, I make --

216
00:13:09,620 --> 00:13:12,320
this is a never negative
quantity, right?

217
00:13:12,320 --> 00:13:13,690
The length of that vector.

218
00:13:16,760 --> 00:13:20,040
The length will be zero
exactly when the --

219
00:13:20,040 --> 00:13:21,990
when I have the
zero vector here.

220
00:13:21,990 --> 00:13:26,620
That's exactly the case
when I can solve exactly,

221
00:13:26,620 --> 00:13:29,730
b is in the column
space, all great.

222
00:13:29,730 --> 00:13:31,820
But I'm not in that case now.

223
00:13:31,820 --> 00:13:34,070
I'm going to have
an error vector, e.

224
00:13:34,070 --> 00:13:35,815
What's this error
vector in my picture?

225
00:13:38,340 --> 00:13:42,030
I guess what I'm trying
to say is there's --

226
00:13:42,030 --> 00:13:45,270
there's two pictures
of what's going on.

227
00:13:45,270 --> 00:13:47,540
There's two pictures
of what's going on.

228
00:13:47,540 --> 00:13:50,900
One picture is --

229
00:13:50,900 --> 00:13:55,020
in this is the three
points and the line.

230
00:13:55,020 --> 00:14:00,220
And in that picture, what
are the three errors?

231
00:14:00,220 --> 00:14:03,480
The three errors are what
I miss by in this equation.

232
00:14:03,480 --> 00:14:05,150
So it's this --

233
00:14:05,150 --> 00:14:06,740
this little bit here.

234
00:14:06,740 --> 00:14:08,950
That vertical distance
up to the line.

235
00:14:08,950 --> 00:14:12,780
There's one -- sorry there's
one, and there's C plus D.

236
00:14:12,780 --> 00:14:14,700
And it's that difference.

237
00:14:14,700 --> 00:14:17,720
Here's two and here's C+2D.

238
00:14:17,720 --> 00:14:20,600
So vertically it's
that distance --

239
00:14:20,600 --> 00:14:23,620
that little error there is e1.

240
00:14:23,620 --> 00:14:26,220
This little error here is e2.

241
00:14:26,220 --> 00:14:30,540
This little error
coming up is e3.

242
00:14:30,540 --> 00:14:32,350
e3.

243
00:14:32,350 --> 00:14:35,240
And what's my overall error?

244
00:14:35,240 --> 00:14:43,240
Is e1 square plus e2
squared plus e3 squared.

245
00:14:43,240 --> 00:14:44,920
That's what I'm
trying to make small.

246
00:14:44,920 --> 00:14:54,090
I -- some statisticians -- this
is a big part of statistics,

247
00:14:54,090 --> 00:14:56,360
fitting straight lines is
a big part of science --

248
00:14:56,360 --> 00:15:00,310
and specifically statistics,
where the right word to use

249
00:15:00,310 --> 00:15:02,210
would be regression.

250
00:15:02,210 --> 00:15:05,270
I'm doing regression here.

251
00:15:05,270 --> 00:15:06,145
Linear regression.

252
00:15:09,840 --> 00:15:12,820
And I'm using this
sum of squares

253
00:15:12,820 --> 00:15:15,270
as the measure of error.

254
00:15:15,270 --> 00:15:21,190
Again, some statisticians
would be -- they would say, OK,

255
00:15:21,190 --> 00:15:24,000
I'll solve that problem
because it's the clean problem.

256
00:15:24,000 --> 00:15:27,080
It leads to a beautiful
linear system.

257
00:15:27,080 --> 00:15:30,340
But they would be a little
careful about these squares,

258
00:15:30,340 --> 00:15:32,670
for -- in this case.

259
00:15:32,670 --> 00:15:35,990
If one of these
points was way off.

260
00:15:35,990 --> 00:15:39,040
Suppose I had a measurement at
t equal zero that was way off.

261
00:15:41,560 --> 00:15:44,880
Well, would the straight line,
would the best line be the same

262
00:15:44,880 --> 00:15:46,890
if I had this fourth point?

263
00:15:46,890 --> 00:15:50,180
Suppose I have this
fourth data point.

264
00:15:50,180 --> 00:15:54,880
No, certainly the line would --

265
00:15:54,880 --> 00:15:57,620
it wouldn't be the -- that
wouldn't be the best line.

266
00:15:57,620 --> 00:16:01,100
Because that line would
have a giant error --

267
00:16:01,100 --> 00:16:04,820
and when I squared it it
would be like way out of sight

268
00:16:04,820 --> 00:16:06,860
compared to the others.

269
00:16:06,860 --> 00:16:14,280
So this would be called by
statisticians an outlier,

270
00:16:14,280 --> 00:16:17,830
and they would not be happy to
see the whole problem turned

271
00:16:17,830 --> 00:16:21,150
topsy-turvy by this one outlier,
which could be a mistake,

272
00:16:21,150 --> 00:16:22,760
after all.

273
00:16:22,760 --> 00:16:26,500
So they wouldn't -- so they
wouldn't like maybe squaring,

274
00:16:26,500 --> 00:16:29,940
if there were outliers, they
would want to identify them.

275
00:16:29,940 --> 00:16:30,440
OK.

276
00:16:30,440 --> 00:16:35,800
I'm not going to --

277
00:16:35,800 --> 00:16:40,870
I don't want to suggest that
least squares isn't used,

278
00:16:40,870 --> 00:16:44,790
it's the most used, but
it's not exclusively used

279
00:16:44,790 --> 00:16:47,040
because it's a little --

280
00:16:47,040 --> 00:16:50,000
overcompensates for outliers.

281
00:16:50,000 --> 00:16:51,500
Because of that squaring.

282
00:16:51,500 --> 00:16:52,000
OK.

283
00:16:52,000 --> 00:16:54,300
So suppose we don't
have this guy,

284
00:16:54,300 --> 00:16:57,300
we just have these
three equations.

285
00:16:57,300 --> 00:17:01,680
And I want to make --
minimize this error.

286
00:17:01,680 --> 00:17:02,650
OK.

287
00:17:02,650 --> 00:17:08,069
Now, what I said is there's
two pictures to look at.

288
00:17:08,069 --> 00:17:10,940
One picture is this one.

289
00:17:10,940 --> 00:17:14,700
The three points, the best line.

290
00:17:14,700 --> 00:17:16,280
And the errors.

291
00:17:16,280 --> 00:17:20,760
Now, on this picture,
what are these points

292
00:17:20,760 --> 00:17:24,890
on the line, the points
that are really on the line?

293
00:17:24,890 --> 00:17:30,490
So they're -- points, let
me call them P1, P2, and P3,

294
00:17:30,490 --> 00:17:35,610
those are three numbers, so
this -- this height is P1,

295
00:17:35,610 --> 00:17:45,700
this height is P2, this height
is P3, and what are those guys?

296
00:17:45,700 --> 00:17:49,930
Suppose those were the
three values instead of --

297
00:17:49,930 --> 00:17:53,840
there's b1, ev- everybody's
seen all these -- sorry,

298
00:17:53,840 --> 00:17:57,090
my art is as usual
not the greatest,

299
00:17:57,090 --> 00:18:04,590
but there's the given b1, the
given b2, and the given b3.

300
00:18:04,590 --> 00:18:09,330
I promise not to put a single
letter more on that picture.

301
00:18:09,330 --> 00:18:10,050
OK.

302
00:18:10,050 --> 00:18:15,600
There's b1, P1 is the one on
the line, and e1 is the distance

303
00:18:15,600 --> 00:18:16,600
between.

304
00:18:16,600 --> 00:18:21,410
And same at points two
and same at points three.

305
00:18:21,410 --> 00:18:23,370
OK, so what's up?

306
00:18:23,370 --> 00:18:26,310
What's up with those Ps?

307
00:18:26,310 --> 00:18:29,930
P1, P2, P3, what are they?

308
00:18:29,930 --> 00:18:32,520
They're the components,
they lie on the line,

309
00:18:32,520 --> 00:18:33,720
right?

310
00:18:33,720 --> 00:18:38,420
They're the points
which if instead

311
00:18:38,420 --> 00:18:44,530
of one, two, two, which
were the b's, suppose I put

312
00:18:44,530 --> 00:18:47,230
P1, P2, P3 in here.

313
00:18:47,230 --> 00:18:50,150
I'll figure out in a minute
what those numbers are.

314
00:18:50,150 --> 00:18:53,040
But I just want to get the
picture of what I'm doing.

315
00:18:53,040 --> 00:18:56,390
If I put P1, P2, P3 in
those three equations,

316
00:18:56,390 --> 00:18:58,795
what would be good about
the three equations?

317
00:19:01,820 --> 00:19:03,820
I could solve them.

318
00:19:03,820 --> 00:19:06,420
A line goes through the Ps.

319
00:19:06,420 --> 00:19:10,400
So the P1, P2, P3 vector,
that's in the column

320
00:19:10,400 --> 00:19:11,320
space.

321
00:19:11,320 --> 00:19:14,480
That is a combination
of these columns.

322
00:19:14,480 --> 00:19:16,400
It's the closest combination.

323
00:19:16,400 --> 00:19:18,180
It's this picture.

324
00:19:18,180 --> 00:19:20,920
See, I've got the two
pictures like here's

325
00:19:20,920 --> 00:19:24,710
the picture that
shows the points, this

326
00:19:24,710 --> 00:19:28,240
is a picture in a
blackboard plane,

327
00:19:28,240 --> 00:19:34,310
here's a picture that's
showing the vectors.

328
00:19:34,310 --> 00:19:38,540
The vector b, which is in
this case, in this example

329
00:19:38,540 --> 00:19:42,090
is the vector one, two, two.

330
00:19:42,090 --> 00:19:47,940
The column space is in
this case spanned by the --

331
00:19:47,940 --> 00:19:49,720
well, you see A there.

332
00:19:49,720 --> 00:19:55,600
The column space of the matrix
one, one, one, one, two, three.

333
00:19:55,600 --> 00:20:01,540
And this picture shows
the nearest point.

334
00:20:01,540 --> 00:20:04,510
There's the -- that
point P1, P2, P3,

335
00:20:04,510 --> 00:20:08,050
which I'm going to compute
before the end of this hour,

336
00:20:08,050 --> 00:20:13,090
is the closest point
in the column space.

337
00:20:13,090 --> 00:20:13,780
OK.

338
00:20:13,780 --> 00:20:19,560
Let me -- t I don't dare
leave it any longer --

339
00:20:19,560 --> 00:20:21,650
can I just compute it now.

340
00:20:21,650 --> 00:20:24,850
So I want to compute --

341
00:20:24,850 --> 00:20:28,800
find P. All right.

342
00:20:28,800 --> 00:20:39,250
Find P. Find x, which
is CD, find P and P. OK.

343
00:20:39,250 --> 00:20:42,430
And I really should put
these little hats on

344
00:20:42,430 --> 00:20:49,830
to remind myself that they're
the estimated the best line,

345
00:20:49,830 --> 00:20:51,970
not the perfect line.

346
00:20:51,970 --> 00:20:53,050
OK.

347
00:20:53,050 --> 00:20:54,330
OK.

348
00:20:54,330 --> 00:20:55,540
How do I proceed?

349
00:20:55,540 --> 00:20:58,340
Let's just run
through the mechanics.

350
00:20:58,340 --> 00:21:02,530
What's the equation for x?

351
00:21:02,530 --> 00:21:04,620
The -- or x hat.

352
00:21:04,620 --> 00:21:10,390
The equation for that is A
transpose A x hat equals A

353
00:21:10,390 --> 00:21:12,500
transpose x --

354
00:21:12,500 --> 00:21:14,105
A transpose b.

355
00:21:18,020 --> 00:21:19,530
The most --

356
00:21:19,530 --> 00:21:23,620
I'm -- will venture to call
that the most important equation

357
00:21:23,620 --> 00:21:26,350
in statistics.

358
00:21:26,350 --> 00:21:28,560
And in estimation.

359
00:21:28,560 --> 00:21:33,140
And whatever you're -- wherever
you've got error and noise this

360
00:21:33,140 --> 00:21:36,980
is the estimate
that you use first.

361
00:21:36,980 --> 00:21:37,500
OK.

362
00:21:37,500 --> 00:21:42,740
Whenever you're fitting
things by a few parameters,

363
00:21:42,740 --> 00:21:44,700
that's the equation to use.

364
00:21:44,700 --> 00:21:46,500
OK, let's solve it.

365
00:21:46,500 --> 00:21:47,970
What is A transpose A?

366
00:21:47,970 --> 00:21:50,580
So I have to figure out
what these matrices are.

367
00:21:50,580 --> 00:21:56,860
One, one, one, one, two, three
and one, one, one, one, two,

368
00:21:56,860 --> 00:22:04,490
three, that gives me some
matrix, that gives me

369
00:22:04,490 --> 00:22:12,510
a matrix, what do I get out of
that, three, six, six, and one

370
00:22:12,510 --> 00:22:15,720
and four and nine, fourteen.

371
00:22:15,720 --> 00:22:17,040
OK.

372
00:22:17,040 --> 00:22:21,830
And what do I expect to see in
that matrix and I do see it,

373
00:22:21,830 --> 00:22:25,210
just before I keep going
with the calculation?

374
00:22:25,210 --> 00:22:28,450
I expect that matrix
to be symmetric.

375
00:22:28,450 --> 00:22:30,565
I expect it to be invertible.

376
00:22:34,100 --> 00:22:36,300
And near the end
of the course I'm

377
00:22:36,300 --> 00:22:39,060
going to say I expect it
to be positive definite,

378
00:22:39,060 --> 00:22:45,590
but that's a future fact
about this crucial matrix,

379
00:22:45,590 --> 00:22:47,050
A transpose A.

380
00:22:47,050 --> 00:22:47,670
OK.

381
00:22:47,670 --> 00:22:50,880
And now let me
figure A transpose b.

382
00:22:50,880 --> 00:22:57,280
So let me -- can I tack on b as
an extra column here, one, two,

383
00:22:57,280 --> 00:22:59,950
two?

384
00:22:59,950 --> 00:23:04,770
And tack on the extra
A transpose b is --

385
00:23:04,770 --> 00:23:09,580
looks like five and one
and four and six, eleven.

386
00:23:13,770 --> 00:23:20,760
I think my equations are three
C plus six D equals five,

387
00:23:20,760 --> 00:23:29,700
and six D plus fourt-six C
plus fourteen D is eleven.

388
00:23:29,700 --> 00:23:33,090
Can I just for safety
see if I did that right?

389
00:23:33,090 --> 00:23:37,350
One, one, one times
one, two, two is five.

390
00:23:37,350 --> 00:23:40,630
One, two, three, that's
one, four and six, eleven.

391
00:23:40,630 --> 00:23:42,667
Looks good.

392
00:23:42,667 --> 00:23:43,625
These are my equations.

393
00:23:48,860 --> 00:23:52,000
That's my -- they're called
the normal equations.

394
00:23:54,610 --> 00:23:56,984
I'll just write that
word down because it --

395
00:24:02,800 --> 00:24:04,470
so I solve them.

396
00:24:04,470 --> 00:24:10,270
I solve that for C and
D. I would like to --

397
00:24:10,270 --> 00:24:13,130
before I solve them could I
do one thing that's on the --

398
00:24:13,130 --> 00:24:16,570
that's just above here?

399
00:24:16,570 --> 00:24:18,110
I would like to --

400
00:24:18,110 --> 00:24:21,470
I'd like to find these
equations from calculus.

401
00:24:21,470 --> 00:24:26,320
I'd like to find them from
this minimizing thing.

402
00:24:26,320 --> 00:24:28,010
So what's the first error?

403
00:24:28,010 --> 00:24:32,690
The first error is what I
missed by in the first equation.

404
00:24:32,690 --> 00:24:36,250
C plus D minus one squared.

405
00:24:36,250 --> 00:24:40,010
And the second error is what
I miss in the second equation.

406
00:24:40,010 --> 00:24:44,110
C plus two D minus two squared.

407
00:24:44,110 --> 00:24:52,350
And the third error squared is C
plus three D minus two squared.

408
00:24:52,350 --> 00:24:56,410
That's my -- overall squared
error that I'm trying

409
00:24:56,410 --> 00:24:58,040
to minimize.

410
00:24:58,040 --> 00:24:58,610
OK.

411
00:24:58,610 --> 00:25:08,910
So how would you minimize that?

412
00:25:08,910 --> 00:25:16,270
OK, linear algebra has given us
the equations for the minimum.

413
00:25:16,270 --> 00:25:20,750
But we could use calculus too.

414
00:25:20,750 --> 00:25:24,440
That's a function of
two variables, C and D,

415
00:25:24,440 --> 00:25:28,010
and we're looking
for the minimum.

416
00:25:28,010 --> 00:25:31,140
So how do we find it?

417
00:25:31,140 --> 00:25:35,160
Directly from calculus, we
take partial derivatives,

418
00:25:35,160 --> 00:25:37,510
right, we've got two
variables, C and D,

419
00:25:37,510 --> 00:25:40,900
so take the partial
derivative with respect to C

420
00:25:40,900 --> 00:25:44,560
and set it to zero, and
you'll get that equation.

421
00:25:44,560 --> 00:25:47,140
Take the partial
derivative with respect --

422
00:25:47,140 --> 00:25:51,570
I'm not going to write it
all out, just -- you will.

423
00:25:51,570 --> 00:25:56,370
The partial derivative with
respect to D, it -- you know,

424
00:25:56,370 --> 00:25:59,340
it's going to be linear,
that's the beauty of these

425
00:25:59,340 --> 00:26:03,220
squares,that if I have the
square of something and I take

426
00:26:03,220 --> 00:26:07,520
its derivative I get something
And this is what I get. linear.

427
00:26:07,520 --> 00:26:11,440
So this is the derivative of
the error with respect to C

428
00:26:11,440 --> 00:26:13,770
being zero, and this
is the derivative

429
00:26:13,770 --> 00:26:17,850
of the error with
respect to D being zero.

430
00:26:17,850 --> 00:26:20,660
Wherever you look, these
equations keep coming.

431
00:26:20,660 --> 00:26:22,370
So now I guess I'm
going to solve it,

432
00:26:22,370 --> 00:26:25,830
what will I do, I'll subtract,
I'll do elimination of course,

433
00:26:25,830 --> 00:26:27,820
because that's the only
thing I know how to do.

434
00:26:27,820 --> 00:26:32,540
Two of these away from
this would give me --

435
00:26:32,540 --> 00:26:37,198
let's see, six, so would
that be two Ds equals one?

436
00:26:37,198 --> 00:26:37,697
Ha.

437
00:26:41,440 --> 00:26:43,050
So it wasn't --

438
00:26:43,050 --> 00:26:45,760
I was afraid these numbers
were going to come out awful.

439
00:26:45,760 --> 00:26:48,770
But if I take two of
those away from that,

440
00:26:48,770 --> 00:26:51,480
the equation I get left
is two D equals one,

441
00:26:51,480 --> 00:26:57,700
so I think D is a
half and C is whatever

442
00:26:57,700 --> 00:27:03,650
back substitution gives, six D
is three, so three C plus three

443
00:27:03,650 --> 00:27:07,060
is five, I'm doing back
substitution now, right, three,

444
00:27:07,060 --> 00:27:10,910
can I do it in light
letters, three C plus

445
00:27:10,910 --> 00:27:15,970
that six D is three equals
five, so three C is two,

446
00:27:15,970 --> 00:27:17,440
so I think C is two-thirds.

447
00:27:23,276 --> 00:27:24,275
One-half and two-thirds.

448
00:27:29,230 --> 00:27:38,640
So the best line, the best
line is the constant two-thirds

449
00:27:38,640 --> 00:27:42,760
plus one-half t.

450
00:27:42,760 --> 00:27:46,820
And I -- is my picture
more or less right?

451
00:27:46,820 --> 00:27:49,890
Let me write, let me copy
that best line down again,

452
00:27:49,890 --> 00:27:52,600
two-thirds and a half.

453
00:27:52,600 --> 00:27:55,510
Let me -- I'll put in the
two-thirds and the half.

454
00:27:59,890 --> 00:28:00,710
OK.

455
00:28:00,710 --> 00:28:05,360
So what's this P1, that's
the value at t equal to one.

456
00:28:05,360 --> 00:28:08,380
At t equal to one, I have
two-thirds plus a half,

457
00:28:08,380 --> 00:28:10,280
which is --

458
00:28:10,280 --> 00:28:13,400
what's that, four-sixths
and three-sixths, so P1, oh,

459
00:28:13,400 --> 00:28:18,400
I promised not to write
another thing on this --

460
00:28:18,400 --> 00:28:21,860
I'll erase P1 and
I'll put seven-sixths.

461
00:28:21,860 --> 00:28:22,580
OK.

462
00:28:22,580 --> 00:28:27,990
And yeah, it's above one,
and e1 is one-sixth, right.

463
00:28:27,990 --> 00:28:28,720
You see it all.

464
00:28:28,720 --> 00:28:29,220
Right?

465
00:28:29,220 --> 00:28:29,830
What's P2?

466
00:28:29,830 --> 00:28:31,660
OK.

467
00:28:31,660 --> 00:28:35,230
At point t equal to two,
where's my line here?

468
00:28:35,230 --> 00:28:38,920
At t equal to two, it's
two-thirds plus one, right?

469
00:28:38,920 --> 00:28:41,580
That's five-thirds.

470
00:28:41,580 --> 00:28:44,130
Two-thirds and t is two,
so that's two-thirds

471
00:28:44,130 --> 00:28:46,070
and one make five-thirds.

472
00:28:46,070 --> 00:28:49,320
And that's -- sure enough,
that's smaller than the exact

473
00:28:49,320 --> 00:28:50,280
two.

474
00:28:50,280 --> 00:28:55,180
And then final P3, when
t is three, oh, what's

475
00:28:55,180 --> 00:28:56,820
two-thirds plus three-halves?

476
00:29:01,390 --> 00:29:03,950
It's the same as
three-halves plus two-thirds.

477
00:29:03,950 --> 00:29:09,280
It's -- so maybe
four-sixths and nine-sixths,

478
00:29:09,280 --> 00:29:11,120
maybe thirteen-sixths.

479
00:29:11,120 --> 00:29:15,110
OK, and again, look,
oh, look at this, OK.

480
00:29:15,110 --> 00:29:19,840
You have to admire the
beauty of this answer.

481
00:29:19,840 --> 00:29:21,260
What's this first error?

482
00:29:21,260 --> 00:29:25,760
So here are the
errors. e1, e2 and e3.

483
00:29:25,760 --> 00:29:28,340
OK, what was that
first error, e1?

484
00:29:28,340 --> 00:29:32,640
Well, if we decide the
errors counting up,

485
00:29:32,640 --> 00:29:35,260
then it's one-sixth.

486
00:29:35,260 --> 00:29:38,670
And the last error,
thirteen-sixths

487
00:29:38,670 --> 00:29:43,420
minus the correct two
is one-sixth again.

488
00:29:43,420 --> 00:29:47,890
And what's this
error in the middle?

489
00:29:47,890 --> 00:29:52,530
Let's see, the correct
answer was two, two.

490
00:29:52,530 --> 00:29:55,900
And we got five-thirds and
it's the other direction,

491
00:29:55,900 --> 00:29:58,445
minus one-third,
minus two-sixths.

492
00:30:02,070 --> 00:30:04,560
That's our error vector.

493
00:30:04,560 --> 00:30:09,220
In our picture, in our
other picture, here it is.

494
00:30:09,220 --> 00:30:13,880
We just found P and e.

495
00:30:13,880 --> 00:30:19,120
e is this vector, one-sixth,
minus two-sixths, one-sixth,

496
00:30:19,120 --> 00:30:21,540
and P is this guy.

497
00:30:21,540 --> 00:30:23,540
Well, maybe I have
the signs of e wrong,

498
00:30:23,540 --> 00:30:26,880
I think I have, let me fix it.

499
00:30:26,880 --> 00:30:32,840
Because I would like
this one-sixth --

500
00:30:32,840 --> 00:30:37,690
I would like this plus the
P to give the original b.

501
00:30:37,690 --> 00:30:42,730
I want P plus e to match b.

502
00:30:42,730 --> 00:30:47,080
So I want minus a
sixth, plus seven-sixths

503
00:30:47,080 --> 00:30:50,650
to give the correct b equal one.

504
00:30:50,650 --> 00:30:52,090
OK.

505
00:30:52,090 --> 00:30:58,060
Now -- I'm going to
take a deep breath here,

506
00:30:58,060 --> 00:31:06,720
and ask what do we know
about this error vector e?

507
00:31:06,720 --> 00:31:09,790
You've seen now this whole
problem worked completely

508
00:31:09,790 --> 00:31:13,780
through, and I even think
the numbers are right.

509
00:31:13,780 --> 00:31:17,500
So there's P, so let me --

510
00:31:17,500 --> 00:31:24,840
I'll write -- if I can put
it down here, B is P plus e.

511
00:31:24,840 --> 00:31:29,110
b I believe was one, two, two.

512
00:31:29,110 --> 00:31:34,860
The nearest point
had seven-sixths,

513
00:31:34,860 --> 00:31:36,120
what were the others?

514
00:31:36,120 --> 00:31:40,590
Five-thirds and thirteen-sixths.

515
00:31:40,590 --> 00:31:46,950
And the e vector was
minus a sixth, two-sixths,

516
00:31:46,950 --> 00:31:49,360
one-third in other
words, and minus a sixth.

517
00:31:58,511 --> 00:31:59,010
OK.

518
00:31:59,010 --> 00:32:01,930
Tell me some stuff
about these two vectors.

519
00:32:01,930 --> 00:32:03,820
Tell me something about
those two vectors,

520
00:32:03,820 --> 00:32:06,480
well, they add to
b, right, great.

521
00:32:06,480 --> 00:32:07,070
OK.

522
00:32:07,070 --> 00:32:09,420
What else?

523
00:32:09,420 --> 00:32:12,520
What else about those
two vectors, the P,

524
00:32:12,520 --> 00:32:18,700
the projection vector P,
and the error vector e.

525
00:32:18,700 --> 00:32:21,470
What else do you
know about them?

526
00:32:21,470 --> 00:32:24,430
They're perpendicular, right.

527
00:32:24,430 --> 00:32:25,860
Do we dare verify that?

528
00:32:29,180 --> 00:32:32,230
Can you take the dot
product of those vectors?

529
00:32:32,230 --> 00:32:35,440
I'm like getting like minus
seven over thirty-six,

530
00:32:35,440 --> 00:32:36,850
can I change that to ten-sixths?

531
00:32:42,180 --> 00:32:45,250
Oh, God, come out right here.

532
00:32:45,250 --> 00:32:50,880
Minus seven over thirty-six,
plus twenty over thirty-six,

533
00:32:50,880 --> 00:32:53,080
minus thirteen over thirty-six.

534
00:32:56,730 --> 00:32:57,630
Thank you, God.

535
00:32:57,630 --> 00:32:59,120
OK.

536
00:32:59,120 --> 00:33:04,030
And what else should we
know about that vector?

537
00:33:04,030 --> 00:33:05,740
Actually we know --

538
00:33:05,740 --> 00:33:08,740
I've got to say we know
even a little more.

539
00:33:08,740 --> 00:33:13,510
This vector, e, is
perpendicular to P,

540
00:33:13,510 --> 00:33:18,480
but it's perpendicular
to other stuff too.

541
00:33:18,480 --> 00:33:22,220
It's perpendicular not just to
this guy in the column space,

542
00:33:22,220 --> 00:33:25,170
this is in the column
space for sure.

543
00:33:25,170 --> 00:33:27,680
This is perpendicular
to the column space.

544
00:33:27,680 --> 00:33:32,710
So like give me another
vector it's perpendicular to.

545
00:33:32,710 --> 00:33:35,000
Another because it's
perpendicular to the whole

546
00:33:35,000 --> 00:33:37,490
column space, not
just to this --

547
00:33:37,490 --> 00:33:40,780
this particular
projection that's --

548
00:33:40,780 --> 00:33:44,880
that is in the column space,
but it's perpendicular to other

549
00:33:44,880 --> 00:33:46,520
stuff, whatever's
in the column space,

550
00:33:46,520 --> 00:33:49,800
so tell me another vector
in the -- oh, well,

551
00:33:49,800 --> 00:33:53,080
I've written down the matrix,
so tell me another vector

552
00:33:53,080 --> 00:33:55,000
in the column space.

553
00:33:55,000 --> 00:33:58,000
Pick a nice one.

554
00:33:58,000 --> 00:33:59,450
One, one, one.

555
00:33:59,450 --> 00:34:01,490
That's what
everybody's thinking.

556
00:34:01,490 --> 00:34:04,230
OK, one, one, one is
in the column space.

557
00:34:04,230 --> 00:34:07,350
And this guy is supposed
to be perpendicular to one,

558
00:34:07,350 --> 00:34:08,090
one, one.

559
00:34:08,090 --> 00:34:10,000
Is it?

560
00:34:10,000 --> 00:34:10,659
Sure.

561
00:34:10,659 --> 00:34:12,550
If I take the dot
product with one,

562
00:34:12,550 --> 00:34:16,690
one, one I get minus a sixth,
plus two-sixths, minus a sixth,

563
00:34:16,690 --> 00:34:18,080
zero.

564
00:34:18,080 --> 00:34:20,659
And it's perpendicular
to one, two, three.

565
00:34:20,659 --> 00:34:23,020
Because if I take the
dot product with one,

566
00:34:23,020 --> 00:34:30,310
two, three I get minus one, plus
four, minus three, zero again.

567
00:34:30,310 --> 00:34:32,449
OK, do you see the --

568
00:34:32,449 --> 00:34:35,739
I hope you see the two pictures.

569
00:34:35,739 --> 00:34:41,110
The picture here for vectors
and, the picture here

570
00:34:41,110 --> 00:34:48,120
for the best line, and it's
the same picture, just --

571
00:34:48,120 --> 00:34:51,440
this one's in the plane
and it's showing the line,

572
00:34:51,440 --> 00:34:56,060
this one never did show the
line, this -- in this picture,

573
00:34:56,060 --> 00:34:59,160
C and D never showed up.

574
00:34:59,160 --> 00:35:02,040
In this picture, C and
D were -- you know,

575
00:35:02,040 --> 00:35:04,730
they determined that line.

576
00:35:04,730 --> 00:35:07,020
But the two are
exactly the same.

577
00:35:07,020 --> 00:35:10,540
C and D is the combination
of the two columns

578
00:35:10,540 --> 00:35:14,770
that gives P. OK.

579
00:35:14,770 --> 00:35:19,890
So that's these squares.

580
00:35:19,890 --> 00:35:23,680
And the special
but most important

581
00:35:23,680 --> 00:35:26,820
example of fitting by
straight line, so the homework

582
00:35:26,820 --> 00:35:29,670
that's coming then
Wednesday asks

583
00:35:29,670 --> 00:35:32,750
you to fit by straight lines.

584
00:35:32,750 --> 00:35:40,440
So you're just going to end
up solving the key equation.

585
00:35:40,440 --> 00:35:42,850
You're going to end up
solving that key equation

586
00:35:42,850 --> 00:35:47,100
and then P will be Ax hat.

587
00:35:47,100 --> 00:35:47,870
That's it.

588
00:35:51,640 --> 00:35:54,470
OK.

589
00:35:54,470 --> 00:35:59,650
Now, can I put in a little
piece of linear algebra

590
00:35:59,650 --> 00:36:03,350
that I mentioned
earlier, mentioned again,

591
00:36:03,350 --> 00:36:06,510
but I never did write?

592
00:36:06,510 --> 00:36:09,840
And I've -- I
should do it right.

593
00:36:09,840 --> 00:36:16,070
It's about this matrix
A transpose A. There.

594
00:36:21,840 --> 00:36:26,450
I was sure that that
matrix would be invertible.

595
00:36:26,450 --> 00:36:29,220
And of course I wanted to
be sure it was invertible,

596
00:36:29,220 --> 00:36:36,210
because I planned to solve this
system with with that matrix.

597
00:36:36,210 --> 00:36:40,620
So and I announced
like before --

598
00:36:40,620 --> 00:36:42,660
as the chapter
was just starting,

599
00:36:42,660 --> 00:36:45,390
I announced that it
would be invertible.

600
00:36:45,390 --> 00:36:48,615
But now I -- can I
come back to that?

601
00:36:48,615 --> 00:36:49,115
OK.

602
00:36:52,940 --> 00:36:56,050
So what I said was --

603
00:36:56,050 --> 00:37:07,440
that if A has
independent columns,

604
00:37:07,440 --> 00:37:14,616
then A transpose
A is invertible.

605
00:37:20,100 --> 00:37:24,080
And I would like to --

606
00:37:24,080 --> 00:37:27,250
first to repeat
that important fact,

607
00:37:27,250 --> 00:37:32,320
that that's the requirement
that makes everything go here.

608
00:37:32,320 --> 00:37:34,610
It's this independent
columns of A

609
00:37:34,610 --> 00:37:39,050
that guarantees
everything goes through.

610
00:37:39,050 --> 00:37:42,140
And think why.

611
00:37:42,140 --> 00:37:44,970
Why does this matrix
A transpose A,

612
00:37:44,970 --> 00:37:50,410
why is it invertible if the
columns of A are independent?

613
00:37:50,410 --> 00:38:01,840
OK, there's -- so if it
wasn't invertible, I'm --

614
00:38:01,840 --> 00:38:04,750
so I want to prove that.

615
00:38:04,750 --> 00:38:08,060
If it isn't
invertible, then what?

616
00:38:08,060 --> 00:38:10,610
I want to reach --

617
00:38:10,610 --> 00:38:13,010
I want to follow that
-- follow that line --

618
00:38:13,010 --> 00:38:15,400
of thinking and
see what I come to.

619
00:38:15,400 --> 00:38:17,480
Suppose, so proof.

620
00:38:17,480 --> 00:38:26,810
Suppose A transpose Ax is zero.

621
00:38:26,810 --> 00:38:28,400
I'm trying to prove this.

622
00:38:28,400 --> 00:38:30,440
This is now to prove.

623
00:38:30,440 --> 00:38:39,910
I don't like hammer away at
too many proofs in this course.

624
00:38:39,910 --> 00:38:41,690
But this is like
the central fact

625
00:38:41,690 --> 00:38:44,320
and it brings in all
the stuff we know.

626
00:38:44,320 --> 00:38:44,820
OK.

627
00:38:44,820 --> 00:38:46,700
So I'll start the proof.

628
00:38:46,700 --> 00:38:51,160
Suppose A transpose Ax is zero.

629
00:38:51,160 --> 00:38:56,110
What -- and I'm aiming to prove
A transpose A is invertible.

630
00:38:56,110 --> 00:38:58,150
So what do I want to prove now?

631
00:39:00,680 --> 00:39:03,560
So I'm aiming to
prove this fact.

632
00:39:03,560 --> 00:39:06,680
I'll use this, and I'm aiming
to prove that this matrix is

633
00:39:06,680 --> 00:39:11,740
invertible, OK, so if I
suppose A transpose Ax is zero,

634
00:39:11,740 --> 00:39:13,875
then what conclusion
do I want to reach?

635
00:39:16,450 --> 00:39:21,200
I'd like to know
that x must be zero.

636
00:39:21,200 --> 00:39:23,510
I want to show x must be zero.

637
00:39:23,510 --> 00:39:33,100
To show now -- to prove x
must be the zero vector.

638
00:39:33,100 --> 00:39:38,640
Is that right, that's what we
worked in the previous chapter

639
00:39:38,640 --> 00:39:43,850
to understand, that a
matrix was invertible

640
00:39:43,850 --> 00:39:51,960
when its null space is
only the zero vector.

641
00:39:51,960 --> 00:39:53,340
So that's what I want to show.

642
00:39:53,340 --> 00:40:00,520
How come if A transpose Ax is
zero, how come x must be zero?

643
00:40:00,520 --> 00:40:01,810
What's going to be the reason?

644
00:40:05,270 --> 00:40:06,880
Actually I have
two ways to do it.

645
00:40:10,270 --> 00:40:12,210
Let me show you one way.

646
00:40:12,210 --> 00:40:14,640
This is -- here, trick.

647
00:40:18,210 --> 00:40:22,880
Take the dot product
of both sides with x.

648
00:40:22,880 --> 00:40:25,980
So I'll multiply both
sides by x transpose.

649
00:40:25,980 --> 00:40:30,100
x transpose A transpose
Ax equals zero.

650
00:40:33,190 --> 00:40:35,100
I shouldn't have written trick.

651
00:40:35,100 --> 00:40:37,640
That makes it sound
like just a dumb idea.

652
00:40:37,640 --> 00:40:39,581
Brilliant idea, I
should have put.

653
00:40:39,581 --> 00:40:40,080
OK.

654
00:40:43,040 --> 00:40:44,215
I'll just put idea.

655
00:40:47,920 --> 00:40:49,230
OK.

656
00:40:49,230 --> 00:40:57,670
Now, I got to that equation,
x transpose A transpose Ax=0,

657
00:40:57,670 --> 00:41:06,229
and I'm hoping you can
see the right way to --

658
00:41:06,229 --> 00:41:07,270
to look at that equation.

659
00:41:12,760 --> 00:41:15,030
What can I conclude
from that equation,

660
00:41:15,030 --> 00:41:17,840
that if I have x
transpose A -- well,

661
00:41:17,840 --> 00:41:21,070
what is x transpose
A transpose Ax?

662
00:41:21,070 --> 00:41:25,360
Does that -- what
it's giving you?

663
00:41:29,620 --> 00:41:32,740
It's again going to be putting
in parentheses, I'm looking

664
00:41:32,740 --> 00:41:37,170
at Ax and what I seeing here?

665
00:41:37,170 --> 00:41:39,560
Its transpose.

666
00:41:39,560 --> 00:41:47,580
So I'm seeing here this
is Ax transpose Ax.

667
00:41:47,580 --> 00:41:48,325
Equaling zero.

668
00:41:51,640 --> 00:41:57,040
Now if Ax transpose Ax, so like
let's call it y or something,

669
00:41:57,040 --> 00:42:01,450
if y transpose y is zero,
what does that tell me?

670
00:42:06,780 --> 00:42:08,950
That the vector has
to be zero, right?

671
00:42:08,950 --> 00:42:10,650
This is the length
squared, that's

672
00:42:10,650 --> 00:42:15,730
the length of the vector Ax
squared, that's Ax times Ax.

673
00:42:15,730 --> 00:42:18,210
So I conclude that
Ax has to be zero.

674
00:42:23,474 --> 00:42:24,640
Well, I'm getting somewhere.

675
00:42:29,900 --> 00:42:34,610
Now that I know Ax
is zero, now I'm

676
00:42:34,610 --> 00:42:37,370
going to use my
little hypothesis.

677
00:42:37,370 --> 00:42:43,290
Somewhere every mathematician
has to use the hypothesis.

678
00:42:43,290 --> 00:42:45,050
Right?

679
00:42:45,050 --> 00:42:49,740
Now, if A has independent
columns and we've --

680
00:42:49,740 --> 00:42:55,580
we're at the point where Ax is
zero, what does that tell us?

681
00:42:55,580 --> 00:42:59,610
I could -- I mean that could
be like a fill-in question

682
00:42:59,610 --> 00:43:01,090
on the final exam.

683
00:43:01,090 --> 00:43:06,820
If A has independent columns
and if Ax equals zero then what?

684
00:43:10,390 --> 00:43:15,850
Please say it. x is zero, right.

685
00:43:15,850 --> 00:43:18,370
Which was just what
we wanted to prove.

686
00:43:18,370 --> 00:43:20,790
That -- do you see why that is?

687
00:43:20,790 --> 00:43:24,150
If Ax eq- equals zero,
now we're using --

688
00:43:24,150 --> 00:43:27,190
here we used this was
the square of something,

689
00:43:27,190 --> 00:43:30,810
so I'll put in
little parentheses

690
00:43:30,810 --> 00:43:35,720
the observation we made, that
was a square which is zero,

691
00:43:35,720 --> 00:43:37,610
so the thing has to be zero.

692
00:43:37,610 --> 00:43:43,130
Now we're using the hypothesis
of independent columns

693
00:43:43,130 --> 00:43:48,600
at the A has
independent columns.

694
00:43:48,600 --> 00:43:52,060
If A has independent
columns, this is telling me

695
00:43:52,060 --> 00:43:56,040
x is in its null space,
and the only thing

696
00:43:56,040 --> 00:44:00,510
in the null space of such a
matrix is the zero vector.

697
00:44:00,510 --> 00:44:01,320
OK.

698
00:44:01,320 --> 00:44:06,620
So that's the argument and
you see how it really used

699
00:44:06,620 --> 00:44:13,420
our understanding of the
-- of the null space.

700
00:44:13,420 --> 00:44:13,990
OK.

701
00:44:13,990 --> 00:44:15,650
That's great.

702
00:44:15,650 --> 00:44:16,420
All right.

703
00:44:16,420 --> 00:44:20,750
So where are we then?

704
00:44:20,750 --> 00:44:24,430
That board is like
the backup theory

705
00:44:24,430 --> 00:44:28,670
that tells me that
this matrix had

706
00:44:28,670 --> 00:44:32,610
to be invertible because these
columns were independent.

707
00:44:35,530 --> 00:44:38,360
OK.

708
00:44:38,360 --> 00:44:44,940
there's one case
of independent --

709
00:44:44,940 --> 00:44:50,540
there's one case where the
geometry gets even better.

710
00:44:50,540 --> 00:44:55,030
When the -- there's one case
when columns are sure to be

711
00:44:55,030 --> 00:44:56,610
independent.

712
00:44:56,610 --> 00:45:00,060
And let me put that -- let me
write that down and that'll be

713
00:45:00,060 --> 00:45:01,780
the subject for next time.

714
00:45:01,780 --> 00:45:07,040
Columns are sure -- are
certainly independent,

715
00:45:07,040 --> 00:45:23,290
definitely independent,
if they're perpendicular.

716
00:45:23,290 --> 00:45:25,190
Oh, I've got to rule
out the zero column,

717
00:45:25,190 --> 00:45:33,280
let me give them all length one,
so they can't be zero if they

718
00:45:33,280 --> 00:45:37,855
are perpendicular unit vectors.

719
00:45:42,870 --> 00:45:53,370
Like the vectors one, zero,
zero, zero, one, zero and zero,

720
00:45:53,370 --> 00:45:55,480
zero, one.

721
00:45:55,480 --> 00:46:00,660
Those vectors are unit
vectors, they're perpendicular,

722
00:46:00,660 --> 00:46:05,820
and they certainly
are independent.

723
00:46:05,820 --> 00:46:10,610
And what's more, suppose
they're -- oh, that's so nice,

724
00:46:10,610 --> 00:46:14,080
I mean what is A transpose
A for that matrix?

725
00:46:14,080 --> 00:46:16,470
For the matrix with
these three columns?

726
00:46:16,470 --> 00:46:18,280
It's the identity.

727
00:46:18,280 --> 00:46:23,090
So here's the key to the
lecture that's coming.

728
00:46:23,090 --> 00:46:27,210
If we're dealing with
perpendicular unit vectors

729
00:46:27,210 --> 00:46:32,000
and the word for that will
be -- see I could have said

730
00:46:32,000 --> 00:46:35,650
orthogonal, but I
said perpendicular --

731
00:46:35,650 --> 00:46:41,370
and this unit vectors gets
put in as the word normal.

732
00:46:41,370 --> 00:46:42,545
Orthonormal vectors.

733
00:46:46,070 --> 00:46:49,820
Those are the best
columns you could ask for.

734
00:46:49,820 --> 00:46:54,730
Matrices with -- whose
columns are orthonormal,

735
00:46:54,730 --> 00:46:56,950
they're perpendicular
to each other,

736
00:46:56,950 --> 00:47:01,010
and they're unit vectors, well,
they don't have to be those

737
00:47:01,010 --> 00:47:06,280
three, let me do a
final example over here,

738
00:47:06,280 --> 00:47:11,110
how about one at an angle like
that and one at ninety degrees,

739
00:47:11,110 --> 00:47:18,050
that vector would be cos theta,
sine theta, a unit vector,

740
00:47:18,050 --> 00:47:24,150
and this vector would be
minus sine theta cos theta.

741
00:47:24,150 --> 00:47:30,850
That is our absolute favorite
pair of orthonormal vectors.

742
00:47:30,850 --> 00:47:33,630
They're both unit vectors
and they're perpendicular.

743
00:47:33,630 --> 00:47:36,520
That angle is ninety degrees.

744
00:47:36,520 --> 00:47:41,500
So like our job next
time is first to see

745
00:47:41,500 --> 00:47:43,640
why orthonormal
vectors are great,

746
00:47:43,640 --> 00:47:47,240
and then to make vectors
orthonormal by picking

747
00:47:47,240 --> 00:47:49,530
the right basis.

748
00:47:49,530 --> 00:47:50,625
OK, see you.

749
00:47:57,070 --> 00:47:58,620
Thanks.