1
00:00:00,060 --> 00:00:02,500
The following content is
provided under a Creative

2
00:00:02,500 --> 00:00:04,010
Commons license.

3
00:00:04,010 --> 00:00:06,360
Your support will help
MIT OpenCourseWare

4
00:00:06,360 --> 00:00:10,730
continue to offer high quality
educational resources for free.

5
00:00:10,730 --> 00:00:13,330
To make a donation or
view additional materials

6
00:00:13,330 --> 00:00:17,217
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,217 --> 00:00:17,842
at ocw.mit.edu.

8
00:00:21,460 --> 00:00:23,110
PROFESSOR: So let's begin.

9
00:00:23,110 --> 00:00:26,600
Today, I'm going to
review linear algebra.

10
00:00:26,600 --> 00:00:30,740
So I'm assuming that you
already took some linear algebra

11
00:00:30,740 --> 00:00:31,390
course.

12
00:00:31,390 --> 00:00:35,160
And I'm going to just review
the relevant content that

13
00:00:35,160 --> 00:00:38,460
will appear again and again
throughout the course.

14
00:00:38,460 --> 00:00:42,070
But do interrupt me if some
concepts are not clear,

15
00:00:42,070 --> 00:00:47,150
if you don't remember some
concept from linear algebra.

16
00:00:47,150 --> 00:00:48,910
I hope you do.

17
00:00:48,910 --> 00:00:50,470
But please let me know.

18
00:00:50,470 --> 00:00:53,450
I just don't know.

19
00:00:53,450 --> 00:00:56,850
You have very different
background knowledge.

20
00:00:56,850 --> 00:01:00,390
So it's hard to tune
to one special group.

21
00:01:00,390 --> 00:01:03,590
So I tailored this
lecture notes so that it's

22
00:01:03,590 --> 00:01:06,800
a review for those who took
the most basic linear algebra

23
00:01:06,800 --> 00:01:08,970
course.

24
00:01:08,970 --> 00:01:10,660
So if you already
have that experience,

25
00:01:10,660 --> 00:01:13,580
and don't understand it, please
feel free to interrupt me.

26
00:01:16,490 --> 00:01:18,570
So I'm going to start by
talking about matrices.

27
00:01:21,354 --> 00:01:24,180
A matrix, in a very
simple form, is just

28
00:01:24,180 --> 00:01:26,820
a collection of numbers.

29
00:01:26,820 --> 00:01:33,620
For example
[1, 2, 3; 2, 3, 4;  4, 5, 10].

30
00:01:33,620 --> 00:01:36,490
You can pick any number of
rows, any number of columns.

31
00:01:36,490 --> 00:01:39,680
You just write down
numbers in a square format.

32
00:01:39,680 --> 00:01:42,320
And that's the matrix.

33
00:01:42,320 --> 00:01:44,810
What's special about it?

34
00:01:44,810 --> 00:01:47,480
So what kind of data can
you arrange in a matrix?

35
00:01:47,480 --> 00:01:51,840
So I'll take an example,
which looks relevant to us.

36
00:01:51,840 --> 00:01:56,850
So for example, we can index the
rows by stocks, by companies,

37
00:01:56,850 --> 00:01:57,590
like Apple.

38
00:02:00,170 --> 00:02:05,350
Morgan Stanley should be
there, and then Google.

39
00:02:08,810 --> 00:02:11,765
And then maybe we can
index the column by dates.

40
00:02:14,290 --> 00:02:20,920
I'll say July 1st, October
1st, September 1st.

41
00:02:20,920 --> 00:02:23,930
And the numbers, you can
pick whatever data you want.

42
00:02:23,930 --> 00:02:25,630
But probably the
sensible data will

43
00:02:25,630 --> 00:02:28,310
be the stock price on that day.

44
00:02:28,310 --> 00:02:33,750
I don't know for example
400, 500, and 5,000.

45
00:02:33,750 --> 00:02:35,930
That would be great.

46
00:02:35,930 --> 00:02:40,950
So these kind of data,
that's just the matrix.

47
00:02:40,950 --> 00:02:43,230
So defining a matrix
is really simple.

48
00:02:43,230 --> 00:02:47,870
But why is it so powerful?

49
00:02:47,870 --> 00:02:50,080
So that's an application
point of view,

50
00:02:50,080 --> 00:02:51,980
just as a collection of data.

51
00:02:51,980 --> 00:03:01,610
But from a theoretical
point of view,

52
00:03:01,610 --> 00:03:10,860
a matrix, an m by n
matrix, is an operator.

53
00:03:10,860 --> 00:03:12,960
It defines a linear
transformation.

54
00:03:12,960 --> 00:03:16,250
A defines a linear
transformation

55
00:03:16,250 --> 00:03:18,660
from the vector space,
n-dimensional vector

56
00:03:18,660 --> 00:03:22,779
space to the m-dimensional
vector space.

57
00:03:22,779 --> 00:03:24,528
That sounds a lot more
abstract than this.

58
00:03:27,840 --> 00:03:30,730
So for example, let's just
take a very small example.

59
00:03:30,730 --> 00:03:39,540
If I use a 2 by 2
matrix, [2, 0;  0, 3].

60
00:03:39,540 --> 00:03:47,589
Then [2, 0; 0, 3] times, let's
say [1, 1] is just [2, 3].

61
00:03:53,122 --> 00:03:54,400
Does that makes sense?

62
00:03:54,400 --> 00:03:57,060
It's just matrix multiplication.

63
00:03:57,060 --> 00:04:00,860
So now try to combine
the point of view.

64
00:04:00,860 --> 00:04:03,500
What does it mean to have a
linear transformation defined

65
00:04:03,500 --> 00:04:06,690
by a data set?

66
00:04:06,690 --> 00:04:08,370
And things start
to get confusing.

67
00:04:08,370 --> 00:04:11,040
What is it?

68
00:04:11,040 --> 00:04:13,790
Why does a data set define
a linear transformation?

69
00:04:13,790 --> 00:04:17,500
And does it have any
sensible meaning?

70
00:04:17,500 --> 00:04:21,209
So that's a good question
to have in mind today.

71
00:04:21,209 --> 00:04:24,410
And try to remember
this question.

72
00:04:24,410 --> 00:04:27,040
Because today I'll
try to really develop

73
00:04:27,040 --> 00:04:31,530
a theory of eigenvalues and
eigenvectors in a purely

74
00:04:31,530 --> 00:04:33,860
theoretical language.

75
00:04:33,860 --> 00:04:38,140
But it can still be
applied to these data sets,

76
00:04:38,140 --> 00:04:44,030
and give very
important properties

77
00:04:44,030 --> 00:04:46,130
and very important quantities.

78
00:04:46,130 --> 00:04:50,030
You can get some useful
information out of it.

79
00:04:50,030 --> 00:04:54,640
Try to make sense out
of why it happens.

80
00:04:54,640 --> 00:04:58,830
So that will be the goal today,
to really treat linear algebra

81
00:04:58,830 --> 00:05:01,120
as a theoretical thing.

82
00:05:01,120 --> 00:05:04,816
But remember that there's some
data set, like really data set

83
00:05:04,816 --> 00:05:05,315
underlying.

84
00:05:08,060 --> 00:05:09,522
This doesn't go up.

85
00:05:09,522 --> 00:05:13,230
That was a bad choice
for my first board.

86
00:05:13,230 --> 00:05:13,730
Sorry.

87
00:05:22,150 --> 00:05:30,880
So the most important concepts
for us are the eigenvalues

88
00:05:30,880 --> 00:05:35,390
and eigenvectors
of a matrix, which

89
00:05:35,390 --> 00:05:47,740
is defined as a real number,
lambda, and vector v,

90
00:05:47,740 --> 00:06:02,250
is an eigenvalue, and
eigenvector of a matrix A,

91
00:06:02,250 --> 00:06:09,120
if A times v is equal to
lambda times V. We also

92
00:06:09,120 --> 00:06:17,925
say that v is an eigenvector
corresponding to lambda.

93
00:06:21,640 --> 00:06:24,570
So remember eigenvalues
and eigenvectors always

94
00:06:24,570 --> 00:06:26,580
come in pairs.

95
00:06:26,580 --> 00:06:34,710
And they are defined by the
property that A*v = lambda*v.

96
00:06:34,710 --> 00:06:37,860
First question, does all
matrix have eigenvalues

97
00:06:37,860 --> 00:06:38,604
and eigenvectors?

98
00:06:41,508 --> 00:06:43,930
Nope?

99
00:06:43,930 --> 00:06:50,620
So Av-- It looks like this
is a very strange equation

100
00:06:50,620 --> 00:06:51,700
to satisfy.

101
00:06:51,700 --> 00:06:57,640
But if you change it in this
form, (A - lambda I)v = 0.

102
00:06:57,640 --> 00:07:01,040
That still looks strange.

103
00:07:01,040 --> 00:07:03,480
But at least you
understand that-- it's

104
00:07:03,480 --> 00:07:08,810
an only if, this can happen
only if this can happen.

105
00:07:08,810 --> 00:07:16,610
Happens only if A - lambda
I does not have full rank.

106
00:07:16,610 --> 00:07:24,065
So determinant of (A - lambda I)
is equal to 0, if and only if,

107
00:07:24,065 --> 00:07:24,680
in fact.

108
00:07:29,020 --> 00:07:32,913
So now comes a very
interesting observation.

109
00:07:35,440 --> 00:07:45,400
det(A - lambda I) is a
polynomial of degree n.

110
00:07:48,587 --> 00:07:49,481
I made a mistake.

111
00:07:49,481 --> 00:07:52,645
I should have said, this is
only for n by n matrices.

112
00:08:00,950 --> 00:08:02,520
This is only for
square matrices.

113
00:08:02,520 --> 00:08:04,630
Sorry.

114
00:08:04,630 --> 00:08:06,430
It's a polynomial of degree n.

115
00:08:06,430 --> 00:08:08,444
That means it has a solution.

116
00:08:08,444 --> 00:08:11,435
It has to give n
in terms of lambda.

117
00:08:15,170 --> 00:08:17,110
So it has a solution.

118
00:08:17,110 --> 00:08:18,430
It might be a complex number.

119
00:08:26,250 --> 00:08:27,215
I'm really sorry.

120
00:08:27,215 --> 00:08:28,790
I'm nervous in
front of the video.

121
00:08:32,974 --> 00:08:35,935
I understand why you were saying
that is doesn't necessarily

122
00:08:35,935 --> 00:08:37,530
exist.

123
00:08:37,530 --> 00:08:38,330
Let me repeat.

124
00:08:38,330 --> 00:08:39,640
I made a few mistakes here.

125
00:08:39,640 --> 00:08:41,510
So let me repeat here.

126
00:08:41,510 --> 00:08:49,650
For n by n matrix A, a complex
number lambda, and the vector

127
00:08:49,650 --> 00:08:53,189
v, is an eigenvalue
and eigenvector

128
00:08:53,189 --> 00:08:54,480
if it satisfies this condition.

129
00:08:54,480 --> 00:08:55,780
It doesn't have to be real.

130
00:08:55,780 --> 00:08:57,500
Sorry about that.

131
00:08:57,500 --> 00:08:59,880
And now if we
rephrase it this way,

132
00:08:59,880 --> 00:09:03,130
because this is a
polynomial, it always

133
00:09:03,130 --> 00:09:04,620
has at least one solution.

134
00:09:07,640 --> 00:09:09,670
That was just a side point.

135
00:09:09,670 --> 00:09:10,970
Very theoretical.

136
00:09:10,970 --> 00:09:13,330
So we see that there
always exists at least one

137
00:09:13,330 --> 00:09:14,455
eigenvalue and eigenvector.

138
00:09:17,420 --> 00:09:21,230
Now we saw its existence, what
is the geometrical meaning

139
00:09:21,230 --> 00:09:22,208
of it?

140
00:09:34,940 --> 00:09:39,220
Now let's go back to the linear
transformation point of view.

141
00:09:39,220 --> 00:09:43,990
So suppose A is a 3 by 3 matrix.

142
00:09:48,230 --> 00:09:58,510
Then A takes the vector in R^3
and transforms it into another

143
00:09:58,510 --> 00:10:01,320
vector in R^3.

144
00:10:04,230 --> 00:10:07,160
But if you have this
relation, what's

145
00:10:07,160 --> 00:10:11,350
going to happen is
A, when applied to v,

146
00:10:11,350 --> 00:10:16,160
it will just scale the vector
v. If this was the original v,

147
00:10:16,160 --> 00:10:19,590
A of v will just be
lambda times this vector.

148
00:10:19,590 --> 00:10:24,440
That will be our Av, which
is equal to lambda v.

149
00:10:24,440 --> 00:10:28,160
So eigenvectors are
those special vectors

150
00:10:28,160 --> 00:10:31,870
which when applied this
linear transformation just

151
00:10:31,870 --> 00:10:39,620
get scaled by some amount, where
that amount is exactly lambda.

152
00:10:39,620 --> 00:10:42,860
So what we established so
far, what we recall so far

153
00:10:42,860 --> 00:10:47,650
is every n by n matrix has
at least one such direction.

154
00:10:47,650 --> 00:10:51,830
There is some vector where the
linear transformation defined

155
00:10:51,830 --> 00:10:55,005
by A just scales that vector.

156
00:10:55,005 --> 00:10:56,630
Which is quite
interesting, if you ever

157
00:10:56,630 --> 00:10:58,830
thought about it before.

158
00:10:58,830 --> 00:11:01,110
There's no reason such
vector should exist.

159
00:11:01,110 --> 00:11:02,640
Of course I'm
lying a little bit.

160
00:11:02,640 --> 00:11:05,445
Because these might
be complex vectors.

161
00:11:05,445 --> 00:11:10,410
But at least in the
complex world it's true.

162
00:11:13,210 --> 00:11:19,070
So if you think about
this, this is very helpful.

163
00:11:19,070 --> 00:11:23,520
It gives you the vectors-- from
these vectors' point of view,

164
00:11:23,520 --> 00:11:27,590
this linear transformation
is really easy to understand.

165
00:11:27,590 --> 00:11:30,000
That's why eigenvalues and
eigenvector are so good.

166
00:11:30,000 --> 00:11:31,970
It breaks down the
linear transformation

167
00:11:31,970 --> 00:11:33,420
into really simple operations.

168
00:11:36,180 --> 00:11:40,110
Let me formalize that
a little bit more.

169
00:11:40,110 --> 00:11:50,110
So in an extreme case a
matrix, an n by n matrix A,

170
00:11:50,110 --> 00:11:58,370
we call it
diagonalizable, if there

171
00:11:58,370 --> 00:12:06,800
exists an orthonormal
matrix, I'll

172
00:12:06,800 --> 00:12:20,920
call what it is, U, such that
A is equal to U times D times U

173
00:12:20,920 --> 00:12:33,540
inverse for a diagonal matrix D.

174
00:12:33,540 --> 00:12:35,754
Let me iterate through
this a little bit.

175
00:12:39,930 --> 00:12:42,140
What is an orthonormal matrix?

176
00:12:42,140 --> 00:12:45,530
It's a matrix defined by the
relation U times U transposed

177
00:12:45,530 --> 00:12:48,480
is equal to the identity.

178
00:12:48,480 --> 00:12:50,330
What is a diagonal matrix?

179
00:12:50,330 --> 00:12:54,090
It's a matrix whose
nonzero entries are all

180
00:12:54,090 --> 00:12:56,192
on the diagonal.

181
00:12:56,192 --> 00:12:57,626
All the rest are zero.

182
00:13:01,360 --> 00:13:04,800
Why is it so good to
have this decomposition?

183
00:13:04,800 --> 00:13:08,270
What does it mean to have an
orthonormal matrix like this?

184
00:13:08,270 --> 00:13:16,140
It means basically I'll just
explain what's happening.

185
00:13:16,140 --> 00:13:18,720
If that happens, if a
matrix is diagonalizable,

186
00:13:18,720 --> 00:13:20,805
if this A is
diagonalizable, there

187
00:13:20,805 --> 00:13:28,560
will be three directions,
v_1, v_2, v_3,

188
00:13:28,560 --> 00:13:34,570
such that when you apply this
A, v_1 scales by some lambda_1.

189
00:13:34,570 --> 00:13:38,240
v_2 scales by some lambda_2.

190
00:13:38,240 --> 00:13:40,244
And v_3 scales by some lambda_3.

191
00:13:43,410 --> 00:13:48,410
So we can completely understand
the transformation A,

192
00:13:48,410 --> 00:13:49,980
just in terms of
these three vectors.

193
00:13:58,600 --> 00:14:05,610
So this, the stuff here will
be the most important things

194
00:14:05,610 --> 00:14:09,250
you'll use in linear algebra
throughout this course.

195
00:14:09,250 --> 00:14:13,390
So let me repeat
it really slowly.

196
00:14:13,390 --> 00:14:18,730
So an eigenvalue and eigenvector
is defined by this relation.

197
00:14:18,730 --> 00:14:21,660
We know that there are at least
one eigenvalue for each matrix,

198
00:14:21,660 --> 00:14:25,310
and there is an eigenvector
corresponding to it.

199
00:14:25,310 --> 00:14:28,570
And eigenvectors have
this geometrical meaning

200
00:14:28,570 --> 00:14:32,930
where-- a vector
is an eigenvector,

201
00:14:32,930 --> 00:14:34,780
if the linear
transformation defined

202
00:14:34,780 --> 00:14:38,300
by A just scales that vector.

203
00:14:38,300 --> 00:14:42,670
So for our setting,
the real good matrices

204
00:14:42,670 --> 00:14:45,430
are the matrices which
can be broken down

205
00:14:45,430 --> 00:14:48,190
into these directions.

206
00:14:48,190 --> 00:14:52,180
And those directions
are defined by this U.

207
00:14:52,180 --> 00:14:55,020
And D defines how
much it will scale.

208
00:14:55,020 --> 00:15:02,110
So in this case U will
be our v_1, v_2, v_3.

209
00:15:02,110 --> 00:15:07,887
And D will be our lambda_1,
lambda_2, lambda_3 all 0.

210
00:15:17,000 --> 00:15:17,890
Any questions so far?

211
00:15:22,930 --> 00:15:24,650
So that is abstract.

212
00:15:24,650 --> 00:15:27,650
Now remember the question
I posed in the beginning.

213
00:15:27,650 --> 00:15:33,500
So remember that matrix where we
had stocks and dates and stock

214
00:15:33,500 --> 00:15:36,520
prices in the entries?

215
00:15:36,520 --> 00:15:40,145
What will an eigenvector
of that matrix mean?

216
00:15:40,145 --> 00:15:41,736
What will an eigenvalue mean?

217
00:15:45,050 --> 00:15:46,620
So try to think
about that question.

218
00:15:49,490 --> 00:15:52,750
It's not like it will have
some physical counterpart.

219
00:15:52,750 --> 00:15:55,600
But there's some really
interesting things

220
00:15:55,600 --> 00:15:56,380
going on there.

221
00:16:09,810 --> 00:16:14,510
The bad news is that not all
matrices are diagonalizable.

222
00:16:14,510 --> 00:16:17,460
If a matrix is diagonalizable,
it's really easy

223
00:16:17,460 --> 00:16:19,600
to understand what it does.

224
00:16:19,600 --> 00:16:22,950
Because it really breaks down
into these three directions,

225
00:16:22,950 --> 00:16:24,005
if it's a 3 by 3.

226
00:16:24,005 --> 00:16:27,280
If it's an n by n, it breaks
down into n directions.

227
00:16:27,280 --> 00:16:32,090
Unfortunately, not all
matrices are diagonalizable.

228
00:16:32,090 --> 00:16:33,970
But there is a
very special class

229
00:16:33,970 --> 00:16:38,330
of matrices which are
always diagonalizable.

230
00:16:38,330 --> 00:16:41,640
And fortunately we
will see those matrices

231
00:16:41,640 --> 00:16:43,340
throughout the course.

232
00:16:43,340 --> 00:16:45,260
Most of the matrices,
n by n matrices,

233
00:16:45,260 --> 00:16:48,070
we will study, fall
into this category.

234
00:16:51,900 --> 00:17:01,970
So an n by n matrix
A is symmetric

235
00:17:01,970 --> 00:17:05,550
if A is equal to A transpose.

236
00:17:05,550 --> 00:17:10,000
Before proceeding,
please raise your hand

237
00:17:10,000 --> 00:17:13,900
if you're familiar with
all the concepts so far.

238
00:17:13,900 --> 00:17:14,400
OK.

239
00:17:14,400 --> 00:17:16,180
Good feeling.

240
00:17:22,500 --> 00:17:25,609
So a matrix is symmetric if
it's equal to its transpose.

241
00:17:25,609 --> 00:17:27,710
A transpose is obtained
by taking the mirror

242
00:17:27,710 --> 00:17:29,225
image across the diagonal.

243
00:17:32,190 --> 00:17:44,720
And then it is known that
all symmetric matrices

244
00:17:44,720 --> 00:17:47,117
are diagonalizable.

245
00:17:47,117 --> 00:17:50,817
Ah, I've made another mistake.

246
00:17:50,817 --> 00:17:51,400
Orthonormally.

247
00:17:55,558 --> 00:18:06,970
So with this I missed matrices
orthonormally diagonalizable.

248
00:18:06,970 --> 00:18:13,190
So it's diagonalizable if
we drop this condition,

249
00:18:13,190 --> 00:18:14,790
and replace it
with an invertible.

250
00:18:25,150 --> 00:18:29,300
So symmetric matrices
are really good.

251
00:18:29,300 --> 00:18:33,920
And fortunately most of the n
by n matrices that we will study

252
00:18:33,920 --> 00:18:34,880
are symmetric.

253
00:18:34,880 --> 00:18:38,640
Just by the nature of
it, it will be symmetric.

254
00:18:38,640 --> 00:18:42,030
The one I gave as an
example is not symmetric.

255
00:18:42,030 --> 00:18:44,536
It's not symmetric.

256
00:18:44,536 --> 00:18:49,770
But I will address
that issue in a minute.

257
00:18:49,770 --> 00:19:00,326
And another important
thing is symmetric matrices

258
00:19:00,326 --> 00:19:03,128
have real eigenvalues.

259
00:19:12,340 --> 00:19:16,950
So really this geometrical
picture just the--

260
00:19:16,950 --> 00:19:18,891
for symmetric
matrices, this picture

261
00:19:18,891 --> 00:19:20,807
is really the picture
you should have in mind.

262
00:19:30,870 --> 00:19:36,870
So proof of Theorem 2.

263
00:19:45,030 --> 00:19:58,610
Suppose lambda is an eigenvalue
with eigenvector v. Then

264
00:19:58,610 --> 00:20:00,190
by definition we have this.

265
00:20:04,720 --> 00:20:08,710
Now multiply v
transposed on both sides.

266
00:20:12,070 --> 00:20:20,150
It is lambda times the norm v.

267
00:20:20,150 --> 00:20:34,732
Now take the complex
conjugate-- Real symmetric.

268
00:20:40,710 --> 00:20:47,540
And then first A conjugate,
we have v^T A^T v,

269
00:20:47,540 --> 00:20:50,965
and then take the
conjugate of it.

270
00:20:50,965 --> 00:20:57,020
Then we get lambda...

271
00:20:57,020 --> 00:21:19,490
v. And this side is
equal to v^T A^T v.

272
00:21:19,490 --> 00:21:27,760
But because A is real symmetric,
we see that A is equal

273
00:21:27,760 --> 00:21:31,860
to the conjugate of
complex conjugate of A.

274
00:21:31,860 --> 00:21:35,760
So this expression and this
expression is the same.

275
00:21:35,760 --> 00:21:39,730
The right side should
also be the same.

276
00:21:39,730 --> 00:21:43,136
That means lambda is equal
to the conjugate of lambda.

277
00:21:43,136 --> 00:21:44,780
So lambda has to be a real.

278
00:21:59,480 --> 00:22:02,660
So Theorem 1 is a little
bit more complicated,

279
00:22:02,660 --> 00:22:06,490
and it involves more
advanced concepts

280
00:22:06,490 --> 00:22:13,500
like basis and linear
subspace, and so on.

281
00:22:13,500 --> 00:22:15,210
And those concepts
are not really

282
00:22:15,210 --> 00:22:16,440
important for this class.

283
00:22:16,440 --> 00:22:18,910
So I'll just skip the proof.

284
00:22:18,910 --> 00:22:21,900
But it's really important to
remember these two theorems.

285
00:22:21,900 --> 00:22:25,760
Wherever you see
a symmetric matrix

286
00:22:25,760 --> 00:22:27,900
you should really feel like
you have control on it.

287
00:22:27,900 --> 00:22:30,730
Because you can diagonalize it.

288
00:22:34,370 --> 00:22:38,052
And moreover, all
eigenvalues are real,

289
00:22:38,052 --> 00:22:40,562
and you have really good
control on symmetric matrices.

290
00:22:44,891 --> 00:22:48,050
That's good.

291
00:22:48,050 --> 00:22:51,170
That was when
everything went well.

292
00:22:51,170 --> 00:22:53,140
We can diagonalize it.

293
00:22:53,140 --> 00:22:58,760
So, so far we saw that if
for a symmetric matrix,

294
00:22:58,760 --> 00:23:00,090
we can diagonalize it.

295
00:23:00,090 --> 00:23:01,450
It's really easy to understand.

296
00:23:01,450 --> 00:23:03,330
But what about general matrices?

297
00:23:16,690 --> 00:23:19,590
In general, not all matrices are
diagonalizable, first of all.

298
00:23:37,500 --> 00:23:41,910
But sometimes we still want
to decomposition like this.

299
00:23:41,910 --> 00:23:53,590
So diagonalization was A equals
U times D times U inverse.

300
00:23:59,910 --> 00:24:01,500
But we want something similar.

301
00:24:01,500 --> 00:24:04,280
We want to understand.

302
00:24:04,280 --> 00:24:15,020
So our goal, we want to
still understand the matrix,

303
00:24:15,020 --> 00:24:22,945
give a matrix A through simple
operations, such as scaling.

304
00:24:27,810 --> 00:24:30,800
When the matrix was a
diagonalizable matrix this

305
00:24:30,800 --> 00:24:33,554
was done, this was possible.

306
00:24:33,554 --> 00:24:35,470
Unfortunately, it's not
always diagonalizable.

307
00:24:38,560 --> 00:24:41,530
So we have to do something else.

308
00:24:45,460 --> 00:24:47,780
So that's what I
want to talk about.

309
00:24:47,780 --> 00:24:49,860
And luckily the
good news is there

310
00:24:49,860 --> 00:24:52,600
is a nice tool we can
use for all matrices,

311
00:24:52,600 --> 00:24:56,360
even those slightly weaker,
in fact, a little bit more

312
00:24:56,360 --> 00:24:58,760
weaker than this
diagonalization.

313
00:24:58,760 --> 00:25:02,580
But still it distills some
very important information

314
00:25:02,580 --> 00:25:03,407
of the matrix.

315
00:25:03,407 --> 00:25:05,240
So it's called singular
value decomposition.

316
00:25:17,220 --> 00:25:22,350
So this will be our second
tool of understanding matrices.

317
00:25:22,350 --> 00:25:24,880
It's very similar to
this diagonalization,

318
00:25:24,880 --> 00:25:27,210
or in other words I call this
eigenvalue decomposition.

319
00:25:34,400 --> 00:25:36,400
But it has a slightly
different form.

320
00:25:36,400 --> 00:25:39,310
So what is its form?

321
00:25:39,310 --> 00:25:41,770
So theorem.

322
00:25:41,770 --> 00:25:45,874
Let A be an m by n matrix.

323
00:25:51,350 --> 00:26:12,400
Then there always exists
orthonormal matrices

324
00:26:12,400 --> 00:26:25,530
U and V such that A is
equal to U times sigma times

325
00:26:25,530 --> 00:26:27,340
V transpose.

326
00:26:27,340 --> 00:26:36,880
For some diagonal matrix sigma.

327
00:26:36,880 --> 00:26:40,980
Let me parse through the
theorem a little bit more.

328
00:26:40,980 --> 00:26:42,670
Whenever you're
given a matrix, it

329
00:26:42,670 --> 00:26:45,060
doesn't even have to be
a square matrix anymore.

330
00:26:45,060 --> 00:26:47,040
It can be non-symmetric.

331
00:26:47,040 --> 00:26:50,400
So whenever we're given an
m by n matrix, in general,

332
00:26:50,400 --> 00:26:55,110
there always exists
two matrices, U and V,

333
00:26:55,110 --> 00:26:58,510
which are orthonormal,
such that A

334
00:26:58,510 --> 00:27:03,380
can be decomposed as U times
sigma times V transposed, where

335
00:27:03,380 --> 00:27:05,290
sigma is a diagonal matrix.

336
00:27:05,290 --> 00:27:08,340
But now the size of the
matrix are important

337
00:27:08,340 --> 00:27:13,740
so U is an m by n matrix,
sigma is an m by n matrix,

338
00:27:13,740 --> 00:27:15,910
and V is an n by n matrix.

339
00:27:15,910 --> 00:27:21,010
That just denotes the size,
the dimensions of the matrix.

340
00:27:21,010 --> 00:27:25,130
So what does it mean for an
m by n matrix to be diagonal?

341
00:27:25,130 --> 00:27:27,410
It just means the same thing.

342
00:27:27,410 --> 00:27:30,640
So only the (i,i) entries
are allowed to be nonzero.

343
00:27:39,760 --> 00:27:41,650
So that was just
a bunch of words.

344
00:27:41,650 --> 00:27:43,110
So let me rephrase this.

345
00:27:52,370 --> 00:27:56,170
So let me compare now eigenvalue
decomposition, with singular

346
00:27:56,170 --> 00:27:58,060
value decomposition.

347
00:27:58,060 --> 00:28:03,370
So this is EVD, what
we just saw before.

348
00:28:03,370 --> 00:28:06,290
It only-- SVD.

349
00:28:06,290 --> 00:28:09,259
This only works for
n by n matrices,

350
00:28:09,259 --> 00:28:10,300
which are diagonalizable.

351
00:28:15,260 --> 00:28:17,793
SVD works for all
general m by n matrices.

352
00:28:23,470 --> 00:28:24,830
However, this is powerful.

353
00:28:24,830 --> 00:28:28,950
Because it gives you one frame.

354
00:28:28,950 --> 00:28:41,508
So v_1 with a v_2, v_3 for which
A acts as a scaling operator.

355
00:28:41,508 --> 00:28:44,030
Kind of like that.

356
00:28:44,030 --> 00:28:45,972
That's what A does,
A does, A does.

357
00:28:49,140 --> 00:28:54,120
That's because these U on
the both sides are equal.

358
00:28:54,120 --> 00:28:57,766
However, for singular
value decomposition,

359
00:28:57,766 --> 00:29:00,106
this is called singular
value decomposition.

360
00:29:00,106 --> 00:29:01,330
I just erased It.

361
00:29:08,750 --> 00:29:11,480
What you have instead
is first of all,

362
00:29:11,480 --> 00:29:12,670
the spaces are different.

363
00:29:12,670 --> 00:29:22,358
If you take a vector in
R^m, and bring it to R^n,

364
00:29:22,358 --> 00:29:25,690
apply this operation A. What's
going to happen here is there

365
00:29:25,690 --> 00:29:28,790
will be one frame in here,
and one frame in here.

366
00:29:28,790 --> 00:29:36,430
So there will be vectors
v_1, v_2, v_3, v_4 like this.

367
00:29:36,430 --> 00:29:44,300
And there will be vectors
u_1, u_2, u_3 like this here.

368
00:29:44,300 --> 00:29:48,420
And what's going to happen
is when you take v_1,

369
00:29:48,420 --> 00:29:52,800
A will take v_1 to u_1
and scale it a little bit

370
00:29:52,800 --> 00:29:54,290
according to that diagonal.

371
00:29:54,290 --> 00:29:58,420
A will take v_2 to
u_2, it will scale it.

372
00:29:58,420 --> 00:30:01,990
It'll take v_3 to u_3, scale it.

373
00:30:01,990 --> 00:30:02,620
Wait a minute.

374
00:30:02,620 --> 00:30:05,100
But for v_4, we don't have u_4.

375
00:30:05,100 --> 00:30:08,070
What's going to happen is this
is just going to disappear.

376
00:30:08,070 --> 00:30:11,510
u_4, when applied
A, will disappear.

377
00:30:11,510 --> 00:30:13,930
So I know it's a very
vague explanation,

378
00:30:13,930 --> 00:30:18,320
but this geometric picture,
try to compare them.

379
00:30:18,320 --> 00:30:21,200
A diagonalization,
eigenvalue decomposition,

380
00:30:21,200 --> 00:30:25,350
works within its frame, so
it's very, very powerful.

381
00:30:25,350 --> 00:30:29,480
You just have some directions
and you scale those directions.

382
00:30:29,480 --> 00:30:31,450
But the singular
value composition

383
00:30:31,450 --> 00:30:34,840
it's applicable to a more
general class of matrices,

384
00:30:34,840 --> 00:30:36,750
but it's rather more restricted.

385
00:30:36,750 --> 00:30:39,750
You have two frames, one
for the original space, one

386
00:30:39,750 --> 00:30:41,400
for the target space.

387
00:30:41,400 --> 00:30:43,320
And what the linear
transformation does is,

388
00:30:43,320 --> 00:30:47,240
it just sends one
vector to another vector

389
00:30:47,240 --> 00:30:49,770
and scales it a little bit.

390
00:30:54,080 --> 00:30:59,149
So now is another
good time to go back

391
00:30:59,149 --> 00:31:00,690
to that matrix in
the very beginning.

392
00:31:12,520 --> 00:31:22,714
So remember that example where
we had a vector of companies,

393
00:31:22,714 --> 00:31:27,604
and dates, and the
entry was stock prices.

394
00:31:37,400 --> 00:31:41,000
So if it's an n by
n matrix, you can

395
00:31:41,000 --> 00:31:42,940
try to apply both
eigenvalue decomposition,

396
00:31:42,940 --> 00:31:45,080
and singular value
decomposition.

397
00:31:45,080 --> 00:31:48,230
But what will be more sensible
is singular value decomposition

398
00:31:48,230 --> 00:31:50,210
in this case.

399
00:31:50,210 --> 00:31:53,130
I won't explain why, and
what's happening here.

400
00:31:53,130 --> 00:31:56,170
Peter will probably.

401
00:31:56,170 --> 00:31:58,100
You will come to it later.

402
00:31:58,100 --> 00:32:01,540
But just try to do some
imagining before listening

403
00:32:01,540 --> 00:32:04,190
what's really happening
in real world.

404
00:32:04,190 --> 00:32:07,380
So try to use your own
imagination, your own language

405
00:32:07,380 --> 00:32:08,240
to express.

406
00:32:08,240 --> 00:32:10,950
See what happens for
this matrix, what

407
00:32:10,950 --> 00:32:12,430
this decomposition is doing.

408
00:32:20,010 --> 00:32:24,060
It just looks like
totally nonsense.

409
00:32:24,060 --> 00:32:26,630
Why does this have
even a geometry?

410
00:32:26,630 --> 00:32:29,160
Why does it define a linear
transformation and so on?

411
00:32:32,440 --> 00:32:34,590
It's just a beautiful
theory, which just

412
00:32:34,590 --> 00:32:36,990
gives many useful information.

413
00:32:36,990 --> 00:32:38,750
I can't really emphasize more.

414
00:32:38,750 --> 00:32:42,540
Because-- emphasize
enough, because really

415
00:32:42,540 --> 00:32:46,010
this is just universal, being
used in all science, these.

416
00:32:46,010 --> 00:32:48,260
I think the eigenvalue
decomposition, and the singular

417
00:32:48,260 --> 00:32:50,070
value decomposition.

418
00:32:50,070 --> 00:32:53,560
Not just for this
course, but pretty much

419
00:32:53,560 --> 00:32:55,620
it's safe to say in
every engineering,

420
00:32:55,620 --> 00:32:57,620
you'll encounter
one of the forms.

421
00:33:00,150 --> 00:33:05,560
So let me talk about the
proof of the singular value

422
00:33:05,560 --> 00:33:06,880
decomposition.

423
00:33:06,880 --> 00:33:11,120
And I will show you an
example of what singular value

424
00:33:11,120 --> 00:33:15,410
decomposition does for some
example matrix, the matrix

425
00:33:15,410 --> 00:33:17,760
that I chose.

426
00:33:17,760 --> 00:33:25,665
Proof of singular
value decomposition,

427
00:33:25,665 --> 00:33:26,540
which is interesting.

428
00:33:26,540 --> 00:33:28,123
It relies on eigenvalue
decomposition.

429
00:33:31,030 --> 00:33:57,125
So given a matrix A, consider
the eigenvalues of A times

430
00:33:57,125 --> 00:33:58,280
A transpose.

431
00:34:04,910 --> 00:34:17,024
Oh, A transpose A. First
observation: that's

432
00:34:17,024 --> 00:34:17,815
a symmetric matrix.

433
00:34:26,170 --> 00:34:29,210
So if you remember, it
will have real eigenvalues,

434
00:34:29,210 --> 00:34:30,210
and it's diagonalizable.

435
00:34:35,110 --> 00:34:51,356
So A^T of A has eigenvalues
lambda_1, lambda_2,

436
00:34:51,356 --> 00:34:57,326
up to, it's an n by n
matrix, so lambda_n.

437
00:35:00,080 --> 00:35:09,387
And corresponding eigenvectors
v_1, v_2, up to v_n.

438
00:35:13,790 --> 00:35:18,110
And so for convenience, I
will cut it at lambda_r,

439
00:35:18,110 --> 00:35:22,150
and assume all rest is 0.

440
00:35:22,150 --> 00:35:23,690
So there might be
none which are 0.

441
00:35:23,690 --> 00:35:26,570
In that case we use
all the eigenvalues.

442
00:35:26,570 --> 00:35:29,850
But I only am interested
in nonzero eigenvalues.

443
00:35:29,850 --> 00:35:33,010
So I'll say up to
lambda_r, they're nonzero.

444
00:35:33,010 --> 00:35:35,710
Afterwards it's 0.

445
00:35:35,710 --> 00:35:36,960
It's just a notational choice.

446
00:35:40,367 --> 00:35:41,950
And now I'm just
going to make a claim

447
00:35:41,950 --> 00:35:44,250
that they're all positive.

448
00:35:44,250 --> 00:35:49,300
This part is kind
of just believe me.

449
00:35:53,730 --> 00:35:56,400
Then if that's the case, we
can rewrite the eigenvalues.

450
00:35:56,400 --> 00:36:06,610
Rewrite eigenvalues as
sigma_1^2, sigma_2^2,

451
00:36:06,610 --> 00:36:08,946
sigma_r^2, and 0.

452
00:36:15,610 --> 00:36:18,530
That was my first step.

453
00:36:18,530 --> 00:36:21,856
My second step,
that was step one,

454
00:36:21,856 --> 00:36:30,770
step two is to define
u_1 as A*v_1 / sigma_1,

455
00:36:30,770 --> 00:36:32,400
u_2 as A*v_2 / sigma_2.

456
00:36:35,200 --> 00:36:37,950
And u_r as A*V_r / sigma_r.

457
00:36:41,460 --> 00:36:49,240
And then u times r+1
as-- up to u times m,

458
00:36:49,240 --> 00:36:56,390
as complete the
above into a basis.

459
00:37:02,590 --> 00:37:04,080
So for those who
don't understand,

460
00:37:04,080 --> 00:37:07,700
just think of we pick u_1
up to u_r first, and then

461
00:37:07,700 --> 00:37:10,750
arbitrarily pick the rest.

462
00:37:10,750 --> 00:37:14,890
And you'll see why I only care
about the nonzero eigenvalues.

463
00:37:14,890 --> 00:37:19,500
Because I have to divide by
sigmas, the sigma values.

464
00:37:19,500 --> 00:37:22,330
And if it's zero, I
can't do the division.

465
00:37:22,330 --> 00:37:24,510
So that's why I identified
those which are not zero.

466
00:37:26,817 --> 00:37:27,650
And then we're done.

467
00:37:30,680 --> 00:37:32,685
So it doesn't look at
all like we're done.

468
00:37:32,685 --> 00:37:41,410
But I'm going to let my U be
this, u_1, u_2, up to u_n.

469
00:37:44,012 --> 00:37:45,988
Sorry, it has to be n.

470
00:37:48,650 --> 00:37:57,110
My V I will pick as
v_1, v_2, up to v_r.

471
00:37:57,110 --> 00:38:00,660
And then v_(r+1) up to v_n.

472
00:38:00,660 --> 00:38:03,130
So this again just
complete into a basis.

473
00:38:15,575 --> 00:38:16,700
Now let's see what happens.

474
00:38:27,960 --> 00:38:33,772
So A times U transpose
times V. Oh, ah.

475
00:38:33,772 --> 00:38:35,060
That's why it's a problem.

476
00:38:39,137 --> 00:38:43,440
You have to do U times
A times V transpose.

477
00:38:43,440 --> 00:38:49,290
So I would write V
is n, and this is m.

478
00:39:20,500 --> 00:39:25,160
Ah yes, so U times A
times V transpose here.

479
00:39:25,160 --> 00:39:31,080
That will be u_1, u_2, u_m.

480
00:39:31,080 --> 00:39:37,865
A. V transpose will be v_1
transpose, v_2 transpose,

481
00:39:37,865 --> 00:39:39,350
to v_n transpose.

482
00:40:03,605 --> 00:40:05,085
I messed up something.

483
00:40:05,085 --> 00:40:05,585
Sorry.

484
00:40:16,180 --> 00:40:18,772
Oh, that's the
form I want, right?

485
00:40:18,772 --> 00:40:20,550
Yeah.

486
00:40:20,550 --> 00:40:23,760
So I have to transpose
U and V there.

487
00:40:23,760 --> 00:40:24,755
OK, sorry.

488
00:40:27,280 --> 00:40:28,110
Thank you.

489
00:40:28,110 --> 00:40:29,810
Thank you for the correction.

490
00:40:29,810 --> 00:40:31,620
I know this looks
different from that.

491
00:40:31,620 --> 00:40:36,380
But I mean if you flip the
definition it will be the same.

492
00:40:36,380 --> 00:40:39,240
So I'll just not--
stop making mistakes.

493
00:40:39,240 --> 00:40:41,200
Do you have a question?

494
00:40:41,200 --> 00:40:42,170
So, yeah.

495
00:40:42,170 --> 00:40:42,670
Thank you.

496
00:40:48,550 --> 00:40:49,530
Yeah.

497
00:40:49,530 --> 00:40:51,000
That will make more sense.

498
00:40:55,900 --> 00:40:58,740
Thank you very much.

499
00:40:58,740 --> 00:41:01,603
And then you're going
to have u_1 transpose up

500
00:41:01,603 --> 00:41:04,460
to u_n transpose.

501
00:41:04,460 --> 00:41:08,300
A times V, because of
the definition of V,

502
00:41:08,300 --> 00:41:11,670
will be lambda_1 of v_1.

503
00:41:11,670 --> 00:41:13,990
A times v_2 will
be lambda_2 of v_2.

504
00:41:13,990 --> 00:41:19,010
Up to lambda_r of v_r,
and the rest will be zero.

505
00:41:19,010 --> 00:41:20,428
These all define the columns.

506
00:41:33,874 --> 00:41:47,200
Now let's do a few computations.

507
00:41:47,200 --> 00:41:50,680
So u_1^T times lambda_1 v_1.

508
00:41:50,680 --> 00:41:54,050
u_1^T, and lambda_1 v_1.

509
00:41:54,050 --> 00:41:59,140
When you take the dot product,
what you're going to get is

510
00:41:59,140 --> 00:42:05,441
v_1^T A transpose
of v_1 lambda_1.

511
00:42:13,393 --> 00:42:14,884
I'm missing something.

512
00:42:25,850 --> 00:42:26,900
Ah, sorry about that.

513
00:42:26,900 --> 00:42:29,921
This is not right.

514
00:42:29,921 --> 00:42:30,855
These are As.

515
00:42:33,660 --> 00:42:47,800
I defined the eigenvalues
for A transpose A.

516
00:42:47,800 --> 00:42:52,140
Then that's u_1 transpose
times sigma_1 times u_1.

517
00:42:52,140 --> 00:42:54,760
That will be sigma_1.

518
00:42:59,669 --> 00:43:01,960
And then if you look at the
second entry, u_1 transpose

519
00:43:01,960 --> 00:43:11,370
times A v_2, you get u_1
transpose times sigma_2 of u_2.

520
00:43:14,010 --> 00:43:18,430
But I claim that
this is equal to 0.

521
00:43:18,430 --> 00:43:20,340
So why is that the case?

522
00:43:20,340 --> 00:43:23,082
u_1 transpose is
equal to V_1 transpose

523
00:43:23,082 --> 00:43:26,316
A transpose over sigma_1.

524
00:43:26,316 --> 00:43:28,160
And we have sigma_2.

525
00:43:28,160 --> 00:43:32,876
u_2 is equal to A
times v_2 over sigma_2.

526
00:43:32,876 --> 00:43:35,650
So those two cancel.

527
00:43:35,650 --> 00:43:42,230
And we have v_1^T A^T
A v_2 over sigma_1.

528
00:43:42,230 --> 00:43:45,680
But v_1 and v_2 are two
different eigenvectors

529
00:43:45,680 --> 00:43:48,160
of this matrix.

530
00:43:48,160 --> 00:43:52,570
At the beginning we can have an
orthonormal decomposition of A

531
00:43:52,570 --> 00:43:59,240
transpose A. That means v_1^T
times v_2 times that has to be

532
00:43:59,240 --> 00:44:00,222
equal to zero.

533
00:44:00,222 --> 00:44:03,140
Because that's an eigenvalue.

534
00:44:03,140 --> 00:44:09,658
We have v_1^T times
lambda_2 v_2 over sigma_1.

535
00:44:09,658 --> 00:44:14,495
So we have lambda_2 over
sigma_1 times v_1 transpose v_2.

536
00:44:14,495 --> 00:44:18,410
These two are
orthogonal so give 0.

537
00:44:18,410 --> 00:44:20,760
So if you do the
computation, what

538
00:44:20,760 --> 00:44:23,560
you're going to have
is sigma_1, sigma_2

539
00:44:23,560 --> 00:44:28,547
on the diagonal, up to
sigma_r, and then 0, 0 rest.

540
00:44:28,547 --> 00:44:32,450
And 0 the rest.

541
00:44:32,450 --> 00:44:35,630
Sorry for the confusion.

542
00:44:35,630 --> 00:44:37,190
Actually the process
is quite simple.

543
00:44:37,190 --> 00:44:39,880
I was just lost in the
computation in the middle.

544
00:44:39,880 --> 00:44:44,550
So process is first
look at A transpose A.

545
00:44:44,550 --> 00:44:47,450
Find the eigenvalues
and eigenvectors.

546
00:44:47,450 --> 00:44:53,070
And using those, they
define the matrix V.

547
00:44:53,070 --> 00:44:56,580
And you can define the
matrix U by applying A times

548
00:44:56,580 --> 00:44:58,280
V over sigma.

549
00:44:58,280 --> 00:45:01,860
Each of those will
define the entries of U.

550
00:45:01,860 --> 00:45:03,900
The reason I wanted to
go through this proof

551
00:45:03,900 --> 00:45:07,490
is because this gives you a
process of finding a singular

552
00:45:07,490 --> 00:45:09,830
value decomposition.

553
00:45:09,830 --> 00:45:11,920
It was a little
bit painful for me.

554
00:45:11,920 --> 00:45:17,070
But if you have a matrix
there's just these simple steps

555
00:45:17,070 --> 00:45:21,600
you can follow to find the
singular value decomposition.

556
00:45:21,600 --> 00:45:25,460
So look at this matrix, find its
eigenvalues and eigenvectors.

557
00:45:25,460 --> 00:45:28,020
Just arrange it
in the right way.

558
00:45:28,020 --> 00:45:30,490
Of course, the right
way needs some practice

559
00:45:30,490 --> 00:45:31,980
to be done correctly.

560
00:45:31,980 --> 00:45:34,350
But once you do that, you
just obtain a singular value

561
00:45:34,350 --> 00:45:36,750
composition.

562
00:45:36,750 --> 00:45:39,674
And really I can't explain
how powerful it is.

563
00:45:39,674 --> 00:45:41,340
You will only later
see it in the course

564
00:45:41,340 --> 00:45:43,720
how powerful this
decomposition will be.

565
00:45:43,720 --> 00:45:45,850
And only then you'll
more appreciate

566
00:45:45,850 --> 00:45:48,900
how good it is to have
this decomposition,

567
00:45:48,900 --> 00:45:53,100
and be able to
compute it so simply.

568
00:45:53,100 --> 00:45:56,598
So let's try to do it by hand.

569
00:45:56,598 --> 00:45:58,133
Yes?

570
00:45:58,133 --> 00:46:00,007
STUDENT: So when you
compute the [INAUDIBLE].

571
00:46:05,155 --> 00:46:05,780
PROFESSOR: Yes.

572
00:46:05,780 --> 00:46:06,613
STUDENT: [INAUDIBLE]

573
00:46:08,700 --> 00:46:12,490
PROFESSOR: It would have
to be orthonormal, yeah.

574
00:46:12,490 --> 00:46:13,927
It should be orthonormal.

575
00:46:13,927 --> 00:46:15,364
These should be orthonormal.

576
00:46:18,238 --> 00:46:18,970
These also.

577
00:46:23,070 --> 00:46:25,480
And that's a good point,
because that can be annoying

578
00:46:25,480 --> 00:46:26,920
when you want to do it by hand.

579
00:46:26,920 --> 00:46:28,530
Actually this decomposition.

580
00:46:28,530 --> 00:46:31,800
You have to do some Gram-Schmidt
process or something like that.

581
00:46:35,380 --> 00:46:37,085
What I mean by
hand, I don't really

582
00:46:37,085 --> 00:46:41,000
mean by hand, other than
when you're doing homework.

583
00:46:41,000 --> 00:46:44,030
Because you can use
the computer to do it.

584
00:46:44,030 --> 00:46:46,690
And in fact, if you
use computer there

585
00:46:46,690 --> 00:46:49,314
are much better algorithms
than this that are known,

586
00:46:49,314 --> 00:46:51,730
which can do this a lot more
quickly and more efficiently.

587
00:46:55,140 --> 00:46:57,077
So let's try to do it by hand.

588
00:47:05,850 --> 00:47:16,280
So let A be this matrix:
[3, 2  2; 2, 3, -2].

589
00:47:16,280 --> 00:47:19,785
And we want to make the
eigenvalue decomposition

590
00:47:19,785 --> 00:47:22,080
of this.

591
00:47:22,080 --> 00:47:24,800
A transpose A, we
have to compute that,

592
00:47:24,800 --> 00:47:29,180
is [3, 2, 2; 2, 3, -2].

593
00:47:39,044 --> 00:47:52,824
And you will get [13, 12, 2; 12,
 13, -2; 2, -2, 8].

594
00:48:03,920 --> 00:48:12,672
And let me just say that the
eigenvalues are 0, 9, and 25.

595
00:48:15,570 --> 00:48:20,900
So in this algorithm,
sigma_1^2 will be 25.

596
00:48:20,900 --> 00:48:23,660
Sigma_2^2 squared will be 9.

597
00:48:23,660 --> 00:48:27,250
And sigma_3^2 squared will be 0.

598
00:48:27,250 --> 00:48:30,140
So we can take sigma_1
to be 5, sigma_2 to be 3,

599
00:48:30,140 --> 00:48:31,230
sigma_3 to be 0.

600
00:48:36,930 --> 00:48:41,650
Now we have to find the
corresponding eigenvectors

601
00:48:41,650 --> 00:48:44,840
to find the singular
value decomposition.

602
00:48:44,840 --> 00:48:48,260
And I'll just do one
just to remind you

603
00:48:48,260 --> 00:48:50,120
how to find an eigenvector.

604
00:48:50,120 --> 00:48:56,345
So A transpose A,
minus 25I is equal to,

605
00:48:56,345 --> 00:48:59,010
if you subtract 25
from these entries,

606
00:48:59,010 --> 00:49:09,784
you're going to get [-12, 12, 2;
 12, -12, -2; 2, -2, -13].

607
00:49:17,000 --> 00:49:20,110
And then you have to
find the vector which

608
00:49:20,110 --> 00:49:22,060
annihilates this matrix.

609
00:49:22,060 --> 00:49:27,213
And that will be, I can take one
of those vectors to be just 1

610
00:49:27,213 --> 00:49:31,493
over square root of 2, 1
over square root of two, 0,

611
00:49:31,493 --> 00:49:32,475
after normalizing.

612
00:49:36,410 --> 00:49:38,260
And then just do it
for other vectors.

613
00:49:43,430 --> 00:49:52,130
You find v_2 to be 1 over
square root 18, negative 1

614
00:49:52,130 --> 00:49:57,744
over square root 18,
4 over square root 18.

615
00:50:11,220 --> 00:50:19,020
Now then find v_3 to be the
one that annihilates this.

616
00:50:19,020 --> 00:50:20,880
But I'll just say it's x, y, z.

617
00:50:20,880 --> 00:50:23,250
This will not be important.

618
00:50:23,250 --> 00:50:25,270
I'll explain why it's
not that important.

619
00:50:35,520 --> 00:50:44,810
Then our v as written
above, actually

620
00:50:44,810 --> 00:50:45,950
there it was transposed.

621
00:50:45,950 --> 00:50:47,305
So I will transpose it.

622
00:50:47,305 --> 00:50:49,165
That will be 1 over
square root of 2,

623
00:50:49,165 --> 00:50:53,350
1 over square root of 2, 0.

624
00:50:53,350 --> 00:50:54,660
v_2 is that.

625
00:50:54,660 --> 00:50:58,610
So we can write 1 over
square root 18, negative 1

626
00:50:58,610 --> 00:51:03,835
over square root 18,
4 over square root 18.

627
00:51:03,835 --> 00:51:06,210
And here just write x, y, z.

628
00:51:10,485 --> 00:51:17,210
And U will be defined
as u_1 and u_2,

629
00:51:17,210 --> 00:51:21,600
where u_1 is A times
v_1 over sigma_1.

630
00:51:21,600 --> 00:51:26,610
u_2 is A times v_2 over sigma_2.

631
00:51:26,610 --> 00:51:30,150
So multiply A by this
vector, divide by sigma_1

632
00:51:30,150 --> 00:51:34,415
to get U. I already did
the computation for you.

633
00:51:34,415 --> 00:51:44,810
It's going to be-- and
this is going to be-- yes?

634
00:51:44,810 --> 00:51:46,320
STUDENT: How did you get v_1?

635
00:51:46,320 --> 00:51:47,870
PROFESSOR: v_1?

636
00:51:47,870 --> 00:51:50,980
So if you did the computation
right in the beginning to get

637
00:51:50,980 --> 00:51:58,560
the eigenvalues, then A^T
A - 25I, this has to be--

638
00:51:58,560 --> 00:52:00,850
has to not have full rank.

639
00:52:00,850 --> 00:52:03,770
So there has to be a vector v,
which when multiplied by this

640
00:52:03,770 --> 00:52:06,260
gives [0, 0, 0] vector.

641
00:52:06,260 --> 00:52:13,670
And then you say [a, b, c]
and set it equal to [0, 0, 0].

642
00:52:13,670 --> 00:52:17,041
And just solve the system
of linear equations.

643
00:52:17,041 --> 00:52:18,290
There will be several of them.

644
00:52:18,290 --> 00:52:20,840
For example, we can
take [1, 1,  0] as well.

645
00:52:20,840 --> 00:52:25,170
But I just normalized
it to have [INAUDIBLE].

646
00:52:25,170 --> 00:52:27,350
So there's a lot
of work involved

647
00:52:27,350 --> 00:52:30,550
if you want to do it by hand,
even though you can do it.

648
00:52:30,550 --> 00:52:32,630
You have to find eigenvalues,
find eigenvectors.

649
00:52:32,630 --> 00:52:35,270
In this case, you have
to find three of them.

650
00:52:35,270 --> 00:52:37,340
And then you have to do
more work, and more work.

651
00:52:37,340 --> 00:52:39,610
But it can be done.

652
00:52:39,610 --> 00:52:44,320
And we are done now.

653
00:52:44,320 --> 00:52:52,020
So now this decomposes A into
U sigma V transformation.

654
00:52:52,020 --> 00:52:57,770
So U is given as [1 over square
root 2, 1 over square root 2;

655
00:52:57,770 --> 00:53:02,320
1 over square root 2, minus
1 over square root 2].

656
00:53:02,320 --> 00:53:07,716
Sigma was 5, 3, 0.

657
00:53:12,070 --> 00:53:15,490
And V is this.

658
00:53:15,490 --> 00:53:18,834
So V transpose is just
transpose of that.

659
00:53:18,834 --> 00:53:22,790
I'll just write it like
that, where V is that.

660
00:53:22,790 --> 00:53:25,200
So we have this decomposition.

661
00:53:25,200 --> 00:53:28,370
And so let me actually write
it, because I want to show you

662
00:53:28,370 --> 00:53:29,996
why x, y, z is not important.

663
00:53:33,272 --> 00:53:37,010
1 over square root 2,
1 over square root 2,

664
00:53:37,010 --> 00:53:43,250
0; 1 over square root 18,
minus 1 over square root 18,

665
00:53:43,250 --> 00:53:46,190
4 over square root 18; x, y, z.

666
00:53:50,600 --> 00:53:52,410
The reason I'm
saying this is not

667
00:53:52,410 --> 00:53:56,980
important is because I can just
drop-- oh what did I do here?

668
00:53:56,980 --> 00:54:00,700
It has to be 2 by 3.

669
00:54:00,700 --> 00:54:04,370
I can just drop this column,
and drop this column together.

670
00:54:06,890 --> 00:54:08,517
It has to be that form.

671
00:54:25,510 --> 00:54:29,160
Drop this and drop
this altogether.

672
00:54:29,160 --> 00:54:33,870
So the message here is that
the eigenvectors corresponding

673
00:54:33,870 --> 00:54:38,340
to eigenvalue zero
are not important.

674
00:54:38,340 --> 00:54:41,640
The only relevant ones
are nonzero eigenvalues.

675
00:54:41,640 --> 00:54:43,290
So drop this, and drop this.

676
00:54:43,290 --> 00:54:46,120
That will save you
some computation.

677
00:54:46,120 --> 00:54:50,297
So let me state a different
form of singular value

678
00:54:50,297 --> 00:54:50,880
decomposition.

679
00:54:57,940 --> 00:54:59,760
So this works in general.

680
00:54:59,760 --> 00:55:00,940
There's a corollary.

681
00:55:00,940 --> 00:55:03,075
We get a simplified form of SVD.

682
00:55:10,730 --> 00:55:16,046
Where A becomes equal to U
times sigma times V transpose.

683
00:55:18,710 --> 00:55:21,260
And A was an m by n matrix.

684
00:55:21,260 --> 00:55:24,220
U is still an m by m matrix.

685
00:55:24,220 --> 00:55:27,320
But now sigma is
also m by m matrix.

686
00:55:27,320 --> 00:55:29,752
This only works when m is
less than or equal to n.

687
00:55:33,190 --> 00:55:35,985
And V is a m by n matrix.

688
00:55:38,960 --> 00:55:41,400
So the proof is
exactly the same.

689
00:55:41,400 --> 00:55:44,460
And the last step is just
to drop the irrelevant

690
00:55:44,460 --> 00:55:46,590
information.

691
00:55:46,590 --> 00:55:48,810
So I will not write
down why it works.

692
00:55:48,810 --> 00:55:51,830
But you can see if
you go through it,

693
00:55:51,830 --> 00:55:54,280
you'll see that
dropping this part

694
00:55:54,280 --> 00:55:56,210
just corresponds to
exactly that information.

695
00:55:59,500 --> 00:56:02,660
So that's the reduced form.

696
00:56:02,660 --> 00:56:04,200
So let's see.

697
00:56:04,200 --> 00:56:06,650
In the beginning
we had A. I erased

698
00:56:06,650 --> 00:56:09,690
A. A was the 2 by 3
matrix in the beginning.

699
00:56:09,690 --> 00:56:11,550
And we obtained the
decomposition into 2

700
00:56:11,550 --> 00:56:15,400
by 2, 2 by 2, and 2 by 3 matrix.

701
00:56:15,400 --> 00:56:18,670
If we didn't delete the
fifth column and fifth row,

702
00:56:18,670 --> 00:56:21,320
we would have obtained a 2
by 2, times 2 by 3, times 3

703
00:56:21,320 --> 00:56:23,080
by 3 matrix.

704
00:56:23,080 --> 00:56:25,572
But now we can simplify
it by removing those.

705
00:56:28,910 --> 00:56:33,020
And it might not look that
much different on this board.

706
00:56:33,020 --> 00:56:35,080
Because I just erased one row.

707
00:56:35,080 --> 00:56:38,920
But many matrices that you'll
see in real application

708
00:56:38,920 --> 00:56:43,350
have a lot lower rank than the
number of columns and rows.

709
00:56:43,350 --> 00:56:49,510
So if r is a lot more smaller
than both m and n, then

710
00:56:49,510 --> 00:56:53,650
this part really--
it's not obvious here.

711
00:56:53,650 --> 00:56:56,400
But if m and n has
a big gap here,

712
00:56:56,400 --> 00:57:00,600
really the number of
columns that you're saving,

713
00:57:00,600 --> 00:57:01,560
it can be enormous.

714
00:57:06,240 --> 00:57:09,940
So to illustrate an
example, look at this.

715
00:57:09,940 --> 00:57:12,665
Now look at the
stock prices, where

716
00:57:12,665 --> 00:57:18,770
you have companies and dates.

717
00:57:18,770 --> 00:57:21,850
Previously I just gave an
example of a 3 by 3 matrix.

718
00:57:21,850 --> 00:57:24,990
But it's more sensible
to have dates, a lot

719
00:57:24,990 --> 00:57:26,950
more dates than companies.

720
00:57:26,950 --> 00:57:31,820
So let's say you recorded
365 days of a year,

721
00:57:31,820 --> 00:57:34,890
even though the market is
not open all days, and just

722
00:57:34,890 --> 00:57:38,130
like five companies.

723
00:57:38,130 --> 00:57:41,340
If you did a decomposition this
this, you'll have a 5 by 5,

724
00:57:41,340 --> 00:57:45,840
5 by 365, 365 by 365 here.

725
00:57:45,840 --> 00:57:48,888
But now in the reduced form,
you're saving a lot of space.

726
00:57:51,726 --> 00:57:53,100
So if you just
look at the board,

727
00:57:53,100 --> 00:57:54,940
it doesn't look like
it's so powerful.

728
00:57:54,940 --> 00:57:56,130
But in fact it is.

729
00:57:56,130 --> 00:57:58,290
So that's the reduced form.

730
00:57:58,290 --> 00:58:00,930
And that will be the
form that you'll see most

731
00:58:00,930 --> 00:58:02,510
of the time, this reduced form.

732
00:58:07,350 --> 00:58:09,400
So I made lot of mistakes today.

733
00:58:09,400 --> 00:58:13,840
I have one more topic, but
a totally irrelevant topic.

734
00:58:13,840 --> 00:58:17,614
So any questions before I
move on to the next topic?

735
00:58:22,594 --> 00:58:23,255
Yes?

736
00:58:23,255 --> 00:58:24,088
STUDENT: [INAUDIBLE]

737
00:58:30,295 --> 00:58:31,795
PROFESSOR: Can you
press the button?

738
00:58:47,792 --> 00:58:48,625
STUDENT: [INAUDIBLE]

739
00:58:57,040 --> 00:58:59,640
PROFESSOR: Oh, so in
this data, what it means.

740
00:58:59,640 --> 00:59:02,770
You're asking what the
eigenvectors will mean over

741
00:59:02,770 --> 00:59:05,280
this data?

742
00:59:05,280 --> 00:59:10,960
It will give you some stocks.

743
00:59:10,960 --> 00:59:14,820
It will give you
like the correlation.

744
00:59:14,820 --> 00:59:17,610
So each eigenvector
will give you

745
00:59:17,610 --> 00:59:20,990
a group of companies that
are correlated somehow.

746
00:59:20,990 --> 00:59:23,550
It measures their
correlation with each other.

747
00:59:23,550 --> 00:59:26,880
So I don't have a
very good explanation

748
00:59:26,880 --> 00:59:28,280
what its physical meaning is.

749
00:59:28,280 --> 00:59:32,040
Maybe you can give
just a little bit more.

750
00:59:32,040 --> 00:59:34,050
GUEST SPEAKER: Possibly.

751
00:59:34,050 --> 00:59:35,870
We will get into this
in later lectures.

752
00:59:35,870 --> 00:59:41,280
But in the singular
value decomposition,

753
00:59:41,280 --> 00:59:45,640
what you want to think of is
these orthonormal matrices

754
00:59:45,640 --> 00:59:50,500
are really defining a new basis,
sort of an orthogonal basis.

755
00:59:50,500 --> 00:59:52,890
So you're taking the
original coordinate system,

756
00:59:52,890 --> 00:59:55,030
then you're rotating it.

757
00:59:55,030 --> 00:59:57,440
And without changing
or stretching

758
00:59:57,440 --> 00:59:58,410
or squeezing the data.

759
00:59:58,410 --> 01:00:00,370
You're just rotating the axes.

760
01:00:00,370 --> 01:00:03,160
So an orthonormal
matrix gives you

761
01:00:03,160 --> 01:00:06,310
the cosines of the
new coordinate system

762
01:00:06,310 --> 01:00:07,880
with respect to the old one.

763
01:00:07,880 --> 01:00:10,002
And so the singular
value decomposition

764
01:00:10,002 --> 01:00:13,360
then is simply sort
of rotating the data

765
01:00:13,360 --> 01:00:16,020
into a different orientation.

766
01:00:16,020 --> 01:00:23,980
And the orthonormal basis
that you're transforming to,

767
01:00:23,980 --> 01:00:28,330
is essentially the coordinates
of the original data

768
01:00:28,330 --> 01:00:29,930
in the transformed system.

769
01:00:29,930 --> 01:00:34,910
So as Choongbum was
commenting, you're essentially

770
01:00:34,910 --> 01:00:38,360
looking at a representation
of the original data

771
01:00:38,360 --> 01:00:43,480
points in a linearly
transformed space,

772
01:00:43,480 --> 01:00:46,820
and the correlations
between different stocks,

773
01:00:46,820 --> 01:00:51,990
say, is represented by how those
points are oriented in the new,

774
01:00:51,990 --> 01:00:53,870
in the transformed space.

775
01:00:57,160 --> 01:01:00,720
PROFESSOR: So you'll have to see
real data to really make sense

776
01:01:00,720 --> 01:01:01,220
out of it.

777
01:01:03,812 --> 01:01:07,430
But another way to think of
it is where it comes from.

778
01:01:07,430 --> 01:01:09,202
So all this singular
value decomposition,

779
01:01:09,202 --> 01:01:10,660
if you remember
the proof, it comes

780
01:01:10,660 --> 01:01:15,850
from eigenvectors and
eigenvalues of A transpose A.

781
01:01:15,850 --> 01:01:19,970
Now if you look at A
transpose A, or I'll just say

782
01:01:19,970 --> 01:01:22,460
it's A times A transposed.

783
01:01:22,460 --> 01:01:23,950
It's pretty much the same.

784
01:01:23,950 --> 01:01:26,530
If you look at A
times A transpose,

785
01:01:26,530 --> 01:01:28,425
you're going to get
an m by n matrix.

786
01:01:32,790 --> 01:01:36,432
And it'll be indexed
both by these companies.

787
01:01:40,920 --> 01:01:42,910
And the numbers
here will represent

788
01:01:42,910 --> 01:01:44,540
how much the
companies are related

789
01:01:44,540 --> 01:01:46,250
to each other, how
much correlation they

790
01:01:46,250 --> 01:01:48,690
have between each other.

791
01:01:48,690 --> 01:01:51,770
So by looking at the
eigenvectors of this matrix,

792
01:01:51,770 --> 01:01:54,950
you're looking at the
correlation between these stock

793
01:01:54,950 --> 01:01:58,290
prices, let's say, these
company stock prices.

794
01:01:58,290 --> 01:02:01,390
And that information is
represented inside the singular

795
01:02:01,390 --> 01:02:04,190
value decomposition.

796
01:02:04,190 --> 01:02:06,610
But again, it's a lot
better to understand

797
01:02:06,610 --> 01:02:09,250
if you have real
numbers and real data,

798
01:02:09,250 --> 01:02:10,880
which you will have later.

799
01:02:10,880 --> 01:02:17,031
So please be excited and wait.

800
01:02:17,031 --> 01:02:18,530
You're going to see
some cool stuff.

801
01:02:26,110 --> 01:02:30,120
So that was all for eigenvalue
decomposition and singular

802
01:02:30,120 --> 01:02:32,650
value decomposition.

803
01:02:32,650 --> 01:02:35,300
And the last thing I
want to mention today

804
01:02:35,300 --> 01:02:39,970
is something called
Perron-Frobenius theorem.

805
01:02:39,970 --> 01:02:43,130
This one even looks a lot
more theoretical than the ones

806
01:02:43,130 --> 01:02:45,320
I showed you.

807
01:02:45,320 --> 01:02:50,080
But surprisingly a few
years ago, Steve Ross,

808
01:02:50,080 --> 01:02:53,550
he's a faculty in the
business school here,

809
01:02:53,550 --> 01:02:56,830
found a very interesting result
called Steve Ross recovery

810
01:02:56,830 --> 01:03:01,250
theorem that makes
use of this theorem,

811
01:03:01,250 --> 01:03:02,870
makes use of
Perron-Frobenius theorem

812
01:03:02,870 --> 01:03:06,410
that I will tell you today.

813
01:03:06,410 --> 01:03:08,790
Unfortunately you will
only see a lecture

814
01:03:08,790 --> 01:03:11,710
on Steve Ross recovery
theorem towards the end

815
01:03:11,710 --> 01:03:13,730
of the semester.

816
01:03:13,730 --> 01:03:16,560
So I will try to recall
what it is later.

817
01:03:16,560 --> 01:03:19,110
But since we're talking
about linear algebra today,

818
01:03:19,110 --> 01:03:22,540
let me introduce the theorem.

819
01:03:22,540 --> 01:03:24,040
This is called Perron-Frobenius.

820
01:03:28,040 --> 01:03:30,470
And you really won't believe
that it has any applications

821
01:03:30,470 --> 01:03:33,955
in finance because it
just looks so theoretical.

822
01:03:37,320 --> 01:03:40,830
I'm just stating a
really weak form.

823
01:03:40,830 --> 01:03:43,940
Weak form.

824
01:03:43,940 --> 01:03:54,330
Let A be an n by n symmetric
matrix, whose entries are all

825
01:03:54,330 --> 01:03:57,360
positive, with positive entries.

826
01:04:03,790 --> 01:04:10,910
Then there are a few
properties that they have.

827
01:04:10,910 --> 01:04:14,790
First there exists
an eigenvalue,

828
01:04:14,790 --> 01:04:20,610
there exists a largest
eigenvalue, lambda_0, such

829
01:04:20,610 --> 01:04:24,960
that lambda is
less than lambda_0.

830
01:04:24,960 --> 01:04:31,182
Well that's true for
all other lambda.

831
01:04:31,182 --> 01:04:34,560
So this statement is really
easy for symmetric matrix.

832
01:04:34,560 --> 01:04:36,709
So forget about-- you
can drop symmetric,

833
01:04:36,709 --> 01:04:39,000
but I'm just stated it,
because I'm going to prove only

834
01:04:39,000 --> 01:04:40,410
for this weak case.

835
01:04:40,410 --> 01:04:44,550
Just think about the statement
when it's not symmetric.

836
01:04:44,550 --> 01:04:48,730
So if you have an n by n matrix
whose entries are all positive,

837
01:04:48,730 --> 01:04:53,290
then there exists an eigenvalue,
lambda_0, a real eigenvalue

838
01:04:53,290 --> 01:04:59,170
such that the absolute value
of all of other eigenvalues

839
01:04:59,170 --> 01:05:02,860
are strictly smaller
than this eigenvalue.

840
01:05:02,860 --> 01:05:05,540
So remember that if it's
not a symmetric matrix,

841
01:05:05,540 --> 01:05:08,100
they can be complex values.

842
01:05:08,100 --> 01:05:10,500
This is saying that there's
a unique eigenvalue which

843
01:05:10,500 --> 01:05:13,790
has largest absolute value, and
moreover, it's a real number.

844
01:05:16,810 --> 01:05:23,260
Second part, there
exists an eigenvector,

845
01:05:23,260 --> 01:05:34,810
a positive eigenvector
with positive entries,

846
01:05:34,810 --> 01:05:40,120
corresponding to lambda 0.

847
01:05:40,120 --> 01:05:43,660
So the eigenvector
corresponding to this lambda 0

848
01:05:43,660 --> 01:05:46,690
has positive entries.

849
01:05:46,690 --> 01:05:51,320
And the third part
is lambda_0 is

850
01:05:51,320 --> 01:06:05,060
an eigenvalue of multiplicity 1,
for those who know what it is.

851
01:06:05,060 --> 01:06:08,070
So this really is
a unique eigenvalue

852
01:06:08,070 --> 01:06:11,580
with a unique eigenvector,
which has positive entries.

853
01:06:11,580 --> 01:06:14,361
And it's larger, really
larger than other eigenvalues.

854
01:06:17,660 --> 01:06:19,660
So from the mathematician
point of view,

855
01:06:19,660 --> 01:06:21,040
this has many applications.

856
01:06:21,040 --> 01:06:23,060
It's probability theory.

857
01:06:23,060 --> 01:06:25,290
My main research area
is combinatorics,

858
01:06:25,290 --> 01:06:27,090
discrete mathematics.

859
01:06:27,090 --> 01:06:30,080
It's also used in there.

860
01:06:30,080 --> 01:06:31,790
So from the theoretical
point of view,

861
01:06:31,790 --> 01:06:35,470
this has been used
in many contexts.

862
01:06:35,470 --> 01:06:38,990
It's not a standard theorem
taught in linear algebra.

863
01:06:38,990 --> 01:06:42,990
So I don't think probably most
of you haven't seen it before.

864
01:06:42,990 --> 01:06:47,420
But it's a well known
result, with many uses,

865
01:06:47,420 --> 01:06:49,070
theoretical uses.

866
01:06:49,070 --> 01:06:54,700
But you also see one use
in, later, as I mentioned,

867
01:06:54,700 --> 01:06:56,921
in finance, which
is quite surprising.

868
01:07:03,740 --> 01:07:07,320
So let me just give you some
feeling of why it happens.

869
01:07:07,320 --> 01:07:10,300
I won't give you the full
detail of the proof, but just

870
01:07:10,300 --> 01:07:11,590
a very brief description.

871
01:07:16,257 --> 01:07:26,214
Sketch when A is symmetric, just
a simple case, A is symmetric.

872
01:07:32,690 --> 01:07:38,540
In this case, this
statement, if you look at it.

873
01:07:42,800 --> 01:07:44,650
First of all A has
real eigenvalues.

874
01:07:53,540 --> 01:07:59,530
I'll say it's lambda_1,
lambda_2, up to lambda_n.

875
01:07:59,530 --> 01:08:02,790
And at some point, I'll
say up to lambda_i,

876
01:08:02,790 --> 01:08:04,510
it's greater than
zero, pass to where

877
01:08:04,510 --> 01:08:06,670
this is smaller than zero.

878
01:08:06,670 --> 01:08:08,170
There are some
positive eigenvalues.

879
01:08:08,170 --> 01:08:11,490
There are some
negative eigenvalues.

880
01:08:11,490 --> 01:08:13,775
So that's observation one.

881
01:08:18,050 --> 01:08:22,090
Things are more easy to control,
because they are all real.

882
01:08:22,090 --> 01:08:25,590
The first statement says that--
maybe I should have indexed it

883
01:08:25,590 --> 01:08:27,384
as lambda_0.

884
01:08:27,384 --> 01:08:30,729
I'll just call this
lambda 0 instead.

885
01:08:30,729 --> 01:08:34,180
This lambda_0 is in fact
larger in absolute value

886
01:08:34,180 --> 01:08:35,630
than lambda_n.

887
01:08:35,630 --> 01:08:42,010
That's the content
of the first bullet.

888
01:08:42,010 --> 01:08:45,640
So if they all have all
positive entries, then

889
01:08:45,640 --> 01:08:47,790
the positive, largest
positive eigenvalue

890
01:08:47,790 --> 01:08:58,529
dominates the smallest negative
eigenvalue, which yeah.

891
01:08:58,529 --> 01:09:01,310
So why is that the case?

892
01:09:01,310 --> 01:09:02,920
First of all, to
see that you have

893
01:09:02,920 --> 01:09:05,610
to go through different steps.

894
01:09:05,610 --> 01:09:06,859
So we go into observation two.

895
01:09:10,090 --> 01:09:11,950
Lambda_1, so look at lambda_1.

896
01:09:11,950 --> 01:09:21,234
lambda_1 has an eigenvector
with positive entries.

897
01:09:27,529 --> 01:09:29,880
Why is that the case?

898
01:09:29,880 --> 01:09:35,939
That's because if
you look at A times v

899
01:09:35,939 --> 01:09:49,185
equals lambda times v. If
v-- let me state it this way.

900
01:09:49,185 --> 01:09:54,135
Lambda_0 is the maximum
of all lambda, lambda_0.

901
01:10:01,560 --> 01:10:03,535
That's not entirely correct.

902
01:10:03,535 --> 01:10:04,035
Lambda_1.

903
01:10:08,985 --> 01:10:09,975
Sorry about that.

904
01:10:09,975 --> 01:10:14,610
So If you look at this, if
v has non-positive entries,

905
01:10:14,610 --> 01:10:23,750
if it has a negative entry,
if v has a negative entry,

906
01:10:23,750 --> 01:10:24,970
then flip it.

907
01:10:24,970 --> 01:10:34,346
Flip the sign, and in this
way obtain new vector v prime.

908
01:10:38,180 --> 01:10:43,070
Since A has positive entries,
A has positive entries.

909
01:10:47,990 --> 01:10:49,960
What we conclude
is that A times v

910
01:10:49,960 --> 01:10:58,590
prime will be larger than A
times v. You have to look.

911
01:10:58,590 --> 01:11:00,810
Think about, because it
has positive entries,

912
01:11:00,810 --> 01:11:02,820
if it had some negative
part somewhere,

913
01:11:02,820 --> 01:11:05,120
the magnitude will decrease.

914
01:11:05,120 --> 01:11:10,120
So if you flip the sign it
should increase the magnitude.

915
01:11:10,120 --> 01:11:11,720
And this cannot happen.

916
01:11:11,720 --> 01:11:13,330
This shouldn't happen.

917
01:11:13,330 --> 01:11:14,452
This should not happen.

918
01:11:20,330 --> 01:11:22,945
That's where the positive
entries part is used.

919
01:11:22,945 --> 01:11:29,330
If you have positive
entries, then it should have,

920
01:11:29,330 --> 01:11:32,620
the eigenvector should have
positive entries as well.

921
01:11:32,620 --> 01:11:38,420
So I will not work through
the details of the rest.

922
01:11:38,420 --> 01:11:40,830
I will post it on
the lecture notes.

923
01:11:40,830 --> 01:11:44,369
But really this
theorem, in fact,

924
01:11:44,369 --> 01:11:46,410
can be stated in a lot
more generality than this.

925
01:11:46,410 --> 01:11:47,950
I'm stating only
a very weak form.

926
01:11:47,950 --> 01:11:50,630
It doesn't have to have
all positive entries.

927
01:11:50,630 --> 01:11:53,690
It has to only be something
called irreducible,

928
01:11:53,690 --> 01:11:56,640
which is a concept from
probability theory,

929
01:11:56,640 --> 01:11:57,630
from Markov chains.

930
01:12:00,150 --> 01:12:04,810
But here we will only
use it in this setting.

931
01:12:04,810 --> 01:12:08,070
So I will review it later,
before it's really being used.

932
01:12:08,070 --> 01:12:11,220
But just remember that how
these positive entries kick

933
01:12:11,220 --> 01:12:12,950
into this kind of
statement, where

934
01:12:12,950 --> 01:12:16,290
there is an eigenvalue,
largest eigenvalue, why

935
01:12:16,290 --> 01:12:21,380
there has to be a vector
which is all positive entries.

936
01:12:21,380 --> 01:12:24,590
Those will all come
into play later.

937
01:12:24,590 --> 01:12:27,272
So I think that's it for today.

938
01:12:27,272 --> 01:12:28,855
If you have any last
minute questions?

939
01:12:33,450 --> 01:12:36,650
If not, I will see
you on Thursday.