1 00:00:00,060 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,010 Commons license. 3 00:00:04,010 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,217 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,217 --> 00:00:17,842 at ocw.mit.edu. 8 00:00:21,460 --> 00:00:23,110 PROFESSOR: So let's begin. 9 00:00:23,110 --> 00:00:26,600 Today, I'm going to review linear algebra. 10 00:00:26,600 --> 00:00:30,740 So I'm assuming that you already took some linear algebra 11 00:00:30,740 --> 00:00:31,390 course. 12 00:00:31,390 --> 00:00:35,160 And I'm going to just review the relevant content that 13 00:00:35,160 --> 00:00:38,460 will appear again and again throughout the course. 14 00:00:38,460 --> 00:00:42,070 But do interrupt me if some concepts are not clear, 15 00:00:42,070 --> 00:00:47,150 if you don't remember some concept from linear algebra. 16 00:00:47,150 --> 00:00:48,910 I hope you do. 17 00:00:48,910 --> 00:00:50,470 But please let me know. 18 00:00:50,470 --> 00:00:53,450 I just don't know. 19 00:00:53,450 --> 00:00:56,850 You have very different background knowledge. 20 00:00:56,850 --> 00:01:00,390 So it's hard to tune to one special group. 21 00:01:00,390 --> 00:01:03,590 So I tailored this lecture notes so that it's 22 00:01:03,590 --> 00:01:06,800 a review for those who took the most basic linear algebra 23 00:01:06,800 --> 00:01:08,970 course. 24 00:01:08,970 --> 00:01:10,660 So if you already have that experience, 25 00:01:10,660 --> 00:01:13,580 and don't understand it, please feel free to interrupt me. 26 00:01:16,490 --> 00:01:18,570 So I'm going to start by talking about matrices. 27 00:01:21,354 --> 00:01:24,180 A matrix, in a very simple form, is just 28 00:01:24,180 --> 00:01:26,820 a collection of numbers. 29 00:01:26,820 --> 00:01:33,620 For example [1, 2, 3; 2, 3, 4; 4, 5, 10]. 30 00:01:33,620 --> 00:01:36,490 You can pick any number of rows, any number of columns. 31 00:01:36,490 --> 00:01:39,680 You just write down numbers in a square format. 32 00:01:39,680 --> 00:01:42,320 And that's the matrix. 33 00:01:42,320 --> 00:01:44,810 What's special about it? 34 00:01:44,810 --> 00:01:47,480 So what kind of data can you arrange in a matrix? 35 00:01:47,480 --> 00:01:51,840 So I'll take an example, which looks relevant to us. 36 00:01:51,840 --> 00:01:56,850 So for example, we can index the rows by stocks, by companies, 37 00:01:56,850 --> 00:01:57,590 like Apple. 38 00:02:00,170 --> 00:02:05,350 Morgan Stanley should be there, and then Google. 39 00:02:08,810 --> 00:02:11,765 And then maybe we can index the column by dates. 40 00:02:14,290 --> 00:02:20,920 I'll say July 1st, October 1st, September 1st. 41 00:02:20,920 --> 00:02:23,930 And the numbers, you can pick whatever data you want. 42 00:02:23,930 --> 00:02:25,630 But probably the sensible data will 43 00:02:25,630 --> 00:02:28,310 be the stock price on that day. 44 00:02:28,310 --> 00:02:33,750 I don't know for example 400, 500, and 5,000. 45 00:02:33,750 --> 00:02:35,930 That would be great. 46 00:02:35,930 --> 00:02:40,950 So these kind of data, that's just the matrix. 47 00:02:40,950 --> 00:02:43,230 So defining a matrix is really simple. 48 00:02:43,230 --> 00:02:47,870 But why is it so powerful? 49 00:02:47,870 --> 00:02:50,080 So that's an application point of view, 50 00:02:50,080 --> 00:02:51,980 just as a collection of data. 51 00:02:51,980 --> 00:03:01,610 But from a theoretical point of view, 52 00:03:01,610 --> 00:03:10,860 a matrix, an m by n matrix, is an operator. 53 00:03:10,860 --> 00:03:12,960 It defines a linear transformation. 54 00:03:12,960 --> 00:03:16,250 A defines a linear transformation 55 00:03:16,250 --> 00:03:18,660 from the vector space, n-dimensional vector 56 00:03:18,660 --> 00:03:22,779 space to the m-dimensional vector space. 57 00:03:22,779 --> 00:03:24,528 That sounds a lot more abstract than this. 58 00:03:27,840 --> 00:03:30,730 So for example, let's just take a very small example. 59 00:03:30,730 --> 00:03:39,540 If I use a 2 by 2 matrix, [2, 0; 0, 3]. 60 00:03:39,540 --> 00:03:47,589 Then [2, 0; 0, 3] times, let's say [1, 1] is just [2, 3]. 61 00:03:53,122 --> 00:03:54,400 Does that makes sense? 62 00:03:54,400 --> 00:03:57,060 It's just matrix multiplication. 63 00:03:57,060 --> 00:04:00,860 So now try to combine the point of view. 64 00:04:00,860 --> 00:04:03,500 What does it mean to have a linear transformation defined 65 00:04:03,500 --> 00:04:06,690 by a data set? 66 00:04:06,690 --> 00:04:08,370 And things start to get confusing. 67 00:04:08,370 --> 00:04:11,040 What is it? 68 00:04:11,040 --> 00:04:13,790 Why does a data set define a linear transformation? 69 00:04:13,790 --> 00:04:17,500 And does it have any sensible meaning? 70 00:04:17,500 --> 00:04:21,209 So that's a good question to have in mind today. 71 00:04:21,209 --> 00:04:24,410 And try to remember this question. 72 00:04:24,410 --> 00:04:27,040 Because today I'll try to really develop 73 00:04:27,040 --> 00:04:31,530 a theory of eigenvalues and eigenvectors in a purely 74 00:04:31,530 --> 00:04:33,860 theoretical language. 75 00:04:33,860 --> 00:04:38,140 But it can still be applied to these data sets, 76 00:04:38,140 --> 00:04:44,030 and give very important properties 77 00:04:44,030 --> 00:04:46,130 and very important quantities. 78 00:04:46,130 --> 00:04:50,030 You can get some useful information out of it. 79 00:04:50,030 --> 00:04:54,640 Try to make sense out of why it happens. 80 00:04:54,640 --> 00:04:58,830 So that will be the goal today, to really treat linear algebra 81 00:04:58,830 --> 00:05:01,120 as a theoretical thing. 82 00:05:01,120 --> 00:05:04,816 But remember that there's some data set, like really data set 83 00:05:04,816 --> 00:05:05,315 underlying. 84 00:05:08,060 --> 00:05:09,522 This doesn't go up. 85 00:05:09,522 --> 00:05:13,230 That was a bad choice for my first board. 86 00:05:13,230 --> 00:05:13,730 Sorry. 87 00:05:22,150 --> 00:05:30,880 So the most important concepts for us are the eigenvalues 88 00:05:30,880 --> 00:05:35,390 and eigenvectors of a matrix, which 89 00:05:35,390 --> 00:05:47,740 is defined as a real number, lambda, and vector v, 90 00:05:47,740 --> 00:06:02,250 is an eigenvalue, and eigenvector of a matrix A, 91 00:06:02,250 --> 00:06:09,120 if A times v is equal to lambda times V. We also 92 00:06:09,120 --> 00:06:17,925 say that v is an eigenvector corresponding to lambda. 93 00:06:21,640 --> 00:06:24,570 So remember eigenvalues and eigenvectors always 94 00:06:24,570 --> 00:06:26,580 come in pairs. 95 00:06:26,580 --> 00:06:34,710 And they are defined by the property that A*v = lambda*v. 96 00:06:34,710 --> 00:06:37,860 First question, does all matrix have eigenvalues 97 00:06:37,860 --> 00:06:38,604 and eigenvectors? 98 00:06:41,508 --> 00:06:43,930 Nope? 99 00:06:43,930 --> 00:06:50,620 So Av-- It looks like this is a very strange equation 100 00:06:50,620 --> 00:06:51,700 to satisfy. 101 00:06:51,700 --> 00:06:57,640 But if you change it in this form, (A - lambda I)v = 0. 102 00:06:57,640 --> 00:07:01,040 That still looks strange. 103 00:07:01,040 --> 00:07:03,480 But at least you understand that-- it's 104 00:07:03,480 --> 00:07:08,810 an only if, this can happen only if this can happen. 105 00:07:08,810 --> 00:07:16,610 Happens only if A - lambda I does not have full rank. 106 00:07:16,610 --> 00:07:24,065 So determinant of (A - lambda I) is equal to 0, if and only if, 107 00:07:24,065 --> 00:07:24,680 in fact. 108 00:07:29,020 --> 00:07:32,913 So now comes a very interesting observation. 109 00:07:35,440 --> 00:07:45,400 det(A - lambda I) is a polynomial of degree n. 110 00:07:48,587 --> 00:07:49,481 I made a mistake. 111 00:07:49,481 --> 00:07:52,645 I should have said, this is only for n by n matrices. 112 00:08:00,950 --> 00:08:02,520 This is only for square matrices. 113 00:08:02,520 --> 00:08:04,630 Sorry. 114 00:08:04,630 --> 00:08:06,430 It's a polynomial of degree n. 115 00:08:06,430 --> 00:08:08,444 That means it has a solution. 116 00:08:08,444 --> 00:08:11,435 It has to give n in terms of lambda. 117 00:08:15,170 --> 00:08:17,110 So it has a solution. 118 00:08:17,110 --> 00:08:18,430 It might be a complex number. 119 00:08:26,250 --> 00:08:27,215 I'm really sorry. 120 00:08:27,215 --> 00:08:28,790 I'm nervous in front of the video. 121 00:08:32,974 --> 00:08:35,935 I understand why you were saying that is doesn't necessarily 122 00:08:35,935 --> 00:08:37,530 exist. 123 00:08:37,530 --> 00:08:38,330 Let me repeat. 124 00:08:38,330 --> 00:08:39,640 I made a few mistakes here. 125 00:08:39,640 --> 00:08:41,510 So let me repeat here. 126 00:08:41,510 --> 00:08:49,650 For n by n matrix A, a complex number lambda, and the vector 127 00:08:49,650 --> 00:08:53,189 v, is an eigenvalue and eigenvector 128 00:08:53,189 --> 00:08:54,480 if it satisfies this condition. 129 00:08:54,480 --> 00:08:55,780 It doesn't have to be real. 130 00:08:55,780 --> 00:08:57,500 Sorry about that. 131 00:08:57,500 --> 00:08:59,880 And now if we rephrase it this way, 132 00:08:59,880 --> 00:09:03,130 because this is a polynomial, it always 133 00:09:03,130 --> 00:09:04,620 has at least one solution. 134 00:09:07,640 --> 00:09:09,670 That was just a side point. 135 00:09:09,670 --> 00:09:10,970 Very theoretical. 136 00:09:10,970 --> 00:09:13,330 So we see that there always exists at least one 137 00:09:13,330 --> 00:09:14,455 eigenvalue and eigenvector. 138 00:09:17,420 --> 00:09:21,230 Now we saw its existence, what is the geometrical meaning 139 00:09:21,230 --> 00:09:22,208 of it? 140 00:09:34,940 --> 00:09:39,220 Now let's go back to the linear transformation point of view. 141 00:09:39,220 --> 00:09:43,990 So suppose A is a 3 by 3 matrix. 142 00:09:48,230 --> 00:09:58,510 Then A takes the vector in R^3 and transforms it into another 143 00:09:58,510 --> 00:10:01,320 vector in R^3. 144 00:10:04,230 --> 00:10:07,160 But if you have this relation, what's 145 00:10:07,160 --> 00:10:11,350 going to happen is A, when applied to v, 146 00:10:11,350 --> 00:10:16,160 it will just scale the vector v. If this was the original v, 147 00:10:16,160 --> 00:10:19,590 A of v will just be lambda times this vector. 148 00:10:19,590 --> 00:10:24,440 That will be our Av, which is equal to lambda v. 149 00:10:24,440 --> 00:10:28,160 So eigenvectors are those special vectors 150 00:10:28,160 --> 00:10:31,870 which when applied this linear transformation just 151 00:10:31,870 --> 00:10:39,620 get scaled by some amount, where that amount is exactly lambda. 152 00:10:39,620 --> 00:10:42,860 So what we established so far, what we recall so far 153 00:10:42,860 --> 00:10:47,650 is every n by n matrix has at least one such direction. 154 00:10:47,650 --> 00:10:51,830 There is some vector where the linear transformation defined 155 00:10:51,830 --> 00:10:55,005 by A just scales that vector. 156 00:10:55,005 --> 00:10:56,630 Which is quite interesting, if you ever 157 00:10:56,630 --> 00:10:58,830 thought about it before. 158 00:10:58,830 --> 00:11:01,110 There's no reason such vector should exist. 159 00:11:01,110 --> 00:11:02,640 Of course I'm lying a little bit. 160 00:11:02,640 --> 00:11:05,445 Because these might be complex vectors. 161 00:11:05,445 --> 00:11:10,410 But at least in the complex world it's true. 162 00:11:13,210 --> 00:11:19,070 So if you think about this, this is very helpful. 163 00:11:19,070 --> 00:11:23,520 It gives you the vectors-- from these vectors' point of view, 164 00:11:23,520 --> 00:11:27,590 this linear transformation is really easy to understand. 165 00:11:27,590 --> 00:11:30,000 That's why eigenvalues and eigenvector are so good. 166 00:11:30,000 --> 00:11:31,970 It breaks down the linear transformation 167 00:11:31,970 --> 00:11:33,420 into really simple operations. 168 00:11:36,180 --> 00:11:40,110 Let me formalize that a little bit more. 169 00:11:40,110 --> 00:11:50,110 So in an extreme case a matrix, an n by n matrix A, 170 00:11:50,110 --> 00:11:58,370 we call it diagonalizable, if there 171 00:11:58,370 --> 00:12:06,800 exists an orthonormal matrix, I'll 172 00:12:06,800 --> 00:12:20,920 call what it is, U, such that A is equal to U times D times U 173 00:12:20,920 --> 00:12:33,540 inverse for a diagonal matrix D. 174 00:12:33,540 --> 00:12:35,754 Let me iterate through this a little bit. 175 00:12:39,930 --> 00:12:42,140 What is an orthonormal matrix? 176 00:12:42,140 --> 00:12:45,530 It's a matrix defined by the relation U times U transposed 177 00:12:45,530 --> 00:12:48,480 is equal to the identity. 178 00:12:48,480 --> 00:12:50,330 What is a diagonal matrix? 179 00:12:50,330 --> 00:12:54,090 It's a matrix whose nonzero entries are all 180 00:12:54,090 --> 00:12:56,192 on the diagonal. 181 00:12:56,192 --> 00:12:57,626 All the rest are zero. 182 00:13:01,360 --> 00:13:04,800 Why is it so good to have this decomposition? 183 00:13:04,800 --> 00:13:08,270 What does it mean to have an orthonormal matrix like this? 184 00:13:08,270 --> 00:13:16,140 It means basically I'll just explain what's happening. 185 00:13:16,140 --> 00:13:18,720 If that happens, if a matrix is diagonalizable, 186 00:13:18,720 --> 00:13:20,805 if this A is diagonalizable, there 187 00:13:20,805 --> 00:13:28,560 will be three directions, v_1, v_2, v_3, 188 00:13:28,560 --> 00:13:34,570 such that when you apply this A, v_1 scales by some lambda_1. 189 00:13:34,570 --> 00:13:38,240 v_2 scales by some lambda_2. 190 00:13:38,240 --> 00:13:40,244 And v_3 scales by some lambda_3. 191 00:13:43,410 --> 00:13:48,410 So we can completely understand the transformation A, 192 00:13:48,410 --> 00:13:49,980 just in terms of these three vectors. 193 00:13:58,600 --> 00:14:05,610 So this, the stuff here will be the most important things 194 00:14:05,610 --> 00:14:09,250 you'll use in linear algebra throughout this course. 195 00:14:09,250 --> 00:14:13,390 So let me repeat it really slowly. 196 00:14:13,390 --> 00:14:18,730 So an eigenvalue and eigenvector is defined by this relation. 197 00:14:18,730 --> 00:14:21,660 We know that there are at least one eigenvalue for each matrix, 198 00:14:21,660 --> 00:14:25,310 and there is an eigenvector corresponding to it. 199 00:14:25,310 --> 00:14:28,570 And eigenvectors have this geometrical meaning 200 00:14:28,570 --> 00:14:32,930 where-- a vector is an eigenvector, 201 00:14:32,930 --> 00:14:34,780 if the linear transformation defined 202 00:14:34,780 --> 00:14:38,300 by A just scales that vector. 203 00:14:38,300 --> 00:14:42,670 So for our setting, the real good matrices 204 00:14:42,670 --> 00:14:45,430 are the matrices which can be broken down 205 00:14:45,430 --> 00:14:48,190 into these directions. 206 00:14:48,190 --> 00:14:52,180 And those directions are defined by this U. 207 00:14:52,180 --> 00:14:55,020 And D defines how much it will scale. 208 00:14:55,020 --> 00:15:02,110 So in this case U will be our v_1, v_2, v_3. 209 00:15:02,110 --> 00:15:07,887 And D will be our lambda_1, lambda_2, lambda_3 all 0. 210 00:15:17,000 --> 00:15:17,890 Any questions so far? 211 00:15:22,930 --> 00:15:24,650 So that is abstract. 212 00:15:24,650 --> 00:15:27,650 Now remember the question I posed in the beginning. 213 00:15:27,650 --> 00:15:33,500 So remember that matrix where we had stocks and dates and stock 214 00:15:33,500 --> 00:15:36,520 prices in the entries? 215 00:15:36,520 --> 00:15:40,145 What will an eigenvector of that matrix mean? 216 00:15:40,145 --> 00:15:41,736 What will an eigenvalue mean? 217 00:15:45,050 --> 00:15:46,620 So try to think about that question. 218 00:15:49,490 --> 00:15:52,750 It's not like it will have some physical counterpart. 219 00:15:52,750 --> 00:15:55,600 But there's some really interesting things 220 00:15:55,600 --> 00:15:56,380 going on there. 221 00:16:09,810 --> 00:16:14,510 The bad news is that not all matrices are diagonalizable. 222 00:16:14,510 --> 00:16:17,460 If a matrix is diagonalizable, it's really easy 223 00:16:17,460 --> 00:16:19,600 to understand what it does. 224 00:16:19,600 --> 00:16:22,950 Because it really breaks down into these three directions, 225 00:16:22,950 --> 00:16:24,005 if it's a 3 by 3. 226 00:16:24,005 --> 00:16:27,280 If it's an n by n, it breaks down into n directions. 227 00:16:27,280 --> 00:16:32,090 Unfortunately, not all matrices are diagonalizable. 228 00:16:32,090 --> 00:16:33,970 But there is a very special class 229 00:16:33,970 --> 00:16:38,330 of matrices which are always diagonalizable. 230 00:16:38,330 --> 00:16:41,640 And fortunately we will see those matrices 231 00:16:41,640 --> 00:16:43,340 throughout the course. 232 00:16:43,340 --> 00:16:45,260 Most of the matrices, n by n matrices, 233 00:16:45,260 --> 00:16:48,070 we will study, fall into this category. 234 00:16:51,900 --> 00:17:01,970 So an n by n matrix A is symmetric 235 00:17:01,970 --> 00:17:05,550 if A is equal to A transpose. 236 00:17:05,550 --> 00:17:10,000 Before proceeding, please raise your hand 237 00:17:10,000 --> 00:17:13,900 if you're familiar with all the concepts so far. 238 00:17:13,900 --> 00:17:14,400 OK. 239 00:17:14,400 --> 00:17:16,180 Good feeling. 240 00:17:22,500 --> 00:17:25,609 So a matrix is symmetric if it's equal to its transpose. 241 00:17:25,609 --> 00:17:27,710 A transpose is obtained by taking the mirror 242 00:17:27,710 --> 00:17:29,225 image across the diagonal. 243 00:17:32,190 --> 00:17:44,720 And then it is known that all symmetric matrices 244 00:17:44,720 --> 00:17:47,117 are diagonalizable. 245 00:17:47,117 --> 00:17:50,817 Ah, I've made another mistake. 246 00:17:50,817 --> 00:17:51,400 Orthonormally. 247 00:17:55,558 --> 00:18:06,970 So with this I missed matrices orthonormally diagonalizable. 248 00:18:06,970 --> 00:18:13,190 So it's diagonalizable if we drop this condition, 249 00:18:13,190 --> 00:18:14,790 and replace it with an invertible. 250 00:18:25,150 --> 00:18:29,300 So symmetric matrices are really good. 251 00:18:29,300 --> 00:18:33,920 And fortunately most of the n by n matrices that we will study 252 00:18:33,920 --> 00:18:34,880 are symmetric. 253 00:18:34,880 --> 00:18:38,640 Just by the nature of it, it will be symmetric. 254 00:18:38,640 --> 00:18:42,030 The one I gave as an example is not symmetric. 255 00:18:42,030 --> 00:18:44,536 It's not symmetric. 256 00:18:44,536 --> 00:18:49,770 But I will address that issue in a minute. 257 00:18:49,770 --> 00:19:00,326 And another important thing is symmetric matrices 258 00:19:00,326 --> 00:19:03,128 have real eigenvalues. 259 00:19:12,340 --> 00:19:16,950 So really this geometrical picture just the-- 260 00:19:16,950 --> 00:19:18,891 for symmetric matrices, this picture 261 00:19:18,891 --> 00:19:20,807 is really the picture you should have in mind. 262 00:19:30,870 --> 00:19:36,870 So proof of Theorem 2. 263 00:19:45,030 --> 00:19:58,610 Suppose lambda is an eigenvalue with eigenvector v. Then 264 00:19:58,610 --> 00:20:00,190 by definition we have this. 265 00:20:04,720 --> 00:20:08,710 Now multiply v transposed on both sides. 266 00:20:12,070 --> 00:20:20,150 It is lambda times the norm v. 267 00:20:20,150 --> 00:20:34,732 Now take the complex conjugate-- Real symmetric. 268 00:20:40,710 --> 00:20:47,540 And then first A conjugate, we have v^T A^T v, 269 00:20:47,540 --> 00:20:50,965 and then take the conjugate of it. 270 00:20:50,965 --> 00:20:57,020 Then we get lambda... 271 00:20:57,020 --> 00:21:19,490 v. And this side is equal to v^T A^T v. 272 00:21:19,490 --> 00:21:27,760 But because A is real symmetric, we see that A is equal 273 00:21:27,760 --> 00:21:31,860 to the conjugate of complex conjugate of A. 274 00:21:31,860 --> 00:21:35,760 So this expression and this expression is the same. 275 00:21:35,760 --> 00:21:39,730 The right side should also be the same. 276 00:21:39,730 --> 00:21:43,136 That means lambda is equal to the conjugate of lambda. 277 00:21:43,136 --> 00:21:44,780 So lambda has to be a real. 278 00:21:59,480 --> 00:22:02,660 So Theorem 1 is a little bit more complicated, 279 00:22:02,660 --> 00:22:06,490 and it involves more advanced concepts 280 00:22:06,490 --> 00:22:13,500 like basis and linear subspace, and so on. 281 00:22:13,500 --> 00:22:15,210 And those concepts are not really 282 00:22:15,210 --> 00:22:16,440 important for this class. 283 00:22:16,440 --> 00:22:18,910 So I'll just skip the proof. 284 00:22:18,910 --> 00:22:21,900 But it's really important to remember these two theorems. 285 00:22:21,900 --> 00:22:25,760 Wherever you see a symmetric matrix 286 00:22:25,760 --> 00:22:27,900 you should really feel like you have control on it. 287 00:22:27,900 --> 00:22:30,730 Because you can diagonalize it. 288 00:22:34,370 --> 00:22:38,052 And moreover, all eigenvalues are real, 289 00:22:38,052 --> 00:22:40,562 and you have really good control on symmetric matrices. 290 00:22:44,891 --> 00:22:48,050 That's good. 291 00:22:48,050 --> 00:22:51,170 That was when everything went well. 292 00:22:51,170 --> 00:22:53,140 We can diagonalize it. 293 00:22:53,140 --> 00:22:58,760 So, so far we saw that if for a symmetric matrix, 294 00:22:58,760 --> 00:23:00,090 we can diagonalize it. 295 00:23:00,090 --> 00:23:01,450 It's really easy to understand. 296 00:23:01,450 --> 00:23:03,330 But what about general matrices? 297 00:23:16,690 --> 00:23:19,590 In general, not all matrices are diagonalizable, first of all. 298 00:23:37,500 --> 00:23:41,910 But sometimes we still want to decomposition like this. 299 00:23:41,910 --> 00:23:53,590 So diagonalization was A equals U times D times U inverse. 300 00:23:59,910 --> 00:24:01,500 But we want something similar. 301 00:24:01,500 --> 00:24:04,280 We want to understand. 302 00:24:04,280 --> 00:24:15,020 So our goal, we want to still understand the matrix, 303 00:24:15,020 --> 00:24:22,945 give a matrix A through simple operations, such as scaling. 304 00:24:27,810 --> 00:24:30,800 When the matrix was a diagonalizable matrix this 305 00:24:30,800 --> 00:24:33,554 was done, this was possible. 306 00:24:33,554 --> 00:24:35,470 Unfortunately, it's not always diagonalizable. 307 00:24:38,560 --> 00:24:41,530 So we have to do something else. 308 00:24:45,460 --> 00:24:47,780 So that's what I want to talk about. 309 00:24:47,780 --> 00:24:49,860 And luckily the good news is there 310 00:24:49,860 --> 00:24:52,600 is a nice tool we can use for all matrices, 311 00:24:52,600 --> 00:24:56,360 even those slightly weaker, in fact, a little bit more 312 00:24:56,360 --> 00:24:58,760 weaker than this diagonalization. 313 00:24:58,760 --> 00:25:02,580 But still it distills some very important information 314 00:25:02,580 --> 00:25:03,407 of the matrix. 315 00:25:03,407 --> 00:25:05,240 So it's called singular value decomposition. 316 00:25:17,220 --> 00:25:22,350 So this will be our second tool of understanding matrices. 317 00:25:22,350 --> 00:25:24,880 It's very similar to this diagonalization, 318 00:25:24,880 --> 00:25:27,210 or in other words I call this eigenvalue decomposition. 319 00:25:34,400 --> 00:25:36,400 But it has a slightly different form. 320 00:25:36,400 --> 00:25:39,310 So what is its form? 321 00:25:39,310 --> 00:25:41,770 So theorem. 322 00:25:41,770 --> 00:25:45,874 Let A be an m by n matrix. 323 00:25:51,350 --> 00:26:12,400 Then there always exists orthonormal matrices 324 00:26:12,400 --> 00:26:25,530 U and V such that A is equal to U times sigma times 325 00:26:25,530 --> 00:26:27,340 V transpose. 326 00:26:27,340 --> 00:26:36,880 For some diagonal matrix sigma. 327 00:26:36,880 --> 00:26:40,980 Let me parse through the theorem a little bit more. 328 00:26:40,980 --> 00:26:42,670 Whenever you're given a matrix, it 329 00:26:42,670 --> 00:26:45,060 doesn't even have to be a square matrix anymore. 330 00:26:45,060 --> 00:26:47,040 It can be non-symmetric. 331 00:26:47,040 --> 00:26:50,400 So whenever we're given an m by n matrix, in general, 332 00:26:50,400 --> 00:26:55,110 there always exists two matrices, U and V, 333 00:26:55,110 --> 00:26:58,510 which are orthonormal, such that A 334 00:26:58,510 --> 00:27:03,380 can be decomposed as U times sigma times V transposed, where 335 00:27:03,380 --> 00:27:05,290 sigma is a diagonal matrix. 336 00:27:05,290 --> 00:27:08,340 But now the size of the matrix are important 337 00:27:08,340 --> 00:27:13,740 so U is an m by n matrix, sigma is an m by n matrix, 338 00:27:13,740 --> 00:27:15,910 and V is an n by n matrix. 339 00:27:15,910 --> 00:27:21,010 That just denotes the size, the dimensions of the matrix. 340 00:27:21,010 --> 00:27:25,130 So what does it mean for an m by n matrix to be diagonal? 341 00:27:25,130 --> 00:27:27,410 It just means the same thing. 342 00:27:27,410 --> 00:27:30,640 So only the (i,i) entries are allowed to be nonzero. 343 00:27:39,760 --> 00:27:41,650 So that was just a bunch of words. 344 00:27:41,650 --> 00:27:43,110 So let me rephrase this. 345 00:27:52,370 --> 00:27:56,170 So let me compare now eigenvalue decomposition, with singular 346 00:27:56,170 --> 00:27:58,060 value decomposition. 347 00:27:58,060 --> 00:28:03,370 So this is EVD, what we just saw before. 348 00:28:03,370 --> 00:28:06,290 It only-- SVD. 349 00:28:06,290 --> 00:28:09,259 This only works for n by n matrices, 350 00:28:09,259 --> 00:28:10,300 which are diagonalizable. 351 00:28:15,260 --> 00:28:17,793 SVD works for all general m by n matrices. 352 00:28:23,470 --> 00:28:24,830 However, this is powerful. 353 00:28:24,830 --> 00:28:28,950 Because it gives you one frame. 354 00:28:28,950 --> 00:28:41,508 So v_1 with a v_2, v_3 for which A acts as a scaling operator. 355 00:28:41,508 --> 00:28:44,030 Kind of like that. 356 00:28:44,030 --> 00:28:45,972 That's what A does, A does, A does. 357 00:28:49,140 --> 00:28:54,120 That's because these U on the both sides are equal. 358 00:28:54,120 --> 00:28:57,766 However, for singular value decomposition, 359 00:28:57,766 --> 00:29:00,106 this is called singular value decomposition. 360 00:29:00,106 --> 00:29:01,330 I just erased It. 361 00:29:08,750 --> 00:29:11,480 What you have instead is first of all, 362 00:29:11,480 --> 00:29:12,670 the spaces are different. 363 00:29:12,670 --> 00:29:22,358 If you take a vector in R^m, and bring it to R^n, 364 00:29:22,358 --> 00:29:25,690 apply this operation A. What's going to happen here is there 365 00:29:25,690 --> 00:29:28,790 will be one frame in here, and one frame in here. 366 00:29:28,790 --> 00:29:36,430 So there will be vectors v_1, v_2, v_3, v_4 like this. 367 00:29:36,430 --> 00:29:44,300 And there will be vectors u_1, u_2, u_3 like this here. 368 00:29:44,300 --> 00:29:48,420 And what's going to happen is when you take v_1, 369 00:29:48,420 --> 00:29:52,800 A will take v_1 to u_1 and scale it a little bit 370 00:29:52,800 --> 00:29:54,290 according to that diagonal. 371 00:29:54,290 --> 00:29:58,420 A will take v_2 to u_2, it will scale it. 372 00:29:58,420 --> 00:30:01,990 It'll take v_3 to u_3, scale it. 373 00:30:01,990 --> 00:30:02,620 Wait a minute. 374 00:30:02,620 --> 00:30:05,100 But for v_4, we don't have u_4. 375 00:30:05,100 --> 00:30:08,070 What's going to happen is this is just going to disappear. 376 00:30:08,070 --> 00:30:11,510 u_4, when applied A, will disappear. 377 00:30:11,510 --> 00:30:13,930 So I know it's a very vague explanation, 378 00:30:13,930 --> 00:30:18,320 but this geometric picture, try to compare them. 379 00:30:18,320 --> 00:30:21,200 A diagonalization, eigenvalue decomposition, 380 00:30:21,200 --> 00:30:25,350 works within its frame, so it's very, very powerful. 381 00:30:25,350 --> 00:30:29,480 You just have some directions and you scale those directions. 382 00:30:29,480 --> 00:30:31,450 But the singular value composition 383 00:30:31,450 --> 00:30:34,840 it's applicable to a more general class of matrices, 384 00:30:34,840 --> 00:30:36,750 but it's rather more restricted. 385 00:30:36,750 --> 00:30:39,750 You have two frames, one for the original space, one 386 00:30:39,750 --> 00:30:41,400 for the target space. 387 00:30:41,400 --> 00:30:43,320 And what the linear transformation does is, 388 00:30:43,320 --> 00:30:47,240 it just sends one vector to another vector 389 00:30:47,240 --> 00:30:49,770 and scales it a little bit. 390 00:30:54,080 --> 00:30:59,149 So now is another good time to go back 391 00:30:59,149 --> 00:31:00,690 to that matrix in the very beginning. 392 00:31:12,520 --> 00:31:22,714 So remember that example where we had a vector of companies, 393 00:31:22,714 --> 00:31:27,604 and dates, and the entry was stock prices. 394 00:31:37,400 --> 00:31:41,000 So if it's an n by n matrix, you can 395 00:31:41,000 --> 00:31:42,940 try to apply both eigenvalue decomposition, 396 00:31:42,940 --> 00:31:45,080 and singular value decomposition. 397 00:31:45,080 --> 00:31:48,230 But what will be more sensible is singular value decomposition 398 00:31:48,230 --> 00:31:50,210 in this case. 399 00:31:50,210 --> 00:31:53,130 I won't explain why, and what's happening here. 400 00:31:53,130 --> 00:31:56,170 Peter will probably. 401 00:31:56,170 --> 00:31:58,100 You will come to it later. 402 00:31:58,100 --> 00:32:01,540 But just try to do some imagining before listening 403 00:32:01,540 --> 00:32:04,190 what's really happening in real world. 404 00:32:04,190 --> 00:32:07,380 So try to use your own imagination, your own language 405 00:32:07,380 --> 00:32:08,240 to express. 406 00:32:08,240 --> 00:32:10,950 See what happens for this matrix, what 407 00:32:10,950 --> 00:32:12,430 this decomposition is doing. 408 00:32:20,010 --> 00:32:24,060 It just looks like totally nonsense. 409 00:32:24,060 --> 00:32:26,630 Why does this have even a geometry? 410 00:32:26,630 --> 00:32:29,160 Why does it define a linear transformation and so on? 411 00:32:32,440 --> 00:32:34,590 It's just a beautiful theory, which just 412 00:32:34,590 --> 00:32:36,990 gives many useful information. 413 00:32:36,990 --> 00:32:38,750 I can't really emphasize more. 414 00:32:38,750 --> 00:32:42,540 Because-- emphasize enough, because really 415 00:32:42,540 --> 00:32:46,010 this is just universal, being used in all science, these. 416 00:32:46,010 --> 00:32:48,260 I think the eigenvalue decomposition, and the singular 417 00:32:48,260 --> 00:32:50,070 value decomposition. 418 00:32:50,070 --> 00:32:53,560 Not just for this course, but pretty much 419 00:32:53,560 --> 00:32:55,620 it's safe to say in every engineering, 420 00:32:55,620 --> 00:32:57,620 you'll encounter one of the forms. 421 00:33:00,150 --> 00:33:05,560 So let me talk about the proof of the singular value 422 00:33:05,560 --> 00:33:06,880 decomposition. 423 00:33:06,880 --> 00:33:11,120 And I will show you an example of what singular value 424 00:33:11,120 --> 00:33:15,410 decomposition does for some example matrix, the matrix 425 00:33:15,410 --> 00:33:17,760 that I chose. 426 00:33:17,760 --> 00:33:25,665 Proof of singular value decomposition, 427 00:33:25,665 --> 00:33:26,540 which is interesting. 428 00:33:26,540 --> 00:33:28,123 It relies on eigenvalue decomposition. 429 00:33:31,030 --> 00:33:57,125 So given a matrix A, consider the eigenvalues of A times 430 00:33:57,125 --> 00:33:58,280 A transpose. 431 00:34:04,910 --> 00:34:17,024 Oh, A transpose A. First observation: that's 432 00:34:17,024 --> 00:34:17,815 a symmetric matrix. 433 00:34:26,170 --> 00:34:29,210 So if you remember, it will have real eigenvalues, 434 00:34:29,210 --> 00:34:30,210 and it's diagonalizable. 435 00:34:35,110 --> 00:34:51,356 So A^T of A has eigenvalues lambda_1, lambda_2, 436 00:34:51,356 --> 00:34:57,326 up to, it's an n by n matrix, so lambda_n. 437 00:35:00,080 --> 00:35:09,387 And corresponding eigenvectors v_1, v_2, up to v_n. 438 00:35:13,790 --> 00:35:18,110 And so for convenience, I will cut it at lambda_r, 439 00:35:18,110 --> 00:35:22,150 and assume all rest is 0. 440 00:35:22,150 --> 00:35:23,690 So there might be none which are 0. 441 00:35:23,690 --> 00:35:26,570 In that case we use all the eigenvalues. 442 00:35:26,570 --> 00:35:29,850 But I only am interested in nonzero eigenvalues. 443 00:35:29,850 --> 00:35:33,010 So I'll say up to lambda_r, they're nonzero. 444 00:35:33,010 --> 00:35:35,710 Afterwards it's 0. 445 00:35:35,710 --> 00:35:36,960 It's just a notational choice. 446 00:35:40,367 --> 00:35:41,950 And now I'm just going to make a claim 447 00:35:41,950 --> 00:35:44,250 that they're all positive. 448 00:35:44,250 --> 00:35:49,300 This part is kind of just believe me. 449 00:35:53,730 --> 00:35:56,400 Then if that's the case, we can rewrite the eigenvalues. 450 00:35:56,400 --> 00:36:06,610 Rewrite eigenvalues as sigma_1^2, sigma_2^2, 451 00:36:06,610 --> 00:36:08,946 sigma_r^2, and 0. 452 00:36:15,610 --> 00:36:18,530 That was my first step. 453 00:36:18,530 --> 00:36:21,856 My second step, that was step one, 454 00:36:21,856 --> 00:36:30,770 step two is to define u_1 as A*v_1 / sigma_1, 455 00:36:30,770 --> 00:36:32,400 u_2 as A*v_2 / sigma_2. 456 00:36:35,200 --> 00:36:37,950 And u_r as A*V_r / sigma_r. 457 00:36:41,460 --> 00:36:49,240 And then u times r+1 as-- up to u times m, 458 00:36:49,240 --> 00:36:56,390 as complete the above into a basis. 459 00:37:02,590 --> 00:37:04,080 So for those who don't understand, 460 00:37:04,080 --> 00:37:07,700 just think of we pick u_1 up to u_r first, and then 461 00:37:07,700 --> 00:37:10,750 arbitrarily pick the rest. 462 00:37:10,750 --> 00:37:14,890 And you'll see why I only care about the nonzero eigenvalues. 463 00:37:14,890 --> 00:37:19,500 Because I have to divide by sigmas, the sigma values. 464 00:37:19,500 --> 00:37:22,330 And if it's zero, I can't do the division. 465 00:37:22,330 --> 00:37:24,510 So that's why I identified those which are not zero. 466 00:37:26,817 --> 00:37:27,650 And then we're done. 467 00:37:30,680 --> 00:37:32,685 So it doesn't look at all like we're done. 468 00:37:32,685 --> 00:37:41,410 But I'm going to let my U be this, u_1, u_2, up to u_n. 469 00:37:44,012 --> 00:37:45,988 Sorry, it has to be n. 470 00:37:48,650 --> 00:37:57,110 My V I will pick as v_1, v_2, up to v_r. 471 00:37:57,110 --> 00:38:00,660 And then v_(r+1) up to v_n. 472 00:38:00,660 --> 00:38:03,130 So this again just complete into a basis. 473 00:38:15,575 --> 00:38:16,700 Now let's see what happens. 474 00:38:27,960 --> 00:38:33,772 So A times U transpose times V. Oh, ah. 475 00:38:33,772 --> 00:38:35,060 That's why it's a problem. 476 00:38:39,137 --> 00:38:43,440 You have to do U times A times V transpose. 477 00:38:43,440 --> 00:38:49,290 So I would write V is n, and this is m. 478 00:39:20,500 --> 00:39:25,160 Ah yes, so U times A times V transpose here. 479 00:39:25,160 --> 00:39:31,080 That will be u_1, u_2, u_m. 480 00:39:31,080 --> 00:39:37,865 A. V transpose will be v_1 transpose, v_2 transpose, 481 00:39:37,865 --> 00:39:39,350 to v_n transpose. 482 00:40:03,605 --> 00:40:05,085 I messed up something. 483 00:40:05,085 --> 00:40:05,585 Sorry. 484 00:40:16,180 --> 00:40:18,772 Oh, that's the form I want, right? 485 00:40:18,772 --> 00:40:20,550 Yeah. 486 00:40:20,550 --> 00:40:23,760 So I have to transpose U and V there. 487 00:40:23,760 --> 00:40:24,755 OK, sorry. 488 00:40:27,280 --> 00:40:28,110 Thank you. 489 00:40:28,110 --> 00:40:29,810 Thank you for the correction. 490 00:40:29,810 --> 00:40:31,620 I know this looks different from that. 491 00:40:31,620 --> 00:40:36,380 But I mean if you flip the definition it will be the same. 492 00:40:36,380 --> 00:40:39,240 So I'll just not-- stop making mistakes. 493 00:40:39,240 --> 00:40:41,200 Do you have a question? 494 00:40:41,200 --> 00:40:42,170 So, yeah. 495 00:40:42,170 --> 00:40:42,670 Thank you. 496 00:40:48,550 --> 00:40:49,530 Yeah. 497 00:40:49,530 --> 00:40:51,000 That will make more sense. 498 00:40:55,900 --> 00:40:58,740 Thank you very much. 499 00:40:58,740 --> 00:41:01,603 And then you're going to have u_1 transpose up 500 00:41:01,603 --> 00:41:04,460 to u_n transpose. 501 00:41:04,460 --> 00:41:08,300 A times V, because of the definition of V, 502 00:41:08,300 --> 00:41:11,670 will be lambda_1 of v_1. 503 00:41:11,670 --> 00:41:13,990 A times v_2 will be lambda_2 of v_2. 504 00:41:13,990 --> 00:41:19,010 Up to lambda_r of v_r, and the rest will be zero. 505 00:41:19,010 --> 00:41:20,428 These all define the columns. 506 00:41:33,874 --> 00:41:47,200 Now let's do a few computations. 507 00:41:47,200 --> 00:41:50,680 So u_1^T times lambda_1 v_1. 508 00:41:50,680 --> 00:41:54,050 u_1^T, and lambda_1 v_1. 509 00:41:54,050 --> 00:41:59,140 When you take the dot product, what you're going to get is 510 00:41:59,140 --> 00:42:05,441 v_1^T A transpose of v_1 lambda_1. 511 00:42:13,393 --> 00:42:14,884 I'm missing something. 512 00:42:25,850 --> 00:42:26,900 Ah, sorry about that. 513 00:42:26,900 --> 00:42:29,921 This is not right. 514 00:42:29,921 --> 00:42:30,855 These are As. 515 00:42:33,660 --> 00:42:47,800 I defined the eigenvalues for A transpose A. 516 00:42:47,800 --> 00:42:52,140 Then that's u_1 transpose times sigma_1 times u_1. 517 00:42:52,140 --> 00:42:54,760 That will be sigma_1. 518 00:42:59,669 --> 00:43:01,960 And then if you look at the second entry, u_1 transpose 519 00:43:01,960 --> 00:43:11,370 times A v_2, you get u_1 transpose times sigma_2 of u_2. 520 00:43:14,010 --> 00:43:18,430 But I claim that this is equal to 0. 521 00:43:18,430 --> 00:43:20,340 So why is that the case? 522 00:43:20,340 --> 00:43:23,082 u_1 transpose is equal to V_1 transpose 523 00:43:23,082 --> 00:43:26,316 A transpose over sigma_1. 524 00:43:26,316 --> 00:43:28,160 And we have sigma_2. 525 00:43:28,160 --> 00:43:32,876 u_2 is equal to A times v_2 over sigma_2. 526 00:43:32,876 --> 00:43:35,650 So those two cancel. 527 00:43:35,650 --> 00:43:42,230 And we have v_1^T A^T A v_2 over sigma_1. 528 00:43:42,230 --> 00:43:45,680 But v_1 and v_2 are two different eigenvectors 529 00:43:45,680 --> 00:43:48,160 of this matrix. 530 00:43:48,160 --> 00:43:52,570 At the beginning we can have an orthonormal decomposition of A 531 00:43:52,570 --> 00:43:59,240 transpose A. That means v_1^T times v_2 times that has to be 532 00:43:59,240 --> 00:44:00,222 equal to zero. 533 00:44:00,222 --> 00:44:03,140 Because that's an eigenvalue. 534 00:44:03,140 --> 00:44:09,658 We have v_1^T times lambda_2 v_2 over sigma_1. 535 00:44:09,658 --> 00:44:14,495 So we have lambda_2 over sigma_1 times v_1 transpose v_2. 536 00:44:14,495 --> 00:44:18,410 These two are orthogonal so give 0. 537 00:44:18,410 --> 00:44:20,760 So if you do the computation, what 538 00:44:20,760 --> 00:44:23,560 you're going to have is sigma_1, sigma_2 539 00:44:23,560 --> 00:44:28,547 on the diagonal, up to sigma_r, and then 0, 0 rest. 540 00:44:28,547 --> 00:44:32,450 And 0 the rest. 541 00:44:32,450 --> 00:44:35,630 Sorry for the confusion. 542 00:44:35,630 --> 00:44:37,190 Actually the process is quite simple. 543 00:44:37,190 --> 00:44:39,880 I was just lost in the computation in the middle. 544 00:44:39,880 --> 00:44:44,550 So process is first look at A transpose A. 545 00:44:44,550 --> 00:44:47,450 Find the eigenvalues and eigenvectors. 546 00:44:47,450 --> 00:44:53,070 And using those, they define the matrix V. 547 00:44:53,070 --> 00:44:56,580 And you can define the matrix U by applying A times 548 00:44:56,580 --> 00:44:58,280 V over sigma. 549 00:44:58,280 --> 00:45:01,860 Each of those will define the entries of U. 550 00:45:01,860 --> 00:45:03,900 The reason I wanted to go through this proof 551 00:45:03,900 --> 00:45:07,490 is because this gives you a process of finding a singular 552 00:45:07,490 --> 00:45:09,830 value decomposition. 553 00:45:09,830 --> 00:45:11,920 It was a little bit painful for me. 554 00:45:11,920 --> 00:45:17,070 But if you have a matrix there's just these simple steps 555 00:45:17,070 --> 00:45:21,600 you can follow to find the singular value decomposition. 556 00:45:21,600 --> 00:45:25,460 So look at this matrix, find its eigenvalues and eigenvectors. 557 00:45:25,460 --> 00:45:28,020 Just arrange it in the right way. 558 00:45:28,020 --> 00:45:30,490 Of course, the right way needs some practice 559 00:45:30,490 --> 00:45:31,980 to be done correctly. 560 00:45:31,980 --> 00:45:34,350 But once you do that, you just obtain a singular value 561 00:45:34,350 --> 00:45:36,750 composition. 562 00:45:36,750 --> 00:45:39,674 And really I can't explain how powerful it is. 563 00:45:39,674 --> 00:45:41,340 You will only later see it in the course 564 00:45:41,340 --> 00:45:43,720 how powerful this decomposition will be. 565 00:45:43,720 --> 00:45:45,850 And only then you'll more appreciate 566 00:45:45,850 --> 00:45:48,900 how good it is to have this decomposition, 567 00:45:48,900 --> 00:45:53,100 and be able to compute it so simply. 568 00:45:53,100 --> 00:45:56,598 So let's try to do it by hand. 569 00:45:56,598 --> 00:45:58,133 Yes? 570 00:45:58,133 --> 00:46:00,007 STUDENT: So when you compute the [INAUDIBLE]. 571 00:46:05,155 --> 00:46:05,780 PROFESSOR: Yes. 572 00:46:05,780 --> 00:46:06,613 STUDENT: [INAUDIBLE] 573 00:46:08,700 --> 00:46:12,490 PROFESSOR: It would have to be orthonormal, yeah. 574 00:46:12,490 --> 00:46:13,927 It should be orthonormal. 575 00:46:13,927 --> 00:46:15,364 These should be orthonormal. 576 00:46:18,238 --> 00:46:18,970 These also. 577 00:46:23,070 --> 00:46:25,480 And that's a good point, because that can be annoying 578 00:46:25,480 --> 00:46:26,920 when you want to do it by hand. 579 00:46:26,920 --> 00:46:28,530 Actually this decomposition. 580 00:46:28,530 --> 00:46:31,800 You have to do some Gram-Schmidt process or something like that. 581 00:46:35,380 --> 00:46:37,085 What I mean by hand, I don't really 582 00:46:37,085 --> 00:46:41,000 mean by hand, other than when you're doing homework. 583 00:46:41,000 --> 00:46:44,030 Because you can use the computer to do it. 584 00:46:44,030 --> 00:46:46,690 And in fact, if you use computer there 585 00:46:46,690 --> 00:46:49,314 are much better algorithms than this that are known, 586 00:46:49,314 --> 00:46:51,730 which can do this a lot more quickly and more efficiently. 587 00:46:55,140 --> 00:46:57,077 So let's try to do it by hand. 588 00:47:05,850 --> 00:47:16,280 So let A be this matrix: [3, 2 2; 2, 3, -2]. 589 00:47:16,280 --> 00:47:19,785 And we want to make the eigenvalue decomposition 590 00:47:19,785 --> 00:47:22,080 of this. 591 00:47:22,080 --> 00:47:24,800 A transpose A, we have to compute that, 592 00:47:24,800 --> 00:47:29,180 is [3, 2, 2; 2, 3, -2]. 593 00:47:39,044 --> 00:47:52,824 And you will get [13, 12, 2; 12, 13, -2; 2, -2, 8]. 594 00:48:03,920 --> 00:48:12,672 And let me just say that the eigenvalues are 0, 9, and 25. 595 00:48:15,570 --> 00:48:20,900 So in this algorithm, sigma_1^2 will be 25. 596 00:48:20,900 --> 00:48:23,660 Sigma_2^2 squared will be 9. 597 00:48:23,660 --> 00:48:27,250 And sigma_3^2 squared will be 0. 598 00:48:27,250 --> 00:48:30,140 So we can take sigma_1 to be 5, sigma_2 to be 3, 599 00:48:30,140 --> 00:48:31,230 sigma_3 to be 0. 600 00:48:36,930 --> 00:48:41,650 Now we have to find the corresponding eigenvectors 601 00:48:41,650 --> 00:48:44,840 to find the singular value decomposition. 602 00:48:44,840 --> 00:48:48,260 And I'll just do one just to remind you 603 00:48:48,260 --> 00:48:50,120 how to find an eigenvector. 604 00:48:50,120 --> 00:48:56,345 So A transpose A, minus 25I is equal to, 605 00:48:56,345 --> 00:48:59,010 if you subtract 25 from these entries, 606 00:48:59,010 --> 00:49:09,784 you're going to get [-12, 12, 2; 12, -12, -2; 2, -2, -13]. 607 00:49:17,000 --> 00:49:20,110 And then you have to find the vector which 608 00:49:20,110 --> 00:49:22,060 annihilates this matrix. 609 00:49:22,060 --> 00:49:27,213 And that will be, I can take one of those vectors to be just 1 610 00:49:27,213 --> 00:49:31,493 over square root of 2, 1 over square root of two, 0, 611 00:49:31,493 --> 00:49:32,475 after normalizing. 612 00:49:36,410 --> 00:49:38,260 And then just do it for other vectors. 613 00:49:43,430 --> 00:49:52,130 You find v_2 to be 1 over square root 18, negative 1 614 00:49:52,130 --> 00:49:57,744 over square root 18, 4 over square root 18. 615 00:50:11,220 --> 00:50:19,020 Now then find v_3 to be the one that annihilates this. 616 00:50:19,020 --> 00:50:20,880 But I'll just say it's x, y, z. 617 00:50:20,880 --> 00:50:23,250 This will not be important. 618 00:50:23,250 --> 00:50:25,270 I'll explain why it's not that important. 619 00:50:35,520 --> 00:50:44,810 Then our v as written above, actually 620 00:50:44,810 --> 00:50:45,950 there it was transposed. 621 00:50:45,950 --> 00:50:47,305 So I will transpose it. 622 00:50:47,305 --> 00:50:49,165 That will be 1 over square root of 2, 623 00:50:49,165 --> 00:50:53,350 1 over square root of 2, 0. 624 00:50:53,350 --> 00:50:54,660 v_2 is that. 625 00:50:54,660 --> 00:50:58,610 So we can write 1 over square root 18, negative 1 626 00:50:58,610 --> 00:51:03,835 over square root 18, 4 over square root 18. 627 00:51:03,835 --> 00:51:06,210 And here just write x, y, z. 628 00:51:10,485 --> 00:51:17,210 And U will be defined as u_1 and u_2, 629 00:51:17,210 --> 00:51:21,600 where u_1 is A times v_1 over sigma_1. 630 00:51:21,600 --> 00:51:26,610 u_2 is A times v_2 over sigma_2. 631 00:51:26,610 --> 00:51:30,150 So multiply A by this vector, divide by sigma_1 632 00:51:30,150 --> 00:51:34,415 to get U. I already did the computation for you. 633 00:51:34,415 --> 00:51:44,810 It's going to be-- and this is going to be-- yes? 634 00:51:44,810 --> 00:51:46,320 STUDENT: How did you get v_1? 635 00:51:46,320 --> 00:51:47,870 PROFESSOR: v_1? 636 00:51:47,870 --> 00:51:50,980 So if you did the computation right in the beginning to get 637 00:51:50,980 --> 00:51:58,560 the eigenvalues, then A^T A - 25I, this has to be-- 638 00:51:58,560 --> 00:52:00,850 has to not have full rank. 639 00:52:00,850 --> 00:52:03,770 So there has to be a vector v, which when multiplied by this 640 00:52:03,770 --> 00:52:06,260 gives [0, 0, 0] vector. 641 00:52:06,260 --> 00:52:13,670 And then you say [a, b, c] and set it equal to [0, 0, 0]. 642 00:52:13,670 --> 00:52:17,041 And just solve the system of linear equations. 643 00:52:17,041 --> 00:52:18,290 There will be several of them. 644 00:52:18,290 --> 00:52:20,840 For example, we can take [1, 1, 0] as well. 645 00:52:20,840 --> 00:52:25,170 But I just normalized it to have [INAUDIBLE]. 646 00:52:25,170 --> 00:52:27,350 So there's a lot of work involved 647 00:52:27,350 --> 00:52:30,550 if you want to do it by hand, even though you can do it. 648 00:52:30,550 --> 00:52:32,630 You have to find eigenvalues, find eigenvectors. 649 00:52:32,630 --> 00:52:35,270 In this case, you have to find three of them. 650 00:52:35,270 --> 00:52:37,340 And then you have to do more work, and more work. 651 00:52:37,340 --> 00:52:39,610 But it can be done. 652 00:52:39,610 --> 00:52:44,320 And we are done now. 653 00:52:44,320 --> 00:52:52,020 So now this decomposes A into U sigma V transformation. 654 00:52:52,020 --> 00:52:57,770 So U is given as [1 over square root 2, 1 over square root 2; 655 00:52:57,770 --> 00:53:02,320 1 over square root 2, minus 1 over square root 2]. 656 00:53:02,320 --> 00:53:07,716 Sigma was 5, 3, 0. 657 00:53:12,070 --> 00:53:15,490 And V is this. 658 00:53:15,490 --> 00:53:18,834 So V transpose is just transpose of that. 659 00:53:18,834 --> 00:53:22,790 I'll just write it like that, where V is that. 660 00:53:22,790 --> 00:53:25,200 So we have this decomposition. 661 00:53:25,200 --> 00:53:28,370 And so let me actually write it, because I want to show you 662 00:53:28,370 --> 00:53:29,996 why x, y, z is not important. 663 00:53:33,272 --> 00:53:37,010 1 over square root 2, 1 over square root 2, 664 00:53:37,010 --> 00:53:43,250 0; 1 over square root 18, minus 1 over square root 18, 665 00:53:43,250 --> 00:53:46,190 4 over square root 18; x, y, z. 666 00:53:50,600 --> 00:53:52,410 The reason I'm saying this is not 667 00:53:52,410 --> 00:53:56,980 important is because I can just drop-- oh what did I do here? 668 00:53:56,980 --> 00:54:00,700 It has to be 2 by 3. 669 00:54:00,700 --> 00:54:04,370 I can just drop this column, and drop this column together. 670 00:54:06,890 --> 00:54:08,517 It has to be that form. 671 00:54:25,510 --> 00:54:29,160 Drop this and drop this altogether. 672 00:54:29,160 --> 00:54:33,870 So the message here is that the eigenvectors corresponding 673 00:54:33,870 --> 00:54:38,340 to eigenvalue zero are not important. 674 00:54:38,340 --> 00:54:41,640 The only relevant ones are nonzero eigenvalues. 675 00:54:41,640 --> 00:54:43,290 So drop this, and drop this. 676 00:54:43,290 --> 00:54:46,120 That will save you some computation. 677 00:54:46,120 --> 00:54:50,297 So let me state a different form of singular value 678 00:54:50,297 --> 00:54:50,880 decomposition. 679 00:54:57,940 --> 00:54:59,760 So this works in general. 680 00:54:59,760 --> 00:55:00,940 There's a corollary. 681 00:55:00,940 --> 00:55:03,075 We get a simplified form of SVD. 682 00:55:10,730 --> 00:55:16,046 Where A becomes equal to U times sigma times V transpose. 683 00:55:18,710 --> 00:55:21,260 And A was an m by n matrix. 684 00:55:21,260 --> 00:55:24,220 U is still an m by m matrix. 685 00:55:24,220 --> 00:55:27,320 But now sigma is also m by m matrix. 686 00:55:27,320 --> 00:55:29,752 This only works when m is less than or equal to n. 687 00:55:33,190 --> 00:55:35,985 And V is a m by n matrix. 688 00:55:38,960 --> 00:55:41,400 So the proof is exactly the same. 689 00:55:41,400 --> 00:55:44,460 And the last step is just to drop the irrelevant 690 00:55:44,460 --> 00:55:46,590 information. 691 00:55:46,590 --> 00:55:48,810 So I will not write down why it works. 692 00:55:48,810 --> 00:55:51,830 But you can see if you go through it, 693 00:55:51,830 --> 00:55:54,280 you'll see that dropping this part 694 00:55:54,280 --> 00:55:56,210 just corresponds to exactly that information. 695 00:55:59,500 --> 00:56:02,660 So that's the reduced form. 696 00:56:02,660 --> 00:56:04,200 So let's see. 697 00:56:04,200 --> 00:56:06,650 In the beginning we had A. I erased 698 00:56:06,650 --> 00:56:09,690 A. A was the 2 by 3 matrix in the beginning. 699 00:56:09,690 --> 00:56:11,550 And we obtained the decomposition into 2 700 00:56:11,550 --> 00:56:15,400 by 2, 2 by 2, and 2 by 3 matrix. 701 00:56:15,400 --> 00:56:18,670 If we didn't delete the fifth column and fifth row, 702 00:56:18,670 --> 00:56:21,320 we would have obtained a 2 by 2, times 2 by 3, times 3 703 00:56:21,320 --> 00:56:23,080 by 3 matrix. 704 00:56:23,080 --> 00:56:25,572 But now we can simplify it by removing those. 705 00:56:28,910 --> 00:56:33,020 And it might not look that much different on this board. 706 00:56:33,020 --> 00:56:35,080 Because I just erased one row. 707 00:56:35,080 --> 00:56:38,920 But many matrices that you'll see in real application 708 00:56:38,920 --> 00:56:43,350 have a lot lower rank than the number of columns and rows. 709 00:56:43,350 --> 00:56:49,510 So if r is a lot more smaller than both m and n, then 710 00:56:49,510 --> 00:56:53,650 this part really-- it's not obvious here. 711 00:56:53,650 --> 00:56:56,400 But if m and n has a big gap here, 712 00:56:56,400 --> 00:57:00,600 really the number of columns that you're saving, 713 00:57:00,600 --> 00:57:01,560 it can be enormous. 714 00:57:06,240 --> 00:57:09,940 So to illustrate an example, look at this. 715 00:57:09,940 --> 00:57:12,665 Now look at the stock prices, where 716 00:57:12,665 --> 00:57:18,770 you have companies and dates. 717 00:57:18,770 --> 00:57:21,850 Previously I just gave an example of a 3 by 3 matrix. 718 00:57:21,850 --> 00:57:24,990 But it's more sensible to have dates, a lot 719 00:57:24,990 --> 00:57:26,950 more dates than companies. 720 00:57:26,950 --> 00:57:31,820 So let's say you recorded 365 days of a year, 721 00:57:31,820 --> 00:57:34,890 even though the market is not open all days, and just 722 00:57:34,890 --> 00:57:38,130 like five companies. 723 00:57:38,130 --> 00:57:41,340 If you did a decomposition this this, you'll have a 5 by 5, 724 00:57:41,340 --> 00:57:45,840 5 by 365, 365 by 365 here. 725 00:57:45,840 --> 00:57:48,888 But now in the reduced form, you're saving a lot of space. 726 00:57:51,726 --> 00:57:53,100 So if you just look at the board, 727 00:57:53,100 --> 00:57:54,940 it doesn't look like it's so powerful. 728 00:57:54,940 --> 00:57:56,130 But in fact it is. 729 00:57:56,130 --> 00:57:58,290 So that's the reduced form. 730 00:57:58,290 --> 00:58:00,930 And that will be the form that you'll see most 731 00:58:00,930 --> 00:58:02,510 of the time, this reduced form. 732 00:58:07,350 --> 00:58:09,400 So I made lot of mistakes today. 733 00:58:09,400 --> 00:58:13,840 I have one more topic, but a totally irrelevant topic. 734 00:58:13,840 --> 00:58:17,614 So any questions before I move on to the next topic? 735 00:58:22,594 --> 00:58:23,255 Yes? 736 00:58:23,255 --> 00:58:24,088 STUDENT: [INAUDIBLE] 737 00:58:30,295 --> 00:58:31,795 PROFESSOR: Can you press the button? 738 00:58:47,792 --> 00:58:48,625 STUDENT: [INAUDIBLE] 739 00:58:57,040 --> 00:58:59,640 PROFESSOR: Oh, so in this data, what it means. 740 00:58:59,640 --> 00:59:02,770 You're asking what the eigenvectors will mean over 741 00:59:02,770 --> 00:59:05,280 this data? 742 00:59:05,280 --> 00:59:10,960 It will give you some stocks. 743 00:59:10,960 --> 00:59:14,820 It will give you like the correlation. 744 00:59:14,820 --> 00:59:17,610 So each eigenvector will give you 745 00:59:17,610 --> 00:59:20,990 a group of companies that are correlated somehow. 746 00:59:20,990 --> 00:59:23,550 It measures their correlation with each other. 747 00:59:23,550 --> 00:59:26,880 So I don't have a very good explanation 748 00:59:26,880 --> 00:59:28,280 what its physical meaning is. 749 00:59:28,280 --> 00:59:32,040 Maybe you can give just a little bit more. 750 00:59:32,040 --> 00:59:34,050 GUEST SPEAKER: Possibly. 751 00:59:34,050 --> 00:59:35,870 We will get into this in later lectures. 752 00:59:35,870 --> 00:59:41,280 But in the singular value decomposition, 753 00:59:41,280 --> 00:59:45,640 what you want to think of is these orthonormal matrices 754 00:59:45,640 --> 00:59:50,500 are really defining a new basis, sort of an orthogonal basis. 755 00:59:50,500 --> 00:59:52,890 So you're taking the original coordinate system, 756 00:59:52,890 --> 00:59:55,030 then you're rotating it. 757 00:59:55,030 --> 00:59:57,440 And without changing or stretching 758 00:59:57,440 --> 00:59:58,410 or squeezing the data. 759 00:59:58,410 --> 01:00:00,370 You're just rotating the axes. 760 01:00:00,370 --> 01:00:03,160 So an orthonormal matrix gives you 761 01:00:03,160 --> 01:00:06,310 the cosines of the new coordinate system 762 01:00:06,310 --> 01:00:07,880 with respect to the old one. 763 01:00:07,880 --> 01:00:10,002 And so the singular value decomposition 764 01:00:10,002 --> 01:00:13,360 then is simply sort of rotating the data 765 01:00:13,360 --> 01:00:16,020 into a different orientation. 766 01:00:16,020 --> 01:00:23,980 And the orthonormal basis that you're transforming to, 767 01:00:23,980 --> 01:00:28,330 is essentially the coordinates of the original data 768 01:00:28,330 --> 01:00:29,930 in the transformed system. 769 01:00:29,930 --> 01:00:34,910 So as Choongbum was commenting, you're essentially 770 01:00:34,910 --> 01:00:38,360 looking at a representation of the original data 771 01:00:38,360 --> 01:00:43,480 points in a linearly transformed space, 772 01:00:43,480 --> 01:00:46,820 and the correlations between different stocks, 773 01:00:46,820 --> 01:00:51,990 say, is represented by how those points are oriented in the new, 774 01:00:51,990 --> 01:00:53,870 in the transformed space. 775 01:00:57,160 --> 01:01:00,720 PROFESSOR: So you'll have to see real data to really make sense 776 01:01:00,720 --> 01:01:01,220 out of it. 777 01:01:03,812 --> 01:01:07,430 But another way to think of it is where it comes from. 778 01:01:07,430 --> 01:01:09,202 So all this singular value decomposition, 779 01:01:09,202 --> 01:01:10,660 if you remember the proof, it comes 780 01:01:10,660 --> 01:01:15,850 from eigenvectors and eigenvalues of A transpose A. 781 01:01:15,850 --> 01:01:19,970 Now if you look at A transpose A, or I'll just say 782 01:01:19,970 --> 01:01:22,460 it's A times A transposed. 783 01:01:22,460 --> 01:01:23,950 It's pretty much the same. 784 01:01:23,950 --> 01:01:26,530 If you look at A times A transpose, 785 01:01:26,530 --> 01:01:28,425 you're going to get an m by n matrix. 786 01:01:32,790 --> 01:01:36,432 And it'll be indexed both by these companies. 787 01:01:40,920 --> 01:01:42,910 And the numbers here will represent 788 01:01:42,910 --> 01:01:44,540 how much the companies are related 789 01:01:44,540 --> 01:01:46,250 to each other, how much correlation they 790 01:01:46,250 --> 01:01:48,690 have between each other. 791 01:01:48,690 --> 01:01:51,770 So by looking at the eigenvectors of this matrix, 792 01:01:51,770 --> 01:01:54,950 you're looking at the correlation between these stock 793 01:01:54,950 --> 01:01:58,290 prices, let's say, these company stock prices. 794 01:01:58,290 --> 01:02:01,390 And that information is represented inside the singular 795 01:02:01,390 --> 01:02:04,190 value decomposition. 796 01:02:04,190 --> 01:02:06,610 But again, it's a lot better to understand 797 01:02:06,610 --> 01:02:09,250 if you have real numbers and real data, 798 01:02:09,250 --> 01:02:10,880 which you will have later. 799 01:02:10,880 --> 01:02:17,031 So please be excited and wait. 800 01:02:17,031 --> 01:02:18,530 You're going to see some cool stuff. 801 01:02:26,110 --> 01:02:30,120 So that was all for eigenvalue decomposition and singular 802 01:02:30,120 --> 01:02:32,650 value decomposition. 803 01:02:32,650 --> 01:02:35,300 And the last thing I want to mention today 804 01:02:35,300 --> 01:02:39,970 is something called Perron-Frobenius theorem. 805 01:02:39,970 --> 01:02:43,130 This one even looks a lot more theoretical than the ones 806 01:02:43,130 --> 01:02:45,320 I showed you. 807 01:02:45,320 --> 01:02:50,080 But surprisingly a few years ago, Steve Ross, 808 01:02:50,080 --> 01:02:53,550 he's a faculty in the business school here, 809 01:02:53,550 --> 01:02:56,830 found a very interesting result called Steve Ross recovery 810 01:02:56,830 --> 01:03:01,250 theorem that makes use of this theorem, 811 01:03:01,250 --> 01:03:02,870 makes use of Perron-Frobenius theorem 812 01:03:02,870 --> 01:03:06,410 that I will tell you today. 813 01:03:06,410 --> 01:03:08,790 Unfortunately you will only see a lecture 814 01:03:08,790 --> 01:03:11,710 on Steve Ross recovery theorem towards the end 815 01:03:11,710 --> 01:03:13,730 of the semester. 816 01:03:13,730 --> 01:03:16,560 So I will try to recall what it is later. 817 01:03:16,560 --> 01:03:19,110 But since we're talking about linear algebra today, 818 01:03:19,110 --> 01:03:22,540 let me introduce the theorem. 819 01:03:22,540 --> 01:03:24,040 This is called Perron-Frobenius. 820 01:03:28,040 --> 01:03:30,470 And you really won't believe that it has any applications 821 01:03:30,470 --> 01:03:33,955 in finance because it just looks so theoretical. 822 01:03:37,320 --> 01:03:40,830 I'm just stating a really weak form. 823 01:03:40,830 --> 01:03:43,940 Weak form. 824 01:03:43,940 --> 01:03:54,330 Let A be an n by n symmetric matrix, whose entries are all 825 01:03:54,330 --> 01:03:57,360 positive, with positive entries. 826 01:04:03,790 --> 01:04:10,910 Then there are a few properties that they have. 827 01:04:10,910 --> 01:04:14,790 First there exists an eigenvalue, 828 01:04:14,790 --> 01:04:20,610 there exists a largest eigenvalue, lambda_0, such 829 01:04:20,610 --> 01:04:24,960 that lambda is less than lambda_0. 830 01:04:24,960 --> 01:04:31,182 Well that's true for all other lambda. 831 01:04:31,182 --> 01:04:34,560 So this statement is really easy for symmetric matrix. 832 01:04:34,560 --> 01:04:36,709 So forget about-- you can drop symmetric, 833 01:04:36,709 --> 01:04:39,000 but I'm just stated it, because I'm going to prove only 834 01:04:39,000 --> 01:04:40,410 for this weak case. 835 01:04:40,410 --> 01:04:44,550 Just think about the statement when it's not symmetric. 836 01:04:44,550 --> 01:04:48,730 So if you have an n by n matrix whose entries are all positive, 837 01:04:48,730 --> 01:04:53,290 then there exists an eigenvalue, lambda_0, a real eigenvalue 838 01:04:53,290 --> 01:04:59,170 such that the absolute value of all of other eigenvalues 839 01:04:59,170 --> 01:05:02,860 are strictly smaller than this eigenvalue. 840 01:05:02,860 --> 01:05:05,540 So remember that if it's not a symmetric matrix, 841 01:05:05,540 --> 01:05:08,100 they can be complex values. 842 01:05:08,100 --> 01:05:10,500 This is saying that there's a unique eigenvalue which 843 01:05:10,500 --> 01:05:13,790 has largest absolute value, and moreover, it's a real number. 844 01:05:16,810 --> 01:05:23,260 Second part, there exists an eigenvector, 845 01:05:23,260 --> 01:05:34,810 a positive eigenvector with positive entries, 846 01:05:34,810 --> 01:05:40,120 corresponding to lambda 0. 847 01:05:40,120 --> 01:05:43,660 So the eigenvector corresponding to this lambda 0 848 01:05:43,660 --> 01:05:46,690 has positive entries. 849 01:05:46,690 --> 01:05:51,320 And the third part is lambda_0 is 850 01:05:51,320 --> 01:06:05,060 an eigenvalue of multiplicity 1, for those who know what it is. 851 01:06:05,060 --> 01:06:08,070 So this really is a unique eigenvalue 852 01:06:08,070 --> 01:06:11,580 with a unique eigenvector, which has positive entries. 853 01:06:11,580 --> 01:06:14,361 And it's larger, really larger than other eigenvalues. 854 01:06:17,660 --> 01:06:19,660 So from the mathematician point of view, 855 01:06:19,660 --> 01:06:21,040 this has many applications. 856 01:06:21,040 --> 01:06:23,060 It's probability theory. 857 01:06:23,060 --> 01:06:25,290 My main research area is combinatorics, 858 01:06:25,290 --> 01:06:27,090 discrete mathematics. 859 01:06:27,090 --> 01:06:30,080 It's also used in there. 860 01:06:30,080 --> 01:06:31,790 So from the theoretical point of view, 861 01:06:31,790 --> 01:06:35,470 this has been used in many contexts. 862 01:06:35,470 --> 01:06:38,990 It's not a standard theorem taught in linear algebra. 863 01:06:38,990 --> 01:06:42,990 So I don't think probably most of you haven't seen it before. 864 01:06:42,990 --> 01:06:47,420 But it's a well known result, with many uses, 865 01:06:47,420 --> 01:06:49,070 theoretical uses. 866 01:06:49,070 --> 01:06:54,700 But you also see one use in, later, as I mentioned, 867 01:06:54,700 --> 01:06:56,921 in finance, which is quite surprising. 868 01:07:03,740 --> 01:07:07,320 So let me just give you some feeling of why it happens. 869 01:07:07,320 --> 01:07:10,300 I won't give you the full detail of the proof, but just 870 01:07:10,300 --> 01:07:11,590 a very brief description. 871 01:07:16,257 --> 01:07:26,214 Sketch when A is symmetric, just a simple case, A is symmetric. 872 01:07:32,690 --> 01:07:38,540 In this case, this statement, if you look at it. 873 01:07:42,800 --> 01:07:44,650 First of all A has real eigenvalues. 874 01:07:53,540 --> 01:07:59,530 I'll say it's lambda_1, lambda_2, up to lambda_n. 875 01:07:59,530 --> 01:08:02,790 And at some point, I'll say up to lambda_i, 876 01:08:02,790 --> 01:08:04,510 it's greater than zero, pass to where 877 01:08:04,510 --> 01:08:06,670 this is smaller than zero. 878 01:08:06,670 --> 01:08:08,170 There are some positive eigenvalues. 879 01:08:08,170 --> 01:08:11,490 There are some negative eigenvalues. 880 01:08:11,490 --> 01:08:13,775 So that's observation one. 881 01:08:18,050 --> 01:08:22,090 Things are more easy to control, because they are all real. 882 01:08:22,090 --> 01:08:25,590 The first statement says that-- maybe I should have indexed it 883 01:08:25,590 --> 01:08:27,384 as lambda_0. 884 01:08:27,384 --> 01:08:30,729 I'll just call this lambda 0 instead. 885 01:08:30,729 --> 01:08:34,180 This lambda_0 is in fact larger in absolute value 886 01:08:34,180 --> 01:08:35,630 than lambda_n. 887 01:08:35,630 --> 01:08:42,010 That's the content of the first bullet. 888 01:08:42,010 --> 01:08:45,640 So if they all have all positive entries, then 889 01:08:45,640 --> 01:08:47,790 the positive, largest positive eigenvalue 890 01:08:47,790 --> 01:08:58,529 dominates the smallest negative eigenvalue, which yeah. 891 01:08:58,529 --> 01:09:01,310 So why is that the case? 892 01:09:01,310 --> 01:09:02,920 First of all, to see that you have 893 01:09:02,920 --> 01:09:05,610 to go through different steps. 894 01:09:05,610 --> 01:09:06,859 So we go into observation two. 895 01:09:10,090 --> 01:09:11,950 Lambda_1, so look at lambda_1. 896 01:09:11,950 --> 01:09:21,234 lambda_1 has an eigenvector with positive entries. 897 01:09:27,529 --> 01:09:29,880 Why is that the case? 898 01:09:29,880 --> 01:09:35,939 That's because if you look at A times v 899 01:09:35,939 --> 01:09:49,185 equals lambda times v. If v-- let me state it this way. 900 01:09:49,185 --> 01:09:54,135 Lambda_0 is the maximum of all lambda, lambda_0. 901 01:10:01,560 --> 01:10:03,535 That's not entirely correct. 902 01:10:03,535 --> 01:10:04,035 Lambda_1. 903 01:10:08,985 --> 01:10:09,975 Sorry about that. 904 01:10:09,975 --> 01:10:14,610 So If you look at this, if v has non-positive entries, 905 01:10:14,610 --> 01:10:23,750 if it has a negative entry, if v has a negative entry, 906 01:10:23,750 --> 01:10:24,970 then flip it. 907 01:10:24,970 --> 01:10:34,346 Flip the sign, and in this way obtain new vector v prime. 908 01:10:38,180 --> 01:10:43,070 Since A has positive entries, A has positive entries. 909 01:10:47,990 --> 01:10:49,960 What we conclude is that A times v 910 01:10:49,960 --> 01:10:58,590 prime will be larger than A times v. You have to look. 911 01:10:58,590 --> 01:11:00,810 Think about, because it has positive entries, 912 01:11:00,810 --> 01:11:02,820 if it had some negative part somewhere, 913 01:11:02,820 --> 01:11:05,120 the magnitude will decrease. 914 01:11:05,120 --> 01:11:10,120 So if you flip the sign it should increase the magnitude. 915 01:11:10,120 --> 01:11:11,720 And this cannot happen. 916 01:11:11,720 --> 01:11:13,330 This shouldn't happen. 917 01:11:13,330 --> 01:11:14,452 This should not happen. 918 01:11:20,330 --> 01:11:22,945 That's where the positive entries part is used. 919 01:11:22,945 --> 01:11:29,330 If you have positive entries, then it should have, 920 01:11:29,330 --> 01:11:32,620 the eigenvector should have positive entries as well. 921 01:11:32,620 --> 01:11:38,420 So I will not work through the details of the rest. 922 01:11:38,420 --> 01:11:40,830 I will post it on the lecture notes. 923 01:11:40,830 --> 01:11:44,369 But really this theorem, in fact, 924 01:11:44,369 --> 01:11:46,410 can be stated in a lot more generality than this. 925 01:11:46,410 --> 01:11:47,950 I'm stating only a very weak form. 926 01:11:47,950 --> 01:11:50,630 It doesn't have to have all positive entries. 927 01:11:50,630 --> 01:11:53,690 It has to only be something called irreducible, 928 01:11:53,690 --> 01:11:56,640 which is a concept from probability theory, 929 01:11:56,640 --> 01:11:57,630 from Markov chains. 930 01:12:00,150 --> 01:12:04,810 But here we will only use it in this setting. 931 01:12:04,810 --> 01:12:08,070 So I will review it later, before it's really being used. 932 01:12:08,070 --> 01:12:11,220 But just remember that how these positive entries kick 933 01:12:11,220 --> 01:12:12,950 into this kind of statement, where 934 01:12:12,950 --> 01:12:16,290 there is an eigenvalue, largest eigenvalue, why 935 01:12:16,290 --> 01:12:21,380 there has to be a vector which is all positive entries. 936 01:12:21,380 --> 01:12:24,590 Those will all come into play later. 937 01:12:24,590 --> 01:12:27,272 So I think that's it for today. 938 01:12:27,272 --> 01:12:28,855 If you have any last minute questions? 939 01:12:33,450 --> 01:12:36,650 If not, I will see you on Thursday.