1 00:00:00,499 --> 00:00:01,950 The following content is provided 2 00:00:01,950 --> 00:00:04,900 by MIT OpenCourseWare under a Creative Commons License. 3 00:00:04,900 --> 00:00:08,230 Additional information about our license, 4 00:00:08,230 --> 00:00:10,560 and MIT OpenCourseWare in general, 5 00:00:10,560 --> 00:00:11,780 is available at ocw.mit.edu. 6 00:00:16,570 --> 00:00:17,280 PROFESSOR: OK. 7 00:00:17,280 --> 00:00:21,230 Now, where am I with this problem? 8 00:00:21,230 --> 00:00:28,697 Well, last time I spoke about what the situation's like 9 00:00:28,697 --> 00:00:29,780 as alpha goes to infinity. 10 00:00:29,780 --> 00:00:37,530 And I want to say also a word about -- more than a word -- 11 00:00:37,530 --> 00:00:39,000 about alpha going to 0. 12 00:00:41,960 --> 00:00:48,900 And then, the real problems come when alpha is in between. 13 00:00:48,900 --> 00:00:51,930 The real problem -- the situations, 14 00:00:51,930 --> 00:00:58,570 these ill-posed problems that come from inverse problems, 15 00:00:58,570 --> 00:01:01,730 trying to find out what's inside your brain by taking 16 00:01:01,730 --> 00:01:03,520 measurements at the skull. 17 00:01:03,520 --> 00:01:12,350 All sorts of applications involve a finite alpha. 18 00:01:12,350 --> 00:01:18,310 And I'm not quite ready to discuss those topics. 19 00:01:18,310 --> 00:01:23,020 I mean, roughly speaking -- 20 00:01:23,020 --> 00:01:26,240 I'll write down a reminder now. 21 00:01:26,240 --> 00:01:29,290 What happened when alpha went to infinity? 22 00:01:29,290 --> 00:01:33,460 When alpha went to infinity, this part 23 00:01:33,460 --> 00:01:35,070 became the important part. 24 00:01:35,070 --> 00:01:42,230 So as alpha went to infinity, the limit was u_infinity, 25 00:01:42,230 --> 00:01:43,710 shall I call it? u_infinity. 26 00:01:47,210 --> 00:01:50,860 Well, so u_infinity was a minimizer of this term, 27 00:01:50,860 --> 00:02:02,100 u_infinity minimized B*u minus d squared. 28 00:02:02,100 --> 00:02:10,490 In fact, in my last lecture, I was taking B*u equal d 29 00:02:10,490 --> 00:02:13,390 as an equation that had exact solutions, 30 00:02:13,390 --> 00:02:17,060 and saying how did we actually solve B*u equal d. 31 00:02:17,060 --> 00:02:21,450 So u_infinity minimizes B*u minus d. 32 00:02:21,450 --> 00:02:24,080 But that might leave some freedom. 33 00:02:24,080 --> 00:02:26,760 If B doesn't have that many rows, 34 00:02:26,760 --> 00:02:30,350 if its rank is not that big, then this 35 00:02:30,350 --> 00:02:32,510 doesn't finish the job. 36 00:02:32,510 --> 00:02:39,610 So among these, if there are many -- 37 00:02:39,610 --> 00:02:43,760 and that's what were interested in -- 38 00:02:43,760 --> 00:02:51,810 u hat infinity, that limit, will minimize the other bit, 39 00:02:51,810 --> 00:02:55,540 A*u minus b square. 40 00:02:55,540 --> 00:02:58,110 Does that makes sense somehow? 41 00:02:58,110 --> 00:03:01,480 This is in the problem here, for any finite alpha. 42 00:03:01,480 --> 00:03:03,600 As alpha gets bigger and bigger, we 43 00:03:03,600 --> 00:03:05,920 push harder and harder on this one, 44 00:03:05,920 --> 00:03:10,440 so we get a one that's a winner for this one, 45 00:03:10,440 --> 00:03:14,720 but the trace of this first part is still around 46 00:03:14,720 --> 00:03:19,770 and if there are many winners, then having this first part 47 00:03:19,770 --> 00:03:24,970 in there will give us, among the winners, the one that 48 00:03:24,970 --> 00:03:29,110 does the best on that term. 49 00:03:29,110 --> 00:03:33,620 And small alpha, going to 0, will just 50 00:03:33,620 --> 00:03:35,540 be the opposite right? 51 00:03:35,540 --> 00:03:37,880 This finally struck me over the weekend, 52 00:03:37,880 --> 00:03:40,300 you know, like I could divide this quantity, 53 00:03:40,300 --> 00:03:44,380 this whole expression by alpha, so then I have a 1 54 00:03:44,380 --> 00:03:48,440 there, and a 1 over alpha here, and now as alpha 55 00:03:48,440 --> 00:03:51,710 goes 0 this is the big term. 56 00:03:51,710 --> 00:03:56,490 So now u -- shall I call this u_0? 57 00:03:56,490 --> 00:03:58,700 Brilliant notation, right? 58 00:03:58,700 --> 00:04:02,640 So this produces a u_alpha. 59 00:04:06,190 --> 00:04:10,200 In that limit it converges to a u_infinity that focuses first 60 00:04:10,200 --> 00:04:13,200 on this problem, but in the other limit, 61 00:04:13,200 --> 00:04:15,970 when alpha's going to 0, it's this term that's biggest. 62 00:04:15,970 --> 00:04:25,530 So u_0 minimizes A*u minus b squared and if there are many 63 00:04:25,530 --> 00:04:31,265 minimizers, among these -- well, you know what I'm going 64 00:04:31,265 --> 00:04:35,590 to write. u_0, see I put a little hat there. 65 00:04:35,590 --> 00:04:36,330 Did I? 66 00:04:36,330 --> 00:04:38,560 I don't know I haven't stayed with these hats 67 00:04:38,560 --> 00:04:43,060 very much but maybe I'll add them. 68 00:04:43,060 --> 00:04:48,870 u hat minimizes the term that's not so important, 69 00:04:48,870 --> 00:04:51,290 B*u minus d square. 70 00:04:51,290 --> 00:05:00,780 OK, so today's lecture is still about these limiting cases. 71 00:05:00,780 --> 00:05:09,210 As I said, the scientific problems, ill-posed problems, 72 00:05:09,210 --> 00:05:12,470 especially these inverse problems, 73 00:05:12,470 --> 00:05:16,180 give situations in which these limiting problems are 74 00:05:16,180 --> 00:05:21,170 really bad, and you don't get to the limit, you don't want to. 75 00:05:21,170 --> 00:05:27,040 The whole point is to have a finite alpha. 76 00:05:27,040 --> 00:05:33,020 But choosing that alpha correctly is the art -- 77 00:05:33,020 --> 00:05:36,460 let me just say why -- so I almost, 78 00:05:36,460 --> 00:05:41,030 I'm sort of anticipating what I'm not ready to do properly. 79 00:05:41,030 --> 00:05:49,010 So, I'll say why a finite alpha, on Wednesday. 80 00:05:49,010 --> 00:05:51,590 Why? 81 00:05:51,590 --> 00:05:54,440 Because noisy data. 82 00:06:04,500 --> 00:06:14,360 Because of the noise, at best u is only determined, 83 00:06:14,360 --> 00:06:20,100 because of the noise, up to some order, 84 00:06:20,100 --> 00:06:23,110 say, order of some small quantity 85 00:06:23,110 --> 00:06:25,390 delta that measures the noise. 86 00:06:25,390 --> 00:06:27,250 This is like a measure of the noise. 87 00:06:32,750 --> 00:06:35,920 Then there's no reason to do what we did last time, 88 00:06:35,920 --> 00:06:39,450 like forcing B*u equal d. 89 00:06:39,450 --> 00:06:43,970 There's no point in forcing B*u equal d if the d in that 90 00:06:43,970 --> 00:06:51,980 equation has noise in it, then pushing it all the way 91 00:06:51,980 --> 00:06:59,570 to the limit is unreasonable, and may produce a very, 92 00:06:59,570 --> 00:07:03,960 you know, a catastrophic illness. 93 00:07:03,960 --> 00:07:08,350 So that's when -- so it's really the presence of noise, 94 00:07:08,350 --> 00:07:15,180 the presence of uncertainty in the first place that says OK, 95 00:07:15,180 --> 00:07:18,710 a finite alpha is fine, you're not looking for perfection, 96 00:07:18,710 --> 00:07:21,110 what you're looking for is some stability, 97 00:07:21,110 --> 00:07:25,460 some control on the stability. 98 00:07:25,460 --> 00:07:27,040 OK, right. 99 00:07:27,040 --> 00:07:30,670 But now -- so that's Wednesday. 100 00:07:34,300 --> 00:07:43,310 Today, let me go -- I didn't give an example, so today, 101 00:07:43,310 --> 00:07:51,990 two topics, one is an example with B*u equal d. 102 00:07:51,990 --> 00:07:54,260 That was last lecture's, and that's 103 00:07:54,260 --> 00:07:58,450 the case when alpha goes to infinity, 104 00:07:58,450 --> 00:08:03,010 then secondly is something called a pseudoinverse. 105 00:08:03,010 --> 00:08:06,440 You may have seen that expression, the pseudoinverse 106 00:08:06,440 --> 00:08:11,160 of A, and sometimes it's written A with a dagger or A 107 00:08:11,160 --> 00:08:13,390 with a plus sign. 108 00:08:13,390 --> 00:08:15,460 And that is worth knowing about. 109 00:08:15,460 --> 00:08:18,140 So this is a topic in linear algebra. 110 00:08:18,140 --> 00:08:19,910 It would be in my linear algebra book, 111 00:08:19,910 --> 00:08:23,970 but it's a topic that never gets into the 18.06 course cause 112 00:08:23,970 --> 00:08:26,490 it's sort of a little late. 113 00:08:26,490 --> 00:08:30,620 And that will appear as alpha goes to 0. 114 00:08:30,620 --> 00:08:31,120 Right. 115 00:08:31,120 --> 00:08:34,860 So that's what today is about, it's 116 00:08:34,860 --> 00:08:41,790 linear algebra, because I'm not ready for the noise yet. 117 00:08:41,790 --> 00:08:48,310 But it's the noisy data that we have in reality 118 00:08:48,310 --> 00:08:55,340 and that's why, in reality, alpha will be chosen finite. 119 00:08:55,340 --> 00:08:55,950 OK. 120 00:08:55,950 --> 00:09:02,760 So part one, then, is to do a very simple example with B*u 121 00:09:02,760 --> 00:09:03,900 equal d. 122 00:09:03,900 --> 00:09:06,060 And here is the example. 123 00:09:06,060 --> 00:09:07,160 OK. 124 00:09:07,160 --> 00:09:11,859 So this is my sum of squares in which I plan 125 00:09:11,859 --> 00:09:13,150 to let alpha to go to infinity. 126 00:09:18,130 --> 00:09:21,870 So A is the identity matrix and b is 0. 127 00:09:21,870 --> 00:09:26,120 So that quantity is simple. 128 00:09:26,120 --> 00:09:36,580 Here, I have just one equation, so p is 1; p by n is 1 by 2. 129 00:09:36,580 --> 00:09:41,940 I've just one equation u_1 minus u_2 equals 6, and in the limit, 130 00:09:41,940 --> 00:09:43,930 as alpha go to infinity, I expect 131 00:09:43,930 --> 00:09:46,220 to see that that equation is enforced. 132 00:09:49,930 --> 00:09:54,790 So there's two ways to do it, we can let alpha go to infinity 133 00:09:54,790 --> 00:10:02,820 and look at u_alpha going toward u_infinity, 134 00:10:02,820 --> 00:10:05,690 maybe with their little hats. 135 00:10:05,690 --> 00:10:12,670 Or the second method which is the null space method, which is 136 00:10:12,670 --> 00:10:15,500 what I spoke about last time. 137 00:10:15,500 --> 00:10:24,290 the null space method solves the constraint B*u equal d which is 138 00:10:24,290 --> 00:10:26,540 just u_1 minus u_2 equal 6. 139 00:10:26,540 --> 00:10:27,040 OK. 140 00:10:27,040 --> 00:10:33,390 And that's -- maybe I'll start with that one. 141 00:10:33,390 --> 00:10:35,313 Which looks so simple, of course, just 142 00:10:35,313 --> 00:10:37,620 to solve u_1 minus u_2 equal 6. 143 00:10:37,620 --> 00:10:41,980 I mean, everybody would say, OK, solve it 144 00:10:41,980 --> 00:10:45,290 for u_2 equals u_1 minus 6. 145 00:10:49,580 --> 00:10:54,650 So here is the method any sensible person would use. 146 00:10:54,650 --> 00:10:56,210 But this course doesn't. 147 00:10:56,210 --> 00:11:03,630 OK, the sensible method would be u_2 is u_1 minus 6; 148 00:11:03,630 --> 00:11:12,440 plug that into the squares and minimize. 149 00:11:12,440 --> 00:11:18,090 So when I plug this in, of course, this is exact, 150 00:11:18,090 --> 00:11:23,470 and this becomes u -- so I'm minimizing, 151 00:11:23,470 --> 00:11:28,610 minimizing u_1 squared plus, what was it? 152 00:11:28,610 --> 00:11:30,190 u_1 minus 6 square. 153 00:11:34,930 --> 00:11:39,410 So that's reduced the problem to one unknown, 154 00:11:39,410 --> 00:11:41,250 this is the null space method. 155 00:11:41,250 --> 00:11:45,040 The null space method is to solve the equation 156 00:11:45,040 --> 00:11:48,260 and remove unknowns. 157 00:11:48,260 --> 00:11:51,970 Remove p unknowns coming from the p constraints, 158 00:11:51,970 --> 00:11:54,150 and here p is 1. 159 00:11:54,150 --> 00:11:54,660 OK. 160 00:11:54,660 --> 00:11:58,170 And by the way, can we just guess, or not guess, 161 00:11:58,170 --> 00:12:05,770 but pretty well be sure, what's the minimizer here? 162 00:12:05,770 --> 00:12:09,120 Anybody just tell me what u_1 would minimize that? 163 00:12:09,120 --> 00:12:10,570 Just make a guess, maybe? 164 00:12:13,090 --> 00:12:19,520 I'm looking for a number sort of halfway between 0 and 6 165 00:12:19,520 --> 00:12:21,110 somehow. 166 00:12:21,110 --> 00:12:27,630 You won't be surprised that the u_1 is 3. 167 00:12:27,630 --> 00:12:32,450 And then, from this equation, I should learn that u_2 is minus 168 00:12:32,450 --> 00:12:34,700 3 -- u_2, no, u_(2, infinity). 169 00:12:37,740 --> 00:12:42,130 Now I've got too many -- u_(2, infinity) is minus 3. 170 00:12:42,130 --> 00:12:47,430 Anyway, simple calculus, if you just set the derivative to 0, 171 00:12:47,430 --> 00:12:50,430 you'll get 3 and then you get minus 3 for u_2. 172 00:12:50,430 --> 00:12:53,650 So that's the null space method, except that I 173 00:12:53,650 --> 00:13:02,180 didn't follow my complicated QR orthogonalization. 174 00:13:02,180 --> 00:13:05,220 And I just want to do that quickly 175 00:13:05,220 --> 00:13:07,320 to reach the same answer. 176 00:13:10,270 --> 00:13:15,250 And to say, why don't I just do this anyway? 177 00:13:15,250 --> 00:13:18,860 This is what -- this would be the row -- 178 00:13:18,860 --> 00:13:23,030 this would be the standard method in the first month 179 00:13:23,030 --> 00:13:27,180 of linear algebra would be to use the row reduced echelon 180 00:13:27,180 --> 00:13:30,300 form, which of course is going to be really, 181 00:13:30,300 --> 00:13:32,520 really simple for this matrix; in fact, 182 00:13:32,520 --> 00:13:37,440 that's already in row reduced echelon form -- elimination, 183 00:13:37,440 --> 00:13:40,930 row reduction has nothing to do to improve that -- 184 00:13:40,930 --> 00:13:44,280 and then solve and then plug in and then go with it. 185 00:13:44,280 --> 00:13:49,290 OK, well the thing is that that row 186 00:13:49,290 --> 00:13:53,960 reduced echelon form, the stuff you teach, is not, 187 00:13:53,960 --> 00:13:59,730 for large systems, guaranteed stable. 188 00:13:59,730 --> 00:14:01,640 It's not numerically stable. 189 00:14:01,640 --> 00:14:07,200 And the option of using-- of orthogonalizing 190 00:14:07,200 --> 00:14:11,200 is the right one to know for a large system. 191 00:14:11,200 --> 00:14:16,010 So you you'll have to allow me, on this really small example, 192 00:14:16,010 --> 00:14:19,820 to use a method that I described last time. 193 00:14:19,820 --> 00:14:24,970 And I just want to recap with an example on the small system. 194 00:14:24,970 --> 00:14:27,780 OK, so what was that method? 195 00:14:27,780 --> 00:14:30,950 So this is the null space method using qr now. 196 00:14:34,920 --> 00:14:39,710 The MATLAB command qr, so what did we -- qr of B prime. 197 00:14:39,710 --> 00:14:43,130 Do you remember that we took that -- 198 00:14:43,130 --> 00:14:47,140 that's the MATLAB command that eventually will, 199 00:14:47,140 --> 00:14:50,750 or is actually already in the notes for this section, 200 00:14:50,750 --> 00:14:53,460 and those notes will get updated -- 201 00:14:53,460 --> 00:14:57,540 but that's step one in the null space method, 202 00:14:57,540 --> 00:14:58,940 qr B prime. 203 00:14:58,940 --> 00:15:00,920 And this gives me a chance to say what's 204 00:15:00,920 --> 00:15:04,790 up with this qr algorithm. 205 00:15:04,790 --> 00:15:11,150 I mean after lu, qr is the most important algorithm in MATLAB. 206 00:15:11,150 --> 00:15:13,340 And so what does it do? 207 00:15:13,340 --> 00:15:20,460 B prime, the transpose of B, is just 1, minus 1, right? 208 00:15:20,460 --> 00:15:23,200 OK. 209 00:15:23,200 --> 00:15:30,800 Now what does Gram-Schmidt do to that matrix? 210 00:15:34,240 --> 00:15:37,250 Well, the idea of Gram-Schmidt is 211 00:15:37,250 --> 00:15:41,370 to produce orthonormal columns. 212 00:15:41,370 --> 00:15:45,220 So the most basic Gram-Schmidt idea would say, so what would 213 00:15:45,220 --> 00:15:46,390 Gram and Schmidt say? 214 00:15:46,390 --> 00:15:49,020 They'd say, well, we only have one column, 215 00:15:49,020 --> 00:15:52,340 and all we would have to do is normalize it 216 00:15:52,340 --> 00:16:01,940 So Gram-Schmidt would produce the normalized thing -- 217 00:16:01,940 --> 00:16:05,830 times square root of 2. 218 00:16:05,830 --> 00:16:09,810 That would be the q, and this would be the r, 1 by 1, 219 00:16:09,810 --> 00:16:12,650 in Gram Schmidt. 220 00:16:12,650 --> 00:16:20,540 OK, but here's the point, that the qr algorithm in MATLAB, 221 00:16:20,540 --> 00:16:23,560 which no longer uses the Gram-Schmidt idea, 222 00:16:23,560 --> 00:16:31,000 instead uses a Householder idea, and one nice thing about this 223 00:16:31,000 --> 00:16:39,950 is that it produces not just this column, but another one, 224 00:16:39,950 --> 00:16:47,520 it produces a column for the -- it completes the basis 225 00:16:47,520 --> 00:16:50,220 to a full orthonormal basis. 226 00:16:50,220 --> 00:16:54,020 So it finds a second vector. 227 00:16:54,020 --> 00:16:56,770 So ordinary Gram-Schmidt just had 228 00:16:56,770 --> 00:16:59,270 one column times one number. 229 00:16:59,270 --> 00:17:05,480 What qr actually does is it ends up with two columns. 230 00:17:05,480 --> 00:17:12,410 And well, everybody can see what's the other column -- 231 00:17:12,410 --> 00:17:16,610 that has length 1, of course, and is orthogonal to the first 232 00:17:16,610 --> 00:17:17,500 column. 233 00:17:17,500 --> 00:17:21,770 And now, that is multiplied by 0. 234 00:17:27,310 --> 00:17:30,370 So this is what qr does. 235 00:17:30,370 --> 00:17:35,160 We have this 2 by 1 matrix, it produces a 2 by 2 times a 2 236 00:17:35,160 --> 00:17:35,690 by 1. 237 00:17:38,820 --> 00:17:43,000 And you might say, it was wasting its time, 238 00:17:43,000 --> 00:17:48,810 to find this part, because it's multiplied by 0, 239 00:17:48,810 --> 00:17:54,910 but what are we learning from the vector? 240 00:17:54,910 --> 00:17:58,410 From this [1, 1] vector or 1 over square root of 2, 241 00:17:58,410 --> 00:18:00,190 1 over square root of 2 vector? 242 00:18:00,190 --> 00:18:03,880 What good can that do us? 243 00:18:03,880 --> 00:18:11,490 It's the null space of B. So B was 1, minus 1, 244 00:18:11,490 --> 00:18:16,340 So let me just -- so that's the connection with null space 245 00:18:16,340 --> 00:18:23,120 of B. If I look at vectors -- there's my matrix B, 246 00:18:23,120 --> 00:18:29,330 and if I'm solving B*u equal d, if I'm solving B*u equal d, 247 00:18:29,330 --> 00:18:37,200 then u is u_particular and u null space, 248 00:18:37,200 --> 00:18:42,960 and if I want u null space, then that's where this -- and these, 249 00:18:42,960 --> 00:18:46,950 whatever extra columns, this might be p columns and then 250 00:18:46,950 --> 00:18:50,640 this would be n minus p columns, that's what that's good for. 251 00:18:50,640 --> 00:18:53,180 And of course that column tells me 252 00:18:53,180 --> 00:18:56,380 about the null space which, for this matrix, 253 00:18:56,380 --> 00:19:05,400 is one-dimensional and easy to find, OK. 254 00:19:05,400 --> 00:19:13,210 So that may be just to, so you know the difference between 255 00:19:13,210 --> 00:19:16,650 Gram-Schmidt's qr which stops with -- 256 00:19:16,650 --> 00:19:19,450 if you had one column you end with one column, 257 00:19:19,450 --> 00:19:24,500 and the MATLAB Householder qr which finds a full square 258 00:19:24,500 --> 00:19:25,170 matrix. 259 00:19:25,170 --> 00:19:29,410 OK, just good to know and here we've found a use for it. 260 00:19:29,410 --> 00:19:29,910 OK. 261 00:19:29,910 --> 00:19:36,010 So then, the algorithm that I gave last time -- 262 00:19:36,010 --> 00:19:39,510 and I'll give the code in the notes -- 263 00:19:39,510 --> 00:19:45,740 goes through the steps of finding a u_particular, 264 00:19:45,740 --> 00:19:51,600 and actually, the u_particular that it would find happens 265 00:19:51,600 --> 00:19:59,540 to be -- [3, minus 3] happens to be the actual winner. 266 00:19:59,540 --> 00:20:06,085 And therefore, the u null space that that algorithm would find 267 00:20:06,085 --> 00:20:08,950 -- if I went through all the steps, 268 00:20:08,950 --> 00:20:15,020 you would see that because I'm in this special case of b being 269 00:20:15,020 --> 00:20:16,360 0 and so on, 270 00:20:16,360 --> 00:20:19,110 that the vector that it would choose -- 271 00:20:19,110 --> 00:20:21,440 this is the basis for the null space, 272 00:20:21,440 --> 00:20:28,620 but it would choose 0 of that basis vector and would come up 273 00:20:28,620 --> 00:20:30,500 with that answer. 274 00:20:30,500 --> 00:20:34,240 OK so that's what the algorithm from last time 275 00:20:34,240 --> 00:20:39,010 would have done to this problem. 276 00:20:39,010 --> 00:20:46,120 I also, over the weekend, thought OK, if it's all true, 277 00:20:46,120 --> 00:20:51,460 I should be able to use my first method. 278 00:20:51,460 --> 00:20:55,190 The large alpha method and just find the answer 279 00:20:55,190 --> 00:20:58,880 to the original problem and let alpha go to infinity. 280 00:20:58,880 --> 00:21:01,260 Are you willing to do that? 281 00:21:01,260 --> 00:21:04,010 That might take a little more calculation, 282 00:21:04,010 --> 00:21:06,400 but let me try that. 283 00:21:06,400 --> 00:21:10,430 I'm hoping, you know, that it approaches this answer. 284 00:21:10,430 --> 00:21:13,810 This is the answer I'm looking for. 285 00:21:13,810 --> 00:21:22,360 OK so, do your mind just -- suppose I had to do that 286 00:21:22,360 --> 00:21:24,650 minimization. 287 00:21:24,650 --> 00:21:27,230 Again, now I'm not using the null space method, 288 00:21:27,230 --> 00:21:30,610 so I'm not reducing, I'm not getting u_2 out of the problem. 289 00:21:30,610 --> 00:21:38,730 I'm doing the minimum as it stands, and so what do I get? 290 00:21:38,730 --> 00:21:41,750 Well, I've got two variables u_1 and u_2. 291 00:21:41,750 --> 00:21:45,210 So I take the derivatives with respect u_1 -- 292 00:21:45,210 --> 00:21:47,520 I'm minimizing -- everybody, when I point, 293 00:21:47,520 --> 00:21:50,360 I'm pointing at that top line. 294 00:21:50,360 --> 00:21:56,030 So it's 2*u_1, and what do I have here? 295 00:21:56,030 --> 00:22:03,990 2*alpha u_1 minus u_2 minus 6 equaling 0. 296 00:22:03,990 --> 00:22:08,350 Is that -- did I take the u_1 derivative correctly? 297 00:22:08,350 --> 00:22:12,590 Now if I take the u_2 derivative I get two u_2's. 298 00:22:12,590 --> 00:22:15,800 And now, the chain rule is going to give me a minus sign, 299 00:22:15,800 --> 00:22:22,850 so it would be a minus 2*alpha u_1 minus u_2 minus 6 equals 0. 300 00:22:22,850 --> 00:22:26,770 So those two equations will determine u_1 and u_2 301 00:22:26,770 --> 00:22:29,370 for a finite alpha. 302 00:22:29,370 --> 00:22:33,360 And then I'll let alpha head to infinity and see what happens. 303 00:22:33,360 --> 00:22:37,730 OK, first I'll multiply by a half 304 00:22:37,730 --> 00:22:42,470 and get rid of those useless 2's, and then 305 00:22:42,470 --> 00:22:45,360 solve this equation. 306 00:22:45,360 --> 00:22:48,080 OK, so what do I have here? 307 00:22:48,080 --> 00:22:54,160 I've got a matrix -- u_1 is multiplying 1 plus alpha, 308 00:22:54,160 --> 00:22:57,250 u_2 has a minus alpha. 309 00:22:57,250 --> 00:23:06,340 In this line, u_1 has a minus alpha, u_1 has a 1 minus, 310 00:23:06,340 --> 00:23:09,470 minus, plus alpha, am I right? 311 00:23:09,470 --> 00:23:14,950 Times [u 1, u 2] equals -- what's my right-hand side? 312 00:23:14,950 --> 00:23:17,940 I guess the right-hand side has alphas in it. 313 00:23:17,940 --> 00:23:23,300 6*alpha and minus 6*alpha, I think. 314 00:23:23,300 --> 00:23:23,810 OK. 315 00:23:27,010 --> 00:23:29,510 two equations, two unknowns. 316 00:23:29,510 --> 00:23:34,190 These are the normal equation for this problem, written out 317 00:23:34,190 --> 00:23:35,610 explicitly. 318 00:23:35,610 --> 00:23:39,630 And probably I can find the solution 319 00:23:39,630 --> 00:23:41,750 and let alpha go to infinity. 320 00:23:41,750 --> 00:23:44,680 You could say, what are you doing, 321 00:23:44,680 --> 00:23:47,870 Professor Strang, this elementary calculation? 322 00:23:47,870 --> 00:23:50,810 But there is something sort of satisfying about seeing 323 00:23:50,810 --> 00:23:53,030 a small example actually work. 324 00:23:53,030 --> 00:23:54,550 At least to me. 325 00:23:54,550 --> 00:23:58,850 OK, so how do I solve those equations? 326 00:23:58,850 --> 00:24:00,050 Well, good question. 327 00:24:03,430 --> 00:24:05,610 Should I -- with a 2 by 2 matrix, 328 00:24:05,610 --> 00:24:09,560 can I do the unforgivable and actually find its inverse? 329 00:24:09,560 --> 00:24:14,260 I mean, it's like not allowed in true linear algebra 330 00:24:14,260 --> 00:24:18,750 to find the inverse, but maybe we could do it here. 331 00:24:18,750 --> 00:24:25,410 So [u 1, u 2] is going to be the inverse matrix, which is -- 332 00:24:25,410 --> 00:24:31,120 so my little recipe for finding inverses is take the entries, 333 00:24:31,120 --> 00:24:38,790 this entry goes up is up here, that entry goes down there -- 334 00:24:38,790 --> 00:24:41,150 well, you couldn't see the difference -- 335 00:24:41,150 --> 00:24:46,640 this entry stays stays in place, those change sign, 336 00:24:46,640 --> 00:24:49,560 and then I have to divide by the determinant. 337 00:24:49,560 --> 00:24:51,680 So what was the determinant of this? 338 00:24:51,680 --> 00:24:55,190 1 plus 2*alpha plus alpha squared minus alpha squared, 339 00:24:55,190 --> 00:24:57,290 I get 1 plus 2 alpha. 340 00:24:57,290 --> 00:25:02,830 And that's the inverse matrix now that's multiplying 6*alpha 341 00:25:02,830 --> 00:25:06,650 and minus 6*alpha, OK. 342 00:25:06,650 --> 00:25:10,100 And if I can do that multiplication, I have -- well, 343 00:25:10,100 --> 00:25:13,390 there's this factor 1 over 1 plus 2*alpha, 344 00:25:13,390 --> 00:25:15,880 and what do I have? 345 00:25:15,880 --> 00:25:20,180 6*alpha, 6 alpha squared, minus 6 alpha squared, 346 00:25:20,180 --> 00:25:22,730 I think 6*alpha? 347 00:25:22,730 --> 00:25:27,640 6 alpha squared, minus 6*alpha squared plus -- minus 6 alpha, 348 00:25:27,640 --> 00:25:30,200 I think it's that, minus 6*alpha. 349 00:25:30,200 --> 00:25:36,320 And, ready for the great moment? 350 00:25:36,320 --> 00:25:44,100 Let alpha go to infinity, and what do I get? 351 00:25:44,100 --> 00:25:52,580 As alpha goes to infinity, the 1 becomes insignificant, 352 00:25:52,580 --> 00:25:59,660 the alpha cancels the alpha, so that approaches [3, -3]. 353 00:25:59,660 --> 00:26:03,590 So there you see the large alpha method in practice. 354 00:26:03,590 --> 00:26:04,370 OK. 355 00:26:04,370 --> 00:26:11,130 And you see what -- well, there's something quite 356 00:26:11,130 --> 00:26:13,550 important here. 357 00:26:13,550 --> 00:26:15,482 Something quite important, and it's connected 358 00:26:15,482 --> 00:26:16,440 with the pseudoinverse. 359 00:26:19,470 --> 00:26:27,350 The pseudoinverse -- so now, I want to, we got this answer. 360 00:26:27,350 --> 00:26:35,300 And what I want to say is that the alpha, the limiting alpha 361 00:26:35,300 --> 00:26:43,300 system, has produced this pseudoinverse. 362 00:26:43,300 --> 00:26:46,160 So now I have to tell you about the pseudoinverse 363 00:26:46,160 --> 00:26:47,810 and what it means. 364 00:26:47,810 --> 00:26:50,210 And basically, the essential thing that it means 365 00:26:50,210 --> 00:26:59,840 is, the pseudoinverse gives the solution u which 366 00:26:59,840 --> 00:27:03,300 has no null space component. 367 00:27:03,300 --> 00:27:05,120 That's what the pseudoinverse is about. 368 00:27:05,120 --> 00:27:07,840 I'll draw a picture to say what I'm saying. 369 00:27:07,840 --> 00:27:12,010 But it's this fact that means that this part, 370 00:27:12,010 --> 00:27:20,150 which was this number, is the output -- 371 00:27:20,150 --> 00:27:28,530 this is the pseudoinverse of B applied to [6, 6]. 372 00:27:28,530 --> 00:27:29,670 You see the point? 373 00:27:29,670 --> 00:27:33,370 B hasn't got an inverse. 374 00:27:33,370 --> 00:27:34,410 B is 1, minus 1. 375 00:27:34,410 --> 00:27:40,200 It's a rectangular matrix. 376 00:27:40,200 --> 00:27:50,220 And it's not invertible in the normal sense. 377 00:27:50,220 --> 00:27:53,590 I can't find a two-sided inverse; 378 00:27:53,590 --> 00:27:58,490 a B inverse doesn't exist. 379 00:27:58,490 --> 00:28:02,100 But a pseudoinverse counts. 380 00:28:02,100 --> 00:28:06,020 So, just to give a MATLAB -- as long as I've written a MATLAB 381 00:28:06,020 --> 00:28:11,910 command here, why don't I write the other MATLAB command? 382 00:28:11,910 --> 00:28:17,940 u is the pseudoinverse -- you remember that pseudo starts 383 00:28:17,940 --> 00:28:27,530 with a letter p, so P-I-N-V -- of B multiplying d. 384 00:28:27,530 --> 00:28:34,660 That's what we got automatically. 385 00:28:34,660 --> 00:28:38,370 And it's what we get -- and the reason we got 386 00:28:38,370 --> 00:28:40,660 the pseudoinverse. 387 00:28:40,660 --> 00:28:43,420 So let me just say what was special here. 388 00:28:43,420 --> 00:28:46,060 What was special that produced this pseudoinverse -- 389 00:28:46,060 --> 00:28:48,300 that I'm going to speak about more -- 390 00:28:48,300 --> 00:28:54,040 was this choice A equal the identity and b equal 0, 391 00:28:54,040 --> 00:28:59,530 the fact that we just put the norm of u squared there -- 392 00:28:59,530 --> 00:29:02,470 well, the idea is this produces the pseudoinverse. 393 00:29:06,030 --> 00:29:12,110 And if you like -- so, can I say a little more about this 394 00:29:12,110 --> 00:29:14,710 pseudoinverse before drawing the picture that shows what 395 00:29:14,710 --> 00:29:15,620 it's about? 396 00:29:15,620 --> 00:29:19,570 So I took this thing and let alpha go to infinity. 397 00:29:19,570 --> 00:29:24,340 OK, so I could equally well have divided it by alpha, 398 00:29:24,340 --> 00:29:27,370 the whole -- if I divide the whole thing by alpha, 399 00:29:27,370 --> 00:29:32,500 that won't change the minimizer; certainly the same u's will 400 00:29:32,500 --> 00:29:33,650 win. 401 00:29:33,650 --> 00:29:37,830 And now I see one over alpha going to 0. 402 00:29:37,830 --> 00:29:41,510 And that's where the pseudoinverse is usually seen. 403 00:29:41,510 --> 00:29:46,550 We take the given problem, which does not completely 404 00:29:46,550 --> 00:29:49,920 determine u_1 and u_2, and we throw 405 00:29:49,920 --> 00:29:54,800 in a small amount of norm u squared, 406 00:29:54,800 --> 00:29:59,590 and find the minimum for that, right. 407 00:29:59,590 --> 00:30:01,380 So yeah. 408 00:30:03,990 --> 00:30:06,440 Let me say it, somehow. 409 00:30:06,440 --> 00:30:15,870 I take the B transpose B plus the 1 over alpha I -- 410 00:30:15,870 --> 00:30:23,560 now alpha is still going to infinity in this lecture, 411 00:30:23,560 --> 00:30:30,890 so 1 over alpha, the whole thing is headed for 0 -- 412 00:30:30,890 --> 00:30:34,390 times the norm of u square. 413 00:30:34,390 --> 00:30:37,720 This is the u_1 squared plus u_2 squared. 414 00:30:37,720 --> 00:30:40,650 OK. 415 00:30:40,650 --> 00:30:49,350 And that inverse, that quantity inverse approaches the -- well, 416 00:30:49,350 --> 00:30:54,020 once I -- I'm not giving the complete formula, 417 00:30:54,020 --> 00:30:59,020 but that's is what entering here and it leads to -- 418 00:30:59,020 --> 00:31:05,830 may I see the vague word leads toward the pseudoinverse B 419 00:31:05,830 --> 00:31:07,370 plus. 420 00:31:07,370 --> 00:31:07,870 Yeah. 421 00:31:07,870 --> 00:31:11,130 And I'll do better with that. 422 00:31:11,130 --> 00:31:13,750 OK, I want to go on to the picture. 423 00:31:13,750 --> 00:31:17,770 OK, so right. 424 00:31:17,770 --> 00:31:20,430 Do you know the most important picture of linear algebra? 425 00:31:20,430 --> 00:31:24,610 The whole picture what a matrix is actually doing? 426 00:31:24,610 --> 00:31:29,120 Here we have a great example to draw that picture. 427 00:31:29,120 --> 00:31:33,690 So here's the picture that 18.06 is -- 428 00:31:33,690 --> 00:31:34,860 it's at the center of 18.06. 429 00:31:34,860 --> 00:31:39,060 For our 2 by 1 matrix. 430 00:31:39,060 --> 00:31:43,680 So this is our matrix is B equals 1, minus 1. 431 00:31:43,680 --> 00:31:45,810 This is the picture for that matrix. 432 00:31:45,810 --> 00:31:51,660 OK, so that matrix has a row space. 433 00:31:51,660 --> 00:31:54,210 The row space is the set of all vectors that 434 00:31:54,210 --> 00:31:56,430 are a combinations of the rows. 435 00:31:56,430 --> 00:32:00,400 But there's only one row, so the row space is only a line. 436 00:32:00,400 --> 00:32:03,720 I guess it's probably that line. 437 00:32:03,720 --> 00:32:09,770 So the row space of B, of my matrix, 438 00:32:09,770 --> 00:32:16,180 is all multiples of [1, -1]. 439 00:32:16,180 --> 00:32:18,070 So it's a line. 440 00:32:18,070 --> 00:32:21,020 Let's put the zero point in. 441 00:32:21,020 --> 00:32:24,540 OK, then the matrix also has a null space. 442 00:32:24,540 --> 00:32:30,420 The null space as the side of solutions to B*x equals 0. 443 00:32:30,420 --> 00:32:36,040 It's a line, and in fact it's a perpendicular line. 444 00:32:36,040 --> 00:32:45,830 So this is the null space of B, and it contains all -- 445 00:32:45,830 --> 00:32:47,480 what does it contain? 446 00:32:47,480 --> 00:32:51,340 All the solutions to B*u equals 0 which, in this case, 447 00:32:51,340 --> 00:32:56,690 are all multiples of [1, 1]. 448 00:32:56,690 --> 00:33:00,860 And just to come back to my early comment, 449 00:33:00,860 --> 00:33:06,500 that's what the qr, the extra half of the qr algorithm 450 00:33:06,500 --> 00:33:10,710 is telling us; it's giving us a beautiful basis 451 00:33:10,710 --> 00:33:11,920 for the null space. 452 00:33:11,920 --> 00:33:16,610 And so the key point is that the null space is always 453 00:33:16,610 --> 00:33:22,130 perpendicular to the row space, which of course we see here. 454 00:33:22,130 --> 00:33:28,240 This z is what we had to compute when there were 455 00:33:28,240 --> 00:33:30,590 p components and not just one. 456 00:33:30,590 --> 00:33:38,040 And now, where is -- let's see, what else goes into this 457 00:33:38,040 --> 00:33:39,450 picture? 458 00:33:39,450 --> 00:33:44,740 Where are the solutions to my equation B*u equal d? 459 00:33:44,740 --> 00:33:50,620 So my equation was -- my equation was u_1 minus u_2 460 00:33:50,620 --> 00:33:56,690 equal a particular number, 6, and where are the solutions 461 00:33:56,690 --> 00:33:58,870 to u_1 minus u_2 equals 6? 462 00:34:02,250 --> 00:34:11,080 OK, so now I want to draw all the -- Where are all the -- 463 00:34:11,080 --> 00:34:13,700 so this is the u1, u2 plane? 464 00:34:13,700 --> 00:34:20,320 OK, so one solution is take c equal to 3. [3, -3], 465 00:34:20,320 --> 00:34:24,090 the combination [3, -3], which is right there, 466 00:34:24,090 --> 00:34:29,410 is my particular solution, so u_particular, or u row space, 467 00:34:29,410 --> 00:34:30,650 is [3, -3]. 468 00:34:33,540 --> 00:34:38,530 That solves the equation, and it lies in the row space. 469 00:34:38,530 --> 00:34:41,090 And now, if you understand the whole point 470 00:34:41,090 --> 00:34:47,400 of linear equations, where are the rest of the solutions? 471 00:34:47,400 --> 00:34:50,710 How do I draw the rest of the solutions? 472 00:34:50,710 --> 00:34:55,890 Well, to a particular solution I add on any null space solution. 473 00:34:55,890 --> 00:34:59,310 The null space solutions go this way. 474 00:34:59,310 --> 00:35:06,100 So I add on -- so this is my whole line of all solutions, 475 00:35:06,100 --> 00:35:08,890 so this is the line of all solutions. 476 00:35:17,960 --> 00:35:24,230 And now, the key question is, which solution is the smallest? 477 00:35:24,230 --> 00:35:26,990 When -- so this is the idea this pseudoinverse. 478 00:35:26,990 --> 00:35:31,260 When there are many solutions, pick the smallest one, 479 00:35:31,260 --> 00:35:34,720 pick the shortest one, it's the most stable somehow. 480 00:35:34,720 --> 00:35:36,530 It's the natural one. 481 00:35:36,530 --> 00:35:39,890 And which one is it? 482 00:35:39,890 --> 00:35:42,860 OK which -- so here is the origin. 483 00:35:42,860 --> 00:35:46,640 What point on that line is closest to the origin? 484 00:35:46,640 --> 00:35:51,930 What point minimizes u_1 square plus u_2 square? 485 00:35:51,930 --> 00:35:53,070 Everybody can see. 486 00:35:53,070 --> 00:36:01,000 This guy, that minimi-- so the pseudoinverse says, 487 00:36:01,000 --> 00:36:04,470 wait a minute, when you've got a whole line of solutions, 488 00:36:04,470 --> 00:36:06,490 just tell me a good one. 489 00:36:06,490 --> 00:36:08,250 Tell me the special one. 490 00:36:08,250 --> 00:36:11,430 And the special one is the one in the row space. 491 00:36:14,110 --> 00:36:16,740 And that's the one that the pseudoinverse picks. 492 00:36:16,740 --> 00:36:23,210 So the pseudoinverse of a matrix -- so the general rule is, 493 00:36:23,210 --> 00:36:26,170 and part of the lecture was the fact that, 494 00:36:26,170 --> 00:36:29,550 as alpha goes to infinity in this problem, 495 00:36:29,550 --> 00:36:31,290 the pseudoinverse will do it. 496 00:36:31,290 --> 00:36:35,310 Or I could say, just directly, what does the pseudoinverse do? 497 00:36:35,310 --> 00:36:46,450 The pseudoinverse -- so B plus, the pseudoinverse, chooses, 498 00:36:46,450 --> 00:36:55,500 it chooses u_p, if you like, u_p -- that's the B plus, 499 00:36:55,500 --> 00:37:01,290 that multiplies -- the solution -- I can't say B inverse d. 500 00:37:01,290 --> 00:37:05,070 Everybody knows my equation is B*u equal d. 501 00:37:05,070 --> 00:37:08,850 So this is my equation, B*u equal d. 502 00:37:08,850 --> 00:37:14,000 And my particular solution, my pseudo-solution, my best 503 00:37:14,000 --> 00:37:16,790 solution, is going to be B plus d, 504 00:37:16,790 --> 00:37:21,380 and it's going to be in the row space, 505 00:37:21,380 --> 00:37:26,930 because it's the smallest solution. 506 00:37:30,650 --> 00:37:33,640 So if you meet the idea of pseudoinverses, 507 00:37:33,640 --> 00:37:38,130 now you know what it's talking about. 508 00:37:38,130 --> 00:37:40,060 Because we don't have a true inverse, 509 00:37:40,060 --> 00:37:45,230 we have a whole line of a solutions, we want to pick one, 510 00:37:45,230 --> 00:37:48,200 and the pseudoinverse picks this one. 511 00:37:48,200 --> 00:37:50,580 It's the one in the row space, and it's the shortest, 512 00:37:50,580 --> 00:37:53,260 because these are orthogonal. 513 00:37:53,260 --> 00:38:02,210 Because these are orthogonal -- u is u_p plus u_n, 514 00:38:02,210 --> 00:38:03,900 and because those are orthogonal, 515 00:38:03,900 --> 00:38:06,800 the length of u squared, by Pythagoras, 516 00:38:06,800 --> 00:38:11,670 is the length of u_p squared plus the length of u_n squared. 517 00:38:11,670 --> 00:38:15,820 And which one is shortest? 518 00:38:15,820 --> 00:38:18,830 The one that has no u_n. 519 00:38:18,830 --> 00:38:22,710 That orthogonal component might as well 520 00:38:22,710 --> 00:38:26,300 be 0 if you want the shortest. 521 00:38:26,300 --> 00:38:30,420 So all solutions have this, and this is 522 00:38:30,420 --> 00:38:32,030 the length of the shortest one. 523 00:38:32,030 --> 00:38:32,860 OK. 524 00:38:32,860 --> 00:38:35,740 So that tells you what the pseudoinverse is. 525 00:38:35,740 --> 00:38:41,550 At least it tells you what it is for a 1 by 2 matrix. 526 00:38:41,550 --> 00:38:48,710 As long as I'm trying to speak about the pseudoinverse, 527 00:38:48,710 --> 00:38:52,170 let me complete this thought. 528 00:38:52,170 --> 00:38:56,570 But you saw the idea, that the thought was -- 529 00:38:56,570 --> 00:38:59,320 there are there two ways to get it, again. 530 00:38:59,320 --> 00:39:02,940 The null space method that goes for it directly, 531 00:39:02,940 --> 00:39:09,070 or the big alpha method that we checked actually works. 532 00:39:09,070 --> 00:39:10,920 So that was the point of this board here. 533 00:39:10,920 --> 00:39:15,780 That the big alpha method, also produces, in the limit as alpha 534 00:39:15,780 --> 00:39:17,610 goes to infinity, u_p. 535 00:39:17,610 --> 00:39:23,850 And there's a little -- it doesn't have -- 536 00:39:23,850 --> 00:39:27,390 if alpha was a 1,000, I wouldn't get exactly the right answer, 537 00:39:27,390 --> 00:39:32,320 because this would be 2,001 in the denominator. 538 00:39:32,320 --> 00:39:36,780 But as 1,000 becomes a million and alpha goes to infinity, 539 00:39:36,780 --> 00:39:40,110 I guess the exact one. 540 00:39:40,110 --> 00:39:42,570 OK, so here I was going to draw the picture, 541 00:39:42,570 --> 00:39:53,170 so if I draw row space -- can I imagine this is the row space, 542 00:39:53,170 --> 00:39:58,580 whose dimension is the rank of the matrix. 543 00:39:58,580 --> 00:40:07,510 Perpendicular to it is the null space whose dimension is 544 00:40:07,510 --> 00:40:12,940 the rest -- the rank, that was the rank that I always call r, 545 00:40:12,940 --> 00:40:16,420 then this will have the dimension n minus r, 546 00:40:16,420 --> 00:40:19,680 the number of -- this is exactly, 547 00:40:19,680 --> 00:40:27,590 these are the two things that MATLAB found here. 548 00:40:27,590 --> 00:40:34,030 These were the r vectors in the row space, turned into columns, 549 00:40:34,030 --> 00:40:37,680 and these were the n minus r -- but that was only one -- 550 00:40:37,680 --> 00:40:40,340 vectors in the null space. 551 00:40:40,340 --> 00:40:46,690 So normally, we're up in n dimensions, not just two. 552 00:40:46,690 --> 00:40:50,810 With two dimensions, I just had lines; in n dimensions 553 00:40:50,810 --> 00:40:54,040 I have an r-dimensional subspace perpendicular 554 00:40:54,040 --> 00:40:56,870 to an n minus r dimensional subspace. 555 00:40:56,870 --> 00:41:01,150 And now B. What does B do? 556 00:41:01,150 --> 00:41:02,030 OK. 557 00:41:02,030 --> 00:41:06,990 So suppose I take a vector u_n in the null space 558 00:41:06,990 --> 00:41:10,130 o B. Then B takes it to 0. 559 00:41:10,130 --> 00:41:13,180 So can I just draw that with an arrow? 560 00:41:13,180 --> 00:41:14,630 This'll be 0. 561 00:41:14,630 --> 00:41:20,210 B*u_n is 0, that's the whole idea. 562 00:41:20,210 --> 00:41:21,380 OK. 563 00:41:21,380 --> 00:41:26,810 But a vector in a row space is not taken to 0. 564 00:41:26,810 --> 00:41:32,110 B will take that --- dot dot dot dot -- into the -- 565 00:41:32,110 --> 00:41:43,740 I better draw here -- the column space of B. OK. 566 00:41:43,740 --> 00:41:49,350 Which I'm drawing as a subspace whose dimension is also, 567 00:41:49,350 --> 00:41:57,590 is this same rank r, that's the great fact about matrices, 568 00:41:57,590 --> 00:41:59,650 that the number of independent rows 569 00:41:59,650 --> 00:42:02,280 equals the number of independent columns. 570 00:42:02,280 --> 00:42:11,550 So this guy heads off to some B u row space. 571 00:42:11,550 --> 00:42:12,730 OK. 572 00:42:12,730 --> 00:42:16,900 And if I've complete the picture, as I really should, 573 00:42:16,900 --> 00:42:23,360 there's another subspace over here, 574 00:42:23,360 --> 00:42:28,250 which happened to be the zero subspace in this example, 575 00:42:28,250 --> 00:42:29,970 but usually it's here. 576 00:42:29,970 --> 00:42:35,230 It's the null space of B transpose. 577 00:42:39,040 --> 00:42:45,470 In that example, B transpose was [1, -1] 578 00:42:45,470 --> 00:42:51,140 and its column was independent, so there was no null space. 579 00:42:51,140 --> 00:42:53,300 So I had a simple picture, and that's 580 00:42:53,300 --> 00:42:57,270 why I wanted to draw you a bigger picture with it. 581 00:42:57,270 --> 00:43:02,950 It's dimension will be, well, not n minus r, 582 00:43:02,950 --> 00:43:10,850 but if B is m by n, let's say, then 583 00:43:10,850 --> 00:43:13,190 it turns out that this null space will 584 00:43:13,190 --> 00:43:15,250 have dimension m minus r. 585 00:43:15,250 --> 00:43:16,100 No problem. 586 00:43:16,100 --> 00:43:16,720 OK. 587 00:43:16,720 --> 00:43:20,420 Now, in the last three minutes, I 588 00:43:20,420 --> 00:43:25,740 want to draw the pseudoinverse. 589 00:43:25,740 --> 00:43:28,910 So what I'm saying is that every matrix B, 590 00:43:28,910 --> 00:43:32,020 every rectangular or square matrix B, 591 00:43:32,020 --> 00:43:34,450 has these four spaces. 592 00:43:34,450 --> 00:43:37,890 Four fundamental subspaces they've come to be called. 593 00:43:37,890 --> 00:43:46,290 OK and the null space is the vectors which B takes to 0. 594 00:43:46,290 --> 00:43:50,030 B takes any vector into its column space. 595 00:43:50,030 --> 00:43:55,950 So now let me just draw what happens to u equal u null 596 00:43:55,950 --> 00:43:59,010 space plus u row space. 597 00:43:59,010 --> 00:44:03,220 So this was a guy in the row space. 598 00:44:03,220 --> 00:44:08,210 If I -- B, what will B do when multiplies this vector? 599 00:44:08,210 --> 00:44:12,010 This vector has a part that's in the null space, 600 00:44:12,010 --> 00:44:14,320 and a part that's in the row space. 601 00:44:14,320 --> 00:44:19,860 But when I multiply by B, what happens to this part? 602 00:44:19,860 --> 00:44:21,180 Gone. 603 00:44:21,180 --> 00:44:24,060 When I multiply that by B, where does it go? 604 00:44:24,060 --> 00:44:24,740 There. 605 00:44:24,740 --> 00:44:31,000 So this, all these guys feed into that same point. 606 00:44:31,000 --> 00:44:37,400 B*u is also going there. 607 00:44:37,400 --> 00:44:39,350 That's why it's not invertible. 608 00:44:39,350 --> 00:44:40,440 Of course. 609 00:44:40,440 --> 00:44:42,800 That's why it's not invertible. 610 00:44:42,800 --> 00:44:51,120 Here, I guess -- yeah, here I -- sorry. 611 00:44:51,120 --> 00:44:51,750 Yeah. 612 00:44:51,750 --> 00:44:54,986 This was the null space of B. I didn't write in what 613 00:44:54,986 --> 00:44:56,380 it was the null space of. 614 00:44:56,380 --> 00:44:57,310 OK. 615 00:44:57,310 --> 00:45:00,630 So the matrix couldn't be invertible, and actually, 616 00:45:00,630 --> 00:45:04,300 because it has a null space, and they all send those -- 617 00:45:04,300 --> 00:45:07,880 so what is the pseudoinverse, finally? 618 00:45:07,880 --> 00:45:13,640 Finally, last moment, the pseudoinverse is the matrix -- 619 00:45:13,640 --> 00:45:17,540 it's like an inverse matrix that comes backwards, right? 620 00:45:17,540 --> 00:45:21,650 It reverses what B does. 621 00:45:21,650 --> 00:45:28,000 What it cannot do is reverse stuff that's appeared at 0. 622 00:45:28,000 --> 00:45:31,960 No matrix could send 0 back to u_n right? 623 00:45:31,960 --> 00:45:34,650 If i multiply by the zero vector, 624 00:45:34,650 --> 00:45:36,890 I'm only going to get the zero vector. 625 00:45:36,890 --> 00:45:44,730 So the pseudoinverse has to -- what it can do is it can send 626 00:45:44,730 --> 00:45:50,220 this stuff back to this. 627 00:45:50,220 --> 00:45:51,760 This is what the pseudoinverse does. 628 00:45:51,760 --> 00:45:55,460 If I had a different color chalk I would use it now. 629 00:45:55,460 --> 00:45:59,920 But let use two arrows or even three. 630 00:45:59,920 --> 00:46:02,470 This is what the pseudoinverse does. 631 00:46:02,470 --> 00:46:08,570 It takes the column space and sends it back to the row space. 632 00:46:08,570 --> 00:46:13,710 And because these have the same dimension r -- the point is, 633 00:46:13,710 --> 00:46:18,880 inside B is this r by r matrix that's cool. 634 00:46:18,880 --> 00:46:20,640 It's totally invertible. 635 00:46:20,640 --> 00:46:23,270 And B plus inverts it. 636 00:46:23,270 --> 00:46:25,900 So from row space to column space 637 00:46:25,900 --> 00:46:29,860 goes B; from column space back to row space 638 00:46:29,860 --> 00:46:34,420 comes the pseudoinverse, but I can't call it a genuine inverse 639 00:46:34,420 --> 00:46:40,300 because all this stuff, including 0, the best I can do 640 00:46:40,300 --> 00:46:43,790 is send those all back to 0. 641 00:46:43,790 --> 00:46:45,400 There. 642 00:46:45,400 --> 00:46:49,320 Now I've really wiped out that figure. 643 00:46:49,320 --> 00:46:52,490 But I'll put the three arrows there 644 00:46:52,490 --> 00:46:56,410 that makes it crystal clear. 645 00:46:56,410 --> 00:46:58,740 So this, those three arrows are indicating 646 00:46:58,740 --> 00:47:00,895 what the pseudoinverse does. 647 00:47:00,895 --> 00:47:08,580 It takes the column space -- Its column space is the row space. 648 00:47:08,580 --> 00:47:13,370 The column space of B plus is the row space of B. 649 00:47:13,370 --> 00:47:16,700 You know, sort of, in these two spaces, that's where 650 00:47:16,700 --> 00:47:19,010 the pseudoinverse is alive. 651 00:47:19,010 --> 00:47:25,410 And B kills the null space and B plus kills the null space -- 652 00:47:25,410 --> 00:47:26,420 the other null space. 653 00:47:26,420 --> 00:47:28,000 The null space of B transpose. 654 00:47:28,000 --> 00:47:35,690 Anyway, that pseudoinverse is at the center of the whole theory 655 00:47:35,690 --> 00:47:38,240 here. 656 00:47:38,240 --> 00:47:40,340 You know, when I take out books from the library 657 00:47:40,340 --> 00:47:44,530 about regularizing least squares, 658 00:47:44,530 --> 00:47:49,470 they begin by explaining the pseudoinverse. 659 00:47:49,470 --> 00:47:55,470 Which, as we've seen, arises as alpha goes to infinity 660 00:47:55,470 --> 00:47:59,920 or 0, whichever end we're at. 661 00:47:59,920 --> 00:48:02,220 And what I have still to do next time 662 00:48:02,220 --> 00:48:06,740 is, what happens if I'm not prepared to go 663 00:48:06,740 --> 00:48:08,820 all the way to the pseudoinverse, 664 00:48:08,820 --> 00:48:17,110 because it blows up on me, and I want a finite alpha, 665 00:48:17,110 --> 00:48:19,610 what should that alpha be? 666 00:48:19,610 --> 00:48:26,710 And that alpha will be determined by, as I said, 667 00:48:26,710 --> 00:48:30,270 somehow by the noise level in the system. 668 00:48:30,270 --> 00:48:30,860 Right. 669 00:48:30,860 --> 00:48:34,540 And just to emphasize another example that I'll probably 670 00:48:34,540 --> 00:48:38,200 mention, you know, CT scans, MRI, 671 00:48:38,200 --> 00:48:42,420 all those things that are trying to reconstruct 672 00:48:42,420 --> 00:48:46,990 the results from limited number of measurements, measurements 673 00:48:46,990 --> 00:48:51,270 that are not really enough to perfect reconstruction, 674 00:48:51,270 --> 00:48:55,760 so this is the theory of imperfect reconstruction, 675 00:48:55,760 --> 00:49:00,640 if I can invent an expression, having 676 00:49:00,640 --> 00:49:03,490 met perfect reconstruction in the world of wavelets 677 00:49:03,490 --> 00:49:06,480 and signal processing, this is the subject 678 00:49:06,480 --> 00:49:09,280 of imperfect reconstruction and I'll 679 00:49:09,280 --> 00:49:12,270 hope to do justice to it on Wednesday. 680 00:49:12,270 --> 00:49:12,920 OK. 681 00:49:12,920 --> 00:49:14,490 Thank you.