1 00:00:00,000 --> 00:00:00,027 2 00:00:00,027 --> 00:00:02,110 The following content is provided under a Creative 3 00:00:02,110 --> 00:00:03,610 Commons license. 4 00:00:03,610 --> 00:00:05,970 Your support will help MIT OpenCourseWare 5 00:00:05,970 --> 00:00:10,050 continue to offer high quality educational resources for free. 6 00:00:10,050 --> 00:00:12,530 To make a donation, or to view additional materials 7 00:00:12,530 --> 00:00:17,120 from hundreds of MIT courses, visit MIT OpenCourseWare 8 00:00:17,120 --> 00:00:20,830 ocw.mit.edu. 9 00:00:20,830 --> 00:00:25,100 PROFESSOR STRANG: Ready for the least squares lecture, 10 00:00:25,100 --> 00:00:29,360 lecture 11? 11 00:00:29,360 --> 00:00:32,840 Homework is just being posted on the web. 12 00:00:32,840 --> 00:00:38,370 It'll be due, it's really to help you practice, get 13 00:00:38,370 --> 00:00:44,880 some experience on these sections for the first exam. 14 00:00:44,880 --> 00:00:46,950 That's Tuesday evening. 15 00:00:46,950 --> 00:00:49,140 So eight days away. 16 00:00:49,140 --> 00:00:52,750 So the homework will be due the day after. 17 00:00:52,750 --> 00:00:58,270 And actually, we'll try to move the review session to Monday 18 00:00:58,270 --> 00:01:02,240 next week so you can ask me any questions about the homework 19 00:01:02,240 --> 00:01:05,840 or any review material. 20 00:01:05,840 --> 00:01:08,630 So that's all a week away and this week we 21 00:01:08,630 --> 00:01:11,710 get two great examples. 22 00:01:11,710 --> 00:01:16,280 Least squares is one that comes today. 23 00:01:16,280 --> 00:01:19,026 But could I first, because I keep learning more-- 24 00:01:19,026 --> 00:01:20,900 And I've got your MATLAB homeworks to return. 25 00:01:20,900 --> 00:01:26,640 I keep sort of learning a little more from your MATLAB results 26 00:01:26,640 --> 00:01:29,810 and I think because we spoke about it, 27 00:01:29,810 --> 00:01:33,120 it would be worth speaking just a little more. 28 00:01:33,120 --> 00:01:41,030 So I'm going to take ten minutes about this convection-diffusion 29 00:01:41,030 --> 00:01:46,960 equation in which I put in a coefficient d, a diffusivity, 30 00:01:46,960 --> 00:01:49,670 just to help get the units right. 31 00:01:49,670 --> 00:01:51,450 So this is your example. 32 00:01:51,450 --> 00:01:55,840 And it had d=1 of course. 33 00:01:55,840 --> 00:02:00,290 Well first I realized that later in the book 34 00:02:00,290 --> 00:02:04,500 I completely forgot that I discussed this problem. 35 00:02:04,500 --> 00:02:07,280 About page 509, I think. 36 00:02:07,280 --> 00:02:09,010 I discussed it a little bit. 37 00:02:09,010 --> 00:02:15,120 And just because it's worth, since we 38 00:02:15,120 --> 00:02:19,010 invested a little time, the little bit more will pay off. 39 00:02:19,010 --> 00:02:23,430 So first of all, the point is here 40 00:02:23,430 --> 00:02:27,430 we have convection competing with diffusion. 41 00:02:27,430 --> 00:02:32,110 And always there's some non-dimensional number. 42 00:02:32,110 --> 00:02:34,290 Here it's called the Peclet number. 43 00:02:34,290 --> 00:02:37,810 Actually, there's an accent on one of those e's, Péclet 44 00:02:37,810 --> 00:02:39,200 number. 45 00:02:39,200 --> 00:02:43,680 Which measures the ratio, the importance of convection 46 00:02:43,680 --> 00:02:45,590 relative to diffusion. 47 00:02:45,590 --> 00:02:52,640 So it's V times a length scale in the problem, divided by d. 48 00:02:52,640 --> 00:02:55,180 So then that has the same units as that, 49 00:02:55,180 --> 00:02:58,040 if the result is dimensionless. 50 00:02:58,040 --> 00:03:00,330 Maybe you know the Reynolds number. 51 00:03:00,330 --> 00:03:04,200 This is very like the Reynolds number, which also measures, 52 00:03:04,200 --> 00:03:10,030 in Navier-Stokes equation, the importance of convection, 53 00:03:10,030 --> 00:03:14,620 advection, and diffusion. 54 00:03:14,620 --> 00:03:19,750 There, in that equation, the velocity V, 55 00:03:19,750 --> 00:03:22,510 that's a non-linear equation, Navier-Stokes, 56 00:03:22,510 --> 00:03:28,490 it's tremendously important and many codes to solve it, 57 00:03:28,490 --> 00:03:33,280 lots of discussion, theory still not complete. 58 00:03:33,280 --> 00:03:36,630 In that problem, the V is u. 59 00:03:36,630 --> 00:03:39,600 It's non-linear and the term there 60 00:03:39,600 --> 00:03:42,980 that we took as a constant, as a given 61 00:03:42,980 --> 00:03:45,510 constant V, it's the same as u. 62 00:03:45,510 --> 00:03:52,200 So in the Reynolds number, this would be u, a typical velocity 63 00:03:52,200 --> 00:03:55,940 u, times a typical length scale, which 64 00:03:55,940 --> 00:04:02,250 would be like one in our zero to one problem, divided by d or mu 65 00:04:02,250 --> 00:04:05,240 or nu, whatever number we use. 66 00:04:05,240 --> 00:04:07,060 So it's like the Reynolds number. 67 00:04:07,060 --> 00:04:11,410 And then it's turned out for this problem 68 00:04:11,410 --> 00:04:15,140 that people also use a number that gets 69 00:04:15,140 --> 00:04:18,740 called the cell Peclet number where 70 00:04:18,740 --> 00:04:23,040 the length is taken to be half the cell 71 00:04:23,040 --> 00:04:24,780 size, delta x over two. 72 00:04:24,780 --> 00:04:33,020 Let me call that number P. So that's P. And what's my point? 73 00:04:33,020 --> 00:04:36,630 This equation's important enough to sort of see a little more 74 00:04:36,630 --> 00:04:40,870 about it than just the numbers that come out. 75 00:04:40,870 --> 00:04:47,670 So the MATLAB homework, which you did really well, 76 00:04:47,670 --> 00:04:51,850 set up finite differences for this. 77 00:04:51,850 --> 00:04:55,700 Right? 78 00:04:55,700 --> 00:05:01,590 And found the eigenvalues and solutions. 79 00:05:01,590 --> 00:05:05,460 It's the eigenvalues I want to say a little more about. 80 00:05:05,460 --> 00:05:16,980 Because you set up a matrix K over delta x squared and V 81 00:05:16,980 --> 00:05:21,170 times the centered difference over delta x. 82 00:05:21,170 --> 00:05:30,930 And I guess I call that whole combination L, and asked you 83 00:05:30,930 --> 00:05:37,310 about the eigenvalues of L. And you printed them out correctly. 84 00:05:37,310 --> 00:05:43,610 But there's more there than I think we have understood. 85 00:05:43,610 --> 00:05:47,130 And I want to make some more comments about that. 86 00:05:47,130 --> 00:05:49,140 Because it's quite important. 87 00:05:49,140 --> 00:05:55,710 And the comments are clearest if I just reduce to a n equal two. 88 00:05:55,710 --> 00:06:01,120 So that matrix, well the off-diagonal part 89 00:06:01,120 --> 00:06:06,240 of that matrix had some number b and some number c. 90 00:06:06,240 --> 00:06:10,330 Actually we could figure out what was the b in this. 91 00:06:10,330 --> 00:06:14,310 This produced a minus one, two, minus one, right? 92 00:06:14,310 --> 00:06:20,360 So part of the b was the minus one over delta x squared. 93 00:06:20,360 --> 00:06:26,970 And then from this was a plus V and a one over, 94 00:06:26,970 --> 00:06:28,960 well it's a centered difference so I 95 00:06:28,960 --> 00:06:31,310 should divide by 2 delta x. 96 00:06:31,310 --> 00:06:32,510 Is that right? 97 00:06:32,510 --> 00:06:37,210 Is that what a typical off-diagonal thing 98 00:06:37,210 --> 00:06:39,230 in the matrix that you displayed? 99 00:06:39,230 --> 00:06:42,610 That's what's coming from the off-diagonal of K. 100 00:06:42,610 --> 00:06:46,080 And this is what's coming from the centered difference 101 00:06:46,080 --> 00:06:50,030 C. And then what would this c thing be? 102 00:06:50,030 --> 00:06:54,850 Well the c is below the diagonal so it's also that minus one 103 00:06:54,850 --> 00:06:56,880 over delta x squared. 104 00:06:56,880 --> 00:07:00,640 But now this is a difference, so it's going to be a minus, 105 00:07:00,640 --> 00:07:01,850 right? 106 00:07:01,850 --> 00:07:09,390 I think those would have been your entries for b and c. 107 00:07:09,390 --> 00:07:12,910 So can we just think first, what are the eigenvalues 108 00:07:12,910 --> 00:07:14,460 of that matrix? 109 00:07:14,460 --> 00:07:17,610 It's a two by two, simple problem. 110 00:07:17,610 --> 00:07:20,760 The trace is zero plus zero. 111 00:07:20,760 --> 00:07:23,670 So that the eigenvalues will be a plus minus pair 112 00:07:23,670 --> 00:07:25,830 because they have to add to zero. 113 00:07:25,830 --> 00:07:29,330 And I think that's the plus minus pair you get. 114 00:07:29,330 --> 00:07:30,510 Let's just check. 115 00:07:30,510 --> 00:07:32,230 What's our other check? 116 00:07:32,230 --> 00:07:35,255 They will add to zero, the plus the square root and minus 117 00:07:35,255 --> 00:07:37,210 the square root. 118 00:07:37,210 --> 00:07:39,430 And the product of the two eigenvalues, 119 00:07:39,430 --> 00:07:42,850 lambda_1 times lambda_2, will be, 120 00:07:42,850 --> 00:07:45,660 we have one of them is plus, one with a minus, 121 00:07:45,660 --> 00:07:48,360 so it'd be minus bc. 122 00:07:48,360 --> 00:07:53,150 And that's correctly the determinant. 123 00:07:53,150 --> 00:07:54,680 So it's good. 124 00:07:54,680 --> 00:07:57,930 These are the correct eigenvalues. 125 00:07:57,930 --> 00:08:03,430 Now let me ask you about the signs of b and c. 126 00:08:03,430 --> 00:08:07,670 If b and c have the same signs, like maybe even equal, one, 127 00:08:07,670 --> 00:08:12,180 one, what are the eigenvalues? 128 00:08:12,180 --> 00:08:15,160 So in that symmetric case if b and c 129 00:08:15,160 --> 00:08:19,350 are equal the eigenvalues are? 130 00:08:19,350 --> 00:08:21,040 Right here. 131 00:08:21,040 --> 00:08:25,290 If b and c are equal, say equal to one, 132 00:08:25,290 --> 00:08:28,780 the eigenvalues are plus and minus one. 133 00:08:28,780 --> 00:08:32,620 But what if the signs are opposite? 134 00:08:32,620 --> 00:08:33,980 Everything changes. 135 00:08:33,980 --> 00:08:37,590 What if b is one and c is minus one? 136 00:08:37,590 --> 00:08:40,700 That matrix would then be a 90 degree rotation. 137 00:08:40,700 --> 00:08:46,020 It would be anti-symmetric if b was one and c was minus one. 138 00:08:46,020 --> 00:08:50,290 Our formula is still correct, but what does it give us? 139 00:08:50,290 --> 00:08:56,500 If b is one and c is minus one what have I got here? 140 00:08:56,500 --> 00:08:57,730 I've got i. 141 00:08:57,730 --> 00:09:01,070 So the eigenvalues change from plus and minus one 142 00:09:01,070 --> 00:09:04,270 in the symmetric case to plus and minus i 143 00:09:04,270 --> 00:09:05,820 in the anti-symmetric case. 144 00:09:05,820 --> 00:09:12,550 And I think that's what you guys saw at a certain level of V. 145 00:09:12,550 --> 00:09:16,340 I hope you did because that was the point about eigenvalues. 146 00:09:16,340 --> 00:09:20,020 Now you may say, what about the diagonal? 147 00:09:20,020 --> 00:09:22,650 Well I claim diagonal is very simple. 148 00:09:22,650 --> 00:09:26,240 What's the diagonal? 149 00:09:26,240 --> 00:09:28,275 Now I'm going to allow myself a diagonal 150 00:09:28,275 --> 00:09:34,720 and I'm just going to change-- What happens if I have a and a? 151 00:09:34,720 --> 00:09:36,720 Same entry on the diagonal. 152 00:09:36,720 --> 00:09:39,190 What are the eigenvalues now? 153 00:09:39,190 --> 00:09:40,810 This is just like, a great chance 154 00:09:40,810 --> 00:09:43,630 to do some basic eigenvalue stuff. 155 00:09:43,630 --> 00:09:47,380 What are the eigenvalues of that matrix? 156 00:09:47,380 --> 00:09:50,650 Well I've added a times the identity. 157 00:09:50,650 --> 00:09:54,000 I've just shifted that matrix by a. 158 00:09:54,000 --> 00:09:57,990 So the eigenvalues all shift by a. 159 00:09:57,990 --> 00:10:02,020 So the eigenvalues are now a plus and minus. 160 00:10:02,020 --> 00:10:06,450 So no big deal. 161 00:10:06,450 --> 00:10:10,720 So you say that the a is actually not important, 162 00:10:10,720 --> 00:10:15,540 not the key to this question of are they real 163 00:10:15,540 --> 00:10:18,920 or do they go complex. 164 00:10:18,920 --> 00:10:22,360 So the eigenvalues of this are real when 165 00:10:22,360 --> 00:10:24,770 b and c have the same sign. 166 00:10:24,770 --> 00:10:29,300 If b and c have the same sign, I have a square root, no problem. 167 00:10:29,300 --> 00:10:33,700 When b and c have opposite sign, what do I get? 168 00:10:33,700 --> 00:10:35,530 When b and c have opposite sign, I'm 169 00:10:35,530 --> 00:10:37,460 taking the square root of a negative number 170 00:10:37,460 --> 00:10:38,910 and I've gone complex. 171 00:10:38,910 --> 00:10:44,560 Do you see that the change from real eigenvalues, which 172 00:10:44,560 --> 00:10:49,040 gives a nice curve, to complex eigenvalues, 173 00:10:49,040 --> 00:10:55,380 which gives a very bumpy curve for the solution, 174 00:10:55,380 --> 00:10:59,250 just happens when like, for example, 175 00:10:59,250 --> 00:11:04,620 b-- Is it b that's going to go to zero maybe? 176 00:11:04,620 --> 00:11:10,770 And then beyond that? 177 00:11:10,770 --> 00:11:14,470 Well this sign is for sure negative, right? 178 00:11:14,470 --> 00:11:16,760 So c is staying negative. 179 00:11:16,760 --> 00:11:24,410 And originally for a little delta x, b is also negative. 180 00:11:24,410 --> 00:11:31,410 What's happening here? 181 00:11:31,410 --> 00:11:36,530 I think that the transition that you I hope observed 182 00:11:36,530 --> 00:11:39,390 comes when b hits zero. 183 00:11:39,390 --> 00:11:44,690 When the combination of V and delta x is such that at b=0 we 184 00:11:44,690 --> 00:11:50,480 switch from real eigenvalues to complex eigenvalues. 185 00:11:50,480 --> 00:11:51,510 And when is b=0? 186 00:11:51,510 --> 00:11:55,080 187 00:11:55,080 --> 00:12:01,690 That's when this negative guy off the diagonal 188 00:12:01,690 --> 00:12:04,780 just exactly cancels this one. 189 00:12:04,780 --> 00:12:09,620 So b is zero when what? 190 00:12:09,620 --> 00:12:13,610 So if this equals this, one over delta x squared 191 00:12:13,610 --> 00:12:17,300 is equal to V over two delta x. 192 00:12:17,300 --> 00:12:19,480 Let me multiply both sides by delta 193 00:12:19,480 --> 00:12:23,360 x squared so that I have a nice one there, 194 00:12:23,360 --> 00:12:28,010 multiplying by delta x squared will put a delta x up here. 195 00:12:28,010 --> 00:12:29,820 And what have we discovered? 196 00:12:29,820 --> 00:12:33,440 This is why I wanted you to see it. 197 00:12:33,440 --> 00:12:39,610 That the transition comes when the Peclet number is one. 198 00:12:39,610 --> 00:12:42,640 So that Peclet number, that cell Peclet number 199 00:12:42,640 --> 00:12:47,450 is exactly the point that we observed of transition 200 00:12:47,450 --> 00:12:53,930 from real eigenvalues to complex eigenvalues. 201 00:12:53,930 --> 00:12:55,510 And that's the transition. 202 00:12:55,510 --> 00:13:00,300 So it's that combination, this is the Peclet number, 203 00:13:00,300 --> 00:13:09,690 cell Peclet number, it's that combination, P_cell maybe. 204 00:13:09,690 --> 00:13:13,600 We've done the computations and now 205 00:13:13,600 --> 00:13:17,700 we gradually get back to the meaning. 206 00:13:17,700 --> 00:13:20,480 And I just wanted to take this step 207 00:13:20,480 --> 00:13:24,890 back to the meaning to see when do those numbers start 208 00:13:24,890 --> 00:13:25,960 going complex. 209 00:13:25,960 --> 00:13:27,660 You may have noticed or you may not 210 00:13:27,660 --> 00:13:30,120 have noticed that it'll happen when 211 00:13:30,120 --> 00:13:35,460 one of those, when that upper diagonal changes sign. 212 00:13:35,460 --> 00:13:40,140 Now you could say, okay that's the eigenvalues. 213 00:13:40,140 --> 00:13:44,500 What's the consequences for the shape of the solution? 214 00:13:44,500 --> 00:13:47,160 Well, I haven't figured all that out. 215 00:13:47,160 --> 00:13:50,530 I'd be happy to have some more thoughts about that. 216 00:13:50,530 --> 00:13:58,280 But what you noticed, I think, in the computations is 217 00:13:58,280 --> 00:14:05,080 if V got too big so that that P was bigger than one, 218 00:14:05,080 --> 00:14:10,400 if V got too big, so convection was dominating 219 00:14:10,400 --> 00:14:14,130 and our delta x was not small enough to deal with it, 220 00:14:14,130 --> 00:14:19,450 you should have seen the points on the discrete values 221 00:14:19,450 --> 00:14:22,580 were oscillating instead of a proper smooth-- I 222 00:14:22,580 --> 00:14:25,540 mean, the proper, with a large V, 223 00:14:25,540 --> 00:14:30,540 the correct solution, I think, is practically nothing for here 224 00:14:30,540 --> 00:14:33,200 and then it goes, this is a really large V, 225 00:14:33,200 --> 00:14:35,290 take V to a thousand or something. 226 00:14:35,290 --> 00:14:36,930 It climbs up like mad. 227 00:14:36,930 --> 00:14:40,020 Here's the halfway point where the load is. 228 00:14:40,020 --> 00:14:44,130 And then it goes along here and then it climbs down like mad 229 00:14:44,130 --> 00:14:46,890 to satisfy the boundary condition. 230 00:14:46,890 --> 00:14:51,790 I didn't know that that's what would happen for large V. 231 00:14:51,790 --> 00:14:53,830 What I'm saying is, and undoubtedly it 232 00:14:53,830 --> 00:14:56,770 could be understood physically, so I 233 00:14:56,770 --> 00:14:58,450 guess what I'm saying is there's just 234 00:14:58,450 --> 00:15:05,370 more good stuff in any computation than purely 235 00:15:05,370 --> 00:15:06,450 the numbers. 236 00:15:06,450 --> 00:15:11,050 And this is part of the good stuff in that example. 237 00:15:11,050 --> 00:15:12,750 I hope you liked that. 238 00:15:12,750 --> 00:15:16,160 Because I mean, here you did the work but then, 239 00:15:16,160 --> 00:15:22,270 to understand it is frankly still under way. 240 00:15:22,270 --> 00:15:28,280 More thinking to do. 241 00:15:28,280 --> 00:15:31,060 That's back to least squares. 242 00:15:31,060 --> 00:15:35,680 Here's today's lecture. 243 00:15:35,680 --> 00:15:38,170 So remember where we started last time. 244 00:15:38,170 --> 00:15:39,220 Au=b. 245 00:15:39,220 --> 00:15:40,410 Last time I wrote f. 246 00:15:40,410 --> 00:15:42,300 I regret it terribly. 247 00:15:42,300 --> 00:15:43,800 I can't fix it. 248 00:15:43,800 --> 00:15:45,260 But it's b. 249 00:15:45,260 --> 00:15:49,590 I want b there to be the right-hand side. 250 00:15:49,590 --> 00:15:56,670 And I jumped to the equation that determines the best u. 251 00:15:56,670 --> 00:16:02,100 There's no exact u because we've got too many equations. 252 00:16:02,100 --> 00:16:04,970 You remember the set-up, we have too many equations. 253 00:16:04,970 --> 00:16:07,330 There's noise in the measurements 254 00:16:07,330 --> 00:16:10,720 and we can't get the error down to zero. 255 00:16:10,720 --> 00:16:13,660 There's some error. 256 00:16:13,660 --> 00:16:17,950 And the best u was given by that equation 257 00:16:17,950 --> 00:16:21,120 and we want to say why. 258 00:16:21,120 --> 00:16:26,220 And understand it from two or three ways. 259 00:16:26,220 --> 00:16:28,780 Calculus, geometry, everything. 260 00:16:28,780 --> 00:16:33,600 Can I first, because I love my little framework here, 261 00:16:33,600 --> 00:16:36,770 fit it in because it's quite important, this example 262 00:16:36,770 --> 00:16:39,130 and then others fit in. 263 00:16:39,130 --> 00:16:42,090 So u is our unknown as always. 264 00:16:42,090 --> 00:16:46,660 Then the matrix A in the problem produces an Au. 265 00:16:46,660 --> 00:16:49,460 266 00:16:49,460 --> 00:16:53,730 Now two things to notice about e, 267 00:16:53,730 --> 00:16:57,055 which, that's the same letter I used for elongation, here 268 00:16:57,055 --> 00:16:59,500 it's standing for error. 269 00:16:59,500 --> 00:17:00,550 Two things to notice. 270 00:17:00,550 --> 00:17:04,500 One is that the source term, which is b, 271 00:17:04,500 --> 00:17:10,910 comes in at this point of the framework. 272 00:17:10,910 --> 00:17:16,570 When we had external forces on springs and on masses 273 00:17:16,570 --> 00:17:19,080 it came in at this point. 274 00:17:19,080 --> 00:17:21,380 We had an f there. 275 00:17:21,380 --> 00:17:23,610 So that's why I'd like to keep those two separate. 276 00:17:23,610 --> 00:17:28,160 The b's are like voltage sources, they come in here. 277 00:17:28,160 --> 00:17:32,820 The f's are will be like current sources, they'll come in there. 278 00:17:32,820 --> 00:17:35,590 Actually it's beautiful. 279 00:17:35,590 --> 00:17:37,530 One more thing to notice. 280 00:17:37,530 --> 00:17:40,200 A is coming with a minus sign. 281 00:17:40,200 --> 00:17:45,200 In mechanics, in masses and springs, we had e=Au. 282 00:17:45,200 --> 00:17:48,960 Here it's natural to work with this, 283 00:17:48,960 --> 00:17:53,110 the error or the residual b-Au. 284 00:17:53,110 --> 00:17:57,650 And that minus sign is natural in physics 285 00:17:57,650 --> 00:18:03,152 and in electrical engineering and hydraulics, 286 00:18:03,152 --> 00:18:07,570 you know, flow-- Where's that minus sign coming from in flow? 287 00:18:07,570 --> 00:18:11,770 Well, flow goes from the higher point to the lower. 288 00:18:11,770 --> 00:18:14,600 Higher voltage to the lower voltage. 289 00:18:14,600 --> 00:18:18,320 And that usually produces that minus sign. 290 00:18:18,320 --> 00:18:23,520 No big deal, of course. 291 00:18:23,520 --> 00:18:27,720 So that step is fine with the framework. 292 00:18:27,720 --> 00:18:32,770 What do we expect in that middle step? 293 00:18:32,770 --> 00:18:37,980 So what's our name for the matrix that goes there? 294 00:18:37,980 --> 00:18:40,510 Everybody's gotta know this framework. 295 00:18:40,510 --> 00:18:42,710 C, right? 296 00:18:42,710 --> 00:18:46,840 Only I've been taking unweighted least squares. 297 00:18:46,840 --> 00:18:52,100 So for unweighted least squares, C will be the identity. 298 00:18:52,100 --> 00:18:55,090 And C doesn't show in our equations. 299 00:18:55,090 --> 00:18:57,010 So C is the identity when there are 300 00:18:57,010 --> 00:19:02,220 no weights, when all the equations are equally reliable. 301 00:19:02,220 --> 00:19:05,430 And that's pretty common, of course. 302 00:19:05,430 --> 00:19:07,630 But not always. 303 00:19:07,630 --> 00:19:11,290 And we'll think, okay, there is a weight e. 304 00:19:11,290 --> 00:19:20,370 So w, which is Ce, is weighted errors, you could say. 305 00:19:20,370 --> 00:19:23,690 So the letter w comes up appropriately again. 306 00:19:23,690 --> 00:19:25,270 Weighted errors. 307 00:19:25,270 --> 00:19:28,060 And then what's the good weighting? 308 00:19:28,060 --> 00:19:31,270 May I stay with C equal the identity for the moment? 309 00:19:31,270 --> 00:19:33,740 Unweighted least squares, because that's by far the most 310 00:19:33,740 --> 00:19:35,700 common. 311 00:19:35,700 --> 00:19:38,190 And then w and e are the same. 312 00:19:38,190 --> 00:19:39,370 C is the identity. 313 00:19:39,370 --> 00:19:42,160 And finally, there's the last step in our framework 314 00:19:42,160 --> 00:19:45,560 where we always expect to see A transpose. 315 00:19:45,560 --> 00:19:47,330 And we do. 316 00:19:47,330 --> 00:19:48,830 And we have to say why. 317 00:19:48,830 --> 00:19:51,520 So that's where I left it last time. 318 00:19:51,520 --> 00:19:53,550 That this was the picture. 319 00:19:53,550 --> 00:19:54,880 This is the equation. 320 00:19:54,880 --> 00:19:59,910 If I had a matrix C, it would go there and there. 321 00:19:59,910 --> 00:20:00,600 Right? 322 00:20:00,600 --> 00:20:06,590 Because I'd have b-Au and then I'd apply C before A transpose. 323 00:20:06,590 --> 00:20:11,270 So C would slip in there before A transpose on both sides. 324 00:20:11,270 --> 00:20:13,100 So that would, with the C's there, 325 00:20:13,100 --> 00:20:17,690 that would be the weighted least squares equation. 326 00:20:17,690 --> 00:20:19,840 You see that it would be A transpose C 327 00:20:19,840 --> 00:20:26,380 A instead of A transpose A, but still the main facts are there. 328 00:20:26,380 --> 00:20:30,850 So where does the equation come from? 329 00:20:30,850 --> 00:20:36,260 So one source, one way to get the equation is from calculus. 330 00:20:36,260 --> 00:20:42,130 From minimizing, from minimizing. 331 00:20:42,130 --> 00:20:45,540 Set a derivative to zero, calculus. 332 00:20:45,540 --> 00:20:48,420 And what's the quantity we're minimizing? 333 00:20:48,420 --> 00:20:51,260 We're minimizing that squared error 334 00:20:51,260 --> 00:20:55,180 because this is least squares. 335 00:20:55,180 --> 00:20:58,800 We're minimizing this e transpose e, the length of e 336 00:20:58,800 --> 00:20:59,530 squared. 337 00:20:59,530 --> 00:21:01,860 The sum of the squares of the errors. 338 00:21:01,860 --> 00:21:06,220 Which is (b-Au) transpose (b-Au). 339 00:21:06,220 --> 00:21:09,230 340 00:21:09,230 --> 00:21:13,620 Again I could say where to slip in the C matrix. 341 00:21:13,620 --> 00:21:15,920 If there was one, it would go in there. 342 00:21:15,920 --> 00:21:19,100 C would go in there, C would go in there. 343 00:21:19,100 --> 00:21:20,940 There'd be a C in the equation. 344 00:21:20,940 --> 00:21:26,850 But let's keep C to be the identity. 345 00:21:26,850 --> 00:21:28,120 So I minimized. 346 00:21:28,120 --> 00:21:29,970 It's a quadratic. 347 00:21:29,970 --> 00:21:35,400 It's got u's times u's, so second degree. 348 00:21:35,400 --> 00:21:38,550 And what's the coefficient in that second degree part? 349 00:21:38,550 --> 00:21:44,730 Well, the second degree part is coming from (Au)^ transpose Au. 350 00:21:44,730 --> 00:21:45,230 Right? 351 00:21:45,230 --> 00:21:47,660 This times this is going to be linear. 352 00:21:47,660 --> 00:21:50,270 This times this is going to be linear. 353 00:21:50,270 --> 00:21:52,940 That times that is just going to be a constant, 354 00:21:52,940 --> 00:21:54,290 its derivative is zero. 355 00:21:54,290 --> 00:21:57,890 But this times this is altogether, 356 00:21:57,890 --> 00:22:02,920 that times that is the u transpose A transpose Au. 357 00:22:02,920 --> 00:22:04,660 Right? 358 00:22:04,660 --> 00:22:07,500 So that's the quadratic part. 359 00:22:07,500 --> 00:22:16,530 And my only point is it's like our old stiffness matrix. 360 00:22:16,530 --> 00:22:21,910 We're seeing the matrix in here is A transpose A. 361 00:22:21,910 --> 00:22:25,750 In other words, when I do calculus 362 00:22:25,750 --> 00:22:32,580 and maybe I'd prefer to see something than just compute 363 00:22:32,580 --> 00:22:35,380 away, take derivatives mechanically. 364 00:22:35,380 --> 00:22:42,020 So I'm going to leave that which is done in the text, 365 00:22:42,020 --> 00:22:44,460 finding the derivative, setting to zero. 366 00:22:44,460 --> 00:22:45,520 And what does it give? 367 00:22:45,520 --> 00:22:48,430 It gives us our equation. 368 00:22:48,430 --> 00:22:52,430 So that equation will come when I set the derivatives 369 00:22:52,430 --> 00:22:54,170 of this thing to zero. 370 00:22:54,170 --> 00:22:58,080 So that's one totally okay approach. 371 00:22:58,080 --> 00:23:02,240 But I like to see a picture with it. 372 00:23:02,240 --> 00:23:03,560 I hope that's alright. 373 00:23:03,560 --> 00:23:06,130 To take the second approach is to see 374 00:23:06,130 --> 00:23:11,230 why A transpose w equal zero. 375 00:23:11,230 --> 00:23:13,350 Why is that? 376 00:23:13,350 --> 00:23:16,660 What's going on in that key step? 377 00:23:16,660 --> 00:23:17,990 This is always the key step. 378 00:23:17,990 --> 00:23:20,060 This is like the set-up step. 379 00:23:20,060 --> 00:23:23,320 This is the weighting step with constants coming in. 380 00:23:23,320 --> 00:23:26,630 And here's the key step. 381 00:23:26,630 --> 00:23:28,360 Let's see that. 382 00:23:28,360 --> 00:23:32,400 So my picture. 383 00:23:32,400 --> 00:23:39,160 Let me draw that picture again. 384 00:23:39,160 --> 00:23:45,210 And my example was in three dimensions, so m=3. 385 00:23:45,210 --> 00:23:48,100 386 00:23:48,100 --> 00:23:51,700 I've got three equations. 387 00:23:51,700 --> 00:23:55,110 The matrix A, oh I'm afraid I don't remember what it was, 388 00:23:55,110 --> 00:23:56,780 but I think it was something like 1, 389 00:23:56,780 --> 00:24:02,610 1, 1; 0, 1, 3, was that maybe it? 390 00:24:02,610 --> 00:24:04,950 Just to connect to last time. 391 00:24:04,950 --> 00:24:12,100 And what I'm now calling b was the vector [1, 2, 3] was it? 392 00:24:12,100 --> 00:24:13,680 Or was it not? 393 00:24:13,680 --> 00:24:15,840 It was [0, 1, 2] maybe? 394 00:24:15,840 --> 00:24:18,890 That's right? 395 00:24:18,890 --> 00:24:20,070 And what was the point? 396 00:24:20,070 --> 00:24:24,320 If I draw the vector b it goes there somewhere. 397 00:24:24,320 --> 00:24:27,660 If I draw the first column of A, it goes here somewhere. 398 00:24:27,660 --> 00:24:31,480 If I draw the second column of A, it goes there somewhere. 399 00:24:31,480 --> 00:24:38,300 And if I draw all combinations of these columns, 400 00:24:38,300 --> 00:24:43,310 all combinations of that vector and that vector, what do I get? 401 00:24:43,310 --> 00:24:45,670 I get a plane. 402 00:24:45,670 --> 00:24:47,560 I get a plane. 403 00:24:47,560 --> 00:24:48,540 There it is. 404 00:24:48,540 --> 00:24:49,410 That's the plane. 405 00:24:49,410 --> 00:24:50,440 That's the plane. 406 00:24:50,440 --> 00:24:52,820 This is from column one. 407 00:24:52,820 --> 00:24:54,580 Here's column two. 408 00:24:54,580 --> 00:24:58,860 This plane is the column plane or column space. 409 00:24:58,860 --> 00:25:03,230 It's the column space of A because it 410 00:25:03,230 --> 00:25:05,600 comes from the columns of A. 411 00:25:05,600 --> 00:25:08,730 Now what's the point about this plane? 412 00:25:08,730 --> 00:25:15,020 The point is that if b is on the plane then I'm golden. 413 00:25:15,020 --> 00:25:20,310 If b is on the plane then b is a combination of the columns, 414 00:25:20,310 --> 00:25:26,210 that's what the plane is, and I have a solution to Au=b. 415 00:25:26,210 --> 00:25:35,440 So b on a plane, b on the plane means Au=b is solvable. 416 00:25:35,440 --> 00:25:40,570 And it could happen, of course. 417 00:25:40,570 --> 00:25:42,510 Like perfect measurements. 418 00:25:42,510 --> 00:25:46,670 But we can't expect it. 419 00:25:46,670 --> 00:25:50,190 When we have three measurements or 100 measurements or 10,000 420 00:25:50,190 --> 00:25:53,390 measurements we can't expect perfection. 421 00:25:53,390 --> 00:25:56,780 So usually b will be off the plane. 422 00:25:56,780 --> 00:25:57,810 Now what? 423 00:25:57,810 --> 00:25:59,860 What happens when b is off the plane? 424 00:25:59,860 --> 00:26:01,710 Let me just complete that picture. 425 00:26:01,710 --> 00:26:06,670 And you know what's coming. 426 00:26:06,670 --> 00:26:15,490 If we're going to get-- Au or Au hat is going to be on the plane 427 00:26:15,490 --> 00:26:19,080 so I'm looking for the best u hat. 428 00:26:19,080 --> 00:26:22,090 Can I just erase this to make space 429 00:26:22,090 --> 00:26:25,560 for what you know I'm going to draw? 430 00:26:25,560 --> 00:26:28,910 Here are these little columns, let me put them there. 431 00:26:28,910 --> 00:26:31,400 What am I going to draw? 432 00:26:31,400 --> 00:26:32,830 The projection. 433 00:26:32,830 --> 00:26:33,710 The projection. 434 00:26:33,710 --> 00:26:36,170 I'm going to draw, what's the projection? 435 00:26:36,170 --> 00:26:38,370 The projection is the nearest point 436 00:26:38,370 --> 00:26:42,530 that is in the plane to the b that's not in the plane. 437 00:26:42,530 --> 00:26:45,340 So here's the projection p. 438 00:26:45,340 --> 00:26:47,350 I drop down this thing. 439 00:26:47,350 --> 00:26:50,710 There's the projection p, little p. 440 00:26:50,710 --> 00:26:53,830 That's the projection of b onto the plane. 441 00:26:53,830 --> 00:26:58,210 I think your mind says yeah, that's the right choice. 442 00:26:58,210 --> 00:27:03,000 And do you want to tell me what this? 443 00:27:03,000 --> 00:27:06,610 That is the part that we can't deal with. 444 00:27:06,610 --> 00:27:09,120 The part we can't improve. 445 00:27:09,120 --> 00:27:13,230 We've made it as small as we could and it's e. 446 00:27:13,230 --> 00:27:18,700 That's the error e and this p is the best guy 447 00:27:18,700 --> 00:27:20,250 that is in the plane. 448 00:27:20,250 --> 00:27:24,350 Do you see that this is the picture. 449 00:27:24,350 --> 00:27:27,330 You get an actual picture of what's going on. 450 00:27:27,330 --> 00:27:31,150 You're splitting b, the measurements, 451 00:27:31,150 --> 00:27:36,820 into the part you can deal with, the projection, the Au hat that 452 00:27:36,820 --> 00:27:38,250 is in the column space. 453 00:27:38,250 --> 00:27:40,280 It is a combination of the columns. 454 00:27:40,280 --> 00:27:42,360 Those points do lie on a line if I'm 455 00:27:42,360 --> 00:27:43,860 doing straight line fitting. 456 00:27:43,860 --> 00:27:48,720 And the part that you can't deal with, the e, the difference, 457 00:27:48,720 --> 00:27:53,670 b-Au, which is not in the plane. 458 00:27:53,670 --> 00:27:56,210 And now I'm still looking for the equations. 459 00:27:56,210 --> 00:27:56,710 Right? 460 00:27:56,710 --> 00:27:58,720 I've just named some stuff. 461 00:27:58,720 --> 00:28:05,270 But I haven't got an equation for that projection. 462 00:28:05,270 --> 00:28:08,280 So what's the key fact? 463 00:28:08,280 --> 00:28:11,940 What's the key fact in this picture that's 464 00:28:11,940 --> 00:28:16,930 going to lead me to an equation for p and e 465 00:28:16,930 --> 00:28:20,260 and u hat and everything? 466 00:28:20,260 --> 00:28:27,570 The key fact is that that dotted line is perpendicular, 467 00:28:27,570 --> 00:28:29,640 perpendicular to the plane. 468 00:28:29,640 --> 00:28:31,920 If I'm looking for the closest point, 469 00:28:31,920 --> 00:28:36,790 everybody knows project, that's what projection involves. 470 00:28:36,790 --> 00:28:37,710 Go perpendicular. 471 00:28:37,710 --> 00:28:40,480 This is a right angle. 472 00:28:40,480 --> 00:28:44,560 That e is perpendicular to the whole plane. 473 00:28:44,560 --> 00:28:46,520 Not only perpendicular to p, it's 474 00:28:46,520 --> 00:28:48,501 perpendicular to everybody in that plane. 475 00:28:48,501 --> 00:28:49,000 Right? 476 00:28:49,000 --> 00:28:51,730 I'm dropping the perpendicular to the plane. 477 00:28:51,730 --> 00:28:54,250 Do you accept that? 478 00:28:54,250 --> 00:28:56,350 Because if you do, we're through. 479 00:28:56,350 --> 00:28:59,570 We just write down the equations for perpendicular 480 00:28:59,570 --> 00:29:03,060 and we've got what we want from the picture 481 00:29:03,060 --> 00:29:06,730 instead of from a calculation. 482 00:29:06,730 --> 00:29:09,800 So what's the idea? 483 00:29:09,800 --> 00:29:14,720 So e is perpendicular to the first column. 484 00:29:14,720 --> 00:29:18,200 So b in the plane, we would be golden. 485 00:29:18,200 --> 00:29:20,920 Let's suppose we're not in the plane. 486 00:29:20,920 --> 00:29:26,100 So now we have this 90 degree angle, this perpendicular 487 00:29:26,100 --> 00:29:27,700 projection. 488 00:29:27,700 --> 00:29:31,670 And it tells me that the first column-- 489 00:29:31,670 --> 00:29:35,000 oh I better name the columns. 490 00:29:35,000 --> 00:29:37,740 Can I just call this column a_1? 491 00:29:37,740 --> 00:29:42,490 That first column is a_1 and the second column is a_2. 492 00:29:42,490 --> 00:29:46,140 So those two columns, whatever they are, 493 00:29:46,140 --> 00:29:51,020 are the guys whose combinations give us the plane. 494 00:29:51,020 --> 00:29:54,990 And it's the plane that we're projecting onto. 495 00:29:54,990 --> 00:29:56,970 It's the plane of all combinations 496 00:29:56,970 --> 00:29:59,150 that comes up here. 497 00:29:59,150 --> 00:30:01,450 So what's this 90 degree angle? 498 00:30:01,450 --> 00:30:06,970 It says that a_1 is perpendicular to p, right? 499 00:30:06,970 --> 00:30:09,740 Sorry! 500 00:30:09,740 --> 00:30:11,480 Say that right for me. 501 00:30:11,480 --> 00:30:17,930 The first equation says that a_1 and what are perpendicular? 502 00:30:17,930 --> 00:30:19,590 e, thank you, e. 503 00:30:19,590 --> 00:30:26,740 So the first equation says that a_1 transpose e is zero. 504 00:30:26,740 --> 00:30:33,350 And the second equation says that a_2 transpose e is zero. 505 00:30:33,350 --> 00:30:40,300 Those are my two equations. 506 00:30:40,300 --> 00:30:43,060 I have to convert those now into matrix language 507 00:30:43,060 --> 00:30:47,370 because I've done them two separate-- vector, 508 00:30:47,370 --> 00:30:48,870 I mean vector language, and I want 509 00:30:48,870 --> 00:30:50,700 to get into matrix language. 510 00:30:50,700 --> 00:30:52,980 But it's easy to do. 511 00:30:52,980 --> 00:30:58,150 Here I have, look, if I have two equations, 512 00:30:58,150 --> 00:31:04,770 let's get a matrix here. 513 00:31:04,770 --> 00:31:10,530 What's it saying? a_1 transpose and a_2 transpose, 514 00:31:10,530 --> 00:31:13,310 what are those? 515 00:31:13,310 --> 00:31:14,870 They're the rows of A transpose. 516 00:31:14,870 --> 00:31:19,930 So the matrix way to say that is A transpose e equal zero. 517 00:31:19,930 --> 00:31:24,060 In other words, this is saying both at once, right? 518 00:31:24,060 --> 00:31:27,850 The first row of A transpose times e 519 00:31:27,850 --> 00:31:31,140 gives zero, the second row of A transpose times e is zero. 520 00:31:31,140 --> 00:31:33,200 So it's A transpose e equal zero which 521 00:31:33,200 --> 00:31:40,150 is what we wanted in this case where w and e are the same. 522 00:31:40,150 --> 00:31:42,430 Because C is the identity. 523 00:31:42,430 --> 00:31:45,540 And let's just go one step further and see. 524 00:31:45,540 --> 00:31:51,050 That's A transpose (b-Au hat) is zero. 525 00:31:51,050 --> 00:32:00,060 Remember this zero stands for [0, 0], right? 526 00:32:00,060 --> 00:32:02,120 I wanted to put the two equations together. 527 00:32:02,120 --> 00:32:07,010 So I've got two components on the right-hand side. 528 00:32:07,010 --> 00:32:09,280 And then I just plugged in what b is. 529 00:32:09,280 --> 00:32:11,880 And now everybody sees it, right? 530 00:32:11,880 --> 00:32:14,990 Everybody sees that we've got the picture, 531 00:32:14,990 --> 00:32:21,620 this 90 degree angle was the key to these equations. 532 00:32:21,620 --> 00:32:27,300 Because if I put A transpose A u hat onto the other side, 533 00:32:27,300 --> 00:32:34,820 I've got exactly the normal equations that I wanted. 534 00:32:34,820 --> 00:32:40,240 We're taking the time to see the picture 535 00:32:40,240 --> 00:32:44,610 and the form of the equations. 536 00:32:44,610 --> 00:32:50,540 Then I can plug in the numbers, but the thinking 537 00:32:50,540 --> 00:32:54,950 is where the equations come from. 538 00:32:54,950 --> 00:32:57,230 We're there. 539 00:32:57,230 --> 00:32:59,450 Now what to do next? 540 00:32:59,450 --> 00:33:03,540 Now we've understood where the equations come from. 541 00:33:03,540 --> 00:33:06,910 I didn't go through the steps of taking the derivatives, 542 00:33:06,910 --> 00:33:09,420 but that would work. 543 00:33:09,420 --> 00:33:12,120 Or this picture. 544 00:33:12,120 --> 00:33:14,220 I love this picture. 545 00:33:14,220 --> 00:33:17,000 Let me stay with that a little bit longer. 546 00:33:17,000 --> 00:33:19,570 What is u hat? 547 00:33:19,570 --> 00:33:29,150 Can I just go over here to say, okay what have we got here? 548 00:33:29,150 --> 00:33:37,960 We started with Au=b and then we got the projection was A u hat. 549 00:33:37,960 --> 00:33:39,650 But now what is u hat? 550 00:33:39,650 --> 00:33:45,160 I'm just going to assemble things here. u hat, 551 00:33:45,160 --> 00:33:47,870 we figured out by the 90 degree angle, 552 00:33:47,870 --> 00:33:51,640 comes from this equation, which is that equation, which 553 00:33:51,640 --> 00:33:55,780 is A transpose A u hat equal A transpose 554 00:33:55,780 --> 00:33:59,320 b, the central equation. 555 00:33:59,320 --> 00:34:02,720 That's the central equation. 556 00:34:02,720 --> 00:34:08,740 Now plug in u hat here so I get a formula for the projection. 557 00:34:08,740 --> 00:34:11,520 While we're doing all this stuff we might just as well 558 00:34:11,520 --> 00:34:13,400 put those two pieces together and have 559 00:34:13,400 --> 00:34:15,300 a formula for the projection. 560 00:34:15,300 --> 00:34:20,810 So it's A times u hat-- I hope you like this formula. 561 00:34:20,810 --> 00:34:25,410 It's kind of goofy-looking but you'll remember it. 562 00:34:25,410 --> 00:34:27,220 What is u hat? 563 00:34:27,220 --> 00:34:31,170 The whole point is that this matrix is good. 564 00:34:31,170 --> 00:34:34,330 It's square, it's symmetric, it's invertible, 565 00:34:34,330 --> 00:34:37,130 we'll have another word about that. 566 00:34:37,130 --> 00:34:42,120 And now I'll invert it. 567 00:34:42,120 --> 00:34:48,470 Times A transpose b. 568 00:34:48,470 --> 00:34:53,110 That's the goofy formula that I wanted you to see. 569 00:34:53,110 --> 00:35:02,870 The projection of vector b onto these columns of A comes from 570 00:35:02,870 --> 00:35:09,270 applying this matrix, sometimes I call it the matrix of four 571 00:35:09,270 --> 00:35:11,830 A's. 572 00:35:11,830 --> 00:35:16,440 Now it's worth looking at that matrix. 573 00:35:16,440 --> 00:35:19,040 Often I'll call that matrix capital 574 00:35:19,040 --> 00:35:22,070 P. It's the projection matrix. 575 00:35:22,070 --> 00:35:26,120 You give me any vector b, I multiply it by this matrix 576 00:35:26,120 --> 00:35:28,380 and I get the projection. 577 00:35:28,380 --> 00:35:33,610 It's just worth seeing what this matrix P, 578 00:35:33,610 --> 00:35:41,830 these four A's, what are projection matrices like. 579 00:35:41,830 --> 00:35:45,440 Now first of all, when I have an inverse 580 00:35:45,440 --> 00:35:48,700 of a product, any reasonable person 581 00:35:48,700 --> 00:35:52,180 would say okay, split that into A inverse 582 00:35:52,180 --> 00:35:55,640 times A transpose inverse and simplify the whole thing. 583 00:35:55,640 --> 00:35:58,060 And what will happen? 584 00:35:58,060 --> 00:36:01,670 It's not going to be legal, but let's just pretend. 585 00:36:01,670 --> 00:36:05,780 If I split this into A inverse times A transpose inverse 586 00:36:05,780 --> 00:36:09,250 and simplify, what do I get for P? 587 00:36:09,250 --> 00:36:10,040 Do you see it? 588 00:36:10,040 --> 00:36:13,350 I'll get A and if I try to split this 589 00:36:13,350 --> 00:36:19,930 into that, what do I have here? 590 00:36:19,930 --> 00:36:22,520 I've got the identity. 591 00:36:22,520 --> 00:36:23,790 That's the identity. 592 00:36:23,790 --> 00:36:25,400 That's the identity. 593 00:36:25,400 --> 00:36:29,370 The result is the identity. 594 00:36:29,370 --> 00:36:36,240 That doesn't look good, right? p is not the same as b. 595 00:36:36,240 --> 00:36:41,880 This matrix cannot be split into these two pieces. 596 00:36:41,880 --> 00:36:45,710 A is rectangular, that's its problem. 597 00:36:45,710 --> 00:36:49,060 If A was square, oh yeah, think about the case 598 00:36:49,060 --> 00:36:50,460 when A is square. 599 00:36:50,460 --> 00:36:52,270 Suppose m equals n. 600 00:36:52,270 --> 00:36:54,500 That case'll be included here. 601 00:36:54,500 --> 00:36:58,510 If m equals n and my matrix is square and invertible and 602 00:36:58,510 --> 00:37:01,050 golden then all this works. 603 00:37:01,050 --> 00:37:04,130 The projection is the identity matrix. 604 00:37:04,130 --> 00:37:06,870 And what's with my picture? 605 00:37:06,870 --> 00:37:09,240 What's my picture look like in the case 606 00:37:09,240 --> 00:37:13,090 where A is a square matrix? 607 00:37:13,090 --> 00:37:14,420 Give it another column. 608 00:37:14,420 --> 00:37:17,030 Fit this thing by a quadratic. 609 00:37:17,030 --> 00:37:20,780 So if I was fitting instead of by a straight line, 610 00:37:20,780 --> 00:37:23,090 by a quadratic, it turns out I'd have 611 00:37:23,090 --> 00:37:29,980 zero squared, one squared and three squared in that column. 612 00:37:29,980 --> 00:37:32,790 I'd have a three by three matrix. 613 00:37:32,790 --> 00:37:35,960 It comes out to be invertible. 614 00:37:35,960 --> 00:37:37,850 Now what's going on? 615 00:37:37,850 --> 00:37:40,370 Now what's my problem Au=b? 616 00:37:40,370 --> 00:37:49,880 Now suddenly m is still three, but now n is three. b is? 617 00:37:49,880 --> 00:37:53,800 And what happened to the plane? b is in there. 618 00:37:53,800 --> 00:37:57,800 And now what's there? 619 00:37:57,800 --> 00:38:01,400 It's now the combinations of what? 620 00:38:01,400 --> 00:38:03,490 Why did that plane come in? 621 00:38:03,490 --> 00:38:05,450 That was the combinations of two columns. 622 00:38:05,450 --> 00:38:06,580 But now I've got three. 623 00:38:06,580 --> 00:38:08,930 The combinations of three columns, 624 00:38:08,930 --> 00:38:14,450 those three columns of an invertible matrix is what? 625 00:38:14,450 --> 00:38:17,160 Are you with me? 626 00:38:17,160 --> 00:38:20,490 If I have a three by three invertible matrix, these three 627 00:38:20,490 --> 00:38:24,280 columns independent, pointing off different directions, 628 00:38:24,280 --> 00:38:31,300 not in a plane, then when I take the combinations I get? 629 00:38:31,300 --> 00:38:32,640 I get R^3. 630 00:38:32,640 --> 00:38:34,000 I get the whole space. 631 00:38:34,000 --> 00:38:35,560 I get everybody. 632 00:38:35,560 --> 00:38:38,650 Every vector including this b and any other b 633 00:38:38,650 --> 00:38:41,150 you want to suggest will be a combination 634 00:38:41,150 --> 00:38:42,430 of these three guys. 635 00:38:42,430 --> 00:38:44,520 So what's my picture here? 636 00:38:44,520 --> 00:38:49,710 My picture is that plane grew to be the whole space. 637 00:38:49,710 --> 00:38:56,350 So what's the projection of b onto the whole space? b itself. 638 00:38:56,350 --> 00:38:58,071 And what's the error? 639 00:38:58,071 --> 00:38:58,570 Zero. 640 00:38:58,570 --> 00:38:59,230 Good. 641 00:38:59,230 --> 00:39:01,690 So that's the nice case. 642 00:39:01,690 --> 00:39:03,360 That's the standard case that we've 643 00:39:03,360 --> 00:39:07,790 thought about in the past when m equalled n. 644 00:39:07,790 --> 00:39:11,650 In that case P is the identity and that'd be all true. 645 00:39:11,650 --> 00:39:14,780 But normally it's not. 646 00:39:14,780 --> 00:39:17,970 So I want to come back to this P just 647 00:39:17,970 --> 00:39:22,780 to mention an important fact about P. 648 00:39:22,780 --> 00:39:24,920 And it comes again from the picture. 649 00:39:24,920 --> 00:39:26,190 So this is a projection. 650 00:39:26,190 --> 00:39:31,850 This is what I'm calling the projection matrix. 651 00:39:31,850 --> 00:39:34,760 It's the matrix that does the projection. 652 00:39:34,760 --> 00:39:36,410 And there it is. 653 00:39:36,410 --> 00:39:40,310 Four A's in a row that multiplies b. 654 00:39:40,310 --> 00:39:42,910 Now here's my little question. 655 00:39:42,910 --> 00:39:46,910 So linear algebra's full of these different kinds 656 00:39:46,910 --> 00:39:49,300 of matrices. 657 00:39:49,300 --> 00:39:53,630 Rotations, reflections, symmetric matrices, 658 00:39:53,630 --> 00:39:58,830 Markov matrices, so it's just every problem has matrices. 659 00:39:58,830 --> 00:40:01,160 Now here we have a projection matrix. 660 00:40:01,160 --> 00:40:07,010 Now what I want to know is what happens if I project again? 661 00:40:07,010 --> 00:40:10,900 If I take the vector b, any vector b, I project it 662 00:40:10,900 --> 00:40:13,090 and then I project again. 663 00:40:13,090 --> 00:40:18,820 So project twice and just tell me, you know what will happen. 664 00:40:18,820 --> 00:40:21,630 I'm back to this picture. 665 00:40:21,630 --> 00:40:26,580 I project b to P and now I project again. 666 00:40:26,580 --> 00:40:28,950 Where do I go? 667 00:40:28,950 --> 00:40:31,490 Same place, right? 668 00:40:31,490 --> 00:40:33,670 Once I'm in the plane the projection 669 00:40:33,670 --> 00:40:35,270 stays right where it is. 670 00:40:35,270 --> 00:40:36,880 So what does that tell me? 671 00:40:36,880 --> 00:40:44,840 That tells me that P squared on b is the same as P on b. 672 00:40:44,840 --> 00:40:47,700 If I project twice, no change. 673 00:40:47,700 --> 00:40:50,010 It's the same as projecting once. 674 00:40:50,010 --> 00:40:55,110 So the projection matrix has the property that P squared is P. 675 00:40:55,110 --> 00:40:57,730 And actually, we should be able to see it 676 00:40:57,730 --> 00:41:01,600 if I write out this whole miserable thing twice. 677 00:41:01,600 --> 00:41:04,590 So now I'm going to be up to eight A's. 678 00:41:04,590 --> 00:41:09,320 Sorry about this, but I promise not to do P cubed. 679 00:41:09,320 --> 00:41:13,250 A times (A transpose A) inverse times A transpose, that's 680 00:41:13,250 --> 00:41:20,840 one P. I'll write it again. 681 00:41:20,840 --> 00:41:26,440 There's the second P. So that's P squared. 682 00:41:26,440 --> 00:41:27,770 Do you see anything good there? 683 00:41:27,770 --> 00:41:33,110 Do you see in here A transpose A, that combination 684 00:41:33,110 --> 00:41:35,980 and that combination there. 685 00:41:35,980 --> 00:41:39,880 This cancels that to give the identity. 686 00:41:39,880 --> 00:41:42,500 And what am I left with? 687 00:41:42,500 --> 00:41:46,390 I'm left with A, the inverse times A transpose, which 688 00:41:46,390 --> 00:41:51,530 was exactly P. The algebra is just coming along 689 00:41:51,530 --> 00:41:54,930 with the understanding that we know. 690 00:41:54,930 --> 00:41:58,580 So that's the projection matrix. 691 00:41:58,580 --> 00:42:03,880 So this is the theory of projections in a nutshell, 692 00:42:03,880 --> 00:42:05,670 in a nutshell. 693 00:42:05,670 --> 00:42:08,780 This is projections onto the column space of A. 694 00:42:08,780 --> 00:42:14,690 Now I have to remind you about one little math point. 695 00:42:14,690 --> 00:42:16,350 Not so little, I guess. 696 00:42:16,350 --> 00:42:20,400 How could I say little for math? 697 00:42:20,400 --> 00:42:22,990 Is A transpose A invertible? 698 00:42:22,990 --> 00:42:26,790 We're plowing along as if it is, that's 699 00:42:26,790 --> 00:42:28,490 going to be our assumption. 700 00:42:28,490 --> 00:42:31,170 But what's the condition for A transpose A 701 00:42:31,170 --> 00:42:35,170 to be invertible, which allows all this to work? 702 00:42:35,170 --> 00:42:43,120 When is A transpose A invertible? 703 00:42:43,120 --> 00:42:46,960 What I'm doing here now is I'm separating 704 00:42:46,960 --> 00:42:52,010 the positive definite one, when A transpose A is positive 705 00:42:52,010 --> 00:42:56,850 definite, the good normal case when all our equations work, 706 00:42:56,850 --> 00:43:01,450 from the semi-definite one where we overlooked 707 00:43:01,450 --> 00:43:04,570 the fact that A transpose A-- where somehow 708 00:43:04,570 --> 00:43:06,660 the experiment wasn't well set up, 709 00:43:06,660 --> 00:43:12,180 we got an A transpose A that is singular. 710 00:43:12,180 --> 00:43:15,470 And just to see, when could that happen? 711 00:43:15,470 --> 00:43:20,450 Let me just remind you. 712 00:43:20,450 --> 00:43:21,270 This is important. 713 00:43:21,270 --> 00:43:26,730 Why don't I give it some space. 714 00:43:26,730 --> 00:43:28,810 It's really straightforward. 715 00:43:28,810 --> 00:43:34,780 Let me just go through those steps again. 716 00:43:34,780 --> 00:43:42,650 If it's not invertible, if some A transpose A u is zero. 717 00:43:42,650 --> 00:43:44,300 This is always the risk that we have 718 00:43:44,300 --> 00:43:49,430 to check out and be sure we don't have and understand. 719 00:43:49,430 --> 00:43:54,020 So if A transpose Au is zero, then that would lead us, 720 00:43:54,020 --> 00:44:00,430 I could multiply both sides by u transpose. 721 00:44:00,430 --> 00:44:03,310 u transpose zero, right? 722 00:44:03,310 --> 00:44:04,380 Safe. 723 00:44:04,380 --> 00:44:06,750 Multiply whatever that u might be, 724 00:44:06,750 --> 00:44:09,090 multiply both sides by u transpose. 725 00:44:09,090 --> 00:44:12,350 But what is u transpose zero? 726 00:44:12,350 --> 00:44:17,720 Zero, nothing there. 727 00:44:17,720 --> 00:44:20,680 Now how do I understand this guy? 728 00:44:20,680 --> 00:44:22,130 Well you remember the key. 729 00:44:22,130 --> 00:44:24,040 Everybody remembers the key? 730 00:44:24,040 --> 00:44:26,680 You look at that thing and you say hey, 731 00:44:26,680 --> 00:44:29,540 if I put in parentheses in the right place that's 732 00:44:29,540 --> 00:44:34,410 the length of Au squared. 733 00:44:34,410 --> 00:44:38,500 So that's the small trick that this multiplying by u transpose 734 00:44:38,500 --> 00:44:42,430 and then seeing what you've got that we've done 735 00:44:42,430 --> 00:44:44,460 and you should know it. 736 00:44:44,460 --> 00:44:47,250 And now if the length squared is zero, what does that 737 00:44:47,250 --> 00:44:48,340 tell me about Au? 738 00:44:48,340 --> 00:44:51,290 739 00:44:51,290 --> 00:44:54,520 If I have a vector here whose length is zero, 740 00:44:54,520 --> 00:44:56,740 that vector must be? 741 00:44:56,740 --> 00:44:57,920 Zero. 742 00:44:57,920 --> 00:45:01,620 Zero vector's the only one for which the sum of the squares 743 00:45:01,620 --> 00:45:05,050 will give zero. 744 00:45:05,050 --> 00:45:08,480 And if Au is zero I could multiply both sides 745 00:45:08,480 --> 00:45:14,590 by A transpose and complete the loop. 746 00:45:14,590 --> 00:45:17,810 Actually I thought of that when I was swimming this morning, 747 00:45:17,810 --> 00:45:21,540 that line. 748 00:45:21,540 --> 00:45:27,450 Just to see once again when, it's sort of interesting then. 749 00:45:27,450 --> 00:45:30,620 A transpose Au equal zero which is the bad thing 750 00:45:30,620 --> 00:45:34,780 we hope we don't deal with. 751 00:45:34,780 --> 00:45:36,360 And when does it happen? 752 00:45:36,360 --> 00:45:41,770 It happens when Au is zero. 753 00:45:41,770 --> 00:45:45,180 So our assumption always has to be this, 754 00:45:45,180 --> 00:45:48,310 that there aren't any u's, except the zero 755 00:45:48,310 --> 00:45:50,900 vector of course, that's always going to happen, 756 00:45:50,900 --> 00:45:59,420 but we always have to assume that Au is never zero. 757 00:45:59,420 --> 00:46:01,320 So we have to avoid this. 758 00:46:01,320 --> 00:46:12,880 So to avoid that assume A has -- this is the key word -- 759 00:46:12,880 --> 00:46:18,390 independent columns. 760 00:46:18,390 --> 00:46:21,260 Since this is a combination of the columns, 761 00:46:21,260 --> 00:46:23,280 independent columns means what? 762 00:46:23,280 --> 00:46:26,150 It means that the only combination of the columns 763 00:46:26,150 --> 00:46:31,150 to give zero is the zero combination. 764 00:46:31,150 --> 00:46:34,040 So did I have independent columns over here? 765 00:46:34,040 --> 00:46:35,490 I sure did. 766 00:46:35,490 --> 00:46:39,300 That column and that column were off in different directions, 767 00:46:39,300 --> 00:46:40,640 they were independent. 768 00:46:40,640 --> 00:46:43,030 And that's why I knew we were fine. 769 00:46:43,030 --> 00:46:46,190 A transpose A was zero. 770 00:46:46,190 --> 00:46:52,520 I would have to really struggle to find a, 771 00:46:52,520 --> 00:46:56,210 well I'd have to think a bit to find an example where 772 00:46:56,210 --> 00:46:58,520 we run into trouble. 773 00:46:58,520 --> 00:47:04,270 These squares, well I certainly could in many applications, 774 00:47:04,270 --> 00:47:07,030 but the straightforward applications 775 00:47:07,030 --> 00:47:09,490 of fitting a straight line, A is going 776 00:47:09,490 --> 00:47:13,920 to be a column vector of ones and a column vector of times 777 00:47:13,920 --> 00:47:20,460 and those are different directions and no problem. 778 00:47:20,460 --> 00:47:23,290 So that's A transpose A. 779 00:47:23,290 --> 00:47:30,430 What else to do with this topic? 780 00:47:30,430 --> 00:47:33,220 Because there's a whole world of estimation. 781 00:47:33,220 --> 00:47:38,490 I mean, statistics is looking over our shoulder I guess. 782 00:47:38,490 --> 00:47:41,120 Really, we should realize that a statistician 783 00:47:41,120 --> 00:47:46,270 is, say, yeah, I know that but, and then going on. 784 00:47:46,270 --> 00:47:52,850 And what is that guy, what more does he have to say? 785 00:47:52,850 --> 00:47:55,250 So you've got the central ideas. 786 00:47:55,250 --> 00:48:01,280 I guess the statistician comes in in this, that's 787 00:48:01,280 --> 00:48:05,780 the statistical constant now. 788 00:48:05,780 --> 00:48:08,910 And what do statisticians compute? 789 00:48:08,910 --> 00:48:14,840 They say you've got errors, right? 790 00:48:14,840 --> 00:48:17,289 And of course, in any particular case 791 00:48:17,289 --> 00:48:19,080 we don't know what that error is, otherwise 792 00:48:19,080 --> 00:48:23,260 we could take it out and we'd get exact solutions. 793 00:48:23,260 --> 00:48:25,550 We don't know what the error is. 794 00:48:25,550 --> 00:48:32,350 What is reasonable to know about errors? 795 00:48:32,350 --> 00:48:43,190 We're doing a little statistics here. 796 00:48:43,190 --> 00:48:45,760 Somehow that error, that particular error 797 00:48:45,760 --> 00:48:49,550 of the experiment that we happen to run, and if we ran it 798 00:48:49,550 --> 00:48:52,130 again we'd get a different error, 799 00:48:52,130 --> 00:48:56,970 those errors come out of some sort of error population. 800 00:48:56,970 --> 00:48:59,900 Like dark matter or something. 801 00:48:59,900 --> 00:49:03,740 Just like, a bunch of errors are out there, noise. 802 00:49:03,740 --> 00:49:07,610 And what could we reasonably assume 803 00:49:07,610 --> 00:49:09,950 that we know about the noise? 804 00:49:09,950 --> 00:49:14,840 We could assume that its average is zero, mean zero. 805 00:49:14,840 --> 00:49:18,741 So statisticians always, that just resets the meter. 806 00:49:18,741 --> 00:49:19,240 Right? 807 00:49:19,240 --> 00:49:23,000 If you had a meter or a clock that was always three minutes 808 00:49:23,000 --> 00:49:27,730 ahead (like this one) you would reset it. 809 00:49:27,730 --> 00:49:30,380 And we'll do that one day. 810 00:49:30,380 --> 00:49:34,590 So you'd reset to get the average zero. 811 00:49:34,590 --> 00:49:36,990 But that doesn't mean every error is zero, right? 812 00:49:36,990 --> 00:49:39,910 That just means the average error is zero. 813 00:49:39,910 --> 00:49:42,110 So what's the other number? 814 00:49:42,110 --> 00:49:46,170 What's the other number that statisticians live on? 815 00:49:46,170 --> 00:49:48,420 It's the deviation or its square, 816 00:49:48,420 --> 00:49:51,610 which is called the variance. 817 00:49:51,610 --> 00:49:52,250 Right. 818 00:49:52,250 --> 00:49:53,420 Variance. 819 00:49:53,420 --> 00:49:56,780 So that's the thing that you could 820 00:49:56,780 --> 00:50:01,130 assume that the errors have mean zero and have some variance. 821 00:50:01,130 --> 00:50:05,220 You could suppose that you knew something about the variance. 822 00:50:05,220 --> 00:50:07,000 You don't know the individual errors, 823 00:50:07,000 --> 00:50:14,270 but you know whether the errors are like, are very small 824 00:50:14,270 --> 00:50:20,650 or close to zero or large. 825 00:50:20,650 --> 00:50:22,410 So this is a small variance. 826 00:50:22,410 --> 00:50:26,140 So one over sigma is sort of that distance. 827 00:50:26,140 --> 00:50:27,220 One over sigma. 828 00:50:27,220 --> 00:50:33,270 Here, this is a large variance where 829 00:50:33,270 --> 00:50:37,650 the magnitude of the error could be much larger from this. 830 00:50:37,650 --> 00:50:39,940 So those are the two numbers, mean zero, 831 00:50:39,940 --> 00:50:43,720 that leaves us just one number, and the variance, 832 00:50:43,720 --> 00:50:48,040 the standard deviation sigma or the variance sigma squared. 833 00:50:48,040 --> 00:50:55,050 One moment on these squares. 834 00:50:55,050 --> 00:50:58,770 Let me just say what the weighting matrix would be. 835 00:50:58,770 --> 00:51:02,780 And then I can tell you in a moment why. 836 00:51:02,780 --> 00:51:05,190 What would the weighting matrix be 837 00:51:05,190 --> 00:51:09,910 if our three equations, you know, 838 00:51:09,910 --> 00:51:11,895 that came from one measurement and this 839 00:51:11,895 --> 00:51:13,520 came from a second measurement and this 840 00:51:13,520 --> 00:51:14,940 came from a third measurement. 841 00:51:14,940 --> 00:51:18,670 If they came from different meter readers 842 00:51:18,670 --> 00:51:21,830 with different variances, suppose, 843 00:51:21,830 --> 00:51:28,270 then the right C matrix will be a diagonal matrix, beautiful. 844 00:51:28,270 --> 00:51:34,060 And what sits up there, what sits there, what sits there? 845 00:51:34,060 --> 00:51:36,660 We don't have spring constants anymore. 846 00:51:36,660 --> 00:51:39,340 We have statistics constants. 847 00:51:39,340 --> 00:51:43,080 And what's the number that goes there? 848 00:51:43,080 --> 00:51:45,340 That one is the third guy. 849 00:51:45,340 --> 00:51:49,410 So it's associated with the third measurement. 850 00:51:49,410 --> 00:51:55,230 It's one over sigma_3 squared. 851 00:51:55,230 --> 00:51:57,720 Those are the numbers that go on the diagonal, 852 00:51:57,720 --> 00:52:00,710 the inverses of the variances. 853 00:52:00,710 --> 00:52:03,360 And just to see that that makes sense. 854 00:52:03,360 --> 00:52:11,430 If that number is unreliable, if it has a large variance 855 00:52:11,430 --> 00:52:18,810 then I want to give it little weight, right? 856 00:52:18,810 --> 00:52:21,840 If this third meter is very unreliable 857 00:52:21,840 --> 00:52:23,840 I'm not going to throw it out entirely, 858 00:52:23,840 --> 00:52:26,710 but I know that it's variance is large 859 00:52:26,710 --> 00:52:31,990 and therefore I'll weight that equation only a little, 860 00:52:31,990 --> 00:52:33,550 with a small weight. 861 00:52:33,550 --> 00:52:39,530 Suppose sigma_2, so this guy is one over sigma_2 squared, 862 00:52:39,530 --> 00:52:44,570 suppose this is an extremely reliable meter. 863 00:52:44,570 --> 00:52:48,100 That measurement has little expected error. 864 00:52:48,100 --> 00:52:49,970 Then I want to weight it heavily. 865 00:52:49,970 --> 00:52:56,800 So it has a small sigma_2 and that gives it a large weight. 866 00:52:56,800 --> 00:52:58,580 And sigma_1 similarly. 867 00:52:58,580 --> 00:53:04,270 So that's the weighting for the case 868 00:53:04,270 --> 00:53:08,770 that you can actually hope to use in practice. 869 00:53:08,770 --> 00:53:11,820 I'll just mention that statisticians would also 870 00:53:11,820 --> 00:53:13,620 say, wait a minute. 871 00:53:13,620 --> 00:53:17,350 Measurement two and measurement three might be interconnected. 872 00:53:17,350 --> 00:53:19,290 They might not be independent. 873 00:53:19,290 --> 00:53:21,980 There might be a covariance. 874 00:53:21,980 --> 00:53:25,500 And then that gets them into more great linear algebra 875 00:53:25,500 --> 00:53:26,360 actually. 876 00:53:26,360 --> 00:53:30,360 But if I want a diagonal matrix C 877 00:53:30,360 --> 00:53:34,280 that's the case when my measurements are independent. 878 00:53:34,280 --> 00:53:41,210 And basically, I'm whitening the system. 879 00:53:41,210 --> 00:53:44,890 I'm making the system white, making it all equal variances 880 00:53:44,890 --> 00:53:46,790 by rescaling. 881 00:53:46,790 --> 00:53:48,720 By weighting the equations. 882 00:53:48,720 --> 00:53:51,450 Okay, thanks. 883 00:53:51,450 --> 00:53:55,030 Wednesday is the next big example 884 00:53:55,030 --> 00:53:57,830 of the framework with b and f. 885 00:53:57,830 --> 00:53:59,380 See you then.