1 00:00:07,460 --> 00:00:09,910 OK. 2 00:00:09,910 --> 00:00:15,260 Here's lecture sixteen and if you remember 3 00:00:15,260 --> 00:00:21,150 I ended up the last lecture with this formula for what 4 00:00:21,150 --> 00:00:24,610 I called a projection matrix. 5 00:00:24,610 --> 00:00:31,610 And maybe I could just recap for a minute what 6 00:00:31,610 --> 00:00:35,010 is that magic formula doing? 7 00:00:35,010 --> 00:00:38,390 For example, it's supposed to be -- 8 00:00:38,390 --> 00:00:40,230 it's supposed to produce a projection, 9 00:00:40,230 --> 00:00:44,770 if I multiply by a b, so I take P times b, 10 00:00:44,770 --> 00:00:51,240 I'm supposed to project that vector b to the nearest point 11 00:00:51,240 --> 00:00:54,150 in the column space. 12 00:00:54,150 --> 00:00:54,800 OK. 13 00:00:54,800 --> 00:00:56,350 Can I just -- 14 00:00:56,350 --> 00:01:01,420 one way to recap is to take the two extreme cases. 15 00:01:01,420 --> 00:01:05,040 Suppose a vector b is in the column space? 16 00:01:05,040 --> 00:01:10,030 Then what do I get when I apply the projection P? 17 00:01:10,030 --> 00:01:13,610 So I'm projecting into the column space 18 00:01:13,610 --> 00:01:18,780 but I'm starting with a vector in this case that's already 19 00:01:18,780 --> 00:01:20,950 in the column space, so of course 20 00:01:20,950 --> 00:01:25,860 when I project it I get B again, right. 21 00:01:25,860 --> 00:01:29,860 And I want to show you how that comes out of this formula. 22 00:01:29,860 --> 00:01:32,550 Let me do the other extreme. 23 00:01:32,550 --> 00:01:35,350 Suppose that vector is perpendicular to the column 24 00:01:35,350 --> 00:01:36,210 space. 25 00:01:36,210 --> 00:01:38,860 So imagine this column space as a plane 26 00:01:38,860 --> 00:01:42,550 and imagine b as sticking straight up perpendicular 27 00:01:42,550 --> 00:01:43,490 to it. 28 00:01:43,490 --> 00:01:50,490 What's the nearest point in the column space to b in that case? 29 00:01:50,490 --> 00:01:54,380 So what's the projection onto the plane, 30 00:01:54,380 --> 00:01:57,980 the nearest point in the plane, if the vector b that 31 00:01:57,980 --> 00:02:02,000 I'm looking at is -- got no component in the column space, 32 00:02:02,000 --> 00:02:05,050 it's sticking completely -- ninety degrees with it, 33 00:02:05,050 --> 00:02:10,220 then Pb should be zero, right. 34 00:02:10,220 --> 00:02:13,040 So those are the two extreme cases. 35 00:02:13,040 --> 00:02:18,000 The average vector has a component P in the column space 36 00:02:18,000 --> 00:02:20,930 and a component perpendicular to it, 37 00:02:20,930 --> 00:02:25,930 and what the projection does is it kills this part 38 00:02:25,930 --> 00:02:29,510 and it preserves this part. 39 00:02:29,510 --> 00:02:30,010 OK. 40 00:02:30,010 --> 00:02:32,230 Can we just see why that's true? 41 00:02:32,230 --> 00:02:37,140 Just -- that formula ought to work. 42 00:02:37,140 --> 00:02:41,010 So let me start with this one. 43 00:02:41,010 --> 00:02:44,260 What vectors are in the -- are perpendicular to the column 44 00:02:44,260 --> 00:02:45,240 space? 45 00:02:45,240 --> 00:02:48,220 How do I see that I really get zero? 46 00:02:48,220 --> 00:02:50,850 I have to think, what does it mean for a vector b 47 00:02:50,850 --> 00:02:54,410 to be perpendicular to the column space? 48 00:02:54,410 --> 00:02:59,430 So if it's perpendicular to all the columns, 49 00:02:59,430 --> 00:03:02,100 then it's in some other space. 50 00:03:02,100 --> 00:03:05,740 We've got our four spaces so the reason I do this is it's 51 00:03:05,740 --> 00:03:10,030 perfectly using what we know about our four spaces. 52 00:03:10,030 --> 00:03:13,860 What vectors are perpendicular to the column space? 53 00:03:13,860 --> 00:03:19,190 Those are the guys in the null space of A transpose, 54 00:03:19,190 --> 00:03:20,290 right? 55 00:03:20,290 --> 00:03:22,740 That's the first section of this chapter, 56 00:03:22,740 --> 00:03:26,160 that's the key geometry of these spaces. 57 00:03:26,160 --> 00:03:28,300 If I'm perpendicular to the column space, 58 00:03:28,300 --> 00:03:30,961 I'm in the null space of A transpose. 59 00:03:30,961 --> 00:03:31,460 OK. 60 00:03:31,460 --> 00:03:33,790 So if I'm in the null space of A transpose, 61 00:03:33,790 --> 00:03:41,760 and I multiply this big formula times b, so now I'm getting Pb, 62 00:03:41,760 --> 00:03:48,530 this is now the projection, Pb, do you see that I get zero? 63 00:03:48,530 --> 00:03:50,470 Of course I get zero. 64 00:03:50,470 --> 00:03:52,950 Right at the end there, A transpose b 65 00:03:52,950 --> 00:03:54,840 will give me zero right away. 66 00:03:54,840 --> 00:03:57,780 So that's why that zero's here. 67 00:03:57,780 --> 00:04:00,890 Because if I'm perpendicular to the column space, then 68 00:04:00,890 --> 00:04:03,840 I'm in the null space of A transpose and A transpose 69 00:04:03,840 --> 00:04:08,640 b is OK, what about the other possibility. 70 00:04:08,640 --> 00:04:09,970 zilch. 71 00:04:09,970 --> 00:04:13,370 How do I see that this formula gives me the right answer 72 00:04:13,370 --> 00:04:15,250 if b is in the column space? 73 00:04:18,230 --> 00:04:21,890 So what's a typical vector in the column space? 74 00:04:21,890 --> 00:04:24,480 It's a combination of the columns. 75 00:04:24,480 --> 00:04:27,240 How do I write a combination of the columns? 76 00:04:27,240 --> 00:04:31,090 So tell me, how would I write, you know, 77 00:04:31,090 --> 00:04:34,440 your everyday vector that's in the column space? 78 00:04:34,440 --> 00:04:38,860 It would have the form A times some x, right? 79 00:04:38,860 --> 00:04:42,280 That's what's in the column space, A times something. 80 00:04:42,280 --> 00:04:44,730 That makes it a combination of the columns. 81 00:04:44,730 --> 00:04:49,570 So these b's were in the null space of A transpose. 82 00:04:49,570 --> 00:04:54,640 These guys in the column space, those b's are Ax-s. 83 00:04:54,640 --> 00:04:55,210 Right? 84 00:04:55,210 --> 00:04:58,825 If b is in the column space then it has the form Ax. 85 00:05:01,380 --> 00:05:04,450 I'm going to stick that on the quiz or the final for sure. 86 00:05:04,450 --> 00:05:08,290 That you have to realize -- because we've said it like 87 00:05:08,290 --> 00:05:13,020 a thousand times that the things in the column space are vectors 88 00:05:13,020 --> 00:05:14,241 A times x. 89 00:05:14,241 --> 00:05:14,740 OK. 90 00:05:14,740 --> 00:05:18,050 And do you see what happens now if we use our formula? 91 00:05:18,050 --> 00:05:19,990 There's an A transpose A. 92 00:05:19,990 --> 00:05:21,860 Gets canceled by its inverse. 93 00:05:21,860 --> 00:05:25,800 We're left with an A times x. 94 00:05:25,800 --> 00:05:27,530 So the result was Ax. 95 00:05:27,530 --> 00:05:28,570 Which was b. 96 00:05:28,570 --> 00:05:30,100 Do you see that it works? 97 00:05:30,100 --> 00:05:32,750 This is that whole business. 98 00:05:32,750 --> 00:05:35,870 Cancel, cancel, leaving Ax. 99 00:05:35,870 --> 00:05:37,840 And Ax was b. 100 00:05:37,840 --> 00:05:43,730 So that turned out to be b, in this case. 101 00:05:43,730 --> 00:05:51,300 OK, so geometrically what we're seeing is we're taking a vector 102 00:05:51,300 --> 00:05:53,010 -- 103 00:05:53,010 --> 00:06:00,770 we've got the column space and perpendicular to that 104 00:06:00,770 --> 00:06:06,350 is the null space of A transpose. 105 00:06:06,350 --> 00:06:10,230 And our typical vector b is out here. 106 00:06:10,230 --> 00:06:12,970 There's zero, so there's our typical vector b, 107 00:06:12,970 --> 00:06:19,460 and what we're doing is we're projecting it to P. And the -- 108 00:06:19,460 --> 00:06:22,300 and of course at the same time we're finding the other part 109 00:06:22,300 --> 00:06:24,810 of it which is e. 110 00:06:24,810 --> 00:06:30,520 So the two pieces, the projection piece and the error 111 00:06:30,520 --> 00:06:35,280 piece, add up to the original b. 112 00:06:35,280 --> 00:06:36,230 OK. 113 00:06:36,230 --> 00:06:39,520 That's like what our matrix does. 114 00:06:39,520 --> 00:06:41,440 So this is P -- 115 00:06:41,440 --> 00:06:48,260 P is -- this P is Ab, is sorry -- is Pb, it's the projection, 116 00:06:48,260 --> 00:06:52,590 applied to b, and this one is -- 117 00:06:52,590 --> 00:06:55,080 OK, that's a projection too. 118 00:06:55,080 --> 00:06:58,070 That's a projection down onto that space. 119 00:06:58,070 --> 00:06:59,860 What's a good formula for it? 120 00:06:59,860 --> 00:07:05,340 Suppose I ask you for the projection of the projection 121 00:07:05,340 --> 00:07:08,830 matrix onto the -- 122 00:07:08,830 --> 00:07:13,240 this space, this perpendicular space? 123 00:07:13,240 --> 00:07:16,960 So if this projection was P, what's 124 00:07:16,960 --> 00:07:21,070 the projection that gives me e? 125 00:07:21,070 --> 00:07:24,170 It's the -- what I want is to get the rest of the vector, 126 00:07:24,170 --> 00:07:30,790 so it'll be just I minus P times b, that's a projection too. 127 00:07:30,790 --> 00:07:35,880 That's the projection onto the perpendicular space. 128 00:07:38,950 --> 00:07:40,040 OK. 129 00:07:40,040 --> 00:07:44,150 So if P's a projection, I minus P is a projection. 130 00:07:44,150 --> 00:07:47,790 If P is symmetric, I minus P is symmetric. 131 00:07:47,790 --> 00:07:52,290 If P squared equals P, then I minus P squared equals I minus 132 00:07:52,290 --> 00:07:55,690 P. It's just -- 133 00:07:55,690 --> 00:08:00,460 the algebra -- is only doing what your -- 134 00:08:00,460 --> 00:08:05,270 picture is completely telling you. 135 00:08:05,270 --> 00:08:08,122 But the algebra leads to this expression. 136 00:08:11,820 --> 00:08:16,280 That expression for P given -- 137 00:08:16,280 --> 00:08:19,810 given a basis for the subspace, given 138 00:08:19,810 --> 00:08:25,690 the matrix A whose columns are a basis for our column space. 139 00:08:25,690 --> 00:08:28,820 OK, that's recap because you -- you need to see that formula 140 00:08:28,820 --> 00:08:30,460 more than once. 141 00:08:30,460 --> 00:08:34,669 And now can I pick up on using it? 142 00:08:34,669 --> 00:08:37,789 So now -- and the -- 143 00:08:37,789 --> 00:08:46,590 it's like, let me do that again, I'll go right through a problem 144 00:08:46,590 --> 00:08:52,470 that I started at the end, which is find a best straight line. 145 00:08:52,470 --> 00:08:53,820 You remember that problem, I -- 146 00:08:53,820 --> 00:08:57,530 I picked a particular set of points, 147 00:08:57,530 --> 00:09:00,820 they weren't specially brilliant, t equal one, 148 00:09:00,820 --> 00:09:07,190 two, three, the heights were one, two, and then two again. 149 00:09:07,190 --> 00:09:10,570 So they were -- heights were that point, that point, 150 00:09:10,570 --> 00:09:13,320 which makes it look like I've got a nice forty-five-degree 151 00:09:13,320 --> 00:09:18,800 line -- but then the third point didn't lie on the line. 152 00:09:18,800 --> 00:09:22,500 And I wanted to find the best straight line. 153 00:09:22,500 --> 00:09:26,404 So I'm looking for the -- this line, y=C+Dt. 154 00:09:30,850 --> 00:09:35,110 And it's not going to go through all three points, 155 00:09:35,110 --> 00:09:37,880 because no line goes through all three points. 156 00:09:37,880 --> 00:09:42,190 So I'm going to pick the best line, the -- 157 00:09:42,190 --> 00:09:45,990 the best being the one that makes the overall error 158 00:09:45,990 --> 00:09:48,430 as small as I can make it. 159 00:09:48,430 --> 00:09:52,390 Now I have to tell you, what is that overall error? 160 00:09:52,390 --> 00:10:01,600 And -- because that determines what's the winning line. 161 00:10:01,600 --> 00:10:02,740 If we don't know -- 162 00:10:02,740 --> 00:10:06,810 I mean we have to decide what we mean by the error -- 163 00:10:06,810 --> 00:10:12,940 and then we minimize and we find the right -- the best C and D. 164 00:10:12,940 --> 00:10:18,100 So if I went through this -- if I went through that point, 165 00:10:18,100 --> 00:10:18,600 OK. 166 00:10:18,600 --> 00:10:20,688 I would solve the equation C+D=1. 167 00:10:23,680 --> 00:10:26,310 Because at t equal to one -- 168 00:10:26,310 --> 00:10:30,050 I'd have C plus D, and it would come out right. 169 00:10:30,050 --> 00:10:34,310 If it went through this point, I'd have C plus two D equal to 170 00:10:34,310 --> 00:10:35,150 two. 171 00:10:35,150 --> 00:10:38,990 Because at t equal to two, I would like to get the answer 172 00:10:38,990 --> 00:10:39,550 two. 173 00:10:39,550 --> 00:10:43,950 At the third point, I have C plus three D because t is 174 00:10:43,950 --> 00:10:47,160 three, but the -- the answer I'm shooting for is 175 00:10:47,160 --> 00:10:49,850 two again. 176 00:10:49,850 --> 00:10:52,680 So those are my three equations. 177 00:10:52,680 --> 00:10:55,720 And they don't have a solution. 178 00:10:55,720 --> 00:10:58,110 But they've got a best solution. 179 00:10:58,110 --> 00:10:59,890 What do I mean by best solution? 180 00:10:59,890 --> 00:11:04,220 So let me take time out to remember what I'm talking 181 00:11:04,220 --> 00:11:06,550 about for best solution. 182 00:11:06,550 --> 00:11:11,960 So this is my equation Ax=b. 183 00:11:11,960 --> 00:11:18,440 A is this matrix, one, one, one, one, two, three. 184 00:11:18,440 --> 00:11:22,580 x is my -- only have two unknowns, C and D, 185 00:11:22,580 --> 00:11:27,440 and b is my right-hand side, one, two, three. 186 00:11:27,440 --> 00:11:27,940 OK. 187 00:11:31,930 --> 00:11:34,670 No solution. 188 00:11:34,670 --> 00:11:37,300 Three eq- I have a three by two matrix, 189 00:11:37,300 --> 00:11:40,200 I do have two independent columns -- 190 00:11:40,200 --> 00:11:42,540 so I do have a basis for the column space, 191 00:11:42,540 --> 00:11:44,630 those two columns are independent, 192 00:11:44,630 --> 00:11:46,420 they're a basis for the column space, 193 00:11:46,420 --> 00:11:52,370 but the column space doesn't include that vector. 194 00:11:52,370 --> 00:11:57,600 So best possible in this -- 195 00:11:57,600 --> 00:12:01,540 what would best possible mean? 196 00:12:01,540 --> 00:12:05,750 The way that comes out to linear equations is I -- 197 00:12:05,750 --> 00:12:13,970 I want to minimize the sum of these -- 198 00:12:13,970 --> 00:12:15,457 I'm going to make an error here. 199 00:12:15,457 --> 00:12:16,790 I'm going to make an error here. 200 00:12:16,790 --> 00:12:18,950 I'm going to make an error there. 201 00:12:18,950 --> 00:12:24,970 And I'm going to sum and square and add up those errors. 202 00:12:24,970 --> 00:12:26,720 So it's a sum of squares. 203 00:12:26,720 --> 00:12:30,750 It's a least squares solution I'm looking for. 204 00:12:30,750 --> 00:12:37,980 So if I -- those errors are the difference between Ax and b. 205 00:12:37,980 --> 00:12:40,320 That's what I want to make small. 206 00:12:40,320 --> 00:12:42,951 And the way I'm measuring this -- this is a vector, 207 00:12:42,951 --> 00:12:43,450 right? 208 00:12:43,450 --> 00:12:45,480 This is e1,e2 ,e3. 209 00:12:45,480 --> 00:12:49,090 The Ax-b, this is the e. 210 00:12:49,090 --> 00:12:50,890 The error vector. 211 00:12:50,890 --> 00:12:55,890 And small means its length. 212 00:12:55,890 --> 00:12:57,662 The length of that vector. 213 00:12:57,662 --> 00:12:59,370 That's what I'm going to try to minimize. 214 00:12:59,370 --> 00:13:04,280 And it's convenient to square. 215 00:13:04,280 --> 00:13:06,920 If I make something small, I make -- 216 00:13:09,620 --> 00:13:12,320 this is a never negative quantity, right? 217 00:13:12,320 --> 00:13:13,690 The length of that vector. 218 00:13:16,760 --> 00:13:20,040 The length will be zero exactly when the -- 219 00:13:20,040 --> 00:13:21,990 when I have the zero vector here. 220 00:13:21,990 --> 00:13:26,620 That's exactly the case when I can solve exactly, 221 00:13:26,620 --> 00:13:29,730 b is in the column space, all great. 222 00:13:29,730 --> 00:13:31,820 But I'm not in that case now. 223 00:13:31,820 --> 00:13:34,070 I'm going to have an error vector, e. 224 00:13:34,070 --> 00:13:35,815 What's this error vector in my picture? 225 00:13:38,340 --> 00:13:42,030 I guess what I'm trying to say is there's -- 226 00:13:42,030 --> 00:13:45,270 there's two pictures of what's going on. 227 00:13:45,270 --> 00:13:47,540 There's two pictures of what's going on. 228 00:13:47,540 --> 00:13:50,900 One picture is -- 229 00:13:50,900 --> 00:13:55,020 in this is the three points and the line. 230 00:13:55,020 --> 00:14:00,220 And in that picture, what are the three errors? 231 00:14:00,220 --> 00:14:03,480 The three errors are what I miss by in this equation. 232 00:14:03,480 --> 00:14:05,150 So it's this -- 233 00:14:05,150 --> 00:14:06,740 this little bit here. 234 00:14:06,740 --> 00:14:08,950 That vertical distance up to the line. 235 00:14:08,950 --> 00:14:12,780 There's one -- sorry there's one, and there's C plus D. 236 00:14:12,780 --> 00:14:14,700 And it's that difference. 237 00:14:14,700 --> 00:14:17,720 Here's two and here's C+2D. 238 00:14:17,720 --> 00:14:20,600 So vertically it's that distance -- 239 00:14:20,600 --> 00:14:23,620 that little error there is e1. 240 00:14:23,620 --> 00:14:26,220 This little error here is e2. 241 00:14:26,220 --> 00:14:30,540 This little error coming up is e3. 242 00:14:30,540 --> 00:14:32,350 e3. 243 00:14:32,350 --> 00:14:35,240 And what's my overall error? 244 00:14:35,240 --> 00:14:43,240 Is e1 square plus e2 squared plus e3 squared. 245 00:14:43,240 --> 00:14:44,920 That's what I'm trying to make small. 246 00:14:44,920 --> 00:14:54,090 I -- some statisticians -- this is a big part of statistics, 247 00:14:54,090 --> 00:14:56,360 fitting straight lines is a big part of science -- 248 00:14:56,360 --> 00:15:00,310 and specifically statistics, where the right word to use 249 00:15:00,310 --> 00:15:02,210 would be regression. 250 00:15:02,210 --> 00:15:05,270 I'm doing regression here. 251 00:15:05,270 --> 00:15:06,145 Linear regression. 252 00:15:09,840 --> 00:15:12,820 And I'm using this sum of squares 253 00:15:12,820 --> 00:15:15,270 as the measure of error. 254 00:15:15,270 --> 00:15:21,190 Again, some statisticians would be -- they would say, OK, 255 00:15:21,190 --> 00:15:24,000 I'll solve that problem because it's the clean problem. 256 00:15:24,000 --> 00:15:27,080 It leads to a beautiful linear system. 257 00:15:27,080 --> 00:15:30,340 But they would be a little careful about these squares, 258 00:15:30,340 --> 00:15:32,670 for -- in this case. 259 00:15:32,670 --> 00:15:35,990 If one of these points was way off. 260 00:15:35,990 --> 00:15:39,040 Suppose I had a measurement at t equal zero that was way off. 261 00:15:41,560 --> 00:15:44,880 Well, would the straight line, would the best line be the same 262 00:15:44,880 --> 00:15:46,890 if I had this fourth point? 263 00:15:46,890 --> 00:15:50,180 Suppose I have this fourth data point. 264 00:15:50,180 --> 00:15:54,880 No, certainly the line would -- 265 00:15:54,880 --> 00:15:57,620 it wouldn't be the -- that wouldn't be the best line. 266 00:15:57,620 --> 00:16:01,100 Because that line would have a giant error -- 267 00:16:01,100 --> 00:16:04,820 and when I squared it it would be like way out of sight 268 00:16:04,820 --> 00:16:06,860 compared to the others. 269 00:16:06,860 --> 00:16:14,280 So this would be called by statisticians an outlier, 270 00:16:14,280 --> 00:16:17,830 and they would not be happy to see the whole problem turned 271 00:16:17,830 --> 00:16:21,150 topsy-turvy by this one outlier, which could be a mistake, 272 00:16:21,150 --> 00:16:22,760 after all. 273 00:16:22,760 --> 00:16:26,500 So they wouldn't -- so they wouldn't like maybe squaring, 274 00:16:26,500 --> 00:16:29,940 if there were outliers, they would want to identify them. 275 00:16:29,940 --> 00:16:30,440 OK. 276 00:16:30,440 --> 00:16:35,800 I'm not going to -- 277 00:16:35,800 --> 00:16:40,870 I don't want to suggest that least squares isn't used, 278 00:16:40,870 --> 00:16:44,790 it's the most used, but it's not exclusively used 279 00:16:44,790 --> 00:16:47,040 because it's a little -- 280 00:16:47,040 --> 00:16:50,000 overcompensates for outliers. 281 00:16:50,000 --> 00:16:51,500 Because of that squaring. 282 00:16:51,500 --> 00:16:52,000 OK. 283 00:16:52,000 --> 00:16:54,300 So suppose we don't have this guy, 284 00:16:54,300 --> 00:16:57,300 we just have these three equations. 285 00:16:57,300 --> 00:17:01,680 And I want to make -- minimize this error. 286 00:17:01,680 --> 00:17:02,650 OK. 287 00:17:02,650 --> 00:17:08,069 Now, what I said is there's two pictures to look at. 288 00:17:08,069 --> 00:17:10,940 One picture is this one. 289 00:17:10,940 --> 00:17:14,700 The three points, the best line. 290 00:17:14,700 --> 00:17:16,280 And the errors. 291 00:17:16,280 --> 00:17:20,760 Now, on this picture, what are these points 292 00:17:20,760 --> 00:17:24,890 on the line, the points that are really on the line? 293 00:17:24,890 --> 00:17:30,490 So they're -- points, let me call them P1, P2, and P3, 294 00:17:30,490 --> 00:17:35,610 those are three numbers, so this -- this height is P1, 295 00:17:35,610 --> 00:17:45,700 this height is P2, this height is P3, and what are those guys? 296 00:17:45,700 --> 00:17:49,930 Suppose those were the three values instead of -- 297 00:17:49,930 --> 00:17:53,840 there's b1, ev- everybody's seen all these -- sorry, 298 00:17:53,840 --> 00:17:57,090 my art is as usual not the greatest, 299 00:17:57,090 --> 00:18:04,590 but there's the given b1, the given b2, and the given b3. 300 00:18:04,590 --> 00:18:09,330 I promise not to put a single letter more on that picture. 301 00:18:09,330 --> 00:18:10,050 OK. 302 00:18:10,050 --> 00:18:15,600 There's b1, P1 is the one on the line, and e1 is the distance 303 00:18:15,600 --> 00:18:16,600 between. 304 00:18:16,600 --> 00:18:21,410 And same at points two and same at points three. 305 00:18:21,410 --> 00:18:23,370 OK, so what's up? 306 00:18:23,370 --> 00:18:26,310 What's up with those Ps? 307 00:18:26,310 --> 00:18:29,930 P1, P2, P3, what are they? 308 00:18:29,930 --> 00:18:32,520 They're the components, they lie on the line, 309 00:18:32,520 --> 00:18:33,720 right? 310 00:18:33,720 --> 00:18:38,420 They're the points which if instead 311 00:18:38,420 --> 00:18:44,530 of one, two, two, which were the b's, suppose I put 312 00:18:44,530 --> 00:18:47,230 P1, P2, P3 in here. 313 00:18:47,230 --> 00:18:50,150 I'll figure out in a minute what those numbers are. 314 00:18:50,150 --> 00:18:53,040 But I just want to get the picture of what I'm doing. 315 00:18:53,040 --> 00:18:56,390 If I put P1, P2, P3 in those three equations, 316 00:18:56,390 --> 00:18:58,795 what would be good about the three equations? 317 00:19:01,820 --> 00:19:03,820 I could solve them. 318 00:19:03,820 --> 00:19:06,420 A line goes through the Ps. 319 00:19:06,420 --> 00:19:10,400 So the P1, P2, P3 vector, that's in the column 320 00:19:10,400 --> 00:19:11,320 space. 321 00:19:11,320 --> 00:19:14,480 That is a combination of these columns. 322 00:19:14,480 --> 00:19:16,400 It's the closest combination. 323 00:19:16,400 --> 00:19:18,180 It's this picture. 324 00:19:18,180 --> 00:19:20,920 See, I've got the two pictures like here's 325 00:19:20,920 --> 00:19:24,710 the picture that shows the points, this 326 00:19:24,710 --> 00:19:28,240 is a picture in a blackboard plane, 327 00:19:28,240 --> 00:19:34,310 here's a picture that's showing the vectors. 328 00:19:34,310 --> 00:19:38,540 The vector b, which is in this case, in this example 329 00:19:38,540 --> 00:19:42,090 is the vector one, two, two. 330 00:19:42,090 --> 00:19:47,940 The column space is in this case spanned by the -- 331 00:19:47,940 --> 00:19:49,720 well, you see A there. 332 00:19:49,720 --> 00:19:55,600 The column space of the matrix one, one, one, one, two, three. 333 00:19:55,600 --> 00:20:01,540 And this picture shows the nearest point. 334 00:20:01,540 --> 00:20:04,510 There's the -- that point P1, P2, P3, 335 00:20:04,510 --> 00:20:08,050 which I'm going to compute before the end of this hour, 336 00:20:08,050 --> 00:20:13,090 is the closest point in the column space. 337 00:20:13,090 --> 00:20:13,780 OK. 338 00:20:13,780 --> 00:20:19,560 Let me -- t I don't dare leave it any longer -- 339 00:20:19,560 --> 00:20:21,650 can I just compute it now. 340 00:20:21,650 --> 00:20:24,850 So I want to compute -- 341 00:20:24,850 --> 00:20:28,800 find P. All right. 342 00:20:28,800 --> 00:20:39,250 Find P. Find x, which is CD, find P and P. OK. 343 00:20:39,250 --> 00:20:42,430 And I really should put these little hats on 344 00:20:42,430 --> 00:20:49,830 to remind myself that they're the estimated the best line, 345 00:20:49,830 --> 00:20:51,970 not the perfect line. 346 00:20:51,970 --> 00:20:53,050 OK. 347 00:20:53,050 --> 00:20:54,330 OK. 348 00:20:54,330 --> 00:20:55,540 How do I proceed? 349 00:20:55,540 --> 00:20:58,340 Let's just run through the mechanics. 350 00:20:58,340 --> 00:21:02,530 What's the equation for x? 351 00:21:02,530 --> 00:21:04,620 The -- or x hat. 352 00:21:04,620 --> 00:21:10,390 The equation for that is A transpose A x hat equals A 353 00:21:10,390 --> 00:21:12,500 transpose x -- 354 00:21:12,500 --> 00:21:14,105 A transpose b. 355 00:21:18,020 --> 00:21:19,530 The most -- 356 00:21:19,530 --> 00:21:23,620 I'm -- will venture to call that the most important equation 357 00:21:23,620 --> 00:21:26,350 in statistics. 358 00:21:26,350 --> 00:21:28,560 And in estimation. 359 00:21:28,560 --> 00:21:33,140 And whatever you're -- wherever you've got error and noise this 360 00:21:33,140 --> 00:21:36,980 is the estimate that you use first. 361 00:21:36,980 --> 00:21:37,500 OK. 362 00:21:37,500 --> 00:21:42,740 Whenever you're fitting things by a few parameters, 363 00:21:42,740 --> 00:21:44,700 that's the equation to use. 364 00:21:44,700 --> 00:21:46,500 OK, let's solve it. 365 00:21:46,500 --> 00:21:47,970 What is A transpose A? 366 00:21:47,970 --> 00:21:50,580 So I have to figure out what these matrices are. 367 00:21:50,580 --> 00:21:56,860 One, one, one, one, two, three and one, one, one, one, two, 368 00:21:56,860 --> 00:22:04,490 three, that gives me some matrix, that gives me 369 00:22:04,490 --> 00:22:12,510 a matrix, what do I get out of that, three, six, six, and one 370 00:22:12,510 --> 00:22:15,720 and four and nine, fourteen. 371 00:22:15,720 --> 00:22:17,040 OK. 372 00:22:17,040 --> 00:22:21,830 And what do I expect to see in that matrix and I do see it, 373 00:22:21,830 --> 00:22:25,210 just before I keep going with the calculation? 374 00:22:25,210 --> 00:22:28,450 I expect that matrix to be symmetric. 375 00:22:28,450 --> 00:22:30,565 I expect it to be invertible. 376 00:22:34,100 --> 00:22:36,300 And near the end of the course I'm 377 00:22:36,300 --> 00:22:39,060 going to say I expect it to be positive definite, 378 00:22:39,060 --> 00:22:45,590 but that's a future fact about this crucial matrix, 379 00:22:45,590 --> 00:22:47,050 A transpose A. 380 00:22:47,050 --> 00:22:47,670 OK. 381 00:22:47,670 --> 00:22:50,880 And now let me figure A transpose b. 382 00:22:50,880 --> 00:22:57,280 So let me -- can I tack on b as an extra column here, one, two, 383 00:22:57,280 --> 00:22:59,950 two? 384 00:22:59,950 --> 00:23:04,770 And tack on the extra A transpose b is -- 385 00:23:04,770 --> 00:23:09,580 looks like five and one and four and six, eleven. 386 00:23:13,770 --> 00:23:20,760 I think my equations are three C plus six D equals five, 387 00:23:20,760 --> 00:23:29,700 and six D plus fourt-six C plus fourteen D is eleven. 388 00:23:29,700 --> 00:23:33,090 Can I just for safety see if I did that right? 389 00:23:33,090 --> 00:23:37,350 One, one, one times one, two, two is five. 390 00:23:37,350 --> 00:23:40,630 One, two, three, that's one, four and six, eleven. 391 00:23:40,630 --> 00:23:42,667 Looks good. 392 00:23:42,667 --> 00:23:43,625 These are my equations. 393 00:23:48,860 --> 00:23:52,000 That's my -- they're called the normal equations. 394 00:23:54,610 --> 00:23:56,984 I'll just write that word down because it -- 395 00:24:02,800 --> 00:24:04,470 so I solve them. 396 00:24:04,470 --> 00:24:10,270 I solve that for C and D. I would like to -- 397 00:24:10,270 --> 00:24:13,130 before I solve them could I do one thing that's on the -- 398 00:24:13,130 --> 00:24:16,570 that's just above here? 399 00:24:16,570 --> 00:24:18,110 I would like to -- 400 00:24:18,110 --> 00:24:21,470 I'd like to find these equations from calculus. 401 00:24:21,470 --> 00:24:26,320 I'd like to find them from this minimizing thing. 402 00:24:26,320 --> 00:24:28,010 So what's the first error? 403 00:24:28,010 --> 00:24:32,690 The first error is what I missed by in the first equation. 404 00:24:32,690 --> 00:24:36,250 C plus D minus one squared. 405 00:24:36,250 --> 00:24:40,010 And the second error is what I miss in the second equation. 406 00:24:40,010 --> 00:24:44,110 C plus two D minus two squared. 407 00:24:44,110 --> 00:24:52,350 And the third error squared is C plus three D minus two squared. 408 00:24:52,350 --> 00:24:56,410 That's my -- overall squared error that I'm trying 409 00:24:56,410 --> 00:24:58,040 to minimize. 410 00:24:58,040 --> 00:24:58,610 OK. 411 00:24:58,610 --> 00:25:08,910 So how would you minimize that? 412 00:25:08,910 --> 00:25:16,270 OK, linear algebra has given us the equations for the minimum. 413 00:25:16,270 --> 00:25:20,750 But we could use calculus too. 414 00:25:20,750 --> 00:25:24,440 That's a function of two variables, C and D, 415 00:25:24,440 --> 00:25:28,010 and we're looking for the minimum. 416 00:25:28,010 --> 00:25:31,140 So how do we find it? 417 00:25:31,140 --> 00:25:35,160 Directly from calculus, we take partial derivatives, 418 00:25:35,160 --> 00:25:37,510 right, we've got two variables, C and D, 419 00:25:37,510 --> 00:25:40,900 so take the partial derivative with respect to C 420 00:25:40,900 --> 00:25:44,560 and set it to zero, and you'll get that equation. 421 00:25:44,560 --> 00:25:47,140 Take the partial derivative with respect -- 422 00:25:47,140 --> 00:25:51,570 I'm not going to write it all out, just -- you will. 423 00:25:51,570 --> 00:25:56,370 The partial derivative with respect to D, it -- you know, 424 00:25:56,370 --> 00:25:59,340 it's going to be linear, that's the beauty of these 425 00:25:59,340 --> 00:26:03,220 squares,that if I have the square of something and I take 426 00:26:03,220 --> 00:26:07,520 its derivative I get something And this is what I get. linear. 427 00:26:07,520 --> 00:26:11,440 So this is the derivative of the error with respect to C 428 00:26:11,440 --> 00:26:13,770 being zero, and this is the derivative 429 00:26:13,770 --> 00:26:17,850 of the error with respect to D being zero. 430 00:26:17,850 --> 00:26:20,660 Wherever you look, these equations keep coming. 431 00:26:20,660 --> 00:26:22,370 So now I guess I'm going to solve it, 432 00:26:22,370 --> 00:26:25,830 what will I do, I'll subtract, I'll do elimination of course, 433 00:26:25,830 --> 00:26:27,820 because that's the only thing I know how to do. 434 00:26:27,820 --> 00:26:32,540 Two of these away from this would give me -- 435 00:26:32,540 --> 00:26:37,198 let's see, six, so would that be two Ds equals one? 436 00:26:37,198 --> 00:26:37,697 Ha. 437 00:26:41,440 --> 00:26:43,050 So it wasn't -- 438 00:26:43,050 --> 00:26:45,760 I was afraid these numbers were going to come out awful. 439 00:26:45,760 --> 00:26:48,770 But if I take two of those away from that, 440 00:26:48,770 --> 00:26:51,480 the equation I get left is two D equals one, 441 00:26:51,480 --> 00:26:57,700 so I think D is a half and C is whatever 442 00:26:57,700 --> 00:27:03,650 back substitution gives, six D is three, so three C plus three 443 00:27:03,650 --> 00:27:07,060 is five, I'm doing back substitution now, right, three, 444 00:27:07,060 --> 00:27:10,910 can I do it in light letters, three C plus 445 00:27:10,910 --> 00:27:15,970 that six D is three equals five, so three C is two, 446 00:27:15,970 --> 00:27:17,440 so I think C is two-thirds. 447 00:27:23,276 --> 00:27:24,275 One-half and two-thirds. 448 00:27:29,230 --> 00:27:38,640 So the best line, the best line is the constant two-thirds 449 00:27:38,640 --> 00:27:42,760 plus one-half t. 450 00:27:42,760 --> 00:27:46,820 And I -- is my picture more or less right? 451 00:27:46,820 --> 00:27:49,890 Let me write, let me copy that best line down again, 452 00:27:49,890 --> 00:27:52,600 two-thirds and a half. 453 00:27:52,600 --> 00:27:55,510 Let me -- I'll put in the two-thirds and the half. 454 00:27:59,890 --> 00:28:00,710 OK. 455 00:28:00,710 --> 00:28:05,360 So what's this P1, that's the value at t equal to one. 456 00:28:05,360 --> 00:28:08,380 At t equal to one, I have two-thirds plus a half, 457 00:28:08,380 --> 00:28:10,280 which is -- 458 00:28:10,280 --> 00:28:13,400 what's that, four-sixths and three-sixths, so P1, oh, 459 00:28:13,400 --> 00:28:18,400 I promised not to write another thing on this -- 460 00:28:18,400 --> 00:28:21,860 I'll erase P1 and I'll put seven-sixths. 461 00:28:21,860 --> 00:28:22,580 OK. 462 00:28:22,580 --> 00:28:27,990 And yeah, it's above one, and e1 is one-sixth, right. 463 00:28:27,990 --> 00:28:28,720 You see it all. 464 00:28:28,720 --> 00:28:29,220 Right? 465 00:28:29,220 --> 00:28:29,830 What's P2? 466 00:28:29,830 --> 00:28:31,660 OK. 467 00:28:31,660 --> 00:28:35,230 At point t equal to two, where's my line here? 468 00:28:35,230 --> 00:28:38,920 At t equal to two, it's two-thirds plus one, right? 469 00:28:38,920 --> 00:28:41,580 That's five-thirds. 470 00:28:41,580 --> 00:28:44,130 Two-thirds and t is two, so that's two-thirds 471 00:28:44,130 --> 00:28:46,070 and one make five-thirds. 472 00:28:46,070 --> 00:28:49,320 And that's -- sure enough, that's smaller than the exact 473 00:28:49,320 --> 00:28:50,280 two. 474 00:28:50,280 --> 00:28:55,180 And then final P3, when t is three, oh, what's 475 00:28:55,180 --> 00:28:56,820 two-thirds plus three-halves? 476 00:29:01,390 --> 00:29:03,950 It's the same as three-halves plus two-thirds. 477 00:29:03,950 --> 00:29:09,280 It's -- so maybe four-sixths and nine-sixths, 478 00:29:09,280 --> 00:29:11,120 maybe thirteen-sixths. 479 00:29:11,120 --> 00:29:15,110 OK, and again, look, oh, look at this, OK. 480 00:29:15,110 --> 00:29:19,840 You have to admire the beauty of this answer. 481 00:29:19,840 --> 00:29:21,260 What's this first error? 482 00:29:21,260 --> 00:29:25,760 So here are the errors. e1, e2 and e3. 483 00:29:25,760 --> 00:29:28,340 OK, what was that first error, e1? 484 00:29:28,340 --> 00:29:32,640 Well, if we decide the errors counting up, 485 00:29:32,640 --> 00:29:35,260 then it's one-sixth. 486 00:29:35,260 --> 00:29:38,670 And the last error, thirteen-sixths 487 00:29:38,670 --> 00:29:43,420 minus the correct two is one-sixth again. 488 00:29:43,420 --> 00:29:47,890 And what's this error in the middle? 489 00:29:47,890 --> 00:29:52,530 Let's see, the correct answer was two, two. 490 00:29:52,530 --> 00:29:55,900 And we got five-thirds and it's the other direction, 491 00:29:55,900 --> 00:29:58,445 minus one-third, minus two-sixths. 492 00:30:02,070 --> 00:30:04,560 That's our error vector. 493 00:30:04,560 --> 00:30:09,220 In our picture, in our other picture, here it is. 494 00:30:09,220 --> 00:30:13,880 We just found P and e. 495 00:30:13,880 --> 00:30:19,120 e is this vector, one-sixth, minus two-sixths, one-sixth, 496 00:30:19,120 --> 00:30:21,540 and P is this guy. 497 00:30:21,540 --> 00:30:23,540 Well, maybe I have the signs of e wrong, 498 00:30:23,540 --> 00:30:26,880 I think I have, let me fix it. 499 00:30:26,880 --> 00:30:32,840 Because I would like this one-sixth -- 500 00:30:32,840 --> 00:30:37,690 I would like this plus the P to give the original b. 501 00:30:37,690 --> 00:30:42,730 I want P plus e to match b. 502 00:30:42,730 --> 00:30:47,080 So I want minus a sixth, plus seven-sixths 503 00:30:47,080 --> 00:30:50,650 to give the correct b equal one. 504 00:30:50,650 --> 00:30:52,090 OK. 505 00:30:52,090 --> 00:30:58,060 Now -- I'm going to take a deep breath here, 506 00:30:58,060 --> 00:31:06,720 and ask what do we know about this error vector e? 507 00:31:06,720 --> 00:31:09,790 You've seen now this whole problem worked completely 508 00:31:09,790 --> 00:31:13,780 through, and I even think the numbers are right. 509 00:31:13,780 --> 00:31:17,500 So there's P, so let me -- 510 00:31:17,500 --> 00:31:24,840 I'll write -- if I can put it down here, B is P plus e. 511 00:31:24,840 --> 00:31:29,110 b I believe was one, two, two. 512 00:31:29,110 --> 00:31:34,860 The nearest point had seven-sixths, 513 00:31:34,860 --> 00:31:36,120 what were the others? 514 00:31:36,120 --> 00:31:40,590 Five-thirds and thirteen-sixths. 515 00:31:40,590 --> 00:31:46,950 And the e vector was minus a sixth, two-sixths, 516 00:31:46,950 --> 00:31:49,360 one-third in other words, and minus a sixth. 517 00:31:58,511 --> 00:31:59,010 OK. 518 00:31:59,010 --> 00:32:01,930 Tell me some stuff about these two vectors. 519 00:32:01,930 --> 00:32:03,820 Tell me something about those two vectors, 520 00:32:03,820 --> 00:32:06,480 well, they add to b, right, great. 521 00:32:06,480 --> 00:32:07,070 OK. 522 00:32:07,070 --> 00:32:09,420 What else? 523 00:32:09,420 --> 00:32:12,520 What else about those two vectors, the P, 524 00:32:12,520 --> 00:32:18,700 the projection vector P, and the error vector e. 525 00:32:18,700 --> 00:32:21,470 What else do you know about them? 526 00:32:21,470 --> 00:32:24,430 They're perpendicular, right. 527 00:32:24,430 --> 00:32:25,860 Do we dare verify that? 528 00:32:29,180 --> 00:32:32,230 Can you take the dot product of those vectors? 529 00:32:32,230 --> 00:32:35,440 I'm like getting like minus seven over thirty-six, 530 00:32:35,440 --> 00:32:36,850 can I change that to ten-sixths? 531 00:32:42,180 --> 00:32:45,250 Oh, God, come out right here. 532 00:32:45,250 --> 00:32:50,880 Minus seven over thirty-six, plus twenty over thirty-six, 533 00:32:50,880 --> 00:32:53,080 minus thirteen over thirty-six. 534 00:32:56,730 --> 00:32:57,630 Thank you, God. 535 00:32:57,630 --> 00:32:59,120 OK. 536 00:32:59,120 --> 00:33:04,030 And what else should we know about that vector? 537 00:33:04,030 --> 00:33:05,740 Actually we know -- 538 00:33:05,740 --> 00:33:08,740 I've got to say we know even a little more. 539 00:33:08,740 --> 00:33:13,510 This vector, e, is perpendicular to P, 540 00:33:13,510 --> 00:33:18,480 but it's perpendicular to other stuff too. 541 00:33:18,480 --> 00:33:22,220 It's perpendicular not just to this guy in the column space, 542 00:33:22,220 --> 00:33:25,170 this is in the column space for sure. 543 00:33:25,170 --> 00:33:27,680 This is perpendicular to the column space. 544 00:33:27,680 --> 00:33:32,710 So like give me another vector it's perpendicular to. 545 00:33:32,710 --> 00:33:35,000 Another because it's perpendicular to the whole 546 00:33:35,000 --> 00:33:37,490 column space, not just to this -- 547 00:33:37,490 --> 00:33:40,780 this particular projection that's -- 548 00:33:40,780 --> 00:33:44,880 that is in the column space, but it's perpendicular to other 549 00:33:44,880 --> 00:33:46,520 stuff, whatever's in the column space, 550 00:33:46,520 --> 00:33:49,800 so tell me another vector in the -- oh, well, 551 00:33:49,800 --> 00:33:53,080 I've written down the matrix, so tell me another vector 552 00:33:53,080 --> 00:33:55,000 in the column space. 553 00:33:55,000 --> 00:33:58,000 Pick a nice one. 554 00:33:58,000 --> 00:33:59,450 One, one, one. 555 00:33:59,450 --> 00:34:01,490 That's what everybody's thinking. 556 00:34:01,490 --> 00:34:04,230 OK, one, one, one is in the column space. 557 00:34:04,230 --> 00:34:07,350 And this guy is supposed to be perpendicular to one, 558 00:34:07,350 --> 00:34:08,090 one, one. 559 00:34:08,090 --> 00:34:10,000 Is it? 560 00:34:10,000 --> 00:34:10,659 Sure. 561 00:34:10,659 --> 00:34:12,550 If I take the dot product with one, 562 00:34:12,550 --> 00:34:16,690 one, one I get minus a sixth, plus two-sixths, minus a sixth, 563 00:34:16,690 --> 00:34:18,080 zero. 564 00:34:18,080 --> 00:34:20,659 And it's perpendicular to one, two, three. 565 00:34:20,659 --> 00:34:23,020 Because if I take the dot product with one, 566 00:34:23,020 --> 00:34:30,310 two, three I get minus one, plus four, minus three, zero again. 567 00:34:30,310 --> 00:34:32,449 OK, do you see the -- 568 00:34:32,449 --> 00:34:35,739 I hope you see the two pictures. 569 00:34:35,739 --> 00:34:41,110 The picture here for vectors and, the picture here 570 00:34:41,110 --> 00:34:48,120 for the best line, and it's the same picture, just -- 571 00:34:48,120 --> 00:34:51,440 this one's in the plane and it's showing the line, 572 00:34:51,440 --> 00:34:56,060 this one never did show the line, this -- in this picture, 573 00:34:56,060 --> 00:34:59,160 C and D never showed up. 574 00:34:59,160 --> 00:35:02,040 In this picture, C and D were -- you know, 575 00:35:02,040 --> 00:35:04,730 they determined that line. 576 00:35:04,730 --> 00:35:07,020 But the two are exactly the same. 577 00:35:07,020 --> 00:35:10,540 C and D is the combination of the two columns 578 00:35:10,540 --> 00:35:14,770 that gives P. OK. 579 00:35:14,770 --> 00:35:19,890 So that's these squares. 580 00:35:19,890 --> 00:35:23,680 And the special but most important 581 00:35:23,680 --> 00:35:26,820 example of fitting by straight line, so the homework 582 00:35:26,820 --> 00:35:29,670 that's coming then Wednesday asks 583 00:35:29,670 --> 00:35:32,750 you to fit by straight lines. 584 00:35:32,750 --> 00:35:40,440 So you're just going to end up solving the key equation. 585 00:35:40,440 --> 00:35:42,850 You're going to end up solving that key equation 586 00:35:42,850 --> 00:35:47,100 and then P will be Ax hat. 587 00:35:47,100 --> 00:35:47,870 That's it. 588 00:35:51,640 --> 00:35:54,470 OK. 589 00:35:54,470 --> 00:35:59,650 Now, can I put in a little piece of linear algebra 590 00:35:59,650 --> 00:36:03,350 that I mentioned earlier, mentioned again, 591 00:36:03,350 --> 00:36:06,510 but I never did write? 592 00:36:06,510 --> 00:36:09,840 And I've -- I should do it right. 593 00:36:09,840 --> 00:36:16,070 It's about this matrix A transpose A. There. 594 00:36:21,840 --> 00:36:26,450 I was sure that that matrix would be invertible. 595 00:36:26,450 --> 00:36:29,220 And of course I wanted to be sure it was invertible, 596 00:36:29,220 --> 00:36:36,210 because I planned to solve this system with with that matrix. 597 00:36:36,210 --> 00:36:40,620 So and I announced like before -- 598 00:36:40,620 --> 00:36:42,660 as the chapter was just starting, 599 00:36:42,660 --> 00:36:45,390 I announced that it would be invertible. 600 00:36:45,390 --> 00:36:48,615 But now I -- can I come back to that? 601 00:36:48,615 --> 00:36:49,115 OK. 602 00:36:52,940 --> 00:36:56,050 So what I said was -- 603 00:36:56,050 --> 00:37:07,440 that if A has independent columns, 604 00:37:07,440 --> 00:37:14,616 then A transpose A is invertible. 605 00:37:20,100 --> 00:37:24,080 And I would like to -- 606 00:37:24,080 --> 00:37:27,250 first to repeat that important fact, 607 00:37:27,250 --> 00:37:32,320 that that's the requirement that makes everything go here. 608 00:37:32,320 --> 00:37:34,610 It's this independent columns of A 609 00:37:34,610 --> 00:37:39,050 that guarantees everything goes through. 610 00:37:39,050 --> 00:37:42,140 And think why. 611 00:37:42,140 --> 00:37:44,970 Why does this matrix A transpose A, 612 00:37:44,970 --> 00:37:50,410 why is it invertible if the columns of A are independent? 613 00:37:50,410 --> 00:38:01,840 OK, there's -- so if it wasn't invertible, I'm -- 614 00:38:01,840 --> 00:38:04,750 so I want to prove that. 615 00:38:04,750 --> 00:38:08,060 If it isn't invertible, then what? 616 00:38:08,060 --> 00:38:10,610 I want to reach -- 617 00:38:10,610 --> 00:38:13,010 I want to follow that -- follow that line -- 618 00:38:13,010 --> 00:38:15,400 of thinking and see what I come to. 619 00:38:15,400 --> 00:38:17,480 Suppose, so proof. 620 00:38:17,480 --> 00:38:26,810 Suppose A transpose Ax is zero. 621 00:38:26,810 --> 00:38:28,400 I'm trying to prove this. 622 00:38:28,400 --> 00:38:30,440 This is now to prove. 623 00:38:30,440 --> 00:38:39,910 I don't like hammer away at too many proofs in this course. 624 00:38:39,910 --> 00:38:41,690 But this is like the central fact 625 00:38:41,690 --> 00:38:44,320 and it brings in all the stuff we know. 626 00:38:44,320 --> 00:38:44,820 OK. 627 00:38:44,820 --> 00:38:46,700 So I'll start the proof. 628 00:38:46,700 --> 00:38:51,160 Suppose A transpose Ax is zero. 629 00:38:51,160 --> 00:38:56,110 What -- and I'm aiming to prove A transpose A is invertible. 630 00:38:56,110 --> 00:38:58,150 So what do I want to prove now? 631 00:39:00,680 --> 00:39:03,560 So I'm aiming to prove this fact. 632 00:39:03,560 --> 00:39:06,680 I'll use this, and I'm aiming to prove that this matrix is 633 00:39:06,680 --> 00:39:11,740 invertible, OK, so if I suppose A transpose Ax is zero, 634 00:39:11,740 --> 00:39:13,875 then what conclusion do I want to reach? 635 00:39:16,450 --> 00:39:21,200 I'd like to know that x must be zero. 636 00:39:21,200 --> 00:39:23,510 I want to show x must be zero. 637 00:39:23,510 --> 00:39:33,100 To show now -- to prove x must be the zero vector. 638 00:39:33,100 --> 00:39:38,640 Is that right, that's what we worked in the previous chapter 639 00:39:38,640 --> 00:39:43,850 to understand, that a matrix was invertible 640 00:39:43,850 --> 00:39:51,960 when its null space is only the zero vector. 641 00:39:51,960 --> 00:39:53,340 So that's what I want to show. 642 00:39:53,340 --> 00:40:00,520 How come if A transpose Ax is zero, how come x must be zero? 643 00:40:00,520 --> 00:40:01,810 What's going to be the reason? 644 00:40:05,270 --> 00:40:06,880 Actually I have two ways to do it. 645 00:40:10,270 --> 00:40:12,210 Let me show you one way. 646 00:40:12,210 --> 00:40:14,640 This is -- here, trick. 647 00:40:18,210 --> 00:40:22,880 Take the dot product of both sides with x. 648 00:40:22,880 --> 00:40:25,980 So I'll multiply both sides by x transpose. 649 00:40:25,980 --> 00:40:30,100 x transpose A transpose Ax equals zero. 650 00:40:33,190 --> 00:40:35,100 I shouldn't have written trick. 651 00:40:35,100 --> 00:40:37,640 That makes it sound like just a dumb idea. 652 00:40:37,640 --> 00:40:39,581 Brilliant idea, I should have put. 653 00:40:39,581 --> 00:40:40,080 OK. 654 00:40:43,040 --> 00:40:44,215 I'll just put idea. 655 00:40:47,920 --> 00:40:49,230 OK. 656 00:40:49,230 --> 00:40:57,670 Now, I got to that equation, x transpose A transpose Ax=0, 657 00:40:57,670 --> 00:41:06,229 and I'm hoping you can see the right way to -- 658 00:41:06,229 --> 00:41:07,270 to look at that equation. 659 00:41:12,760 --> 00:41:15,030 What can I conclude from that equation, 660 00:41:15,030 --> 00:41:17,840 that if I have x transpose A -- well, 661 00:41:17,840 --> 00:41:21,070 what is x transpose A transpose Ax? 662 00:41:21,070 --> 00:41:25,360 Does that -- what it's giving you? 663 00:41:29,620 --> 00:41:32,740 It's again going to be putting in parentheses, I'm looking 664 00:41:32,740 --> 00:41:37,170 at Ax and what I seeing here? 665 00:41:37,170 --> 00:41:39,560 Its transpose. 666 00:41:39,560 --> 00:41:47,580 So I'm seeing here this is Ax transpose Ax. 667 00:41:47,580 --> 00:41:48,325 Equaling zero. 668 00:41:51,640 --> 00:41:57,040 Now if Ax transpose Ax, so like let's call it y or something, 669 00:41:57,040 --> 00:42:01,450 if y transpose y is zero, what does that tell me? 670 00:42:06,780 --> 00:42:08,950 That the vector has to be zero, right? 671 00:42:08,950 --> 00:42:10,650 This is the length squared, that's 672 00:42:10,650 --> 00:42:15,730 the length of the vector Ax squared, that's Ax times Ax. 673 00:42:15,730 --> 00:42:18,210 So I conclude that Ax has to be zero. 674 00:42:23,474 --> 00:42:24,640 Well, I'm getting somewhere. 675 00:42:29,900 --> 00:42:34,610 Now that I know Ax is zero, now I'm 676 00:42:34,610 --> 00:42:37,370 going to use my little hypothesis. 677 00:42:37,370 --> 00:42:43,290 Somewhere every mathematician has to use the hypothesis. 678 00:42:43,290 --> 00:42:45,050 Right? 679 00:42:45,050 --> 00:42:49,740 Now, if A has independent columns and we've -- 680 00:42:49,740 --> 00:42:55,580 we're at the point where Ax is zero, what does that tell us? 681 00:42:55,580 --> 00:42:59,610 I could -- I mean that could be like a fill-in question 682 00:42:59,610 --> 00:43:01,090 on the final exam. 683 00:43:01,090 --> 00:43:06,820 If A has independent columns and if Ax equals zero then what? 684 00:43:10,390 --> 00:43:15,850 Please say it. x is zero, right. 685 00:43:15,850 --> 00:43:18,370 Which was just what we wanted to prove. 686 00:43:18,370 --> 00:43:20,790 That -- do you see why that is? 687 00:43:20,790 --> 00:43:24,150 If Ax eq- equals zero, now we're using -- 688 00:43:24,150 --> 00:43:27,190 here we used this was the square of something, 689 00:43:27,190 --> 00:43:30,810 so I'll put in little parentheses 690 00:43:30,810 --> 00:43:35,720 the observation we made, that was a square which is zero, 691 00:43:35,720 --> 00:43:37,610 so the thing has to be zero. 692 00:43:37,610 --> 00:43:43,130 Now we're using the hypothesis of independent columns 693 00:43:43,130 --> 00:43:48,600 at the A has independent columns. 694 00:43:48,600 --> 00:43:52,060 If A has independent columns, this is telling me 695 00:43:52,060 --> 00:43:56,040 x is in its null space, and the only thing 696 00:43:56,040 --> 00:44:00,510 in the null space of such a matrix is the zero vector. 697 00:44:00,510 --> 00:44:01,320 OK. 698 00:44:01,320 --> 00:44:06,620 So that's the argument and you see how it really used 699 00:44:06,620 --> 00:44:13,420 our understanding of the -- of the null space. 700 00:44:13,420 --> 00:44:13,990 OK. 701 00:44:13,990 --> 00:44:15,650 That's great. 702 00:44:15,650 --> 00:44:16,420 All right. 703 00:44:16,420 --> 00:44:20,750 So where are we then? 704 00:44:20,750 --> 00:44:24,430 That board is like the backup theory 705 00:44:24,430 --> 00:44:28,670 that tells me that this matrix had 706 00:44:28,670 --> 00:44:32,610 to be invertible because these columns were independent. 707 00:44:35,530 --> 00:44:38,360 OK. 708 00:44:38,360 --> 00:44:44,940 there's one case of independent -- 709 00:44:44,940 --> 00:44:50,540 there's one case where the geometry gets even better. 710 00:44:50,540 --> 00:44:55,030 When the -- there's one case when columns are sure to be 711 00:44:55,030 --> 00:44:56,610 independent. 712 00:44:56,610 --> 00:45:00,060 And let me put that -- let me write that down and that'll be 713 00:45:00,060 --> 00:45:01,780 the subject for next time. 714 00:45:01,780 --> 00:45:07,040 Columns are sure -- are certainly independent, 715 00:45:07,040 --> 00:45:23,290 definitely independent, if they're perpendicular. 716 00:45:23,290 --> 00:45:25,190 Oh, I've got to rule out the zero column, 717 00:45:25,190 --> 00:45:33,280 let me give them all length one, so they can't be zero if they 718 00:45:33,280 --> 00:45:37,855 are perpendicular unit vectors. 719 00:45:42,870 --> 00:45:53,370 Like the vectors one, zero, zero, zero, one, zero and zero, 720 00:45:53,370 --> 00:45:55,480 zero, one. 721 00:45:55,480 --> 00:46:00,660 Those vectors are unit vectors, they're perpendicular, 722 00:46:00,660 --> 00:46:05,820 and they certainly are independent. 723 00:46:05,820 --> 00:46:10,610 And what's more, suppose they're -- oh, that's so nice, 724 00:46:10,610 --> 00:46:14,080 I mean what is A transpose A for that matrix? 725 00:46:14,080 --> 00:46:16,470 For the matrix with these three columns? 726 00:46:16,470 --> 00:46:18,280 It's the identity. 727 00:46:18,280 --> 00:46:23,090 So here's the key to the lecture that's coming. 728 00:46:23,090 --> 00:46:27,210 If we're dealing with perpendicular unit vectors 729 00:46:27,210 --> 00:46:32,000 and the word for that will be -- see I could have said 730 00:46:32,000 --> 00:46:35,650 orthogonal, but I said perpendicular -- 731 00:46:35,650 --> 00:46:41,370 and this unit vectors gets put in as the word normal. 732 00:46:41,370 --> 00:46:42,545 Orthonormal vectors. 733 00:46:46,070 --> 00:46:49,820 Those are the best columns you could ask for. 734 00:46:49,820 --> 00:46:54,730 Matrices with -- whose columns are orthonormal, 735 00:46:54,730 --> 00:46:56,950 they're perpendicular to each other, 736 00:46:56,950 --> 00:47:01,010 and they're unit vectors, well, they don't have to be those 737 00:47:01,010 --> 00:47:06,280 three, let me do a final example over here, 738 00:47:06,280 --> 00:47:11,110 how about one at an angle like that and one at ninety degrees, 739 00:47:11,110 --> 00:47:18,050 that vector would be cos theta, sine theta, a unit vector, 740 00:47:18,050 --> 00:47:24,150 and this vector would be minus sine theta cos theta. 741 00:47:24,150 --> 00:47:30,850 That is our absolute favorite pair of orthonormal vectors. 742 00:47:30,850 --> 00:47:33,630 They're both unit vectors and they're perpendicular. 743 00:47:33,630 --> 00:47:36,520 That angle is ninety degrees. 744 00:47:36,520 --> 00:47:41,500 So like our job next time is first to see 745 00:47:41,500 --> 00:47:43,640 why orthonormal vectors are great, 746 00:47:43,640 --> 00:47:47,240 and then to make vectors orthonormal by picking 747 00:47:47,240 --> 00:47:49,530 the right basis. 748 00:47:49,530 --> 00:47:50,625 OK, see you. 749 00:47:57,070 --> 00:47:58,620 Thanks.