1 00:00:00,000 --> 00:00:00,500 2 00:00:00,500 --> 00:00:02,944 The following content is provided under a Creative 3 00:00:02,944 --> 00:00:03,610 Commons license. 4 00:00:03,610 --> 00:00:06,730 Your support will help MIT OpenCourseWare 5 00:00:06,730 --> 00:00:10,050 continue to offer high quality educational resources for free. 6 00:00:10,050 --> 00:00:12,530 To make a donation or to view additional materials 7 00:00:12,530 --> 00:00:15,860 from hundreds of MIT courses visit MIT OpenCourseWare 8 00:00:15,860 --> 00:00:19,390 at ocw.mit.edu. 9 00:00:19,390 --> 00:00:24,680 PROFESSOR STRANG: You can pick up the homeworks 10 00:00:24,680 --> 00:00:27,470 after if you didn't get one. 11 00:00:27,470 --> 00:00:29,350 And the good grade is five. 12 00:00:29,350 --> 00:00:32,900 And the MATLABs are coming in, that's all good. 13 00:00:32,900 --> 00:00:40,050 And I'll have some thoughts about the MATLAB Monday. 14 00:00:40,050 --> 00:00:44,490 I wanted to say a little more about finite differences 15 00:00:44,490 --> 00:00:47,310 in time just because they're so important. 16 00:00:47,310 --> 00:00:53,080 And I got the idea of a forward Euler and a backward Euler 17 00:00:53,080 --> 00:00:57,160 but I want to give you a couple of more possibilities 18 00:00:57,160 --> 00:01:02,660 just so you see how finite difference methods are created. 19 00:01:02,660 --> 00:01:06,470 And then I want to get started on least 20 00:01:06,470 --> 00:01:10,610 squares, the next major topic, the next case 21 00:01:10,610 --> 00:01:15,480 where A transpose A shows up at the core of everything. 22 00:01:15,480 --> 00:01:18,040 So what we did with finite differences 23 00:01:18,040 --> 00:01:22,000 was this was our model problem for one spring 24 00:01:22,000 --> 00:01:25,030 and this is our model problem for n springs 25 00:01:25,030 --> 00:01:27,990 with n different masses. 26 00:01:27,990 --> 00:01:34,090 So this is like, scalar. u is just a single unknown. 27 00:01:34,090 --> 00:01:37,530 Here u is a vector unknown. 28 00:01:37,530 --> 00:01:42,390 If I reduce things to like, even this model problem, 29 00:01:42,390 --> 00:01:45,450 if I introduce the velocity, because everybody's also 30 00:01:45,450 --> 00:01:47,850 interested in computing velocity, 31 00:01:47,850 --> 00:01:51,000 then the velocity is u' and the equation 32 00:01:51,000 --> 00:01:53,730 tells me that v' is minus u. 33 00:01:53,730 --> 00:01:55,620 So I have a system. 34 00:01:55,620 --> 00:01:58,780 Better to think in terms of first order systems 35 00:01:58,780 --> 00:02:00,550 because they include everything. 36 00:02:00,550 --> 00:02:02,590 So the first order system there, do 37 00:02:02,590 --> 00:02:05,580 you see what the matrix is on that right-hand side? 38 00:02:05,580 --> 00:02:10,350 I'm always seeing a matrix that's multiplying [u, v]. 39 00:02:10,350 --> 00:02:17,360 So this would be our example of this general set-up. 40 00:02:17,360 --> 00:02:19,330 This general set-up. 41 00:02:19,330 --> 00:02:24,680 First order systems, taking them to be linear here. 42 00:02:24,680 --> 00:02:26,670 And then I better say a little bit 43 00:02:26,670 --> 00:02:34,110 about what happens when they're not linear today and later. 44 00:02:34,110 --> 00:02:36,140 So this is my problem. du/dt=Au. 45 00:02:36,140 --> 00:02:39,720 46 00:02:39,720 --> 00:02:42,930 And the exact solution would come from the eigenvalues 47 00:02:42,930 --> 00:02:50,260 and eigenvectors of A. We would have the e^(lambda*t)'s, giving 48 00:02:50,260 --> 00:02:54,360 us the time growth or decay or oscillation, 49 00:02:54,360 --> 00:02:56,960 times their eigenvectors. 50 00:02:56,960 --> 00:03:02,690 And a combination of those would be the exact answer. 51 00:03:02,690 --> 00:03:05,620 So that's the general method for using 52 00:03:05,620 --> 00:03:13,190 eigenvalues and eigenvectors to get exact solutions. 53 00:03:13,190 --> 00:03:15,620 I'm not speaking now about exact solutions. 54 00:03:15,620 --> 00:03:19,880 I'm going to talk about finite differences in time 55 00:03:19,880 --> 00:03:23,630 because for more general problems 56 00:03:23,630 --> 00:03:26,550 I must expect to go to finite differences. 57 00:03:26,550 --> 00:03:31,870 I can't expect exact solutions like pure cos(t) or sin(t). 58 00:03:31,870 --> 00:03:36,330 So here's my problem, model problem. 59 00:03:36,330 --> 00:03:39,980 Here is Euler's first idea. 60 00:03:39,980 --> 00:03:45,230 So idea one from Euler was replace the derivative 61 00:03:45,230 --> 00:03:49,860 by a finite difference taking time steps forward in time 62 00:03:49,860 --> 00:03:55,610 and use the equation to tell you the slope 63 00:03:55,610 --> 00:03:56,880 at the start of the step. 64 00:03:56,880 --> 00:04:01,680 Do you see Euler's equation there? 65 00:04:01,680 --> 00:04:08,510 And that is definitely a reasonable thing to start with. 66 00:04:08,510 --> 00:04:11,150 It's not very accurate. 67 00:04:11,150 --> 00:04:13,090 It's not perfectly stable. 68 00:04:13,090 --> 00:04:16,200 It doesn't preserve energy. 69 00:04:16,200 --> 00:04:22,160 We saw the answer spiraling out but if you only 70 00:04:22,160 --> 00:04:26,610 want to compute up to a limited time 71 00:04:26,610 --> 00:04:30,760 you could use it with a pretty small time step. 72 00:04:30,760 --> 00:04:34,190 But you can do better. 73 00:04:34,190 --> 00:04:36,950 Now backward Euler was the other way. 74 00:04:36,950 --> 00:04:40,150 So it's just the contrast of the two that you want to see. 75 00:04:40,150 --> 00:04:43,880 Neither one is an outstanding method. 76 00:04:43,880 --> 00:04:48,170 They're both only first order accurate. 77 00:04:48,170 --> 00:04:54,010 So backward Euler, this time step in time 78 00:04:54,010 --> 00:05:00,470 is still the difference that replaces the derivative. 79 00:05:00,470 --> 00:05:04,130 But now everybody notices the big difference. 80 00:05:04,130 --> 00:05:08,110 The slope is being computed at the end of the time step. 81 00:05:08,110 --> 00:05:09,520 And that's more stable. 82 00:05:09,520 --> 00:05:14,480 That's the one that spiraled in. 83 00:05:14,480 --> 00:05:21,580 And I would call this method explicit. 84 00:05:21,580 --> 00:05:23,620 What does explicit mean? 85 00:05:23,620 --> 00:05:27,960 It means that I can find u_(n+1) directly. 86 00:05:27,960 --> 00:05:31,170 This method I would call implicit. 87 00:05:31,170 --> 00:05:33,610 And what does implicit mean? 88 00:05:33,610 --> 00:05:37,020 Implicit means that u_(n+1) is appearing on the right-hand 89 00:05:37,020 --> 00:05:40,660 side as well as the left so I've got to move it over and I have 90 00:05:40,660 --> 00:05:45,720 to solve a system of equations to find the next u_(n+1). 91 00:05:45,720 --> 00:05:51,890 So you see that basic separation of ideas here. 92 00:05:51,890 --> 00:05:57,000 Forward, faster, but less stable. 93 00:05:57,000 --> 00:06:00,370 Backward, slower, generally more stable 94 00:06:00,370 --> 00:06:03,640 and in fact, in this case somehow too stable. 95 00:06:03,640 --> 00:06:05,100 It dissipates more energy. 96 00:06:05,100 --> 00:06:10,710 You don't want to lose all your energy. 97 00:06:10,710 --> 00:06:12,590 So you look for something better. 98 00:06:12,590 --> 00:06:16,850 And also you look for something more accurate. 99 00:06:16,850 --> 00:06:21,010 I want to suggest a more accurate method, 100 00:06:21,010 --> 00:06:23,130 the trapezoidal method. 101 00:06:23,130 --> 00:06:26,250 Which just follows that basic principle 102 00:06:26,250 --> 00:06:30,310 that if I center things at halfway 103 00:06:30,310 --> 00:06:33,300 I'm probably going to pick up an order of accuracy. 104 00:06:33,300 --> 00:06:40,850 So centering it halfway I took half of backward Euler and half 105 00:06:40,850 --> 00:06:41,870 of forward Euler. 106 00:06:41,870 --> 00:06:47,010 So is this one explicit or implicit? 107 00:06:47,010 --> 00:06:47,690 Implicit. 108 00:06:47,690 --> 00:06:50,850 Implicit. u_(n+1) is still appearing here, 109 00:06:50,850 --> 00:06:52,470 has to move over there. 110 00:06:52,470 --> 00:06:56,420 So if I move it over I could say here I have, 111 00:06:56,420 --> 00:07:01,251 let me multiply up by delta t and bring the u_(n+1) over 112 00:07:01,251 --> 00:07:01,750 here. 113 00:07:01,750 --> 00:07:04,370 So I'd have u_(n+1). 114 00:07:04,370 --> 00:07:10,950 I'll have minus delta t over two A. 115 00:07:10,950 --> 00:07:14,240 All that's what's multiplying u_(n+1). 116 00:07:14,240 --> 00:07:15,250 Right? 117 00:07:15,250 --> 00:07:17,500 When I multiply up, let me do that. 118 00:07:17,500 --> 00:07:19,540 Let me make it easy for your eyes. 119 00:07:19,540 --> 00:07:23,380 I'll multiply up by the delta t. 120 00:07:23,380 --> 00:07:26,980 And then I'm bringing this part, the implicit part. 121 00:07:26,980 --> 00:07:29,210 So that's my left-hand side. 122 00:07:29,210 --> 00:07:31,720 And what's my right-hand side? u_n 123 00:07:31,720 --> 00:07:38,240 is going over as the identity plus delta t over two A. 124 00:07:38,240 --> 00:07:41,570 All that's what's multiplying u_n. 125 00:07:41,570 --> 00:07:45,040 Good. 126 00:07:45,040 --> 00:07:49,200 So this is the matrix that has to get inverted at every step. 127 00:07:49,200 --> 00:07:55,340 Of course, in this model problem that's not expensive to do. 128 00:07:55,340 --> 00:07:57,990 If we have the same matrix at every step, 129 00:07:57,990 --> 00:08:02,280 if we have a linear time-invariant system, 130 00:08:02,280 --> 00:08:08,690 well we can just invert it once or factor it in into L times U 131 00:08:08,690 --> 00:08:09,760 once. 132 00:08:09,760 --> 00:08:11,400 We don't have to do it at every step. 133 00:08:11,400 --> 00:08:13,950 So that's, of course, very fast. 134 00:08:13,950 --> 00:08:17,300 But when the problem becomes non-linear 135 00:08:17,300 --> 00:08:20,480 we're going to start paying a price for a more 136 00:08:20,480 --> 00:08:23,300 complicated implicit part. 137 00:08:23,300 --> 00:08:27,010 What I'm interested in right now just 138 00:08:27,010 --> 00:08:33,140 to sort of see these differences, 139 00:08:33,140 --> 00:08:39,610 there's a particular matrix A that is my model here. 140 00:08:39,610 --> 00:08:41,570 It's antisymmetric. 141 00:08:41,570 --> 00:08:44,710 Notice it's antisymmetric and that sort of 142 00:08:44,710 --> 00:08:47,750 goes with conserving energy. 143 00:08:47,750 --> 00:08:50,820 I'm not going to-- In a first order system, 144 00:08:50,820 --> 00:08:56,800 that antisymmetric tells me that the eigenvalues of that matrix 145 00:08:56,800 --> 00:08:58,710 are pure imaginary. 146 00:08:58,710 --> 00:09:02,110 And so I'm going to get e^(it). 147 00:09:02,110 --> 00:09:05,200 I'm going to get things that don't blow up, 148 00:09:05,200 --> 00:09:07,450 that don't decay. 149 00:09:07,450 --> 00:09:14,600 So I want to see, what's the growth factor? 150 00:09:14,600 --> 00:09:21,080 Suppose you want to understand, is this method good, 151 00:09:21,080 --> 00:09:22,830 what's its growth factor? 152 00:09:22,830 --> 00:09:24,900 So that will tell me about stability. 153 00:09:24,900 --> 00:09:28,030 It'll tell me about accuracy, too. 154 00:09:28,030 --> 00:09:34,090 So its growth factor, all I want to do 155 00:09:34,090 --> 00:09:42,800 is put everything on the right side of the equation. 156 00:09:42,800 --> 00:09:44,280 I want to write it in this form. 157 00:09:44,280 --> 00:09:46,930 So that G will be the growth matrix. 158 00:09:46,930 --> 00:09:50,310 And what is G? 159 00:09:50,310 --> 00:09:55,270 Well, I just have to invert that. 160 00:09:55,270 --> 00:09:59,900 That's I minus delta t over two A. The inverse of that, so 161 00:09:59,900 --> 00:10:08,100 that's the implicit part, times the explicit part. 162 00:10:08,100 --> 00:10:10,180 I'm not doing anything difficult here. 163 00:10:10,180 --> 00:10:16,520 Just seeing how would you approach 164 00:10:16,520 --> 00:10:20,830 to understand whether the solution to this 165 00:10:20,830 --> 00:10:28,120 is growing, decaying, or staying as we hope, 166 00:10:28,120 --> 00:10:31,490 with maybe constant energy. 167 00:10:31,490 --> 00:10:33,974 Because that's what the true solution is doing. 168 00:10:33,974 --> 00:10:35,640 The true solution, you remember, is just 169 00:10:35,640 --> 00:10:38,920 going around in a circle in our model problem 170 00:10:38,920 --> 00:10:45,050 and just oscillating forever in our model system. 171 00:10:45,050 --> 00:10:49,830 What about the growth or decay or whatever 172 00:10:49,830 --> 00:10:53,030 of that growth factor? 173 00:10:53,030 --> 00:10:55,250 Well again, this is the point where 174 00:10:55,250 --> 00:10:57,820 I would look for eigenvalues. 175 00:10:57,820 --> 00:11:02,210 If I have a matrix G, and everybody recognizes that 176 00:11:02,210 --> 00:11:07,740 after n time steps, what matrix am I going to have? 177 00:11:07,740 --> 00:11:13,220 What matrix connects u_n to the original u_0? 178 00:11:13,220 --> 00:11:15,510 That matrix is? 179 00:11:15,510 --> 00:11:17,310 G^n. 180 00:11:17,310 --> 00:11:21,550 At every step I'm multiplying by a G. 181 00:11:21,550 --> 00:11:25,160 So with finite differences, you have powers. 182 00:11:25,160 --> 00:11:26,430 That's the rule. 183 00:11:26,430 --> 00:11:30,640 With differential equations, you have exponentials. 184 00:11:30,640 --> 00:11:36,220 With finite differences, finite steps, you have powers of G. 185 00:11:36,220 --> 00:11:42,220 So I'm interested in G^n and what tells me about that is 186 00:11:42,220 --> 00:11:44,730 the eigenvalues here. 187 00:11:44,730 --> 00:11:49,190 If I had to ask for the eigenvalues of this matrix, 188 00:11:49,190 --> 00:11:56,010 they would be the eigenvalues of G. So eig(G), can I say, 189 00:11:56,010 --> 00:11:59,980 for this case. 190 00:11:59,980 --> 00:12:04,700 Actually, can I just say what it would be? 191 00:12:04,700 --> 00:12:07,390 What are the eigenvalues of this guy? 192 00:12:07,390 --> 00:12:12,330 The eigenvalues of that factor are one plus delta t 193 00:12:12,330 --> 00:12:17,260 over two times the eigenvalues of A. That's 194 00:12:17,260 --> 00:12:19,080 what we're getting from here. 195 00:12:19,080 --> 00:12:20,700 And here, this is the inverse. 196 00:12:20,700 --> 00:12:26,240 So its eigenvalues will come in the denominator. 197 00:12:26,240 --> 00:12:32,230 This'll be one minus delta t over two times the eigenvalues 198 00:12:32,230 --> 00:12:35,920 of A. That's pretty good. 199 00:12:35,920 --> 00:12:37,830 That's pretty good. 200 00:12:37,830 --> 00:12:41,680 Actually this sort of shows me a lot. 201 00:12:41,680 --> 00:12:46,000 Well, in the case of this model example, 202 00:12:46,000 --> 00:12:51,220 the eigenvalues were i and minus i. 203 00:12:51,220 --> 00:12:57,170 This is for the special case A equals [0, 1; -1, 0]. 204 00:12:57,170 --> 00:13:04,100 This will be one plus i delta t over two divided by one 205 00:13:04,100 --> 00:13:09,000 minus i delta t over two. 206 00:13:09,000 --> 00:13:12,560 And the complex conjugate for the other eigenvalue. 207 00:13:12,560 --> 00:13:15,000 It'd be two eigenvalues. 208 00:13:15,000 --> 00:13:18,550 Let me work with the one, let me take the one that's i 209 00:13:18,550 --> 00:13:22,510 and then there would be a similar deal with minus i. 210 00:13:22,510 --> 00:13:32,190 That's our eigenvalue of G. Where in the complex plane 211 00:13:32,190 --> 00:13:34,320 is that number? 212 00:13:34,320 --> 00:13:37,780 If I give you that complex number. 213 00:13:37,780 --> 00:13:40,190 We're meeting complex numbers here. 214 00:13:40,190 --> 00:13:43,140 We're not meeting complex functions, 215 00:13:43,140 --> 00:13:48,650 just we have to be able to deal with complex numbers. 216 00:13:48,650 --> 00:13:51,570 Actually, when I look at those two numbers, 217 00:13:51,570 --> 00:13:56,440 what do I see right away about them? 218 00:13:56,440 --> 00:13:59,140 How are they related? 219 00:13:59,140 --> 00:14:02,210 They're conjugates, right? 220 00:14:02,210 --> 00:14:03,640 Conjugates of each other. 221 00:14:03,640 --> 00:14:07,650 This one is in the complex plane, 222 00:14:07,650 --> 00:14:12,360 I would go along one, the real axis, and up delta t over two. 223 00:14:12,360 --> 00:14:15,780 And this one I would go down delta t over two. 224 00:14:15,780 --> 00:14:21,610 And that symmetry is what I, the word you used, 225 00:14:21,610 --> 00:14:25,500 the complex conjugate. 226 00:14:25,500 --> 00:14:30,830 Compare that length for the top with that length for the bottom 227 00:14:30,830 --> 00:14:32,750 and what do you get? 228 00:14:32,750 --> 00:14:33,940 Same length. 229 00:14:33,940 --> 00:14:37,000 So I'm concluding that in this case 230 00:14:37,000 --> 00:14:43,470 the magnitude of these eigenvalues of G is what? 231 00:14:43,470 --> 00:14:47,570 So magnitude, absolute value, is just 232 00:14:47,570 --> 00:14:49,270 the absolute value of that divided 233 00:14:49,270 --> 00:14:51,670 by the absolute value of that. 234 00:14:51,670 --> 00:14:54,800 It's this distance divided by this distance. 235 00:14:54,800 --> 00:14:59,450 And you told me already the answer is one. 236 00:14:59,450 --> 00:15:06,720 The eigenvalues are right on the unit circle. 237 00:15:06,720 --> 00:15:09,470 In some ways that's wonderful. 238 00:15:09,470 --> 00:15:13,490 The solution if you compute exactly 239 00:15:13,490 --> 00:15:18,120 will stay on the unit circle. 240 00:15:18,120 --> 00:15:23,490 Of course, it will not be exactly the same 241 00:15:23,490 --> 00:15:27,710 as the continuous solution. 242 00:15:27,710 --> 00:15:31,461 But the energy won't change. 243 00:15:31,461 --> 00:15:32,710 We're not going to spiral out. 244 00:15:32,710 --> 00:15:34,030 We're not going to spiral in. 245 00:15:34,030 --> 00:15:37,020 It's this average and it's more accurate. 246 00:15:37,020 --> 00:15:39,460 So trapezoidal method is like, the workhorse 247 00:15:39,460 --> 00:15:44,120 for finite element codes. 248 00:15:44,120 --> 00:15:52,330 Trapezoidal method is a pretty successful method. 249 00:15:52,330 --> 00:15:53,610 What's the price? 250 00:15:53,610 --> 00:15:58,200 The price is this implicit stuff that you have to solve. 251 00:15:58,200 --> 00:16:00,990 So I should say it's the workhorse 252 00:16:00,990 --> 00:16:03,300 among implicit methods. 253 00:16:03,300 --> 00:16:06,660 And it's simple. 254 00:16:06,660 --> 00:16:11,630 So just to have a picture, finite element codes 255 00:16:11,630 --> 00:16:16,200 usually are not shooting for great accuracy. 256 00:16:16,200 --> 00:16:18,780 Many finite element calculations, 257 00:16:18,780 --> 00:16:21,610 you're happy with two or three decimal places. 258 00:16:21,610 --> 00:16:23,590 So we're not shooting for great accuracy. 259 00:16:23,590 --> 00:16:27,670 Second order is very adequate. 260 00:16:27,670 --> 00:16:33,910 What we don't want, of course, is to be unstable. 261 00:16:33,910 --> 00:16:39,700 We don't want to lose all the energy. 262 00:16:39,700 --> 00:16:49,240 So this is really a good method. 263 00:16:49,240 --> 00:17:02,310 I have to say it's a good method, but not perfect. 264 00:17:02,310 --> 00:17:08,320 For linear problems, for my model problem this is fine. 265 00:17:08,320 --> 00:17:11,860 For a real problem, a difficult problem 266 00:17:11,860 --> 00:17:25,770 in mechanics you might find that the energy, you might find it 267 00:17:25,770 --> 00:17:28,870 goes a little unstable and non-linearity 268 00:17:28,870 --> 00:17:33,960 tends to grab onto a little instability and make it worse. 269 00:17:33,960 --> 00:17:38,540 So like Professor Bathe, if you take his course 270 00:17:38,540 --> 00:17:46,820 on finite elements, this trapezoidal rule-- Many people 271 00:17:46,820 --> 00:17:50,090 have reinvented it. 272 00:17:50,090 --> 00:17:52,940 There's a Newmark family of methods 273 00:17:52,940 --> 00:17:56,890 and with special parameter, you get back this one and everybody 274 00:17:56,890 --> 00:17:59,000 makes that choice practically. 275 00:17:59,000 --> 00:18:04,050 There's just a host of finite difference methods. 276 00:18:04,050 --> 00:18:06,450 I'm wondering whether you want me to tell you one more. 277 00:18:06,450 --> 00:18:09,130 Do you want one more finite difference method? 278 00:18:09,130 --> 00:18:13,540 Just to like, see what. 279 00:18:13,540 --> 00:18:15,210 You said yes, right? 280 00:18:15,210 --> 00:18:16,130 One more method. 281 00:18:16,130 --> 00:18:21,120 One more and then, this is a subject on its own. 282 00:18:21,120 --> 00:18:23,500 But just to see what else could you do. 283 00:18:23,500 --> 00:18:25,180 What else might you do? 284 00:18:25,180 --> 00:18:27,020 And let me say why you would. 285 00:18:27,020 --> 00:18:30,990 And Professor Bathe had to do it. 286 00:18:30,990 --> 00:18:37,250 In some problems he found that with non-linear problems, 287 00:18:37,250 --> 00:18:42,980 he found that this method, which for a perfect linear problem 288 00:18:42,980 --> 00:18:48,420 stays exactly one, that's great, but you're playing with fire. 289 00:18:48,420 --> 00:18:54,520 To be right on the unit circle, if non-linearity pushes you off 290 00:18:54,520 --> 00:19:00,850 then you wish you had not tried tried for that perfection. 291 00:19:00,850 --> 00:19:10,040 So let me describe a backward difference. 292 00:19:10,040 --> 00:19:18,090 Let me write down, I'll call this BDF2. 293 00:19:18,090 --> 00:19:22,650 All these formulas have names and numbers. 294 00:19:22,650 --> 00:19:26,230 So this would be Backward Difference Formula second order 295 00:19:26,230 --> 00:19:27,810 accurate. 296 00:19:27,810 --> 00:19:30,360 You'll see, it goes two steps back. 297 00:19:30,360 --> 00:19:38,240 So here it is. u_(n+1)-u_n over delta t. 298 00:19:38,240 --> 00:19:41,520 And then there's another term which 299 00:19:41,520 --> 00:19:43,330 gets the second order accuracy. 300 00:19:43,330 --> 00:19:54,220 It happens to be u_(n+1) - 2u_n + u_(n-1) over delta t. 301 00:19:54,220 --> 00:19:56,310 And then that really is delta t. 302 00:19:56,310 --> 00:19:58,260 Equals Au_(n+1). 303 00:19:58,260 --> 00:20:02,360 304 00:20:02,360 --> 00:20:04,700 It's good practice. 305 00:20:04,700 --> 00:20:11,010 That formula came from somewhere. 306 00:20:11,010 --> 00:20:12,110 What if we look at it. 307 00:20:12,110 --> 00:20:15,950 What do we see? 308 00:20:15,950 --> 00:20:19,990 What's the picture for a formula like that? 309 00:20:19,990 --> 00:20:25,180 Is it implicit or explicit first, our first question. 310 00:20:25,180 --> 00:20:25,950 Implicit. 311 00:20:25,950 --> 00:20:29,870 Because it's using, this right-hand side involves this. 312 00:20:29,870 --> 00:20:32,190 And if it was a non-linear equation, 313 00:20:32,190 --> 00:20:35,560 whatever that right-hand side is, not linear, 314 00:20:35,560 --> 00:20:38,330 would be evaluated at the new step 315 00:20:38,330 --> 00:20:43,200 and therefore would have to go back over here. 316 00:20:43,200 --> 00:20:45,640 It is second order accurate and I won't go 317 00:20:45,640 --> 00:20:51,080 through the checking on that. 318 00:20:51,080 --> 00:20:53,970 And is it stable? 319 00:20:53,970 --> 00:20:56,120 That's another question. 320 00:20:56,120 --> 00:20:59,620 We have to find eigenvalues here. 321 00:20:59,620 --> 00:21:07,860 Let me not go through all details there. 322 00:21:07,860 --> 00:21:09,920 It is stable. 323 00:21:09,920 --> 00:21:12,460 And it's slightly dissipative. 324 00:21:12,460 --> 00:21:16,110 It's not as dissipative as backward Euler. 325 00:21:16,110 --> 00:21:19,130 There you're losing energy fast. 326 00:21:19,130 --> 00:21:24,750 Here the eigenvalues, the thing would 327 00:21:24,750 --> 00:21:33,390 stay much closer to the unit circle than backward Euler. 328 00:21:33,390 --> 00:21:41,350 I'll just put up there so that you see. 329 00:21:41,350 --> 00:21:45,590 What are other features that you see right away 330 00:21:45,590 --> 00:21:47,220 from this method? 331 00:21:47,220 --> 00:21:52,050 The fact that it involves not only u_(n+1) and u_n, 332 00:21:52,050 --> 00:21:54,050 but also u_(n-1). 333 00:21:54,050 --> 00:21:55,320 What does that mean? 334 00:21:55,320 --> 00:21:57,870 How do I get started with that method? 335 00:21:57,870 --> 00:21:59,110 Right? 336 00:21:59,110 --> 00:22:02,910 This calls to find the new u. 337 00:22:02,910 --> 00:22:07,630 I need the previous one and the one before that. 338 00:22:07,630 --> 00:22:09,270 No problem, I have them. 339 00:22:09,270 --> 00:22:11,160 Except at the start I don't. 340 00:22:11,160 --> 00:22:14,520 So it'll need a sort of special start 341 00:22:14,520 --> 00:22:17,840 to be able to figure out what should u_n be. 342 00:22:17,840 --> 00:22:20,250 It'll need a separate formula to decide. 343 00:22:20,250 --> 00:22:26,410 And it could use one step of backward Euler to find u_1. 344 00:22:26,410 --> 00:22:33,510 And then take off, now finding u_2 from u_1 and u_0. 345 00:22:33,510 --> 00:22:34,750 And then onward. 346 00:22:34,750 --> 00:22:36,510 So that's quite fast. 347 00:22:36,510 --> 00:22:40,100 More stable, good method. 348 00:22:40,100 --> 00:22:46,660 And you maybe can see that I could get more formulas 349 00:22:46,660 --> 00:22:49,730 by going back further in time. 350 00:22:49,730 --> 00:22:54,720 And by doing that I can get the accuracy higher. 351 00:22:54,720 --> 00:22:58,120 So that's good to see that particular BDF2. 352 00:22:58,120 --> 00:23:02,430 So that's a backward difference formula. 353 00:23:02,430 --> 00:23:09,300 Oh, just to mention what's developed. 354 00:23:09,300 --> 00:23:14,720 So this is stable. 355 00:23:14,720 --> 00:23:17,140 It actually loses a little energy. 356 00:23:17,140 --> 00:23:21,360 So in fact now both Professor Bathe and I 357 00:23:21,360 --> 00:23:28,070 have studied the possibility of taking a trapezoidal step, 358 00:23:28,070 --> 00:23:30,900 which was a little dangerous in the non-linear case 359 00:23:30,900 --> 00:23:32,890 because you were playing with fire, 360 00:23:32,890 --> 00:23:34,630 you were right on the circle. 361 00:23:34,630 --> 00:23:39,100 And then put in a backward difference step 362 00:23:39,100 --> 00:23:43,210 to recover a little stability. 363 00:23:43,210 --> 00:23:47,170 And then trapezoidal backward difference. 364 00:23:47,170 --> 00:23:48,710 So split step. 365 00:23:48,710 --> 00:23:52,520 Split the step into a trapezoidal part and a backward 366 00:23:52,520 --> 00:23:53,340 difference part. 367 00:23:53,340 --> 00:23:58,350 That's actually discussed later in Chapter 2. 368 00:23:58,350 --> 00:24:01,200 Section 2.6 and I might come back to that. 369 00:24:01,200 --> 00:24:04,150 I just wanted to say that much this morning. 370 00:24:04,150 --> 00:24:10,650 Having got as far as forward and backward Euler. 371 00:24:10,650 --> 00:24:14,570 I didn't want to leave you without a better method which 372 00:24:14,570 --> 00:24:18,660 is the trapezoidal method. 373 00:24:18,660 --> 00:24:23,600 I could take any question. 374 00:24:23,600 --> 00:24:26,850 I would like to devote the second half, 375 00:24:26,850 --> 00:24:29,840 if you're okay to just change gear, 376 00:24:29,840 --> 00:24:34,460 to beginning on least squares. 377 00:24:34,460 --> 00:24:40,660 So this was our kind of excitement 378 00:24:40,660 --> 00:24:43,410 to have time dependence for a little while. 379 00:24:43,410 --> 00:24:48,280 But now I'm going back to steady-state problems. 380 00:24:48,280 --> 00:24:49,870 So they're not moving. 381 00:24:49,870 --> 00:24:51,970 I'm looking at equilibrium. 382 00:24:51,970 --> 00:24:57,630 And what I'm going to do now in the next lectures 383 00:24:57,630 --> 00:25:06,470 is more to see this framework that we identified once 384 00:25:06,470 --> 00:25:09,680 for the masses and springs, to see it again and again. 385 00:25:09,680 --> 00:25:12,600 Because it's the basic framework of applied math. 386 00:25:12,600 --> 00:25:15,330 So now I'm ready for least squares. 387 00:25:15,330 --> 00:25:17,300 Least squares. 388 00:25:17,300 --> 00:25:22,700 What's the problem? 389 00:25:22,700 --> 00:25:27,120 Well the problem is I'm given a system of equations Au=f. 390 00:25:27,120 --> 00:25:34,240 391 00:25:34,240 --> 00:25:42,410 These f's are observations. 392 00:25:42,410 --> 00:25:45,850 The u is the unknown vector. 393 00:25:45,850 --> 00:25:47,520 You say what's different here? 394 00:25:47,520 --> 00:25:49,480 It's just a linear system. 395 00:25:49,480 --> 00:25:53,190 What's different is too many equations. 396 00:25:53,190 --> 00:26:01,220 This is an m by n matrix with m bigger than n. 397 00:26:01,220 --> 00:26:05,550 Maybe much bigger than n. 398 00:26:05,550 --> 00:26:07,200 So what do we do? 399 00:26:07,200 --> 00:26:18,920 We've got too many equations and no solution, no exact solution. 400 00:26:18,920 --> 00:26:21,980 I would say probably that what we're coming to, 401 00:26:21,980 --> 00:26:25,390 what we're starting on today is the most important, 402 00:26:25,390 --> 00:26:29,500 the application of linear algebra that I see the most. 403 00:26:29,500 --> 00:26:30,830 So it interests everybody. 404 00:26:30,830 --> 00:26:36,650 It interests engineers, scientists, statisticians, 405 00:26:36,650 --> 00:26:43,680 everybody has to deal with this problem of too many equations. 406 00:26:43,680 --> 00:26:46,170 And those equations come from measurements 407 00:26:46,170 --> 00:26:48,290 so you don't want to throw them away. 408 00:26:48,290 --> 00:26:50,360 I don't want to just throw away. 409 00:26:50,360 --> 00:26:54,480 I want to somehow get the best solution. 410 00:26:54,480 --> 00:26:57,980 I'm looking for the u that comes closest 411 00:26:57,980 --> 00:27:00,920 when I can't find an exact u. 412 00:27:00,920 --> 00:27:02,290 So that's the idea. 413 00:27:02,290 --> 00:27:05,120 There's no exact solution and the problem 414 00:27:05,120 --> 00:27:12,080 is what is the best u. 415 00:27:12,080 --> 00:27:14,920 And I'm going to call that u hat. 416 00:27:14,920 --> 00:27:18,960 My favorite choice of u will be u hat. 417 00:27:18,960 --> 00:27:22,460 So I'm going to get an equation for u hat. 418 00:27:22,460 --> 00:27:25,110 That is my goal. 419 00:27:25,110 --> 00:27:31,810 And what I'm starting here just goes all the way to, 420 00:27:31,810 --> 00:27:33,940 there will be weighted least squares, 421 00:27:33,940 --> 00:27:36,380 there will be Kalman filters. 422 00:27:36,380 --> 00:27:39,420 It's a giant world here of estimating 423 00:27:39,420 --> 00:27:44,940 the best solution when there's noise in the right-hand side. 424 00:27:44,940 --> 00:27:46,520 And what's the model problem? 425 00:27:46,520 --> 00:27:48,360 Always good to have a model problem. 426 00:27:48,360 --> 00:27:51,600 Let me draw a model problem. 427 00:27:51,600 --> 00:28:00,550 Model problem is fit by a straight line. 428 00:28:00,550 --> 00:28:04,040 So say C+Dt. 429 00:28:04,040 --> 00:28:07,650 430 00:28:07,650 --> 00:28:09,750 Shall I use that as the--? 431 00:28:09,750 --> 00:28:11,130 C+Dx. 432 00:28:11,130 --> 00:28:13,030 It's got two unknowns. 433 00:28:13,030 --> 00:28:15,310 So n is two. 434 00:28:15,310 --> 00:28:21,100 But m is big. 435 00:28:21,100 --> 00:28:22,330 What do I have? 436 00:28:22,330 --> 00:28:23,520 Let me draw the picture. 437 00:28:23,520 --> 00:28:26,030 You've seen this often. 438 00:28:26,030 --> 00:28:29,870 This is the t-direction, this is the f. 439 00:28:29,870 --> 00:28:31,140 These are the measurements. 440 00:28:31,140 --> 00:28:39,170 So I measure at time t=0, let's say. 441 00:28:39,170 --> 00:28:45,300 I measure my position. 442 00:28:45,300 --> 00:28:47,140 That would be f_1. 443 00:28:47,140 --> 00:28:51,130 At time t=1 I've moved somewhere, 444 00:28:51,130 --> 00:28:52,290 I've measured where I am. 445 00:28:52,290 --> 00:28:54,740 I'm tracking a satellite, let's say. 446 00:28:54,740 --> 00:28:56,700 So I'm tracking this satellite. 447 00:28:56,700 --> 00:28:59,410 Well the times don't have to be equally spaced. 448 00:28:59,410 --> 00:29:03,570 I'll take the next time to be three. 449 00:29:03,570 --> 00:29:10,590 Let's say the engine is off, this satellite. 450 00:29:10,590 --> 00:29:13,630 If the measurements were perfect-- 451 00:29:13,630 --> 00:29:16,520 Does that look too perfect to you? 452 00:29:16,520 --> 00:29:18,870 It's almost on a straight line, isn't it? 453 00:29:18,870 --> 00:29:23,270 Of course my point is that measurements, well I mean, 454 00:29:23,270 --> 00:29:25,560 of course in reality measurements 455 00:29:25,560 --> 00:29:27,190 would be close to a straight line. 456 00:29:27,190 --> 00:29:31,380 I'm going to have to draw, in order for you to see anything, 457 00:29:31,380 --> 00:29:35,990 I'm going to have to draw a really-- 458 00:29:35,990 --> 00:29:39,520 Suppose f_1 is one. 459 00:29:39,520 --> 00:29:46,410 Suppose it starts at position one and suppose f_2 is two 460 00:29:46,410 --> 00:29:53,200 and this guy will be-- Where do you want me to take it? 461 00:29:53,200 --> 00:29:58,160 Let's see, if it was linear, what would be the--? 462 00:29:58,160 --> 00:30:01,240 It would be four, right? 463 00:30:01,240 --> 00:30:03,640 So can I take a different number? 464 00:30:03,640 --> 00:30:04,710 Three. 465 00:30:04,710 --> 00:30:05,600 Is three okay? 466 00:30:05,600 --> 00:30:09,180 Because five I haven't got space for. 467 00:30:09,180 --> 00:30:12,810 And you don't want to see pi or some dumb thing or e. 468 00:30:12,810 --> 00:30:20,960 So let me take three. 469 00:30:20,960 --> 00:30:23,370 I want to fit that data which is, 470 00:30:23,370 --> 00:30:25,520 we're saying, close to a straight line, 471 00:30:25,520 --> 00:30:28,360 I want to fit it by the best straight line. 472 00:30:28,360 --> 00:30:31,350 So the best straight line would go probably, 473 00:30:31,350 --> 00:30:34,010 I don't know what your eyes suggest 474 00:30:34,010 --> 00:30:36,120 for the best straight line through three points. 475 00:30:36,120 --> 00:30:40,290 Do you see I've got three equations, two unknowns? 476 00:30:40,290 --> 00:30:41,730 That's the first point to see. 477 00:30:41,730 --> 00:30:46,630 Somehow I'm trying to fit three things with only two 478 00:30:46,630 --> 00:30:50,440 degrees of freedom and I'm not going to succeed, usually. 479 00:30:50,440 --> 00:30:54,030 But I'm going to do my best and probably the best line 480 00:30:54,030 --> 00:30:59,420 goes sort of, it won't exactly go through any of them. 481 00:30:59,420 --> 00:31:03,970 So I'm doing the best least squares approximation. 482 00:31:03,970 --> 00:31:05,210 And what does that mean? 483 00:31:05,210 --> 00:31:08,160 Well, what would the three equations be? 484 00:31:08,160 --> 00:31:12,990 What does my linear equations, my unsolvable ones, 485 00:31:12,990 --> 00:31:15,460 say that at time zero? 486 00:31:15,460 --> 00:31:19,730 So at time zero, at time one and at time three, 487 00:31:19,730 --> 00:31:24,200 at each of those times I have an equation C+Dt should agree 488 00:31:24,200 --> 00:31:31,070 with-- So that C plus D times zero should match f_1. 489 00:31:31,070 --> 00:31:40,140 At t=1 my line will be C+D*1, it should equal f_2. 490 00:31:40,140 --> 00:31:46,570 And at t=3, the height of the line will be C+3D, 491 00:31:46,570 --> 00:31:50,950 and I would like it to go through that height f_3. 492 00:31:50,950 --> 00:31:53,780 493 00:31:53,780 --> 00:31:58,780 But I'm not going to be able-- If there 494 00:31:58,780 --> 00:32:02,110 was noise in the measurements that system, that's 495 00:32:02,110 --> 00:32:06,140 my unsolvable system. 496 00:32:06,140 --> 00:32:08,660 What's the matrix? 497 00:32:08,660 --> 00:32:11,660 I want to write three equations and you're 498 00:32:11,660 --> 00:32:19,700 getting good at seeing three equations like so. 499 00:32:19,700 --> 00:32:24,810 So I've a 3 by 2 matrix. 500 00:32:24,810 --> 00:32:28,710 And my unknown u is [C, D]. 501 00:32:28,710 --> 00:32:30,140 Those are my unknowns. 502 00:32:30,140 --> 00:32:33,190 And my right-hand sides are these heights, well, 503 00:32:33,190 --> 00:32:40,720 I decided on particular numbers, one, two and three. 504 00:32:40,720 --> 00:32:42,420 One, two, three. 505 00:32:42,420 --> 00:32:44,480 And what's the matrix? 506 00:32:44,480 --> 00:32:47,160 What's the matrix A that you read off when you 507 00:32:47,160 --> 00:32:50,040 see that system of equations? 508 00:32:50,040 --> 00:32:52,750 The first column of the matrix is? 509 00:32:52,750 --> 00:32:55,720 All ones because that's multiplying the C's. 510 00:32:55,720 --> 00:32:59,980 And the second column of the matrix is the times. 511 00:32:59,980 --> 00:33:02,380 Zero, one, three, is that right? 512 00:33:02,380 --> 00:33:08,070 That multiply the D. So this is the same as that. 513 00:33:08,070 --> 00:33:12,840 So here I'm in my set-up. 514 00:33:12,840 --> 00:33:24,510 I'll erase m equal big because m was only three, not that big. 515 00:33:24,510 --> 00:33:30,060 What's the best answer? 516 00:33:30,060 --> 00:33:32,010 What's the best u hat? 517 00:33:32,010 --> 00:33:38,770 The best u hat will now be C hat and D hat. 518 00:33:38,770 --> 00:33:45,730 The best I can do. 519 00:33:45,730 --> 00:33:50,920 I need some idea of what does best mean. 520 00:33:50,920 --> 00:33:55,520 And there is not a single possible meaning. 521 00:33:55,520 --> 00:33:59,670 There are many possible ways I could say the best line. 522 00:33:59,670 --> 00:34:05,830 One way would be to make the, well, 523 00:34:05,830 --> 00:34:09,110 what could a best line be? 524 00:34:09,110 --> 00:34:11,510 I'm going to have three errors here, right? 525 00:34:11,510 --> 00:34:14,190 That did not go right through the point. 526 00:34:14,190 --> 00:34:15,970 This did not go right through the point. 527 00:34:15,970 --> 00:34:17,470 They came pretty close. 528 00:34:17,470 --> 00:34:23,280 I've got three small errors. e_1, e_2, e_3. 529 00:34:23,280 --> 00:34:26,510 Those are the errors in my equations. 530 00:34:26,510 --> 00:34:31,690 So I will get equality when I add in the e_1, e_2, e_3, 531 00:34:31,690 --> 00:34:39,680 the little bits that will bring it onto the line. 532 00:34:39,680 --> 00:34:41,820 One idea. 533 00:34:41,820 --> 00:34:46,880 Make the largest error as small as I can. 534 00:34:46,880 --> 00:34:54,730 Minimize the maximum of the e's Try to balance them so no e, 535 00:34:54,730 --> 00:34:58,790 no error is bigger than the others. 536 00:34:58,790 --> 00:35:00,630 Look for that balance. 537 00:35:00,630 --> 00:35:04,770 That's a reasonable idea. 538 00:35:04,770 --> 00:35:07,360 But it's not the least squares idea. 539 00:35:07,360 --> 00:35:11,290 So what's the least squares idea? 540 00:35:11,290 --> 00:35:14,680 The least squares idea makes the sum 541 00:35:14,680 --> 00:35:17,640 of the squares of the errors as small as possible. 542 00:35:17,640 --> 00:35:29,075 So the least squares idea will be to minimize the sum 543 00:35:29,075 --> 00:35:33,550 of the squares of the errors. e_1 squared plus... 544 00:35:33,550 --> 00:35:35,940 e_m squared. 545 00:35:35,940 --> 00:35:38,610 It would be just e_1 squared plus e_2 squared 546 00:35:38,610 --> 00:35:42,440 plus e_3 squared. 547 00:35:42,440 --> 00:35:44,710 What is this? 548 00:35:44,710 --> 00:35:48,240 Let me began to write this in matrix-- 549 00:35:48,240 --> 00:35:51,240 I want to bring in the matrix here. 550 00:35:51,240 --> 00:35:56,210 This is the error. 551 00:35:56,210 --> 00:36:00,670 The error is the difference between the measurements 552 00:36:00,670 --> 00:36:02,650 and Au. 553 00:36:02,650 --> 00:36:06,820 So that's what I'm trying to make small. 554 00:36:06,820 --> 00:36:09,680 I'd love to make it zero but I can't. 555 00:36:09,680 --> 00:36:12,080 I've got more equations than unknowns. 556 00:36:12,080 --> 00:36:15,410 There's no two unknowns that will make all three errors 557 00:36:15,410 --> 00:36:16,680 zero. 558 00:36:16,680 --> 00:36:18,950 So I want to make the errors small. 559 00:36:18,950 --> 00:36:23,620 And this is the length of e squared. 560 00:36:23,620 --> 00:36:30,110 The length in this sum of squares method. 561 00:36:30,110 --> 00:36:37,900 It's a pretty good measure of the error. 562 00:36:37,900 --> 00:36:41,800 Gauss was the first to apply least squares. 563 00:36:41,800 --> 00:36:45,740 What I'm going to do today is Gauss. 564 00:36:45,740 --> 00:36:48,620 Who was, by the way, the greatest mathematician 565 00:36:48,620 --> 00:36:52,030 of all time. 566 00:36:52,030 --> 00:36:54,910 And here, he was doing astronomy actually. 567 00:36:54,910 --> 00:37:03,750 And writing in Latin. 568 00:37:03,750 --> 00:37:08,750 The message got out somehow. 569 00:37:08,750 --> 00:37:12,420 So his idea was sum of squares. 570 00:37:12,420 --> 00:37:18,510 So this e is the distance between f and Au. 571 00:37:18,510 --> 00:37:22,340 I have to begin to write. 572 00:37:22,340 --> 00:37:23,960 I have to write some things. 573 00:37:23,960 --> 00:37:28,740 I can write some things out in detail, but then I also, 574 00:37:28,740 --> 00:37:30,900 at the same time, have to carry along 575 00:37:30,900 --> 00:37:34,410 the way I would look at it for any matrix A 576 00:37:34,410 --> 00:37:36,210 and any right-hand side f. 577 00:37:36,210 --> 00:37:39,370 So do you see that this is the error? 578 00:37:39,370 --> 00:37:44,040 The meaning of this double bars squared is exactly that. 579 00:37:44,040 --> 00:37:46,670 That it's the sum of the squares of the components. 580 00:37:46,670 --> 00:37:54,190 So that's where the word least squares come in. 581 00:37:54,190 --> 00:38:02,920 Can I just say what's better about least squares 582 00:38:02,920 --> 00:38:06,660 and what's maybe a drawback. 583 00:38:06,660 --> 00:38:09,510 So actually this next sentence is 584 00:38:09,510 --> 00:38:12,150 pretty important in practice. 585 00:38:12,150 --> 00:38:14,340 What's better about least squares, what's 586 00:38:14,340 --> 00:38:19,190 really nice about least squares is well, 587 00:38:19,190 --> 00:38:23,960 for one thing, the equations we get for the best [C, D] 588 00:38:23,960 --> 00:38:29,620 will be linear, will be linear equations. 589 00:38:29,620 --> 00:38:32,810 You may say, not surprising, I started out 590 00:38:32,810 --> 00:38:35,955 with a linear system and I'm going 591 00:38:35,955 --> 00:38:39,040 to end up with a linear system. 592 00:38:39,040 --> 00:38:45,030 Actually I prefer to use-- Well, might be too late. 593 00:38:45,030 --> 00:38:47,090 Next lecture I'm going to put b in there 594 00:38:47,090 --> 00:38:49,340 for the right-hand side. 595 00:38:49,340 --> 00:38:54,000 But I'll leave it with that for now. 596 00:38:54,000 --> 00:38:57,420 So good point is we'll get linear equations. 597 00:38:57,420 --> 00:39:04,810 The not so good point in some applications 598 00:39:04,810 --> 00:39:11,380 is when I look at the squares of errors, 599 00:39:11,380 --> 00:39:18,290 well big errors, outliers, really bad measurements 600 00:39:18,290 --> 00:39:20,650 have a big influence on the answer 601 00:39:20,650 --> 00:39:22,590 because of getting squared. 602 00:39:22,590 --> 00:39:30,620 So if I have ten readings that are very accurate but then 603 00:39:30,620 --> 00:39:35,660 in an eleventh reading that is way off and I don't know it 604 00:39:35,660 --> 00:39:38,890 and if I don't realize that that's way off, then 605 00:39:38,890 --> 00:39:43,770 that eleventh error will-- It's like having 606 00:39:43,770 --> 00:39:45,550 a whole lot of points close to a line 607 00:39:45,550 --> 00:39:48,040 and then another point way off. 608 00:39:48,040 --> 00:39:55,250 That will have a significant effect on the best line. 609 00:39:55,250 --> 00:40:02,240 So you might say too great an effect. 610 00:40:02,240 --> 00:40:05,570 Depends on the application. 611 00:40:05,570 --> 00:40:10,900 I just had to say before starting on least squares, 612 00:40:10,900 --> 00:40:16,790 as always, there are advantages and disadvantages 613 00:40:16,790 --> 00:40:18,880 but the advantages are very, very great. 614 00:40:18,880 --> 00:40:25,780 So it's an important idea here, least squares. 615 00:40:25,780 --> 00:40:31,650 I'm ready now to ask for the equation for u hat. 616 00:40:31,650 --> 00:40:38,700 So the equation for u hat is the u that minimizes here. 617 00:40:38,700 --> 00:40:45,880 So we have touched on minimizing quadratics. 618 00:40:45,880 --> 00:40:54,060 This is squares. 619 00:40:54,060 --> 00:40:59,670 I could expand that out as f minus A u transpose times 620 00:40:59,670 --> 00:41:06,700 f minus Au just to see another way to write it. 621 00:41:06,700 --> 00:41:11,050 The length squared of a vector is always the transpose-- 622 00:41:11,050 --> 00:41:14,030 Its inner product with itself. 623 00:41:14,030 --> 00:41:19,090 And I could split this out into all these different terms. 624 00:41:19,090 --> 00:41:22,040 I would have then, some quadratic expression 625 00:41:22,040 --> 00:41:24,480 to minimize. 626 00:41:24,480 --> 00:41:28,940 In other words, let me jump to the answer. 627 00:41:28,940 --> 00:41:33,540 Let me jump to the equation for the best u. 628 00:41:33,540 --> 00:41:38,570 And then come back to see why. 629 00:41:38,570 --> 00:41:41,990 Because you must see what that equation is. 630 00:41:41,990 --> 00:41:44,720 It's the fundamental equation of, this 631 00:41:44,720 --> 00:41:49,280 might be called linear regression, fitting data. 632 00:41:49,280 --> 00:41:51,570 You're just constantly doing it. 633 00:41:51,570 --> 00:41:56,280 So what is the equation for the best u hat? 634 00:41:56,280 --> 00:41:58,480 Can I put it here? 635 00:41:58,480 --> 00:42:01,400 This'll be equation that we get to. 636 00:42:01,400 --> 00:42:04,960 It'll be A transpose A. You're not 637 00:42:04,960 --> 00:42:07,040 surprised to see A transpose A up here. 638 00:42:07,040 --> 00:42:09,920 First of all because this is 18.085 639 00:42:09,920 --> 00:42:14,050 and also because this A is rectangular 640 00:42:14,050 --> 00:42:16,760 and when you have rectangular matrices, 641 00:42:16,760 --> 00:42:19,480 sooner or later A transpose A comes up. 642 00:42:19,480 --> 00:42:21,670 So that's the matrix. 643 00:42:21,670 --> 00:42:26,860 And then the right-hand side is A transpose f. 644 00:42:26,860 --> 00:42:30,490 So that's the key equation for least squares. 645 00:42:30,490 --> 00:42:33,870 That's the central equation of least squares. 646 00:42:33,870 --> 00:42:37,480 And let's just see what it looks like. 647 00:42:37,480 --> 00:42:40,350 You could say the way I arrived at it, 648 00:42:40,350 --> 00:42:43,620 I mean the short way is this is an equation that I 649 00:42:43,620 --> 00:42:45,810 can't satisfy. 650 00:42:45,810 --> 00:42:55,570 I multiply both sides by A transpose 651 00:42:55,570 --> 00:43:00,050 and now this is the equation for u hat. 652 00:43:00,050 --> 00:43:05,150 And what that did was kind of average out the m equations. 653 00:43:05,150 --> 00:43:07,920 Because how many equations do I now have? 654 00:43:07,920 --> 00:43:09,700 A as m by n. 655 00:43:09,700 --> 00:43:13,420 What's the shape of A transpose A? 656 00:43:13,420 --> 00:43:15,650 Everybody's on top of that, right? 657 00:43:15,650 --> 00:43:21,350 The shape of A transpose A is square, n by n. 658 00:43:21,350 --> 00:43:28,960 Because A transpose is n by m. n by m times m by n 659 00:43:28,960 --> 00:43:30,610 leaves us an n by n. 660 00:43:30,610 --> 00:43:34,060 So we've averaged the m equations 661 00:43:34,060 --> 00:43:38,900 that were too many to get n equations. 662 00:43:38,900 --> 00:43:47,410 And of course this is what it should be, n by m, m by one. 663 00:43:47,410 --> 00:43:48,520 So it's n by one. 664 00:43:48,520 --> 00:43:51,500 It's a good right-hand side. 665 00:43:51,500 --> 00:44:02,030 That's the equation of least squares. 666 00:44:02,030 --> 00:44:07,300 That's the equation I want to explain, understand and solve. 667 00:44:07,300 --> 00:44:10,520 Actually why don't we solve it for this particular problem 668 00:44:10,520 --> 00:44:17,210 just to see the whole thing for this example. 669 00:44:17,210 --> 00:44:22,060 Just to do it. 670 00:44:22,060 --> 00:44:25,980 So there is A. Here is u. 671 00:44:25,980 --> 00:44:32,900 And here is f. what shall I call these? 672 00:44:32,900 --> 00:44:36,310 They're mostly called the normal equations. 673 00:44:36,310 --> 00:44:42,110 That's one possible word for the key equation of least squares, 674 00:44:42,110 --> 00:44:44,610 the normal equations. 675 00:44:44,610 --> 00:44:47,990 Can you tell me these matrices A transpose A? 676 00:44:47,990 --> 00:44:50,320 And u hat I know. 677 00:44:50,320 --> 00:44:53,460 That'll be the best C and the best D. 678 00:44:53,460 --> 00:44:59,390 And over here can you compute A transpose f? 679 00:44:59,390 --> 00:45:01,730 If I write A transpose above it, will that 680 00:45:01,730 --> 00:45:05,500 help you do these multiplications? 681 00:45:05,500 --> 00:45:06,580 Let me just do. 682 00:45:06,580 --> 00:45:12,650 So there was the matrix A. Let me write A transpose above it. 683 00:45:12,650 --> 00:45:22,900 So it has a row of ones and then a row of times. 684 00:45:22,900 --> 00:45:25,380 So what shape is the matrix? 685 00:45:25,380 --> 00:45:30,470 The A transpose A matrix. 686 00:45:30,470 --> 00:45:33,480 It's going to be that A transpose times that A. 687 00:45:33,480 --> 00:45:37,550 The size will be two by two, right? 688 00:45:37,550 --> 00:45:39,890 Two by three times three by two, it's 689 00:45:39,890 --> 00:45:43,530 averaging out to get me a two by two matrix. 690 00:45:43,530 --> 00:45:46,130 What's the first entry of this? 691 00:45:46,130 --> 00:45:50,890 Can you do A transpose times A just so we see this matrix. 692 00:45:50,890 --> 00:45:52,220 Three. 693 00:45:52,220 --> 00:45:54,790 And off the diagonal? 694 00:45:54,790 --> 00:45:55,630 Four. 695 00:45:55,630 --> 00:45:56,900 And here? 696 00:45:56,900 --> 00:45:58,490 And you knew it would be symmetric. 697 00:45:58,490 --> 00:45:59,790 And here? 698 00:45:59,790 --> 00:46:03,660 Ten. 699 00:46:03,660 --> 00:46:06,700 And tell me A transpose f while we're at it. 700 00:46:06,700 --> 00:46:08,470 So that's just a vector. 701 00:46:08,470 --> 00:46:11,340 If you multiply that by the right-hand sides, 702 00:46:11,340 --> 00:46:12,650 looks like a six. 703 00:46:12,650 --> 00:46:18,640 Is that right? 704 00:46:18,640 --> 00:46:21,210 11, maybe. 705 00:46:21,210 --> 00:46:24,020 Is that right? 706 00:46:24,020 --> 00:46:29,730 Two, nine making 11, yeah. 707 00:46:29,730 --> 00:46:31,110 So those are the numbers. 708 00:46:31,110 --> 00:46:35,710 I can't write those numbers, write A transpose A, 709 00:46:35,710 --> 00:46:39,390 without asking you to tell me one more time what kind 710 00:46:39,390 --> 00:46:42,640 of a matrix have I got here? 711 00:46:42,640 --> 00:46:44,770 It's symmetric positive definite, right? 712 00:46:44,770 --> 00:46:47,840 We know that's going to be. 713 00:46:47,840 --> 00:46:50,430 And we see it. 714 00:46:50,430 --> 00:46:54,230 Our test for positive definite might be the determinant, 715 00:46:54,230 --> 00:46:57,030 that one by one determinant is three, 716 00:46:57,030 --> 00:47:04,870 that two by two determinant is 30 minus 16, 14, positive. 717 00:47:04,870 --> 00:47:06,710 We got a good problem here. 718 00:47:06,710 --> 00:47:09,750 And I could solve for C hat and D hat. 719 00:47:09,750 --> 00:47:13,730 I see the numbers are not coming out fantastically 720 00:47:13,730 --> 00:47:18,530 but they would produce a line that would, I'm pretty sure, 721 00:47:18,530 --> 00:47:25,310 it would be, this looks optimal to me. 722 00:47:25,310 --> 00:47:29,570 If I rotated any, I'm going to make things worse. 723 00:47:29,570 --> 00:47:31,480 I think it would look like that. 724 00:47:31,480 --> 00:47:35,590 So that's the system that you end up with. 725 00:47:35,590 --> 00:47:37,790 But why? 726 00:47:37,790 --> 00:47:41,740 You really have to understand, where 727 00:47:41,740 --> 00:47:44,710 did this equation come from. 728 00:47:44,710 --> 00:47:50,010 Where did it come from? 729 00:47:50,010 --> 00:47:56,380 It's worth understanding, this least squares stuff. 730 00:47:56,380 --> 00:48:01,690 So I'm going to try to draw a picture that makes it clear 731 00:48:01,690 --> 00:48:08,190 where that equation comes from. 732 00:48:08,190 --> 00:48:11,560 So what am I doing here? 733 00:48:11,560 --> 00:48:12,060 Au=f. 734 00:48:12,060 --> 00:48:15,040 735 00:48:15,040 --> 00:48:18,130 Start there. 736 00:48:18,130 --> 00:48:21,130 And the particular A was, I'll even 737 00:48:21,130 --> 00:48:25,230 copy the A. It was 1, 1, 1; 0, 1, 3, 738 00:48:25,230 --> 00:48:30,400 multiplied u to give me f as [0, 1, 3]. 739 00:48:30,400 --> 00:48:32,290 But of course, I couldn't solve it 740 00:48:32,290 --> 00:48:35,550 because I don't have enough unknowns. 741 00:48:35,550 --> 00:48:39,740 What's the picture? 742 00:48:39,740 --> 00:48:46,540 Everybody likes to see what's happening by a picture 743 00:48:46,540 --> 00:48:49,170 as well as by algebra. 744 00:48:49,170 --> 00:48:55,750 So the picture here is I'm in three dimensions 745 00:48:55,750 --> 00:48:57,820 and I have a vector [0, 1, 3]. 746 00:48:57,820 --> 00:49:01,480 So 0 in that direction, one in that, three up. 747 00:49:01,480 --> 00:49:07,640 So somewhere there is my f. 748 00:49:07,640 --> 00:49:15,760 Now I'll put in C, D here. 749 00:49:15,760 --> 00:49:19,570 What is the equation asking me to do? 750 00:49:19,570 --> 00:49:22,630 Which actually, I won't be able to do because I can't solve it. 751 00:49:22,630 --> 00:49:35,610 But the equation, how do we see a system of linear equations? 752 00:49:35,610 --> 00:49:38,060 If I have a system of linear equations 753 00:49:38,060 --> 00:49:41,090 I'm looking for numbers C and D so 754 00:49:41,090 --> 00:49:52,230 that C times column one plus D times column two gives me that. 755 00:49:52,230 --> 00:49:55,180 That's how I think of a system of equations. 756 00:49:55,180 --> 00:50:00,990 A combination of the columns. 757 00:50:00,990 --> 00:50:04,400 Tell me, what vectors do I get if I take combinations 758 00:50:04,400 --> 00:50:05,160 of the columns. 759 00:50:05,160 --> 00:50:07,160 Well, if I took the combination C=1, 760 00:50:07,160 --> 00:50:09,860 D=0 I would just get the first column. 761 00:50:09,860 --> 00:50:14,880 So that's a candidate. [1, 1, 1], I don't know where that is. 762 00:50:14,880 --> 00:50:20,380 Wherever [1, 1, 1] might be. 763 00:50:20,380 --> 00:50:23,490 I'm not too sure where to draw [1, 1, 1]. 764 00:50:23,490 --> 00:50:27,840 I want to go one there, one there and one there. 765 00:50:27,840 --> 00:50:31,350 Damn. 766 00:50:31,350 --> 00:50:35,920 Let's define that to be the vector [1, 1, 1] right there. 767 00:50:35,920 --> 00:50:36,420 Wait. 768 00:50:36,420 --> 00:50:40,000 You let me put that up there and I didn't mean to, 769 00:50:40,000 --> 00:50:48,770 right? f should be zero, what, sorry? f was [1, 2, 3], yeah. 770 00:50:48,770 --> 00:50:49,920 Damn! 771 00:50:49,920 --> 00:50:52,010 Don't let me make mistakes. 772 00:50:52,010 --> 00:50:57,780 These mistakes are permanent if you let them slide by. 773 00:50:57,780 --> 00:51:04,660 That's it, same point. 774 00:51:04,660 --> 00:51:06,660 I didn't have the point right in the first place 775 00:51:06,660 --> 00:51:08,620 so now it's just perfect. 776 00:51:08,620 --> 00:51:15,920 There it is. 777 00:51:15,920 --> 00:51:18,250 Before of course, if I had [0, 1, 3], 778 00:51:18,250 --> 00:51:21,810 I could've solved the equation. 779 00:51:21,810 --> 00:51:26,060 But with [1, 2, 3] I can't. 780 00:51:26,060 --> 00:51:27,250 Here's the situation. 781 00:51:27,250 --> 00:51:31,270 This vector is not a combination of those two. 782 00:51:31,270 --> 00:51:36,320 Because the combinations of two vectors, what's the picture? 783 00:51:36,320 --> 00:51:40,290 If I try to draw, if I look at all combinations of two 784 00:51:40,290 --> 00:51:44,940 vectors, [1, 1, 1] which is that vector, [0, 1, 3] 785 00:51:44,940 --> 00:51:49,880 which is maybe this vector, let's just say. 786 00:51:49,880 --> 00:51:53,010 If I take the combinations of these two column vectors, 787 00:51:53,010 --> 00:51:55,490 what do I get? 788 00:51:55,490 --> 00:51:57,760 Now this is for everybody to know. 789 00:51:57,760 --> 00:52:00,640 If I take the combinations of two vectors 790 00:52:00,640 --> 00:52:04,170 here in three-dimensional space I get a plane. 791 00:52:04,170 --> 00:52:08,990 I get the plane that contains those vectors. 792 00:52:08,990 --> 00:52:14,450 So this I could call the column plane or the column space. 793 00:52:14,450 --> 00:52:21,650 This is all combinations of the columns. 794 00:52:21,650 --> 00:52:27,890 That's the same thing as saying this is all the f's 795 00:52:27,890 --> 00:52:38,650 that have exact solutions. 796 00:52:38,650 --> 00:52:41,290 So let's just see this picture. 797 00:52:41,290 --> 00:52:45,120 This particular right-hand side is not in the plane, right? 798 00:52:45,120 --> 00:52:46,630 That's my problem. 799 00:52:46,630 --> 00:52:51,100 This particular vector f points out of the plane. 800 00:52:51,100 --> 00:52:53,740 But if I change it a little, like 801 00:52:53,740 --> 00:52:56,350 if I change it to [1, 2, 4]. 802 00:52:56,350 --> 00:52:57,790 Do you see that that would--? 803 00:52:57,790 --> 00:53:01,110 What's different now that I've changed it to [1, 2, 3] 804 00:53:01,110 --> 00:53:05,220 for a moment? 805 00:53:05,220 --> 00:53:07,820 What's different about [1, 2, 4]? 806 00:53:07,820 --> 00:53:11,890 Where is [1, 2, 4] in my picture? 807 00:53:11,890 --> 00:53:14,850 Do you see what's great about [1, 2, 4]? 808 00:53:14,850 --> 00:53:16,380 It is a combination. 809 00:53:16,380 --> 00:53:20,520 Right? [1, 2, 4] is a combination, with C=1, D=1. 810 00:53:20,520 --> 00:53:24,560 It would satisfy the equation. 811 00:53:24,560 --> 00:53:28,750 So where is [1, 2, 4] in my picture? 812 00:53:28,750 --> 00:53:31,410 It's in the plane. 813 00:53:31,410 --> 00:53:38,770 The plane are the heights that do lie on a straight line. 814 00:53:38,770 --> 00:53:42,660 So the plane are all the ones that I can get exactly. 815 00:53:42,660 --> 00:53:49,440 But this vector, these observations, [1, 2, 3], 816 00:53:49,440 --> 00:53:50,660 I couldn't get exactly. 817 00:53:50,660 --> 00:53:54,440 So let me-- In 30 seconds or less, 818 00:53:54,440 --> 00:53:57,370 let me tell you the best thing to do. 819 00:53:57,370 --> 00:53:59,620 Or let you tell me the best thing to do. 820 00:53:59,620 --> 00:54:02,660 I have a right-hand side that's not in the plane. 821 00:54:02,660 --> 00:54:05,160 I can get my straight lines correspond 822 00:54:05,160 --> 00:54:07,610 to vectors, right-hand sides that are in the plane. 823 00:54:07,610 --> 00:54:10,990 So what do I do? 824 00:54:10,990 --> 00:54:12,060 I project. 825 00:54:12,060 --> 00:54:15,830 I take the nearest point that is the plane 826 00:54:15,830 --> 00:54:18,830 as my right-hand side. 827 00:54:18,830 --> 00:54:21,040 I project down. 828 00:54:21,040 --> 00:54:23,015 And it's that projection that's going 829 00:54:23,015 --> 00:54:26,340 to lead us to the equation that I'm 830 00:54:26,340 --> 00:54:32,250 shooting for, A transpose A u hat equals A transpose f. 831 00:54:32,250 --> 00:54:38,760 This comes from projecting f down into the plane 832 00:54:38,760 --> 00:54:43,320 where straight lines do work exactly. 833 00:54:43,320 --> 00:54:47,570 So there's an error here that I can't deal with. 834 00:54:47,570 --> 00:54:50,170 And there's a part here, the projection part, 835 00:54:50,170 --> 00:54:52,240 that I can deal with. 836 00:54:52,240 --> 00:54:56,300 This is important and it's fun and we'll 837 00:54:56,300 --> 00:54:57,480 come back to it Monday. 838 00:54:57,480 --> 00:54:59,260 Thanks for patience.