1 00:00:00,000 --> 00:00:00,500 2 00:00:00,500 --> 00:00:02,756 The following content is provided under a Creative 3 00:00:02,756 --> 00:00:03,610 Commons license. 4 00:00:03,610 --> 00:00:05,770 Your support will help MIT OpenCourseWare 5 00:00:05,770 --> 00:00:09,910 continue to offer high quality educational resources for free. 6 00:00:09,910 --> 00:00:12,530 To make a donation or to view additional materials 7 00:00:12,530 --> 00:00:15,610 from hundreds of MIT courses, visit MIT OpenCourseWare 8 00:00:15,610 --> 00:00:20,835 at ocw.mit.edu. 9 00:00:20,835 --> 00:00:21,710 PROFESSOR STRANG: OK. 10 00:00:21,710 --> 00:00:25,840 So this is Lecture 14, it's also the lecture 11 00:00:25,840 --> 00:00:29,300 before the exam on Tuesday evening. 12 00:00:29,300 --> 00:00:35,150 I thought I would just go ahead and tell you what questions 13 00:00:35,150 --> 00:00:37,800 there are, so you could see. 14 00:00:37,800 --> 00:00:39,580 I haven't filled in all the numbers, 15 00:00:39,580 --> 00:00:44,630 but this will tell you, it's a way 16 00:00:44,630 --> 00:00:47,680 of reviewing of course, to sort of see 17 00:00:47,680 --> 00:00:50,010 the things that we've done. 18 00:00:50,010 --> 00:00:52,330 Which is quite a bit, really. 19 00:00:52,330 --> 00:00:58,380 And also of course topics that will not be in the exam. 20 00:00:58,380 --> 00:01:01,510 You can see that they're not in the exam, for example. 21 00:01:01,510 --> 00:01:06,240 Topics from 1.7 on condition number, or 2.3 22 00:01:06,240 --> 00:01:09,440 on Gram-Schmidt, that I can speak about a little bit. 23 00:01:09,440 --> 00:01:12,740 So those are the, only thing I didn't 24 00:01:12,740 --> 00:01:16,830 fill in is the end time there. 25 00:01:16,830 --> 00:01:19,360 So I don't usually ask four questions, 26 00:01:19,360 --> 00:01:20,870 three is more typical. 27 00:01:20,870 --> 00:01:25,060 So it's a little bit longer but it's not a difficult exam. 28 00:01:25,060 --> 00:01:33,620 And I never want time to be the essential ingredient. 29 00:01:33,620 --> 00:01:37,270 So 7:30 to 9 is the nominal time, 30 00:01:37,270 --> 00:01:42,110 but it would not be a surprise if you're still 31 00:01:42,110 --> 00:01:45,030 going on after 9 o'clock a little bit. 32 00:01:45,030 --> 00:01:50,890 And I'll try not to, like, tear a paper away from you. 33 00:01:50,890 --> 00:01:57,090 So, just figure that if you move along at a reasonable speed, 34 00:01:57,090 --> 00:02:04,470 and so you can bring papers, the book, anything with you. 35 00:02:04,470 --> 00:02:09,240 And I'm open for questions now about the exam. 36 00:02:09,240 --> 00:02:13,060 You know where 54-100 is, it's a classroom unfortunately. 37 00:02:13,060 --> 00:02:17,260 Not a, sometimes we'll have the top of Walker where you have 38 00:02:17,260 --> 00:02:21,030 a whole table to work on, that's-- This 54-100, 39 00:02:21,030 --> 00:02:26,600 in the tallest building of MIT out in the middle of the middle 40 00:02:26,600 --> 00:02:31,690 space out there, is a large classroom. 41 00:02:31,690 --> 00:02:38,380 So if you kind of spread out your papers we'll be OK. 42 00:02:38,380 --> 00:02:40,730 It's a pretty big room. 43 00:02:40,730 --> 00:02:42,810 Questions. 44 00:02:42,810 --> 00:02:47,090 And, of course, don't forget this afternoon in 1-190 45 00:02:47,090 --> 00:02:48,530 rather than here. 46 00:02:48,530 --> 00:02:58,620 So, review is in 1-190, 4 to 5 today. 47 00:02:58,620 --> 00:03:04,470 But if you're not free at that time 48 00:03:04,470 --> 00:03:07,990 don't feel that you've missed anything essential. 49 00:03:07,990 --> 00:03:11,810 I thought this would be the right way 50 00:03:11,810 --> 00:03:18,070 to tell you what the exam is and guide your preparation for it. 51 00:03:18,070 --> 00:03:19,010 Questions. 52 00:03:19,010 --> 00:03:19,680 Yes, thanks. 53 00:03:19,680 --> 00:03:26,870 AUDIENCE: [INAUDIBLE]. 54 00:03:26,870 --> 00:03:30,380 PROFESSOR STRANG: So these are four different questions. 55 00:03:30,380 --> 00:03:36,080 So this would be a question about getting to this matrix 56 00:03:36,080 --> 00:03:40,150 and what it's about, A transpose C A. 57 00:03:40,150 --> 00:03:49,430 AUDIENCE: [INAUDIBLE] 58 00:03:49,430 --> 00:03:52,060 PROFESSOR STRANG: OK, I don't use beam bending for that. 59 00:03:52,060 --> 00:03:53,810 I'm thinking of elastic bar. 60 00:03:53,810 --> 00:03:57,090 Yeah, stretching equation, yeah. 61 00:03:57,090 --> 00:04:02,720 So the stretching equation, so the first one we ever saw, 62 00:04:02,720 --> 00:04:08,740 the stretching equation, was just u'', or -u'', equal 1. 63 00:04:08,740 --> 00:04:12,460 And now that allows a C to sneak in 64 00:04:12,460 --> 00:04:15,980 and that allows a matrix C to sneak in there, 65 00:04:15,980 --> 00:04:21,880 but I think you'll see what they should look like, but yeah. 66 00:04:21,880 --> 00:04:24,480 AUDIENCE: [INAUDIBLE] 67 00:04:24,480 --> 00:04:26,250 PROFESSOR STRANG: Yeah, yeah. 68 00:04:26,250 --> 00:04:30,500 So this is Section 2.2, directly out of the book. 69 00:04:30,500 --> 00:04:33,920 M will be the mass matrix, K will be the stiffness matrix, 70 00:04:33,920 --> 00:04:35,480 yep. 71 00:04:35,480 --> 00:04:38,825 AUDIENCE: [INAUDIBLE] 72 00:04:38,825 --> 00:04:39,700 PROFESSOR STRANG: OK. 73 00:04:39,700 --> 00:04:43,140 So, good point to say. 74 00:04:43,140 --> 00:04:46,630 In the very first section, 1.1, we 75 00:04:46,630 --> 00:04:51,530 gave the name K to a very special matrix, a specific one. 76 00:04:51,530 --> 00:04:57,120 But then later, now, I'm using the same letter K 77 00:04:57,120 --> 00:04:59,450 for matrices of that type. 78 00:04:59,450 --> 00:05:03,750 That was the most special, simplest, completely understood 79 00:05:03,750 --> 00:05:07,620 case, but now I'll use K for stiffness matrices 80 00:05:07,620 --> 00:05:11,560 and when we're doing finite elements in a few weeks 81 00:05:11,560 --> 00:05:15,100 again it'll be K. Yeah, same name, right. 82 00:05:15,100 --> 00:05:18,110 So here you'll want to create K and M 83 00:05:18,110 --> 00:05:24,650 and know how to deal with, this was our only 84 00:05:24,650 --> 00:05:26,370 time-dependent thing. 85 00:05:26,370 --> 00:05:27,990 So I guess what you're seeing here 86 00:05:27,990 --> 00:05:31,540 is not only what time-dependent equation will be there 87 00:05:31,540 --> 00:05:35,160 but also that I'm not going in detail 88 00:05:35,160 --> 00:05:42,450 into those trapezoidal difference methods. 89 00:05:42,450 --> 00:05:45,660 Important as they are, we can't do everything on the quiz 90 00:05:45,660 --> 00:05:55,360 so I'm really focusing on things that are central to our course. 91 00:05:55,360 --> 00:05:56,030 Good. 92 00:05:56,030 --> 00:05:57,380 Other questions. 93 00:05:57,380 --> 00:06:01,271 I'm very open for more questions this afternoon. 94 00:06:01,271 --> 00:06:01,770 Yep. 95 00:06:01,770 --> 00:06:04,039 AUDIENCE: [INAUDIBLE] 96 00:06:04,039 --> 00:06:05,080 PROFESSOR STRANG: Others. 97 00:06:05,080 --> 00:06:05,780 OK. 98 00:06:05,780 --> 00:06:12,280 So, let me, I don't want to go on and do new material, 99 00:06:12,280 --> 00:06:15,410 because we're focused on these things. 100 00:06:15,410 --> 00:06:17,940 And this course, the name of this course 101 00:06:17,940 --> 00:06:20,620 is computational science and engineering. 102 00:06:20,620 --> 00:06:23,680 And by the way I just had an email last week 103 00:06:23,680 --> 00:06:26,240 from the Dean of Engineering, or a bunch of us 104 00:06:26,240 --> 00:06:30,030 did, to say that the School of Engineering 105 00:06:30,030 --> 00:06:36,800 is establishing Center for Computational Engineering, CCE. 106 00:06:36,800 --> 00:06:40,170 Several faculty members there and, like myself 107 00:06:40,170 --> 00:06:43,820 in the School of Science and the Sloan School 108 00:06:43,820 --> 00:06:47,750 are involved with computation, and this new center 109 00:06:47,750 --> 00:06:50,330 is going to organize that. 110 00:06:50,330 --> 00:06:53,270 So, it's a good development. 111 00:06:53,270 --> 00:06:58,620 And it's headed by people in Course 2 and Course 16. 112 00:06:58,620 --> 00:06:59,420 So. 113 00:06:59,420 --> 00:07:03,090 If we're talking about computations, and I do 114 00:07:03,090 --> 00:07:06,260 have to say something about how you would actually 115 00:07:06,260 --> 00:07:08,310 do the computations. 116 00:07:08,310 --> 00:07:12,410 And what are the issues about accuracy. 117 00:07:12,410 --> 00:07:16,880 Speed and accuracy is what you're 118 00:07:16,880 --> 00:07:19,510 aiming for in the computations. 119 00:07:19,510 --> 00:07:22,270 Of course, the first step is to know what problem 120 00:07:22,270 --> 00:07:23,520 is it you want to compute. 121 00:07:23,520 --> 00:07:26,230 What do you want to solve, what's the equation? 122 00:07:26,230 --> 00:07:28,430 That's what we've been doing all along. 123 00:07:28,430 --> 00:07:33,060 Now, I just take a little time-out to say, 124 00:07:33,060 --> 00:07:35,920 suppose I have the equation. 125 00:07:35,920 --> 00:07:39,920 When I write K, I'm thinking of a symmetric, positive definite 126 00:07:39,920 --> 00:07:42,150 or at least semi-definite matrix. 127 00:07:42,150 --> 00:07:46,800 When I write A I'm thinking of any general, usually 128 00:07:46,800 --> 00:07:48,290 tall, thin, matrix. 129 00:07:48,290 --> 00:07:49,440 Rectangular. 130 00:07:49,440 --> 00:07:54,720 So that I would need least squares for this guy, where 131 00:07:54,720 --> 00:07:58,380 straightforward elimination would work for that one. 132 00:07:58,380 --> 00:08:02,050 And so my first question is-- Let's take this, 133 00:08:02,050 --> 00:08:05,180 so these are two topics for today. 134 00:08:05,180 --> 00:08:10,340 This one would come out of 1.7, that discussion with condition 135 00:08:10,340 --> 00:08:11,100 number. 136 00:08:11,100 --> 00:08:16,180 This one would come out of 2.3, the least squares section. 137 00:08:16,180 --> 00:08:21,440 OK, so if I give you, and I'm thinking 138 00:08:21,440 --> 00:08:24,270 that the computational questions emerge 139 00:08:24,270 --> 00:08:26,360 when the systems are large. 140 00:08:26,360 --> 00:08:28,970 So I'm thinking thousands of unknowns here. 141 00:08:28,970 --> 00:08:31,310 Thousands of equations, at the least. 142 00:08:31,310 --> 00:08:31,930 OK. 143 00:08:31,930 --> 00:08:40,500 So, and the question is, I do Gaussian elimination here, 144 00:08:40,500 --> 00:08:42,500 ordinary elimination. 145 00:08:42,500 --> 00:08:43,870 Backslash. 146 00:08:43,870 --> 00:08:48,060 And how accurate is the answer? 147 00:08:48,060 --> 00:08:49,540 And how do you understand? 148 00:08:49,540 --> 00:08:51,870 I mean, the accuracy of the answer 149 00:08:51,870 --> 00:08:53,850 is going to kind of depend on two things. 150 00:08:53,850 --> 00:08:55,730 And it's good to separate them. 151 00:08:55,730 --> 00:09:01,050 One is the method you use, like elimination. 152 00:09:01,050 --> 00:09:03,570 Whatever adjustments you might make. 153 00:09:03,570 --> 00:09:04,640 Pivoting. 154 00:09:04,640 --> 00:09:07,330 Exchanging rows to get larger pivots. 155 00:09:07,330 --> 00:09:10,620 All that is in the algorithm, in the code. 156 00:09:10,620 --> 00:09:13,290 And then the second, very important aspect, 157 00:09:13,290 --> 00:09:16,360 is the matrix K itself. 158 00:09:16,360 --> 00:09:19,950 Is this a tough problem to solve whatever method you're using, 159 00:09:19,950 --> 00:09:21,410 or is it a simple problem? 160 00:09:21,410 --> 00:09:24,520 Is the problem ill conditioned, meeting K 161 00:09:24,520 --> 00:09:28,640 would be like nearly singular, and then we 162 00:09:28,640 --> 00:09:30,550 would know we had a tougher problem 163 00:09:30,550 --> 00:09:32,820 to solve, whatever method. 164 00:09:32,820 --> 00:09:35,410 Or is K quite well conditioned, I 165 00:09:35,410 --> 00:09:37,990 mean the best condition would be when 166 00:09:37,990 --> 00:09:42,480 all the columns are unit vectors and all orthogonal 167 00:09:42,480 --> 00:09:43,560 to each other. 168 00:09:43,560 --> 00:09:46,860 Yeah, I mean that would be the best conditioning of all. 169 00:09:46,860 --> 00:09:49,080 That condition number would be one 170 00:09:49,080 --> 00:09:55,580 if this K, which, not too likely is a matrix that I would call 171 00:09:55,580 --> 00:10:00,120 Q. Q, which is going to show up over here, 172 00:10:00,120 --> 00:10:05,670 in the second problem is, Q stands for a matrix which 173 00:10:05,670 --> 00:10:08,060 has orthonormal columns. 174 00:10:08,060 --> 00:10:17,610 So, you remember what orthonormal means. 175 00:10:17,610 --> 00:10:19,900 Ortho is telling us perpendicular, 176 00:10:19,900 --> 00:10:21,390 that's the key point. 177 00:10:21,390 --> 00:10:23,220 Normal is telling us that they're 178 00:10:23,220 --> 00:10:25,460 unit vectors, lengths one. 179 00:10:25,460 --> 00:10:28,990 So that's the Q, and then you might ask what's the R. 180 00:10:28,990 --> 00:10:35,790 And the R is upper triangular. 181 00:10:35,790 --> 00:10:38,130 OK. 182 00:10:38,130 --> 00:10:39,820 OK. 183 00:10:39,820 --> 00:10:42,220 So what I said about this problem, 184 00:10:42,220 --> 00:10:46,690 that there's the method you use, and also the sensitivity, 185 00:10:46,690 --> 00:10:49,230 the difficulty of the problem in the first place, 186 00:10:49,230 --> 00:10:51,290 applies just the same here. 187 00:10:51,290 --> 00:10:53,940 There's the method you use, do you 188 00:10:53,940 --> 00:10:59,630 use A transpose A to find u hat, this is, now 189 00:10:59,630 --> 00:11:03,060 we're looking for u hat, of course, the best solution. 190 00:11:03,060 --> 00:11:05,560 Do I use A transpose A? 191 00:11:05,560 --> 00:11:09,380 Well, you would say of course, what else. 192 00:11:09,380 --> 00:11:11,720 That equation, that least squares equation 193 00:11:11,720 --> 00:11:16,040 has A transpose A u hat equal A transpose b, what's the choice? 194 00:11:16,040 --> 00:11:24,350 But, if you're interested in high accuracy, 195 00:11:24,350 --> 00:11:27,120 and stability, numerical stability, 196 00:11:27,120 --> 00:11:29,740 maybe you don't go to A transpose A. 197 00:11:29,740 --> 00:11:34,160 Going to A transpose A kind of squares the condition number. 198 00:11:34,160 --> 00:11:37,420 You get to an A transpose A, that'll be our K, 199 00:11:37,420 --> 00:11:42,150 but its condition number will somehow be squared 200 00:11:42,150 --> 00:11:46,680 and if the problem is nice, you're OK with that. 201 00:11:46,680 --> 00:11:48,390 But if the problem is delicate? 202 00:11:48,390 --> 00:11:51,330 Now, what does delicate mean for Au=b? 203 00:11:51,330 --> 00:11:55,910 I'm kind of giving you a overview of the two problems 204 00:11:55,910 --> 00:11:57,690 before I start on this one. 205 00:11:57,690 --> 00:11:59,180 And then that one. 206 00:11:59,180 --> 00:12:01,610 So with this one the problem was, 207 00:12:01,610 --> 00:12:03,520 is the matrix nearly singular. 208 00:12:03,520 --> 00:12:05,240 What does that mean? 209 00:12:05,240 --> 00:12:07,970 Does MATLAB tell you, what does MATLAB tell you 210 00:12:07,970 --> 00:12:09,110 about the matrix? 211 00:12:09,110 --> 00:12:12,860 And that is measured by the condition number. 212 00:12:12,860 --> 00:12:20,810 What's the issue here is, when would this be a numerically 213 00:12:20,810 --> 00:12:22,600 difficult, sensitive problem? 214 00:12:22,600 --> 00:12:27,690 Well, the columns of A are not orthonormal. 215 00:12:27,690 --> 00:12:30,980 If they are, then you're golden. 216 00:12:30,980 --> 00:12:33,100 If the columns of A are orthonormal, 217 00:12:33,100 --> 00:12:36,600 then you're all set. 218 00:12:36,600 --> 00:12:39,150 So what's the opposite? 219 00:12:39,150 --> 00:12:42,680 Well, the extreme opposite would be when the columns of A 220 00:12:42,680 --> 00:12:44,110 are dependent. 221 00:12:44,110 --> 00:12:46,440 If the columns of A are linearly dependent 222 00:12:46,440 --> 00:12:50,520 and some column is a combination of other columns, 223 00:12:50,520 --> 00:12:52,550 you're in trouble right away. 224 00:12:52,550 --> 00:12:56,060 So that's like big trouble, that's like K being singular. 225 00:12:56,060 --> 00:12:57,640 Those are the cases. 226 00:12:57,640 --> 00:13:03,160 K singular here, dependent columns here. 227 00:13:03,160 --> 00:13:05,470 Not full rank. 228 00:13:05,470 --> 00:13:11,160 So, again we're supposing we're not facing disaster. 229 00:13:11,160 --> 00:13:12,825 Just near disaster. 230 00:13:12,825 --> 00:13:15,160 So we want to know, is K near the singular 231 00:13:15,160 --> 00:13:17,490 and how to measure that, and we want 232 00:13:17,490 --> 00:13:22,190 to know what to do when the columns of A 233 00:13:22,190 --> 00:13:28,780 are independent but maybe not very. 234 00:13:28,780 --> 00:13:31,470 And that would show up in a large condition 235 00:13:31,470 --> 00:13:35,420 number for A transpose A. And this happens all the time; 236 00:13:35,420 --> 00:13:39,410 if you don't set up your problem well, 237 00:13:39,410 --> 00:13:43,920 your experimental problem, you can easily 238 00:13:43,920 --> 00:13:49,390 get matrices A, whose columns are not very independent. 239 00:13:49,390 --> 00:13:54,000 Measured by A transpose A being close to singular. 240 00:13:54,000 --> 00:13:56,040 Right, everybody here's got that idea. 241 00:13:56,040 --> 00:13:58,700 If the columns of A are independent, 242 00:13:58,700 --> 00:14:01,170 A transpose A is non-singular. 243 00:14:01,170 --> 00:14:03,910 In fact, positive definite. 244 00:14:03,910 --> 00:14:09,980 Then, now we're talking about when we have that property, 245 00:14:09,980 --> 00:14:16,760 but the columns of A are not very independent 246 00:14:16,760 --> 00:14:20,530 and the matrix A transpose A is not very invertible. 247 00:14:20,530 --> 00:14:23,350 OK, so that's what the things are. 248 00:14:23,350 --> 00:14:27,910 And then just to, because-- Say, on this one, 249 00:14:27,910 --> 00:14:30,350 what's the good thing to do? 250 00:14:30,350 --> 00:14:36,370 The good thing to do is to call the qr code which 251 00:14:36,370 --> 00:14:43,750 gets its name because it takes the matrix A and it factors it. 252 00:14:43,750 --> 00:14:49,150 Of course, we all know that lu is the code here. 253 00:14:49,150 --> 00:14:56,490 It factors K. And qr is the code that factors 254 00:14:56,490 --> 00:15:04,810 A into a very good guy, an optimal Q. It couldn't be beat. 255 00:15:04,810 --> 00:15:08,430 And an R, that's upper triangular, and therefore 256 00:15:08,430 --> 00:15:12,050 in the simplest form you see exactly what you're 257 00:15:12,050 --> 00:15:15,040 dealing with. 258 00:15:15,040 --> 00:15:24,600 Let me continue this least squares idea. 259 00:15:24,600 --> 00:15:27,870 Because Q and R are probably not so familiar. 260 00:15:27,870 --> 00:15:30,730 Maybe the name Gram-Schmidt is familiar? 261 00:15:30,730 --> 00:15:34,700 How many have seen Gram-Schmidt? 262 00:15:34,700 --> 00:15:37,500 The Gram-Schmidt idea I'll describe quickly, 263 00:15:37,500 --> 00:15:40,530 but just do those words, do those guys' names mean anything 264 00:15:40,530 --> 00:15:41,030 to? 265 00:15:41,030 --> 00:15:42,180 Yes, if they do. 266 00:15:42,180 --> 00:15:43,090 Quite a few. 267 00:15:43,090 --> 00:15:44,420 But not all. 268 00:15:44,420 --> 00:15:45,390 OK. 269 00:15:45,390 --> 00:15:47,220 OK. 270 00:15:47,220 --> 00:15:53,190 And can I just say, also Gram-Schmidt 271 00:15:53,190 --> 00:15:57,390 is kind of our name for getting these two factors. 272 00:15:57,390 --> 00:16:00,810 And you'll see why, it's very cool to have, 273 00:16:00,810 --> 00:16:06,030 why this is a good first step. 274 00:16:06,030 --> 00:16:08,420 It costs a little to take that step, 275 00:16:08,420 --> 00:16:12,550 but if you're interested in safety, take it. 276 00:16:12,550 --> 00:16:16,470 It might cost twice as much as solving the A transpose 277 00:16:16,470 --> 00:16:21,330 A equation, so you double the cost by going this safer route. 278 00:16:21,330 --> 00:16:24,250 And double is not a big deal, usually. 279 00:16:24,250 --> 00:16:26,750 OK. 280 00:16:26,750 --> 00:16:29,350 So, I was going to say that Gram-Schmidt, 281 00:16:29,350 --> 00:16:31,270 that's the name everybody uses. 282 00:16:31,270 --> 00:16:37,090 But actually their method is no longer the winner. 283 00:16:37,090 --> 00:16:41,210 And in Section 2.3 and I'll try to describe 284 00:16:41,210 --> 00:16:46,540 a slightly better method than the Gram-Schmidt idea 285 00:16:46,540 --> 00:16:50,720 to arrive at Q and R. But let's suppose you got to Q and R. 286 00:16:50,720 --> 00:16:53,600 Then, what would be the least squares equation? 287 00:16:53,600 --> 00:16:59,240 A transpose A u hat is A transpose b, right? 288 00:16:59,240 --> 00:17:02,840 That's the equation everybody knows. 289 00:17:02,840 --> 00:17:07,210 But now if we have A factored into Q times R, 290 00:17:07,210 --> 00:17:09,790 let me see how that simplifies. 291 00:17:09,790 --> 00:17:14,710 So now A is QR, and A transpose, of course, 292 00:17:14,710 --> 00:17:18,870 is R transpose Q transpose, u hat, and this 293 00:17:18,870 --> 00:17:23,200 is R transpose Q transpose b. 294 00:17:23,200 --> 00:17:26,060 Same equation, I'm just supposing 295 00:17:26,060 --> 00:17:29,350 that I've got A into this nice form where 296 00:17:29,350 --> 00:17:35,580 Q has-- where I've taken these columns that possibly lined up 297 00:17:35,580 --> 00:17:40,370 too close to each other like, you know, angles of one degree. 298 00:17:40,370 --> 00:17:42,800 And I've got better angles. 299 00:17:42,800 --> 00:17:47,330 I've got-- These columns of A are too close, 300 00:17:47,330 --> 00:17:52,690 so I spread them out, to columns of Q that are at 90 degrees. 301 00:17:52,690 --> 00:17:54,430 Orthogonal columns. 302 00:17:54,430 --> 00:17:57,460 Now, what's the deal with orthogonal columns? 303 00:17:57,460 --> 00:17:59,990 Let me just remember the main point 304 00:17:59,990 --> 00:18:06,310 about Q. It has orthogonal columns, right, 305 00:18:06,310 --> 00:18:11,360 and I'll call those q's. q_1 to q_n. 306 00:18:11,360 --> 00:18:12,450 OK. 307 00:18:12,450 --> 00:18:14,340 And the good deal is, what happens 308 00:18:14,340 --> 00:18:16,820 when I do Q transpose Q? 309 00:18:16,820 --> 00:18:23,610 So I do q_1 transpose, these are now rows, to q_n transpose. 310 00:18:23,610 --> 00:18:27,460 q_1 columns to q_n. 311 00:18:27,460 --> 00:18:31,880 And what do I get when I multiply those matrices? 312 00:18:31,880 --> 00:18:37,960 Q transpose times Q. I get I. q_1 transpose q_1, that's 313 00:18:37,960 --> 00:18:40,680 the length of q_1 squared is one, 314 00:18:40,680 --> 00:18:44,090 and q_1 is orthogonal to all the others. 315 00:18:44,090 --> 00:18:48,930 And then q_2, you see I get the I. q_3, 316 00:18:48,930 --> 00:18:52,300 get an n by n-- I get the identity matrix. 317 00:18:52,300 --> 00:18:58,280 Q transpose Q is I. That's the beautiful, 318 00:18:58,280 --> 00:19:00,720 just remember that fact. 319 00:19:00,720 --> 00:19:02,760 And use it right away. 320 00:19:02,760 --> 00:19:05,360 You see where it's used, Q transpose Q 321 00:19:05,360 --> 00:19:08,460 is I in the middle of that. 322 00:19:08,460 --> 00:19:12,350 So I can just delete that, I just have R transpose R 323 00:19:12,350 --> 00:19:14,410 and I can even simplify this further. 324 00:19:14,410 --> 00:19:15,680 What can I do now? 325 00:19:15,680 --> 00:19:20,350 So that's the identity, so I have R transpose R, 326 00:19:20,350 --> 00:19:25,640 but now I have an R transpose over here so am I left with? 327 00:19:25,640 --> 00:19:28,760 I'll multiply both sides by R transpose inverse 328 00:19:28,760 --> 00:19:35,360 and that will lead me to, R u hat equals, knocking out 329 00:19:35,360 --> 00:19:40,690 our transpose inverse on both sides, Q transpose b. 330 00:19:40,690 --> 00:19:45,280 Well, that's, our least squares equation has become 331 00:19:45,280 --> 00:19:47,390 completely easy to solve. 332 00:19:47,390 --> 00:19:49,300 We've got a triangular matrix here, 333 00:19:49,300 --> 00:19:52,030 I mean it's just back substitution. 334 00:19:52,030 --> 00:19:57,800 It's just back substitution now, and a Q transpose b over there. 335 00:19:57,800 --> 00:20:06,150 So a very simple solution for our equation after the initial 336 00:20:06,150 --> 00:20:09,380 work of A=QR. 337 00:20:09,380 --> 00:20:10,990 OK. 338 00:20:10,990 --> 00:20:18,340 But very safe, Q is a great matrix to work with. 339 00:20:18,340 --> 00:20:22,320 In fact people-- codes are written so as 340 00:20:22,320 --> 00:20:27,920 to use orthogonal matrices Q as often as they can. 341 00:20:27,920 --> 00:20:34,240 Alright, so you had a look ahead of the computational side 342 00:20:34,240 --> 00:20:39,980 of 2.3, let me come back to the most basic equations, just 343 00:20:39,980 --> 00:20:42,310 symmetric, positive definite equations. 344 00:20:42,310 --> 00:20:53,050 Ku=f, and consider OK, how do we measure whether K is nearly 345 00:20:53,050 --> 00:20:55,260 singular? 346 00:20:55,260 --> 00:20:57,400 OK, let me just ask that question. 347 00:20:57,400 --> 00:21:00,190 That's the central question. 348 00:21:00,190 --> 00:21:07,060 How to measure, when is K, which is, 349 00:21:07,060 --> 00:21:12,150 which we're assuming be symmetric positive definite, 350 00:21:12,150 --> 00:21:13,650 nearly singular? 351 00:21:13,650 --> 00:21:20,820 How to measure that? 352 00:21:20,820 --> 00:21:24,590 How to know whether we're in any danger or not? 353 00:21:24,590 --> 00:21:25,480 OK. 354 00:21:25,480 --> 00:21:29,570 Well, first you might think OK, if it 355 00:21:29,570 --> 00:21:33,200 is singular its determinant is zero. 356 00:21:33,200 --> 00:21:36,310 So why not take its determinant? 357 00:21:36,310 --> 00:21:40,030 Well determinants, as we've said, 358 00:21:40,030 --> 00:21:45,420 are not a good idea numerically. 359 00:21:45,420 --> 00:21:47,600 First, they're not fun to compute. 360 00:21:47,600 --> 00:21:53,460 Second, they depend on the number of unknowns, right? 361 00:21:53,460 --> 00:21:57,190 If I just have twice the identity, 362 00:21:57,190 --> 00:22:01,140 suppose K is twice the identity matrix. 363 00:22:01,140 --> 00:22:04,250 You could not get a better problem than that, right? 364 00:22:04,250 --> 00:22:06,070 If K was twice the identity matrix 365 00:22:06,070 --> 00:22:08,270 the whole thing's simple. 366 00:22:08,270 --> 00:22:12,330 Or if K is, suppose K is one millionth 367 00:22:12,330 --> 00:22:15,190 of the identity matrix. 368 00:22:15,190 --> 00:22:18,970 OK, again, that's a perfect problem, right? 369 00:22:18,970 --> 00:22:22,210 If K is one millionth of the identity matrix, 370 00:22:22,210 --> 00:22:25,860 well to solve the problem you just multiply by a million, 371 00:22:25,860 --> 00:22:27,330 you've got the answer. 372 00:22:27,330 --> 00:22:29,680 So those are good. 373 00:22:29,680 --> 00:22:34,490 And we have to have some measure of bad or good 374 00:22:34,490 --> 00:22:37,160 that tells us those are good. 375 00:22:37,160 --> 00:22:40,030 OK. 376 00:22:40,030 --> 00:22:42,320 So the determinant won't do. 377 00:22:42,320 --> 00:22:44,850 Because the determinant of 2 I would be two 378 00:22:44,850 --> 00:22:48,610 to the nth, the size of the matrix. 379 00:22:48,610 --> 00:22:51,640 Or the determinant of one millionth identity 380 00:22:51,640 --> 00:22:53,840 would be one millionth to the n. 381 00:22:53,840 --> 00:22:56,160 Those are not numbers we want. 382 00:22:56,160 --> 00:22:57,310 What's a better number? 383 00:22:57,310 --> 00:22:59,950 Maybe you could suggest a better number 384 00:22:59,950 --> 00:23:05,070 to measure how close is the matrix to being singular. 385 00:23:05,070 --> 00:23:08,350 What would you say? 386 00:23:08,350 --> 00:23:10,370 I think if you think about it a little, 387 00:23:10,370 --> 00:23:13,900 so what numbers do we know? 388 00:23:13,900 --> 00:23:17,330 Well eigenvalues jumps to mind. 389 00:23:17,330 --> 00:23:18,840 Eigenvalues jumps to mind. 390 00:23:18,840 --> 00:23:22,940 Because this matrix K, being symmetric positive definite, 391 00:23:22,940 --> 00:23:31,520 has eigenvalues say lambda_1 less than lambda_2, so on. 392 00:23:31,520 --> 00:23:32,790 So on. 393 00:23:32,790 --> 00:23:40,020 Up to, so this is lambda_max and that's lambda_min, 394 00:23:40,020 --> 00:23:42,840 and they're all positive. 395 00:23:42,840 --> 00:23:50,640 And so what's your idea of whether the thing's nearly 396 00:23:50,640 --> 00:23:52,150 singular now? 397 00:23:52,150 --> 00:23:54,390 Look at lambda_1, right? 398 00:23:54,390 --> 00:23:56,920 If lambda_1 is near zero, that somehow 399 00:23:56,920 --> 00:23:59,320 indicates near singular. 400 00:23:59,320 --> 00:24:02,600 So lambda_1 is sort of a natural test. 401 00:24:02,600 --> 00:24:05,060 Not that I intend to compute lambda_1, 402 00:24:05,060 --> 00:24:08,070 that would take longer than solving the system. 403 00:24:08,070 --> 00:24:10,990 But an estimate of lambda_1 would be enough. 404 00:24:10,990 --> 00:24:13,060 OK. 405 00:24:13,060 --> 00:24:17,270 But my answer is not just lambda_1. 406 00:24:17,270 --> 00:24:20,090 407 00:24:20,090 --> 00:24:22,510 And why is that? 408 00:24:22,510 --> 00:24:28,850 Because the examples I gave you, when I had twice the identity, 409 00:24:28,850 --> 00:24:31,780 what would lambda_1 be there in that case? 410 00:24:31,780 --> 00:24:35,930 If my matrix K was beautiful, twice the identity matrix, 411 00:24:35,930 --> 00:24:40,250 lambda_1 would be two. 412 00:24:40,250 --> 00:24:44,630 All the eigenvalues are two for the identity matrix. 413 00:24:44,630 --> 00:24:48,340 Now if my matrix was one millionth of the identity, 414 00:24:48,340 --> 00:24:50,450 again I have a beautiful problem. 415 00:24:50,450 --> 00:24:52,540 Just as good, just as beautiful problem. 416 00:24:52,540 --> 00:24:55,190 What's lambda_1 for that one? 417 00:24:55,190 --> 00:24:56,400 One millionth. 418 00:24:56,400 --> 00:24:59,770 It looks not as good, it looks much more singular, 419 00:24:59,770 --> 00:25:05,220 but that's not really there. 420 00:25:05,220 --> 00:25:11,320 So you could say, we'll scale your matrix. 421 00:25:11,320 --> 00:25:15,740 And scaling the matrices, in fact scaling individual rows 422 00:25:15,740 --> 00:25:19,350 and columns to get it, you might have used, 423 00:25:19,350 --> 00:25:25,890 your unknowns might be somehow in the wrong units. 424 00:25:25,890 --> 00:25:29,825 So one of the answers is way big and the second component 425 00:25:29,825 --> 00:25:30,960 is way small. 426 00:25:30,960 --> 00:25:33,640 That's not good. 427 00:25:33,640 --> 00:25:36,630 So scaling is important. 428 00:25:36,630 --> 00:25:42,670 But even then you still end up with a matrix K, 429 00:25:42,670 --> 00:25:47,990 some eigenvalues and I'll tell you the condition number. 430 00:25:47,990 --> 00:25:55,670 The condition number of K is the ratio of this guy to this one. 431 00:25:55,670 --> 00:26:00,170 In other words, two K, or a million K, or one millionth K, 432 00:26:00,170 --> 00:26:02,320 all have the same condition number. 433 00:26:02,320 --> 00:26:06,530 Because those problems are identical problems. 434 00:26:06,530 --> 00:26:08,690 Multiplying by two, multiplying by a million, 435 00:26:08,690 --> 00:26:12,150 dividing by a million didn't change reality there. 436 00:26:12,150 --> 00:26:17,130 So if we're in floating points, it just didn't change. 437 00:26:17,130 --> 00:26:19,210 So the condition number is going to be 438 00:26:19,210 --> 00:26:20,880 lambda_max over lambda_min. 439 00:26:20,880 --> 00:26:24,710 440 00:26:24,710 --> 00:26:31,310 And this is for symmetric positive definite matrices. 441 00:26:31,310 --> 00:26:35,170 And MATLAB will print out that number. 442 00:26:35,170 --> 00:26:37,230 Or print an estimate for that number; 443 00:26:37,230 --> 00:26:39,740 as I said we don't want to compute it exactly. 444 00:26:39,740 --> 00:26:41,720 Lambda_max over lambda_min. 445 00:26:41,720 --> 00:26:48,220 That measures how sensitive, how tough your problem is. 446 00:26:48,220 --> 00:26:49,470 OK. 447 00:26:49,470 --> 00:26:55,450 And then I have to think, how does that come in, why 448 00:26:55,450 --> 00:26:57,200 is that an appropriate number? 449 00:26:57,200 --> 00:26:59,790 I guess I've tried to give an instinct for why 450 00:26:59,790 --> 00:27:05,410 it's appropriate, but we can be pretty specific about it. 451 00:27:05,410 --> 00:27:08,770 In fact, let's do that now. 452 00:27:08,770 --> 00:27:13,510 So what would be the condition number of twice the identity? 453 00:27:13,510 --> 00:27:15,720 It would be one. 454 00:27:15,720 --> 00:27:17,470 Perfectly conditioned problem. 455 00:27:17,470 --> 00:27:20,440 What would be the condition, yeah, OK. 456 00:27:20,440 --> 00:27:24,020 What would be the condition of a diagonal matrix? 457 00:27:24,020 --> 00:27:32,900 Suppose K was the diagonal matrix two, three, four? 458 00:27:32,900 --> 00:27:36,650 The condition number of that matrix is two, right? 459 00:27:36,650 --> 00:27:39,990 Lambda_max is sitting there, lambda_min is sitting there, 460 00:27:39,990 --> 00:27:41,370 the ratio is two. 461 00:27:41,370 --> 00:27:44,950 Of course, any condition number under 100 or 1000 462 00:27:44,950 --> 00:27:47,500 is no problem. 463 00:27:47,500 --> 00:27:52,509 Roughly the rule of thumb is that the-- 464 00:27:52,509 --> 00:27:53,550 What's the rule of thumb? 465 00:27:53,550 --> 00:27:55,700 I think that maybe the number of digits 466 00:27:55,700 --> 00:28:03,020 in the condition number, the number of digits-- Maybe 467 00:28:03,020 --> 00:28:06,660 if the condition number was 1000 you would be taking 468 00:28:06,660 --> 00:28:11,920 a chance that your last three digits, single precision that 469 00:28:11,920 --> 00:28:18,160 would be three out of six, three bits-- Somehow 470 00:28:18,160 --> 00:28:22,240 the log of the condition number, the number of digits in it, 471 00:28:22,240 --> 00:28:26,340 is some measure of the number of digits you'd lose. 472 00:28:26,340 --> 00:28:30,450 Because you're doing floating point, of course, here. 473 00:28:30,450 --> 00:28:33,670 That's it, totally well conditioned matrix. 474 00:28:33,670 --> 00:28:35,740 I wouldn't touch that one. 475 00:28:35,740 --> 00:28:37,630 I mean that's just fine. 476 00:28:37,630 --> 00:28:45,260 But we can figure out-- Here's the point that I should make. 477 00:28:45,260 --> 00:28:48,490 Because here's the computational science point. 478 00:28:48,490 --> 00:28:57,540 When this is our special K, our -1, 2, -1 matrix. 479 00:28:57,540 --> 00:29:03,730 -1, 2, -1 matrix of size n, the condition number 480 00:29:03,730 --> 00:29:09,430 goes like n squared. 481 00:29:09,430 --> 00:29:11,910 Because we know the eigenvalues of that matrix, 482 00:29:11,910 --> 00:29:13,160 we could see it. 483 00:29:13,160 --> 00:29:18,340 The largest eigenvalue, when n is big, say n is 1000. 484 00:29:18,340 --> 00:29:21,850 And we're dealing with our standard second difference 485 00:29:21,850 --> 00:29:24,830 matrix, the most important example I could possibly 486 00:29:24,830 --> 00:29:25,980 present. 487 00:29:25,980 --> 00:29:30,430 Then the largest eigenvalues we actually are there 488 00:29:30,430 --> 00:29:36,430 in Section 1.5, we didn't do them in detail, 489 00:29:36,430 --> 00:29:39,280 we'll probably come back to them when we need them. 490 00:29:39,280 --> 00:29:42,370 But the largest one is about four. 491 00:29:42,370 --> 00:29:46,030 And the smallest one is pretty small. 492 00:29:46,030 --> 00:29:50,120 The smallest one is sort of like a sine 493 00:29:50,120 --> 00:29:54,390 squared of a small number. 494 00:29:54,390 --> 00:29:59,970 And so the smallest eigenvalue is of order 1/n squared. 495 00:29:59,970 --> 00:30:04,840 And then when I do that ratio of four, lambda_max 496 00:30:04,840 --> 00:30:09,620 is just like four, this lambda_min is like 1/n squared, 497 00:30:09,620 --> 00:30:10,720 quite small. 498 00:30:10,720 --> 00:30:13,260 That ratio gives me the n squared. 499 00:30:13,260 --> 00:30:16,890 So there's an indication. 500 00:30:16,890 --> 00:30:19,680 Basically, that's not bad. 501 00:30:19,680 --> 00:30:26,330 That's not bad if n is 1000 in most engineering problems, that 502 00:30:26,330 --> 00:30:29,160 gives you extremely, extremely good accuracy. 503 00:30:29,160 --> 00:30:33,060 Condition number of a million, you could live with. 504 00:30:33,060 --> 00:30:37,480 If n is 100, more typical, condition number of 10,000 505 00:30:37,480 --> 00:30:44,330 is basically, I think OK. 506 00:30:44,330 --> 00:30:46,280 And I would go with it. 507 00:30:46,280 --> 00:30:53,820 But if the condition number is way up, then I'd think again, 508 00:30:53,820 --> 00:30:55,970 did I model the problem well. 509 00:30:55,970 --> 00:30:56,950 OK. 510 00:30:56,950 --> 00:30:57,870 Alright. 511 00:30:57,870 --> 00:31:01,650 So that's-- Now I have to tell you, why is this inappropriate? 512 00:31:01,650 --> 00:31:04,990 How do you look at the error? 513 00:31:04,990 --> 00:31:15,500 So can I write down a way of approaching this Ku=f? 514 00:31:15,500 --> 00:31:21,680 515 00:31:21,680 --> 00:31:24,950 So this is the first time I've used the word round-off error. 516 00:31:24,950 --> 00:31:28,970 So in all the calculations, in all the calculations 517 00:31:28,970 --> 00:31:36,710 that you have to do, to get to u, those row operations, 518 00:31:36,710 --> 00:31:40,680 and you're doing them to the right side too, 519 00:31:40,680 --> 00:31:42,960 so all those are floating point operations in which 520 00:31:42,960 --> 00:31:44,910 small errors are sneaking in. 521 00:31:44,910 --> 00:31:50,520 And it was very unclear, in the early years, 522 00:31:50,520 --> 00:31:54,090 whether the millions and millions of operations that you 523 00:31:54,090 --> 00:31:59,150 do, additions, subtractions, multiplications, 524 00:31:59,150 --> 00:32:06,480 in elimination, do those, could those add up? 525 00:32:06,480 --> 00:32:10,960 If they don't cancel, you've got problems, right? 526 00:32:10,960 --> 00:32:14,930 But in general you would expect somehow 527 00:32:14,930 --> 00:32:17,120 that these are just round off errors, 528 00:32:17,120 --> 00:32:19,760 you're making them millions and millions of times, 529 00:32:19,760 --> 00:32:23,870 it would be pretty bad luck, I mean like Red Sox twelfth 530 00:32:23,870 --> 00:32:29,790 inning bad luck to have them pile up on you. 531 00:32:29,790 --> 00:32:32,050 So you don't expect that. 532 00:32:32,050 --> 00:32:37,120 Now, what you do solve, so what you actually compute, 533 00:32:37,120 --> 00:32:41,110 so this is the exact. 534 00:32:41,110 --> 00:32:44,630 This would be the computed. 535 00:32:44,630 --> 00:32:48,610 Let me suppose that the computed one is sort of a, 536 00:32:48,610 --> 00:32:49,900 there's an error. 537 00:32:49,900 --> 00:32:51,850 I'll call it delta u. 538 00:32:51,850 --> 00:32:54,580 That's our error. 539 00:32:54,580 --> 00:32:58,180 And it's equal to an f plus delta f. 540 00:32:58,180 --> 00:33:08,740 And this is our round off error, this is error we make, 541 00:33:08,740 --> 00:33:13,920 and this is error in the answer. 542 00:33:13,920 --> 00:33:18,620 In the final answer. 543 00:33:18,620 --> 00:33:21,570 OK, now I would like to have an error equation. 544 00:33:21,570 --> 00:33:23,810 An equation for that error, delta u, 545 00:33:23,810 --> 00:33:26,700 because that's what I'm trying to get an idea of. 546 00:33:26,700 --> 00:33:28,050 No problem. 547 00:33:28,050 --> 00:33:32,140 If I subtract the exact equation from this equation, 548 00:33:32,140 --> 00:33:35,290 I have a simple error equation. 549 00:33:35,290 --> 00:33:43,030 So this is my error equation. 550 00:33:43,030 --> 00:33:45,550 OK. 551 00:33:45,550 --> 00:33:50,740 So I want to estimate the size of that error, 552 00:33:50,740 --> 00:33:54,310 compared to the exact. 553 00:33:54,310 --> 00:33:58,010 You might say, and you would be right, in saying, 554 00:33:58,010 --> 00:34:03,130 well wait a minute as you do all these operations you're also 555 00:34:03,130 --> 00:34:09,050 creating errors in K. So I could have a K plus delta K here, 556 00:34:09,050 --> 00:34:09,660 too. 557 00:34:09,660 --> 00:34:13,940 And actually it wouldn't be difficult to deal with. 558 00:34:13,940 --> 00:34:18,960 And would certainly be there in a proper error analysis. 559 00:34:18,960 --> 00:34:20,760 And it wouldn't make a big difference, 560 00:34:20,760 --> 00:34:23,820 the condition number would still be the right measure. 561 00:34:23,820 --> 00:34:28,020 So let me concentrate here on the error 562 00:34:28,020 --> 00:34:32,260 in f, when subtracting one from the other 563 00:34:32,260 --> 00:34:37,780 gives me this simple error equation. 564 00:34:37,780 --> 00:34:46,990 So my question is, when is that error, delta u, big? 565 00:34:46,990 --> 00:34:49,210 When do I get a large error? 566 00:34:49,210 --> 00:34:51,330 And delta f, I'm not controlling. 567 00:34:51,330 --> 00:34:55,920 I might control the size, but the details of it I can't know. 568 00:34:55,920 --> 00:35:03,730 So what delta f, now I'll take worst possible here. 569 00:35:03,730 --> 00:35:07,620 Suppose this is of some small size, 570 00:35:07,620 --> 00:35:13,150 ten to the minus something, times some vector of errors, 571 00:35:13,150 --> 00:35:15,340 but I don't know anything about that vector 572 00:35:15,340 --> 00:35:17,950 and therefore I'd better take the worst possibility. 573 00:35:17,950 --> 00:35:19,800 What would be the worst possibility? 574 00:35:19,800 --> 00:35:24,490 What right hand side would give me the biggest delta u? 575 00:35:24,490 --> 00:35:24,990 Yeah. 576 00:35:24,990 --> 00:35:29,360 Maybe that's the right question to ask. 577 00:35:29,360 --> 00:35:31,930 So now we're being a little pessimistic. 578 00:35:31,930 --> 00:35:35,070 We're saying what right hand side, what 579 00:35:35,070 --> 00:35:37,650 set of errors in the measurements 580 00:35:37,650 --> 00:35:42,650 or from the calculations would give me the largest delta u? 581 00:35:42,650 --> 00:35:49,380 Well, so let's see. 582 00:35:49,380 --> 00:35:51,710 I'm thinking the worst case would 583 00:35:51,710 --> 00:35:58,890 be if delta f was an eigenvector with the smallest eigenvalue, 584 00:35:58,890 --> 00:36:00,160 right? 585 00:36:00,160 --> 00:36:05,910 If delta f is an eigenvector, is x_1, the eigenvector that 586 00:36:05,910 --> 00:36:11,340 goes with lambda_1, the worst case 587 00:36:11,340 --> 00:36:17,330 would be for that to be the first eigenvector. 588 00:36:17,330 --> 00:36:19,166 That would be the worst direction. 589 00:36:19,166 --> 00:36:20,540 Of course, it would be multiplied 590 00:36:20,540 --> 00:36:22,710 by some little number. 591 00:36:22,710 --> 00:36:26,080 Epsilon is every mathematician's idea of a little number. 592 00:36:26,080 --> 00:36:26,590 OK. 593 00:36:26,590 --> 00:36:32,330 So epsilon x_1, then what is delta u? 594 00:36:32,330 --> 00:36:39,610 Then the worst delta u is what? 595 00:36:39,610 --> 00:36:44,140 What would be the solution to that equation? 596 00:36:44,140 --> 00:36:49,640 If the right-hand side was epsilon and an eigenvector? 597 00:36:49,640 --> 00:36:51,780 This is the whole point of eigenvectors. 598 00:36:51,780 --> 00:36:54,310 You can tell me what the solution is. 599 00:36:54,310 --> 00:36:57,880 Is it a multiple of that eigenvector? 600 00:36:57,880 --> 00:36:59,910 You bet. 601 00:36:59,910 --> 00:37:01,610 If this is an eigenvector, then I 602 00:37:01,610 --> 00:37:03,540 can put in the same eigenvector there, 603 00:37:03,540 --> 00:37:05,480 I just have to scale it properly. 604 00:37:05,480 --> 00:37:14,380 So it'll be just this side and what do I need? 605 00:37:14,380 --> 00:37:15,720 I think I just need lambda_1. 606 00:37:15,720 --> 00:37:19,640 607 00:37:19,640 --> 00:37:24,830 The worst K inverse can be is like 1/lambda_1. 608 00:37:24,830 --> 00:37:30,600 Right, if I claim that that's the answer, if the right hand 609 00:37:30,600 --> 00:37:33,140 side is sort of in the worst direction, 610 00:37:33,140 --> 00:37:36,380 then the answer is that same right hand 611 00:37:36,380 --> 00:37:39,830 side divided by lambda_1. 612 00:37:39,830 --> 00:37:40,960 Let me just check. 613 00:37:40,960 --> 00:37:45,850 If I multiply both sides by K, I have K delta u equals K, 614 00:37:45,850 --> 00:37:48,860 what's K*x_1? 615 00:37:48,860 --> 00:37:49,690 Everybody with me? 616 00:37:49,690 --> 00:37:51,470 What's K*x_1? 617 00:37:51,470 --> 00:37:54,280 Lambda_1*x_1, so the lambda_1's cancel, 618 00:37:54,280 --> 00:37:56,750 then I got the epsilon x_1 I want. 619 00:37:56,750 --> 00:37:59,360 So, no surprise. 620 00:37:59,360 --> 00:38:03,380 That's just telling us that the worst error is an error 621 00:38:03,380 --> 00:38:06,310 in the direction of the low eigenvector, 622 00:38:06,310 --> 00:38:10,530 and that error gets amplified by 1/lambda_1. 623 00:38:10,530 --> 00:38:11,240 OK. 624 00:38:11,240 --> 00:38:15,070 So that's brought lambda_1, lambda_min into it, 625 00:38:15,070 --> 00:38:16,540 in the denominator. 626 00:38:16,540 --> 00:38:19,920 Now, here's another point. 627 00:38:19,920 --> 00:38:22,350 Second point now. 628 00:38:22,350 --> 00:38:25,790 So that would be the absolute error. 629 00:38:25,790 --> 00:38:29,290 But we saw for those factors of two and one 630 00:38:29,290 --> 00:38:33,980 millionth and so on, that really it's the relative error. 631 00:38:33,980 --> 00:38:40,060 So I want to estimate not the absolute error, delta u, 632 00:38:40,060 --> 00:38:45,050 but the error delta u relative to u itself. 633 00:38:45,050 --> 00:38:47,460 So that if I scale the whole problem, 634 00:38:47,460 --> 00:38:50,220 my relative error wouldn't change. 635 00:38:50,220 --> 00:38:52,630 So, in other words, what I want to do 636 00:38:52,630 --> 00:39:00,960 is ask in this case, what's the right hand side? 637 00:39:00,960 --> 00:39:07,540 How big, yeah, I know, I want to know how small u could be, 638 00:39:07,540 --> 00:39:08,280 right? 639 00:39:08,280 --> 00:39:10,410 I'm shooting for the worst. 640 00:39:10,410 --> 00:39:20,370 The relative error is the size of u, 641 00:39:20,370 --> 00:39:21,925 maybe I should put it this way. 642 00:39:21,925 --> 00:39:26,320 The relative error is the size of the error 643 00:39:26,320 --> 00:39:28,300 relative to the size of u. 644 00:39:28,300 --> 00:39:32,620 And I want to know how big that could be. 645 00:39:32,620 --> 00:39:33,200 OK. 646 00:39:33,200 --> 00:39:37,280 So now I know how big delta u could be, it could be that big. 647 00:39:37,280 --> 00:39:40,710 But u itself, how big could u be? 648 00:39:40,710 --> 00:39:44,030 How small could u be, right? u's in the denominator. 649 00:39:44,030 --> 00:39:45,940 So if I'm trying to make this big, 650 00:39:45,940 --> 00:39:47,510 I'll try to make that small. 651 00:39:47,510 --> 00:39:50,990 So when is u the smallest? 652 00:39:50,990 --> 00:39:54,035 Over there I said when is delta u the biggest, now I'm going 653 00:39:54,035 --> 00:39:56,730 to say when is u the smallest? 654 00:39:56,730 --> 00:40:03,260 What f would point me in the direction 655 00:40:03,260 --> 00:40:07,390 in which u was the smallest? 656 00:40:07,390 --> 00:40:10,530 Got to be the other eigenvector. 657 00:40:10,530 --> 00:40:11,480 This end. 658 00:40:11,480 --> 00:40:17,620 The worst case would be when this 659 00:40:17,620 --> 00:40:19,840 is in the direction of x_n. 660 00:40:19,840 --> 00:40:21,420 The top eigenvector. 661 00:40:21,420 --> 00:40:27,800 In that case, what is u? 662 00:40:27,800 --> 00:40:34,300 So I'm saying the worst f is the one that makes u smallest, 663 00:40:34,300 --> 00:40:39,460 and the worst delta f is the one that makes delta u biggest. 664 00:40:39,460 --> 00:40:41,540 I'm going for the worst case here, 665 00:40:41,540 --> 00:40:52,040 so if the right side is x_n, what is u? x_n over lambda_n. 666 00:40:52,040 --> 00:40:56,560 Because when I multiply by K, K times x_n brings me a lambda_n, 667 00:40:56,560 --> 00:40:58,660 cancel that lambda_n I get it right. 668 00:40:58,660 --> 00:41:04,980 So there is the smallest u, and here is the largest delta u. 669 00:41:04,980 --> 00:41:09,220 And the epsilon is coming from the method we use, 670 00:41:09,220 --> 00:41:13,930 so that's not involved with the matrix K. So 671 00:41:13,930 --> 00:41:17,340 do you see on there, if I'm trying to estimate 672 00:41:17,340 --> 00:41:23,360 delta u over u, that's big. 673 00:41:23,360 --> 00:41:27,610 The size of delta u over this, the size of delta u 674 00:41:27,610 --> 00:41:31,400 over the size of u is what? 675 00:41:31,400 --> 00:41:36,190 Delta u has some epsilon that measures the machine 676 00:41:36,190 --> 00:41:41,850 number of digits we're keeping, the machine length, word length 677 00:41:41,850 --> 00:41:43,020 and so on. 678 00:41:43,020 --> 00:41:46,040 This is a unit vector over lambda_1. 679 00:41:46,040 --> 00:41:48,610 680 00:41:48,610 --> 00:41:50,850 That's delta u over lambda_1, this u 681 00:41:50,850 --> 00:41:54,550 is in the denominator. lambda_n-- u is down here, 682 00:41:54,550 --> 00:41:57,430 so the lambda_n flips up. 683 00:41:57,430 --> 00:42:00,940 Do you see it? 684 00:42:00,940 --> 00:42:07,550 By taking the worst case, I've got the worst relative error. 685 00:42:07,550 --> 00:42:15,430 So that for other methods, other f's and delta f's, they 686 00:42:15,430 --> 00:42:17,410 won't be the very worst ones. 687 00:42:17,410 --> 00:42:22,960 But here I've written down what's the worst. 688 00:42:22,960 --> 00:42:27,570 And that's the reason that this is the condition number. 689 00:42:27,570 --> 00:42:35,640 So I'm speaking about topics that are there in 1.7, trying 690 00:42:35,640 --> 00:42:37,660 to give you the main point. 691 00:42:37,660 --> 00:42:40,950 The main point is, look at relative error 692 00:42:40,950 --> 00:42:43,430 because that's the right thing to look at. 693 00:42:43,430 --> 00:42:46,000 Look at the worst cases, which are 694 00:42:46,000 --> 00:42:49,530 in the directions of the top and bottom eigenvectors. 695 00:42:49,530 --> 00:42:52,670 In that case, that relative error has this condition 696 00:42:52,670 --> 00:42:58,960 number, lambda_n/lambda_1, and that's the good measure for how 697 00:42:58,960 --> 00:43:01,970 singular the matrix is. 698 00:43:01,970 --> 00:43:05,870 So one millionth of the identity is not a nearly singular 699 00:43:05,870 --> 00:43:08,800 matrix, because lambda_max and lambda_min are equal, 700 00:43:08,800 --> 00:43:11,040 that's a perfectly conditioned matrix. 701 00:43:11,040 --> 00:43:14,990 This matrix has condition number two, 4/2. 702 00:43:14,990 --> 00:43:16,360 It's quite good. 703 00:43:16,360 --> 00:43:21,670 This matrix is getting worse, with an n squared in there, 704 00:43:21,670 --> 00:43:29,230 if n is big, and other matrices could be worse than that. 705 00:43:29,230 --> 00:43:30,460 OK. 706 00:43:30,460 --> 00:43:37,560 So that's my discussion of condition numbers. 707 00:43:37,560 --> 00:43:40,570 I'll add one more thing. 708 00:43:40,570 --> 00:43:43,750 These eigenvalues were a good measure 709 00:43:43,750 --> 00:43:47,300 when my matrix was symmetric positive definite. 710 00:43:47,300 --> 00:43:50,300 If I have a matrix that I would never 711 00:43:50,300 --> 00:44:00,310 call K, a matrix like one, one, zero and a million, OK, that, 712 00:44:00,310 --> 00:44:02,480 I would never write K for that. 713 00:44:02,480 --> 00:44:05,200 I would shoot myself first before writing K. 714 00:44:05,200 --> 00:44:07,110 So what are the eigenvalues lambda_min 715 00:44:07,110 --> 00:44:11,340 and lambda_max for that matrix? 716 00:44:11,340 --> 00:44:13,970 Are you up now on eigenvalues? 717 00:44:13,970 --> 00:44:15,490 We haven't done a lot of eigenvalues 718 00:44:15,490 --> 00:44:19,190 but triangular matrices are really easy. 719 00:44:19,190 --> 00:44:21,770 The eigenvalues of that matrix are? 720 00:44:21,770 --> 00:44:23,470 One and one. 721 00:44:23,470 --> 00:44:28,370 The condition number should not be 1/1, that would be bad. 722 00:44:28,370 --> 00:44:32,250 So instead, if this was my matrix 723 00:44:32,250 --> 00:44:34,800 and I wanted to know its condition number, 724 00:44:34,800 --> 00:44:36,530 what would I do? 725 00:44:36,530 --> 00:44:40,520 How would I define the condition number of A? 726 00:44:40,520 --> 00:44:42,670 You know what I do whenever I have 727 00:44:42,670 --> 00:44:46,770 a matrix that is not symmetric. 728 00:44:46,770 --> 00:44:50,700 I get to a symmetric matrix by forming A transpose A, 729 00:44:50,700 --> 00:44:56,550 I get a K, I take its condition number 730 00:44:56,550 --> 00:45:00,700 by my formula, which I like in a symmetric case, 731 00:45:00,700 --> 00:45:05,950 and then I would take the square root. 732 00:45:05,950 --> 00:45:08,840 So that would be a pretty big number. 733 00:45:08,840 --> 00:45:13,770 For this, for that matrix A, the condition number of that matrix 734 00:45:13,770 --> 00:45:16,470 A is up around 10^6. 735 00:45:16,470 --> 00:45:19,850 The condition number of that matrix is up around 10^6 even 736 00:45:19,850 --> 00:45:24,070 though its eigenvalues are one and one because when I form A 737 00:45:24,070 --> 00:45:29,880 transpose A, those eigenvalues will jump all over. 738 00:45:29,880 --> 00:45:37,670 And then probably this thing will have eigenvalues way up 739 00:45:37,670 --> 00:45:41,370 and the condition number will be high. 740 00:45:41,370 --> 00:45:42,170 OK. 741 00:45:42,170 --> 00:45:45,070 So that's a little, you'll meet condition number. 742 00:45:45,070 --> 00:45:49,410 MATLAB shows it, and you naturally wonder what is this? 743 00:45:49,410 --> 00:45:52,050 Well, if it's a positive definite matrix, 744 00:45:52,050 --> 00:45:54,640 it's just the ratio lambda_max to lambda_min 745 00:45:54,640 --> 00:45:57,010 and it tells you as it gets bigger 746 00:45:57,010 --> 00:46:00,320 that means the matrix is tougher to work with. 747 00:46:00,320 --> 00:46:01,660 OK. 748 00:46:01,660 --> 00:46:07,020 We have just five minutes left to say a few words about QR. 749 00:46:07,020 --> 00:46:11,970 OK, can I do that in just a few minutes? 750 00:46:11,970 --> 00:46:16,140 And much more is in the codes. 751 00:46:16,140 --> 00:46:19,800 OK, what's the deal with QR? 752 00:46:19,800 --> 00:46:28,910 I'm starting with a matrix A. Let's make 753 00:46:28,910 --> 00:46:32,540 it two by two, a two by two. 754 00:46:32,540 --> 00:46:34,750 OK, it's got a couple of columns. 755 00:46:34,750 --> 00:46:36,100 Can I draw them? 756 00:46:36,100 --> 00:46:39,050 So it's got a column there, that's its first column 757 00:46:39,050 --> 00:46:40,790 and it's got another column there, 758 00:46:40,790 --> 00:46:44,320 maybe that's not a very well conditioned matrix. 759 00:46:44,320 --> 00:46:47,840 Those are the columns of A. Plotted in the plane, 760 00:46:47,840 --> 00:46:48,980 two-space. 761 00:46:48,980 --> 00:46:55,530 OK, so now the Gram-Schmidt idea is: out of those columns, 762 00:46:55,530 --> 00:46:57,910 get orthonormal columns. 763 00:46:57,910 --> 00:47:02,560 Get from A to Q. So the Gram-Schmidt idea is, out 764 00:47:02,560 --> 00:47:09,650 of these two vectors, two axes that are not at 90 degrees, 765 00:47:09,650 --> 00:47:12,340 produce vectors that are at 90 degrees. 766 00:47:12,340 --> 00:47:16,330 Actually, you can guess how you're going to do it. 767 00:47:16,330 --> 00:47:19,090 Let me say, OK I'll settle for that direction, that 768 00:47:19,090 --> 00:47:22,400 can be my first direction, q_1. 769 00:47:22,400 --> 00:47:26,870 What should be q_2? 770 00:47:26,870 --> 00:47:30,080 If that direction is the right one for q_1 771 00:47:30,080 --> 00:47:36,270 I say OK I'll settle for that, what's the q_2 guy? 772 00:47:36,270 --> 00:47:38,850 Well, what am I going to do? 773 00:47:38,850 --> 00:47:43,500 I mean, Gram thought of it and Schmidt thought of it. 774 00:47:43,500 --> 00:47:46,210 Schmidt was a little later, but it wasn't, 775 00:47:46,210 --> 00:47:49,830 like that hard to think of it. 776 00:47:49,830 --> 00:47:52,700 What do you do here? 777 00:47:52,700 --> 00:47:54,470 Well, we know how to do projections 778 00:47:54,470 --> 00:47:59,010 from this least squares. 779 00:47:59,010 --> 00:48:01,600 What am I looking for? 780 00:48:01,600 --> 00:48:03,880 Subtract off the projection, right. 781 00:48:03,880 --> 00:48:07,510 Take the projection and subtract it off 782 00:48:07,510 --> 00:48:12,040 and be left with the component that's perpendicular. 783 00:48:12,040 --> 00:48:17,500 So this will be the q_1 direction and this will be, 784 00:48:17,500 --> 00:48:20,550 this guy e, what we called e, would tell me 785 00:48:20,550 --> 00:48:22,750 the q_2 direction. 786 00:48:22,750 --> 00:48:26,440 And then we could make those into unit vectors 787 00:48:26,440 --> 00:48:29,750 and we'd be golden. 788 00:48:29,750 --> 00:48:34,750 And if we did that we would discover that the original a_1, 789 00:48:34,750 --> 00:48:40,160 a_2 and a_1, so this is the original first column, 790 00:48:40,160 --> 00:48:45,040 and the original second column, would be the good q_1 791 00:48:45,040 --> 00:48:53,000 and the good q_2 times some matrix R. So here's our A=QR. 792 00:48:53,000 --> 00:48:57,620 It's your chance to see this second major factorization 793 00:48:57,620 --> 00:48:59,470 of linear algebra. 794 00:48:59,470 --> 00:49:03,520 LU being the first, QR being the second. 795 00:49:03,520 --> 00:49:05,240 So what's up? 796 00:49:05,240 --> 00:49:09,380 Well, compare first columns. 797 00:49:09,380 --> 00:49:12,240 First columns, I didn't change direction. 798 00:49:12,240 --> 00:49:17,580 So all I have is here is some scaling r_(1,1), zero. 799 00:49:17,580 --> 00:49:20,700 Some number times q_1 is a_1. 800 00:49:20,700 --> 00:49:23,880 That direction was fine. 801 00:49:23,880 --> 00:49:31,320 The second direction, q_2 and a_2, those involve also a_1. 802 00:49:31,320 --> 00:49:37,550 So there's an r_(1,2) and an r_(2,2) there. 803 00:49:37,550 --> 00:49:40,570 The point was, this came out triangular. 804 00:49:40,570 --> 00:49:42,990 And that's what makes things good. 805 00:49:42,990 --> 00:49:46,170 It came out triangular because of the order 806 00:49:46,170 --> 00:49:48,300 that Gram and Schmidt worked. 807 00:49:48,300 --> 00:49:51,250 Gram and Schmidt settled the first one 808 00:49:51,250 --> 00:49:53,070 in the first direction. 809 00:49:53,070 --> 00:49:58,180 Then they settled the first two in the first two directions. 810 00:49:58,180 --> 00:50:00,050 If we were in three dimensions, there'd 811 00:50:00,050 --> 00:50:04,350 be an a_3 somewhere here, coming out of the board. 812 00:50:04,350 --> 00:50:09,580 And then the q_3 would come straight out of the board. 813 00:50:09,580 --> 00:50:11,650 Right? 814 00:50:11,650 --> 00:50:15,910 If you just see that you've got Gram-Schmidt completely. a_1 815 00:50:15,910 --> 00:50:17,190 is there. 816 00:50:17,190 --> 00:50:19,900 So is q_1. a_2 is there. 817 00:50:19,900 --> 00:50:25,720 I'm in the board still, in the plane of a_1 and a_2, 818 00:50:25,720 --> 00:50:31,300 is the plane of q_1 and q_2, I'm just getting right angles in. 819 00:50:31,300 --> 00:50:34,850 a_3, the third column in a three by three case, 820 00:50:34,850 --> 00:50:37,760 is coming out at some angle. 821 00:50:37,760 --> 00:50:42,100 I want q_3 to come out at a 90 degree angle. 822 00:50:42,100 --> 00:50:50,140 So that q_3 will involve some combination of all the a's. 823 00:50:50,140 --> 00:50:52,380 So if it was three by three, this 824 00:50:52,380 --> 00:50:57,450 would grow to q_1, q_2, q_3, and this would then have 825 00:50:57,450 --> 00:50:59,530 three guys in its third column. 826 00:50:59,530 --> 00:51:02,970 But maybe you see that picture. 827 00:51:02,970 --> 00:51:06,670 So that's what Gram-Schmidt achieves. 828 00:51:06,670 --> 00:51:09,940 And I just can't let time run out 829 00:51:09,940 --> 00:51:14,800 without saying that this is a pretty good way. 830 00:51:14,800 --> 00:51:19,330 Actually, nobody thought there was a better one for centuries. 831 00:51:19,330 --> 00:51:23,330 But then a guy named Householder came up 832 00:51:23,330 --> 00:51:29,480 with a different way, and a numerically little better way. 833 00:51:29,480 --> 00:51:31,000 Numerically a little better way. 834 00:51:31,000 --> 00:51:33,440 So this is the Gram-Schmidt way. 835 00:51:33,440 --> 00:51:35,620 Can I just put those words up here? 836 00:51:35,620 --> 00:51:41,760 So there's the Gram-Schmidt, the classical Gram-Schmidt idea, 837 00:51:41,760 --> 00:51:47,150 which was what I described, was the easy one, easy to describe. 838 00:51:47,150 --> 00:51:53,090 And then there's a method called Householder, just named 839 00:51:53,090 --> 00:51:56,220 after him, that MATLAB would follow. 840 00:51:56,220 --> 00:52:01,310 That every good qr code now uses Householder matrices. 841 00:52:01,310 --> 00:52:04,440 It achieves the same results. 842 00:52:04,440 --> 00:52:06,120 And if I had a little bit more time 843 00:52:06,120 --> 00:52:08,400 I could draw a picture of what it does. 844 00:52:08,400 --> 00:52:09,720 But there you go. 845 00:52:09,720 --> 00:52:15,560 So that's my lecture on, my quick lecture 846 00:52:15,560 --> 00:52:17,180 on numerical linear algebra. 847 00:52:17,180 --> 00:52:23,010 These two essential points and I'll see you this afternoon. 848 00:52:23,010 --> 00:52:26,270 Let me bring those quiz questions down again, 849 00:52:26,270 --> 00:52:29,480 for any discussion about the quiz.