1 00:00:00,000 --> 00:00:03,350 The following content it's provided by MIT OpenCourseWare, 2 00:00:03,350 --> 00:00:06,090 under a Creative Commons license. 3 00:00:06,090 --> 00:00:09,460 Additional information about our license and MIT OpenCourseWare, 4 00:00:09,460 --> 00:00:11,930 in general, is available at ocw.mit.edu. 5 00:00:15,340 --> 00:00:17,430 PROFESSOR: Several pieces of good news. 6 00:00:17,430 --> 00:00:23,890 Since Brett is back now, all of chapters five and six 7 00:00:23,890 --> 00:00:29,930 of my notes, that is to say, all that we've covered in class, 8 00:00:29,930 --> 00:00:34,850 plus a little more, are on the web page now. 9 00:00:34,850 --> 00:00:39,040 So, I was cautious at the beginning of the semester 10 00:00:39,040 --> 00:00:42,330 and didn't put that up, and then forgot that it wasn't there. 11 00:00:42,330 --> 00:00:45,360 So now you have something to work with, 12 00:00:45,360 --> 00:00:52,330 particularly on any experiments, for example, 13 00:00:52,330 --> 00:00:54,030 minimum degree ordering. 14 00:00:54,030 --> 00:01:00,040 Oh and also, there's a movie now. 15 00:01:00,040 --> 00:01:04,390 Tim Davis from Florida and [? Perils ?] [? Person ?], 16 00:01:04,390 --> 00:01:10,950 my friend here, created a little movie to show the order that 17 00:01:10,950 --> 00:01:13,360 elimination occurs. 18 00:01:13,360 --> 00:01:16,760 And I had a lot of email with Tim Davis. 19 00:01:19,620 --> 00:01:26,200 On that one-page handout, for 2D Laplace's equation, 20 00:01:26,200 --> 00:01:32,830 it mentions N cubed for the nested dissection, where 21 00:01:32,830 --> 00:01:38,020 you cut the region in half, and half, and half, by separators. 22 00:01:38,020 --> 00:01:41,250 I believe, and I think experiments will show, 23 00:01:41,250 --> 00:01:45,310 that minimum degree is even better. 24 00:01:45,310 --> 00:01:49,360 But the analysis is extremely weak, apparently, 25 00:01:49,360 --> 00:01:50,560 on minimum degree. 26 00:01:50,560 --> 00:01:57,390 So anyway, one possible project either for now 27 00:01:57,390 --> 00:02:04,570 or for the final project would be some experiments 28 00:02:04,570 --> 00:02:11,310 to see what the actual convergence is like. 29 00:02:11,310 --> 00:02:16,660 And maybe a little understanding of minimum degree. 30 00:02:16,660 --> 00:02:18,150 So we'll talk more about that. 31 00:02:18,150 --> 00:02:23,590 Anyway, there's a movie and a lot more material 32 00:02:23,590 --> 00:02:24,530 there on the website. 33 00:02:27,740 --> 00:02:34,600 It'll get updated and corrected as I do the edits. 34 00:02:34,600 --> 00:02:37,760 Second bit of good news. 35 00:02:37,760 --> 00:02:40,340 I made a mistake. 36 00:02:40,340 --> 00:02:44,660 That's not normally good news, but this time, I 37 00:02:44,660 --> 00:02:50,150 reported a little calculation for multigrid, 38 00:02:50,150 --> 00:02:52,470 and I computed the wrong thing. 39 00:02:52,470 --> 00:03:00,680 You remember that I computed M. M was the Jacobi matrix. 40 00:03:00,680 --> 00:03:06,800 So M was the Jacobi matrix, the iteration matrix with Jacobi, 41 00:03:06,800 --> 00:03:11,280 which is I minus D inverse A This is 42 00:03:11,280 --> 00:03:17,300 for the 1D second differences. 43 00:03:17,300 --> 00:03:24,140 In fact, this was the 5 by 5 matrix 44 00:03:24,140 --> 00:03:25,620 that we've been talking about. 45 00:03:25,620 --> 00:03:32,000 And now this is multiplied by D inverse which is a half, 46 00:03:32,000 --> 00:03:37,970 but then with a weighting factor, that becomes a third. 47 00:03:37,970 --> 00:03:40,390 So that's weighted Jacobi now. 48 00:03:40,390 --> 00:03:45,690 And this is what we would get for ordinary, simple iteration. 49 00:03:45,690 --> 00:03:47,960 And it's not satisfactory. 50 00:03:47,960 --> 00:03:52,630 Remember it had an eigenvalue lambda_max, the spectra radius, 51 00:03:52,630 --> 00:03:56,400 was about 0.9. 52 00:03:56,400 --> 00:04:05,190 Then I reported about the eigenvalues of M*S, 53 00:04:05,190 --> 00:04:12,680 the multigrid matrix S, M, and I was a little disappointed 54 00:04:12,680 --> 00:04:14,630 in its eigenvalues. 55 00:04:14,630 --> 00:04:18,060 The reason is, I should have been doing I 56 00:04:18,060 --> 00:04:24,960 minus S M. When I do that, I'm not disappointed anymore. 57 00:04:29,050 --> 00:04:30,900 What are the eigenvalues of this? 58 00:04:30,900 --> 00:04:37,100 You remember, S is the matrix that tells us 59 00:04:37,100 --> 00:04:39,210 how much error we're capturing. 60 00:04:39,210 --> 00:04:44,800 S is the matrix that came from-- the error we captured 61 00:04:44,800 --> 00:04:48,550 at the end of multigrid, was S times 62 00:04:48,550 --> 00:04:52,910 the error that we entered with. 63 00:04:52,910 --> 00:05:00,750 And so the error that's left, the remaining error, 64 00:05:00,750 --> 00:05:05,050 the new error, is the rest. 65 00:05:05,050 --> 00:05:07,780 It's the part we don't get, and that's when I forgot. 66 00:05:07,780 --> 00:05:10,610 That's I minus S e. 67 00:05:10,610 --> 00:05:14,780 So that's why I should have been using I minus S. 68 00:05:14,780 --> 00:05:19,130 So then, the step one of multigrid does a smoother. 69 00:05:19,130 --> 00:05:24,480 Steps two, three, four leave me I minus S. 70 00:05:24,480 --> 00:05:27,920 And step five did another smoother. 71 00:05:27,920 --> 00:05:33,660 And this is just with one Jacobi step, which you would probably 72 00:05:33,660 --> 00:05:34,930 do three. 73 00:05:34,930 --> 00:05:40,630 Anyway, the eigenvalues of this were, well -- 74 00:05:40,630 --> 00:05:45,760 this has two zero eigenvalues, and three 1's. 75 00:05:45,760 --> 00:05:51,484 So this matrix has eigenvalues 0, 0, 1, 1, 1. 76 00:05:56,110 --> 00:05:58,820 Because, why's that? 77 00:05:58,820 --> 00:06:02,200 Well, we get two 0's-- you remember the eigenvalues had 78 00:06:02,200 --> 00:06:05,250 to be 0 or 1, because S squared equaled S, 79 00:06:05,250 --> 00:06:09,110 and I minus S squared equals I minus S. So. 80 00:06:09,110 --> 00:06:10,470 We remember that. 81 00:06:10,470 --> 00:06:14,440 Now, the question is, what are these? 82 00:06:14,440 --> 00:06:18,600 And the answer is 1/9 is a triple eigenvalue. 83 00:06:18,600 --> 00:06:24,970 I was astonished to discover 1/9-- of course, 0, 0 survived. 84 00:06:24,970 --> 00:06:28,560 The rank is 3. 85 00:06:28,560 --> 00:06:32,220 And this is now looking much better. 86 00:06:32,220 --> 00:06:35,610 What I reported last time was some more like 7/8, 87 00:06:35,610 --> 00:06:38,900 or something, and I thought oh, multigrid, 88 00:06:38,900 --> 00:06:42,290 it's not showing its true colors. 89 00:06:42,290 --> 00:06:44,650 But actually it does. 90 00:06:44,650 --> 00:06:48,760 This is a kind eigenvalue we expect, like 1/10, 91 00:06:48,760 --> 00:06:54,240 for a multigrid cycle. 92 00:06:54,240 --> 00:06:55,390 So much better. 93 00:06:55,390 --> 00:06:58,950 If we just did three M's, for example, 94 00:06:58,950 --> 00:07:04,210 I would get down to 0.9 cubed, which is more than 0.7, 95 00:07:04,210 --> 00:07:09,520 where by doing multigrid instead, we're down to 0.1. 96 00:07:09,520 --> 00:07:10,260 OK. 97 00:07:10,260 --> 00:07:13,130 So that's the good news. 98 00:07:13,130 --> 00:07:16,730 And, of course, if you experiment a little, 99 00:07:16,730 --> 00:07:21,350 you'll find different numbers for different sizes 100 00:07:21,350 --> 00:07:23,810 and really see what's happening. 101 00:07:23,810 --> 00:07:26,330 And, in fact, you could do 2D. 102 00:07:26,330 --> 00:07:29,710 I mentioned all this partly for now, 103 00:07:29,710 --> 00:07:35,340 if you're wanting to do it now, or partly for eventually, 104 00:07:35,340 --> 00:07:36,770 later. 105 00:07:36,770 --> 00:07:40,730 Oh, and thinking about projects, just one word to repeat, 106 00:07:40,730 --> 00:07:43,610 that the Monday after spring break, 107 00:07:43,610 --> 00:07:48,920 I think that's April the third, no class. 108 00:07:48,920 --> 00:07:52,710 So it'll be Wednesday that I'll see you again. 109 00:07:52,710 --> 00:07:56,620 But Mr. [? Cho ?] will be available. 110 00:07:56,620 --> 00:08:02,270 He has office hours, so that's April 3, 111 00:08:02,270 --> 00:08:06,480 if you wanted to discuss issues with your project 112 00:08:06,480 --> 00:08:11,200 with Mr. [? Cho ?], that would be at class time in his office 113 00:08:11,200 --> 00:08:18,460 2-130, at one o'clock. 114 00:08:18,460 --> 00:08:21,380 Otherwise, take an extra spring break, 115 00:08:21,380 --> 00:08:23,620 and I'll see you Wednesday. 116 00:08:23,620 --> 00:08:24,900 OK. 117 00:08:24,900 --> 00:08:27,960 So that's various bits of good news. 118 00:08:27,960 --> 00:08:28,460 Oh. 119 00:08:28,460 --> 00:08:32,700 One other thing that I guess I didn't finish in that lecture, 120 00:08:32,700 --> 00:08:40,570 was to identify the two eigenvalues. 121 00:08:44,560 --> 00:08:46,130 Well, eigenvectors. 122 00:08:46,130 --> 00:08:48,359 Those will be in the notes too. 123 00:08:48,359 --> 00:08:50,400 I guess I found one eigenvector, [1, 2, 2, 2, 1]. 124 00:08:52,930 --> 00:08:55,910 And glancing at the matrix, I didn't spot the other one, 125 00:08:55,910 --> 00:08:58,860 but it has one oscillation. 126 00:08:58,860 --> 00:09:02,180 It's [1, 2, 0, -2, -1]. 127 00:09:02,180 --> 00:09:05,110 Anyway, this is the main thing. 128 00:09:05,110 --> 00:09:12,030 A much better result. So that's multigrid, 129 00:09:12,030 --> 00:09:16,580 which we now can use as our iteration. 130 00:09:19,260 --> 00:09:22,420 Or we can use it as a preconditioner. 131 00:09:22,420 --> 00:09:29,630 Many of the recommended methods use 132 00:09:29,630 --> 00:09:33,530 multigrid as a preconditioner before something even more 133 00:09:33,530 --> 00:09:36,147 powerful. 134 00:09:36,147 --> 00:09:37,980 Well, I don't know about even more powerful. 135 00:09:37,980 --> 00:09:40,730 Multigrid people would say not possible. 136 00:09:40,730 --> 00:09:49,410 But some situations, we may want to go to a different method 137 00:09:49,410 --> 00:09:55,330 but we want a preconditioner, and multigrid excellent. 138 00:09:55,330 --> 00:09:58,030 By multigrid, I include the smoother. 139 00:09:58,030 --> 00:10:00,050 So it'd be that. 140 00:10:00,050 --> 00:10:02,070 That would be a possible preconditioner 141 00:10:02,070 --> 00:10:03,530 for something else. 142 00:10:03,530 --> 00:10:04,220 OK. 143 00:10:04,220 --> 00:10:06,020 So where are we now? 144 00:10:06,020 --> 00:10:10,170 We're into the new section starting this moment. 145 00:10:10,170 --> 00:10:13,730 And what's the section about? 146 00:10:13,730 --> 00:10:21,030 It's about things associated with this name, Krylov. 147 00:10:21,030 --> 00:10:24,270 So, I'm going to use the letter K for the things that are 148 00:10:24,270 --> 00:10:26,740 associated-- and I'm always solving A*u equals b. 149 00:10:32,620 --> 00:10:37,210 There's a Krylov matrix that's created exactly 150 00:10:37,210 --> 00:10:38,660 as you see here. 151 00:10:38,660 --> 00:10:43,100 This is the j-th-- the one with j columns. 152 00:10:43,100 --> 00:10:48,020 b, A*b, A squared b, up to A to the the j minus 1 b. 153 00:10:48,020 --> 00:10:55,830 And the Krylov space is the combinations of those vectors. 154 00:10:55,830 --> 00:10:58,920 That's what this word, span, means. 155 00:10:58,920 --> 00:11:02,660 I could erase span, and say, "all combinations of". 156 00:11:02,660 --> 00:11:06,130 Maybe that even more familiar. 157 00:11:06,130 --> 00:11:08,160 So "spanned by" is the same thing 158 00:11:08,160 --> 00:11:16,110 is saying "all linear combinations of". 159 00:11:16,110 --> 00:11:19,930 Of those j vectors. 160 00:11:19,930 --> 00:11:21,790 And, of course, that's the same as saying 161 00:11:21,790 --> 00:11:23,650 it's the column space of the matrix, 162 00:11:23,650 --> 00:11:26,430 because the column space is all combinations. 163 00:11:26,430 --> 00:11:27,390 OK. 164 00:11:27,390 --> 00:11:33,170 So, why am I interested? 165 00:11:33,170 --> 00:11:34,450 Why was Krylov interested? 166 00:11:34,450 --> 00:11:38,200 Why is everybody interested in these vectors? 167 00:11:38,200 --> 00:11:45,470 Because actually, that's what an iteration like Jacobi produces. 168 00:11:45,470 --> 00:11:53,120 If I use Jacobi's method-- or Gauss-Seidel, any of those-- 169 00:11:53,120 --> 00:11:57,670 after one step, I've got b. 170 00:11:57,670 --> 00:12:01,371 After two steps, there's a multiplication by A in there, 171 00:12:01,371 --> 00:12:01,870 right? 172 00:12:01,870 --> 00:12:04,750 And some combination is taken, depending 173 00:12:04,750 --> 00:12:06,710 on the particular method. 174 00:12:06,710 --> 00:12:11,510 So after two steps, I've got a combination of b and A*b. 175 00:12:11,510 --> 00:12:15,760 After three steps, Jacobi produces some combination of b, 176 00:12:15,760 --> 00:12:16,890 A*b, A squared b. 177 00:12:16,890 --> 00:12:20,660 In other words, all of those iterative methods 178 00:12:20,660 --> 00:12:26,420 are picking their j-th approximation 179 00:12:26,420 --> 00:12:29,550 in this space K_j, actually. 180 00:12:29,550 --> 00:12:34,190 I should put a little j on it, to indicate that that's-- so we 181 00:12:34,190 --> 00:12:36,500 have these spaces are growing. 182 00:12:36,500 --> 00:12:41,010 They grow by one dimension, by one new basis 183 00:12:41,010 --> 00:12:43,620 vector at each iteration. 184 00:12:43,620 --> 00:12:50,970 And the point is, Jacobi makes a particular choice 185 00:12:50,970 --> 00:13:00,860 of x_j or u_j, the approximation after j steps. 186 00:13:00,860 --> 00:13:03,470 But does it make the best choice? 187 00:13:03,470 --> 00:13:04,690 Probably not. 188 00:13:04,690 --> 00:13:08,340 In fact, pretty definitely not. 189 00:13:08,340 --> 00:13:15,180 And so the idea is let's make the best choice. 190 00:13:15,180 --> 00:13:18,030 Let's not just use a simple iteration 191 00:13:18,030 --> 00:13:24,790 that builds in a choice that might not be so terrific. 192 00:13:24,790 --> 00:13:26,210 Let's make the best choice. 193 00:13:26,210 --> 00:13:36,160 There are a whole lot of Krylov methods, which all choose x_j-- 194 00:13:36,160 --> 00:13:38,525 since I think this section will use x, 195 00:13:38,525 --> 00:13:40,470 I'm going to change that to x. 196 00:13:43,569 --> 00:13:44,610 They'll all be iterative. 197 00:13:44,610 --> 00:13:48,810 They'll all start with x_0 as 0, say. 198 00:13:48,810 --> 00:13:53,040 And then the first guess will be b, maybe. 199 00:13:53,040 --> 00:13:53,810 Or maybe not. 200 00:13:53,810 --> 00:13:56,050 Maybe some multiple of b. 201 00:13:56,050 --> 00:13:59,770 Actually, a good Krylov method will 202 00:13:59,770 --> 00:14:02,380 take the best multiple of b as the first guess, 203 00:14:02,380 --> 00:14:06,180 and not necessarily 1 times b. 204 00:14:06,180 --> 00:14:08,430 And onwards. 205 00:14:08,430 --> 00:14:14,010 So we have this word, best, coming up. 206 00:14:14,010 --> 00:14:17,580 What's the best vector in that space? 207 00:14:17,580 --> 00:14:20,690 And there are different methods depending 208 00:14:20,690 --> 00:14:22,130 on what I mean by best. 209 00:14:25,790 --> 00:14:27,870 Oh, let me tell you the name of the method 210 00:14:27,870 --> 00:14:33,180 that I'm going to concentrate on first and most. 211 00:14:33,180 --> 00:14:36,660 Will be the conjugate gradient method. 212 00:14:36,660 --> 00:14:38,280 CG. 213 00:14:38,280 --> 00:14:41,020 The conjugate gradient method. 214 00:14:53,500 --> 00:14:55,290 What determines x_j? 215 00:14:55,290 --> 00:15:08,320 It chooses x_j in K_j, the space that we always 216 00:15:08,320 --> 00:15:12,870 look for our approximation in. 217 00:15:12,870 --> 00:15:21,940 Let me not forget to say: these vectors, b, A*b, A squared b, 218 00:15:21,940 --> 00:15:25,240 and so on, they're easy to compute, 219 00:15:25,240 --> 00:15:30,920 because each one is just a matrix multiplication from 220 00:15:30,920 --> 00:15:32,890 the previous one. 221 00:15:32,890 --> 00:15:35,590 And the matrix is-- we're assuming we're 222 00:15:35,590 --> 00:15:38,810 working with sparse matrices. 223 00:15:38,810 --> 00:15:42,200 And mostly, and especially, sometimes, 224 00:15:42,200 --> 00:15:44,590 especially symmetric ones. 225 00:15:44,590 --> 00:15:47,540 So just let me put in here, before I even 226 00:15:47,540 --> 00:15:50,790 finish that sentence, that CG, this 227 00:15:50,790 --> 00:15:54,820 is for A transpose equal A symmetric. 228 00:15:54,820 --> 00:15:56,990 It's only for those. 229 00:15:56,990 --> 00:16:02,150 And positive definite so symmetric, positive definite. 230 00:16:07,390 --> 00:16:11,610 So it's a limited class of problems, but a highly, highly 231 00:16:11,610 --> 00:16:13,400 important class. 232 00:16:13,400 --> 00:16:17,270 And you may say what do we do if the matrix is 233 00:16:17,270 --> 00:16:20,740 symmetric, but indefinite? 234 00:16:20,740 --> 00:16:23,510 Well, that comes next. 235 00:16:23,510 --> 00:16:27,500 Or what if the matrix is not to symmetric at all. 236 00:16:27,500 --> 00:16:31,710 Well, if you're brave, you might try conjugate gradients anyway. 237 00:16:31,710 --> 00:16:34,800 But if you're cautious, then you would 238 00:16:34,800 --> 00:16:39,010 use one of the other methods like MINRES. 239 00:16:39,010 --> 00:16:45,580 Maybe I'll just mention one more on our eventual list. 240 00:16:45,580 --> 00:16:50,970 Actually that tells us probably what choice it makes. 241 00:16:50,970 --> 00:16:59,930 MINRES chooses the x_j to minimize the residual. 242 00:16:59,930 --> 00:17:01,850 Minimum r. 243 00:17:01,850 --> 00:17:08,490 Remember r is b minus A*x. 244 00:17:08,490 --> 00:17:11,450 The residual r will always denote 245 00:17:11,450 --> 00:17:13,880 the error in the equation. 246 00:17:13,880 --> 00:17:19,860 And so it's minimum residual. 247 00:17:19,860 --> 00:17:21,090 OK. 248 00:17:21,090 --> 00:17:23,470 So that's a natural choice, but you'll 249 00:17:23,470 --> 00:17:31,190 see that this is a fantastic method. 250 00:17:31,190 --> 00:17:39,430 Superior, just quicker than MINRES, for a nice reason. 251 00:17:39,430 --> 00:17:45,680 So it chooses x_j in K_j, and I think the rule it uses 252 00:17:45,680 --> 00:18:00,090 is that I think x_j should be orthogonal-- I'll go and check 253 00:18:00,090 --> 00:18:05,080 it in my notes-- to the residual r_j. 254 00:18:05,080 --> 00:18:08,410 Let me just be sure-- if I'm going to write that down, 255 00:18:08,410 --> 00:18:10,510 I better be sure I said it correctly. 256 00:18:13,910 --> 00:18:18,120 I actually, I didn't say it correctly. 257 00:18:18,120 --> 00:18:30,800 r_j is orthogonal to the whole space K_j. 258 00:18:30,800 --> 00:18:36,520 It turns out that we can choose x_j-- 259 00:18:36,520 --> 00:18:40,260 when I make a choice of x_j, it's in K_j. 260 00:18:40,260 --> 00:18:43,790 Now when I compute r, there's a multiplication by A 261 00:18:43,790 --> 00:18:48,130 to get the residual, so that moves us up to the next space, 262 00:18:48,130 --> 00:18:50,230 and I'm going to make a choice where 263 00:18:50,230 --> 00:18:54,600 this is orthogonal to the whole space K_j. 264 00:18:54,600 --> 00:18:57,540 You have to do conjugate gradient. 265 00:18:57,540 --> 00:19:01,620 And I can't start on that for a good reason. 266 00:19:01,620 --> 00:19:04,490 I have to start with something called Arnoldi. 267 00:19:04,490 --> 00:19:11,230 And and what's Arnoldi about? 268 00:19:11,230 --> 00:19:13,734 Well. 269 00:19:13,734 --> 00:19:15,150 Let me come back to these vectors. 270 00:19:20,380 --> 00:19:21,240 Quick to compute. 271 00:19:21,240 --> 00:19:25,970 But what's the other property? 272 00:19:25,970 --> 00:19:29,220 Everything in applied numerical mathematics, 273 00:19:29,220 --> 00:19:32,140 you're choosing basis vectors, and you're 274 00:19:32,140 --> 00:19:34,710 looking for two properties. 275 00:19:34,710 --> 00:19:37,690 And I guess this is like a general rule, whenever 276 00:19:37,690 --> 00:19:42,270 you meet a whole new problem, in the end, 277 00:19:42,270 --> 00:19:44,190 you're going to construct some basis vectors 278 00:19:44,190 --> 00:19:46,370 if you're going to compute. 279 00:19:46,370 --> 00:19:49,350 And what properties are you after? 280 00:19:49,350 --> 00:19:52,380 You're after speed. 281 00:19:52,380 --> 00:19:55,230 So they have to be quick to compute and work with. 282 00:19:55,230 --> 00:19:56,760 Well these are. 283 00:19:56,760 --> 00:20:00,400 Multiplying by A is sparse matrix multiplication, 284 00:20:00,400 --> 00:20:03,080 you can't beat that. 285 00:20:03,080 --> 00:20:05,010 But the other property that you need 286 00:20:05,010 --> 00:20:09,800 is some decent independence. 287 00:20:09,800 --> 00:20:12,050 And not just barely independent. 288 00:20:12,050 --> 00:20:15,840 You want the condition number of the basis, 289 00:20:15,840 --> 00:20:21,650 somehow-- if I use that word-- to be good. 290 00:20:21,650 --> 00:20:23,720 Not enormous. 291 00:20:23,720 --> 00:20:28,060 And best of all, you would like the basis vectors 292 00:20:28,060 --> 00:20:30,450 to be orthonormal. 293 00:20:30,450 --> 00:20:34,280 That's a condition number of one. 294 00:20:34,280 --> 00:20:37,500 That's the best basis you can hope for. 295 00:20:37,500 --> 00:20:41,980 And the point is, that this construction 296 00:20:41,980 --> 00:20:46,330 produces a lousy basis from the point of view of condition. 297 00:20:46,330 --> 00:20:57,540 So Arnoldi's job is orthogonalize the Krylov basis, 298 00:20:57,540 --> 00:21:14,230 which is b, A*b, and so on, producing orthonormal basis 299 00:21:14,230 --> 00:21:17,520 q_1, q_2, up to q_j. 300 00:21:25,160 --> 00:21:26,140 It's just beautiful. 301 00:21:26,140 --> 00:21:31,000 So Arnoldi's taking like a preliminary step 302 00:21:31,000 --> 00:21:38,400 that Krylov has to make to get something that numerically 303 00:21:38,400 --> 00:21:40,880 reasonable to work with. 304 00:21:40,880 --> 00:21:48,300 Then if it's fast, and Arnoldi doesn't-- we have to do 305 00:21:48,300 --> 00:21:51,800 a multiplication by A, and you'll see one here 306 00:21:51,800 --> 00:21:55,670 in the middle of Arnoldi-- and then, of course, 307 00:21:55,670 --> 00:22:03,150 if you look at those ten lines of MATLAB, 308 00:22:03,150 --> 00:22:07,120 and we could go through them a little carefully, 309 00:22:07,120 --> 00:22:09,250 but let's just take a first look. 310 00:22:09,250 --> 00:22:14,380 The first step is like Gram-Schmidt, right? 311 00:22:14,380 --> 00:22:17,070 It's sort of a Gram-Schmidt idea. 312 00:22:17,070 --> 00:22:21,990 You take that first vector b, you accept that direction, 313 00:22:21,990 --> 00:22:24,995 and the only step remaining is to normalize it. 314 00:22:24,995 --> 00:22:27,740 Divide by its length. 315 00:22:27,740 --> 00:22:30,890 Then you go on to the next. 316 00:22:30,890 --> 00:22:37,000 Then you have some trial vector t, 317 00:22:37,000 --> 00:22:42,580 which would in this case be A*b, in the direction of A*b, 318 00:22:42,580 --> 00:22:47,480 which won't be orthogonal to the original b, right? 319 00:22:47,480 --> 00:22:50,350 So this won't be orthogonal to that one. 320 00:22:53,250 --> 00:22:57,190 So, of course, you've got to compute 321 00:22:57,190 --> 00:23:02,150 an inner product between them. 322 00:23:02,150 --> 00:23:05,680 And subtract off the right multiple 323 00:23:05,680 --> 00:23:09,010 of the previous one from the current t 324 00:23:09,010 --> 00:23:11,430 to get the improved t. 325 00:23:11,430 --> 00:23:15,750 And then you're going to normalize it again. 326 00:23:18,510 --> 00:23:21,910 So you compute its length and divide by the length. 327 00:23:21,910 --> 00:23:25,650 You see that overall pattern in Arnoldi? 328 00:23:25,650 --> 00:23:31,180 It's the familiar idea of Gram-Schmidt of, 329 00:23:31,180 --> 00:23:37,050 take a new vector, find its projections 330 00:23:37,050 --> 00:23:39,800 onto the ones that are already set, 331 00:23:39,800 --> 00:23:46,700 subtract those components off, you're left with a piece, 332 00:23:46,700 --> 00:23:49,120 and you find its length and normalize 333 00:23:49,120 --> 00:23:50,730 that to be a unit factor. 334 00:23:50,730 --> 00:23:52,060 It's very familiar. 335 00:23:52,060 --> 00:23:56,370 Its exactly the type of algorithm 336 00:23:56,370 --> 00:23:59,910 that you would write down immediately. 337 00:23:59,910 --> 00:24:04,820 And it just involved the one multiplication by A 338 00:24:04,820 --> 00:24:10,280 so that it's not going to cost us 339 00:24:10,280 --> 00:24:15,610 a lot to make the vectors orthonormal. 340 00:24:15,610 --> 00:24:16,940 Well. 341 00:24:16,940 --> 00:24:17,660 Wait a minute. 342 00:24:20,630 --> 00:24:22,340 Is it going to be expensive or not? 343 00:24:22,340 --> 00:24:26,220 That's like the key question. 344 00:24:26,220 --> 00:24:29,850 It's certainly going to produce a good result. 345 00:24:29,850 --> 00:24:33,940 Producing orthonormal vectors is going to be a good thing to do. 346 00:24:33,940 --> 00:24:36,650 But is it expensive or not? 347 00:24:36,650 --> 00:24:37,450 That depends. 348 00:24:40,520 --> 00:24:49,780 It's certainly not expensive if the new trial vector 349 00:24:49,780 --> 00:24:55,910 t has only components in one or two 350 00:24:55,910 --> 00:24:59,370 of the already settled directions. 351 00:24:59,370 --> 00:25:04,410 In other words, if I only have to subtract off 352 00:25:04,410 --> 00:25:12,400 a couple of earlier components, then I'm golden. 353 00:25:12,400 --> 00:25:15,730 And that's the case when A is symmetric. 354 00:25:15,730 --> 00:25:18,210 So that will be the key point here. 355 00:25:18,210 --> 00:25:19,860 And I just make it here. 356 00:25:19,860 --> 00:25:32,310 If A is symmetric, A transpose equals A, then I only need, 357 00:25:32,310 --> 00:25:39,330 in this subtracting off, where I'm headed for the new q, 358 00:25:39,330 --> 00:25:47,620 I only need to subtract off h_(j, j), multiplying the q_j, 359 00:25:47,620 --> 00:25:50,800 the q that was just set, and the one before. 360 00:25:53,230 --> 00:25:53,730 h_(j-1, j). 361 00:25:56,390 --> 00:25:59,130 I'll call that a short recurrence. 362 00:25:59,130 --> 00:26:02,730 It's short because it only has two terms. 363 00:26:02,730 --> 00:26:08,730 And then there'll be a new h_(j+1, j), 364 00:26:08,730 --> 00:26:11,140 which is just the length. 365 00:26:11,140 --> 00:26:17,140 So there'll be one of these magic things in so many parts 366 00:26:17,140 --> 00:26:22,390 of mathematical analysis, a three-term recurrence relation 367 00:26:22,390 --> 00:26:25,690 will hold-- in fact, I guess the reason 368 00:26:25,690 --> 00:26:31,140 we see three-term recurrence relations in all-- Legendre 369 00:26:31,140 --> 00:26:34,040 polynomials, Chebyshev polynomials, 370 00:26:34,040 --> 00:26:37,410 all those classical things-- the reason 371 00:26:37,410 --> 00:26:41,160 is actually the same as it is here. 372 00:26:41,160 --> 00:26:45,150 That something in the background is symmetric. 373 00:26:50,760 --> 00:26:52,870 I want to put a few numbers on the board 374 00:26:52,870 --> 00:26:58,390 so you see what is typical input and output to Arnoldi. 375 00:26:58,390 --> 00:27:02,680 And you'll see it's symmetric and then you'll 376 00:27:02,680 --> 00:27:06,080 see it's short recurrence, and then we want to see what? 377 00:27:06,080 --> 00:27:06,700 OK. 378 00:27:06,700 --> 00:27:14,090 So I worked out a typical Arnoldi example. 379 00:27:14,090 --> 00:27:15,870 Not that one. 380 00:27:15,870 --> 00:27:17,550 Here. 381 00:27:17,550 --> 00:27:18,050 OK. 382 00:27:22,330 --> 00:27:25,180 I think this is a good example to look at. 383 00:27:25,180 --> 00:27:29,120 So matrix A is not only symmetric, it's diagonal. 384 00:27:29,120 --> 00:27:35,760 So of course, to find A inverse b is not difficult. 385 00:27:35,760 --> 00:27:40,480 But we're going to do it through conjugate gradients. 386 00:27:40,480 --> 00:27:43,420 So that means that we don't figure out A inverse, 387 00:27:43,420 --> 00:27:45,780 which, of course, we easily could. 388 00:27:45,780 --> 00:27:49,350 We just use A, very sparse. 389 00:27:49,350 --> 00:27:51,490 Here's our right-hand side. 390 00:27:51,490 --> 00:27:55,440 Here's our Krylov matrix. 391 00:27:55,440 --> 00:27:57,880 It has b in its first column. 392 00:27:57,880 --> 00:28:00,770 It has A times b in the second column. 393 00:28:00,770 --> 00:28:03,340 Multiply by A again to get the third column. 394 00:28:03,340 --> 00:28:05,430 And A one more time to get the fourth column. 395 00:28:05,430 --> 00:28:09,500 So that's K_4, right? 396 00:28:09,500 --> 00:28:16,770 And for a 4 by 4 problem, that's the end. 397 00:28:16,770 --> 00:28:26,100 So actually here I'm carrying Krylov to the end of it. 398 00:28:26,100 --> 00:28:27,920 There's no more to do. 399 00:28:27,920 --> 00:28:34,350 Because once I've got this K_j, the combinations of those, 400 00:28:34,350 --> 00:28:39,780 are all of R^4, all of four-dimensional space. 401 00:28:39,780 --> 00:28:40,280 Oh, yeah. 402 00:28:40,280 --> 00:28:43,230 That raises an important point about how 403 00:28:43,230 --> 00:28:46,050 we're improving on iterations. 404 00:28:46,050 --> 00:28:49,890 If I was using Jacobi, or Gauss-Seidel, or one 405 00:28:49,890 --> 00:28:52,910 of those others, then-- well, I won't 406 00:28:52,910 --> 00:29:00,310 say for this particular A, but usually-- after four steps, 407 00:29:00,310 --> 00:29:03,820 I am by no means finished. 408 00:29:03,820 --> 00:29:05,900 I've taken four steps, I've gotten closer, 409 00:29:05,900 --> 00:29:08,220 but I haven't got the exact answer. 410 00:29:08,220 --> 00:29:12,560 But now, in this the world of Krylov methods, 411 00:29:12,560 --> 00:29:15,920 after four steps, I will have the exact answer. 412 00:29:15,920 --> 00:29:16,480 Why? 413 00:29:16,480 --> 00:29:26,710 Because Krylov methods take the best x_4 out of the space 414 00:29:26,710 --> 00:29:28,220 spanned by those four columns. 415 00:29:28,220 --> 00:29:32,100 Well, the whole space, R^4 is spanned by those columns 416 00:29:32,100 --> 00:29:35,600 and taking the best one must be the answer. 417 00:29:35,600 --> 00:29:40,200 So x_4 will actually be A inverse 418 00:29:40,200 --> 00:29:41,810 b, the answer we're looking for. 419 00:29:41,810 --> 00:29:51,950 So these methods are on the one hand, direct methods. 420 00:29:51,950 --> 00:29:55,330 They finish after four steps, finish. 421 00:29:55,330 --> 00:29:57,260 You've got the answer. 422 00:29:57,260 --> 00:29:58,582 Now. 423 00:29:58,582 --> 00:30:00,040 That's actually how they were born. 424 00:30:00,040 --> 00:30:04,250 As direct methods that gave the answer 425 00:30:04,250 --> 00:30:11,060 from an interesting iteration and they nearly 426 00:30:11,060 --> 00:30:12,740 died for that reason. 427 00:30:12,740 --> 00:30:17,590 They were born and died, or were on death's door, 428 00:30:17,590 --> 00:30:19,690 as direct methods. 429 00:30:19,690 --> 00:30:24,570 Because as direct methods, there are better ones. 430 00:30:24,570 --> 00:30:32,670 Then some years later, people went back to it, back 431 00:30:32,670 --> 00:30:35,880 to conjugate gradients, and noticed 432 00:30:35,880 --> 00:30:42,580 that thought of as just going partway, 433 00:30:42,580 --> 00:30:45,770 gave very successful answers. 434 00:30:45,770 --> 00:30:50,260 So we're going to think of Krylov-- we're planning 435 00:30:50,260 --> 00:30:53,510 to stop before j equals n. 436 00:30:53,510 --> 00:30:57,170 In this picture, j equals n equal 4 here. 437 00:30:59,740 --> 00:31:07,680 But, we plan to stop for j much below n. 438 00:31:07,680 --> 00:31:11,250 We're thinking of applications where n is 10 the fifth, 439 00:31:11,250 --> 00:31:15,910 and we're going to stop at 10 squared, 100 steps, maybe. 440 00:31:15,910 --> 00:31:17,800 Or ten steps. 441 00:31:17,800 --> 00:31:23,790 So that meant a total reconsideration, 442 00:31:23,790 --> 00:31:26,330 reanalysis of conjugate gradients, 443 00:31:26,330 --> 00:31:30,100 and it is now back to a big success. 444 00:31:30,100 --> 00:31:37,420 Well it's a rather unusual thing that a method that people 445 00:31:37,420 --> 00:31:41,750 have put aside gets picked up again and becomes a favorite, 446 00:31:41,750 --> 00:31:43,400 as conjugate gradients have. 447 00:31:49,970 --> 00:31:56,150 First of all, I was saying that the basis, those basis vectors 448 00:31:56,150 --> 00:32:00,850 are not very independent. 449 00:32:03,740 --> 00:32:05,920 That's not a good basis. 450 00:32:05,920 --> 00:32:08,130 Of course, it's a detractive basis and maybe, 451 00:32:08,130 --> 00:32:09,920 do you know the name for a matrix 452 00:32:09,920 --> 00:32:12,380 that has this particular form? 453 00:32:12,380 --> 00:32:19,590 The columns are constant, then, that's the key column, there, 454 00:32:19,590 --> 00:32:24,210 and then it's first powers, second powers, third powers. 455 00:32:24,210 --> 00:32:27,990 Do you remember whose name is associated with that matrix? 456 00:32:27,990 --> 00:32:32,060 It comes up in interpolation, and it starts with a V? 457 00:32:35,900 --> 00:32:36,700 Anybody know? 458 00:32:36,700 --> 00:32:37,960 Vandermonde, yeah. 459 00:32:37,960 --> 00:32:39,570 Vandermonde. 460 00:32:39,570 --> 00:32:44,140 So I could call it V for Vandermonde. 461 00:32:44,140 --> 00:32:48,910 And Vandermonde matrices are not very well conditioned. 462 00:32:54,620 --> 00:33:00,420 So now a little timeout to say, because it's so important, 463 00:33:00,420 --> 00:33:08,160 how do you judge the good or poor conditioning of a matrix? 464 00:33:08,160 --> 00:33:11,470 Those vectors are linearly independent. 465 00:33:11,470 --> 00:33:14,640 The determinant of V is not zero. 466 00:33:17,810 --> 00:33:22,960 The matrix is not singular, but it's too close to singular, 467 00:33:22,960 --> 00:33:28,480 and how do you test the-- suppose you 468 00:33:28,480 --> 00:33:33,700 have a basis, as we have here. 469 00:33:33,700 --> 00:33:38,570 And you want to know is it good or not? 470 00:33:38,570 --> 00:33:41,930 So, of course, you always put your bases vectors, 471 00:33:41,930 --> 00:33:47,760 let me call them-- well,I don't want to call them v because -- 472 00:33:47,760 --> 00:33:49,180 well I guess I could call them v, 473 00:33:49,180 --> 00:33:52,300 they come out of Vandermonde. v_1, v_2, v_3, v_4. 474 00:33:55,370 --> 00:33:57,570 That's our matrix. 475 00:33:57,570 --> 00:34:00,350 Essentially I want its condition number. 476 00:34:00,350 --> 00:34:03,110 And to find its condition number, 477 00:34:03,110 --> 00:34:07,250 I-- it's not symmetric, so I can't just 478 00:34:07,250 --> 00:34:08,850 take the eigenvalues of that. 479 00:34:08,850 --> 00:34:11,430 You might say, look at lambda_max and lambda_min. 480 00:34:14,010 --> 00:34:16,540 The condition number is associated 481 00:34:16,540 --> 00:34:19,610 with max divided by min. 482 00:34:19,610 --> 00:34:25,200 But when the matrix isn't symmetric, 483 00:34:25,200 --> 00:34:29,340 just taking its eigenvalues directly is not cool. 484 00:34:29,340 --> 00:34:33,900 It's not reliable. 485 00:34:33,900 --> 00:34:36,050 I could have a matrix that's badly conditioned, 486 00:34:36,050 --> 00:34:37,960 but all its eigenvalues were 1. 487 00:34:40,870 --> 00:34:43,210 I mean a matrix with 1's on the diagonal 488 00:34:43,210 --> 00:34:45,850 and zillions up above the diagonal 489 00:34:45,850 --> 00:34:49,380 would have eigenvalues of 1, but it would be badly conditioned. 490 00:34:49,380 --> 00:34:53,900 So the right way to take it is V transpose V. 491 00:34:53,900 --> 00:34:59,470 Look at v_1 transpose, v_2 transpose... 492 00:34:59,470 --> 00:35:06,060 As always, if a matrix isn't symmetric, 493 00:35:06,060 --> 00:35:08,610 if the matrix V is not symmetric, 494 00:35:08,610 --> 00:35:14,940 good idea to form V transpose V. That is symmetric. 495 00:35:14,940 --> 00:35:17,970 It does have positive eigenvalues. 496 00:35:17,970 --> 00:35:24,720 And those eigenvalues, the eigenvalues of V transpose V, 497 00:35:24,720 --> 00:35:30,760 are the singular values, or rather the singular values 498 00:35:30,760 --> 00:35:37,090 squared, of V. So I guess I'm saying, 499 00:35:37,090 --> 00:35:39,970 you can't trust the eigenvalues of V. 500 00:35:39,970 --> 00:35:42,060 It's the singular values you can trust. 501 00:35:42,060 --> 00:35:43,870 And the way to find singular values 502 00:35:43,870 --> 00:35:48,510 is form V transpose V, that gives you 503 00:35:48,510 --> 00:35:51,280 a symmetric matrix, its eigenvalues-- 504 00:35:51,280 --> 00:35:55,400 so the i-th eigenvalue would be the i-th singular 505 00:35:55,400 --> 00:35:56,250 value squared. 506 00:36:08,260 --> 00:36:15,480 So the condition number, is sigma_max over sigma_min. 507 00:36:18,990 --> 00:36:24,910 And, well, it's not enormous for this 4 by 4 matrix, 508 00:36:24,910 --> 00:36:29,350 but if I go up to 10 by 10 or 100 by 100. 509 00:36:29,350 --> 00:36:31,550 100 by 100 would just totally wipe me out. 510 00:36:31,550 --> 00:36:34,810 10 by 10 would already be disastrous. 511 00:36:34,810 --> 00:36:37,080 Completely disastrous actually. 512 00:36:37,080 --> 00:36:44,740 10 by 10 Vandermonde matrix, with 1, 2 3, 4, up to 10 513 00:36:44,740 --> 00:36:51,180 as the points, would be-- well, it would have entries-- 514 00:36:51,180 --> 00:36:54,200 if that ended in a 10 and I had 10 rows, 515 00:36:54,200 --> 00:36:57,660 would that be something like 10 to the ninth power. 516 00:36:57,660 --> 00:37:02,500 I mean the dynamics scale would be terrible. 517 00:37:02,500 --> 00:37:06,820 So finally, just to-- V transpose V 518 00:37:06,820 --> 00:37:09,930 is called the Gram matrix. 519 00:37:09,930 --> 00:37:21,790 So that guy Gram is coming back in as giving the good measure. 520 00:37:21,790 --> 00:37:26,870 So the point was then, that this measures the dependence 521 00:37:26,870 --> 00:37:33,240 or independence of the v's. 522 00:37:33,240 --> 00:37:35,150 That ratio. 523 00:37:35,150 --> 00:37:39,690 The bigger that ratio is, the more dependency they are. 524 00:37:39,690 --> 00:37:43,150 What's the ratio if they're orthonormal? 525 00:37:43,150 --> 00:37:46,700 What's the ratio of the q's? 526 00:37:46,700 --> 00:37:49,250 What's the condition number of the q's? 527 00:37:49,250 --> 00:37:57,990 If we've run Arnoldi and got a good basis, it's one. 528 00:37:57,990 --> 00:37:59,280 What's the Gram matrix? 529 00:37:59,280 --> 00:38:07,000 If this is Q transpose Q, with the q's, the orthonormal basis 530 00:38:07,000 --> 00:38:13,330 in the columns, then Q transpose Q is the identity. 531 00:38:13,330 --> 00:38:15,610 So it's a Gram matrix. 532 00:38:15,610 --> 00:38:20,340 So Q transpose Q would be the identity, 533 00:38:20,340 --> 00:38:25,880 and its condition number is the best possible one. 534 00:38:25,880 --> 00:38:29,130 Lambda_max is 1, lambda_min is 1. 535 00:38:29,130 --> 00:38:30,960 The condition number is one. 536 00:38:30,960 --> 00:38:31,920 OK. 537 00:38:31,920 --> 00:38:38,980 So that's just some comments about why Arnoldi gets brought 538 00:38:38,980 --> 00:38:41,220 in to fix the situation. 539 00:38:41,220 --> 00:38:42,550 OK. 540 00:38:42,550 --> 00:38:47,720 So I'll just leave those ten Arnoldi steps on the board. 541 00:38:47,720 --> 00:38:52,210 The notes give ten comments. 542 00:38:52,210 --> 00:38:58,300 I'm quite proud of-- my MATLAB is unreliable. 543 00:38:58,300 --> 00:39:00,490 Actually you'll find a few errors 544 00:39:00,490 --> 00:39:04,180 like after end, I accidentally put 545 00:39:04,180 --> 00:39:08,260 a semicolon, which was absurd. 546 00:39:08,260 --> 00:39:11,650 But the comments I'm pleased with, 547 00:39:11,650 --> 00:39:15,490 and then the numerical example that 548 00:39:15,490 --> 00:39:23,850 runs through one cycle of this, with these numbers, 549 00:39:23,850 --> 00:39:28,370 to see what q_2 is, is, I hope, useful. 550 00:39:33,070 --> 00:39:36,210 Why do we have a short recurrence? 551 00:39:36,210 --> 00:39:41,240 This is a key point here. 552 00:39:41,240 --> 00:39:45,420 And in this example, if I work out the h's, I'll discover sure 553 00:39:45,420 --> 00:39:53,280 enough, h_(1, 3) will be 0. h_(1, 4) will be 0. 554 00:39:57,400 --> 00:40:00,410 Here's the key equation. 555 00:40:00,410 --> 00:40:10,700 Here's Arnoldi in matrix language. 556 00:40:10,700 --> 00:40:15,380 Let me see if I can remember Arnoldi in matrix language. 557 00:40:15,380 --> 00:40:23,920 So, Arnoldi is taking the matrix-- yeah. 558 00:40:23,920 --> 00:40:28,890 So Arnoldi in matrix language is going to be this. 559 00:40:28,890 --> 00:40:48,870 It's going to A*Q equals Q*H. I can't write out all of Q. 560 00:40:48,870 --> 00:40:57,420 So that's the big equation. 561 00:40:57,420 --> 00:41:01,910 Its a very important equation that we now 562 00:41:01,910 --> 00:41:05,210 have all the pieces for. 563 00:41:05,210 --> 00:41:08,210 So A is our original matrix that we were given. 564 00:41:08,210 --> 00:41:11,450 Symmetric let's say. 565 00:41:11,450 --> 00:41:19,930 Q is our basis out of Arnoldi and H is the multipliers 566 00:41:19,930 --> 00:41:21,310 that gave that basis. 567 00:41:21,310 --> 00:41:26,660 So this Q*H is a little bit like Gram-Schmidt. 568 00:41:26,660 --> 00:41:33,280 Do you remember, Gram-Schmidt is described by Q times R. Q, 569 00:41:33,280 --> 00:41:35,850 again, is orthonormal. 570 00:41:35,850 --> 00:41:39,530 So it's an orthogonal matrix. 571 00:41:39,530 --> 00:41:43,800 In Gram-Schmidt R is upper triangular. 572 00:41:43,800 --> 00:41:45,860 Here it's not. 573 00:41:45,860 --> 00:41:46,960 Here it's Hessenberg. 574 00:41:46,960 --> 00:41:50,520 So H stands for Hessenberg. 575 00:41:50,520 --> 00:41:52,660 I'll write down what the actual H 576 00:41:52,660 --> 00:41:56,380 is for these for these numbers. 577 00:41:56,380 --> 00:42:00,700 I won't write down the Q. I'll just write down 578 00:42:00,700 --> 00:42:04,280 what H turned out to be for those numbers 579 00:42:04,280 --> 00:42:05,270 if I did it right. 580 00:42:05,270 --> 00:42:07,820 Five halves. 581 00:42:07,820 --> 00:42:09,150 Oh, interesting. 582 00:42:09,150 --> 00:42:12,230 The lengths all turned out to be five halves. 583 00:42:15,760 --> 00:42:19,630 And this turned out to be root 5 on 2. 584 00:42:19,630 --> 00:42:24,400 This turned out to be the square root of 4 over 5, 585 00:42:24,400 --> 00:42:29,480 and this turned out to be the square root of 9 over 20. 586 00:42:29,480 --> 00:42:39,220 And the point is from here, this one is just below the diagonal. 587 00:42:39,220 --> 00:42:42,890 And it will show up as symmetric. 588 00:42:42,890 --> 00:42:50,630 Root 5 over 2, root 4/5 and root 9/20. 589 00:42:50,630 --> 00:42:52,790 OK. 590 00:42:52,790 --> 00:42:58,870 So, what am I seeing from that particular H which 591 00:42:58,870 --> 00:43:00,970 somehow can't be an accident? 592 00:43:00,970 --> 00:43:09,210 It must be that it's built in. 593 00:43:09,210 --> 00:43:18,100 It's the fact that H is symmetric and tridiagonal. 594 00:43:24,900 --> 00:43:27,420 And what does that tridiagonal tell us? 595 00:43:27,420 --> 00:43:29,920 It tells us that we have short recurrences. 596 00:43:29,920 --> 00:43:37,560 It's the three-term recurrence relation, 597 00:43:37,560 --> 00:43:41,220 is what I'm seeing here in matrix language, 598 00:43:41,220 --> 00:43:47,140 because there were three non-zeros in the columns of H. 599 00:43:47,140 --> 00:43:47,640 All right. 600 00:43:47,640 --> 00:43:49,520 I was going to write out-- I was going 601 00:43:49,520 --> 00:43:55,490 to try to understand this one by looking at the first column. 602 00:43:55,490 --> 00:43:58,250 What if I take the first column of both sides? 603 00:43:58,250 --> 00:44:01,540 That'll be A times Q_1, right? 604 00:44:05,100 --> 00:44:09,440 Q_1 is the first column vector in the-- first basis vector. 605 00:44:09,440 --> 00:44:14,420 And what do I have here when I take q's times h's? 606 00:44:14,420 --> 00:44:19,290 Do you do matrix multiplication a column at a time? 607 00:44:19,290 --> 00:44:20,990 You should. 608 00:44:20,990 --> 00:44:22,280 OK. 609 00:44:22,280 --> 00:44:25,310 So this says take 5/2 of the first column. 610 00:44:28,170 --> 00:44:32,790 And this says take that factor times the second column. 611 00:44:36,850 --> 00:44:40,690 And I could track back and see, yes that's 612 00:44:40,690 --> 00:44:46,050 what Arnoldi has produced. 613 00:44:46,050 --> 00:44:51,220 And then the second one, the next one, would be an A*q_2. 614 00:44:51,220 --> 00:44:52,220 the next would an A*q_3. 615 00:44:56,710 --> 00:44:58,890 Well, look, here it is. 616 00:44:58,890 --> 00:45:05,960 I want to show that H is symmetric when A is. 617 00:45:05,960 --> 00:45:07,760 Could you do that? 618 00:45:07,760 --> 00:45:11,430 We know what the property of Q is here. 619 00:45:11,430 --> 00:45:15,610 We know that Q is, because Arnoldi made it that way, 620 00:45:15,610 --> 00:45:22,420 has Q transpose Q equal I. And I can, in one step, 621 00:45:22,420 --> 00:45:26,130 show that H is symmetric if A is symmetric. 622 00:45:26,130 --> 00:45:26,950 How do you do that? 623 00:45:29,860 --> 00:45:32,640 I guess we need a formula for H, so I just 624 00:45:32,640 --> 00:45:36,610 multiply by Q inverse. 625 00:45:42,540 --> 00:45:44,610 So that's good. 626 00:45:44,610 --> 00:45:49,420 And even better is to recognize what Q inverse is. 627 00:45:49,420 --> 00:45:51,790 So what is Q inverse here? 628 00:45:51,790 --> 00:45:53,460 It's Q transpose. 629 00:45:53,460 --> 00:45:57,030 Anytime we see Q, that's my letter, always, 630 00:45:57,030 --> 00:45:58,440 for an orthogonal matrix. 631 00:45:58,440 --> 00:46:05,850 So this is Q transpose A Q. And now what? 632 00:46:05,850 --> 00:46:07,780 The argument's finished. 633 00:46:07,780 --> 00:46:10,940 We're here. 634 00:46:10,940 --> 00:46:18,490 If A is symmetric, what can you say about that combination? 635 00:46:18,490 --> 00:46:22,040 If A is a symmetric matrix? 636 00:46:22,040 --> 00:46:26,860 It is symmetric. 637 00:46:26,860 --> 00:46:27,450 Right? 638 00:46:27,450 --> 00:46:30,450 That's how you get the symmetric matrices. 639 00:46:30,450 --> 00:46:32,210 You start with one. 640 00:46:32,210 --> 00:46:35,260 You multiply on one side by a matrix, and on the other side 641 00:46:35,260 --> 00:46:36,550 by its transpose. 642 00:46:36,550 --> 00:46:40,430 The thing has to be symmetric, because if I 643 00:46:40,430 --> 00:46:43,450 transpose this whole thing, what will happen? 644 00:46:43,450 --> 00:46:48,020 To transpose things their transposes 645 00:46:48,020 --> 00:46:49,660 come in the opposite order. 646 00:46:49,660 --> 00:46:53,100 So Q, its transpose comes first. 647 00:46:53,100 --> 00:46:58,000 A, its transpose comes in the middle, but what's that? 648 00:46:58,000 --> 00:47:01,780 The transpose of A is A. We're assuming A to be symmetric. 649 00:47:01,780 --> 00:47:04,000 And then the transpose of Q transpose is? 650 00:47:04,000 --> 00:47:06,960 Is Q. 651 00:47:06,960 --> 00:47:10,790 So we got this back again when we transpose, 652 00:47:10,790 --> 00:47:11,780 so it's symmetric. 653 00:47:11,780 --> 00:47:15,000 So H is symmetric. 654 00:47:15,000 --> 00:47:20,190 So the conclusion is, H equals H transposed. 655 00:47:20,190 --> 00:47:25,030 And then we know immediately that it's tridiagonal, 656 00:47:25,030 --> 00:47:31,390 because every H from Arnoldi is Hessenberg. 657 00:47:31,390 --> 00:47:35,960 We know that these zeros are here. 658 00:47:35,960 --> 00:47:45,200 The Arnoldi cycle ended produced h i j in column j 659 00:47:45,200 --> 00:47:49,730 but it's stopped one below the diagonal. 660 00:47:49,730 --> 00:47:52,810 So we know these are zero, but now, if we know 661 00:47:52,810 --> 00:47:57,240 the matrix is symmetric, then we know these are zero. 662 00:47:57,240 --> 00:48:00,140 And, of course, it works out that way. 663 00:48:00,140 --> 00:48:08,390 So the conclusion is that we can orthogonalize the Krylov 664 00:48:08,390 --> 00:48:18,100 basis, quickly, easily, and work with that basis, 665 00:48:18,100 --> 00:48:20,390 either explicitly by computing it 666 00:48:20,390 --> 00:48:24,830 or by implicitly by keeping things orthogonal 667 00:48:24,830 --> 00:48:27,530 and that's what conjugate gradients will do. 668 00:48:27,530 --> 00:48:30,720 So next time, I'm going to make an effort 669 00:48:30,720 --> 00:48:32,970 to describe the conjugate gradients method. 670 00:48:32,970 --> 00:48:35,090 I'll pick the highlights of it. 671 00:48:35,090 --> 00:48:43,150 It has fantastic properties, and to verify those properties 672 00:48:43,150 --> 00:48:48,150 in full detail is often more confusing than not. 673 00:48:48,150 --> 00:48:53,050 If you see today's lecture, you're 674 00:48:53,050 --> 00:48:58,000 seeing the important points: the role of symmetry 675 00:48:58,000 --> 00:49:03,790 of A and the Arnoldi algorithm. 676 00:49:03,790 --> 00:49:12,250 OK. so that's our first lecture on the Krylov ideas. 677 00:49:12,250 --> 00:49:16,630 And next time, we'll probably complete that topic, 678 00:49:16,630 --> 00:49:19,400 and it will be on the web. 679 00:49:19,400 --> 00:49:25,050 So I'll see you Wednesday for the end of Krylov. 680 00:49:25,050 --> 00:49:25,800 Thanks. 681 00:49:25,800 --> 00:49:27,050 Good.