1 00:00:00,000 --> 00:00:01,950 The following content is provided 2 00:00:01,950 --> 00:00:06,050 by MIT OpenCourseWare under a Creative Commons license. 3 00:00:06,050 --> 00:00:08,263 Additional information about our license 4 00:00:08,263 --> 00:00:10,470 and MIT OpenCourseWare in general 5 00:00:10,470 --> 00:00:11,930 is available at ocw.mit.edu. 6 00:00:15,520 --> 00:00:20,150 PROFESSOR: So we started last week 7 00:00:20,150 --> 00:00:26,630 on the big topic for the rest of the semester, optimization. 8 00:00:26,630 --> 00:00:30,780 Maybe, can you close that door or just partway, anyway. 9 00:00:30,780 --> 00:00:32,820 Thanks. 10 00:00:32,820 --> 00:00:33,320 Great. 11 00:00:33,320 --> 00:00:35,020 That's perfect, just like that. 12 00:00:35,020 --> 00:00:40,240 So, since it was a few days ago, I 13 00:00:40,240 --> 00:00:46,420 wanted to recap what I did in the first lecture 14 00:00:46,420 --> 00:00:50,680 about optimization, which was to pick out the least squares 15 00:00:50,680 --> 00:00:55,810 problem as a beautiful model problem. 16 00:00:55,810 --> 00:01:01,930 And I followed that through -- the input is a matrix A, 17 00:01:01,930 --> 00:01:08,090 rectangular, a right-hand side b, probably measurements. 18 00:01:08,090 --> 00:01:12,730 We would like to get A*u equal b but we can't. 19 00:01:12,730 --> 00:01:16,280 We got too many measurements. 20 00:01:16,280 --> 00:01:20,250 We do the best possible, which is the solution u 21 00:01:20,250 --> 00:01:24,540 hat of this normal equation. 22 00:01:24,540 --> 00:01:28,670 Then I rewrote the normal equations -- well, 23 00:01:28,670 --> 00:01:33,580 I also drew the picture that you see over here on the left. 24 00:01:33,580 --> 00:01:38,070 The picture that shows the two optimization 25 00:01:38,070 --> 00:01:40,020 problems at the same time. 26 00:01:40,020 --> 00:01:42,530 Here is the vector b. 27 00:01:42,530 --> 00:01:47,455 Here are all possible A*u's -- this is the column space, 28 00:01:47,455 --> 00:01:51,700 this is all A*u's. 29 00:01:51,700 --> 00:01:55,200 The best A*u was the projection. 30 00:01:55,200 --> 00:01:59,330 And the error e was what we couldn't 31 00:01:59,330 --> 00:02:03,380 get right, the part that's perpendicular to the column 32 00:02:03,380 --> 00:02:06,020 space we can't help. 33 00:02:06,020 --> 00:02:10,380 It's the solution to another projection problem. 34 00:02:10,380 --> 00:02:14,040 e is the same -- of course, it's the same over here -- 35 00:02:14,040 --> 00:02:19,560 as projecting b onto this perpendicular line, 36 00:02:19,560 --> 00:02:25,670 which is the line of all vectors, A transpose e says -- 37 00:02:25,670 --> 00:02:33,860 what that says in words is e is perpendicular to the columns 38 00:02:33,860 --> 00:02:44,110 of A. I've drawn it the best I could as a perpendicular 39 00:02:44,110 --> 00:02:46,480 to the columns. 40 00:02:46,480 --> 00:02:48,920 So that we have a plane of columns -- 41 00:02:48,920 --> 00:02:51,790 this is like a three by two matrix. 42 00:02:51,790 --> 00:02:58,030 We have a plane with the two columns and the perpendicular 43 00:02:58,030 --> 00:02:59,830 line. 44 00:02:59,830 --> 00:03:06,315 Then just near the end, I said that it would be great to have 45 00:03:06,315 --> 00:03:10,240 -- this model would be almost all we need except it should 46 00:03:10,240 --> 00:03:18,160 have one more matrix and I called that matrix C. 47 00:03:18,160 --> 00:03:20,990 I want just to show you quickly where that comes. 48 00:03:20,990 --> 00:03:27,000 So this is all section 7.1 of the notes, 49 00:03:27,000 --> 00:03:32,550 except that these words are not yet typed. 50 00:03:32,550 --> 00:03:35,850 So they'll -- as soon as possible, 51 00:03:35,850 --> 00:03:40,420 an updated 7.1 will include this, what I want to do. 52 00:03:40,420 --> 00:03:42,450 Because I think it's the very best introduction 53 00:03:42,450 --> 00:03:48,260 I could give to these pair of optimization problems. 54 00:03:48,260 --> 00:03:51,040 You might think, who needs optimization? 55 00:03:54,750 --> 00:04:00,000 Your main activity might be solving differential equations. 56 00:04:00,000 --> 00:04:05,570 So, can I just take a time out here because I happened to see 57 00:04:05,570 --> 00:04:12,280 the homework, upcoming -- I think it's 16.930, 58 00:04:12,280 --> 00:04:14,240 so it's a course a little bit like this, 59 00:04:14,240 --> 00:04:19,040 only the first word in the course title is "Advanced," 60 00:04:19,040 --> 00:04:22,310 and our first word is "Introduction," but, of course, 61 00:04:22,310 --> 00:04:27,110 it's the same course or same stuff. 62 00:04:27,110 --> 00:04:28,940 This is the homework that I don't 63 00:04:28,940 --> 00:04:32,630 think has even been assigned, even been handed out yet. 64 00:04:32,630 --> 00:04:40,720 But I just thought it's a great example of a applied -- 65 00:04:40,720 --> 00:04:46,410 so that you see where optimization appears 66 00:04:46,410 --> 00:04:50,130 and what's involved. 67 00:04:50,130 --> 00:04:53,330 So differential equations are involved, 68 00:04:53,330 --> 00:04:57,350 but also you got something that you want to optimize. 69 00:04:57,350 --> 00:05:02,670 So I just wrote it on this board and I simplified it 70 00:05:02,670 --> 00:05:07,760 a little over the homework that they'll actually get. 71 00:05:07,760 --> 00:05:09,480 So here's the problem. 72 00:05:09,480 --> 00:05:12,670 We want a certain distribution of heat. 73 00:05:12,670 --> 00:05:16,410 So I could draw a picture. 74 00:05:16,410 --> 00:05:20,260 We want a heat distribution, for whatever reason, 75 00:05:20,260 --> 00:05:27,240 that maybe goes like this over the interval 0 to 1. 76 00:05:27,240 --> 00:05:31,970 And what do we have at our disposal 77 00:05:31,970 --> 00:05:34,500 to get the heat to be that way? 78 00:05:34,500 --> 00:05:36,950 Well, we've got sources of heat. 79 00:05:36,950 --> 00:05:42,180 But we don't have a continuous source, 80 00:05:42,180 --> 00:05:52,640 we only have n parameters to play with. 81 00:05:52,640 --> 00:05:55,000 I mean, right away, you recognize an optimization 82 00:05:55,000 --> 00:05:55,770 problem. 83 00:05:55,770 --> 00:05:59,430 We're trying to get this function here, u_naught of x. 84 00:06:02,660 --> 00:06:04,220 We're trying to match a function, 85 00:06:04,220 --> 00:06:07,550 but we've only got n parameters. 86 00:06:07,550 --> 00:06:12,180 Those will be the right-hand side. 87 00:06:12,180 --> 00:06:15,460 So what we're allowed to choose, we can put in little space 88 00:06:15,460 --> 00:06:20,860 heaters, and we can turn them to the temperatures we want to -- 89 00:06:20,860 --> 00:06:23,900 temperatures s_1, s_2, s_3. 90 00:06:23,900 --> 00:06:26,290 That was probably a stupid choice 91 00:06:26,290 --> 00:06:30,080 to put an s_4 down there because I don't even know 92 00:06:30,080 --> 00:06:32,940 if negative heat is allowed. 93 00:06:32,940 --> 00:06:35,910 Anyway, we wouldn't want it if we're 94 00:06:35,910 --> 00:06:38,610 aiming for that distribution. 95 00:06:38,610 --> 00:06:41,280 Now you understand the s's are not supposed 96 00:06:41,280 --> 00:06:42,310 to match that u_naught. 97 00:06:42,310 --> 00:06:46,560 The s's are the sources of heat, and the u's 98 00:06:46,560 --> 00:06:50,440 are the outputs, the distribution, 99 00:06:50,440 --> 00:06:53,650 and they are controlled by a differential equation. 100 00:06:56,520 --> 00:06:58,670 What we control is the right-hand side 101 00:06:58,670 --> 00:07:03,490 of the equation, but we only control n parameters. 102 00:07:03,490 --> 00:07:05,220 Then we have to solve the equation 103 00:07:05,220 --> 00:07:07,900 and find out what distribution that gives. 104 00:07:07,900 --> 00:07:11,610 So that's all like differential equations, 105 00:07:11,610 --> 00:07:13,950 we know how to do it. 106 00:07:13,950 --> 00:07:18,680 It's one-dimensional in this example, so straightforward. 107 00:07:18,680 --> 00:07:21,030 But now comes the optimization part. 108 00:07:21,030 --> 00:07:25,610 We take this result and we compare it 109 00:07:25,610 --> 00:07:29,690 with the desired result. So the actual result 110 00:07:29,690 --> 00:07:34,250 from some source of heat might be something like that. 111 00:07:38,380 --> 00:07:48,970 We want this u of x, the heat distribution 112 00:07:48,970 --> 00:07:51,450 that we're actually producing, to be as close 113 00:07:51,450 --> 00:07:53,760 as possible to u_naught. 114 00:07:53,760 --> 00:07:56,220 Close could mean different things, 115 00:07:56,220 --> 00:08:00,990 but if we measure closeness in this integral square, 116 00:08:00,990 --> 00:08:05,284 mean square sense, then we're going to have nice problem. 117 00:08:05,284 --> 00:08:06,950 In fact, we're going to have a linear -- 118 00:08:06,950 --> 00:08:10,080 everything's linear right now. 119 00:08:10,080 --> 00:08:12,310 Well, everything's quadratic here, 120 00:08:12,310 --> 00:08:16,710 so when we minimize it's going to give us a linear equation 121 00:08:16,710 --> 00:08:18,740 for the s's. 122 00:08:18,740 --> 00:08:23,980 I just think that's a good model problem. 123 00:08:23,980 --> 00:08:26,290 A good model of what we might do. 124 00:08:26,290 --> 00:08:32,810 I think actually in the problem to be assigned -- well, 125 00:08:32,810 --> 00:08:37,500 it's not a big deal for us -- there is convection as well. 126 00:08:37,500 --> 00:08:41,470 So this is a pure diffusion problem, just 127 00:08:41,470 --> 00:08:42,780 the second derivative. 128 00:08:42,780 --> 00:08:44,500 We know very well that if there was 129 00:08:44,500 --> 00:08:48,820 a first derivative in there, the stuff's convecting, 130 00:08:48,820 --> 00:08:52,670 passing through the region. 131 00:08:52,670 --> 00:09:02,140 So if I put on a convective term, c*du/dx, 132 00:09:02,140 --> 00:09:04,110 how does that change the problem? 133 00:09:04,110 --> 00:09:09,280 Not in a big, big major way, but one thing we can guess 134 00:09:09,280 --> 00:09:14,940 is that now that is not a symmetric term, right. 135 00:09:14,940 --> 00:09:18,370 We've seen the difference between first differences 136 00:09:18,370 --> 00:09:20,402 and second differences, first derivatives 137 00:09:20,402 --> 00:09:21,360 and second derivatives. 138 00:09:21,360 --> 00:09:28,790 So now it's a little trickier and it's not symmetric. 139 00:09:28,790 --> 00:09:37,370 So somehow there will be a primal problem, this one, 140 00:09:37,370 --> 00:09:42,580 and there will be a adjoint problem, dual problem, 141 00:09:42,580 --> 00:09:46,660 perpendicular problem, whatever name you want to give it, 142 00:09:46,660 --> 00:09:51,790 just as there is in our very first model over there. 143 00:09:51,790 --> 00:09:59,030 So that's my little quick look to kind of put 144 00:09:59,030 --> 00:10:04,450 on the board one example that I didn't invent, 145 00:10:04,450 --> 00:10:11,060 that came from applications and gives a sort of typical 146 00:10:11,060 --> 00:10:12,550 of what you have to do. 147 00:10:12,550 --> 00:10:18,190 You control an input, you get an output, 148 00:10:18,190 --> 00:10:20,100 so that's the analysis problem. 149 00:10:20,100 --> 00:10:21,530 Find the output. 150 00:10:21,530 --> 00:10:24,530 But then comes the optimization problem -- 151 00:10:24,530 --> 00:10:27,880 make that output close to something that you wish. 152 00:10:31,730 --> 00:10:34,470 So what's the typical algorithm going to do? 153 00:10:34,470 --> 00:10:38,550 It's going to make a choice of s, 154 00:10:38,550 --> 00:10:43,520 it's going to solve the analysis problem for the u, 155 00:10:43,520 --> 00:10:46,300 it's going to look and see what the error is, 156 00:10:46,300 --> 00:10:51,160 it's going to figure out probably the gradient somehow. 157 00:10:51,160 --> 00:10:55,190 What's the steepest way to make it closer? 158 00:10:55,190 --> 00:10:58,410 That's going to lead us to a change of s. 159 00:10:58,410 --> 00:11:03,450 We use a change of s, the new s, solve that, and iterate. 160 00:11:03,450 --> 00:11:06,110 That would be a typical algorithm. 161 00:11:06,110 --> 00:11:09,750 We might be able to shortcut it in a model problem like this. 162 00:11:09,750 --> 00:11:16,310 But that's totally the typical optimization idea, 163 00:11:16,310 --> 00:11:24,100 is an analysis problem and then figure out a gradient 164 00:11:24,100 --> 00:11:27,510 to get a better source; back to the analysis problem 165 00:11:27,510 --> 00:11:29,040 with the new source. 166 00:11:32,390 --> 00:11:35,000 What the algebra, the math, has to do 167 00:11:35,000 --> 00:11:38,850 is those two steps of figuring out OK, 168 00:11:38,850 --> 00:11:43,760 how do we improve the error, how do we 169 00:11:43,760 --> 00:11:46,660 reduce the error, what's the steepest direction? 170 00:11:46,660 --> 00:11:50,690 Somehow we got to compute a derivative. 171 00:11:50,690 --> 00:11:54,310 Actually, that's what this month is about. 172 00:11:54,310 --> 00:11:56,930 Derivatives that are not just like the derivative 173 00:11:56,930 --> 00:12:00,990 of x cube or something. 174 00:12:00,990 --> 00:12:03,260 I often wondered how many presidents 175 00:12:03,260 --> 00:12:08,340 could take the derivative of x cube and I'm not sure. 176 00:12:12,170 --> 00:12:14,360 Anybody occur to you who you could 177 00:12:14,360 --> 00:12:18,080 count on being able to take the derivative of x cubed? 178 00:12:18,080 --> 00:12:20,080 I don't think the current president 179 00:12:20,080 --> 00:12:21,980 would know what it meant. 180 00:12:21,980 --> 00:12:25,180 But I think Carter could have done it, because he 181 00:12:25,180 --> 00:12:27,170 went to the Naval Academy. 182 00:12:27,170 --> 00:12:30,180 Jefferson was probably -- he knew everything. 183 00:12:32,720 --> 00:12:33,270 I don't know. 184 00:12:33,270 --> 00:12:37,280 Anybody else has another candidate they can tell me. 185 00:12:37,280 --> 00:12:43,080 So there is our problem -- finding derivatives that would 186 00:12:43,080 --> 00:12:46,310 be definitely beyond the capacity of the White House. 187 00:12:52,140 --> 00:12:57,360 Now I want to stay with this model 188 00:12:57,360 --> 00:13:05,440 a little more because it's the perfect model. 189 00:13:05,440 --> 00:13:09,710 So this was like model one, unweighted ordinary least 190 00:13:09,710 --> 00:13:13,830 squared, and it produced the identity matrix in there. 191 00:13:13,830 --> 00:13:17,800 And I mentioned last time what I want to do, 192 00:13:17,800 --> 00:13:22,050 the more general model has another symmetric 193 00:13:22,050 --> 00:13:27,580 positive-definite matrix in there, but not necessarily i, 194 00:13:27,580 --> 00:13:30,640 and it comes from weighted least squares. 195 00:13:30,640 --> 00:13:33,070 So that's what I'm going to talk about. 196 00:13:33,070 --> 00:13:35,050 So what are weighted least squares? 197 00:13:35,050 --> 00:13:41,460 Well, you've got these measurements and you think they 198 00:13:41,460 --> 00:13:46,400 maybe don't all -- maybe they're not independent, 199 00:13:46,400 --> 00:13:49,320 maybe they're not equally reliable. 200 00:13:49,320 --> 00:13:52,930 So you weight them by how reliable they are. 201 00:13:52,930 --> 00:13:56,080 A more reliable one you would put a heavier weight 202 00:13:56,080 --> 00:14:00,410 w on because you want that to be more important. 203 00:14:00,410 --> 00:14:05,000 So, you change to this problem. 204 00:14:05,000 --> 00:14:06,730 But it looks practically the same. 205 00:14:06,730 --> 00:14:12,400 The only difference is A has become W*A, b has become W*b. 206 00:14:12,400 --> 00:14:17,090 So the equations will be the same, but A is now W*A, 207 00:14:17,090 --> 00:14:22,140 b is now W*b, and those are the equations. 208 00:14:22,140 --> 00:14:25,270 I guess I should call, just to make clear that u is 209 00:14:25,270 --> 00:14:30,560 a different u, that the best u now depends on the choice 210 00:14:30,560 --> 00:14:34,130 of weights, I should really be calling that u -- 211 00:14:34,130 --> 00:14:37,940 somehow indicate that it depends on the weight. 212 00:14:37,940 --> 00:14:42,890 Now they key nice thing is that if I write this out as A 213 00:14:42,890 --> 00:14:49,080 transpose -- can you just write this out? 214 00:14:49,080 --> 00:14:53,500 You get the A transpose, and the A is over here. 215 00:14:53,500 --> 00:14:56,460 But what's in the middle? 216 00:14:56,460 --> 00:15:00,810 What's this matrix C -- I jumped the gun and called that matrix 217 00:15:00,810 --> 00:15:05,920 in the middle C, but how is it connected to W? 218 00:15:05,920 --> 00:15:12,390 You can see it here as C is W transpose W, right? 219 00:15:12,390 --> 00:15:14,180 It's just sitting there in the middle. 220 00:15:14,180 --> 00:15:18,680 That's great that the fact that the combination W transpose 221 00:15:18,680 --> 00:15:21,160 W is all you need to know. 222 00:15:21,160 --> 00:15:27,280 So we can forget W in favor of -- I'll just put it here -- 223 00:15:27,280 --> 00:15:32,630 W transpose W is now given the name C, 224 00:15:32,630 --> 00:15:36,200 and this matrix is symmetric positive definite. 225 00:15:40,440 --> 00:15:44,730 So it's a great matrix and it's exactly the one we want. 226 00:15:44,730 --> 00:15:48,850 And this is exactly the equation we want. 227 00:15:48,850 --> 00:15:53,480 So if I go back to writing it in this -- you know, 228 00:15:53,480 --> 00:15:56,210 this was the equation. 229 00:15:56,210 --> 00:16:01,510 You remember the point that we could go directly 230 00:16:01,510 --> 00:16:05,250 to one equation for the best u, or we 231 00:16:05,250 --> 00:16:13,880 could keep our options open and have two equations that 232 00:16:13,880 --> 00:16:15,530 led to the same one. 233 00:16:15,530 --> 00:16:17,410 They're totally equivalent. 234 00:16:17,410 --> 00:16:22,530 But the two equations will give us not only u, but the error, 235 00:16:22,530 --> 00:16:25,960 b minus A*u, as an other unknown. 236 00:16:25,960 --> 00:16:27,230 That's what we did up there. 237 00:16:27,230 --> 00:16:32,340 So we had two equations, and if we eliminated e, 238 00:16:32,340 --> 00:16:34,500 we got to this. 239 00:16:34,500 --> 00:16:36,880 Now I want to have two equations, 240 00:16:36,880 --> 00:16:40,530 and if I eliminate e, I get to that. 241 00:16:40,530 --> 00:16:44,450 So let me see what those would be. 242 00:16:44,450 --> 00:16:47,000 Well, here's one of them. 243 00:16:47,000 --> 00:16:53,040 A transpose C*e is zero -- you see, 244 00:16:53,040 --> 00:16:55,150 that's the only difference, really. 245 00:16:55,150 --> 00:17:01,260 That the weighted normal equation, 246 00:17:01,260 --> 00:17:05,480 I just took A transpose C, and then I took the b minus A*u 247 00:17:05,480 --> 00:17:08,510 together, and this is what I'm calling e. 248 00:17:08,510 --> 00:17:15,170 So one way to do it is just the new guy is just now e plus 249 00:17:15,170 --> 00:17:19,630 A*u_W is still b. 250 00:17:19,630 --> 00:17:25,920 But now a transpose C*e is zero. 251 00:17:25,920 --> 00:17:26,570 Good deal. 252 00:17:26,570 --> 00:17:29,040 I mean that's quite nice. 253 00:17:31,950 --> 00:17:34,720 It isn't absolutely perfect though, 254 00:17:34,720 --> 00:17:36,380 because I've lost the symmetry. 255 00:17:36,380 --> 00:17:39,920 I'm putting a C in there and I don't really want it. 256 00:17:39,920 --> 00:17:42,720 I want a C but I don't want it there. 257 00:17:42,720 --> 00:17:46,970 So, I just make a small change. 258 00:17:46,970 --> 00:17:49,770 I'm going to introduce a new unknown that I'll 259 00:17:49,770 --> 00:17:55,630 call little w, apologies for the fact that it's also a w, 260 00:17:55,630 --> 00:17:58,410 it just happened to fit. 261 00:17:58,410 --> 00:17:59,380 That'll be the C*e. 262 00:18:02,730 --> 00:18:07,960 So I'm just calling this a new name here. 263 00:18:07,960 --> 00:18:11,800 So that now my e -- of course, if I just invert, 264 00:18:11,800 --> 00:18:14,970 e is C inverse w. 265 00:18:14,970 --> 00:18:19,690 So now I'm just going to write these equations with w instead 266 00:18:19,690 --> 00:18:21,580 of e because I like it better. 267 00:18:21,580 --> 00:18:29,000 So this equation is now A transpose w equals zero. 268 00:18:29,000 --> 00:18:35,030 This equation, the e is disappeared in favor of C 269 00:18:35,030 --> 00:18:42,100 inverse w plus A*u equals b. 270 00:18:42,100 --> 00:18:45,300 So that's the system that I really like. 271 00:18:45,300 --> 00:18:48,550 That's the saddle point system, the Kuhn-Tucker system, 272 00:18:48,550 --> 00:18:51,080 the primal-dual system, the fundamental system 273 00:18:51,080 --> 00:18:59,230 of the whole subject in this linear, matrix case. 274 00:18:59,230 --> 00:19:02,430 We haven't got functions, we got vectors, 275 00:19:02,430 --> 00:19:09,730 and we've got symmetry, and we've got linearity. 276 00:19:09,730 --> 00:19:15,930 And we've got a saddle point matrix that's now the S -- 277 00:19:15,930 --> 00:19:19,060 well, let me just change it here. 278 00:19:19,060 --> 00:19:23,430 It's just changed to this, C inverse. 279 00:19:23,430 --> 00:19:28,090 That's the fundamental matrix of the whole subject. 280 00:19:28,090 --> 00:19:30,740 So, S is the saddle point matrix. 281 00:19:38,300 --> 00:19:41,230 So I wanted to get that far. 282 00:19:41,230 --> 00:19:46,470 You see that the whole picture was elementary linear algebra. 283 00:19:46,470 --> 00:19:52,780 Let me come back to the elementary figure that 284 00:19:52,780 --> 00:19:55,950 illustrates what's the geometry. 285 00:19:55,950 --> 00:20:00,790 How was the geometry affected by introducing 286 00:20:00,790 --> 00:20:06,460 this guy W, this weighting matrix, or the C equal W 287 00:20:06,460 --> 00:20:07,320 transpose W? 288 00:20:07,320 --> 00:20:12,020 Well, here was the picture from last time. 289 00:20:12,020 --> 00:20:14,390 A right-angled picture. 290 00:20:14,390 --> 00:20:16,810 This was a right triangle. 291 00:20:16,810 --> 00:20:22,050 This line was perpendicular to that plane, 292 00:20:22,050 --> 00:20:23,500 but not anymore now. 293 00:20:23,500 --> 00:20:31,440 The second one, it's A transpose C*e is zero. 294 00:20:31,440 --> 00:20:36,380 That's still a line, but it's not any longer perpendicular 295 00:20:36,380 --> 00:20:41,010 to the -- this is still all the A*u's. 296 00:20:41,010 --> 00:20:44,600 This plane is still the column space of all the A*u's. 297 00:20:47,380 --> 00:20:49,940 We have the same b. 298 00:20:49,940 --> 00:20:56,850 But you see the problem is it lost its 90 degree angle. 299 00:20:56,850 --> 00:21:02,690 Because the projection is now projection on a -- 300 00:21:02,690 --> 00:21:05,990 it's now an oblique projection, it's slanted. 301 00:21:05,990 --> 00:21:12,360 This is the best A*u -- and if I occasionally keep up-to-date 302 00:21:12,360 --> 00:21:16,190 I'll put that u_W there. 303 00:21:16,190 --> 00:21:19,800 There's still an error e, and this is still a parallelogram 304 00:21:19,800 --> 00:21:21,340 but it's not a rectangle anymore. 305 00:21:25,350 --> 00:21:27,620 Forgive my enthusiasm. 306 00:21:27,620 --> 00:21:32,400 I'm sort of happy that the picture and the algebra both 307 00:21:32,400 --> 00:21:36,230 come out so neatly. 308 00:21:36,230 --> 00:21:40,170 I totally agree that at this point 309 00:21:40,170 --> 00:21:46,070 I'm asking you to follow a model without giving you 310 00:21:46,070 --> 00:21:48,250 an application, and that's one reason 311 00:21:48,250 --> 00:21:53,910 I threw in this mention of a specific application that 312 00:21:53,910 --> 00:21:55,840 came from somewhere else. 313 00:21:55,840 --> 00:21:59,850 But this is the picture there. 314 00:21:59,850 --> 00:22:07,280 So I'll say one more word about the picture. 315 00:22:07,280 --> 00:22:11,790 I said that we lost the right angle. 316 00:22:11,790 --> 00:22:16,110 We lost perpendicularity, and, of course, 317 00:22:16,110 --> 00:22:18,360 literally speaking we did. 318 00:22:18,360 --> 00:22:25,870 This is no longer -- this is not a right angle anymore. 319 00:22:25,870 --> 00:22:27,870 This is not a right angle anymore. 320 00:22:34,000 --> 00:22:41,220 It's not a right angle in the usual meaning of right angles. 321 00:22:41,220 --> 00:22:48,540 It is a right angle in the inner product that's 322 00:22:48,540 --> 00:22:50,420 associated with C. In other words, 323 00:22:50,420 --> 00:22:55,870 right angles here mean x transpose y equals zero. 324 00:22:55,870 --> 00:22:58,180 That's the idea of a right angle, 325 00:22:58,180 --> 00:23:02,210 right? x perpendicular to y, and they 326 00:23:02,210 --> 00:23:03,720 have different letters here. 327 00:23:03,720 --> 00:23:10,870 Now over here I still have perpendicular, but I don't -- 328 00:23:10,870 --> 00:23:14,060 this is not the right inner product anymore. 329 00:23:14,060 --> 00:23:19,050 It should be a weighted inner product, weighted with this C 330 00:23:19,050 --> 00:23:20,600 in the middle. 331 00:23:20,600 --> 00:23:26,660 So that's really what I mean by C-orthogonal. 332 00:23:26,660 --> 00:23:28,560 Maybe I'll put those words down. 333 00:23:28,560 --> 00:23:32,870 So this weighted thing is -- if I can squeeze it, 334 00:23:32,870 --> 00:23:39,750 I doubt if I can -- is C-orthogonality, 335 00:23:39,750 --> 00:23:41,670 weighted orthogonality. 336 00:23:41,670 --> 00:23:44,250 So let me circle the whole thing. 337 00:23:44,250 --> 00:23:50,650 The C is the W transpose W. Just to say that we aren't giving up 338 00:23:50,650 --> 00:23:54,450 on dot products and perpendiculars 339 00:23:54,450 --> 00:23:58,120 and good equations, we're just changing them 340 00:23:58,120 --> 00:24:04,240 by inserting C every time in the inner product. 341 00:24:04,240 --> 00:24:09,010 What it means is that this is the natural inner product 342 00:24:09,010 --> 00:24:12,590 for the particular problem. 343 00:24:12,590 --> 00:24:16,230 This is the natural inner product for Euclid, right? 344 00:24:16,230 --> 00:24:18,850 But then from some specific application 345 00:24:18,850 --> 00:24:21,990 like this one or a million others, 346 00:24:21,990 --> 00:24:25,340 they have their own natural inner product, 347 00:24:25,340 --> 00:24:27,900 and the inner product for that particular problem 348 00:24:27,900 --> 00:24:33,440 would be one of these guys with some kind of a matrix C showing 349 00:24:33,440 --> 00:24:35,410 up. 350 00:24:35,410 --> 00:24:40,470 So least squares and weighted least squares, 351 00:24:40,470 --> 00:24:44,600 that's my example one. 352 00:24:44,600 --> 00:24:51,200 Now I'd like to give a second example, a more mechanic -- 353 00:24:51,200 --> 00:24:53,280 will come closer to mechanics. 354 00:24:53,280 --> 00:24:59,030 Because this is least squares, statistics, algebra. 355 00:24:59,030 --> 00:25:03,400 But let me put on the middle board an application out 356 00:25:03,400 --> 00:25:05,170 of mechanics. 357 00:25:05,170 --> 00:25:12,220 It will be, say, I'll make it small 358 00:25:12,220 --> 00:25:19,070 and just a couple of springs with a mass between them 359 00:25:19,070 --> 00:25:23,750 and fixed at both ends. 360 00:25:26,900 --> 00:25:32,600 So this spring extends by some amount. 361 00:25:32,600 --> 00:25:35,380 This spring extends by some amount. 362 00:25:35,380 --> 00:25:37,580 There's a force on this mass. 363 00:25:37,580 --> 00:25:42,130 So there's a force on this mass, maybe just gravity, f. 364 00:25:46,270 --> 00:25:51,390 That's the external force from the mass that's 365 00:25:51,390 --> 00:25:54,200 here in between the springs. 366 00:25:54,200 --> 00:26:02,090 Then, also acting on that mass are spring forces. 367 00:26:02,090 --> 00:26:05,130 This spring is pulling it up, right? 368 00:26:05,130 --> 00:26:07,110 There's a spring force, w_1. 369 00:26:10,700 --> 00:26:14,920 Here, do you want me to draw -- really, 370 00:26:14,920 --> 00:26:16,670 which direction is this? 371 00:26:16,670 --> 00:26:26,810 I'm going to draw it this way just to show the w_2 drawn that 372 00:26:26,810 --> 00:26:33,860 way would actually be negative, because I think that this 373 00:26:33,860 --> 00:26:35,790 spring would get compressed, right -- 374 00:26:35,790 --> 00:26:38,090 this mass is pushing it down. 375 00:26:38,090 --> 00:26:40,310 This spring would be under compression. 376 00:26:40,310 --> 00:26:44,040 It would be pushing the mass back up. 377 00:26:44,040 --> 00:26:48,440 So the w_2 in that picture would be negative. 378 00:26:48,440 --> 00:26:53,450 So this w_1 will actually be positive and go 379 00:26:53,450 --> 00:26:55,240 the way the arrow is showing. 380 00:26:55,240 --> 00:26:57,830 This w_2 would actually be negative 381 00:26:57,830 --> 00:27:01,400 and go not the way the arrow is showing. 382 00:27:01,400 --> 00:27:10,850 But what's the -- oh, what equations have we got then? 383 00:27:10,850 --> 00:27:13,070 What's our optimization problem? 384 00:27:13,070 --> 00:27:14,470 Well actually, we have a choice. 385 00:27:14,470 --> 00:27:20,270 We could work with equations, period. 386 00:27:20,270 --> 00:27:22,870 Actually, one of the equations is pretty obvious. 387 00:27:22,870 --> 00:27:24,700 This mass is in equilibrium. 388 00:27:24,700 --> 00:27:30,940 So w_1 is equal to w_2 plus f. 389 00:27:30,940 --> 00:27:33,130 So it doesn't move. 390 00:27:33,130 --> 00:27:37,440 Or you might prefer me to write it, I would rather write it, 391 00:27:37,440 --> 00:27:44,100 with w's on one side and source terms on the other. 392 00:27:44,100 --> 00:27:53,410 So that's the equilibrium equation. 393 00:27:56,900 --> 00:28:01,620 So what decides-- we want to know what these w's are, 394 00:28:01,620 --> 00:28:03,920 and these springs are extended. 395 00:28:03,920 --> 00:28:08,290 So that first spring is extended by an amount e_1, 396 00:28:08,290 --> 00:28:12,230 and this second spring is extended by an amount e_2. 397 00:28:12,230 --> 00:28:16,310 Stretched or compressed. 398 00:28:16,310 --> 00:28:19,770 e_1 is probably going to be positive here -- 399 00:28:19,770 --> 00:28:21,820 that spring's going to be stretched. 400 00:28:21,820 --> 00:28:24,200 This spring is going to be compressed, so that e_2 is 401 00:28:24,200 --> 00:28:27,710 probably going to be negative. 402 00:28:27,710 --> 00:28:31,620 What's the mechanics here? 403 00:28:31,620 --> 00:28:34,320 Well, I can state it two ways, as I said. 404 00:28:34,320 --> 00:28:39,120 I can state the mechanics in terms of equations -- 405 00:28:39,120 --> 00:28:43,810 force and stretching, elastic constant. 406 00:28:43,810 --> 00:28:48,990 That's how we did it in the first semester, 18.085. 407 00:28:48,990 --> 00:28:55,840 It's a little clearer because simple equation, Hooke's law. 408 00:28:55,840 --> 00:29:02,870 Or I can state the problem as a minimization of energy. 409 00:29:02,870 --> 00:29:06,520 That's what I want to do today in 18.06. 410 00:29:06,520 --> 00:29:09,320 So I want to minimize -- 18.086. 411 00:29:09,320 --> 00:29:12,980 I want to minimize the total energy, 412 00:29:12,980 --> 00:29:20,920 the energy in these springs. 413 00:29:20,920 --> 00:29:34,650 Subject to the constraint -- this constraint, equilibrium. 414 00:29:37,410 --> 00:29:40,480 So that's the optimization statement 415 00:29:40,480 --> 00:29:46,610 of the mechanical problem. 416 00:29:50,940 --> 00:29:54,990 Well, I guess all that remains is -- well, I guess, 417 00:29:54,990 --> 00:29:55,910 what remains? 418 00:29:55,910 --> 00:30:01,670 First of all, I need an expression for this energy. 419 00:30:01,670 --> 00:30:04,950 If the springs are governed by Hooke's law, 420 00:30:04,950 --> 00:30:08,400 then it'll be pretty simple. 421 00:30:08,400 --> 00:30:12,050 If they're real springs that don't quite obey Hooke's law 422 00:30:12,050 --> 00:30:17,010 then there'll be non-- there'll be fourth-degree, 423 00:30:17,010 --> 00:30:21,190 sixth-degree, whatever, terms in the energy. 424 00:30:21,190 --> 00:30:25,110 It's like the energy in the first spring plus the energy 425 00:30:25,110 --> 00:30:27,970 in the second spring, anyway. 426 00:30:27,970 --> 00:30:33,960 E in the first spring, and the energy in the second spring, 427 00:30:33,960 --> 00:30:36,520 and, of course, the two springs could have different spring 428 00:30:36,520 --> 00:30:37,020 constants. 429 00:30:40,280 --> 00:30:48,040 So those E's -- I'll make life easy in solving, 430 00:30:48,040 --> 00:30:54,280 if I choose Hooke's law, if I choose the energy to be just 431 00:30:54,280 --> 00:30:54,930 a square. 432 00:30:57,700 --> 00:31:03,500 The constraint is linear, so that'll be the model problem. 433 00:31:03,500 --> 00:31:09,850 And actually, that'll be the kind of problem I've got here. 434 00:31:14,120 --> 00:31:19,740 One reason for introducing a new example was to get some 435 00:31:19,740 --> 00:31:29,700 mechanics into the lecture, but also to get a problem where 436 00:31:29,700 --> 00:31:34,820 we're doing a minimization but we've got a condition 437 00:31:34,820 --> 00:31:36,420 on the w's. 438 00:31:36,420 --> 00:31:42,090 And the question is how do you find a minimum? 439 00:31:42,090 --> 00:31:45,170 You can't just set derivatives of the energy to zero. 440 00:31:45,170 --> 00:31:48,520 You would discover w_1 equal w_2 equals zero. 441 00:31:48,520 --> 00:31:50,770 Nothing happening. 442 00:31:50,770 --> 00:31:52,680 That would be the minimum. 443 00:31:52,680 --> 00:31:57,100 But that minimum is ruled out, that solution 444 00:31:57,100 --> 00:32:00,340 is ruled out because we have this constraint. 445 00:32:00,340 --> 00:32:03,920 We've got to balance the external force. 446 00:32:03,920 --> 00:32:08,740 So this is the question and you'll maybe 447 00:32:08,740 --> 00:32:15,870 have met this question in other courses, but it's essential. 448 00:32:15,870 --> 00:32:21,690 How do you deal with minimizing when there's a constraint? 449 00:32:21,690 --> 00:32:26,520 I guess, in some way, we had it over here. 450 00:32:26,520 --> 00:32:30,450 We were minimizing something -- well, the minimum would be, 451 00:32:30,450 --> 00:32:32,480 take u equal u_naught. 452 00:32:32,480 --> 00:32:36,330 But no, there was a constraint that u 453 00:32:36,330 --> 00:32:43,240 had to satisfy a certain equilibrium equation. 454 00:32:43,240 --> 00:32:44,890 Here it was a differential equation 455 00:32:44,890 --> 00:32:47,780 so that problem is a little harder than this one 456 00:32:47,780 --> 00:32:56,690 where the equation is just discrete, one simple equation. 457 00:32:56,690 --> 00:32:58,080 So how are you going to do it? 458 00:32:58,080 --> 00:33:05,220 Well, actually the quickest way would be -- 459 00:33:05,220 --> 00:33:08,630 that's such an easy constraint that I could say hey, 460 00:33:08,630 --> 00:33:12,040 w_2 is w_1 minus f. 461 00:33:12,040 --> 00:33:15,280 So I could just, if I wanted to really like 462 00:33:15,280 --> 00:33:18,310 shortcut this whole lecture, I could 463 00:33:18,310 --> 00:33:21,170 say well, w_2 is w_1 minus f. 464 00:33:24,690 --> 00:33:27,440 Now I've accounted for the constraint. 465 00:33:27,440 --> 00:33:30,560 I've removed w_2 from the problem. 466 00:33:30,560 --> 00:33:33,680 I have a minimization, an ordinary minimization 467 00:33:33,680 --> 00:33:35,240 with an unknown w_1. 468 00:33:35,240 --> 00:33:37,410 I take the derivative. 469 00:33:37,410 --> 00:33:40,230 I solve derivative equals zero. 470 00:33:40,230 --> 00:33:43,610 I find w_1, then I go back, I get that w_2. 471 00:33:43,610 --> 00:33:44,740 That's the fast way. 472 00:33:47,880 --> 00:33:49,450 Of course, gets the right answer. 473 00:33:49,450 --> 00:33:56,980 But there's another way that in the end turns out to be better. 474 00:33:56,980 --> 00:34:00,440 It's not necessarily better for this simple problem, 475 00:34:00,440 --> 00:34:06,520 but it's better for the general approach 476 00:34:06,520 --> 00:34:09,780 to constrained optimization. 477 00:34:09,780 --> 00:34:16,320 So I'm going to not do the simple deal of solving for w_2, 478 00:34:16,320 --> 00:34:19,680 but I'm going to keep the constraint around. 479 00:34:19,680 --> 00:34:23,900 It's the idea of Lagrange multipliers. 480 00:34:23,900 --> 00:34:30,550 You've heard those words and probably seen it happen. 481 00:34:30,550 --> 00:34:32,570 So what is Lagrange multiplier? 482 00:34:32,570 --> 00:34:34,650 What is Lagrange's idea? 483 00:34:34,650 --> 00:34:43,770 Lagrange's idea is -- he constructs a function of w, 484 00:34:43,770 --> 00:34:47,230 to work with, which is the same energy. 485 00:34:47,230 --> 00:34:49,660 But he's going to include a multiplier. 486 00:34:49,660 --> 00:34:52,020 Now the next question is what letter shall 487 00:34:52,020 --> 00:34:57,260 I use for that multiplier, Lagrange's multiplier. 488 00:34:57,260 --> 00:35:02,840 Books on optimization often call it lambda -- 489 00:35:02,840 --> 00:35:06,560 lambda's sort of, like, for Lagrange. 490 00:35:06,560 --> 00:35:11,410 So, Lagrange obviously wasn't Greek, but anyway -- close. 491 00:35:11,410 --> 00:35:17,780 Lambda for us always means eigenvalue -- 492 00:35:17,780 --> 00:35:19,490 for me always means eigenvalue. 493 00:35:19,490 --> 00:35:21,270 So I'm reluctant to use lambda. 494 00:35:21,270 --> 00:35:30,520 And sometimes in books on economics, which is doing this 495 00:35:30,520 --> 00:35:35,740 all the time, the multiplier's called pi 496 00:35:35,740 --> 00:35:39,570 because it turns out to be a price. 497 00:35:43,200 --> 00:35:50,350 But let me use u for the Lagrange multiplier. 498 00:35:53,770 --> 00:35:56,100 So that'll be the Lagrange multiplier. 499 00:35:56,100 --> 00:35:57,420 So what do I do with it? 500 00:35:57,420 --> 00:36:04,340 I multiply this equation by u and I build it in to this. 501 00:36:04,340 --> 00:36:08,990 So this thing is this part -- I'll just copy that down there 502 00:36:08,990 --> 00:36:13,020 -- plus or minus, depending what I want, 503 00:36:13,020 --> 00:36:17,370 depending which sign I want to end up with, u, the multiplier, 504 00:36:17,370 --> 00:36:21,980 times the constraint -- w_1 minus w_2 -- 505 00:36:21,980 --> 00:36:25,210 let me put it like that. 506 00:36:25,210 --> 00:36:29,140 So the constraint is that this should be zero. 507 00:36:29,140 --> 00:36:32,170 So you can say I haven't added anything. 508 00:36:32,170 --> 00:36:34,770 I've added zero. 509 00:36:34,770 --> 00:36:40,690 But that will only be true at the end 510 00:36:40,690 --> 00:36:43,650 when I have a specific w_1 and w_2. 511 00:36:43,650 --> 00:36:48,700 Right now what I've done is I've built in the constraint 512 00:36:48,700 --> 00:36:50,430 into the function. 513 00:36:53,070 --> 00:36:58,940 Lagrange's brilliant idea was that now I've got a function 514 00:36:58,940 --> 00:37:03,960 that I can -- whose derivatives I can take. 515 00:37:03,960 --> 00:37:05,820 I could set the derivatives to zero. 516 00:37:05,820 --> 00:37:14,420 dL/dw_1 is going to be zero. dL/dw_2 is going to be zero. 517 00:37:14,420 --> 00:37:20,310 And dL/du is going to be zero. 518 00:37:20,310 --> 00:37:24,540 So we got now three equations, three unknowns. 519 00:37:24,540 --> 00:37:26,260 Instead of going from two to one, 520 00:37:26,260 --> 00:37:29,360 we've gone from two up to three. 521 00:37:29,360 --> 00:37:33,890 But it's so much more systematic that it's the right thing 522 00:37:33,890 --> 00:37:35,470 to do. 523 00:37:35,470 --> 00:37:38,580 What is this last equation? 524 00:37:38,580 --> 00:37:42,130 What's the u derivative of this expression? 525 00:37:42,130 --> 00:37:48,260 Well, u doesn't appear here, the derivative is just w_1 -- 526 00:37:48,260 --> 00:37:53,420 this just leads us to w_1 minus w_2 minus f, 527 00:37:53,420 --> 00:37:56,000 which is the constraint that we wanted. 528 00:37:56,000 --> 00:38:05,740 So the constraint is showing up as this equation. 529 00:38:05,740 --> 00:38:09,730 Just the way the constraint showed up 530 00:38:09,730 --> 00:38:10,940 as this equation here. 531 00:38:15,480 --> 00:38:24,460 I guess what I want to say is when I wrote this down, 532 00:38:24,460 --> 00:38:27,240 when we did this first example, I 533 00:38:27,240 --> 00:38:31,820 didn't say here's the constraint. 534 00:38:31,820 --> 00:38:34,450 I mean that equation kind of came out 535 00:38:34,450 --> 00:38:38,350 from the normal equations here. 536 00:38:42,540 --> 00:38:47,900 So what we're doing new here is we're starting with -- 537 00:38:47,900 --> 00:38:50,950 the constraint equation is sort of part of the mechanics 538 00:38:50,950 --> 00:38:55,280 and we're asking the question: how do I deal with it? 539 00:38:55,280 --> 00:38:57,560 Here it came out of the geometry, 540 00:38:57,560 --> 00:39:00,630 so it wasn't there at the beginning, 541 00:39:00,630 --> 00:39:03,730 so we didn't have to say: how do we deal with this equation? 542 00:39:03,730 --> 00:39:05,620 It just emerged. 543 00:39:05,620 --> 00:39:11,220 Here it's really forced on us right away. 544 00:39:11,220 --> 00:39:16,830 So anyway, we've still got to figure out the derivatives 545 00:39:16,830 --> 00:39:22,490 of those, and I guess I do have to finally now say what choice 546 00:39:22,490 --> 00:39:24,560 -- yeah. 547 00:39:24,560 --> 00:39:36,210 So this is -- what is this, the derivative of E_1 minus u. 548 00:39:36,210 --> 00:39:39,830 If I take the w_1 derivative, I've got the derivative of E_1 549 00:39:39,830 --> 00:39:43,300 with respect to w_1. 550 00:39:43,300 --> 00:39:47,130 Then w1 doesn't appear there but it appears here, 551 00:39:47,130 --> 00:39:50,370 so it's minus u. 552 00:39:50,370 --> 00:39:54,920 This would be the derivative of E_2 with respect to w_2, 553 00:39:54,920 --> 00:39:59,320 that second spring minus -- oh maybe plus the u. 554 00:40:05,680 --> 00:40:08,140 Well, those are the three equations. 555 00:40:10,890 --> 00:40:16,140 Let me move now to the linear case, 556 00:40:16,140 --> 00:40:19,530 just so we see the beautiful pattern. 557 00:40:19,530 --> 00:40:24,780 So if I make the equations linear, what's 558 00:40:24,780 --> 00:40:31,140 the energy in a Hooke's law spring 559 00:40:31,140 --> 00:40:41,440 here if the extension is e_1 and if it produces a force of w_1, 560 00:40:41,440 --> 00:40:44,970 I want to know what is this E_1 here. 561 00:40:44,970 --> 00:40:48,590 So let me just remember Hooke's law. 562 00:40:48,590 --> 00:40:51,840 I think Hooke's law would say that -- well, 563 00:40:51,840 --> 00:40:56,440 there's some elastic constant c_1, 564 00:40:56,440 --> 00:41:01,815 so there's a c_1 and a c_2 that tell us how hard or soft 565 00:41:01,815 --> 00:41:02,810 the springs are. 566 00:41:02,810 --> 00:41:04,890 So these are physical constants. 567 00:41:12,200 --> 00:41:16,500 If I remember right, the energy -- 568 00:41:16,500 --> 00:41:20,170 so I'm going to erase a little here just to -- well no. 569 00:41:20,170 --> 00:41:23,410 So what is the energy in that first spring? 570 00:41:23,410 --> 00:41:31,650 You remember, there's a 1/2 of an c_1 e_1 squared. 571 00:41:31,650 --> 00:41:38,100 That's the energy in a spring with constant c_1 572 00:41:38,100 --> 00:41:39,890 and the stretch e_1. 573 00:41:45,280 --> 00:41:48,200 But now, I really want it not in terms of e_1, 574 00:41:48,200 --> 00:41:53,412 I want it in terms of w_1, just the way I did here. 575 00:41:53,412 --> 00:41:54,870 I've got to do the same thing here. 576 00:41:54,870 --> 00:42:00,630 I want to get to w, because the constraint is in terms of w. 577 00:42:03,290 --> 00:42:04,820 What does Hooke's law say? 578 00:42:04,820 --> 00:42:12,510 Hooke's law says w equals c*e, that's Hooke, Hooke's law. 579 00:42:15,400 --> 00:42:19,460 The force is the elastic constant times the stretch. 580 00:42:19,460 --> 00:42:24,650 So in place of the e, I have w over c. 581 00:42:24,650 --> 00:42:32,310 So this is 1/2 -- e_1 squared will be w_1 squared over c_1 582 00:42:32,310 --> 00:42:36,520 squared, and then a c_1 cancels -- that's what it looks like. 583 00:42:40,450 --> 00:42:45,390 I guess I'm certainly happy to see that I'm 584 00:42:45,390 --> 00:42:47,380 coming up with c inverse. 585 00:42:47,380 --> 00:42:51,400 This c is showing up in the denominator, 586 00:42:51,400 --> 00:42:55,390 and that's exactly the -- so this is what I mean, 587 00:42:55,390 --> 00:43:00,780 this is what I want for my energy, 588 00:43:00,780 --> 00:43:10,530 is a 1/2 w_1 squared over c_1, and a 1/2 w_2 squared over c_2, 589 00:43:10,530 --> 00:43:18,720 and now I was doing a minus sign because that kind of goes well 590 00:43:18,720 --> 00:43:23,670 with mechanics where a plus sign would go well with all 591 00:43:23,670 --> 00:43:26,720 the other applications. 592 00:43:26,720 --> 00:43:31,650 So now I've got explicit energy. 593 00:43:31,650 --> 00:43:35,400 So now I can say what this thing really -- 594 00:43:35,400 --> 00:43:37,860 the derivative with respect to w_1. 595 00:43:37,860 --> 00:43:39,130 OK. 596 00:43:39,130 --> 00:43:44,740 The derivative with respect to w_1 is just w_1 over c_1. 597 00:43:44,740 --> 00:43:46,770 Is that what I'm getting? 598 00:43:46,770 --> 00:43:48,520 This is zero. 599 00:43:48,520 --> 00:43:55,580 This says w_2 over c_2 minus u is zero, 600 00:43:55,580 --> 00:44:00,440 and this one says w_1 minus w_2 equals f. 601 00:44:04,120 --> 00:44:07,200 Our three equations. 602 00:44:07,200 --> 00:44:16,660 Yes, so I will kill these equal signs and just look at -- oh, 603 00:44:16,660 --> 00:44:17,740 that's a plus. 604 00:44:29,090 --> 00:44:32,170 We've got our saddle point matrix again. 605 00:44:32,170 --> 00:44:34,950 That's the nice thing here. 606 00:44:34,950 --> 00:44:39,130 This is a problem with a three by three matrix, 607 00:44:39,130 --> 00:44:43,760 with three unknowns, w_1, w_2, and u. 608 00:44:43,760 --> 00:44:47,320 With right-hand sides zero, zero, and f. 609 00:44:53,580 --> 00:44:56,030 And with what matrix? 610 00:44:59,190 --> 00:45:06,110 That looks like a 1 over c_1, zero, and a minus 1. 611 00:45:06,110 --> 00:45:12,640 This looks like zero, a 1 over c_2 and a plus 1. 612 00:45:12,640 --> 00:45:16,240 This looks like -- oh, wait a minute. 613 00:45:16,240 --> 00:45:18,880 I don't want this. 614 00:45:18,880 --> 00:45:21,870 w_1 minus w_2 equal f. 615 00:45:25,500 --> 00:45:29,020 I can live with 1 and minus 1 there, 616 00:45:29,020 --> 00:45:33,040 but it's not really what I wanted. 617 00:45:33,040 --> 00:45:37,140 I wanted the signs to -- you know what I want here. 618 00:45:37,140 --> 00:45:41,130 I want the transpose of that to be here, and zero 619 00:45:41,130 --> 00:45:42,330 to be in that block. 620 00:45:42,330 --> 00:45:50,200 So that's what my saddle point matrix would look like. 621 00:45:50,200 --> 00:45:55,460 Well, let me just say that I could live with either. 622 00:45:55,460 --> 00:46:03,450 I was aiming for this one because it's symmetric. 623 00:46:03,450 --> 00:46:09,180 But a lot of people would rather have the opposite signs 624 00:46:09,180 --> 00:46:16,140 and have the 1, minus 1 there. 625 00:46:16,140 --> 00:46:22,190 I don't care which sign f has, of course. 626 00:46:22,190 --> 00:46:25,640 Some people these days are liking this form better 627 00:46:25,640 --> 00:46:28,120 because then it has a symmetric part 628 00:46:28,120 --> 00:46:29,590 and an anti-symmetric part. 629 00:46:33,160 --> 00:46:37,010 I mean the thing is, at some point 630 00:46:37,010 --> 00:46:39,970 we're going to get problems like this with thousands 631 00:46:39,970 --> 00:46:42,970 of unknowns, and we're going to think 632 00:46:42,970 --> 00:46:47,640 how do we solve them and maybe some iteration. 633 00:46:47,640 --> 00:46:53,750 So we might want the matrix to be symmetric but indefinite, 634 00:46:53,750 --> 00:46:59,500 or we might want a positive definite, symmetric part 635 00:46:59,500 --> 00:47:01,360 and an anti-symmetric part. 636 00:47:04,350 --> 00:47:07,840 What we can't have is positive definite symmetric. 637 00:47:07,840 --> 00:47:11,880 That's like asking for what can't happen here. 638 00:47:11,880 --> 00:47:17,090 The combination of the problems is producing a saddle point 639 00:47:17,090 --> 00:47:28,330 and we can play with that sign, but we can't make that zero 640 00:47:28,330 --> 00:47:33,140 something -- oh, we could, actually. 641 00:47:33,140 --> 00:47:36,970 What I was going to say is we can't make that zero something 642 00:47:36,970 --> 00:47:40,520 different, but you could. 643 00:47:40,520 --> 00:47:44,260 That would be a possible way to -- 644 00:47:44,260 --> 00:47:47,630 it's another way people thought of of solving these problems, 645 00:47:47,630 --> 00:47:55,800 is artificially throw in a big number there, 646 00:47:55,800 --> 00:48:02,590 or even a small number, and push things towards positive. 647 00:48:02,590 --> 00:48:10,630 Anyway, my purpose today is essentially completed, 648 00:48:10,630 --> 00:48:15,780 that we're getting out of a physical application, 649 00:48:15,780 --> 00:48:20,200 after I linearized and it became a linear equation 650 00:48:20,200 --> 00:48:22,380 and it had that saddle point form. 651 00:48:22,380 --> 00:48:26,420 So, saddle point form here, saddle point form here, 652 00:48:26,420 --> 00:48:28,120 saddle point forms everywhere. 653 00:48:28,120 --> 00:48:30,710 I mean we'll have the saddle point 654 00:48:30,710 --> 00:48:33,460 forms for differential equations, 655 00:48:33,460 --> 00:48:36,390 as well as for matrix equations. 656 00:48:36,390 --> 00:48:44,590 So those are the examples to sort of hang on to, 657 00:48:44,590 --> 00:48:53,350 and it's section 7.1 that has a big part of it. 658 00:48:53,350 --> 00:48:58,980 Ah -- I always have one last thing to say. 659 00:48:58,980 --> 00:49:00,440 What's the meaning of u? 660 00:49:03,180 --> 00:49:06,070 So, u was a Lagrange multiplier. 661 00:49:06,070 --> 00:49:08,470 Lagrange just like helped us out by saying 662 00:49:08,470 --> 00:49:13,430 OK, deal with constraints by using one of my multipliers. 663 00:49:13,430 --> 00:49:17,510 But the point is that the multiplier always 664 00:49:17,510 --> 00:49:20,570 has a real meaning. 665 00:49:20,570 --> 00:49:24,980 I mention prices before. 666 00:49:24,980 --> 00:49:28,950 Here, what's the meaning of u? 667 00:49:28,950 --> 00:49:31,390 What's the physical meaning of the Lagrange multiplier. 668 00:49:31,390 --> 00:49:36,220 It turns out to be the displacement of the mass. 669 00:49:36,220 --> 00:49:41,740 It's the dual variable, so it always has some interpretation. 670 00:49:41,740 --> 00:49:44,830 In this case, with mechanics it's 671 00:49:44,830 --> 00:49:51,810 the amount the mass comes down when the force acts. 672 00:49:51,810 --> 00:49:57,480 What's more, it also has a derivative interpretation. 673 00:49:57,480 --> 00:50:01,600 Turns out -- I'll just put turns out -- 674 00:50:01,600 --> 00:50:09,370 that the Lagrange multiplier u turns out to be the derivative 675 00:50:09,370 --> 00:50:16,300 of the minimum energy in the system with respect 676 00:50:16,300 --> 00:50:20,640 to the source term. 677 00:50:20,640 --> 00:50:23,710 It's the sensitivity of the problem somehow. 678 00:50:23,710 --> 00:50:27,620 I'll just use sensitivity. 679 00:50:27,620 --> 00:50:33,860 You often want to know -- it's actually this quantity that we 680 00:50:33,860 --> 00:50:35,380 need over there. 681 00:50:35,380 --> 00:50:38,700 We want to know how much does the answer depend 682 00:50:38,700 --> 00:50:44,010 on the source, and that's what the Lagrange multiplier tells 683 00:50:44,010 --> 00:50:44,970 us. 684 00:50:44,970 --> 00:50:51,420 So, if I computed the solution here, so the notes do this -- 685 00:50:51,420 --> 00:50:53,850 maybe I'll leave this for the notes. 686 00:50:53,850 --> 00:50:57,580 The notes solve this little problem. 687 00:50:57,580 --> 00:50:59,820 That's easy to do. 688 00:50:59,820 --> 00:51:05,870 They figure out what the energy is in the springs. 689 00:51:05,870 --> 00:51:08,200 It depends on the right-hand side 690 00:51:08,200 --> 00:51:12,150 f, which is just a number here. 691 00:51:12,150 --> 00:51:14,330 The energy turns out to be quadratic. 692 00:51:14,330 --> 00:51:18,830 You can take its derivative and you find out that it's u. 693 00:51:18,830 --> 00:51:25,630 So that Lagrange multiplier, that's really a key message, 694 00:51:25,630 --> 00:51:28,850 is an important quantity in itself. 695 00:51:28,850 --> 00:51:30,980 Here it happens to mean displacement, 696 00:51:30,980 --> 00:51:33,580 which is obviously crucial quantity, 697 00:51:33,580 --> 00:51:36,970 and in general it tells us the change 698 00:51:36,970 --> 00:51:40,850 in the minimum energy with respect 699 00:51:40,850 --> 00:51:44,480 to a change in the input. 700 00:51:44,480 --> 00:51:49,670 And sensitivity's a natural word to use for that. 701 00:51:49,670 --> 00:51:51,860 So that's a final word about -- well, 702 00:51:51,860 --> 00:51:55,000 a near to final word about Lagrange multipliers. 703 00:51:55,000 --> 00:51:59,280 So I'll see you Wednesday and by that time 704 00:51:59,280 --> 00:52:02,600 I'll know more about the projects 705 00:52:02,600 --> 00:52:08,481 and we'll be moving onward with optimization. 706 00:52:08,481 --> 00:52:08,980 Good. 707 00:52:08,980 --> 00:52:10,230 Thanks.