1 00:00:00,040 --> 00:00:02,460 The following content is provided under a Creative 2 00:00:02,460 --> 00:00:03,870 Commons license. 3 00:00:03,870 --> 00:00:06,320 Your support will help MIT OpenCourseWare 4 00:00:06,320 --> 00:00:10,560 continue to offer high quality educational resources for free. 5 00:00:10,560 --> 00:00:13,300 To make a donation or view additional materials 6 00:00:13,300 --> 00:00:17,210 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,210 --> 00:00:19,500 at ocw.mit.edu. 8 00:00:32,560 --> 00:00:33,620 HERBERT GROSS: Hi. 9 00:00:33,620 --> 00:00:37,120 As I was standing here wondering how to begin today's lesson, 10 00:00:37,120 --> 00:00:40,440 an old story came to mind, of the professor who passed out 11 00:00:40,440 --> 00:00:43,175 an examination to his class, and one of the students 12 00:00:43,175 --> 00:00:44,800 said, "Professor, this is the same test 13 00:00:44,800 --> 00:00:46,640 you gave us last week". 14 00:00:46,640 --> 00:00:48,820 And the professor said, "I know, but this time I 15 00:00:48,820 --> 00:00:50,470 changed the answers." 16 00:00:50,470 --> 00:00:52,710 And I was thinking of this in terms of the fact 17 00:00:52,710 --> 00:00:56,170 that much of the new mathematics is essentially 18 00:00:56,170 --> 00:00:59,370 the old mathematics with some of the answers changed. 19 00:00:59,370 --> 00:01:02,390 One of the topics that we used to belittle 20 00:01:02,390 --> 00:01:05,180 in the traditional curriculum, because it was too easy, 21 00:01:05,180 --> 00:01:08,380 was the topic called linear equations. 22 00:01:08,380 --> 00:01:11,430 And it turns out that in the study of several variables 23 00:01:11,430 --> 00:01:14,160 in particular-- but it was already present in calculus 24 00:01:14,160 --> 00:01:17,410 of a single variable-- we very strongly used 25 00:01:17,410 --> 00:01:20,210 the concept of linearity. 26 00:01:20,210 --> 00:01:24,100 I could've called today's lesson "something old, something new." 27 00:01:24,100 --> 00:01:27,690 Meaning that the old topic that we were going to revisit 28 00:01:27,690 --> 00:01:29,770 would be that of linear functions, 29 00:01:29,770 --> 00:01:33,270 and the new topic would be how it manifests 30 00:01:33,270 --> 00:01:36,070 into the modern curriculum in the sense 31 00:01:36,070 --> 00:01:38,240 that one introduces a subject called 32 00:01:38,240 --> 00:01:40,980 linear algebra, or matrix algebra, 33 00:01:40,980 --> 00:01:44,960 as a standard portion of a modern calculus course, 34 00:01:44,960 --> 00:01:47,730 whereas in the traditional calculus courses, 35 00:01:47,730 --> 00:01:51,970 essentially nothing was ever said about matrix algebra 36 00:01:51,970 --> 00:01:53,140 or linearity. 37 00:01:53,140 --> 00:01:56,230 Instead I picked a more conservative title 38 00:01:56,230 --> 00:02:00,790 for today's lesson, I simply call it "Linearity Revisited". 39 00:02:00,790 --> 00:02:03,410 And as I say, it goes back to when 40 00:02:03,410 --> 00:02:05,730 we were in junior high school or high school, 41 00:02:05,730 --> 00:02:09,419 when we were taught that linear functions were very nice. 42 00:02:09,419 --> 00:02:13,750 For example, given the equation y equals m*x plus b-- 43 00:02:13,750 --> 00:02:15,510 the linear equation meaning what? 44 00:02:15,510 --> 00:02:19,050 It graphs as a straight line, but that the two variables 45 00:02:19,050 --> 00:02:22,100 are related linearly, y is a constant multiple 46 00:02:22,100 --> 00:02:24,670 of x, plus a constant. 47 00:02:24,670 --> 00:02:28,140 We were told solve for x in terms of y. 48 00:02:28,140 --> 00:02:31,470 And what we found was that if y equals m*x plus b, 49 00:02:31,470 --> 00:02:37,280 this was true if and only if x was equal to y minus b over m. 50 00:02:37,280 --> 00:02:40,450 What we showed was given a value of x, 51 00:02:40,450 --> 00:02:44,380 there corresponded a value of y, and conversely, 52 00:02:44,380 --> 00:02:48,730 given a value for y, there corresponded a unique value 53 00:02:48,730 --> 00:02:49,780 for x. 54 00:02:49,780 --> 00:02:52,140 And to put this into the language of functions, 55 00:02:52,140 --> 00:02:56,650 what we were saying was that if f of x equals m*x plus b, 56 00:02:56,650 --> 00:02:59,420 then f inverse exists. 57 00:02:59,420 --> 00:03:03,110 In other words, what we're saying is that no two different 58 00:03:03,110 --> 00:03:07,010 x values can give you the same y value, 59 00:03:07,010 --> 00:03:10,235 if the function has the form y equals m*x plus b. 60 00:03:10,235 --> 00:03:12,110 And just about the time that we were learning 61 00:03:12,110 --> 00:03:14,060 to enjoy this kind of an equation 62 00:03:14,060 --> 00:03:18,540 our dream world was shattered, and we were told it's too bad, 63 00:03:18,540 --> 00:03:21,360 but most functions aren't linear. 64 00:03:21,360 --> 00:03:24,260 We were given things like y equals x to the seventh 65 00:03:24,260 --> 00:03:25,920 plus x to the fifth, and we found 66 00:03:25,920 --> 00:03:28,610 that we couldn't solve for x very conveniently 67 00:03:28,610 --> 00:03:29,920 in terms of y. 68 00:03:29,920 --> 00:03:32,790 And that's what began our intermediate algebra 69 00:03:32,790 --> 00:03:34,410 and advanced algebra courses. 70 00:03:34,410 --> 00:03:38,850 In other words, the fact that most functions are non-linear. 71 00:03:38,850 --> 00:03:41,180 Now an interesting thing occurred though. 72 00:03:41,180 --> 00:03:44,050 Let me just emphasize this. 73 00:03:44,050 --> 00:03:45,460 And this is the key point. 74 00:03:45,460 --> 00:03:49,460 In terms of calculus, we discovered-- 75 00:03:49,460 --> 00:03:51,840 and here's a key word coming up-- Most functions 76 00:03:51,840 --> 00:03:54,140 are locally linear. 77 00:03:54,140 --> 00:03:56,730 Now that sounds a little bit like a tongue twister, 78 00:03:56,730 --> 00:04:00,110 but actually back in the first part of course 79 00:04:00,110 --> 00:04:04,520 when we talked about delta y sub tan-- a change in y 80 00:04:04,520 --> 00:04:05,910 to the tangent line. 81 00:04:05,910 --> 00:04:07,570 Notice what we were saying. 82 00:04:07,570 --> 00:04:11,680 We were saying that to study f of x near f equals a, 83 00:04:11,680 --> 00:04:15,450 we saw that f of a plus delta x minus f of a 84 00:04:15,450 --> 00:04:20,089 was equal to f prime of a times delta x plus k delta 85 00:04:20,089 --> 00:04:23,660 x, where the limit of k, as delta x went to 0, 86 00:04:23,660 --> 00:04:25,400 was 0 itself. 87 00:04:25,400 --> 00:04:29,550 Provided of course that f was differentiable at x equals a; 88 00:04:29,550 --> 00:04:32,550 otherwise, you couldn't write down f prime of a here. 89 00:04:32,550 --> 00:04:34,420 The interesting point is this. 90 00:04:34,420 --> 00:04:38,100 But if you look just at this term over here, 91 00:04:38,100 --> 00:04:44,410 this expresses delta f as a linear function of delta x. 92 00:04:44,410 --> 00:04:47,440 The part that makes this thing non-linear is the term 93 00:04:47,440 --> 00:04:50,150 called k delta x. 94 00:04:50,150 --> 00:04:52,180 But that's the term that's going to 0 95 00:04:52,180 --> 00:04:54,710 as a second-order infinitesimal. 96 00:04:54,710 --> 00:04:57,080 So what we're really saying is this: 97 00:04:57,080 --> 00:04:59,420 that provided that f is differentiable at x 98 00:04:59,420 --> 00:05:03,690 equals a-- in other words locally we mean this: 99 00:05:03,690 --> 00:05:09,400 near x equals a, we can say that delta f is approximately 100 00:05:09,400 --> 00:05:12,240 f prime of a times delta x. 101 00:05:12,240 --> 00:05:15,480 That's what we call delta f sub tan, recall. 102 00:05:15,480 --> 00:05:18,430 And what we mean by approximately here 103 00:05:18,430 --> 00:05:21,710 is that error k delta x goes to 0 very, 104 00:05:21,710 --> 00:05:24,900 very rapidly as delta x goes to 0. 105 00:05:24,900 --> 00:05:27,900 And what we mean by locally is this-- suppose 106 00:05:27,900 --> 00:05:30,970 f prime exists also when x equals b. 107 00:05:30,970 --> 00:05:35,750 We can again compute delta f near x equals b Now delta f 108 00:05:35,750 --> 00:05:37,370 is equal to what? 109 00:05:37,370 --> 00:05:40,610 Approximately f prime of b times delta 110 00:05:40,610 --> 00:05:44,160 x plus that error term which goes to 0 very rapidly. 111 00:05:44,160 --> 00:05:50,180 We again call this thing here delta f tan, 112 00:05:50,180 --> 00:05:54,240 but the thing to keep in mind is since f prime of a need 113 00:05:54,240 --> 00:06:00,100 not equal f prime of b, delta f tan is different at a and at b. 114 00:06:00,100 --> 00:06:02,920 In other words, even though it's always true 115 00:06:02,920 --> 00:06:04,810 where f is differentiable, that we 116 00:06:04,810 --> 00:06:08,950 can say that delta f is approximately delta f tan, 117 00:06:08,950 --> 00:06:12,580 the value of delta f tan depends on the value 118 00:06:12,580 --> 00:06:14,280 of x that we're near. 119 00:06:14,280 --> 00:06:16,170 And that's what we mean by saying 120 00:06:16,170 --> 00:06:18,750 that approximating delta f by delta f 121 00:06:18,750 --> 00:06:21,580 tan is a local property. 122 00:06:21,580 --> 00:06:23,830 Now I think that sometimes, by putting these things 123 00:06:23,830 --> 00:06:26,900 into words, it sounds harder than it really is. 124 00:06:26,900 --> 00:06:29,460 So I think what might be nice is if we just 125 00:06:29,460 --> 00:06:33,690 look at a specific illustration, a problem which I deliberately 126 00:06:33,690 --> 00:06:37,250 picked to be as simple a non-linear example as I 127 00:06:37,250 --> 00:06:38,210 can think of. 128 00:06:38,210 --> 00:06:41,830 Let me come back to our old friend, 129 00:06:41,830 --> 00:06:45,040 the function f of x equals x squared, which as I say, 130 00:06:45,040 --> 00:06:48,850 is about as simple a non-linear function we can get into. 131 00:06:48,850 --> 00:06:52,030 Now we know that f of x equals x squared plots as the curve y 132 00:06:52,030 --> 00:06:54,930 equals x squared, the parabola. 133 00:06:54,930 --> 00:06:57,360 Let's take a couple of points on this parabola. 134 00:06:57,360 --> 00:07:01,870 Let's say the point 1 comma 1 and the point 2 comma 4. 135 00:07:01,870 --> 00:07:08,480 Draw in the tangent lines to the curve at these two points. 136 00:07:08,480 --> 00:07:09,470 And we know what? 137 00:07:09,470 --> 00:07:13,110 That the equation of the tangent line to the curve at (1, 1) 138 00:07:13,110 --> 00:07:17,630 is y minus 1 over x minus 1 equals the slope. 139 00:07:17,630 --> 00:07:21,750 Since y is equal to x squared, the slope is 2x; when x is 1 140 00:07:21,750 --> 00:07:22,970 the slope is 2. 141 00:07:22,970 --> 00:07:26,640 So the equation of this tangent line is given by y minus 1 142 00:07:26,640 --> 00:07:28,920 over x minus 1 equals 2. 143 00:07:28,920 --> 00:07:31,790 At the point corresponding to x equals 2, 144 00:07:31,790 --> 00:07:36,250 2x is 4, so the equation of the tangent line here is y minus 4 145 00:07:36,250 --> 00:07:38,870 over x minus 2 equals 4. 146 00:07:38,870 --> 00:07:41,150 So now I've induced three functions 147 00:07:41,150 --> 00:07:42,400 that I can talk about. 148 00:07:42,400 --> 00:07:46,230 My original function, f of x is x squared. 149 00:07:46,230 --> 00:07:49,990 This straight line is the linear function-- just solving 150 00:07:49,990 --> 00:07:55,860 for y in terms of x-- g of x equals 4x minus 4. 151 00:07:55,860 --> 00:07:59,770 And this straight line corresponds to the function h 152 00:07:59,770 --> 00:08:03,059 of x equals 2x minus 1. 153 00:08:03,059 --> 00:08:04,600 Now the interesting point, of course, 154 00:08:04,600 --> 00:08:06,754 is that these two functions here are linear. 155 00:08:06,754 --> 00:08:08,420 They are completely different functions. 156 00:08:08,420 --> 00:08:10,530 Notice not only pictorially are they different, 157 00:08:10,530 --> 00:08:13,200 but algebraically their slopes are different, 158 00:08:13,200 --> 00:08:16,470 and their y-intercepts are different, 159 00:08:16,470 --> 00:08:19,340 and back in our course in part one, 160 00:08:19,340 --> 00:08:21,910 we talked about things geometrically saying, 161 00:08:21,910 --> 00:08:25,140 lookit, near the point of tangency, 162 00:08:25,140 --> 00:08:28,820 the tangent line serves as a good approximation 163 00:08:28,820 --> 00:08:30,030 to the curve itself. 164 00:08:30,030 --> 00:08:31,780 What were we really saying then? 165 00:08:31,780 --> 00:08:35,692 What we were saying was that near the point of tangency, 166 00:08:35,692 --> 00:08:39,690 g of x, which was a linear function, 167 00:08:39,690 --> 00:08:43,520 could replace f of x, which was a non-linear function. 168 00:08:43,520 --> 00:08:45,320 Of course, when we moved too far away 169 00:08:45,320 --> 00:08:48,680 from a given point, then when we said that f of x still 170 00:08:48,680 --> 00:08:50,880 had a linear approximation, we had 171 00:08:50,880 --> 00:08:53,780 to pick a different linear function. 172 00:08:53,780 --> 00:08:55,390 By the way, again because we were 173 00:08:55,390 --> 00:08:57,700 dealing with one independent variable and one 174 00:08:57,700 --> 00:09:00,490 dependent variable, it was very easy to invent 175 00:09:00,490 --> 00:09:02,280 the concept of a graph. 176 00:09:02,280 --> 00:09:06,320 As we shall show in a little while, the concept of linearity 177 00:09:06,320 --> 00:09:08,860 extends to several variables, but you 178 00:09:08,860 --> 00:09:10,880 can't draw the graph as nicely. 179 00:09:10,880 --> 00:09:15,530 So let me now revisit the same result here, only 180 00:09:15,530 --> 00:09:17,730 without reference to the graph. 181 00:09:17,730 --> 00:09:21,120 What we're saying is that our function 182 00:09:21,120 --> 00:09:24,555 is mapping the real number line into the real number line. 183 00:09:24,555 --> 00:09:26,680 In other words, instead of putting x and y at right 184 00:09:26,680 --> 00:09:31,800 angles to each other, let's put x and y horizontally parallel 185 00:09:31,800 --> 00:09:32,840 to one another. 186 00:09:32,840 --> 00:09:39,060 And what we're saying is that f maps the interval from 0 to 2 187 00:09:39,060 --> 00:09:43,370 onto the interval from 0 to 4. 188 00:09:43,370 --> 00:09:44,750 Now what does h do? 189 00:09:44,750 --> 00:09:47,760 Remember h is the function 2x minus 1. 190 00:09:47,760 --> 00:09:51,620 h maps the interval from 0 to 2 onto the interval 191 00:09:51,620 --> 00:09:54,770 from minus 1 to 3. 192 00:09:54,770 --> 00:09:58,700 And you see this is all this diagram means. f maps 0 into 0, 193 00:09:58,700 --> 00:10:03,340 it maps 1 into 1, it maps 2 into 4, et cetera. 194 00:10:03,340 --> 00:10:06,070 In other words, f is the function which squares 195 00:10:06,070 --> 00:10:08,140 the input to yield the output. 196 00:10:08,140 --> 00:10:13,430 And correspondingly, h maps 0 into minus 1, it maps 1 into 1, 197 00:10:13,430 --> 00:10:16,190 and it maps 2 into 3. 198 00:10:16,190 --> 00:10:18,110 Now the interesting point is that f and h 199 00:10:18,110 --> 00:10:19,570 are very different. 200 00:10:19,570 --> 00:10:23,710 In fact, the only time f and h have the same output 201 00:10:23,710 --> 00:10:27,610 is when x equals 1. 202 00:10:27,610 --> 00:10:29,310 Which, of course, we knew from before, 203 00:10:29,310 --> 00:10:31,730 because how was h of x constructed? 204 00:10:31,730 --> 00:10:36,300 h of x was constructed to be the line tangent to the parabola y 205 00:10:36,300 --> 00:10:39,600 equals x squared at the point x equals 1, y equals 1. 206 00:10:39,600 --> 00:10:41,830 So that should be no great surprise. 207 00:10:41,830 --> 00:10:45,040 But if we didn't know that, notice that algebraically, we 208 00:10:45,040 --> 00:10:48,450 could equate f of x to h of x, conclude, therefore, 209 00:10:48,450 --> 00:10:51,910 that that means x squared must equal 2x minus 1. 210 00:10:51,910 --> 00:10:57,610 We then transpose, and get that x minus 1 squared must be 0, 211 00:10:57,610 --> 00:11:00,290 whence x must equal 1. 212 00:11:00,290 --> 00:11:03,890 And what we have is that near x equals 1, 213 00:11:03,890 --> 00:11:08,080 x squared behaves like-- and I put this in quotation marks, 214 00:11:08,080 --> 00:11:11,210 because that's the hardest part of the course that's 215 00:11:11,210 --> 00:11:14,350 going to follow, was what do you mean by behaves like-- but x 216 00:11:14,350 --> 00:11:17,370 squared behaves like 2x minus 1. 217 00:11:17,370 --> 00:11:19,260 And what we mean by that is this, at least 218 00:11:19,260 --> 00:11:20,740 in terms of a picture. 219 00:11:20,740 --> 00:11:24,130 If I pick a small interval surrounding 220 00:11:24,130 --> 00:11:28,020 x equals 1 on the x-axis, and a small interval-- 221 00:11:28,020 --> 00:11:33,660 like a thick dot-- surrounding y equals 1 on the y-axis here. 222 00:11:33,660 --> 00:11:40,090 Then, as a mapping from this domain into this range, 223 00:11:40,090 --> 00:11:45,520 I can essentially not distinguish f from h. 224 00:11:45,520 --> 00:11:49,800 The error is so small that as the size of the interval 225 00:11:49,800 --> 00:11:53,370 shrinks, the error goes to 0 even faster. 226 00:11:53,370 --> 00:11:56,900 And therefore, if I stay close enough, locally, 227 00:11:56,900 --> 00:11:59,520 to the point in question-- if I stay close enough to this 228 00:11:59,520 --> 00:12:01,880 point, I cannot tell the difference between 229 00:12:01,880 --> 00:12:05,140 the non-linear function and the linear function. 230 00:12:05,140 --> 00:12:07,290 But what I have to be careful about is this-- 231 00:12:07,290 --> 00:12:11,350 that whereas x squared can be replaced by 2x minus 1 232 00:12:11,350 --> 00:12:15,120 near x equals 1, near x equals 2, 233 00:12:15,120 --> 00:12:18,860 x squared can be replaced again by a linear function, namely 234 00:12:18,860 --> 00:12:20,230 4x minus 4. 235 00:12:20,230 --> 00:12:23,610 But 4x minus 4 is not approximately 236 00:12:23,610 --> 00:12:26,950 the same as 2x minus 1, no matter where you look. 237 00:12:26,950 --> 00:12:29,460 You might say well, lookit, don't these two straight lines 238 00:12:29,460 --> 00:12:31,010 intersect at the particular point? 239 00:12:31,010 --> 00:12:33,050 The answer is yes they do. 240 00:12:33,050 --> 00:12:35,610 But even at the point that they intersect, 241 00:12:35,610 --> 00:12:38,410 there was no neighborhood in which 242 00:12:38,410 --> 00:12:42,410 these lines can serve as approximations for one another. 243 00:12:42,410 --> 00:12:44,220 Those are two straight lines that 244 00:12:44,220 --> 00:12:46,490 intersect at a constant angle, and as soon 245 00:12:46,490 --> 00:12:48,430 as you leave the point of intersection 246 00:12:48,430 --> 00:12:50,050 there is a significant error. 247 00:12:50,050 --> 00:12:53,860 Meaning an error which does not go to 0 more 248 00:12:53,860 --> 00:12:56,260 rapidly than the change in x. 249 00:12:56,260 --> 00:12:59,820 You don't have that higher-order infinitesimal over here. 250 00:12:59,820 --> 00:13:02,470 At any rate, leaving this to the exercises 251 00:13:02,470 --> 00:13:04,880 and the supplementary notes, for you to get more out of, 252 00:13:04,880 --> 00:13:08,810 in summary, let's just say this: if f is continuously 253 00:13:08,810 --> 00:13:12,530 differentiable at x equals a, then locally-- meaning near x 254 00:13:12,530 --> 00:13:16,060 equals a-- f behaves linearly. 255 00:13:16,060 --> 00:13:17,030 In other words, 256 00:13:17,030 --> 00:13:21,970 f of x is approximately f of a plus f prime of a times 257 00:13:21,970 --> 00:13:26,880 the quantity x minus a, and you see, once x is chosen to be a, 258 00:13:26,880 --> 00:13:29,830 this is a number, this is a number, 259 00:13:29,830 --> 00:13:32,080 delta x here is the only variable 260 00:13:32,080 --> 00:13:33,440 on the right-hand side. 261 00:13:33,440 --> 00:13:36,960 So what we're saying is that f of x is a what? 262 00:13:36,960 --> 00:13:39,770 Linear function of delta x. 263 00:13:39,770 --> 00:13:42,160 And the more interesting point is-- 264 00:13:42,160 --> 00:13:44,860 since this is all review, so I say-- what I mean 265 00:13:44,860 --> 00:13:46,509 by interesting point is what? 266 00:13:46,509 --> 00:13:48,300 That we don't have to just review this way, 267 00:13:48,300 --> 00:13:50,330 we did this simply to refresh your memories 268 00:13:50,330 --> 00:13:53,300 as to how linearity was playing a big role in calculus 269 00:13:53,300 --> 00:13:54,800 of a single variable. 270 00:13:54,800 --> 00:13:57,900 Now what we're going to do is extend the result 271 00:13:57,900 --> 00:14:00,190 to several variables. 272 00:14:00,190 --> 00:14:02,420 Let me just say that at the outset. 273 00:14:02,420 --> 00:14:05,240 That this concept does extend to n variables, 274 00:14:05,240 --> 00:14:08,270 but n equals 2 yields a particularly good 275 00:14:08,270 --> 00:14:10,010 geometric insight. 276 00:14:10,010 --> 00:14:13,870 For example, let's suppose I look at two equations and two 277 00:14:13,870 --> 00:14:15,500 unknowns. 278 00:14:15,500 --> 00:14:19,002 Well actually, I'll use u and v instead. 279 00:14:19,002 --> 00:14:19,960 Let those be variables. 280 00:14:19,960 --> 00:14:21,960 Also, we can think of this as a function. 281 00:14:21,960 --> 00:14:25,540 I have u of x, y is x squared minus y squared, 282 00:14:25,540 --> 00:14:28,510 whereas v of x, y is 2x*y. 283 00:14:28,510 --> 00:14:30,960 Notice that these are not linear, 284 00:14:30,960 --> 00:14:32,640 because here we have things appearing 285 00:14:32,640 --> 00:14:36,540 to second power, squares, and here we have what? 286 00:14:36,540 --> 00:14:38,560 The variables multiplying one another. 287 00:14:38,560 --> 00:14:42,230 These are not linear equations, but the beautiful point 288 00:14:42,230 --> 00:14:45,830 is-- if you look at this way-- is even without a picture, 289 00:14:45,830 --> 00:14:47,720 I can think of this as a mapping which 290 00:14:47,720 --> 00:14:50,630 maps two-dimensional space into two-dimensional space. 291 00:14:50,630 --> 00:14:52,540 And how does this mapping take place? 292 00:14:52,540 --> 00:14:55,720 It maps the point or the pair, or the 2-tuple-- 293 00:14:55,720 --> 00:14:57,220 whichever way you want to say it-- 294 00:14:57,220 --> 00:15:01,410 x comma y into the 2-tuple u comma v, 295 00:15:01,410 --> 00:15:06,510 where u is x squared minus y squared, and v is 2x*y. 296 00:15:06,510 --> 00:15:07,790 In other words, 297 00:15:07,790 --> 00:15:10,590 f-bar-- and notice I put the bar underneath, 298 00:15:10,590 --> 00:15:13,810 simply to indicate that E^2 is a vector space, 299 00:15:13,810 --> 00:15:15,700 and we have a function that's mapping what? 300 00:15:15,700 --> 00:15:20,890 A vector into a vector, so I indicate that f is a vector 301 00:15:20,890 --> 00:15:21,570 function here. 302 00:15:21,570 --> 00:15:23,870 It maps a vector into a vector. 303 00:15:23,870 --> 00:15:25,730 And how does the mapping take place? 304 00:15:25,730 --> 00:15:30,670 It maps the 2-tuple x comma y into the 2-tuple x squared 305 00:15:30,670 --> 00:15:33,700 minus y squared comma 2x*y. 306 00:15:33,700 --> 00:15:35,630 (u, v). 307 00:15:35,630 --> 00:15:38,380 Now, the thing is that as long as we only have n equals 2, 308 00:15:38,380 --> 00:15:42,650 we can still draw a picture, but not a picture as nice as what 309 00:15:42,650 --> 00:15:45,470 existed when n was equal to 1. 310 00:15:45,470 --> 00:15:49,500 See, pictorially, f-bar maps the xy-plane 311 00:15:49,500 --> 00:15:52,350 into what we can call the uv-plane. 312 00:15:52,350 --> 00:15:55,880 But notice that since the domain of f-bar 313 00:15:55,880 --> 00:15:59,170 has two degrees of freedom-- a two-dimensional vector space-- 314 00:15:59,170 --> 00:16:03,360 notice that the domain of f-bar is the entire xy-plane, 315 00:16:03,360 --> 00:16:07,610 whereas the range of f-bar is the entire uv-plane. 316 00:16:07,610 --> 00:16:09,880 In other words, I can now view f-bar 317 00:16:09,880 --> 00:16:13,300 as a mapping which carries points in the xy-plane 318 00:16:13,300 --> 00:16:15,600 into points in the uv-plane. 319 00:16:15,600 --> 00:16:19,020 And this will be exploited more later in the course, 320 00:16:19,020 --> 00:16:20,550 but the idea is this. 321 00:16:20,550 --> 00:16:22,470 Let's take a look for the time being. 322 00:16:22,470 --> 00:16:26,520 Let's see what f-bar does to the point 2 comma 1. 323 00:16:26,520 --> 00:16:30,350 Remember u is x squared minus y squared, 324 00:16:30,350 --> 00:16:33,310 so at the point 2 comma 1, u becomes what? 325 00:16:33,310 --> 00:16:36,840 2 squared minus 1 squared, which is 3. 326 00:16:36,840 --> 00:16:42,360 On the other hand, 2x*y is 2 times 2 times 1, which is 4. 327 00:16:42,360 --> 00:16:45,930 So f-bar can be viewed as mapping the point 2 comma 328 00:16:45,930 --> 00:16:50,010 1 into the point 3 comma 4. 329 00:16:50,010 --> 00:16:52,610 Now you recall that calculus isn't 330 00:16:52,610 --> 00:16:55,470 interested in what's happening at a particular point. 331 00:16:55,470 --> 00:16:58,630 It's interested in what's happening in the neighborhood 332 00:16:58,630 --> 00:17:00,050 of a particular point. 333 00:17:00,050 --> 00:17:03,700 So the major question is, how does f-bar behave 334 00:17:03,700 --> 00:17:06,390 near the point 2 comma 1. 335 00:17:06,390 --> 00:17:08,750 In other words, what is f-bar of 2 336 00:17:08,750 --> 00:17:13,970 plus delta x comma 1 plus delta y, when delta x and delta y are 337 00:17:13,970 --> 00:17:14,710 quite small. 338 00:17:14,710 --> 00:17:17,700 That's the question that we're raising over here. 339 00:17:17,700 --> 00:17:21,250 What we're saying is, we know that 2 comma 1 maps into 3 340 00:17:21,250 --> 00:17:22,790 comma 4. 341 00:17:22,790 --> 00:17:26,109 We also know or we'd like to believe that a point near 2 342 00:17:26,109 --> 00:17:30,050 comma 1 maps into a point near 3 comma 4. 343 00:17:30,050 --> 00:17:32,120 Well if we call this point 2 plus 344 00:17:32,120 --> 00:17:35,030 delta x comma 1 plus delta y, then 345 00:17:35,030 --> 00:17:37,100 the corresponding image over here 346 00:17:37,100 --> 00:17:42,390 should be 3 plus delta u comma 4 plus delta v. 347 00:17:42,390 --> 00:17:46,190 What we can say is that whatever the image of 2 348 00:17:46,190 --> 00:17:48,930 plus delta x comma 1 plus delta y 349 00:17:48,930 --> 00:17:54,120 is, it has the form 3 plus delta u comma 4 plus delta v, 350 00:17:54,120 --> 00:17:57,970 and all we have to do is find delta u and delta v. This 351 00:17:57,970 --> 00:18:00,720 is the pictorial idea of what's happening. 352 00:18:00,720 --> 00:18:03,050 Now the point is that delta u and delta 353 00:18:03,050 --> 00:18:04,920 v are very difficult to find. 354 00:18:04,920 --> 00:18:07,840 After all, u and v are non-linear functions. 355 00:18:07,840 --> 00:18:10,140 To invert them is either difficult, 356 00:18:10,140 --> 00:18:14,450 or downright impossible, one or the other, in many cases. 357 00:18:14,450 --> 00:18:19,430 The thing that's easy to find is delta u tan, and delta v tan. 358 00:18:19,430 --> 00:18:23,460 Remember delta u tan was the partial of u with respect to x 359 00:18:23,460 --> 00:18:26,400 times delta x, plus the partial of u with respect to y 360 00:18:26,400 --> 00:18:27,730 times delta y. 361 00:18:27,730 --> 00:18:30,620 Since u is equal to x squared minus y squared, 362 00:18:30,620 --> 00:18:35,240 that means delta u tan is 2x delta x minus 2y delta y. 363 00:18:35,240 --> 00:18:38,640 We're interested in this at the point 2 comma 1. 364 00:18:38,640 --> 00:18:43,360 Letting x be 2, and y be 1, we see that delta u tan 365 00:18:43,360 --> 00:18:46,580 is 4 delta x minus 2 delta y. 366 00:18:46,580 --> 00:18:49,810 Similarly, since v is equal to 2x*y, 367 00:18:49,810 --> 00:18:52,790 the partial of v with respect to x is 2y; 368 00:18:52,790 --> 00:18:55,830 the partial with v with respect to y is 2x. 369 00:18:55,830 --> 00:19:01,230 Therefore, delta v sub tan is 2y delta x plus 2x delta y. 370 00:19:01,230 --> 00:19:03,450 Since we're evaluating this at x equals 2, 371 00:19:03,450 --> 00:19:08,560 y equals 1, we see that delta v tan is two delta x plus 4 delta 372 00:19:08,560 --> 00:19:09,500 y. 373 00:19:09,500 --> 00:19:11,390 Now here's the key point. 374 00:19:11,390 --> 00:19:14,270 This is always delta u tan. 375 00:19:14,270 --> 00:19:16,600 This is always delta v tan. 376 00:19:16,600 --> 00:19:19,030 Where the local thing comes in is 377 00:19:19,030 --> 00:19:22,220 that we know that because u and v are continuously 378 00:19:22,220 --> 00:19:26,950 differentiable functions of x and y, that near the point 2 379 00:19:26,950 --> 00:19:32,380 comma 1, we can replace delta u by delta u sub tan, delta v 380 00:19:32,380 --> 00:19:36,010 by delta v sub tan, and we wind up with what? 381 00:19:36,010 --> 00:19:40,110 delta u is approximately 4 delta x minus 2 delta y. 382 00:19:40,110 --> 00:19:43,780 delta v is approximately 2 delta x plus 4 delta y. 383 00:19:43,780 --> 00:19:49,470 But the key point now is that this is 384 00:19:49,470 --> 00:19:51,760 a system of linear equations. 385 00:19:51,760 --> 00:19:54,620 You see, delta u is a linear combination 386 00:19:54,620 --> 00:19:57,130 of delta x and delta y, and delta v 387 00:19:57,130 --> 00:20:01,590 is also a linear combination of delta x and delta y. 388 00:20:01,590 --> 00:20:03,530 In other words, as long as u and v 389 00:20:03,530 --> 00:20:07,460 are continuously differentiable functions of x and y, 390 00:20:07,460 --> 00:20:11,650 we can approximate, locally, delta u and delta 391 00:20:11,650 --> 00:20:15,070 v by linear approximations. 392 00:20:15,070 --> 00:20:18,310 Notice how linear systems come into play. 393 00:20:18,310 --> 00:20:21,300 Now I've been emphasizing the case n equals 2 just 394 00:20:21,300 --> 00:20:23,020 so we could draw a picture. 395 00:20:23,020 --> 00:20:26,150 Notice that no matter how many variables we have-- well, 396 00:20:26,150 --> 00:20:29,820 in fact, let me just summarize this in terms of x and y first. 397 00:20:29,820 --> 00:20:34,090 And then we'll generalize it to n variables in a minute. 398 00:20:34,090 --> 00:20:36,410 The key point for two variables, and what 399 00:20:36,410 --> 00:20:40,250 happens for two variables happens for any number. 400 00:20:40,250 --> 00:20:41,950 But as we've often done in this course, 401 00:20:41,950 --> 00:20:44,260 we emphasize the two-variable case 402 00:20:44,260 --> 00:20:46,950 because we can still visualize the picture. 403 00:20:46,950 --> 00:20:50,320 Even though the graph idea is hard to see, 404 00:20:50,320 --> 00:20:53,910 because we're mapping two dimensions into two dimensions. 405 00:20:53,910 --> 00:20:56,520 But at least the domain and the range 406 00:20:56,520 --> 00:20:58,970 are easy to see separately, but if u 407 00:20:58,970 --> 00:21:00,920 is a continuously differentiable function 408 00:21:00,920 --> 00:21:05,280 of x and y near the point (x_0, y_0), 409 00:21:05,280 --> 00:21:10,850 then delta u is exactly the partial of u with respect to x 410 00:21:10,850 --> 00:21:14,630 times delta x, plus the partial of u with respect to y times 411 00:21:14,630 --> 00:21:20,440 delta y, plus an error term, k_1 delta x plus k_2 delta y, 412 00:21:20,440 --> 00:21:25,610 where k_1 and k_2 go to 0, as delta x and delta y go to 0. 413 00:21:25,610 --> 00:21:27,490 In other words, 414 00:21:27,490 --> 00:21:31,530 if we just look at this part alone, 415 00:21:31,530 --> 00:21:37,000 delta u is linear up to this as a correction term. 416 00:21:37,000 --> 00:21:40,450 In other words, the non-linearity part of delta u 417 00:21:40,450 --> 00:21:44,060 is going to 0 as a second-order infinitesimal, 418 00:21:44,060 --> 00:21:46,470 and the reason I keep harping on this point 419 00:21:46,470 --> 00:21:49,140 is that no matter how complex the theory gets, 420 00:21:49,140 --> 00:21:51,520 in the rest of this particular block, 421 00:21:51,520 --> 00:21:54,050 the key step is always going to be 422 00:21:54,050 --> 00:21:55,810 that when you have a continuously 423 00:21:55,810 --> 00:21:59,580 differentiable function you can essentially-- as long you 424 00:21:59,580 --> 00:22:01,380 stay locally-- you can essentially 425 00:22:01,380 --> 00:22:02,910 throw away the nasty part. 426 00:22:02,910 --> 00:22:05,730 You can essentially throw away this error term, 427 00:22:05,730 --> 00:22:08,720 because it goes to 0 so rapidly that if you stay close 428 00:22:08,720 --> 00:22:12,170 enough to the point x_0, y_0, no harm comes 429 00:22:12,170 --> 00:22:13,870 from neglecting this term. 430 00:22:13,870 --> 00:22:15,850 What you must be careful about is 431 00:22:15,850 --> 00:22:18,490 that as soon as you pick a large enough neighborhood so 432 00:22:18,490 --> 00:22:21,040 that this term is no longer negligible, then 433 00:22:21,040 --> 00:22:25,250 even though this part here is still delta u sub tan, 434 00:22:25,250 --> 00:22:29,880 delta u sub tan is no longer a good approximation for delta u. 435 00:22:29,880 --> 00:22:33,580 At any rate, in n variables, what we're saying is, 436 00:22:33,580 --> 00:22:37,280 suppose w is a function of x_1 up to x_n. 437 00:22:37,280 --> 00:22:39,960 Then if w happens to be continuously 438 00:22:39,960 --> 00:22:42,540 differentiable at the point corresponding 439 00:22:42,540 --> 00:22:46,850 to x-bar equals a-bar-- meaning, in terms of n-tuples, 440 00:22:46,850 --> 00:22:50,510 x_1 up to x_n is the point a_1 comma up 441 00:22:50,510 --> 00:22:55,110 to a_n-- then what we're saying is that delta w can be replaced 442 00:22:55,110 --> 00:22:57,710 by-- now this has been mentioned the text, 443 00:22:57,710 --> 00:23:00,100 I don't remember whether we've mentioned this 444 00:23:00,100 --> 00:23:02,300 in previous lectures or not. 445 00:23:02,300 --> 00:23:03,780 It's rather interesting that when 446 00:23:03,780 --> 00:23:07,660 you deal with more than three independent variables we 447 00:23:07,660 --> 00:23:11,340 somehow don't like to use the word delta w sub tan. 448 00:23:11,340 --> 00:23:15,470 Because tangent indicates a tangent line or a tangent plane 449 00:23:15,470 --> 00:23:17,280 which is a geometric concept. 450 00:23:17,280 --> 00:23:23,430 Instead we replace the word tangent by L-I-N 451 00:23:23,430 --> 00:23:26,020 as an abbreviation for linear. 452 00:23:26,020 --> 00:23:27,550 The key point being what? 453 00:23:27,550 --> 00:23:30,730 That this thing that we call delta w sub lin, 454 00:23:30,730 --> 00:23:33,260 or if you like to call it sub tan, what's in a name? 455 00:23:33,260 --> 00:23:34,800 Call it whatever you want. 456 00:23:34,800 --> 00:23:37,975 The point is that this thing that we call delta w sub 457 00:23:37,975 --> 00:23:42,370 lin or delta w sub tan is the partial of f with respect 458 00:23:42,370 --> 00:23:46,770 to x_1 evaluated at a-bar times delta x1, 459 00:23:46,770 --> 00:23:49,390 plus the partial of f with respect 460 00:23:49,390 --> 00:23:53,970 to x sub n evaluated at a-bar times delta x_n. 461 00:23:53,970 --> 00:23:56,060 And the key point is that once you 462 00:23:56,060 --> 00:23:59,770 have chosen a specific number a-bar, 463 00:23:59,770 --> 00:24:03,770 notice that the coefficients of delta x_1 up to delta x_n 464 00:24:03,770 --> 00:24:05,940 are numbers. 465 00:24:05,940 --> 00:24:07,060 They're not variables. 466 00:24:07,060 --> 00:24:09,500 They are numbers once a is chosen. 467 00:24:09,500 --> 00:24:13,360 So that what is delta w lin, why do we call it linear? 468 00:24:13,360 --> 00:24:17,510 Notice that this expression here is a linear combination 469 00:24:17,510 --> 00:24:19,790 of delta x_1 up to delta x_n. 470 00:24:19,790 --> 00:24:21,110 In other words they're what? 471 00:24:21,110 --> 00:24:25,980 Sums of terms each involving a delta x times-- excuse me. 472 00:24:25,980 --> 00:24:29,770 A delta x times a constant. 473 00:24:29,770 --> 00:24:32,470 What we're saying is that nice functions, 474 00:24:32,470 --> 00:24:34,240 and what's a nice function? 475 00:24:34,240 --> 00:24:37,830 A nice function is one which is continuously differentiable. 476 00:24:37,830 --> 00:24:41,550 A nice function is locally linear. 477 00:24:41,550 --> 00:24:45,840 In other words, a continuously differentiable function, 478 00:24:45,840 --> 00:24:51,390 near a particular point, can be approximated 479 00:24:51,390 --> 00:24:54,460 by a linear function, where the error will 480 00:24:54,460 --> 00:24:56,590 be very small as long as you stay 481 00:24:56,590 --> 00:24:58,025 near the point in question. 482 00:24:58,025 --> 00:24:59,900 You remember, at the beginning of my lecture, 483 00:24:59,900 --> 00:25:02,510 I said something old, something new. 484 00:25:02,510 --> 00:25:06,020 This finishes the old part of the course. 485 00:25:06,020 --> 00:25:09,020 In other words, what I've tried to motivate for you here 486 00:25:09,020 --> 00:25:13,850 is why, if we were remodeling the pre-calculus curriculum, 487 00:25:13,850 --> 00:25:17,500 much more emphasis should be paid to linear equations. 488 00:25:17,500 --> 00:25:22,210 Granted that most functions in real life are non-linear, 489 00:25:22,210 --> 00:25:27,340 the point remains that locally, functions are linear. 490 00:25:27,340 --> 00:25:28,630 OK? 491 00:25:28,630 --> 00:25:30,870 That's the key point. 492 00:25:30,870 --> 00:25:33,653 Locally we deal with linear functions. 493 00:25:36,390 --> 00:25:40,010 Therefore, since all non-linear functions 494 00:25:40,010 --> 00:25:42,790 may be viewed as being linear locally, 495 00:25:42,790 --> 00:25:44,890 this motivates why we should really study 496 00:25:44,890 --> 00:25:47,030 systems of linear equations. 497 00:25:47,030 --> 00:25:48,810 In other words, this motivates the subject 498 00:25:48,810 --> 00:25:50,800 called linear systems. 499 00:25:50,800 --> 00:25:52,630 Now what is a linear system? 500 00:25:52,630 --> 00:25:58,720 Essentially, a linear system is m equations in n unknowns. 501 00:25:58,720 --> 00:26:01,840 In many cases m and n are taken to be equal, 502 00:26:01,840 --> 00:26:03,680 but what kind of equations are they? 503 00:26:03,680 --> 00:26:06,890 They are equations where all the variables appear separately 504 00:26:06,890 --> 00:26:10,940 to the first power multiplied only by a constant term, 505 00:26:10,940 --> 00:26:13,670 and by the way, let me introduce this double subscript 506 00:26:13,670 --> 00:26:18,220 notation rather than introducing umpteen different symbols 507 00:26:18,220 --> 00:26:19,390 for constants. 508 00:26:19,390 --> 00:26:21,400 Notice that a very nice device here 509 00:26:21,400 --> 00:26:25,800 is to pick one symbol, like an a, and then use two subscripts. 510 00:26:25,800 --> 00:26:29,280 The first subscript telling you what row 511 00:26:29,280 --> 00:26:32,520 the coefficient is referring to, and the second one 512 00:26:32,520 --> 00:26:33,760 which column. 513 00:26:33,760 --> 00:26:37,030 Or in terms of the equations, the first subscript 514 00:26:37,030 --> 00:26:39,630 tells you which equation you're dealing with, 515 00:26:39,630 --> 00:26:42,150 and the second subscript tells you 516 00:26:42,150 --> 00:26:44,510 what variable it's multiplying. 517 00:26:44,510 --> 00:26:46,360 For example this is what? 518 00:26:46,360 --> 00:26:51,930 This is the coefficient of x sub 1 in the first equation. 519 00:26:51,930 --> 00:26:57,140 This is the coefficient of x sub n in the first equation. 520 00:26:57,140 --> 00:27:04,880 This is the coefficient of x sub n in the n-th equation. 521 00:27:04,880 --> 00:27:08,270 Think of this as the row and the column if you will. 522 00:27:08,270 --> 00:27:13,270 And what we're saying then is that the solutions of this type 523 00:27:13,270 --> 00:27:16,460 of system of equations are really controlled 524 00:27:16,460 --> 00:27:18,820 by the coefficients of the x's. 525 00:27:18,820 --> 00:27:22,480 In other words, by the numbers a sub ij, 526 00:27:22,480 --> 00:27:26,760 where i and j can take on-- well i takes on all values 527 00:27:26,760 --> 00:27:28,380 from what? 528 00:27:28,380 --> 00:27:39,110 The number of rows. i goes from 1 to m, and j goes from 1 to n. 529 00:27:39,110 --> 00:27:42,070 But the a's become very important, 530 00:27:42,070 --> 00:27:43,570 and this is what ultimately is going 531 00:27:43,570 --> 00:27:46,280 to motivate what we mean by a matrix, 532 00:27:46,280 --> 00:27:48,910 but before I come to that, let me give you just one 533 00:27:48,910 --> 00:27:54,510 example of what I mean by saying that the equations are governed 534 00:27:54,510 --> 00:27:59,020 by the coefficients of the x's, not by the constants 535 00:27:59,020 --> 00:28:00,480 on the right-hand side. 536 00:28:00,480 --> 00:28:02,340 By the way, notice the convention 537 00:28:02,340 --> 00:28:04,980 that when you have two equations with two unknowns, 538 00:28:04,980 --> 00:28:07,760 rather than call the unknowns x_1 and x_2, 539 00:28:07,760 --> 00:28:10,910 it's conventional to call the unknowns x and y. 540 00:28:10,910 --> 00:28:14,030 Let's take a particularly simple system here-- x plus y 541 00:28:14,030 --> 00:28:17,530 equals b_1, x minus y equals b_2. 542 00:28:17,530 --> 00:28:19,540 If we add these two equations, we 543 00:28:19,540 --> 00:28:23,320 get 2x is b_1 plus b_2, whereupon 544 00:28:23,320 --> 00:28:26,720 x is b_1 plus b_2 over 2. 545 00:28:26,720 --> 00:28:28,440 If we subtract the two equations, 546 00:28:28,440 --> 00:28:32,640 we get 2y is b_1 minus b_2, whereupon y 547 00:28:32,640 --> 00:28:35,720 is b_1 minus b_2 over 2. 548 00:28:35,720 --> 00:28:38,950 Notice that this tells us how to solve for x and y 549 00:28:38,950 --> 00:28:41,060 in terms of b_1 and b_2. 550 00:28:41,060 --> 00:28:44,400 Namely, to find x you take half the sum of the two b's. 551 00:28:44,400 --> 00:28:47,280 To find y, you take half the difference. 552 00:28:47,280 --> 00:28:49,710 Now certainly, the solution depends 553 00:28:49,710 --> 00:28:51,900 on the values of b_1 and b_2. 554 00:28:51,900 --> 00:28:54,520 I'm not saying you don't change the answers by changing 555 00:28:54,520 --> 00:28:55,860 the constants on this side. 556 00:28:55,860 --> 00:28:59,280 What I am saying is that the structure by which you 557 00:28:59,280 --> 00:29:02,670 find the answers does not depend on b_1 and b_2; 558 00:29:02,670 --> 00:29:07,370 it's determined solely by the coefficients of x and y. 559 00:29:07,370 --> 00:29:08,950 What we're saying is, no matter what 560 00:29:08,950 --> 00:29:11,690 b_1 and b_2 are in this particular problem, 561 00:29:11,690 --> 00:29:15,400 to find x and y we take half the sum of the b's, and we 562 00:29:15,400 --> 00:29:17,300 take half the difference. 563 00:29:17,300 --> 00:29:18,860 In other words, the solution depends 564 00:29:18,860 --> 00:29:24,232 on b_1 and b_2 numerically, but not structurally. 565 00:29:24,232 --> 00:29:26,190 Well, the whole idea is this-- and this is what 566 00:29:26,190 --> 00:29:28,480 we so often do in mathematics. 567 00:29:28,480 --> 00:29:32,570 Because the solution to our equations 568 00:29:32,570 --> 00:29:36,420 depends on the coefficients of the x's, we somehow 569 00:29:36,420 --> 00:29:40,130 want to focus our attention on the coefficients. 570 00:29:40,130 --> 00:29:41,810 And we don't need the x's in there, 571 00:29:41,810 --> 00:29:45,380 because we can sort of think of the x's as being a place value 572 00:29:45,380 --> 00:29:46,430 type of situation. 573 00:29:46,430 --> 00:29:49,070 In other words, x_1 can be thought 574 00:29:49,070 --> 00:29:51,470 of as being the first column. 575 00:29:51,470 --> 00:29:53,200 x_2 the second column. 576 00:29:53,200 --> 00:29:56,090 The first equation can be thought of as the first row. 577 00:29:56,090 --> 00:29:58,070 The second equation, the second row. 578 00:29:58,070 --> 00:30:00,270 And what this motivates is a concept 579 00:30:00,270 --> 00:30:02,500 called an m by n matrix. 580 00:30:02,500 --> 00:30:05,930 Now this sounds like a very ominous term, an m by n matrix. 581 00:30:05,930 --> 00:30:10,070 But the point is it's not a very ominous term. 582 00:30:10,070 --> 00:30:12,980 It's in fact, I think that it's too-- in fact 583 00:30:12,980 --> 00:30:16,180 the word matrix essentially indicates an array, 584 00:30:16,180 --> 00:30:17,620 and that's all this thing is. 585 00:30:17,620 --> 00:30:21,910 By an m by n matrix, we simply mean a rectangular array 586 00:30:21,910 --> 00:30:26,870 of numbers, arranged to form m rows-- 587 00:30:26,870 --> 00:30:27,690 In other words, 588 00:30:27,690 --> 00:30:33,870 the first number tells you the number of rows, 589 00:30:33,870 --> 00:30:38,070 and the second number tells you the number of columns. 590 00:30:38,070 --> 00:30:40,070 Now there's certainly nothing logical about that 591 00:30:40,070 --> 00:30:41,620 in terms of our game idea. 592 00:30:41,620 --> 00:30:44,187 Just memorize this, it's a rule of the game or a definition. 593 00:30:44,187 --> 00:30:45,770 Somebody could've said, why didn't you 594 00:30:45,770 --> 00:30:47,590 give the columns first and then the rows? 595 00:30:47,590 --> 00:30:50,000 Well we could've, but one of them had to come first. 596 00:30:50,000 --> 00:30:52,890 And the convention is that one refers to the rows 597 00:30:52,890 --> 00:30:54,910 first, and then the columns. 598 00:30:54,910 --> 00:30:57,030 An m by n matrix then is what? 599 00:30:57,030 --> 00:31:00,520 It's a rectangular array of numbers consisting 600 00:31:00,520 --> 00:31:03,970 of m rows and n columns. 601 00:31:03,970 --> 00:31:08,580 By way of an example-- by the way, to indicate that's you're 602 00:31:08,580 --> 00:31:10,590 talking about a matrix, one usually 603 00:31:10,590 --> 00:31:16,209 encloses the array in brackets, or in parentheses. 604 00:31:16,209 --> 00:31:17,500 It doesn't make any difference. 605 00:31:17,500 --> 00:31:21,180 I will use whichever one strikes my fancy at the moment. 606 00:31:21,180 --> 00:31:23,310 And it happens to be brackets right now. 607 00:31:23,310 --> 00:31:26,110 But if I write down this array-- what is it now? 608 00:31:26,110 --> 00:31:31,070 [1, 1, 1; 1, -1, 2]. 609 00:31:31,070 --> 00:31:35,120 This is a rectangular array of numbers consisting of what? 610 00:31:35,120 --> 00:31:40,220 Two rows and three columns. 611 00:31:40,220 --> 00:31:44,170 And so this is an example of a 2 by 3 matrix. 612 00:31:44,170 --> 00:31:45,990 A 2 by 3 matrix. 613 00:31:45,990 --> 00:31:49,500 Now again, we don't want to invent this thing vacuously. 614 00:31:49,500 --> 00:31:53,230 Let's keep track of what this matrix is 615 00:31:53,230 --> 00:31:57,291 coding for us in terms of a system of equations. 616 00:31:57,291 --> 00:31:57,790 Well. 617 00:31:57,790 --> 00:32:01,790 For example, suppose we have the system of equations z_1 618 00:32:01,790 --> 00:32:05,350 is equal to y_1 plus y_2 plus y_3. 619 00:32:05,350 --> 00:32:09,870 z_2 is equal to y_1 minus y_2 plus 2*y_3, 620 00:32:09,870 --> 00:32:13,270 and we want to think of the y_1, y_2, 621 00:32:13,270 --> 00:32:19,000 and y_3 as being the variables, z_1 and z_2 as being 622 00:32:19,000 --> 00:32:20,020 the constants here. 623 00:32:20,020 --> 00:32:22,580 What is the matrix of coefficients here? 624 00:32:22,580 --> 00:32:25,410 Well the matrix would be what? 625 00:32:25,410 --> 00:32:32,070 The coefficient of the first variable in the first column 626 00:32:32,070 --> 00:32:35,110 is 1; second variable, first column is 1; 627 00:32:35,110 --> 00:32:39,870 third variable, first row is 1. 628 00:32:39,870 --> 00:32:40,550 You see? 629 00:32:40,550 --> 00:32:42,930 Second equation, first variable coefficient 630 00:32:42,930 --> 00:32:48,120 is 1; second equation, second variable coefficient 631 00:32:48,120 --> 00:32:49,540 is minus 1; 632 00:32:49,540 --> 00:32:53,010 second equation, third variable coefficient is 2. 633 00:32:53,010 --> 00:32:55,360 So using our matrix coding system, 634 00:32:55,360 --> 00:32:57,490 the matrix of coefficients would be what? 635 00:32:57,490 --> 00:33:03,580 [1, 1, 1; 1, -1, 2]. 636 00:33:03,580 --> 00:33:08,296 Which is exactly the matrix that we wrote down over here. 637 00:33:08,296 --> 00:33:10,170 And to put this into a different perspective, 638 00:33:10,170 --> 00:33:13,610 so to see what we're driving at, let's take a second example 639 00:33:13,610 --> 00:33:17,400 where we first start out with three equations and four 640 00:33:17,400 --> 00:33:17,950 unknowns. 641 00:33:17,950 --> 00:33:20,150 Three linear equations and four unknowns. 642 00:33:20,150 --> 00:33:22,560 And then we'll write the matrix for this afterwards. 643 00:33:22,560 --> 00:33:27,470 But let the equations be y sub 1 is x_1 plus 2*x_2 plus x_3 plus 644 00:33:27,470 --> 00:33:28,900 x_4. 645 00:33:28,900 --> 00:33:33,510 y_2 is 2*x_1 minus x_2 minus x_3 plus 3*x_4. 646 00:33:33,510 --> 00:33:37,760 y_3 is 3*x_1 plus x_2 plus 2*x_3 minus x_4. 647 00:33:37,760 --> 00:33:41,165 if I want to write the matrix of coefficients, what do I do? 648 00:33:41,165 --> 00:33:44,610 I simply leave the variables out, and write down what? 649 00:33:44,610 --> 00:33:46,230 My first row would be what? 650 00:33:46,230 --> 00:33:49,460 [1, 2, 1, 1]. 651 00:33:49,460 --> 00:33:54,700 My second row would be [2, -1, -1, 3]. 652 00:33:54,700 --> 00:34:00,320 My third row would be [3, 1, 2, -1]. 653 00:34:00,320 --> 00:34:02,440 In other words, my matrix of coefficients, 654 00:34:02,440 --> 00:34:04,840 now, would be what kind of a matrix? 655 00:34:04,840 --> 00:34:07,440 It would be a rectangular array of numbers, 656 00:34:07,440 --> 00:34:12,719 consisting of three rows and four . columns. 657 00:34:12,719 --> 00:34:13,690 All right? 658 00:34:13,690 --> 00:34:16,510 And that would be called a 3 by 4 matrix. 659 00:34:16,510 --> 00:34:20,000 Again, notice, in this coding system, the number of rows 660 00:34:20,000 --> 00:34:23,330 corresponds to the number of equations. 661 00:34:23,330 --> 00:34:26,340 And the number of columns corresponds 662 00:34:26,340 --> 00:34:29,250 to the number of variables that are 663 00:34:29,250 --> 00:34:32,590 formed in linear combinations. 664 00:34:32,590 --> 00:34:36,820 To summarize this again, the matrix 665 00:34:36,820 --> 00:34:39,280 of coefficients in our second example 666 00:34:39,280 --> 00:34:48,040 is the 3 by 4 matrix [1, 2, 1, 1; 2, -1, -1, 3; 3, 1, 2, -1]. 667 00:34:48,040 --> 00:34:52,070 Well again, let's recall that when we do mathematics, 668 00:34:52,070 --> 00:34:53,909 we don't like to introduce notation 669 00:34:53,909 --> 00:34:55,409 for the sake of notation. 670 00:34:55,409 --> 00:34:59,970 And simply to be able to have a way of conveniently writing 671 00:34:59,970 --> 00:35:02,990 the coefficients, but not being able to use it 672 00:35:02,990 --> 00:35:06,620 efficiently would be a rather stupid thing to do. 673 00:35:06,620 --> 00:35:09,700 Why invent new notation if it's not going to help 674 00:35:09,700 --> 00:35:12,180 us effectively solve new problems? 675 00:35:12,180 --> 00:35:14,850 This is why in mathematics we've been emphasizing 676 00:35:14,850 --> 00:35:20,170 the game idea, whereby what we really care about is structure. 677 00:35:20,170 --> 00:35:24,270 We care about structure, not about the terms themselves. 678 00:35:24,270 --> 00:35:26,090 And to motivate what I'm driving at, 679 00:35:26,090 --> 00:35:29,000 let me return to examples one and two. 680 00:35:29,000 --> 00:35:34,060 And bring up a question that has great impact-- 681 00:35:34,060 --> 00:35:36,200 and even if we don't appreciate it right now 682 00:35:36,200 --> 00:35:38,210 in terms of a practical application, 683 00:35:38,210 --> 00:35:40,680 let's at least see what's happening. 684 00:35:40,680 --> 00:35:44,660 You'll notice that if I look at these systems of equations 685 00:35:44,660 --> 00:35:49,000 over here, notice that the first two equations tell me 686 00:35:49,000 --> 00:35:54,810 how to express z_1 and z_2 in terms of y_1, y_2, and y_3. 687 00:35:54,810 --> 00:35:57,930 On the other hand, the second system of equations 688 00:35:57,930 --> 00:36:02,190 tells me how to express y_1, y_2, and y_3 in terms 689 00:36:02,190 --> 00:36:05,500 of x_1, x_2, x_3, and x_4. 690 00:36:05,500 --> 00:36:06,990 Now, without belaboring the point 691 00:36:06,990 --> 00:36:10,160 because the arithmetic is quite trivial here, 692 00:36:10,160 --> 00:36:12,980 a very natural question that might come up next this 693 00:36:12,980 --> 00:36:15,710 is, lookit, let's look at our old friend the chain 694 00:36:15,710 --> 00:36:16,650 rule again. 695 00:36:16,650 --> 00:36:18,690 Since the z's are expressed in terms 696 00:36:18,690 --> 00:36:21,600 of the y's, and the y's are expressed in terms 697 00:36:21,600 --> 00:36:24,970 of the x's, it seems that by direct substitution, 698 00:36:24,970 --> 00:36:29,190 I should be able to express the z's in terms of the x's. 699 00:36:29,190 --> 00:36:34,410 Namely, I replace y_1 by this linear combination of the x's. 700 00:36:34,410 --> 00:36:38,510 I replace y_2 by this linear combination of the x's. 701 00:36:38,510 --> 00:36:43,050 I replace y_3 by this linear combination of the x's. 702 00:36:43,050 --> 00:36:50,040 | then combine the y's in terms of the x's as indicated here. 703 00:36:50,040 --> 00:36:52,860 And that should give me the z's in terms of the x's. 704 00:36:52,860 --> 00:36:57,780 Leaving that, hopefully, as a trivial exercise, 705 00:36:57,780 --> 00:37:00,890 we come to the next example that I'd like to mention here, 706 00:37:00,890 --> 00:37:03,710 and that is: suppose you were told 707 00:37:03,710 --> 00:37:09,350 to express z_1 and z_2 in terms of x_1, x_2, x_3 and x_4. 708 00:37:09,350 --> 00:37:13,240 The point is that, with the amount of arithmetic mentioned 709 00:37:13,240 --> 00:37:20,070 before, we could easily show that z_1 was 6*x_1 plus 2*x_2 710 00:37:20,070 --> 00:37:22,610 plus 2*x_3 plus 3*x_4, 711 00:37:22,610 --> 00:37:29,900 while z_2 was 5*x_1 plus 5*x_2 plus 6*x_3 minus 4*x_4, 712 00:37:29,900 --> 00:37:32,590 by a straightforward substitution. 713 00:37:32,590 --> 00:37:34,380 The point is that somehow or other, 714 00:37:34,380 --> 00:37:38,500 we would like to be able to handle this substitution more 715 00:37:38,500 --> 00:37:40,010 efficiently. 716 00:37:40,010 --> 00:37:46,050 Is there a neater way of being able to transform 717 00:37:46,050 --> 00:37:49,500 the z's into the x's by way of the y's? 718 00:37:49,500 --> 00:37:52,300 In other words, is there a way of replacing the y's 719 00:37:52,300 --> 00:37:54,690 by the x's, and then finding z's in terms 720 00:37:54,690 --> 00:37:58,630 of x's in a convenient, mechanical way that 721 00:37:58,630 --> 00:38:00,640 will save us much steps? 722 00:38:00,640 --> 00:38:04,620 Not so much in these easy examples where you have 2 by 3, 723 00:38:04,620 --> 00:38:08,365 and 3 by 4 systems, but cases where you might have 724 00:38:08,365 --> 00:38:10,140 10 equations and 10 unknowns. 725 00:38:10,140 --> 00:38:12,460 Or 10 equations and 12 unknowns. 726 00:38:12,460 --> 00:38:14,191 And the answer is, there is a way. 727 00:38:14,191 --> 00:38:16,190 Of course, you knew there was going to be a way. 728 00:38:16,190 --> 00:38:18,100 Otherwise we wouldn't be leading up to it 729 00:38:18,100 --> 00:38:21,200 in this particular way, and as so often happens, 730 00:38:21,200 --> 00:38:24,400 there usually happens to be a real-life situation that 731 00:38:24,400 --> 00:38:26,520 motivates why we invent something 732 00:38:26,520 --> 00:38:29,720 called matrix algebra. 733 00:38:29,720 --> 00:38:31,860 In terms of our present illustration, 734 00:38:31,860 --> 00:38:33,840 the chain rule that we're just talking 735 00:38:33,840 --> 00:38:36,570 about expressing the z's in terms of the y's, and then 736 00:38:36,570 --> 00:38:39,170 the y's in terms of the x's motivates 737 00:38:39,170 --> 00:38:42,240 what we mean by matrix multiplication. 738 00:38:42,240 --> 00:38:45,400 And you may notice that I put "multiplication" here 739 00:38:45,400 --> 00:38:47,000 in quotation marks. 740 00:38:47,000 --> 00:38:48,720 The reason I put in quotation marks 741 00:38:48,720 --> 00:38:51,520 is that unfortunately the word "multiplication" 742 00:38:51,520 --> 00:38:55,250 has a connotation of multiplying numbers together. 743 00:38:55,250 --> 00:38:56,510 Don't think of it that way. 744 00:38:56,510 --> 00:38:58,490 Think of multiplication meaning what? 745 00:38:58,490 --> 00:39:03,260 A way of combining two matrices to form another matrix. 746 00:39:03,260 --> 00:39:06,950 There's going to be no logic behind this other than one 747 00:39:06,950 --> 00:39:08,930 very famous piece of logic. 748 00:39:08,930 --> 00:39:12,220 That is knowing what the answer was supposed to be, 749 00:39:12,220 --> 00:39:15,560 we make up our rules to guarantee us 750 00:39:15,560 --> 00:39:17,995 that we will get the appropriate answer. 751 00:39:17,995 --> 00:39:18,620 In other words, 752 00:39:18,620 --> 00:39:22,300 I remember when I was an undergraduate in college. 753 00:39:22,300 --> 00:39:25,030 The big type of humor that was going around at that time 754 00:39:25,030 --> 00:39:27,540 was the idea of, somebody would give you the answer, 755 00:39:27,540 --> 00:39:29,685 and you have to make up the question. 756 00:39:29,685 --> 00:39:31,310 Oh, they were silly little things like, 757 00:39:31,310 --> 00:39:35,115 if the answer to the question was 9w what was the question? 758 00:39:35,115 --> 00:39:36,490 And the question would be, do you 759 00:39:36,490 --> 00:39:39,970 spell your last name with a V, Herr Wagner, and the answer 760 00:39:39,970 --> 00:39:41,440 would be "nein, W." 761 00:39:41,440 --> 00:39:43,164 And these were funny jokes at that time. 762 00:39:43,164 --> 00:39:45,080 I don't know whether they're funny now or not. 763 00:39:45,080 --> 00:39:46,730 But the funny point is this. 764 00:39:46,730 --> 00:39:49,520 That this joke, which might not be that funny, 765 00:39:49,520 --> 00:39:52,750 is exactly how we motivate definitions and rules 766 00:39:52,750 --> 00:39:53,620 in mathematics. 767 00:39:53,620 --> 00:39:56,150 We start with the answer, and then 768 00:39:56,150 --> 00:39:58,460 go back, and answer the question. 769 00:39:58,460 --> 00:40:01,550 Knowing in advance that somehow or other, 770 00:40:01,550 --> 00:40:03,820 the matrix that expresses the z's 771 00:40:03,820 --> 00:40:07,880 in terms of the y's is given by this. 772 00:40:07,880 --> 00:40:11,940 And the matrix that expresses the y's in terms of the x's, is 773 00:40:11,940 --> 00:40:13,830 given by this matrix. 774 00:40:13,830 --> 00:40:16,680 Somehow or other what we would like to do 775 00:40:16,680 --> 00:40:22,070 is invent a way of combining these two matrices to give me 776 00:40:22,070 --> 00:40:24,970 the matrix that expresses this answer. 777 00:40:24,970 --> 00:40:27,940 In other words, if I start knowing what the answer is 778 00:40:27,940 --> 00:40:31,390 supposed to be-- in other words, what is the matrix that 779 00:40:31,390 --> 00:40:33,950 expresses the z's in terms of the x's? 780 00:40:33,950 --> 00:40:38,020 It's the matrix whose first row is [6, 2, 2, 3]. 781 00:40:38,020 --> 00:40:42,240 And whose second row is [5, 5, 6, -4]. 782 00:40:42,240 --> 00:40:43,990 In other words, the matrix would be what? 783 00:40:43,990 --> 00:40:50,530 [6, 2, 2, 3; 5, 5, 6, -4]. 784 00:40:50,530 --> 00:40:52,880 And without even looking at any mechanical rule, 785 00:40:52,880 --> 00:40:55,390 the question that comes up is, how can I 786 00:40:55,390 --> 00:40:57,630 invent a rule that will tell me how 787 00:40:57,630 --> 00:41:04,470 to multiply this 2 by 3 matrix by this 3 by 4 matrix 788 00:41:04,470 --> 00:41:08,480 to obtain this 2 by 4 matrix? 789 00:41:08,480 --> 00:41:10,900 2 by 4 matrix. 790 00:41:10,900 --> 00:41:11,520 Now lookit. 791 00:41:11,520 --> 00:41:14,530 In the notes, I'm going to do this in great detail. 792 00:41:14,530 --> 00:41:16,680 There will be many exercises on this for you 793 00:41:16,680 --> 00:41:18,670 to sharpen your teeth on. 794 00:41:18,670 --> 00:41:22,030 But for now I just want to hit this main point, 795 00:41:22,030 --> 00:41:24,560 because the lecture is quite long. 796 00:41:24,560 --> 00:41:28,110 Your attention span probably is starting to be taxed. 797 00:41:28,110 --> 00:41:31,480 And so I just want to show you what the recipe is, 798 00:41:31,480 --> 00:41:34,180 because my feeling is that this is something 799 00:41:34,180 --> 00:41:38,350 you have to hear before you can really read it without becoming 800 00:41:38,350 --> 00:41:40,540 panicked by the notation. 801 00:41:40,540 --> 00:41:44,320 The idea is this: first of all, to multiply two matrices, 802 00:41:44,320 --> 00:41:47,780 all we ever require is that the number of columns 803 00:41:47,780 --> 00:41:51,250 in the first matrix equals the number of rows 804 00:41:51,250 --> 00:41:52,800 in the second matrix. 805 00:41:52,800 --> 00:41:55,090 And if that sounds complicated to you, 806 00:41:55,090 --> 00:41:58,800 simply think in terms of the chain rule again. 807 00:41:58,800 --> 00:42:01,850 The number of columns in the first matrix 808 00:42:01,850 --> 00:42:03,500 tells you how many unknowns there 809 00:42:03,500 --> 00:42:06,070 are in the first system of equations. 810 00:42:06,070 --> 00:42:09,730 And that number of unknowns gives you 811 00:42:09,730 --> 00:42:12,690 the number of equations in the second system. 812 00:42:12,690 --> 00:42:16,200 In other words, the number of columns in the first matrix 813 00:42:16,200 --> 00:42:21,880 must match the number of rows in the second matrix. 814 00:42:21,880 --> 00:42:25,290 Notice, we don't care about the number of rows 815 00:42:25,290 --> 00:42:27,690 in the first one matching the number of columns 816 00:42:27,690 --> 00:42:31,280 in the second, all we care is that the number of columns 817 00:42:31,280 --> 00:42:34,230 in the first matrix-- namely three here-- match 818 00:42:34,230 --> 00:42:37,410 the number of rows of the second, which is three. 819 00:42:37,410 --> 00:42:41,640 Then the rule works in a very interesting mechanical way that 820 00:42:41,640 --> 00:42:43,610 makes use of the dot product. 821 00:42:43,610 --> 00:42:45,650 Namely, what you do is, suppose I 822 00:42:45,650 --> 00:42:50,080 want to find the term in the product of these two matrices 823 00:42:50,080 --> 00:42:55,200 that occupies the second row, third column. 824 00:42:55,200 --> 00:42:58,430 What I do is I take the second row-- in other words, 825 00:42:58,430 --> 00:43:02,190 I take the row comes from the first matrix. 826 00:43:02,190 --> 00:43:04,830 I take the column value from the second matrix. 827 00:43:04,830 --> 00:43:06,240 In other words, I have what? 828 00:43:06,240 --> 00:43:10,150 Second row, third column. 829 00:43:10,150 --> 00:43:14,810 And I form the usual dot product that we've talked about. 830 00:43:14,810 --> 00:43:18,350 I dot the second row with the third column. 831 00:43:18,350 --> 00:43:20,010 And what would I get if I did that? 832 00:43:20,010 --> 00:43:27,310 1 times 1 is 1; minus 1 times minus 1 is 1; and 2 times 2 833 00:43:27,310 --> 00:43:28,200 is 4. 834 00:43:28,200 --> 00:43:32,470 So 1 plus 1 plus 4 is 6. 835 00:43:32,470 --> 00:43:35,690 So in this product matrix, the term 836 00:43:35,690 --> 00:43:39,920 in the second row, third column will be 6. 837 00:43:39,920 --> 00:43:44,820 The term in the second row, third column will be 6. 838 00:43:44,820 --> 00:43:47,570 Second row, third column will be 6. 839 00:43:47,570 --> 00:43:50,460 Now, leaving it as an exercise for the time being, 840 00:43:50,460 --> 00:43:52,230 and reading it in the supplementary notes, 841 00:43:52,230 --> 00:43:54,460 I'm sure you'll be able to put this all together. 842 00:43:54,460 --> 00:43:57,220 It's not nearly as difficult as it sounds 843 00:43:57,220 --> 00:43:58,760 hearing it the first time. 844 00:43:58,760 --> 00:44:01,850 I think the most difficult part is rationalizing 845 00:44:01,850 --> 00:44:05,260 why one would invent such a definition in the first place. 846 00:44:05,260 --> 00:44:08,620 The answer is very simple: we invent the definition 847 00:44:08,620 --> 00:44:11,470 to solve a particular problem. 848 00:44:11,470 --> 00:44:13,800 Coming back here again, all I'm saying 849 00:44:13,800 --> 00:44:15,790 is that if I invent-- for example, 850 00:44:15,790 --> 00:44:18,330 let me just give you one more checking-out point here. 851 00:44:18,330 --> 00:44:20,210 Let me see what the term would be 852 00:44:20,210 --> 00:44:22,420 in the first row, second column. 853 00:44:22,420 --> 00:44:26,150 To find the term in the first row, second column, 854 00:44:26,150 --> 00:44:29,560 I take the first row of the first matrix. 855 00:44:29,560 --> 00:44:35,260 Dot it with the second column of the second matrix. 856 00:44:35,260 --> 00:44:37,800 See first row dotted with second column, 857 00:44:37,800 --> 00:44:39,630 the answer will give me what? 858 00:44:39,630 --> 00:44:42,030 The term in the product that's in the first row, 859 00:44:42,030 --> 00:44:43,030 second column. 860 00:44:43,030 --> 00:44:44,050 Let's check that. 861 00:44:44,050 --> 00:44:50,550 1 times 2 is 2; 1 times minus 1 is minus 1; 1 times 1 is 1. 862 00:44:50,550 --> 00:44:53,530 2 minus 1 plus 1 is 2. 863 00:44:53,530 --> 00:44:57,490 And therefore, the term in the first row, second column 864 00:44:57,490 --> 00:45:00,520 should be 2. 865 00:45:00,520 --> 00:45:01,020 It is. 866 00:45:01,020 --> 00:45:02,680 You see, there's no more motivation 867 00:45:02,680 --> 00:45:05,890 to how we multiply these two matrices than the fact 868 00:45:05,890 --> 00:45:09,950 that it solves the problem that we want solved. 869 00:45:09,950 --> 00:45:13,260 To find the term that's in the i-th row, 870 00:45:13,260 --> 00:45:17,730 j-th column of the product, dot the i-th row 871 00:45:17,730 --> 00:45:20,950 of the first matrix with the j-th column 872 00:45:20,950 --> 00:45:23,550 of the second matrix. 873 00:45:23,550 --> 00:45:29,180 More generally, you can always multiply an m by n matrix 874 00:45:29,180 --> 00:45:32,330 by an n by p matrix. 875 00:45:32,330 --> 00:45:34,240 What's the key factor? 876 00:45:34,240 --> 00:45:36,720 You don't care about the number of rows in the first, 877 00:45:36,720 --> 00:45:39,170 you don't care about the number of columns in the second. 878 00:45:39,170 --> 00:45:41,030 What you do care about is what? 879 00:45:41,030 --> 00:45:45,380 That the number of columns in the first matrix 880 00:45:45,380 --> 00:45:48,450 be equal to the number of rows in the second, 881 00:45:48,450 --> 00:45:52,120 and if you do that, when you multiply an m by n matrix 882 00:45:52,120 --> 00:45:56,550 by an n by p matrix, notice that the result will be what? 883 00:45:56,550 --> 00:45:58,750 An m by p matrix. 884 00:45:58,750 --> 00:46:01,230 In other words, the number of rows 885 00:46:01,230 --> 00:46:05,100 is governed by the number of rows in the first matrix 886 00:46:05,100 --> 00:46:07,890 and a number of columns is governed 887 00:46:07,890 --> 00:46:11,790 by the number of columns in the second matrix. 888 00:46:11,790 --> 00:46:13,840 Notice, by the way, that this tells us 889 00:46:13,840 --> 00:46:17,660 right away that when we want to multiply two matrices 890 00:46:17,660 --> 00:46:21,310 it makes a difference in which order that they're written. 891 00:46:21,310 --> 00:46:26,440 If we were to take that 2 by 3 matrix, and the 3 by 4 matrix, 892 00:46:26,440 --> 00:46:30,220 and interchange them, we don't have the appropriate match 893 00:46:30,220 --> 00:46:32,600 up of rows and columns. 894 00:46:32,600 --> 00:46:35,930 You can't dot a 2-tuple with a 4-tuple. 895 00:46:35,930 --> 00:46:39,530 The very fact that we say dot the row with the column, 896 00:46:39,530 --> 00:46:42,630 the dot product is only defined for two n-tuples. 897 00:46:42,630 --> 00:46:45,480 We insist that the n-tuples be the same. 898 00:46:45,480 --> 00:46:49,520 The n has to be the same to dot two n-tuples. 899 00:46:49,520 --> 00:46:52,680 Let me summarize today's lecture by saying 900 00:46:52,680 --> 00:46:55,850 that in overview, notice that what we've done, 901 00:46:55,850 --> 00:46:58,090 hopefully, is that we have reestablished 902 00:46:58,090 --> 00:47:01,250 the need for linear systems of equations, 903 00:47:01,250 --> 00:47:03,760 and secondly, once we have understood what 904 00:47:03,760 --> 00:47:06,970 the need for linear systems is, we are now introducing 905 00:47:06,970 --> 00:47:11,310 a mechanism whereby we can solve linear systems more 906 00:47:11,310 --> 00:47:14,580 efficiently than what we were taught in the past as to how 907 00:47:14,580 --> 00:47:15,750 to solve them. 908 00:47:15,750 --> 00:47:18,490 You see, what I'm going to do for the next few lectures now 909 00:47:18,490 --> 00:47:21,470 is concentrate on a new game, called 910 00:47:21,470 --> 00:47:23,880 the game of matrix algebra. 911 00:47:23,880 --> 00:47:27,950 But that will unfold gradually as we develop the next two 912 00:47:27,950 --> 00:47:28,750 lectures. 913 00:47:28,750 --> 00:47:31,330 And so until our next lecture, so long. 914 00:47:34,480 --> 00:47:36,850 Funding for the publication of this video 915 00:47:36,850 --> 00:47:41,730 was provided by the Gabriella and Paul Rosenbaum foundation. 916 00:47:41,730 --> 00:47:45,900 Help OCW continue to provide free and open access to MIT 917 00:47:45,900 --> 00:47:53,610 courses by making a donation at ocw.mit.edu/donate.