1 00:00:00,000 --> 00:00:02,400 ANNOUNCER: Open content is provided under a creative 2 00:00:02,400 --> 00:00:03,830 commons license. 3 00:00:03,830 --> 00:00:06,840 Your support will help MIT OpenCourseWare continue to 4 00:00:06,840 --> 00:00:10,510 offer high-quality educational resources for free. 5 00:00:10,510 --> 00:00:13,390 To make a donation, or view additional materials from 6 00:00:13,390 --> 00:00:17,490 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,490 --> 00:00:19,430 ocw.mit.edu . 8 00:00:19,430 --> 00:00:23,120 PROFESSOR ERIC GRIMSON: Last time, we ended up, we sort of 9 00:00:23,120 --> 00:00:25,380 did this tag team thing, Professor Guttag did the first 10 00:00:25,380 --> 00:00:27,200 half, I did the second half of the lecture, and the second 11 00:00:27,200 --> 00:00:29,430 half of the lecture, we started talking about 12 00:00:29,430 --> 00:00:30,610 complexity. 13 00:00:30,610 --> 00:00:32,170 Efficiency. 14 00:00:32,170 --> 00:00:33,560 Orders of growth. 15 00:00:33,560 --> 00:00:35,340 And that's what we're going to spend today on, is talking 16 00:00:35,340 --> 00:00:36,080 about that topic. 17 00:00:36,080 --> 00:00:37,300 I'm going to use it to build over the 18 00:00:37,300 --> 00:00:39,050 next couple of lectures. 19 00:00:39,050 --> 00:00:41,530 I want to remind you that we were talking at a fairly high 20 00:00:41,530 --> 00:00:42,450 level about complexity. 21 00:00:42,450 --> 00:00:45,170 We're going to get down into the weeds in a second here. 22 00:00:45,170 --> 00:00:47,680 But the things we were trying to stress were that it's an 23 00:00:47,680 --> 00:00:51,010 important design decision, when you are coming up with a 24 00:00:51,010 --> 00:00:54,820 piece of code, as to what kind of efficiency your code has. 25 00:00:54,820 --> 00:00:58,170 And the second thing that we talked about is this idea that 26 00:00:58,170 --> 00:01:00,630 we want you to in fact learn how to relate a choice you 27 00:01:00,630 --> 00:01:03,296 make about a piece of code to what the 28 00:01:03,296 --> 00:01:05,640 efficiency is going to be. 29 00:01:05,640 --> 00:01:07,670 So in fact, over the next thirty or forty minutes, we're 30 00:01:07,670 --> 00:01:10,920 going to show you a set of examples of sort of canonical 31 00:01:10,920 --> 00:01:13,980 algorithms, and the different classes of complexity. 32 00:01:13,980 --> 00:01:15,930 Because one of the things that you want to do as a good 33 00:01:15,930 --> 00:01:19,900 designer is to basically map a new problem 34 00:01:19,900 --> 00:01:20,890 into a known domain. 35 00:01:20,890 --> 00:01:23,460 You want to take a new problem and say, what does 36 00:01:23,460 --> 00:01:24,490 this most look like? 37 00:01:24,490 --> 00:01:27,390 What is the class of algorithm that's-- that probably applies 38 00:01:27,390 --> 00:01:30,380 to this, and how do I pull something out of that, if you 39 00:01:30,380 --> 00:01:33,230 like, a briefcase of possible algorithms to solve? 40 00:01:33,230 --> 00:01:36,350 All right, having said that, let's do some examples. 41 00:01:36,350 --> 00:01:39,540 I'm going to show you a sequence of algorithms, 42 00:01:39,540 --> 00:01:41,680 they're mostly simple algorithms, that's OK. 43 00:01:41,680 --> 00:01:44,410 But I want you to take away from this how we reason about 44 00:01:44,410 --> 00:01:47,200 the complexity of these algorithms. And I'll remind 45 00:01:47,200 --> 00:01:48,960 you, we said we're going to mostly talk about time. 46 00:01:48,960 --> 00:01:51,530 We're going to be counting the number of basic steps it takes 47 00:01:51,530 --> 00:01:53,150 to solve the problem. 48 00:01:53,150 --> 00:01:55,520 So here's the first example I want to do. 49 00:01:55,520 --> 00:01:59,390 I'm going to write a function to compute integer power 50 00:01:59,390 --> 00:02:02,860 exponents. a to the b where b is an integer. 51 00:02:02,860 --> 00:02:05,210 And I'm going to do it only using multiplication and 52 00:02:05,210 --> 00:02:07,010 addition and some simple tests. 53 00:02:07,010 --> 00:02:07,680 All right? 54 00:02:07,680 --> 00:02:10,290 And yeah, I know it comes built in, that's OK, what we 55 00:02:10,290 --> 00:02:12,410 want to do is use it as an example to look at it. 56 00:02:12,410 --> 00:02:20,960 So I'm going to build something that's going to do 57 00:02:20,960 --> 00:02:22,920 iterative exponentiation. 58 00:02:22,920 --> 00:02:23,950 OK? 59 00:02:23,950 --> 00:02:26,980 And in fact, if you look at the code up here, and it's on 60 00:02:26,980 --> 00:02:30,300 your handout, the very first one, x 1, right here-- if I 61 00:02:30,300 --> 00:02:33,400 could ask you to look at it-- is a piece of code to do it. 62 00:02:33,400 --> 00:02:35,980 And I'm less interested in the code than how we're going to 63 00:02:35,980 --> 00:02:37,670 analyze it, but let's look at it for a second. 64 00:02:37,670 --> 00:02:42,300 All right, you can see that this little piece of code, 65 00:02:42,300 --> 00:02:45,390 it's got a loop in there, and what's it doing? 66 00:02:45,390 --> 00:02:47,380 It's basically cycling through the loop, 67 00:02:47,380 --> 00:02:49,050 multiplying by a each time. 68 00:02:49,050 --> 00:02:51,750 So first time through the loop, the answer is 1. 69 00:02:51,750 --> 00:02:53,860 Second time it-- sorry, as it enters the loop, at the time 70 00:02:53,860 --> 00:02:56,260 it enter-- exits, the answer is a. 71 00:02:56,260 --> 00:02:58,080 Next time through the loop it goes to a squared. 72 00:02:58,080 --> 00:02:59,620 Next time through the loop it goes to a cubed. 73 00:02:59,620 --> 00:03:02,420 And it's just gathering together the multiplications 74 00:03:02,420 --> 00:03:05,090 while counting down the exponent. 75 00:03:05,090 --> 00:03:07,240 And you can see it when we get down to the end test here, 76 00:03:07,240 --> 00:03:08,990 we're going to pop out of there and we're going to 77 00:03:08,990 --> 00:03:11,190 return the answer. 78 00:03:11,190 --> 00:03:13,810 I could run it, it'll do the right thing. 79 00:03:13,810 --> 00:03:16,830 What I want to think about though, is, how much 80 00:03:16,830 --> 00:03:17,970 time does this take? 81 00:03:17,970 --> 00:03:22,040 How many steps does it take for this function to run? 82 00:03:22,040 --> 00:03:23,500 Well, you can kind of look at it, right? 83 00:03:23,500 --> 00:03:26,000 The key part of that is that WHILE loop. 84 00:03:26,000 --> 00:03:27,450 And what are the steps I want to count? 85 00:03:27,450 --> 00:03:28,810 They're inside that loop-- 86 00:03:28,810 --> 00:03:30,530 I've got the wrong glasses so I'm going to have to squint-- 87 00:03:30,530 --> 00:03:33,640 and we've got one test which is a comparison, we've got 88 00:03:33,640 --> 00:03:36,050 another test which is a multiplication-- sorry, not a 89 00:03:36,050 --> 00:03:38,850 test, we've got another step which is a multiplication-- 90 00:03:38,850 --> 00:03:41,360 and another step that is a subtraction. 91 00:03:41,360 --> 00:03:44,350 So each time through the loop, I'm doing three steps. 92 00:03:44,350 --> 00:03:46,430 Three basic operations. 93 00:03:46,430 --> 00:03:48,930 How many times do I go through the loop? 94 00:03:48,930 --> 00:03:51,330 Somebody help me out. 95 00:03:51,330 --> 00:03:51,870 Hand up? 96 00:03:51,870 --> 00:03:53,140 Sorry. b times. 97 00:03:53,140 --> 00:03:53,740 You're right. 98 00:03:53,740 --> 00:03:56,706 Because I keep counting down each time around-- mostly I've 99 00:03:56,706 --> 00:03:59,480 got to unload this candy, which is driving me nuts, so-- 100 00:03:59,480 --> 00:04:00,560 thank you. b times. 101 00:04:00,560 --> 00:04:03,430 So I've got to go 3 b steps. 102 00:04:03,430 --> 00:04:05,830 All right, I've got to go through the loop b times, I've 103 00:04:05,830 --> 00:04:07,730 got three steps each time, and then when I pop out of the 104 00:04:07,730 --> 00:04:09,280 loop, I've got two more steps. 105 00:04:09,280 --> 00:04:12,150 All right, I've got the initiation of answer and the 106 00:04:12,150 --> 00:04:12,730 return of it. 107 00:04:12,730 --> 00:04:19,050 So I take 2 plus 3 b steps to go through this loop. 108 00:04:19,050 --> 00:04:19,390 OK. 109 00:04:19,390 --> 00:04:27,910 So if b is 300, it takes 902 steps. b is 3000, it takes 110 00:04:27,910 --> 00:04:31,995 9002 steps. b is 30,000 you get the point, it 111 00:04:31,995 --> 00:04:35,520 takes 90,002 steps. 112 00:04:35,520 --> 00:04:35,780 OK. 113 00:04:35,780 --> 00:04:38,860 So the point here is, first of all, I can count these things, 114 00:04:38,860 --> 00:04:41,750 but the second thing you can see is, as the size of the 115 00:04:41,750 --> 00:04:46,240 problems get larger, that additive constant, that 2, 116 00:04:46,240 --> 00:04:47,600 really doesn't matter. 117 00:04:47,600 --> 00:04:47,770 All right? 118 00:04:47,770 --> 00:04:51,480 The difference between 90,000 steps and 90,002 steps, who 119 00:04:51,480 --> 00:04:53,210 cares about the 2, right? 120 00:04:53,210 --> 00:04:55,220 So, and typically, we're not going to worry about those 121 00:04:55,220 --> 00:04:56,980 additive constants. 122 00:04:56,980 --> 00:05:00,260 The second one is, this multiplicative constant here 123 00:05:00,260 --> 00:05:04,620 is 3, in some sense also isn't all that crucial. 124 00:05:04,620 --> 00:05:06,860 Does it really matter to you whether your code is going to 125 00:05:06,860 --> 00:05:10,350 take 300 years or 900 years to run? 126 00:05:10,350 --> 00:05:11,880 Problem is, how big is that number? 127 00:05:11,880 --> 00:05:14,570 So we're going to typically also not worry about the 128 00:05:14,570 --> 00:05:15,880 multiplicative constants. 129 00:05:15,880 --> 00:05:18,280 This factor here. 130 00:05:18,280 --> 00:05:21,190 What we really want to worry about is, as the size of the 131 00:05:21,190 --> 00:05:25,170 problem gets larger, how does this thing grow? 132 00:05:25,170 --> 00:05:27,090 How does the cost go up? 133 00:05:27,090 --> 00:05:29,030 And so what we're going to primarily talk about as a 134 00:05:29,030 --> 00:05:46,310 consequence is the rate of growth as the size of the 135 00:05:46,310 --> 00:05:47,200 problem grows. 136 00:05:47,200 --> 00:05:53,890 If it was, how much bigger does this get as I make the 137 00:05:53,890 --> 00:05:55,450 problem bigger? 138 00:05:55,450 --> 00:05:57,560 And what that really says is, that we're going to use this 139 00:05:57,560 --> 00:05:58,720 using something we're going to just 140 00:05:58,720 --> 00:06:02,170 call asymptotic notation-- 141 00:06:02,170 --> 00:06:08,820 I love spelling this word-- meaning, as in the limit as 142 00:06:08,820 --> 00:06:11,340 the size of the problem gets bigger, how do I characterize 143 00:06:11,340 --> 00:06:12,820 this growth? 144 00:06:12,820 --> 00:06:13,870 All right? 145 00:06:13,870 --> 00:06:15,820 You'll find out, if you go on to some of the other classes 146 00:06:15,820 --> 00:06:18,390 in course 6, there are a lot of different ways that you can 147 00:06:18,390 --> 00:06:18,950 measure this. 148 00:06:18,950 --> 00:06:21,280 The most common one, and the one we're going to use, is 149 00:06:21,280 --> 00:06:27,990 what's often called big Oh notation. 150 00:06:27,990 --> 00:06:30,430 This isn't big Oh as in, oh my God I'm shocked the markets 151 00:06:30,430 --> 00:06:33,130 are collapsing, This is called big Oh because we use the 152 00:06:33,130 --> 00:06:35,950 Greek letter, capital letter, omicron to represent it. 153 00:06:35,950 --> 00:06:38,500 And the way we're going to do this, or what this represents, 154 00:06:38,500 --> 00:06:41,540 let me write this carefully for you, big Oh notation is 155 00:06:41,540 --> 00:06:53,450 basically going to be an upper limit to the growth of a 156 00:06:53,450 --> 00:07:00,690 function as the input grow-- as the input gets large. 157 00:07:00,690 --> 00:07:06,020 Now we're going to see a bunch of examples, and I know those 158 00:07:06,020 --> 00:07:08,080 are words, let me give you an example. 159 00:07:08,080 --> 00:07:16,690 I would write f of x is in big Oh of n squared. 160 00:07:16,690 --> 00:07:17,360 And what does it say? 161 00:07:17,360 --> 00:07:21,160 It says that function, f of x, is bounded above, there's an 162 00:07:21,160 --> 00:07:25,940 upper limit on it, that this grows no faster than quadratic 163 00:07:25,940 --> 00:07:28,130 in n, n squared. 164 00:07:28,130 --> 00:07:29,440 OK. 165 00:07:29,440 --> 00:07:31,190 And first of all, you say, wait a minute, x and n? 166 00:07:31,190 --> 00:07:34,430 Well, one of the things we're going to see is x is the input 167 00:07:34,430 --> 00:07:38,860 to this particular problem, n is a measure of the size of x. 168 00:07:38,860 --> 00:07:42,280 And we're going to talk about how we come up with that. n 169 00:07:42,280 --> 00:07:46,830 measures the size of x. 170 00:07:46,830 --> 00:07:47,450 OK. 171 00:07:47,450 --> 00:07:50,030 In this example I'd use b. 172 00:07:50,030 --> 00:07:52,250 All right, as b get-- b is the thing that's changing as I go 173 00:07:52,250 --> 00:07:54,700 along here, but it could be things like, how many elements 174 00:07:54,700 --> 00:07:56,830 are there in a list if the input is a list, could be how 175 00:07:56,830 --> 00:07:58,700 many digits are there in a string if the input's a 176 00:07:58,700 --> 00:08:01,630 string, it could be the size of the integer as we go along. 177 00:08:01,630 --> 00:08:02,690 All right.? 178 00:08:02,690 --> 00:08:05,810 And what we want to do then, is we want to basically come 179 00:08:05,810 --> 00:08:09,020 up with, how do we characterize the growth-- 180 00:08:09,020 --> 00:08:11,630 God bless you-- of this problem in terms of this 181 00:08:11,630 --> 00:08:14,660 quadra-- sorry, terms of this exponential growth 182 00:08:14,660 --> 00:08:16,540 Now, one last piece of math. 183 00:08:16,540 --> 00:08:17,260 I could cheat. 184 00:08:17,260 --> 00:08:19,070 I said I just want an upper bound. 185 00:08:19,070 --> 00:08:21,440 I could get a really big upper bound, this thing grows 186 00:08:21,440 --> 00:08:22,720 exponentially. 187 00:08:22,720 --> 00:08:23,970 That doesn't help me much. 188 00:08:23,970 --> 00:08:26,910 Usually what I want to talk about is what's the smallest 189 00:08:26,910 --> 00:08:30,420 size class in which this function grows? 190 00:08:30,420 --> 00:08:32,890 With all of that, what that says, is that this we would 191 00:08:32,890 --> 00:08:35,100 write is order b. 192 00:08:35,100 --> 00:08:40,910 That algorithm is linear. 193 00:08:40,910 --> 00:08:41,740 You can see it. 194 00:08:41,740 --> 00:08:44,310 I've said the product was is 2 plus 3 b. 195 00:08:44,310 --> 00:08:47,070 As I make b really large, how does this thing grow? 196 00:08:47,070 --> 00:08:48,480 It grows as b. 197 00:08:48,480 --> 00:08:50,620 The 3 doesn't matter, it's just a constant, 198 00:08:50,620 --> 00:08:51,590 it's growing linearly. 199 00:08:51,590 --> 00:08:55,570 Another way of saying it is, if I, for example, increase 200 00:08:55,570 --> 00:08:59,480 the size of the input by 10, the amount of time 201 00:08:59,480 --> 00:09:00,480 increases by 10. 202 00:09:00,480 --> 00:09:03,610 And that's a sign that it's linear. 203 00:09:03,610 --> 00:09:04,150 OK. 204 00:09:04,150 --> 00:09:05,450 So there's one quick example. 205 00:09:05,450 --> 00:09:07,260 Let's look at another example. 206 00:09:07,260 --> 00:09:12,200 If you look at x 2, this one right here in your handout. 207 00:09:12,200 --> 00:09:12,580 OK. 208 00:09:12,580 --> 00:09:17,360 This is another way of doing exponentiation, but this one's 209 00:09:17,360 --> 00:09:18,550 a recursive function. 210 00:09:18,550 --> 00:09:18,820 All right? 211 00:09:18,820 --> 00:09:20,730 So again, let's look at it. 212 00:09:20,730 --> 00:09:21,620 What does it say to do? 213 00:09:21,620 --> 00:09:23,220 Well, it's basically saying a similar thing. 214 00:09:23,220 --> 00:09:26,060 It says, if I am in the base case, if b is equal to 1, the 215 00:09:26,060 --> 00:09:27,800 answer is just a. 216 00:09:27,800 --> 00:09:30,280 I could have used if b is equal to 0, the answer is 1, 217 00:09:30,280 --> 00:09:31,490 that would have also worked. 218 00:09:31,490 --> 00:09:33,060 Otherwise, what do I say? 219 00:09:33,060 --> 00:09:36,600 I say, ah, I'm in a nice recursive way, a to the b is 220 00:09:36,600 --> 00:09:41,230 the same as a times a to the b minus 1. 221 00:09:41,230 --> 00:09:43,610 And I've just reduced that problem to a simpler version 222 00:09:43,610 --> 00:09:45,050 of the same problem. 223 00:09:45,050 --> 00:09:47,340 OK, and you can see that this thing ought to unwrap, it's 224 00:09:47,340 --> 00:09:49,560 going to keep extending out those multiplications until 225 00:09:49,560 --> 00:09:51,490 gets down to the base case, going to 226 00:09:51,490 --> 00:09:53,650 collapse them all together. 227 00:09:53,650 --> 00:09:53,960 OK. 228 00:09:53,960 --> 00:09:57,180 Now I want to know what's the order of growth here? 229 00:09:57,180 --> 00:10:00,130 What's the complexity of this? 230 00:10:00,130 --> 00:10:01,040 Well, gee. 231 00:10:01,040 --> 00:10:04,170 It looks like it's pretty straightforward, right? 232 00:10:04,170 --> 00:10:06,670 I've got one test there, and then I've just got one thing 233 00:10:06,670 --> 00:10:09,310 to do here, which has got a subtraction and a 234 00:10:09,310 --> 00:10:10,930 multiplication. 235 00:10:10,930 --> 00:10:14,470 Oh, but how do I know how long it takes to do x 2? 236 00:10:14,470 --> 00:10:16,930 All right, we were counting basic steps. 237 00:10:16,930 --> 00:10:19,380 We don't know how long it takes to do x 2. 238 00:10:19,380 --> 00:10:21,230 So I'm going to show you a little trick for 239 00:10:21,230 --> 00:10:23,050 figuring that out. 240 00:10:23,050 --> 00:10:25,802 And in particular, I'm going to cheat slightly, I'm going 241 00:10:25,802 --> 00:10:28,200 to use a little bit of abusive mathematics, but I'm going to 242 00:10:28,200 --> 00:10:30,110 show you a trick to figure it out. 243 00:10:30,110 --> 00:10:36,130 In the case of a recursive exponentiator, I'm going to do 244 00:10:36,130 --> 00:10:36,890 the following trick. 245 00:10:36,890 --> 00:10:41,890 I'm going to let t of b be the number of steps it takes to 246 00:10:41,890 --> 00:10:43,750 solve the problem of size b. 247 00:10:43,750 --> 00:10:45,520 OK, and I can figure this out. 248 00:10:45,520 --> 00:10:49,140 I've got one test, I've got a subtraction, I've got a 249 00:10:49,140 --> 00:10:55,670 multiplication, that's three steps, plus whatever number of 250 00:10:55,670 --> 00:10:58,100 steps it takes to solve a problem of size b minus 1. 251 00:10:58,100 --> 00:11:02,480 All right, this is what's called a recurrence relation, 252 00:11:02,480 --> 00:11:05,370 there are actually cool ways to solve them. 253 00:11:05,370 --> 00:11:07,080 We can kind of eyeball it. 254 00:11:07,080 --> 00:11:10,130 In particular, how would I write an expression for 255 00:11:10,130 --> 00:11:12,100 t of b minus 1? 256 00:11:12,100 --> 00:11:13,270 Well the same way. 257 00:11:13,270 --> 00:11:18,230 This is 3 plus 3 plus t of b minus 2. 258 00:11:18,230 --> 00:11:18,690 Right? 259 00:11:18,690 --> 00:11:21,980 I'm using exactly the same form to reduce this. 260 00:11:21,980 --> 00:11:23,350 You know, you can see what's going to happen. 261 00:11:23,350 --> 00:11:26,590 If I reduce that, it would be 3 plus t of b minus 3, so in 262 00:11:26,590 --> 00:11:33,660 general, this is 3 k plus t of b minus k. 263 00:11:33,660 --> 00:11:33,830 OK. 264 00:11:33,830 --> 00:11:36,370 I'm just expanding it out. 265 00:11:36,370 --> 00:11:38,320 When am I done? 266 00:11:38,320 --> 00:11:40,580 How do I stop this? 267 00:11:40,580 --> 00:11:43,670 Any suggestions? 268 00:11:43,670 --> 00:11:45,360 Don't you hate it when professors ask questions? 269 00:11:45,360 --> 00:11:48,830 Yeah. 270 00:11:48,830 --> 00:11:52,070 Actually, I think I want b minus k equal to 1. 271 00:11:52,070 --> 00:11:52,720 Right? 272 00:11:52,720 --> 00:11:55,990 When this gets down to t of 1, I'm in the base case. 273 00:11:55,990 --> 00:12:01,550 So I'm done when b minus k equals 1, or k 274 00:12:01,550 --> 00:12:04,010 equals b minus 1. 275 00:12:04,010 --> 00:12:05,900 Right, that gets me down to the base case, I'm solving a 276 00:12:05,900 --> 00:12:08,800 problem with size 1, and in that case, I've got two more 277 00:12:08,800 --> 00:12:12,730 operations to do, so I plug this all back in, I-- t of b 278 00:12:12,730 --> 00:12:19,840 is I'm going to put k for b minus 1 I get 3 b minus 1 plus 279 00:12:19,840 --> 00:12:26,050 t of 1, so t of 1 is 2, so this is 3 b minus 1 plus 2, or 280 00:12:26,050 --> 00:12:30,670 3 b minus 1. 281 00:12:30,670 --> 00:12:30,900 OK. 282 00:12:30,900 --> 00:12:34,560 A whole lot of work to basically say, 283 00:12:34,560 --> 00:12:38,740 again, order b is linear. 284 00:12:38,740 --> 00:12:41,510 But that's also nice, it lets you see how the recursive 285 00:12:41,510 --> 00:12:43,930 thing is simply unwrapping but the complexity in terms of the 286 00:12:43,930 --> 00:12:46,460 amount of time it takes is going to be the same. 287 00:12:46,460 --> 00:12:48,310 I owe you a candy. 288 00:12:48,310 --> 00:12:50,110 Thank you. 289 00:12:50,110 --> 00:12:51,510 OK. 290 00:12:51,510 --> 00:12:53,440 At this point, if we stop, you'll think all algorithms 291 00:12:53,440 --> 00:12:53,820 are linear. 292 00:12:53,820 --> 00:12:55,410 This is really boring. 293 00:12:55,410 --> 00:12:56,450 But they're not. 294 00:12:56,450 --> 00:12:56,540 OK? 295 00:12:56,540 --> 00:13:00,540 So let me show you another way I could do exponentiation. 296 00:13:00,540 --> 00:13:01,880 Taking an advantage of a trick. 297 00:13:01,880 --> 00:13:08,200 I want to solve a to the b. 298 00:13:08,200 --> 00:13:10,080 Here's another way I could do that. 299 00:13:10,080 --> 00:13:10,400 OK. 300 00:13:10,400 --> 00:13:23,040 If b is even, then a to the b is the same as a squared all 301 00:13:23,040 --> 00:13:24,020 to the b over 2. 302 00:13:24,020 --> 00:13:26,820 All right, just move the 2's around. 303 00:13:26,820 --> 00:13:27,470 It's the same thing. 304 00:13:27,470 --> 00:13:29,270 You're saying, OK, so what? 305 00:13:29,270 --> 00:13:31,370 Well gee, notice. 306 00:13:31,370 --> 00:13:33,500 This is a primitive operation. 307 00:13:33,500 --> 00:13:35,280 That's a primitive operation. 308 00:13:35,280 --> 00:13:38,910 But in one step, I've reduced this problem in half. 309 00:13:38,910 --> 00:13:40,340 I didn't just make it one smaller, I 310 00:13:40,340 --> 00:13:41,850 made it a half smaller. 311 00:13:41,850 --> 00:13:43,840 That's a nice deal. 312 00:13:43,840 --> 00:13:44,180 OK. 313 00:13:44,180 --> 00:13:45,870 But I'm not always going to have b as even. 314 00:13:45,870 --> 00:13:47,810 If b is odd, what do I do? 315 00:13:47,810 --> 00:13:56,490 Well, go back to what I did before. 316 00:13:56,490 --> 00:13:59,870 Multiply a by a to the b minus 1. 317 00:13:59,870 --> 00:14:00,980 You know, that's nice, right? 318 00:14:00,980 --> 00:14:03,770 Because if b was odd, then b minus one is even, which means 319 00:14:03,770 --> 00:14:07,740 on the next step, I can cut the problem in half again. 320 00:14:07,740 --> 00:14:08,210 OK? 321 00:14:08,210 --> 00:14:17,160 All right. x 3, as you can see right here, does exactly that. 322 00:14:17,160 --> 00:14:17,760 OK? 323 00:14:17,760 --> 00:14:20,290 You can take a quick look at it, even with the wrong 324 00:14:20,290 --> 00:14:22,850 glasses on, it says if a-- sorry, b is equal to 1, I'm 325 00:14:22,850 --> 00:14:24,740 just going to return a. 326 00:14:24,740 --> 00:14:26,860 Otherwise there's that funky little test. I'll do the 327 00:14:26,860 --> 00:14:29,580 remainder multiplied by 2, because these are integers, 328 00:14:29,580 --> 00:14:31,580 that gives me back an integer, I just check to see if it's 329 00:14:31,580 --> 00:14:33,710 equal to b, that tells me whether it's even or odd. 330 00:14:33,710 --> 00:14:38,420 And in the even case, I'd square, divide by half, call 331 00:14:38,420 --> 00:14:41,790 this again: in the odd case, I go b minus 1 and 332 00:14:41,790 --> 00:14:44,210 then multiply by a. 333 00:14:44,210 --> 00:14:46,190 I'll let you chase it through, it does work. 334 00:14:46,190 --> 00:14:48,260 What I want to look at is, what's the 335 00:14:48,260 --> 00:14:51,030 order of growth here? 336 00:14:51,030 --> 00:14:52,690 This is a little different, right? 337 00:14:52,690 --> 00:14:54,890 It's going to take a little bit more work, so let's see if 338 00:14:54,890 --> 00:14:58,270 we can do it. 339 00:14:58,270 --> 00:15:01,400 In the b even case, again I'm going to let t of b be the 340 00:15:01,400 --> 00:15:03,750 number of steps I want to go through. 341 00:15:03,750 --> 00:15:05,610 And we can kind of eyeball this thing, right? 342 00:15:05,610 --> 00:15:09,470 If b is even, I've got a test to see if b is equal to 1, and 343 00:15:09,470 --> 00:15:11,880 then I've got to do the remainder, the multiplication, 344 00:15:11,880 --> 00:15:14,160 and the test, I'm up to four. 345 00:15:14,160 --> 00:15:17,240 And then in the even case, I've got to do a square and 346 00:15:17,240 --> 00:15:17,830 the divide. 347 00:15:17,830 --> 00:15:24,670 So I've got six steps, plus whatever it takes to solve the 348 00:15:24,670 --> 00:15:26,950 problem size b over 2, right? 349 00:15:26,950 --> 00:15:33,450 Because that's the recursive call. b as odd, well I can go 350 00:15:33,450 --> 00:15:34,940 through the same kind of thing. 351 00:15:34,940 --> 00:15:37,610 I've got the same first four steps, I've got a check to see 352 00:15:37,610 --> 00:15:40,130 is it 1, I got a check to see if it's even, and then in the 353 00:15:40,130 --> 00:15:43,500 odd case, I've got to subtract 1 from b, that's a fifth step, 354 00:15:43,500 --> 00:15:45,560 I've got to go off and solve the recursive problem, and 355 00:15:45,560 --> 00:15:48,810 then I'm going to do one more multiplication, so it's 6 356 00:15:48,810 --> 00:15:52,320 plus, in this case, t of b minus 1. 357 00:15:52,320 --> 00:15:55,500 Because it's now solving a one-smaller problem. 358 00:15:55,500 --> 00:16:01,240 On the next step though, this, we get substituted by that. 359 00:16:01,240 --> 00:16:03,190 Right, on the next step, I'm back in the even case, it's 360 00:16:03,190 --> 00:16:09,150 going to take six more steps, plus t of b minus 1. 361 00:16:09,150 --> 00:16:12,580 Oops, sorry about that, over 2. 362 00:16:12,580 --> 00:16:15,490 Because b minus 1 is now even. 363 00:16:15,490 --> 00:16:17,250 Don't sweat the details here, I just want you to see the 364 00:16:17,250 --> 00:16:17,930 reason it goes through it. 365 00:16:17,930 --> 00:16:19,350 What I now have, though, is a nice thing. 366 00:16:19,350 --> 00:16:25,510 It says, in either case, in general, t of b-- and this is 367 00:16:25,510 --> 00:16:27,740 where I'm going to abuse notation a little bit-- but I 368 00:16:27,740 --> 00:16:33,900 can basically bound it by t, 12 steps plus t of b over 2. 369 00:16:33,900 --> 00:16:35,710 And the abuse is, you know, it's not quite right, it 370 00:16:35,710 --> 00:16:36,900 depends upon whether it's all ready, but you can see in 371 00:16:36,900 --> 00:16:39,605 either case, after 12 steps, 2 runs through this and down to 372 00:16:39,605 --> 00:16:41,200 a problem size b over 2. 373 00:16:41,200 --> 00:16:42,830 Why's that nice? 374 00:16:42,830 --> 00:16:49,150 Well, that then says after another 12 steps, we're down 375 00:16:49,150 --> 00:16:51,570 to a problem with size t of b over 4. 376 00:16:51,570 --> 00:16:59,875 And if I pull it out one more level, it's 12 plus 12 plus t 377 00:16:59,875 --> 00:17:04,590 of b over 8, which in general is going to be, after k steps, 378 00:17:04,590 --> 00:17:08,390 12 k because I'll have 12 of those to add up, plus t of b 379 00:17:08,390 --> 00:17:13,570 over 2 to the k. 380 00:17:13,570 --> 00:17:15,840 When am I done? 381 00:17:15,840 --> 00:17:20,060 When do I get down to the base case? 382 00:17:20,060 --> 00:17:21,650 Somebody help me out. 383 00:17:21,650 --> 00:17:25,350 What am I looking for? 384 00:17:25,350 --> 00:17:25,790 Yeah. 385 00:17:25,790 --> 00:17:28,240 You're jumping slightly ahead of me, but basically, I'm done 386 00:17:28,240 --> 00:17:29,760 when this is equal to 1, right? 387 00:17:29,760 --> 00:17:32,790 Because I get down to the base case, so I'm done when b u is 388 00:17:32,790 --> 00:17:36,010 over 2 to the k is equal to 1, and you're absolutely right, 389 00:17:36,010 --> 00:17:44,400 that's when k is log base 2 of b. 390 00:17:44,400 --> 00:17:46,170 You're sitting a long ways back, I have no idea if I'll 391 00:17:46,170 --> 00:17:48,170 make it this far or not. 392 00:17:48,170 --> 00:17:50,040 Thank you. 393 00:17:50,040 --> 00:17:51,250 OK. 394 00:17:51,250 --> 00:17:52,800 There's some constants in there, but this 395 00:17:52,800 --> 00:17:57,550 is order log b. 396 00:17:57,550 --> 00:17:59,490 Logarithmic. 397 00:17:59,490 --> 00:18:01,040 This matters. 398 00:18:01,040 --> 00:18:02,000 This matters a lot. 399 00:18:02,000 --> 00:18:03,610 And I'm going to show you an example in a second, just to 400 00:18:03,610 --> 00:18:06,650 drive this home, but notice the characteristics. 401 00:18:06,650 --> 00:18:09,570 In the first two cases, the problem reduced 402 00:18:09,570 --> 00:18:11,970 by 1 at each step. 403 00:18:11,970 --> 00:18:13,150 Whether it was recursive or iterative. 404 00:18:13,150 --> 00:18:15,710 That's a sign that it's probably linear. 405 00:18:15,710 --> 00:18:19,800 This case, I reduced the size of the problem in half. 406 00:18:19,800 --> 00:18:21,920 It's a good sign that this is logarithmic, and I'm going to 407 00:18:21,920 --> 00:18:25,010 come back in a second to why logs are a great thing. 408 00:18:25,010 --> 00:18:28,410 Let me show you one more class, though, about-- sorry, 409 00:18:28,410 --> 00:18:30,130 let me show you two more classes of algorithms. Let's 410 00:18:30,130 --> 00:18:32,900 look at the next one g-- and there's a bug in your handout, 411 00:18:32,900 --> 00:18:37,210 it should be g of n and m, I apologize for that, I changed 412 00:18:37,210 --> 00:18:40,170 it partway through and didn't catch it. 413 00:18:40,170 --> 00:18:40,490 OK. 414 00:18:40,490 --> 00:18:45,530 Order of growth here. 415 00:18:45,530 --> 00:18:47,960 Anybody want to volunteer a guess? 416 00:18:47,960 --> 00:18:53,830 Other than the TAs, who know? 417 00:18:53,830 --> 00:18:54,300 OK. 418 00:18:54,300 --> 00:18:56,470 Let's think it through. 419 00:18:56,470 --> 00:18:57,330 I've got two loops. 420 00:18:57,330 --> 00:18:58,710 All right? 421 00:18:58,710 --> 00:19:01,530 We already saw with one of the loops, you know, it looked 422 00:19:01,530 --> 00:19:03,350 like it might be linear, depending on what's inside of 423 00:19:03,350 --> 00:19:04,350 it, but let's think about this. 424 00:19:04,350 --> 00:19:06,200 I got two loops with g. 425 00:19:06,200 --> 00:19:07,060 What's g do? 426 00:19:07,060 --> 00:19:10,260 I've got an initialization of x, and then I say, for i in 427 00:19:10,260 --> 00:19:12,740 the range, so that's basically from 0 up to n minus 428 00:19:12,740 --> 00:19:14,250 1, what do I do? 429 00:19:14,250 --> 00:19:17,080 Well, inside of there, I've got another loop, for j in the 430 00:19:17,080 --> 00:19:20,870 range from 0 up to m minus 1. 431 00:19:20,870 --> 00:19:24,980 What's the complexity of that inner loop? 432 00:19:24,980 --> 00:19:26,730 Sorry? 433 00:19:26,730 --> 00:19:27,260 OK. 434 00:19:27,260 --> 00:19:28,400 You're doing the whole thing for me. 435 00:19:28,400 --> 00:19:30,700 What's the complexity just of this inner loop here? 436 00:19:30,700 --> 00:19:31,250 Just this piece. 437 00:19:31,250 --> 00:19:36,260 How many times do I go through that loop? m. 438 00:19:36,260 --> 00:19:36,440 Right? 439 00:19:36,440 --> 00:19:39,050 I'm going to get back to your answer in a second, because 440 00:19:39,050 --> 00:19:40,100 you're heading in the right direction. 441 00:19:40,100 --> 00:19:43,440 The inner loop, this part here, I do m times. 442 00:19:43,440 --> 00:19:44,520 There's one step inside of it. 443 00:19:44,520 --> 00:19:46,160 Right? 444 00:19:46,160 --> 00:19:48,900 How many times do I go through that loop? 445 00:19:48,900 --> 00:19:53,230 Ah, n times, because for each value of i, I'm going to do 446 00:19:53,230 --> 00:19:56,830 that m thing, so that is, close to what you said, right? 447 00:19:56,830 --> 00:19:59,460 The order complexity here, if I actually write it, would 448 00:19:59,460 --> 00:20:08,530 be-- sorry, order n times m, and if m was equal to n, that 449 00:20:08,530 --> 00:20:14,860 would be order n squared, and this is quadratic. 450 00:20:14,860 --> 00:20:19,660 And that's a different behavior. 451 00:20:19,660 --> 00:20:20,180 OK. 452 00:20:20,180 --> 00:20:21,420 What am I doing? 453 00:20:21,420 --> 00:20:24,050 Building up examples of algorithms. Again, I want you 454 00:20:24,050 --> 00:20:26,460 to start seeing how to map the characteristics of the code-- 455 00:20:26,460 --> 00:20:28,110 the characteristics of the algorithm, let's not call it 456 00:20:28,110 --> 00:20:29,850 the code-- to the complexity. 457 00:20:29,850 --> 00:20:31,300 I'm going to come back to that in a second with that, but I 458 00:20:31,300 --> 00:20:33,850 need to do one more example, and I've got to use my 459 00:20:33,850 --> 00:20:36,660 high-tech really expensive props. 460 00:20:36,660 --> 00:20:38,060 Right. 461 00:20:38,060 --> 00:20:39,880 So here's the fourth or fifth, whatever we're up to, I guess 462 00:20:39,880 --> 00:20:41,330 fifth example. 463 00:20:41,330 --> 00:20:42,620 This is an example of a problem 464 00:20:42,620 --> 00:20:43,580 called Towers of Hanoi. 465 00:20:43,580 --> 00:20:44,840 Anybody heard about this problem? 466 00:20:44,840 --> 00:20:47,520 A few tentative hands. 467 00:20:47,520 --> 00:20:49,100 OK. 468 00:20:49,100 --> 00:20:51,180 Here's the story as I am told it. 469 00:20:51,180 --> 00:20:53,470 There's a temple in the middle of Hanoi. 470 00:20:53,470 --> 00:20:56,150 In that temple, there are three very large 471 00:20:56,150 --> 00:21:00,390 diamond-encrusted posts, and on those posts are sixty-four 472 00:21:00,390 --> 00:21:02,950 disks, all of a different size. 473 00:21:02,950 --> 00:21:05,170 And they're, you know, covered with jewels and all sorts of 474 00:21:05,170 --> 00:21:07,170 other really neat stuff. 475 00:21:07,170 --> 00:21:10,410 There are a set of priests in that temple, and their task is 476 00:21:10,410 --> 00:21:15,470 to move the entire stack of sixty-four disks from one post 477 00:21:15,470 --> 00:21:19,380 to a second post. When they do this, you know, the universe 478 00:21:19,380 --> 00:21:21,740 ends or they solve the financial crisis in Washington 479 00:21:21,740 --> 00:21:24,470 or something like that actually good happens, right? 480 00:21:24,470 --> 00:21:26,690 Boy, none of you have 401k's, you're not even wincing at 481 00:21:26,690 --> 00:21:28,010 that thing. 482 00:21:28,010 --> 00:21:29,210 All right. 483 00:21:29,210 --> 00:21:31,930 The rules, though, are, they can only move one disk at a 484 00:21:31,930 --> 00:21:36,500 time, and they can never cover up a smaller disk with a 485 00:21:36,500 --> 00:21:37,980 larger disk. 486 00:21:37,980 --> 00:21:38,120 OK. 487 00:21:38,120 --> 00:21:40,720 Otherwise you'd just move the whole darn stack, OK? 488 00:21:40,720 --> 00:21:42,030 So we want to solve that problem. 489 00:21:42,030 --> 00:21:43,780 We want to write a piece of code that helps these guys 490 00:21:43,780 --> 00:21:45,850 out, so I'm going to show you an example. 491 00:21:45,850 --> 00:21:47,490 Let's see if we can figure out how to do this. 492 00:21:47,490 --> 00:21:49,240 So, we'll start with the easy one. 493 00:21:49,240 --> 00:21:51,160 Moving a disk of size 1. 494 00:21:51,160 --> 00:21:53,240 OK, that's not so bad. 495 00:21:53,240 --> 00:21:55,700 Moving a stack of size 2, if I want to go there, I need to 496 00:21:55,700 --> 00:21:57,980 put this one temporarily over here so I can move the bottom 497 00:21:57,980 --> 00:22:00,380 one before I move it over. 498 00:22:00,380 --> 00:22:02,920 Moving a stack of size 3, again, if I want to go over 499 00:22:02,920 --> 00:22:04,800 there, I need to make sure I can put the spare one over 500 00:22:04,800 --> 00:22:06,850 here before I move the bottom one, I can't cover up any of 501 00:22:06,850 --> 00:22:08,350 the smaller ones with the larger one, but 502 00:22:08,350 --> 00:22:09,200 I can get it there. 503 00:22:09,200 --> 00:22:13,110 Stack of size 4, again I'm going there, so I'm going to 504 00:22:13,110 --> 00:22:15,620 do this initially, no I'm not, I'm going to start again. 505 00:22:15,620 --> 00:22:17,800 I'm going to go there initially, so I can move this 506 00:22:17,800 --> 00:22:19,980 over here, so I can get the base part of that over there, 507 00:22:19,980 --> 00:22:22,290 I want to put that one there before I put this over here, 508 00:22:22,290 --> 00:22:24,145 finally I get to the point where I can move the bottom 509 00:22:24,145 --> 00:22:26,330 one over, now I've got to be really careful to make sure 510 00:22:26,330 --> 00:22:28,350 that I don't cover up the bottom one in the wrong way 511 00:22:28,350 --> 00:22:30,980 before I get to the stage where I wish they were posts 512 00:22:30,980 --> 00:22:31,590 and there you go. 513 00:22:31,590 --> 00:22:32,520 All right? 514 00:22:32,520 --> 00:22:34,570 [APPLAUSE] 515 00:22:34,570 --> 00:22:36,510 I mean, I can make money at Harvard Square doing this 516 00:22:36,510 --> 00:22:38,420 stuff, right? 517 00:22:38,420 --> 00:22:41,140 All right, you ready to do five? 518 00:22:41,140 --> 00:22:43,470 Got the solution? 519 00:22:43,470 --> 00:22:45,350 Not so easy to see. 520 00:22:45,350 --> 00:22:47,210 All right, but this is actually a great one of those 521 00:22:47,210 --> 00:22:49,270 educational moments. 522 00:22:49,270 --> 00:22:52,200 This is a great example to think recursively. 523 00:22:52,200 --> 00:22:54,320 If I wanted to think about this problem recursively-- 524 00:22:54,320 --> 00:22:55,530 what do I mean by thinking recursively? 525 00:22:55,530 --> 00:22:59,020 How do I reduce this to a smaller-size problem in the 526 00:22:59,020 --> 00:23:00,160 same instant? 527 00:23:00,160 --> 00:23:02,610 And so, if I do that, this now becomes really easy. 528 00:23:02,610 --> 00:23:06,070 If I want to move this stack here, I'm going to take a 529 00:23:06,070 --> 00:23:10,990 stack of size n minus 1, move it to the spare spot, now I 530 00:23:10,990 --> 00:23:13,970 can move the base disk over, and then I'm going to move 531 00:23:13,970 --> 00:23:18,300 that stack of size n minus 1 to there. 532 00:23:18,300 --> 00:23:21,270 That's literally what I did, OK? 533 00:23:21,270 --> 00:23:22,760 So there's the code. 534 00:23:22,760 --> 00:23:23,460 Called towers. 535 00:23:23,460 --> 00:23:26,020 I'm just going to have you-- let you take a look at it. 536 00:23:26,020 --> 00:23:27,990 I'm giving it an argument, which is the size of the 537 00:23:27,990 --> 00:23:30,730 stack, and then just labels for the three posts. 538 00:23:30,730 --> 00:23:33,540 A from, a to, and a spare. 539 00:23:33,540 --> 00:23:36,330 And in fact, if we look at this-- let me just pop it over 540 00:23:36,330 --> 00:23:38,930 to the other side-- 541 00:23:38,930 --> 00:23:44,400 OK, I can move a tower, I'll say of size 2, from, to, and 542 00:23:44,400 --> 00:23:48,170 spare, and that was what I did. 543 00:23:48,170 --> 00:23:55,540 And if I want to move towers, let's say, size 5, from, to, 544 00:23:55,540 --> 00:24:01,140 and spare, there are the instructions 545 00:24:01,140 --> 00:24:02,860 for how to move it. 546 00:24:02,860 --> 00:24:05,330 We ain't going to do sixty-four. 547 00:24:05,330 --> 00:24:06,720 OK. 548 00:24:06,720 --> 00:24:07,020 All right. 549 00:24:07,020 --> 00:24:09,100 So it's fun, and I got a little bit of applause out of 550 00:24:09,100 --> 00:24:13,030 it, which is always nice for me, but I also showed you how 551 00:24:13,030 --> 00:24:14,420 to think about it recursively. 552 00:24:14,420 --> 00:24:16,290 Once you hear that description, it's easy to 553 00:24:16,290 --> 00:24:18,440 write the code, in fact. 554 00:24:18,440 --> 00:24:20,430 This is a place where the recursive version of it is 555 00:24:20,430 --> 00:24:22,760 much easier to think about than the iterative one. 556 00:24:22,760 --> 00:24:25,380 But what I really want to talk about is, what's the order of 557 00:24:25,380 --> 00:24:25,820 growth here? 558 00:24:25,820 --> 00:24:28,830 What's the complexity of this algorithm? 559 00:24:28,830 --> 00:24:31,550 And again, I'm going to do it with a little bit of abusive 560 00:24:31,550 --> 00:24:33,560 notation, and it's a little more complicated, but we can 561 00:24:33,560 --> 00:24:34,370 kind of look at. 562 00:24:34,370 --> 00:24:35,890 All right? 563 00:24:35,890 --> 00:24:40,000 Given the code up there, if I want to move a tower of size 564 00:24:40,000 --> 00:24:43,090 n, what do I have to do? 565 00:24:43,090 --> 00:24:47,050 I've got to test to see if I'm in the base case, and if I'm 566 00:24:47,050 --> 00:24:52,230 not, then I need to move a tower of size n minus 1, I 567 00:24:52,230 --> 00:24:56,360 need to move a tower of size 1, and I need to move a 568 00:24:56,360 --> 00:24:59,050 second-- sorry about that-- a second tower of 569 00:24:59,050 --> 00:25:02,000 size n minus 1. 570 00:25:02,000 --> 00:25:03,910 OK. t of 1 I can also reduce. 571 00:25:03,910 --> 00:25:06,540 In the case of a tower of size 1, basically there are two 572 00:25:06,540 --> 00:25:07,290 things to do, right? 573 00:25:07,290 --> 00:25:09,460 I've got to do the test, and then I just do the move. 574 00:25:09,460 --> 00:25:16,420 So the general formula is that. 575 00:25:16,420 --> 00:25:18,020 Now. 576 00:25:18,020 --> 00:25:20,580 You might look at that and say, well that's just a lot 577 00:25:20,580 --> 00:25:22,450 like what we had over here. 578 00:25:22,450 --> 00:25:22,610 Right? 579 00:25:22,610 --> 00:25:25,880 We had some additive constant plus a simpler version of the 580 00:25:25,880 --> 00:25:28,070 same problem reduced in size by 1. 581 00:25:28,070 --> 00:25:31,110 But that two matters. 582 00:25:31,110 --> 00:25:32,780 So let's look at it. 583 00:25:32,780 --> 00:25:35,900 How do I rea-- replace the expression FOR t of n minus 1? 584 00:25:35,900 --> 00:25:39,650 Substitute it in again. t of n minus 1 is 3 plus 2 585 00:25:39,650 --> 00:25:40,950 t of n minus 2. 586 00:25:40,950 --> 00:25:49,890 So this is 3, plus 2 times 3, plus 4 t minus 2. 587 00:25:49,890 --> 00:25:51,410 OK. 588 00:25:51,410 --> 00:25:56,640 And if I substitute it again, I get 3 plus 2 times 3 plus 4 589 00:25:56,640 --> 00:26:02,850 times 3 plus 8 t n minus 3. 590 00:26:02,850 --> 00:26:04,610 This is going by a little fast. I'm just 591 00:26:04,610 --> 00:26:05,500 substituting in. 592 00:26:05,500 --> 00:26:08,090 I'm going to skip some steps. 593 00:26:08,090 --> 00:26:13,210 But basically if I do this, I end up with 3 times 1 plus 2 594 00:26:13,210 --> 00:26:17,590 plus 4 to 2 to the k minus 1 for all of 595 00:26:17,590 --> 00:26:21,120 those terms, plus 2-- 596 00:26:21,120 --> 00:26:30,960 I want to do this right, 2 to the k, sorry-- t of n minus k. 597 00:26:30,960 --> 00:26:31,250 OK. 598 00:26:31,250 --> 00:26:33,620 Don't sweat the details, I'm just expanding it out. 599 00:26:33,620 --> 00:26:36,720 What I want you to see is, because I've got two versions 600 00:26:36,720 --> 00:26:38,130 of that problem. 601 00:26:38,130 --> 00:26:39,660 The next time down I've got four versions. 602 00:26:39,660 --> 00:26:40,980 Next time down I've got eight versions. 603 00:26:40,980 --> 00:26:43,510 And in fact, if I substitute, I can solve for this, I'm done 604 00:26:43,510 --> 00:26:45,150 when this is equal to 1. 605 00:26:45,150 --> 00:26:49,500 If you substitute it all in, you get basically 606 00:26:49,500 --> 00:26:54,850 order 2 to the n. 607 00:26:54,850 --> 00:26:57,590 Exponential. 608 00:26:57,590 --> 00:26:59,850 That's a problem. 609 00:26:59,850 --> 00:27:02,250 Now, it's also the case that this is fundamentally what 610 00:27:02,250 --> 00:27:04,370 class this algorithm falls into, it is going to take 611 00:27:04,370 --> 00:27:06,160 exponential amount of time. 612 00:27:06,160 --> 00:27:10,120 But it grows pretty rapidly, as n goes up, and I'm going to 613 00:27:10,120 --> 00:27:12,460 show you an example in a second. 614 00:27:12,460 --> 00:27:14,290 Again, what I want you to see is, notice the 615 00:27:14,290 --> 00:27:15,710 characteristic of that. 616 00:27:15,710 --> 00:27:21,240 That this recursive call had two sub-problems of a smaller 617 00:27:21,240 --> 00:27:22,450 size, not one. 618 00:27:22,450 --> 00:27:23,490 And that makes a big difference. 619 00:27:23,490 --> 00:27:26,530 So just to show you how big a difference it makes, let's run 620 00:27:26,530 --> 00:27:29,810 a couple of numbers. 621 00:27:29,810 --> 00:27:33,410 Let's suppose n is 1000, and we're running 622 00:27:33,410 --> 00:27:37,770 at nanosecond speed. 623 00:27:37,770 --> 00:27:52,550 We have seen log, linear, quadratic, and exponential. 624 00:27:52,550 --> 00:27:55,320 So, again, there could be constants in here, but just to 625 00:27:55,320 --> 00:27:56,980 give you a sense of this. 626 00:27:56,980 --> 00:27:59,850 If I'm running at nanosecond speed, n, the size of the 627 00:27:59,850 --> 00:28:02,820 problem, whatever it is, is 1000, and I've got a log 628 00:28:02,820 --> 00:28:08,940 algorithm, it takes 10 nanoseconds to complete. 629 00:28:08,940 --> 00:28:12,570 If you blink, you miss it. 630 00:28:12,570 --> 00:28:18,770 If I'm running a linear algorithm, it'll take one 631 00:28:18,770 --> 00:28:21,680 microsecond to complete. 632 00:28:21,680 --> 00:28:25,570 If I'm running a quadratic algorithm, it'll take one 633 00:28:25,570 --> 00:28:29,570 millisecond to complete. 634 00:28:29,570 --> 00:28:32,600 And if I'm running an exponential 635 00:28:32,600 --> 00:28:33,830 algorithm, any guesses? 636 00:28:33,830 --> 00:28:50,310 I hope Washington doesn't take this long to fix my 401k plan. 637 00:28:50,310 --> 00:28:51,460 All right? 638 00:28:51,460 --> 00:28:54,250 10 to the 284 years. 639 00:28:54,250 --> 00:28:56,930 As Emeril would say, pow! 640 00:28:56,930 --> 00:28:58,400 That's a some spicy whatever. 641 00:28:58,400 --> 00:28:59,600 All right. 642 00:28:59,600 --> 00:29:01,220 Bad jokes aside, what's the point? 643 00:29:01,220 --> 00:29:04,700 You see, these classes have really different performance. 644 00:29:04,700 --> 00:29:05,990 Now this is a little misleading. 645 00:29:05,990 --> 00:29:08,210 These are all really fast, so just to give you another set 646 00:29:08,210 --> 00:29:10,200 of examples, I'm not going to do the-- 647 00:29:10,200 --> 00:29:17,600 If I had a problem where the log one took ten milliseconds, 648 00:29:17,600 --> 00:29:21,340 then the linear one would take a second, the quadratic one 649 00:29:21,340 --> 00:29:26,880 would take 16 minutes. 650 00:29:26,880 --> 00:29:29,670 So you can see, even the quadratic ones can 651 00:29:29,670 --> 00:29:30,280 blow up in a hurry. 652 00:29:30,280 --> 00:29:31,820 And this goes back to the point I tried 653 00:29:31,820 --> 00:29:33,130 to make last time. 654 00:29:33,130 --> 00:29:36,360 Yes, the computers are really fast. But the problems can 655 00:29:36,360 --> 00:29:38,910 grow much faster than you can get a performance boost out of 656 00:29:38,910 --> 00:29:39,930 the computer. 657 00:29:39,930 --> 00:29:42,270 And you really, wherever possible, want to avoid that 658 00:29:42,270 --> 00:29:44,760 exponential algorithm, because that's really deadly. 659 00:29:44,760 --> 00:29:48,660 Yes. 660 00:29:48,660 --> 00:29:50,520 All right. 661 00:29:50,520 --> 00:29:52,790 The question is, is there a point where it'll quit. 662 00:29:52,790 --> 00:29:56,130 Yeah, when the power goes out, or-- so let me not answer it 663 00:29:56,130 --> 00:29:57,080 quite so facetiously. 664 00:29:57,080 --> 00:29:58,680 We'd be mostly talking about time. 665 00:29:58,680 --> 00:30:00,890 In fact, if I ran one of these things, it would just keep 666 00:30:00,890 --> 00:30:01,930 crunching away. 667 00:30:01,930 --> 00:30:05,030 It will probably quit at some point because of space issues, 668 00:30:05,030 --> 00:30:07,130 unless I'm writing an algorithm that is using no 669 00:30:07,130 --> 00:30:08,670 additional space. 670 00:30:08,670 --> 00:30:09,250 Right. 671 00:30:09,250 --> 00:30:10,770 Those things are going to stack up, and eventually it's 672 00:30:10,770 --> 00:30:11,540 going to run out of space. 673 00:30:11,540 --> 00:30:13,730 And that's more likely to happen, but, you know. 674 00:30:13,730 --> 00:30:17,280 The algorithm doesn't know that it's going to take this 675 00:30:17,280 --> 00:30:19,800 long to compute, it's just busy crunching away, trying to 676 00:30:19,800 --> 00:30:21,190 see if it can make it happen. 677 00:30:21,190 --> 00:30:21,860 OK. 678 00:30:21,860 --> 00:30:24,290 Good question, thank you. 679 00:30:24,290 --> 00:30:26,010 All right. 680 00:30:26,010 --> 00:30:29,510 I want to do one more extended example here., because we've 681 00:30:29,510 --> 00:30:31,670 got another piece to do, but I want to capture this, because 682 00:30:31,670 --> 00:30:33,390 it's important, so let me again try and say it the 683 00:30:33,390 --> 00:30:34,610 following way. 684 00:30:34,610 --> 00:30:37,590 I want you to recognize classes of algorithms and 685 00:30:37,590 --> 00:30:41,420 match what you see in the performance of the algorithm 686 00:30:41,420 --> 00:30:42,680 to the complexity of that algorithm. 687 00:30:42,680 --> 00:30:44,020 All right? 688 00:30:44,020 --> 00:30:47,270 Linear algorithms tend to be things where, at one 689 00:30:47,270 --> 00:30:50,100 pass-through, you reduce the problem by a 690 00:30:50,100 --> 00:30:51,790 constant amount, by one. 691 00:30:51,790 --> 00:30:53,450 If you reduce it by two, it's going to be the same thing. 692 00:30:53,450 --> 00:30:55,400 Where you go from problem of size n to a problem of 693 00:30:55,400 --> 00:30:57,460 size n minus 1. 694 00:30:57,460 --> 00:31:01,600 A log algorithm typically is one where you cut the size of 695 00:31:01,600 --> 00:31:03,750 the problem down by some multiplicative factor. 696 00:31:03,750 --> 00:31:04,960 You reduce it in half. 697 00:31:04,960 --> 00:31:05,810 You reduce it in third. 698 00:31:05,810 --> 00:31:07,660 All right? 699 00:31:07,660 --> 00:31:09,700 Quadratic algorithms tend to have this-- 700 00:31:09,700 --> 00:31:11,640 I was about to say additive, wrong term-- but 701 00:31:11,640 --> 00:31:15,630 doubly-nested, triply-nested things are likely to be 702 00:31:15,630 --> 00:31:18,420 quadratic or cubic algorithms, all right, because you know-- 703 00:31:18,420 --> 00:31:20,310 let me not confuse things-- double-loop quadratic 704 00:31:20,310 --> 00:31:23,000 algorithm, because you're doing one set of things and 705 00:31:23,000 --> 00:31:25,360 you're doing it some other number of times, and that's a 706 00:31:25,360 --> 00:31:27,970 typical signal that that's what you have there. 707 00:31:27,970 --> 00:31:28,390 OK. 708 00:31:28,390 --> 00:31:30,910 And then the exponentials, as you saw is when typically I 709 00:31:30,910 --> 00:31:36,200 reduce the problem of one size into two or more sub-problems 710 00:31:36,200 --> 00:31:37,930 of a smaller size. 711 00:31:37,930 --> 00:31:39,970 And you can imagine this gets complex and there's lots of 712 00:31:39,970 --> 00:31:41,580 interesting things to do to look to the real form, but 713 00:31:41,580 --> 00:31:43,960 those are the things that you should see. 714 00:31:43,960 --> 00:31:44,480 Now. 715 00:31:44,480 --> 00:31:47,790 Two other things, before we do this last example. 716 00:31:47,790 --> 00:31:50,210 One is, I'll remind you, what we're interested in is 717 00:31:50,210 --> 00:31:51,800 asymptotic growth. 718 00:31:51,800 --> 00:31:55,040 How does this thing grow as I make the problem size big? 719 00:31:55,040 --> 00:31:56,950 And I'll also remind you, and we're going to see this in the 720 00:31:56,950 --> 00:31:58,480 next example, we talked about looking at 721 00:31:58,480 --> 00:32:00,530 the worst case behavior. 722 00:32:00,530 --> 00:32:02,610 In these cases there's no best case worst case, it's just 723 00:32:02,610 --> 00:32:03,800 doing one computation. 724 00:32:03,800 --> 00:32:05,570 We're going to see an example of that in a second. 725 00:32:05,570 --> 00:32:07,850 What we really want to worry about, what's the worst case 726 00:32:07,850 --> 00:32:09,000 that happens. 727 00:32:09,000 --> 00:32:11,410 And the third thing I want you to keep in mind is, remember 728 00:32:11,410 --> 00:32:14,760 these are orders of growth. 729 00:32:14,760 --> 00:32:17,370 It is certainly possible, for example, that a quadratic 730 00:32:17,370 --> 00:32:20,810 algorithm could run faster than a linear algorithm. 731 00:32:20,810 --> 00:32:24,330 It depends on what the input is, it depends on, you know, 732 00:32:24,330 --> 00:32:25,480 what the particular cases are. 733 00:32:25,480 --> 00:32:28,300 So it is not the case that, on every input, a linear 734 00:32:28,300 --> 00:32:29,590 algorithm is always going to be better 735 00:32:29,590 --> 00:32:30,140 than a quadratic algorithm. 736 00:32:30,140 --> 00:32:33,490 It is just in general that's going to hold true, and that's 737 00:32:33,490 --> 00:32:35,880 what I want you to see. 738 00:32:35,880 --> 00:32:36,260 OK. 739 00:32:36,260 --> 00:32:37,560 I want to do one last example. 740 00:32:37,560 --> 00:32:41,170 I'm going to take a little bit more time on it, because it's 741 00:32:41,170 --> 00:32:43,600 going to both reinforce these ideas, but it's also going to 742 00:32:43,600 --> 00:32:46,190 show us how we have to think about what's a primitive 743 00:32:46,190 --> 00:32:49,720 step., and in a particular, how do data structures 744 00:32:49,720 --> 00:32:51,760 interact with this analysis? 745 00:32:51,760 --> 00:32:53,790 Here I've just been running integers, it's pretty simple, 746 00:32:53,790 --> 00:32:55,110 but if I have a data structure, I'm going to have 747 00:32:55,110 --> 00:32:56,320 to worry about that a little bit more. 748 00:32:56,320 --> 00:32:57,130 So let's look at that. 749 00:32:57,130 --> 00:33:00,290 And the example I want to look at is, suppose I want to 750 00:33:00,290 --> 00:33:04,360 search a list that I know is sorted, to see if an element's 751 00:33:04,360 --> 00:33:05,920 in the list. OK? 752 00:33:05,920 --> 00:33:17,875 So the example I'm going to do, I'm going to search a 753 00:33:17,875 --> 00:33:20,850 sorted list. All right. 754 00:33:20,850 --> 00:33:23,250 If you flip to the second side of your handout, you'll see 755 00:33:23,250 --> 00:33:26,640 that I have a piece of code there, that does this-- let 756 00:33:26,640 --> 00:33:28,580 me, ah, I didn't want to do that, let me back up 757 00:33:28,580 --> 00:33:32,870 slightly-- this is the algorithm called search. 758 00:33:32,870 --> 00:33:35,020 And let's take a look at it. 759 00:33:35,020 --> 00:33:36,050 OK? 760 00:33:36,050 --> 00:33:38,730 Basic idea, before I even look at the code, is pretty simple. 761 00:33:38,730 --> 00:33:41,960 If I've got a list that is sorted, in let's call it, just 762 00:33:41,960 --> 00:33:43,670 in increasing order, and I haven't said what's in the 763 00:33:43,670 --> 00:33:45,830 list, could be numbers, could be other things, for now, 764 00:33:45,830 --> 00:33:46,800 we're going to just assume they're integers. 765 00:33:46,800 --> 00:33:50,140 The easy thing to do would be the following: start at the 766 00:33:50,140 --> 00:33:53,010 front end of the list, check the first element. 767 00:33:53,010 --> 00:33:54,130 If it's the thing I'm looking for, I'm done. 768 00:33:54,130 --> 00:33:54,740 It's there. 769 00:33:54,740 --> 00:33:57,010 If not, move on to the next element. 770 00:33:57,010 --> 00:33:57,900 And keep doing that. 771 00:33:57,900 --> 00:34:00,690 But if, at any point, I get to a place in the list where the 772 00:34:00,690 --> 00:34:04,240 thing I'm looking for is smaller than the element in 773 00:34:04,240 --> 00:34:07,180 the list, I know everything else in the rest of the list 774 00:34:07,180 --> 00:34:08,530 has to be bigger than that, I don't have to 775 00:34:08,530 --> 00:34:09,710 bother looking anymore. 776 00:34:09,710 --> 00:34:10,724 It says the element's not there. 777 00:34:10,724 --> 00:34:12,450 I can just stop. 778 00:34:12,450 --> 00:34:12,750 OK. 779 00:34:12,750 --> 00:34:14,550 So that's what this piece of code does here. 780 00:34:14,550 --> 00:34:15,390 Right.? 781 00:34:15,390 --> 00:34:17,700 I'm going to set up a variable to say, what's the answer I 782 00:34:17,700 --> 00:34:19,090 want to return, is it there or not. 783 00:34:19,090 --> 00:34:22,200 Initially it's got that funny value none. 784 00:34:22,200 --> 00:34:25,260 I'm going to set up an index, which is going to tell me 785 00:34:25,260 --> 00:34:29,990 where to look, starting at the first part of the list, right? 786 00:34:29,990 --> 00:34:30,630 And then, when I got-- 787 00:34:30,630 --> 00:34:32,810 I'm also going to count how many comparisons I do, just so 788 00:34:32,810 --> 00:34:34,230 I can see how much work I do here, and then 789 00:34:34,230 --> 00:34:35,010 notice what it does. 790 00:34:35,010 --> 00:34:40,330 It says while the index is smaller than the size of the 791 00:34:40,330 --> 00:34:43,840 list, I'm not at the end of the list, and I don't have an 792 00:34:43,840 --> 00:34:45,960 answer yet, check. 793 00:34:45,960 --> 00:34:49,660 So I'm going to check to see if-- really can't read that 794 00:34:49,660 --> 00:34:51,250 thing, let me do it this way-- right, I'm going to increase 795 00:34:51,250 --> 00:34:53,890 the number of compares, and I'm going to check to say, is 796 00:34:53,890 --> 00:34:57,250 the thing I'm looking for at the i'th spot in the list? 797 00:34:57,250 --> 00:34:59,250 Right, so s of i saying, given the list, look at the i'th 798 00:34:59,250 --> 00:35:01,790 element, is it the same thing? 799 00:35:01,790 --> 00:35:04,700 If it is, OK. 800 00:35:04,700 --> 00:35:06,590 Set the answer to true. 801 00:35:06,590 --> 00:35:09,900 Which means, next time through the loop, that's going to pop 802 00:35:09,900 --> 00:35:11,930 out and return an answer. 803 00:35:11,930 --> 00:35:17,490 If it's not, then check to see, is it smaller than that 804 00:35:17,490 --> 00:35:19,880 element in the current spot of the list? 805 00:35:19,880 --> 00:35:22,200 And if that's true, it says again, everything else in the 806 00:35:22,200 --> 00:35:24,600 list has to be bigger than this, thing can't possibly be 807 00:35:24,600 --> 00:35:26,790 in the list, I'm taking advantage of the ordering, I 808 00:35:26,790 --> 00:35:29,880 can set the answer to false, change i to go to the next 809 00:35:29,880 --> 00:35:31,956 one, and next time through the loop, I'm going to pop out and 810 00:35:31,956 --> 00:35:34,140 print it out. 811 00:35:34,140 --> 00:35:36,800 OK? 812 00:35:36,800 --> 00:35:37,490 Right. 813 00:35:37,490 --> 00:35:38,460 Order of growth here. 814 00:35:38,460 --> 00:35:46,650 What do you think? 815 00:35:46,650 --> 00:35:48,840 Even with these glasses on, I can see no hands up, any 816 00:35:48,840 --> 00:35:49,600 suggestions? 817 00:35:49,600 --> 00:35:50,250 Somebody help me out. 818 00:35:50,250 --> 00:35:51,540 What do you think the order of growth is here? 819 00:35:51,540 --> 00:35:58,300 I've got a list, walk you through it an element at a 820 00:35:58,300 --> 00:36:00,780 time, do I look at each element of the 821 00:36:00,780 --> 00:36:02,890 list more than once? 822 00:36:02,890 --> 00:36:04,600 Don't think so, right? 823 00:36:04,600 --> 00:36:07,970 So, what does this suggest? 824 00:36:07,970 --> 00:36:09,100 Sorry? 825 00:36:09,100 --> 00:36:10,420 Constant. 826 00:36:10,420 --> 00:36:12,910 Ooh, constant says, no matter what the length of the list 827 00:36:12,910 --> 00:36:16,630 is, I'm going to take the same amount of time. 828 00:36:16,630 --> 00:36:17,890 And I don't think that's true, right? 829 00:36:17,890 --> 00:36:21,080 If I have a list ten times longer, it's going to take me 830 00:36:21,080 --> 00:36:23,890 more time, so-- not a bad guess, I'm still 831 00:36:23,890 --> 00:36:27,190 reward you, thank you. 832 00:36:27,190 --> 00:36:29,080 Somebody else. 833 00:36:29,080 --> 00:36:30,500 Yeah. 834 00:36:30,500 --> 00:36:30,930 Linear. 835 00:36:30,930 --> 00:36:31,930 Why? 836 00:36:31,930 --> 00:36:39,290 You're right, by the way, but why? 837 00:36:39,290 --> 00:36:40,140 Yeah. 838 00:36:40,140 --> 00:36:41,880 All right, so the answer was it's linear, which is 839 00:36:41,880 --> 00:36:43,710 absolutely right. 840 00:36:43,710 --> 00:36:44,730 Although for a reason we're going to 841 00:36:44,730 --> 00:36:46,620 come back in a second. 842 00:36:46,620 --> 00:36:48,450 Oh, thank you, I hope your friends help you out with 843 00:36:48,450 --> 00:36:49,920 that, thank you. 844 00:36:49,920 --> 00:36:50,670 Right? 845 00:36:50,670 --> 00:36:52,290 You can see that this ought to be linear, 846 00:36:52,290 --> 00:36:53,200 because what am I doing? 847 00:36:53,200 --> 00:36:56,220 I'm walking down the list. So one of the things I didn't 848 00:36:56,220 --> 00:36:58,350 say, it's sort of implicit here, is what is the thing I 849 00:36:58,350 --> 00:36:59,700 measuring the size of the problem in? 850 00:36:59,700 --> 00:37:01,450 What's the size of the list? 851 00:37:01,450 --> 00:37:08,230 And if I'm walking down the list, this is probably order 852 00:37:08,230 --> 00:37:10,910 of the length of the list s, because I'm looking at each 853 00:37:10,910 --> 00:37:13,210 element once. 854 00:37:13,210 --> 00:37:14,840 Now you might say, wait a minute. 855 00:37:14,840 --> 00:37:17,300 Thing's ordered, if I stop part way through and I throw 856 00:37:17,300 --> 00:37:19,760 away half the list, doesn't that help me? 857 00:37:19,760 --> 00:37:22,430 And the answer is yes, but it doesn't change the complexity. 858 00:37:22,430 --> 00:37:23,960 Because what did we say? 859 00:37:23,960 --> 00:37:26,150 We're measuring the worst case. 860 00:37:26,150 --> 00:37:28,300 The worst case here is, the things not in the list, in 861 00:37:28,300 --> 00:37:30,520 which case I've got to go all the way through the list to 862 00:37:30,520 --> 00:37:32,220 get to the end. 863 00:37:32,220 --> 00:37:33,390 OK. 864 00:37:33,390 --> 00:37:37,110 Now, having said that, and I've actually got a subtlety 865 00:37:37,110 --> 00:37:39,500 I'm going to come back to in a second, there ought to be a 866 00:37:39,500 --> 00:37:40,810 better way to do this. 867 00:37:40,810 --> 00:37:41,370 OK? 868 00:37:41,370 --> 00:37:47,180 And here's the better way to think about. 869 00:37:47,180 --> 00:37:49,640 I'll just draw out sort of a funny representation of a 870 00:37:49,640 --> 00:37:55,780 list. These are sort of the cells, if you like, in memory 871 00:37:55,780 --> 00:37:58,060 that are holding the elements of the list. What we've been 872 00:37:58,060 --> 00:38:00,020 saying is, I start here and look. 873 00:38:00,020 --> 00:38:00,900 If it's there, I'm done. 874 00:38:00,900 --> 00:38:02,030 If not, I go there. 875 00:38:02,030 --> 00:38:04,600 If it's there, I'm done, if not, I keep walking down, and 876 00:38:04,600 --> 00:38:07,240 I only stop when I get to a place where the element I'm 877 00:38:07,240 --> 00:38:10,150 looking for is smaller than the value in the list., in 878 00:38:10,150 --> 00:38:11,870 which case I know the rest of this is too 879 00:38:11,870 --> 00:38:13,650 big and I can stop. 880 00:38:13,650 --> 00:38:16,080 But I still have to go through the list. 881 00:38:16,080 --> 00:38:17,830 There's a better way to think about this, and in fact 882 00:38:17,830 --> 00:38:20,600 Professor Guttag has already hinted at this in the last 883 00:38:20,600 --> 00:38:22,180 couple of lectures. 884 00:38:22,180 --> 00:38:24,740 The better way to think about this is, suppose, rather than 885 00:38:24,740 --> 00:38:26,900 starting at the beginning, I just grabbed some spot at 886 00:38:26,900 --> 00:38:29,880 random, like this one. 887 00:38:29,880 --> 00:38:32,120 And I look at that value. 888 00:38:32,120 --> 00:38:34,340 If it's the value I'm looking for, boy, I ought to go to 889 00:38:34,340 --> 00:38:35,630 Vegas, I'm really lucky. 890 00:38:35,630 --> 00:38:37,840 And I'm done, right? 891 00:38:37,840 --> 00:38:39,330 If not, what could I do? 892 00:38:39,330 --> 00:38:41,780 Well, I could look at the value here, and compare it to 893 00:38:41,780 --> 00:38:45,690 the value I'm trying to find, and say the following; if the 894 00:38:45,690 --> 00:38:51,010 value I'm looking for is bigger than this value, where 895 00:38:51,010 --> 00:38:53,240 do I need to look? 896 00:38:53,240 --> 00:38:53,760 Just here. 897 00:38:53,760 --> 00:38:57,940 All right? 898 00:38:57,940 --> 00:38:59,580 Can't possibly be there, because I know 899 00:38:59,580 --> 00:39:01,080 this thing is over. 900 00:39:01,080 --> 00:39:03,460 On the other hand, if the value I'm looking for here-- 901 00:39:03,460 --> 00:39:06,810 sorry, the value I'm looking for is smaller than the value 902 00:39:06,810 --> 00:39:10,350 I see here, I just need to look here. 903 00:39:10,350 --> 00:39:14,730 All right? 904 00:39:14,730 --> 00:39:17,030 Having done that, I could do the same thing, so I suppose I 905 00:39:17,030 --> 00:39:20,040 take this branch, I can pick a spot like, say, this one, and 906 00:39:20,040 --> 00:39:21,260 look there. 907 00:39:21,260 --> 00:39:23,180 Because there, I'm done, if not, I'm either 908 00:39:23,180 --> 00:39:27,750 looking here or there. 909 00:39:27,750 --> 00:39:31,350 And I keep cutting the problem down. 910 00:39:31,350 --> 00:39:31,760 OK. 911 00:39:31,760 --> 00:39:35,400 Now, having said that, where should I pick to 912 00:39:35,400 --> 00:39:39,220 look in this list? 913 00:39:39,220 --> 00:39:40,630 I'm sorry? 914 00:39:40,630 --> 00:39:41,270 Halfway. 915 00:39:41,270 --> 00:39:41,580 Why? 916 00:39:41,580 --> 00:39:50,060 You're right, but why? 917 00:39:50,060 --> 00:39:50,660 Yeah. 918 00:39:50,660 --> 00:39:53,250 So the answer, in case you didn't hear it, was, again, if 919 00:39:53,250 --> 00:39:55,800 I'm a gambling person, I could start like a way down here. 920 00:39:55,800 --> 00:39:57,220 All right? 921 00:39:57,220 --> 00:39:59,320 If I'm gambling, I'm saying, gee, if I'm really lucky, 922 00:39:59,320 --> 00:40:01,220 it'll be only on this side, and I've got a little bit of 923 00:40:01,220 --> 00:40:03,910 work to do, but if I'm unlucky, I'm scrawed, the past 924 00:40:03,910 --> 00:40:07,690 pluperfect of screwed, OK., or a Boston fish. 925 00:40:07,690 --> 00:40:10,140 I'll look at the rest of that big chunk of the list, and 926 00:40:10,140 --> 00:40:11,290 that's a pain. 927 00:40:11,290 --> 00:40:15,480 So halfway is the right thing to do, because at each step, 928 00:40:15,480 --> 00:40:18,850 I'm guaranteed to throw away at least half the list. Right? 929 00:40:18,850 --> 00:40:20,640 And that's nice. 930 00:40:20,640 --> 00:40:21,010 OK. 931 00:40:21,010 --> 00:40:25,060 What would you guess the order of growth here is? 932 00:40:25,060 --> 00:40:26,360 Yeah. 933 00:40:26,360 --> 00:40:29,590 Why? 934 00:40:29,590 --> 00:40:29,970 Good. 935 00:40:29,970 --> 00:40:30,520 Exactly. 936 00:40:30,520 --> 00:40:30,920 Right? 937 00:40:30,920 --> 00:40:33,570 Again, if you didn't hear it, the answer was it's log. 938 00:40:33,570 --> 00:40:36,700 Because I'm cutting down the problem in half at each time. 939 00:40:36,700 --> 00:40:38,890 You're right, but there's something we have to do to add 940 00:40:38,890 --> 00:40:40,800 to that, and that's the last thing I want to pick up on. 941 00:40:40,800 --> 00:40:41,370 OK. 942 00:40:41,370 --> 00:40:43,610 Let's look at the code-- actually, let's test this out 943 00:40:43,610 --> 00:40:45,090 first before we do it. 944 00:40:45,090 --> 00:40:47,895 So I've added, as Professor Guttag did-- ah, should have 945 00:40:47,895 --> 00:40:51,130 said it this way, let's write the code for it first, sorry 946 00:40:51,130 --> 00:40:52,810 about that-- 947 00:40:52,810 --> 00:40:54,970 OK, I'm going to write a little thing called b search. 948 00:40:54,970 --> 00:40:57,680 I'm going to call it down here with search, which is simply 949 00:40:57,680 --> 00:41:00,290 going to call it, and then print an answer out. 950 00:41:00,290 --> 00:41:03,240 In binary search-- ah, there's that wonderful phrase, this is 951 00:41:03,240 --> 00:41:05,560 called a version of binary search, just like you saw 952 00:41:05,560 --> 00:41:07,970 bin-- or bi-section methods, when we were doing numerical 953 00:41:07,970 --> 00:41:11,090 things-- in binary search, I need to keep track of the 954 00:41:11,090 --> 00:41:13,570 starting point and the ending point of the 955 00:41:13,570 --> 00:41:14,320 list I'm looking at. 956 00:41:14,320 --> 00:41:16,730 Initially, it's the beginning and the end of it. 957 00:41:16,730 --> 00:41:18,805 And when I do this test, what I want to do, is say I'm going 958 00:41:18,805 --> 00:41:21,780 to pick the middle spot, and depending on the test, if I 959 00:41:21,780 --> 00:41:24,560 know it's in the upper half, I'm going to set my start at 960 00:41:24,560 --> 00:41:27,710 the mid point and the end stays the same, if it's in the 961 00:41:27,710 --> 00:41:29,583 front half I'm going to keep the front the same and I'm 962 00:41:29,583 --> 00:41:30,860 going to change the endpoint. 963 00:41:30,860 --> 00:41:33,350 And you can see that in this code here. 964 00:41:33,350 --> 00:41:34,130 Right? 965 00:41:34,130 --> 00:41:34,940 What does it say to do? 966 00:41:34,940 --> 00:41:37,850 It says, well I'm going to print out first and last, just 967 00:41:37,850 --> 00:41:41,240 so you can see it, and then I say, gee, if last minus first 968 00:41:41,240 --> 00:41:43,330 is less than 2, that is, if there's no more than two 969 00:41:43,330 --> 00:41:46,710 elements left in the list, then I can just check those 970 00:41:46,710 --> 00:41:50,380 two elements, and return the answer. 971 00:41:50,380 --> 00:41:52,700 Otherwise, we find the midpoint, and 972 00:41:52,700 --> 00:41:53,590 notice what it does. 973 00:41:53,590 --> 00:41:55,620 First, it's pointing to the beginning of the list, which 974 00:41:55,620 --> 00:41:58,160 initially might be down here at 0 but after a while, might 975 00:41:58,160 --> 00:41:59,800 be part way through. 976 00:41:59,800 --> 00:42:03,390 And to that, I simply add a halfway 977 00:42:03,390 --> 00:42:06,000 point, and then I check. 978 00:42:06,000 --> 00:42:08,880 If it's at that point, I'm done, if not, if it's greater 979 00:42:08,880 --> 00:42:11,340 than the value I'm looking for, I either take 980 00:42:11,340 --> 00:42:15,180 one half or the other. 981 00:42:15,180 --> 00:42:15,510 OK. 982 00:42:15,510 --> 00:42:17,270 You can see that thing is cutting down the problem in 983 00:42:17,270 --> 00:42:19,380 half each time, which is good, but there's one more thing I 984 00:42:19,380 --> 00:42:20,070 need to deal with. 985 00:42:20,070 --> 00:42:22,250 So let's step through this with a little more care. 986 00:42:22,250 --> 00:42:23,640 And I keep saying, before we do it, let's just 987 00:42:23,640 --> 00:42:24,390 actually try it out. 988 00:42:24,390 --> 00:42:27,110 So I'm going to go over here, and I'm going 989 00:42:27,110 --> 00:42:29,590 to type test search-- 990 00:42:29,590 --> 00:42:33,240 I can type-- and if you look at your handout, it's just a 991 00:42:33,240 --> 00:42:36,130 sequence of tests that I'm going to do. 992 00:42:36,130 --> 00:42:36,740 OK. 993 00:42:36,740 --> 00:42:39,300 So initially, I'm going to set up the list to be the first 994 00:42:39,300 --> 00:42:40,950 million integers. 995 00:42:40,950 --> 00:42:43,140 Yeah, it's kind of simple, but it gives me an ordered list of 996 00:42:43,140 --> 00:42:44,990 these things, And let's run it. 997 00:42:44,990 --> 00:42:46,020 OK. 998 00:42:46,020 --> 00:42:48,310 So I'm first going to look for something that's not in the 999 00:42:48,310 --> 00:42:50,310 list, I'm going to see, is minus 1 in this list, so it's 1000 00:42:50,310 --> 00:42:52,070 going to be at the far end, and if I do that in 1001 00:42:52,070 --> 00:42:54,510 the basic case, bam. 1002 00:42:54,510 --> 00:42:54,790 Done. 1003 00:42:54,790 --> 00:42:56,140 All right? 1004 00:42:56,140 --> 00:42:57,970 The basic, that primary search, because it looks at 1005 00:42:57,970 --> 00:42:59,400 the first element, says it's smaller than 1006 00:42:59,400 --> 00:43:00,850 everything else, I'm done. 1007 00:43:00,850 --> 00:43:06,790 If I look in the binary case, takes a little longer. 1008 00:43:06,790 --> 00:43:07,840 Notice the printout here. 1009 00:43:07,840 --> 00:43:10,280 The printout is simply telling me, what are the 1010 00:43:10,280 --> 00:43:11,230 ranges of the search. 1011 00:43:11,230 --> 00:43:15,030 And you can see it wrapping its way down, cutting in half 1012 00:43:15,030 --> 00:43:16,250 at each time until it gets there, but it 1013 00:43:16,250 --> 00:43:18,250 takes a while to find. 1014 00:43:18,250 --> 00:43:18,520 All right. 1015 00:43:18,520 --> 00:43:23,220 Let's search to see though now if a million is in this list, 1016 00:43:23,220 --> 00:43:25,470 or 10 million, whichever way I did this, it must be a 1017 00:43:25,470 --> 00:43:26,520 million, right? 1018 00:43:26,520 --> 00:43:31,150 In the basic case, oh, took a little while. 1019 00:43:31,150 --> 00:43:35,180 Right, in the binary case, bam. 1020 00:43:35,180 --> 00:43:37,280 In fact, it took the same number of steps as it did in 1021 00:43:37,280 --> 00:43:39,030 the other case, because each time I'm cutting 1022 00:43:39,030 --> 00:43:40,710 it down by a half. 1023 00:43:40,710 --> 00:43:41,040 OK. 1024 00:43:41,040 --> 00:43:42,140 That's nice. 1025 00:43:42,140 --> 00:43:43,930 Now, let's do the following; if you look right here, I'm 1026 00:43:43,930 --> 00:43:45,400 going to set this now to-- 1027 00:43:45,400 --> 00:43:48,780 I'm going to change my range to 10 million, I'm going to 1028 00:43:48,780 --> 00:43:51,070 first say, gee, is a million in there, 1029 00:43:51,070 --> 00:43:54,680 using the basic search. 1030 00:43:54,680 --> 00:43:56,190 It is. 1031 00:43:56,190 --> 00:43:58,280 Now, I'm going to say, is 10 million in this, using the 1032 00:43:58,280 --> 00:44:01,310 basic search. 1033 00:44:01,310 --> 00:44:04,400 We may test your hypothesis, about how long does it take, 1034 00:44:04,400 --> 00:44:06,580 if I time this really well, I ought to be able to end when 1035 00:44:06,580 --> 00:44:10,610 it finds it, which should be right about now. 1036 00:44:10,610 --> 00:44:13,080 That was pure luck. 1037 00:44:13,080 --> 00:44:14,400 But notice how much longer it took. 1038 00:44:14,400 --> 00:44:16,040 On the other hand, watch what happens with binary. 1039 00:44:16,040 --> 00:44:18,720 Is the partway one there? 1040 00:44:18,720 --> 00:44:19,810 Yeah. 1041 00:44:19,810 --> 00:44:21,780 Is the last one there? 1042 00:44:21,780 --> 00:44:22,190 Wow. 1043 00:44:22,190 --> 00:44:25,480 I think it took one more step. 1044 00:44:25,480 --> 00:44:27,690 Man, that's exactly what logs should do, right? 1045 00:44:27,690 --> 00:44:30,190 I make the problem ten times bigger, it takes one 1046 00:44:30,190 --> 00:44:31,680 more step to do it. 1047 00:44:31,680 --> 00:44:34,020 Whereas in the linear case, I make it ten times bigger, it 1048 00:44:34,020 --> 00:44:36,660 takes ten times longer to run. 1049 00:44:36,660 --> 00:44:38,480 OK. 1050 00:44:38,480 --> 00:44:40,700 So I keep saying I've got one thing hanging, it's the last 1051 00:44:40,700 --> 00:44:42,460 thing I want to do, but I wanted you see how much of a 1052 00:44:42,460 --> 00:44:43,110 difference this makes. 1053 00:44:43,110 --> 00:44:46,030 But let's look a little more carefully at the code for 1054 00:44:46,030 --> 00:44:48,160 binary search-- for search 1. 1055 00:44:48,160 --> 00:44:50,560 What's the complexity of search 1? 1056 00:44:50,560 --> 00:44:52,000 Well, you might say it's constant, right? 1057 00:44:52,000 --> 00:44:54,140 It's only got two things to do, except what it really says 1058 00:44:54,140 --> 00:44:56,450 is, that the complexity of search 1 is the same as the 1059 00:44:56,450 --> 00:44:58,060 complexity of b search, because that's 1060 00:44:58,060 --> 00:44:59,210 the call it's doing. 1061 00:44:59,210 --> 00:45:00,380 So let's look at b search. 1062 00:45:00,380 --> 00:45:02,030 All right? 1063 00:45:02,030 --> 00:45:05,600 We've got the code for b search up there. 1064 00:45:05,600 --> 00:45:08,450 First step, constant, right? 1065 00:45:08,450 --> 00:45:09,970 Nothing to do. 1066 00:45:09,970 --> 00:45:13,680 Second step, hm. 1067 00:45:13,680 --> 00:45:17,480 That also looks constant, you think? 1068 00:45:17,480 --> 00:45:19,280 Oh but wait a minute. 1069 00:45:19,280 --> 00:45:21,200 I'm accessing s. 1070 00:45:21,200 --> 00:45:25,290 I'm accessing a list. How long does it take for me to get the 1071 00:45:25,290 --> 00:45:28,490 nth element of a list? 1072 00:45:28,490 --> 00:45:31,610 That might not be a primitive step. 1073 00:45:31,610 --> 00:45:35,480 And in fact, it depends on how I store a list. 1074 00:45:35,480 --> 00:45:42,620 So, for example, in this case, I had lists that I knew were 1075 00:45:42,620 --> 00:45:44,620 made out of integers. 1076 00:45:44,620 --> 00:45:50,820 As a consequence, I have a list of ints. 1077 00:45:50,820 --> 00:45:53,400 I might know, for example, that it takes four memory 1078 00:45:53,400 --> 00:45:56,350 chunks to represent one int, just for example. 1079 00:45:56,350 --> 00:46:01,250 And to find the i'th element, I'm simply going to take the 1080 00:46:01,250 --> 00:46:04,240 starting point, that point at the beginning of memory where 1081 00:46:04,240 --> 00:46:10,850 the list is, plus 4 times i, that would tell me how many 1082 00:46:10,850 --> 00:46:13,530 units over to go, and that's the memory location I want to 1083 00:46:13,530 --> 00:46:15,720 look for the i'th element of the list. 1084 00:46:15,720 --> 00:46:17,890 And remember, we said we're going to assume a random 1085 00:46:17,890 --> 00:46:21,730 access model, which says, as long as I know the location, 1086 00:46:21,730 --> 00:46:24,680 it takes a constant amount of time to get to that point. 1087 00:46:24,680 --> 00:46:27,800 So if the-- if I knew the lists were made of just 1088 00:46:27,800 --> 00:46:30,100 integers, it'd be really easy to figure it out. 1089 00:46:30,100 --> 00:46:32,570 Another way of saying it is, this takes constant amount of 1090 00:46:32,570 --> 00:46:34,790 time to figure out where to look, it takes constant amount 1091 00:46:34,790 --> 00:46:37,860 of time to get there, so in fact I could treat indexing 1092 00:46:37,860 --> 00:46:41,630 into a list as being a basic operation. 1093 00:46:41,630 --> 00:46:44,660 But we know lists can be composed of anything. 1094 00:46:44,660 --> 00:46:47,400 Could be ints, could be floats, could be a combination 1095 00:46:47,400 --> 00:46:49,380 of things, some ints, some floats, some lists, some 1096 00:46:49,380 --> 00:46:51,410 strings, some lists of lists, whatever. 1097 00:46:51,410 --> 00:46:58,970 And in that case, in general lists, I need to figure out 1098 00:46:58,970 --> 00:47:01,270 what's the access time. 1099 00:47:01,270 --> 00:47:03,400 And here I've got a choice. 1100 00:47:03,400 --> 00:47:05,660 OK, one of the ways I could do would be the following. 1101 00:47:05,660 --> 00:47:10,350 I could have a pointer to the beginning of the list where 1102 00:47:10,350 --> 00:47:17,550 the first element here is the actual value, and this would 1103 00:47:17,550 --> 00:47:23,810 point to the next element in the list. Or another way of 1104 00:47:23,810 --> 00:47:26,610 saying it is, the first part of the cell could be some 1105 00:47:26,610 --> 00:47:29,070 encoding of how many cells do I need to have to store the 1106 00:47:29,070 --> 00:47:31,690 value, and then I've got some way of telling me where to get 1107 00:47:31,690 --> 00:47:35,320 the next element of the list. And this would point to value, 1108 00:47:35,320 --> 00:47:42,730 and this would point off someplace in memory. 1109 00:47:42,730 --> 00:47:44,480 Here's the problem with that technique, and by the way, a 1110 00:47:44,480 --> 00:47:45,995 number of programming languages use 1111 00:47:45,995 --> 00:47:46,930 this, including Lisp. 1112 00:47:46,930 --> 00:47:48,630 The problem with that technique, while it's very 1113 00:47:48,630 --> 00:47:51,750 general, is how long does it take me to find the i'th 1114 00:47:51,750 --> 00:47:54,240 element of the list? 1115 00:47:54,240 --> 00:47:55,410 Oh fudge knuckle. 1116 00:47:55,410 --> 00:47:56,090 OK. 1117 00:47:56,090 --> 00:47:58,910 I've got to go to the first place, figure out how far over 1118 00:47:58,910 --> 00:48:01,030 to skip, go to the next place, figure out how far over to 1119 00:48:01,030 --> 00:48:02,660 skip, eventually I'll be out the door. 1120 00:48:02,660 --> 00:48:05,620 I've got to count my way down, which means that the access 1121 00:48:05,620 --> 00:48:09,120 would be linear in the length of the list to find the i'th 1122 00:48:09,120 --> 00:48:11,720 element of the list, and that's going to increase the 1123 00:48:11,720 --> 00:48:13,330 complexity. 1124 00:48:13,330 --> 00:48:15,290 There's an alternative, which is the last point I want to 1125 00:48:15,290 --> 00:48:22,180 make, which is instead what I could do, I should have said 1126 00:48:22,180 --> 00:48:25,510 these things are called linked lists, we'll come back to 1127 00:48:25,510 --> 00:48:29,540 those, another way to do it, is to have the start of the 1128 00:48:29,540 --> 00:48:36,480 list be at some point in memory, and to have each one 1129 00:48:36,480 --> 00:48:39,210 of the successive cells in memory point off 1130 00:48:39,210 --> 00:48:40,340 to the actual value. 1131 00:48:40,340 --> 00:48:43,500 Which may take up some arbitrary amount of memory. 1132 00:48:43,500 --> 00:48:47,880 In that case, I'm back to this problem. 1133 00:48:47,880 --> 00:48:51,570 And as a consequence, access time in the list is constant, 1134 00:48:51,570 --> 00:48:56,290 which is what I want. 1135 00:48:56,290 --> 00:48:59,010 Now, to my knowledge, most implementations of Python use 1136 00:48:59,010 --> 00:49:01,665 this way of storing lists, whereas Lisp 1137 00:49:01,665 --> 00:49:02,760 and Scheme do not. 1138 00:49:02,760 --> 00:49:05,950 The message I'm trying to get to here, because I'm running 1139 00:49:05,950 --> 00:49:09,670 you right up against time, is I have to be careful about 1140 00:49:09,670 --> 00:49:11,660 what's a primitive step. 1141 00:49:11,660 --> 00:49:14,460 With this, if I can assume that accessing the i'th 1142 00:49:14,460 --> 00:49:17,670 element of a list is constant, then you can't see that the 1143 00:49:17,670 --> 00:49:20,790 rest of that analysis looks just like the log analysis I 1144 00:49:20,790 --> 00:49:23,590 did before, and each step, no matter which branch I'm 1145 00:49:23,590 --> 00:49:25,710 taking, I'm cutting the problem down in half. 1146 00:49:25,710 --> 00:49:27,250 And as a consequence, it is log. 1147 00:49:27,250 --> 00:49:35,920 And the last piece of this, is as said, I have to make sure I 1148 00:49:35,920 --> 00:49:39,190 know what my primitive elements are, in terms of 1149 00:49:39,190 --> 00:49:41,090 operations. 1150 00:49:41,090 --> 00:49:44,660 Summary: I want you to recognize different classes of 1151 00:49:44,660 --> 00:49:46,360 algorithms. I'm not going to repeat them. 1152 00:49:46,360 --> 00:49:48,870 We've seen log, we've seen linear, we've seen quadratic, 1153 00:49:48,870 --> 00:49:50,790 we've seen exponential. 1154 00:49:50,790 --> 00:49:53,510 One of the things you should begin to do, is to recognize 1155 00:49:53,510 --> 00:49:57,070 what identifies those classes of algorithms, so you can map 1156 00:49:57,070 --> 00:49:59,420 your problems into those ranges. 1157 00:49:59,420 --> 00:50:01,640 And with that, good luck on Thursday.