1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:19,790 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:19,790 --> 00:00:21,040 ocw.mit.edu. 8 00:00:22,760 --> 00:00:25,550 PROFESSOR: I wanted to give everybody a more conceptual 9 00:00:25,550 --> 00:00:29,210 idea of what big O notation as well as, hopefully, answer any 10 00:00:29,210 --> 00:00:31,600 lingering questions you might have about object-oriented 11 00:00:31,600 --> 00:00:32,860 programming. 12 00:00:32,860 --> 00:00:35,820 So I have these notes, and I type them up, and they're 13 00:00:35,820 --> 00:00:36,490 pretty detailed. 14 00:00:36,490 --> 00:00:37,650 So I'm just going to go through some 15 00:00:37,650 --> 00:00:39,830 points kind of quickly. 16 00:00:39,830 --> 00:00:41,610 Who's still kind of unclear about why we 17 00:00:41,610 --> 00:00:43,165 even due big O notation? 18 00:00:47,550 --> 00:00:52,150 So who can explain why we do big O notation quickly? 19 00:00:52,150 --> 00:00:53,960 What is it? 20 00:00:53,960 --> 00:00:54,250 Yeah? 21 00:00:54,250 --> 00:00:56,780 AUDIENCE: [INAUDIBLE]. 22 00:00:56,780 --> 00:00:57,330 PROFESSOR: Right. 23 00:00:57,330 --> 00:00:57,780 Exactly. 24 00:00:57,780 --> 00:01:00,300 So big O notation gives us an upper bound on how long 25 00:01:00,300 --> 00:01:01,670 something is going to take. 26 00:01:01,670 --> 00:01:03,870 Now, something that's important to remember is it's 27 00:01:03,870 --> 00:01:06,380 not a time bound. 28 00:01:06,380 --> 00:01:08,510 So something that's often confusing is that people say, 29 00:01:08,510 --> 00:01:12,660 oh, this is something that'll tell us how long our programs 30 00:01:12,660 --> 00:01:13,420 going to run. 31 00:01:13,420 --> 00:01:14,700 That's actually not the case. 32 00:01:14,700 --> 00:01:17,910 Big O notation informs how many steps 33 00:01:17,910 --> 00:01:19,660 something's going to take. 34 00:01:19,660 --> 00:01:20,990 And so why is that important? 35 00:01:20,990 --> 00:01:23,100 Well, I mean I look at all you guys, a couple of you have 36 00:01:23,100 --> 00:01:25,050 laptops out. 37 00:01:25,050 --> 00:01:26,430 Everybody's computer run something 38 00:01:26,430 --> 00:01:27,560 at a different speed. 39 00:01:27,560 --> 00:01:28,450 Right? 40 00:01:28,450 --> 00:01:34,810 But if we say something is big O of n, what we're saying here 41 00:01:34,810 --> 00:01:39,350 is we're saying that the worst case number of steps your 42 00:01:39,350 --> 00:01:42,180 program is going to take is going to be linear with 43 00:01:42,180 --> 00:01:44,450 respect to the size of the input. 44 00:01:44,450 --> 00:01:46,280 So if my computer is five times faster than your 45 00:01:46,280 --> 00:01:48,080 computer, my computer will probably run 46 00:01:48,080 --> 00:01:50,760 it five times faster. 47 00:01:50,760 --> 00:01:53,720 As the size of the input grows, I'm going to expect a 48 00:01:53,720 --> 00:01:56,680 linear speedup in the amount of time it's going to take. 49 00:01:59,220 --> 00:02:02,350 So why is that important? 50 00:02:02,350 --> 00:02:06,750 At the bottom of page one, big O notation, we are 51 00:02:06,750 --> 00:02:07,930 particularly concerned with the 52 00:02:07,930 --> 00:02:10,820 scalability of our functions. 53 00:02:10,820 --> 00:02:15,260 So what the O notation does is it might not predict what's 54 00:02:15,260 --> 00:02:19,600 going to be the fastest for really small inputs, for an 55 00:02:19,600 --> 00:02:22,740 array size 10. 56 00:02:22,740 --> 00:02:26,750 You guys know a little bit about graphs, right? 57 00:02:26,750 --> 00:02:29,640 We have a graph of x-squared, and we 58 00:02:29,640 --> 00:02:30,890 have a graph of x-cubed. 59 00:02:33,270 --> 00:02:36,030 There's a portion of time where the graph of x-squared 60 00:02:36,030 --> 00:02:39,270 is actually bigger than x-cubed. 61 00:02:39,270 --> 00:02:41,870 But then all of a sudden, there's a point where x-cubed 62 00:02:41,870 --> 00:02:44,910 just goes, whoosh, way bigger than x-squared. 63 00:02:44,910 --> 00:02:47,960 So if we're in some really small amount of input, big O 64 00:02:47,960 --> 00:02:51,930 notation might not tell us what's the best function. 65 00:02:51,930 --> 00:02:53,170 But in big O notation, we're not 66 00:02:53,170 --> 00:02:54,530 concerned about small inputs. 67 00:02:54,530 --> 00:02:56,480 We're concerned about really big inputs. 68 00:02:56,480 --> 00:02:58,630 We're concerned about filtering the genome. 69 00:02:58,630 --> 00:03:02,080 We're concerned about analyzing data from Hubble, 70 00:03:02,080 --> 00:03:06,690 really huge blocks of data. 71 00:03:06,690 --> 00:03:10,580 So if we're looking at a program that analyzes the 72 00:03:10,580 --> 00:03:14,500 human genome, like three million base pairs, some 73 00:03:14,500 --> 00:03:17,250 segment that we're looking at, and we have two algorithms. 74 00:03:17,250 --> 00:03:20,380 One runs in order n time, and one runs in 75 00:03:20,380 --> 00:03:21,630 order n-cubed time. 76 00:03:24,190 --> 00:03:26,820 What this means is regardless of the machine that we're 77 00:03:26,820 --> 00:03:30,650 running on, so this is algorithm 1, this is algorithm 78 00:03:30,650 --> 00:03:34,060 2, regardless of the machine that we're running on, we'd 79 00:03:34,060 --> 00:03:39,830 expect algorithm 2 to run approximately n-cubed over n 80 00:03:39,830 --> 00:03:42,950 approximately n-squared slower. 81 00:03:42,950 --> 00:03:45,830 So with big O notation, you can compare two algorithms by 82 00:03:45,830 --> 00:03:50,160 just looking at the ratio of their big O run time. 83 00:03:50,160 --> 00:03:53,040 So if I'm looking at something that has an array of size two 84 00:03:53,040 --> 00:03:56,070 million as its input, is it clear that this is going to be 85 00:03:56,070 --> 00:03:57,320 a much better choice? 86 00:04:00,240 --> 00:04:00,320 Ok. 87 00:04:00,320 --> 00:04:02,120 So you'll run into that, especially a lot of you guys 88 00:04:02,120 --> 00:04:03,510 are taking this for the purposes 89 00:04:03,510 --> 00:04:04,600 of scientific computing. 90 00:04:04,600 --> 00:04:06,720 So you'll run into big O notation a lot. 91 00:04:06,720 --> 00:04:08,790 It's important to have a grasp of what it means. 92 00:04:12,170 --> 00:04:15,270 On the second page of the handout, I have some common 93 00:04:15,270 --> 00:04:16,940 ones that you'll see. 94 00:04:16,940 --> 00:04:18,870 The first one is constant time. 95 00:04:18,870 --> 00:04:21,839 We denote constant time as order 1. 96 00:04:21,839 --> 00:04:25,100 But you'll notice that I have here is order 1 is equal to 97 00:04:25,100 --> 00:04:28,880 order 10 is equal to order 2 to the 100th. 98 00:04:28,880 --> 00:04:31,520 That's unexpected to a lot of people who are learning about 99 00:04:31,520 --> 00:04:32,770 big O notation. 100 00:04:35,810 --> 00:04:37,060 Why is this true? 101 00:04:40,690 --> 00:04:42,160 That seems kind of ridiculous. 102 00:04:42,160 --> 00:04:43,270 This is a really big number. 103 00:04:43,270 --> 00:04:45,392 This is really small number. 104 00:04:45,392 --> 00:04:45,843 Yeah? 105 00:04:45,843 --> 00:04:47,196 AUDIENCE: [INAUDIBLE]. 106 00:04:47,196 --> 00:04:48,530 PROFESSOR: Yeah. 107 00:04:48,530 --> 00:04:49,010 Exactly. 108 00:04:49,010 --> 00:04:54,480 So we look at a graph of 1 and a graph of 2 to the 100th. 109 00:05:01,860 --> 00:05:01,940 Ok. 110 00:05:01,940 --> 00:05:04,770 We'll see that even though 2 to the 100th is much higher, 111 00:05:04,770 --> 00:05:12,920 much bigger than 1, if this is our input size, as the size of 112 00:05:12,920 --> 00:05:19,030 our input grows, do we see any change in these two graphs? 113 00:05:19,030 --> 00:05:20,590 No. 114 00:05:20,590 --> 00:05:21,840 They're completely constant. 115 00:05:24,270 --> 00:05:26,520 When you're doing big O notation, if you run across an 116 00:05:26,520 --> 00:05:31,450 algorithm that does not depend on the size of the input, OK, 117 00:05:31,450 --> 00:05:33,270 it's always going to be order 1. 118 00:05:33,270 --> 00:05:36,330 Even if it's like 2 to the 100th steps, if it's a 119 00:05:36,330 --> 00:05:39,710 constant number of times regardless of the size of the 120 00:05:39,710 --> 00:05:41,820 input, it's constant time. 121 00:05:41,820 --> 00:05:45,280 Other ones you'll see are logarithmic time. 122 00:05:45,280 --> 00:05:49,550 Any base for logarithmic time is about the same order. 123 00:05:49,550 --> 00:05:54,280 So order log base 2 of n is order log base 10 of n. 124 00:05:54,280 --> 00:05:56,900 This is the fastest time bound for search. 125 00:05:56,900 --> 00:05:59,000 Does anybody know what type of search we'd be doing in 126 00:05:59,000 --> 00:06:01,780 logarithmic time? 127 00:06:01,780 --> 00:06:03,015 Something maybe we-- 128 00:06:03,015 --> 00:06:03,825 AUDIENCE: Bisection time 129 00:06:03,825 --> 00:06:04,630 PROFESSOR: Yeah. 130 00:06:04,630 --> 00:06:05,040 Exactly. 131 00:06:05,040 --> 00:06:06,880 Bisection search is logarithmic time. 132 00:06:06,880 --> 00:06:08,270 Because we take our input. 133 00:06:08,270 --> 00:06:11,650 And at every step, we cut in half, cut in half, cut in 134 00:06:11,650 --> 00:06:13,630 half, and that's the fastest search we can do. 135 00:06:16,850 --> 00:06:19,610 The order n is linear time. 136 00:06:19,610 --> 00:06:24,790 Order n log n is the fastest time bound we have for sort. 137 00:06:24,790 --> 00:06:28,040 We'll be talking about sort in a couple of weeks. 138 00:06:28,040 --> 00:06:30,900 And order n-squared is quadratic time. 139 00:06:30,900 --> 00:06:37,140 Anything that is order n to some variable, so order 140 00:06:37,140 --> 00:06:41,110 n-squared, order n-cubed, order n-fourth, all of that is 141 00:06:41,110 --> 00:06:48,060 going to be less than order something to the power of n. 142 00:06:48,060 --> 00:06:52,040 So if we have something that's order 2 to the n, that's 143 00:06:52,040 --> 00:06:54,280 ridiculous. 144 00:06:54,280 --> 00:06:56,400 That's a computationally very intensive algorithm. 145 00:06:59,340 --> 00:07:02,040 So on page two, I have some questions for you. 146 00:07:02,040 --> 00:07:03,180 (1), (2), (3). 147 00:07:03,180 --> 00:07:05,930 Does order 100 n-squared equal order n-squared. 148 00:07:08,810 --> 00:07:10,666 Who says yes? 149 00:07:10,666 --> 00:07:11,170 All right. 150 00:07:11,170 --> 00:07:11,630 Very good. 151 00:07:11,630 --> 00:07:14,040 How about does order one quarter n-cubed 152 00:07:14,040 --> 00:07:15,290 equals order n-cubed? 153 00:07:17,830 --> 00:07:23,040 Does order n plus order n equals order n? 154 00:07:23,040 --> 00:07:25,330 The answer is yes to all of those. 155 00:07:25,330 --> 00:07:28,830 In the intuitive sense behind this is that big O notation 156 00:07:28,830 --> 00:07:31,640 deals with the limiting behavior of function. 157 00:07:31,640 --> 00:07:36,365 So I made some nifty graphs for you guys to look at. 158 00:07:36,365 --> 00:07:44,260 When we're comparing order 100 n-squared to n-squared n cubed 159 00:07:44,260 --> 00:07:48,030 and 1/4 n-cubed, what people often think of is what I have 160 00:07:48,030 --> 00:07:50,140 here in the first figure. 161 00:07:50,140 --> 00:07:53,400 So these are the four functions I just mentioned. 162 00:07:53,400 --> 00:07:55,630 There's a legend in the top left-hand corner. 163 00:07:55,630 --> 00:07:59,740 And the scale of this is up to x equals 80. 164 00:07:59,740 --> 00:08:02,120 So you'll see at this scale, this line 165 00:08:02,120 --> 00:08:04,950 right here is 100 x-squared. 166 00:08:04,950 --> 00:08:07,280 So this is, I think, often a tripping point is that when 167 00:08:07,280 --> 00:08:09,130 people are conceptualizing functions, they're saying, 168 00:08:09,130 --> 00:08:12,950 well, yeah, 100 x-squared is much bigger than x-cubed, 169 00:08:12,950 --> 00:08:16,450 which is a lot bigger than 1/4 x-cubed. 170 00:08:16,450 --> 00:08:19,310 So for very small inputs, yes that's true. 171 00:08:19,310 --> 00:08:26,620 But what we're concerned about is the behavior as the input 172 00:08:26,620 --> 00:08:27,870 gets very, very large. 173 00:08:30,600 --> 00:08:36,309 So now, we're looking at a size of up to 1,000. 174 00:08:36,309 --> 00:08:39,770 So now we see here, x-cubed, even though it's a little bit 175 00:08:39,770 --> 00:08:43,990 smaller than 100 x-squared in the beginning, it shoots off. 176 00:08:43,990 --> 00:08:46,740 x-cubed is much bigger than either of the 2 x-squared. 177 00:08:46,740 --> 00:08:50,340 And even 1/4 x-cubed is becoming bigger than 100 178 00:08:50,340 --> 00:08:54,000 x-squared out of 1,000. 179 00:08:54,000 --> 00:08:56,970 So that's an intuitive sense why x-cubed no matter what the 180 00:08:56,970 --> 00:09:00,190 coefficient is in front of it is going to dominate any term 181 00:09:00,190 --> 00:09:02,550 with x-squared in it, because x-cubed is just going to go, 182 00:09:02,550 --> 00:09:06,720 whoosh, real big like that. 183 00:09:06,720 --> 00:09:09,900 And if we go out even further, let's go out to input size of 184 00:09:09,900 --> 00:09:16,460 50,000, we go out to an input size of 50,000, we see that 185 00:09:16,460 --> 00:09:25,380 even 100 x-squared versus just x-squared, alright? they're 186 00:09:25,380 --> 00:09:27,840 about the same. 187 00:09:27,840 --> 00:09:32,910 The x-cubed terms now, they're way above x-squared. 188 00:09:32,910 --> 00:09:37,910 So the two x-squared terms, 100 versus just 1, as far as 189 00:09:37,910 --> 00:09:42,690 the coefficient goes, they're about the same. 190 00:09:42,690 --> 00:09:46,420 So this is the scale at which we're concerned about when 191 00:09:46,420 --> 00:09:48,550 we're talking about big O notation, the limiting 192 00:09:48,550 --> 00:09:51,400 behavior as your input size grows very large. 193 00:09:51,400 --> 00:09:54,830 50,000 is not even that large, if you think about the size of 194 00:09:54,830 --> 00:09:56,460 the genome. 195 00:09:56,460 --> 00:09:59,420 I mean does anybody here bio? 196 00:09:59,420 --> 00:10:02,000 What's like the size of the human genome. 197 00:10:02,000 --> 00:10:04,460 How many base pairs? 198 00:10:04,460 --> 00:10:08,094 Or even one gene or one chromosome. 199 00:10:08,094 --> 00:10:10,329 AUDIENCE: [INAUDIBLE]. 200 00:10:10,329 --> 00:10:12,020 PROFESSOR: What's the biggest? 201 00:10:17,190 --> 00:10:18,610 AUDIENCE: It's over 50,000. 202 00:10:18,610 --> 00:10:20,620 PROFESSOR: Yeah, over 50,000. 203 00:10:20,620 --> 00:10:23,950 And we're talking about the amount of data that we get 204 00:10:23,950 --> 00:10:26,530 back from the Hubble Space Telescope. 205 00:10:26,530 --> 00:10:28,320 I mean the resolution on those things are absolutely 206 00:10:28,320 --> 00:10:28,920 ridiculous. 207 00:10:28,920 --> 00:10:31,540 And you run all sorts of algorithms on those images to 208 00:10:31,540 --> 00:10:34,650 try and see if there's life in the universe. 209 00:10:34,650 --> 00:10:38,190 So we're very concerned about the big long term behavior of 210 00:10:38,190 --> 00:10:40,380 these functions. 211 00:10:40,380 --> 00:10:41,650 How about page three? 212 00:10:41,650 --> 00:10:42,940 One last question. 213 00:10:42,940 --> 00:10:47,740 Does order 100 n-squared plus 1/4 214 00:10:47,740 --> 00:10:50,580 n-cube equal order n-cubed? 215 00:10:50,580 --> 00:10:53,680 Who says yes? 216 00:10:53,680 --> 00:10:55,000 And so I have one more graph. 217 00:11:01,080 --> 00:11:06,740 Down here, these red dots are 100 x-squared. 218 00:11:06,740 --> 00:11:09,600 These blue circles are 1/4 x-cubed. 219 00:11:09,600 --> 00:11:13,850 And this line is the sum. 220 00:11:13,850 --> 00:11:19,810 We can see that this line is a little bit bigger than the 1/4 221 00:11:19,810 --> 00:11:20,800 x-cubed term. 222 00:11:20,800 --> 00:11:26,570 But really, this has no effect at this far out. 223 00:11:26,570 --> 00:11:28,900 So that's why we're just going to drop any lower order terms 224 00:11:28,900 --> 00:11:33,210 whenever you're approached with a big O expression that 225 00:11:33,210 --> 00:11:35,850 has a bunch of constant factors, it has all sorts of 226 00:11:35,850 --> 00:11:38,510 different powers of n and stuff, you're always just 227 00:11:38,510 --> 00:11:40,550 going to drop all the constant factors and just pick the 228 00:11:40,550 --> 00:11:41,630 biggest thing. 229 00:11:41,630 --> 00:11:48,501 So this line right here is order n-cubed. 230 00:11:48,501 --> 00:11:50,750 Is that clear to everybody? 231 00:11:50,750 --> 00:11:54,450 So now I've gotten through the basics of how we analyze this 232 00:11:54,450 --> 00:11:56,600 and why are we looking at this. 233 00:11:56,600 --> 00:11:57,965 Let's look at some code. 234 00:12:01,210 --> 00:12:12,500 So the first example, all of these things right here, in 235 00:12:12,500 --> 00:12:17,050 Python, we make the assumption that statements like this, x 236 00:12:17,050 --> 00:12:21,020 plus 1, x times y, all these mathematical operations are 237 00:12:21,020 --> 00:12:23,600 all constant time. 238 00:12:23,600 --> 00:12:26,250 That's something that you can just assume. 239 00:12:26,250 --> 00:12:29,130 So for this function down here, we have constant time, 240 00:12:29,130 --> 00:12:32,730 constant time, constant time, constant time operation. 241 00:12:32,730 --> 00:12:36,620 So we'd say, this function bar is what? 242 00:12:36,620 --> 00:12:37,780 What's its complexity? 243 00:12:37,780 --> 00:12:39,955 AUDIENCE: [INAUDIBLE]. 244 00:12:39,955 --> 00:12:40,670 PROFESSOR: Yeah. 245 00:12:40,670 --> 00:12:41,260 Constant time. 246 00:12:41,260 --> 00:12:45,460 So the complexity of all these functions are just a 1, 247 00:12:45,460 --> 00:12:49,860 because it doesn't matter how big the input is. 248 00:12:49,860 --> 00:12:51,280 It's all going to run in constant time. 249 00:12:57,610 --> 00:13:01,060 For this multiplication function here, 250 00:13:01,060 --> 00:13:03,430 we use a for loop. 251 00:13:03,430 --> 00:13:05,470 Oftentimes, when we see for loops that's just going 252 00:13:05,470 --> 00:13:08,500 through the input, there's a signal to us that it's going 253 00:13:08,500 --> 00:13:11,880 to probably contain a factor of O(n). 254 00:13:11,880 --> 00:13:13,800 Why is that? 255 00:13:13,800 --> 00:13:14,990 What do we do in this for loop? 256 00:13:14,990 --> 00:13:18,170 We say for i in range y. 257 00:13:18,170 --> 00:13:18,970 What does that mean? 258 00:13:18,970 --> 00:13:22,950 How many times do we execute that for loop? 259 00:13:22,950 --> 00:13:24,170 Yeah, y times. 260 00:13:24,170 --> 00:13:27,220 So if y is really small, we execute that for loop just a 261 00:13:27,220 --> 00:13:28,460 few number of times. 262 00:13:28,460 --> 00:13:31,130 But if y is really large, we execute that for loop a whole 263 00:13:31,130 --> 00:13:33,130 bunch of times. 264 00:13:33,130 --> 00:13:35,450 So when we're analyzing this, we see this for loop and we 265 00:13:35,450 --> 00:13:39,430 say, ah, that for loop must be O(y). 266 00:13:42,630 --> 00:13:44,884 Does that make sense to everybody? 267 00:13:44,884 --> 00:13:47,780 OK, good. 268 00:13:47,780 --> 00:13:50,501 Let's look at a factorial. 269 00:13:50,501 --> 00:13:56,015 Can anybody tell me what the complexity of factorial is? 270 00:13:56,015 --> 00:13:57,392 AUDIENCE: [INAUDIBLE]. 271 00:13:57,392 --> 00:13:58,310 PROFESSOR: Yeah. 272 00:13:58,310 --> 00:13:58,890 Order n. 273 00:13:58,890 --> 00:14:00,995 Why is it order n? 274 00:14:00,995 --> 00:14:02,830 AUDIENCE: Because it's self for loop. 275 00:14:02,830 --> 00:14:03,710 PROFESSOR: Yeah. 276 00:14:03,710 --> 00:14:04,890 It's the exact same structure. 277 00:14:04,890 --> 00:14:07,750 We have a for loop that's going through 278 00:14:07,750 --> 00:14:10,850 range 1 to n plus 1. 279 00:14:10,850 --> 00:14:12,560 So that's dependent on the size of n. 280 00:14:12,560 --> 00:14:14,460 So this for loop is order n. 281 00:14:14,460 --> 00:14:16,110 And inside the for loop, we just do a 282 00:14:16,110 --> 00:14:17,750 constant time operation. 283 00:14:17,750 --> 00:14:18,920 That's the other thing. 284 00:14:18,920 --> 00:14:21,390 Just because we have this for loop doesn't mean that what's 285 00:14:21,390 --> 00:14:24,420 inside the for loop is going to be constant. 286 00:14:24,420 --> 00:14:29,470 But in this case, if we have order n times, we do a content 287 00:14:29,470 --> 00:14:30,960 time operation. 288 00:14:30,960 --> 00:14:34,560 Then this whole chunk of the for loop is order n. 289 00:14:34,560 --> 00:14:37,660 The rest of everything else is just constant time. 290 00:14:37,660 --> 00:14:41,490 So we have constant time plus order n times constant time 291 00:14:41,490 --> 00:14:43,470 plus constant time, they're going to be order n. 292 00:14:48,090 --> 00:14:50,640 How about this one? 293 00:14:50,640 --> 00:14:53,810 Factorial 2. 294 00:14:53,810 --> 00:14:56,150 AUDIENCE: [INAUDIBLE]. 295 00:14:56,150 --> 00:14:56,960 PROFESSOR: Yeah. 296 00:14:56,960 --> 00:14:57,420 Exactly. 297 00:14:57,420 --> 00:14:59,160 This is also order n. 298 00:14:59,160 --> 00:15:01,480 The only thing that's different in this code is that 299 00:15:01,480 --> 00:15:03,370 we initialize its count variable. 300 00:15:03,370 --> 00:15:04,750 And inside the for loop, we also 301 00:15:04,750 --> 00:15:06,630 increment this count variable. 302 00:15:06,630 --> 00:15:10,220 But both result times equals num and count plus equal 1, 303 00:15:10,220 --> 00:15:12,940 both of these are constant time operations. 304 00:15:12,940 --> 00:15:17,870 So if we do n times 2 constant times operations, that's still 305 00:15:17,870 --> 00:15:20,480 going to be order n. 306 00:15:20,480 --> 00:15:23,780 So the takeaway from these two examples that I'm trying to 307 00:15:23,780 --> 00:15:28,530 demonstrate here is a single line of code can generate a 308 00:15:28,530 --> 00:15:32,010 pretty complex thing. 309 00:15:32,010 --> 00:15:34,100 But a collection of lines of code might 310 00:15:34,100 --> 00:15:35,720 still be constant time. 311 00:15:35,720 --> 00:15:37,845 So you have to look at every line of code 312 00:15:37,845 --> 00:15:39,095 and consider that. 313 00:15:44,725 --> 00:15:49,040 I've thrown in some conditionals here. 314 00:15:49,040 --> 00:15:50,450 What's the complexity of this guy? 315 00:15:57,184 --> 00:15:58,627 AUDIENCE: [INAUDIBLE]. 316 00:15:58,627 --> 00:15:59,190 PROFESSOR: Yeah. 317 00:15:59,190 --> 00:16:00,240 This is also linear. 318 00:16:00,240 --> 00:16:01,540 What's going on here? 319 00:16:01,540 --> 00:16:05,260 We initialize a variable count that's constant time. 320 00:16:05,260 --> 00:16:07,620 We go through character in a string. 321 00:16:07,620 --> 00:16:10,830 This is linear in the size of a string. 322 00:16:10,830 --> 00:16:16,900 Now we say if character equal, equal t, this character equal, 323 00:16:16,900 --> 00:16:21,070 equal t, that's also a constant time operation. 324 00:16:21,070 --> 00:16:25,200 That's just asking if this one thing equals this other thing. 325 00:16:25,200 --> 00:16:26,800 So we're looking at two characters. 326 00:16:26,800 --> 00:16:28,140 We're looking at two numbers. 327 00:16:28,140 --> 00:16:30,690 Equal, equal or not equal is generally a 328 00:16:30,690 --> 00:16:32,640 constant times operation. 329 00:16:32,640 --> 00:16:39,950 The exception to this might be a quality of certain types, 330 00:16:39,950 --> 00:16:42,790 like if you define a class and you define a quality method in 331 00:16:42,790 --> 00:16:45,470 your class, and the equality method of your class is not 332 00:16:45,470 --> 00:16:48,100 constant time, then this equal, equal check might not 333 00:16:48,100 --> 00:16:49,180 be constant time. 334 00:16:49,180 --> 00:16:53,090 But on two strings, equal, equal is constant time. 335 00:16:53,090 --> 00:16:56,020 And this is constant time as well. 336 00:16:56,020 --> 00:16:58,860 So linear in the size of a string. 337 00:16:58,860 --> 00:17:01,480 Something that's important when you're doing this for 338 00:17:01,480 --> 00:17:06,910 exams, it's a good idea to define what n is before you 339 00:17:06,910 --> 00:17:09,000 give the complexity bound. 340 00:17:09,000 --> 00:17:13,700 So here I'm saying n is equal to the size of a string. 341 00:17:13,700 --> 00:17:16,710 So now, I can say this function is order n. 342 00:17:16,710 --> 00:17:19,869 What I'm saying is that it's a linear with respect to the 343 00:17:19,869 --> 00:17:22,670 size or the length of a string. 344 00:17:22,670 --> 00:17:28,140 Because sometimes, like in the one where there is the input x 345 00:17:28,140 --> 00:17:33,840 and y, the running time was only linear in the size of y. 346 00:17:33,840 --> 00:17:37,180 So you want to define that n was equal to the size of y to 347 00:17:37,180 --> 00:17:38,690 say that it was order n. 348 00:17:38,690 --> 00:17:39,640 So always be clear. 349 00:17:39,640 --> 00:17:42,960 If it's not clear, be sure to explicitly state 350 00:17:42,960 --> 00:17:44,210 what n is equal to. 351 00:17:48,720 --> 00:17:51,890 This code's a little more tricky. 352 00:17:51,890 --> 00:17:55,315 What's going on here? 353 00:17:55,315 --> 00:17:56,565 AUDIENCE: [INAUDIBLE]. 354 00:18:10,963 --> 00:18:12,520 PROFESSOR: Yeah. 355 00:18:12,520 --> 00:18:13,680 That was perfect. 356 00:18:13,680 --> 00:18:20,100 So just to reiterate, the for loop we know is linear with 357 00:18:20,100 --> 00:18:21,410 respect to the size of a string. 358 00:18:21,410 --> 00:18:23,710 We have to go through every character in a string. 359 00:18:23,710 --> 00:18:29,220 Now, the second is if char in b string, when we're looking 360 00:18:29,220 --> 00:18:31,790 at big O notation, we're worried about the worst case 361 00:18:31,790 --> 00:18:33,830 complexity in upper bound. 362 00:18:33,830 --> 00:18:35,635 What's the worst case? 363 00:18:35,635 --> 00:18:36,615 AUDIENCE: [INAUDIBLE]. 364 00:18:36,615 --> 00:18:37,740 PROFESSOR: Yeah. 365 00:18:37,740 --> 00:18:40,620 If the character is not in b string, we have to look at 366 00:18:40,620 --> 00:18:42,760 every single character in b string before 367 00:18:42,760 --> 00:18:44,920 we can return false. 368 00:18:44,920 --> 00:18:46,750 So that is linear. 369 00:18:46,750 --> 00:18:50,320 This one single line, if character in b string, that 370 00:18:50,320 --> 00:18:53,150 one line is linear with respect to 371 00:18:53,150 --> 00:18:55,330 the size of b string. 372 00:18:55,330 --> 00:18:59,440 So how do we analyze the complexity of this? 373 00:18:59,440 --> 00:19:03,000 I want to be able to touch the screen. 374 00:19:03,000 --> 00:19:04,920 We have this for loop. 375 00:19:04,920 --> 00:19:08,110 This for loop is executed. 376 00:19:08,110 --> 00:19:10,600 Let's call n is the length of a string. 377 00:19:10,600 --> 00:19:13,460 This for loop is executed n times. 378 00:19:13,460 --> 00:19:15,980 Every time we execute this for loop, we 379 00:19:15,980 --> 00:19:18,100 execute this inner body. 380 00:19:18,100 --> 00:19:20,920 And what's the time bound on the inner body? 381 00:19:20,920 --> 00:19:24,230 Well, if we let m equal the length of b string, when we 382 00:19:24,230 --> 00:19:28,320 say that this check is order m every time we run it, then we 383 00:19:28,320 --> 00:19:34,010 run an order m operation order n times. 384 00:19:34,010 --> 00:19:35,260 So the complexity is-- 385 00:19:38,530 --> 00:19:41,850 we use something of size m, n times. 386 00:19:41,850 --> 00:19:42,850 AUDIENCE: [INAUDIBLE]. 387 00:19:42,850 --> 00:19:43,600 PROFESSOR: Yeah. 388 00:19:43,600 --> 00:19:44,850 Just order n, m. 389 00:19:49,640 --> 00:19:53,270 So we execute an order m check order n time, we say this 390 00:19:53,270 --> 00:19:57,340 function is order n, m. 391 00:19:57,340 --> 00:19:59,750 Does that make sense to everybody? 392 00:19:59,750 --> 00:20:02,120 Because you'll see the nested for loops. 393 00:20:02,120 --> 00:20:04,070 Nested for loops are very similar to this. 394 00:20:07,780 --> 00:20:11,960 While loops combine the best of conditionals with the best 395 00:20:11,960 --> 00:20:14,560 of for loops. 396 00:20:14,560 --> 00:20:17,350 Because a while loop has a chance to act like for loop, 397 00:20:17,350 --> 00:20:19,930 but a while loop can also have a conditional. 398 00:20:19,930 --> 00:20:22,910 It's actually possible to write a while loop that has a 399 00:20:22,910 --> 00:20:26,250 complex conditional that also executes a number of times. 400 00:20:26,250 --> 00:20:28,790 And so you could have one single line of code generating 401 00:20:28,790 --> 00:20:32,530 like an order n-squared complexity. 402 00:20:32,530 --> 00:20:34,610 Let's look at factorial 3. 403 00:20:34,610 --> 00:20:37,290 Who can tell the complexity of factorial 3? 404 00:20:43,751 --> 00:20:45,739 AUDIENCE: [INAUDIBLE]. 405 00:20:45,739 --> 00:20:46,750 PROFESSOR: Yeah. 406 00:20:46,750 --> 00:20:48,100 It's also linear. 407 00:20:48,100 --> 00:20:51,070 It's interesting that factorial is always linear 408 00:20:51,070 --> 00:20:53,360 despite its name. 409 00:20:53,360 --> 00:20:55,070 We have constant time operations. 410 00:20:55,070 --> 00:20:57,942 How many times does the while loop executed? 411 00:20:57,942 --> 00:20:58,766 AUDIENCE: n times. 412 00:20:58,766 --> 00:21:00,580 PROFESSOR: Yeah, n times. 413 00:21:00,580 --> 00:21:03,960 And what's inside the body of the while loop? 414 00:21:03,960 --> 00:21:05,770 Constant time operations. 415 00:21:05,770 --> 00:21:08,740 So we execute a bunch of constant time operation n 416 00:21:08,740 --> 00:21:11,690 times order n. 417 00:21:11,690 --> 00:21:13,170 How about this char split example? 418 00:21:16,884 --> 00:21:19,640 This one's a little tricky because you're like, well, 419 00:21:19,640 --> 00:21:21,060 what's the complexity of len? 420 00:21:23,600 --> 00:21:26,660 In Python, len's actually a constant time operation. 421 00:21:26,660 --> 00:21:29,110 This example's very crafted such that all of the 422 00:21:29,110 --> 00:21:31,970 operations that are here are constant time. 423 00:21:31,970 --> 00:21:34,660 So appending to a list is constant time. 424 00:21:34,660 --> 00:21:38,700 And indexing a string is constant time. 425 00:21:38,700 --> 00:21:41,300 So what's the complexity of char split? 426 00:21:47,940 --> 00:21:50,666 Constant time. 427 00:21:50,666 --> 00:21:51,916 AUDIENCE: [INAUDIBLE]. 428 00:21:57,442 --> 00:22:00,760 PROFESSOR: Who would agree with constant time? 429 00:22:00,760 --> 00:22:03,302 And who would say it's linear time? 430 00:22:03,302 --> 00:22:04,140 OK, yeah. 431 00:22:04,140 --> 00:22:04,530 Very good. 432 00:22:04,530 --> 00:22:06,090 It is linear time. 433 00:22:06,090 --> 00:22:08,150 That's a correct intuition. 434 00:22:08,150 --> 00:22:10,990 We say while the length of the a string is not equal to the 435 00:22:10,990 --> 00:22:13,550 length of the result, these are two constant time 436 00:22:13,550 --> 00:22:14,560 operations. 437 00:22:14,560 --> 00:22:15,360 Well, what do we do? 438 00:22:15,360 --> 00:22:18,850 We append a value to the result, and then we add up 439 00:22:18,850 --> 00:22:19,920 this index. 440 00:22:19,920 --> 00:22:23,190 So when is this check going to be equal? 441 00:22:23,190 --> 00:22:25,490 This check's going to be equal when the length of the result 442 00:22:25,490 --> 00:22:27,060 is equal to the length of a string. 443 00:22:27,060 --> 00:22:28,980 And that's only going to happen after we've gone 444 00:22:28,980 --> 00:22:31,730 through the entire a string, and we've added each of its 445 00:22:31,730 --> 00:22:33,650 characters to result. 446 00:22:33,650 --> 00:22:38,520 So this is linear with respect to the size of a string. 447 00:22:38,520 --> 00:22:42,200 Something that's important to recognize is that not all 448 00:22:42,200 --> 00:22:45,390 string in the list operations are constant time. 449 00:22:45,390 --> 00:22:50,110 There's a website here that first off, it says C Python if 450 00:22:50,110 --> 00:22:50,780 you go to it. 451 00:22:50,780 --> 00:22:53,710 C Python just means Python implemented in C, which is 452 00:22:53,710 --> 00:22:56,860 actually what you're running, C Python. 453 00:22:56,860 --> 00:22:59,200 So don't worry about that. 454 00:22:59,200 --> 00:23:01,920 There's often two time bound complexities. 455 00:23:01,920 --> 00:23:05,380 It says the amortized time and the worst case time. 456 00:23:05,380 --> 00:23:07,990 And so if you're looking for big O notation, you don't want 457 00:23:07,990 --> 00:23:09,010 to use the amortized time. 458 00:23:09,010 --> 00:23:12,150 You want to use the worst case time. 459 00:23:12,150 --> 00:23:14,870 And it's important to note that operations like slicing 460 00:23:14,870 --> 00:23:18,250 and copying actually aren't constant time. 461 00:23:18,250 --> 00:23:22,770 If you slice a list or a string, the complexity of that 462 00:23:22,770 --> 00:23:25,870 operation is going to depend on how big your slice is. 463 00:23:25,870 --> 00:23:27,670 Does that makes sense? 464 00:23:27,670 --> 00:23:29,690 Does the way that a slice works is that walks through 465 00:23:29,690 --> 00:23:33,750 the list until it gets to the index, and then keeps walking 466 00:23:33,750 --> 00:23:36,330 until the final index, and then copies that and 467 00:23:36,330 --> 00:23:37,870 returns it to you. 468 00:23:37,870 --> 00:23:40,990 So slicing is not constant time. 469 00:23:40,990 --> 00:23:43,150 Copying is similarly not constant time. 470 00:23:47,070 --> 00:23:50,630 For this little snippet of code, this is just 471 00:23:50,630 --> 00:23:51,600 similar to what we-- 472 00:23:51,600 --> 00:23:52,850 yeah? 473 00:23:52,850 --> 00:23:54,100 AUDIENCE: [INAUDIBLE]. 474 00:24:06,215 --> 00:24:09,140 PROFESSOR: So this is what I was saying. 475 00:24:09,140 --> 00:24:12,420 You want to define what n is. 476 00:24:12,420 --> 00:24:15,800 So we say something like n equals the length of a string. 477 00:24:15,800 --> 00:24:19,400 And then you can say it's order n. 478 00:24:19,400 --> 00:24:23,180 It's important to define what you're saying the complexity 479 00:24:23,180 --> 00:24:24,430 is related to. 480 00:24:27,070 --> 00:24:31,280 So here, I'm saying if we let n equal to the size of z, can 481 00:24:31,280 --> 00:24:33,280 anybody tell me what the complexity of this 482 00:24:33,280 --> 00:24:36,870 snippet of code is? 483 00:24:36,870 --> 00:24:37,285 [UNINTELLIGIBLE]. 484 00:24:37,285 --> 00:24:37,960 AUDIENCE: [INAUDIBLE]. 485 00:24:37,960 --> 00:24:38,840 PROFESSOR: Yeah, precisesly. 486 00:24:38,840 --> 00:24:39,750 Order n-squared. 487 00:24:39,750 --> 00:24:40,740 Why? 488 00:24:40,740 --> 00:24:44,360 Well, because we execute this for i for loop 489 00:24:44,360 --> 00:24:47,870 here order n times. 490 00:24:47,870 --> 00:24:50,860 Each time through this for loop, the body of this for 491 00:24:50,860 --> 00:24:53,880 loop is, in fact, another for loop. 492 00:24:53,880 --> 00:24:58,290 So my approach to problems like this is just step back a 493 00:24:58,290 --> 00:25:01,150 minute and ignore the outer loop. 494 00:25:01,150 --> 00:25:02,380 Just concentrate on the inner loop. 495 00:25:02,380 --> 00:25:04,010 What's the runtime of this inner loop? 496 00:25:06,510 --> 00:25:06,740 Yeah. 497 00:25:06,740 --> 00:25:07,620 This is order n. 498 00:25:07,620 --> 00:25:08,830 We go through this. 499 00:25:08,830 --> 00:25:10,760 Now, go to the outer loop. 500 00:25:10,760 --> 00:25:12,630 Just ignore the body since we've already 501 00:25:12,630 --> 00:25:13,550 analyzed the body. 502 00:25:13,550 --> 00:25:14,500 Ignore it. 503 00:25:14,500 --> 00:25:17,640 What's the complexity of the outer loop? 504 00:25:17,640 --> 00:25:19,270 Also order n. 505 00:25:19,270 --> 00:25:21,200 So now you can combine the analysis. 506 00:25:21,200 --> 00:25:26,190 You can say for order n times, I execute this body. 507 00:25:26,190 --> 00:25:28,950 This body takes order n times. 508 00:25:28,950 --> 00:25:34,160 So if execute something that's order n order n times, that is 509 00:25:34,160 --> 00:25:36,040 order n squared complexity. 510 00:25:36,040 --> 00:25:39,040 So we just multiply how long it takes the outer body of the 511 00:25:39,040 --> 00:25:40,550 loop to take the inner body of the loop. 512 00:25:40,550 --> 00:25:44,370 And so in this fashion, I could give you now probably a 513 00:25:44,370 --> 00:25:46,240 four or five nested for loop, and you could tell me the 514 00:25:46,240 --> 00:25:47,490 complexity of it. 515 00:25:52,900 --> 00:25:57,550 Harder sometimes to understand is recursion. 516 00:25:57,550 --> 00:26:00,180 I don't know how important it is to understand this because 517 00:26:00,180 --> 00:26:01,820 I've never actually taught this class before. 518 00:26:01,820 --> 00:26:03,440 But Mitch did tell me to go over this. 519 00:26:03,440 --> 00:26:06,500 So I'd like to touch on it. 520 00:26:06,500 --> 00:26:09,080 So consider recursive factorial. 521 00:26:09,080 --> 00:26:10,630 What's the time complexity of this? 522 00:26:10,630 --> 00:26:13,740 How can we figure out the time complexity 523 00:26:13,740 --> 00:26:14,990 over a recursive function? 524 00:26:23,020 --> 00:26:24,950 The way we want to figure out the time complexity of a 525 00:26:24,950 --> 00:26:27,530 recursive function is just to figure out how many times 526 00:26:27,530 --> 00:26:30,570 we're executing said recursive function. 527 00:26:30,570 --> 00:26:35,290 So here I have recursive factorial of n. 528 00:26:35,290 --> 00:26:39,750 When I make a call to this, what do I do? 529 00:26:39,750 --> 00:26:45,850 I make a call to recursive factorial n minus 1. 530 00:26:45,850 --> 00:26:47,330 And then what does this do? 531 00:26:47,330 --> 00:26:51,430 This calls recursive factorial on a sub problem the 532 00:26:51,430 --> 00:26:54,340 size n minus 2. 533 00:26:54,340 --> 00:27:00,180 So oftentimes, when you're dealing with recursive 534 00:27:00,180 --> 00:27:02,620 problems to figure out the complexity, what you need to 535 00:27:02,620 --> 00:27:06,190 do is you need to figure out how many times you're going to 536 00:27:06,190 --> 00:27:10,040 make a recursive call before a result is returned. 537 00:27:10,040 --> 00:27:12,590 Intuitively, we can start to see a pattern. 538 00:27:12,590 --> 00:27:16,630 We can say, I called on n, and then n minus 1, and then n 539 00:27:16,630 --> 00:27:22,850 minus 2, and I keep calling recursive factorial until n is 540 00:27:22,850 --> 00:27:24,740 less than or equal to 0. 541 00:27:24,740 --> 00:27:27,180 When is n going to be less than or equal to 0? 542 00:27:27,180 --> 00:27:28,740 Well, when I get n minus n. 543 00:27:31,360 --> 00:27:34,626 So how many calls is that? 544 00:27:34,626 --> 00:27:35,594 AUDIENCE: [INAUDIBLE]. 545 00:27:35,594 --> 00:27:36,195 PROFESSOR: Yeah. 546 00:27:36,195 --> 00:27:38,030 This is n calls. 547 00:27:38,030 --> 00:27:43,530 So it's a good practice to get into being able to draw this 548 00:27:43,530 --> 00:27:46,430 out and work yourself through how many times you're running 549 00:27:46,430 --> 00:27:47,680 the recursion. 550 00:27:47,680 --> 00:27:51,000 And we see we're making n calls, we can say, oh, this 551 00:27:51,000 --> 00:27:52,250 must be linear in time. 552 00:27:56,720 --> 00:27:58,815 How about this one, this foo function? 553 00:28:06,410 --> 00:28:09,720 This one's a little harder to see. 554 00:28:09,720 --> 00:28:12,330 But what are we doing? 555 00:28:12,330 --> 00:28:20,480 We call foo on input of size n, which then makes a call to 556 00:28:20,480 --> 00:28:24,370 sub problem the size n/2, which makes the call to a sub 557 00:28:24,370 --> 00:28:34,200 problem of size n/4 and so on until I make a call to sub 558 00:28:34,200 --> 00:28:35,760 problem of some size. 559 00:28:35,760 --> 00:28:38,260 So this is n. 560 00:28:38,260 --> 00:28:40,840 This is 2 to the 1st. 561 00:28:40,840 --> 00:28:43,150 This is 2-squared. 562 00:28:43,150 --> 00:28:44,350 We start to see a pattern-- 563 00:28:44,350 --> 00:28:47,610 2-squared, 2-cubed, 2 to the fourth. 564 00:28:47,610 --> 00:28:50,570 So we're going to keep making calls on a smaller, and 565 00:28:50,570 --> 00:28:52,820 smaller, and smaller sub problem size. 566 00:28:52,820 --> 00:28:56,770 But instead of being linear like before, we're decreasing 567 00:28:56,770 --> 00:28:58,100 at an exponential rate. 568 00:29:01,630 --> 00:29:03,470 There's a bunch of different ways to try and work this out 569 00:29:03,470 --> 00:29:04,140 in your head. 570 00:29:04,140 --> 00:29:06,160 I wrote up one possible description. 571 00:29:06,160 --> 00:29:10,330 But when we're decreasing at this exponential rate, what's 572 00:29:10,330 --> 00:29:15,360 going to end up happening is that this recursive problem 573 00:29:15,360 --> 00:29:21,900 where we make a recursive call in the form to sub problem of 574 00:29:21,900 --> 00:29:28,310 size n/b, the complexity of that is always going to be log 575 00:29:28,310 --> 00:29:30,450 base b of n. 576 00:29:30,450 --> 00:29:33,840 So this is just like bisection search, where bisection 577 00:29:33,840 --> 00:29:36,620 search, we essentially do in bisection search. 578 00:29:36,620 --> 00:29:39,450 We restrict the problem size by half every time. 579 00:29:39,450 --> 00:29:41,950 And that leads to logarithmic time, actually 580 00:29:41,950 --> 00:29:43,610 log base 2 of n. 581 00:29:43,610 --> 00:29:46,540 This problem is also log base 2 of n. 582 00:29:46,540 --> 00:29:54,520 If we change this recursive call from n/2 to n/6, we get a 583 00:29:54,520 --> 00:29:58,590 cut time complexity of log base 6 of n. 584 00:29:58,590 --> 00:30:00,120 So try and work that through. 585 00:30:00,120 --> 00:30:02,610 You can read this closer later. 586 00:30:02,610 --> 00:30:06,280 Definitely ask me if you need more help on that one. 587 00:30:06,280 --> 00:30:09,460 The last one is how do we deal time complexity of something 588 00:30:09,460 --> 00:30:10,710 like Fibonacci? 589 00:30:13,250 --> 00:30:19,260 Fibonacci, fib n minus 1 plus fib n minus 2, initially, that 590 00:30:19,260 --> 00:30:20,360 kind of looks linear. 591 00:30:20,360 --> 00:30:20,700 Right? 592 00:30:20,700 --> 00:30:24,960 We just went over the recursive factorial, and it 593 00:30:24,960 --> 00:30:28,520 made the call to a sub problem the size n minus 1. 594 00:30:28,520 --> 00:30:31,280 And that was linear. 595 00:30:31,280 --> 00:30:33,170 Fibonacci's a little bit different. 596 00:30:33,170 --> 00:30:36,870 If you actually draw out in a tree, you start to see like at 597 00:30:36,870 --> 00:30:44,090 every level of the tree, we expand the call by 2. 598 00:30:44,090 --> 00:30:47,690 Now imagine this is just for Fibonacci of 6. 599 00:30:47,690 --> 00:30:49,610 Whenever you're doing big O complexity, you want to 600 00:30:49,610 --> 00:30:53,516 imagine it and put 100,000, 50,000. 601 00:30:53,516 --> 00:30:55,585 And you could imagine how big that tree grows. 602 00:30:58,710 --> 00:31:03,640 Intuitively, the point to see here is that they're going to 603 00:31:03,640 --> 00:31:10,340 be about n levels to get down to 1 from your 604 00:31:10,340 --> 00:31:12,500 initial input of 6. 605 00:31:12,500 --> 00:31:15,150 So to get down to 1 from an initial input of size n is 606 00:31:15,150 --> 00:31:17,260 going to take about n levels. 607 00:31:17,260 --> 00:31:21,790 The branching factor of this tree at each level is 2. 608 00:31:21,790 --> 00:31:26,460 So if we have n levels, and at each level, we increase our 609 00:31:26,460 --> 00:31:30,780 branching factor by another 2, we can say that a loose bound 610 00:31:30,780 --> 00:31:32,840 on the complexity of this is actually 2 to the n. 611 00:31:35,700 --> 00:31:39,450 This is something that's even less intuitive, I think, than 612 00:31:39,450 --> 00:31:41,430 what we did before with the logarithms. 613 00:31:41,430 --> 00:31:43,640 So try and work through it again. 614 00:31:43,640 --> 00:31:45,020 Play with it a little bit. 615 00:31:45,020 --> 00:31:47,370 There's actually a tighter bound on this, which is like 616 00:31:47,370 --> 00:31:51,170 1.62 to the n, which is a lot more complicated math that you 617 00:31:51,170 --> 00:31:53,130 could look up. 618 00:31:53,130 --> 00:31:56,530 But for the purposes of this class, it's sufficient to say 619 00:31:56,530 --> 00:31:58,610 that Fibonacci is order 2 to the n. 620 00:32:02,900 --> 00:32:07,870 So does that roughly clear up some time complexities stuff 621 00:32:07,870 --> 00:32:08,910 for you guys? 622 00:32:08,910 --> 00:32:09,400 OK, awesome. 623 00:32:09,400 --> 00:32:10,320 Does anybody have the time? 624 00:32:10,320 --> 00:32:11,700 I forgot my watch today. 625 00:32:11,700 --> 00:32:12,660 AUDIENCE: 12:42. 626 00:32:12,660 --> 00:32:15,350 PROFESSOR: OK, excellent. 627 00:32:15,350 --> 00:32:17,590 That gives us a little bit of time to talk about 628 00:32:17,590 --> 00:32:18,555 object-oriented programming. 629 00:32:18,555 --> 00:32:21,650 Does anybody had any specific questions that object-oriented 630 00:32:21,650 --> 00:32:24,500 programming? 631 00:32:24,500 --> 00:32:24,990 How about this? 632 00:32:24,990 --> 00:32:27,106 How many of you guys finished the problem set and turned it 633 00:32:27,106 --> 00:32:28,330 in already? 634 00:32:28,330 --> 00:32:32,890 Or did any of you guys not turn in the problem set yet? 635 00:32:32,890 --> 00:32:36,640 I'll talk loosely about it then, not too specifically. 636 00:32:36,640 --> 00:32:39,120 Does anybody have any questions from, I guess, at 637 00:32:39,120 --> 00:32:40,210 least the first part? 638 00:32:40,210 --> 00:32:43,720 We're making some classes, making some trigger classes. 639 00:32:43,720 --> 00:32:44,593 Yeah? 640 00:32:44,593 --> 00:32:45,843 AUDIENCE: [INAUDIBLE]? 641 00:32:50,251 --> 00:32:51,535 PROFESSOR: Self dot what? 642 00:32:51,535 --> 00:32:52,785 AUDIENCE: [INAUDIBLE]. 643 00:32:56,415 --> 00:32:58,050 PROFESSOR: When we have like self-- 644 00:32:58,050 --> 00:33:00,120 we have like the getter methods. 645 00:33:00,120 --> 00:33:02,505 So what's important about that? 646 00:33:02,505 --> 00:33:04,870 I'll Tell you what's important about that. 647 00:33:04,870 --> 00:33:07,600 So we have a class. 648 00:33:07,600 --> 00:33:08,930 Let's say we have a class person. 649 00:33:16,490 --> 00:33:21,900 So we define our INIT method to just take a name. 650 00:33:32,230 --> 00:33:34,600 And so now, what the problems that ask you to do was to 651 00:33:34,600 --> 00:33:36,310 define a getter method. 652 00:33:36,310 --> 00:33:44,410 Define a getter method called get_name that 653 00:33:44,410 --> 00:33:45,660 just returns the attribute. 654 00:33:50,690 --> 00:33:52,500 So what's the point of this? 655 00:33:52,500 --> 00:33:58,333 Because I can just say Sally equals person. 656 00:34:07,550 --> 00:34:09,639 So here, I defined a person named Sally. 657 00:34:09,639 --> 00:34:15,580 And I initialized a person with the string Sally. 658 00:34:15,580 --> 00:34:19,730 If I just look at sally.name, that's going to just directly 659 00:34:19,730 --> 00:34:21,630 print the attribute. 660 00:34:21,630 --> 00:34:25,010 So why do we need this get name function? 661 00:34:25,010 --> 00:34:27,989 What's the point of this additional getter method? 662 00:34:27,989 --> 00:34:30,590 Does anybody know why that is? 663 00:34:30,590 --> 00:34:31,840 AUDIENCE: [INAUDIBLE]. 664 00:34:34,510 --> 00:34:35,239 PROFESSOR: Right. 665 00:34:35,239 --> 00:34:36,070 So that's what it does. 666 00:34:36,070 --> 00:34:38,380 This get_name does return the attribute name. 667 00:34:38,380 --> 00:34:42,130 But we don't need this method to just look at 668 00:34:42,130 --> 00:34:43,380 the attribute name. 669 00:34:46,070 --> 00:34:47,320 Let's actually code this up. 670 00:34:58,970 --> 00:35:00,220 So we have class person. 671 00:35:18,760 --> 00:35:22,470 So if we run this code, and over here in the shell, we 672 00:35:22,470 --> 00:35:28,210 define Sally equals person with the name Sally. 673 00:35:31,110 --> 00:35:37,200 If I just print sally.name, it prints the attribute. 674 00:35:37,200 --> 00:35:42,970 So why did I need to provide this getter method called 675 00:35:42,970 --> 00:35:46,660 get_name that does the same thing? 676 00:35:46,660 --> 00:35:47,890 That's the question. 677 00:35:47,890 --> 00:35:51,200 That seems sort of redundant. 678 00:35:51,200 --> 00:35:54,610 But there's actually a pretty big important reason for it. 679 00:35:54,610 --> 00:35:59,200 Let's say we set s name equal to the attribute sally.name. 680 00:36:02,930 --> 00:36:07,300 If we look at s name, we see Sally. 681 00:36:07,300 --> 00:36:09,202 Now if I say-- 682 00:36:09,202 --> 00:36:11,830 actually, I'm not sure if this is the correct reasoning. 683 00:36:41,770 --> 00:36:43,970 This is going to be better. 684 00:36:43,970 --> 00:36:49,910 Let's say Sally equals a person Sally 685 00:36:49,910 --> 00:36:51,600 who's taking what? 686 00:36:51,600 --> 00:36:59,560 1803, 605, 11.1. 687 00:36:59,560 --> 00:37:04,460 So now I can look at the attribute classes to show 688 00:37:04,460 --> 00:37:08,020 Sally's classes, which are weird flows. 689 00:37:08,020 --> 00:37:12,610 And I can also use sally.getclasses to look at 690 00:37:12,610 --> 00:37:15,300 Sally's classes. 691 00:37:15,300 --> 00:37:21,740 If I set a variable s classes equal to sally.classes, this 692 00:37:21,740 --> 00:37:25,240 binds this variable s classes to the attribute 693 00:37:25,240 --> 00:37:26,960 sally.classes. 694 00:37:26,960 --> 00:37:38,290 Now if I say sclasses.append 1401, if I now look at the 695 00:37:38,290 --> 00:37:46,990 attribute sally.classes, it now has 1401 in it. 696 00:37:46,990 --> 00:37:48,140 This is not safe. 697 00:37:48,140 --> 00:37:49,500 This is not type safe. 698 00:37:49,500 --> 00:37:52,970 Because the reason for that is if you define a class, and you 699 00:37:52,970 --> 00:37:56,310 access the classes' attributes directly instead of through a 700 00:37:56,310 --> 00:37:58,910 getter method, you can then do this. 701 00:37:58,910 --> 00:38:01,250 And sometimes, it's accidental. 702 00:38:01,250 --> 00:38:05,080 You'll set some variable equal to some attribute of a class. 703 00:38:05,080 --> 00:38:10,130 Then later on in your code, you'll alter that variable. 704 00:38:10,130 --> 00:38:14,010 But that variable is not a copy of the attribute. 705 00:38:14,010 --> 00:38:17,250 Yes, you can make copies of that attribute and stuff, but 706 00:38:17,250 --> 00:38:20,680 the overall takeaway is that in programming, we try to do 707 00:38:20,680 --> 00:38:22,610 something called defensive programming. 708 00:38:22,610 --> 00:38:23,900 This isn't defensive. 709 00:38:23,900 --> 00:38:29,770 Because it is possible if you code it incorrectly to alter 710 00:38:29,770 --> 00:38:33,570 the attribute the instance of the class. 711 00:38:33,570 --> 00:38:36,030 But if we use the getter method, if instead of 712 00:38:36,030 --> 00:38:38,470 sally.classes, instead of directly accessing the 713 00:38:38,470 --> 00:38:41,160 attribute here, we have set s classes equal to 714 00:38:41,160 --> 00:38:42,930 sally.getclasses. 715 00:38:42,930 --> 00:38:45,780 And then, we had changed s classes around. 716 00:38:45,780 --> 00:38:50,180 That wouldn't have happened, because the getter method, it 717 00:38:50,180 --> 00:38:51,900 does return self.classes. 718 00:38:51,900 --> 00:38:55,830 But in the way that Python is scoped and when we return 719 00:38:55,830 --> 00:39:00,630 something, we're not returning the exact same thing, the 720 00:39:00,630 --> 00:39:02,850 reference that we're returning a copy of it. 721 00:39:02,850 --> 00:39:04,075 Does that make sense? 722 00:39:04,075 --> 00:39:04,670 All right. 723 00:39:04,670 --> 00:39:05,730 Cool. 724 00:39:05,730 --> 00:39:07,360 Other questions about classes? 725 00:39:07,360 --> 00:39:09,770 We have a little class appear if there's like some basic 726 00:39:09,770 --> 00:39:12,820 stuff that you'd like explained again. 727 00:39:12,820 --> 00:39:14,050 Now's the time. 728 00:39:14,050 --> 00:39:15,300 AUDIENCE: [INAUDIBLE]. 729 00:39:20,290 --> 00:39:23,900 PROFESSOR: So here, I'm setting just some variable s 730 00:39:23,900 --> 00:39:28,620 classes equal to the attribute sally classes. 731 00:39:28,620 --> 00:39:31,800 it's just like setting any sort of variable equal to some 732 00:39:31,800 --> 00:39:32,370 other quantity. 733 00:39:32,370 --> 00:39:35,230 AUDIENCE: So you appending the variable, but it also appended 734 00:39:35,230 --> 00:39:38,282 like the attribute of Sally? 735 00:39:38,282 --> 00:39:42,540 PROFESSOR: So what I did here was I set the variable s 736 00:39:42,540 --> 00:39:46,710 classes equal to this attribute sallly.classes. 737 00:39:46,710 --> 00:39:49,730 And then, because I know this is a list, I appended another 738 00:39:49,730 --> 00:39:51,550 value to it. 739 00:39:51,550 --> 00:39:54,440 But this is the same as when we have two lists. 740 00:39:54,440 --> 00:39:58,500 If we have a list called a, and we say a is equal to 1, 2, 741 00:39:58,500 --> 00:40:02,440 3, then I say b is equal to a. 742 00:40:02,440 --> 00:40:04,894 What is b? 743 00:40:04,894 --> 00:40:12,910 Now If I say b.append 1401, what does b look like? 744 00:40:12,910 --> 00:40:15,500 What does a look like? 745 00:40:15,500 --> 00:40:17,430 Because they're aliases of each other. 746 00:40:17,430 --> 00:40:20,900 So what I did here, when I set s classes directly equal to 747 00:40:20,900 --> 00:40:24,170 the attribute sally.classes, I made s classes an 748 00:40:24,170 --> 00:40:26,180 alias of the attribute. 749 00:40:26,180 --> 00:40:30,440 But the problem with that is that then I can change them. 750 00:40:30,440 --> 00:40:31,770 And because they're aliases, the 751 00:40:31,770 --> 00:40:33,720 attribute itself has changed. 752 00:40:33,720 --> 00:40:36,100 And we don't want to do that in object-oriented 753 00:40:36,100 --> 00:40:36,680 programming. 754 00:40:36,680 --> 00:40:38,300 We need to find an object. 755 00:40:38,300 --> 00:40:41,520 The only way you should be able to change an attribute is 756 00:40:41,520 --> 00:40:44,330 through some method of the class that allows you to 757 00:40:44,330 --> 00:40:46,760 change that attribute. 758 00:40:46,760 --> 00:40:49,970 So if I want to be able to add a class to Sally's class 759 00:40:49,970 --> 00:40:59,700 lists, I should define a method called define add class 760 00:40:59,700 --> 00:41:04,260 that does self.classes.append new class. 761 00:41:08,280 --> 00:41:11,740 While technically, it's possible to directly access an 762 00:41:11,740 --> 00:41:15,540 attribute, it's really bad practice to do so simply 763 00:41:15,540 --> 00:41:18,260 because this unexpected behavior can result. 764 00:41:18,260 --> 00:41:21,340 And also because if you say, oh, well, it's not going to 765 00:41:21,340 --> 00:41:23,240 matter for this one time, I'll remember how to 766 00:41:23,240 --> 00:41:24,420 do the right thing. 767 00:41:24,420 --> 00:41:26,750 The problem with that is it's often the case that you're not 768 00:41:26,750 --> 00:41:29,110 the only person using your code. 769 00:41:29,110 --> 00:41:32,320 So it's a better practice to provide all the sorts of 770 00:41:32,320 --> 00:41:35,230 methods that you would need to do with the class in order to 771 00:41:35,230 --> 00:41:38,570 get an access and change attributes as 772 00:41:38,570 --> 00:41:41,930 methods within the class. 773 00:41:41,930 --> 00:41:43,180 Does that make sense? 774 00:41:46,060 --> 00:41:48,380 So yeah, this is maybe our one violation if you guys have 775 00:41:48,380 --> 00:41:50,990 been attending my recitation. 776 00:41:50,990 --> 00:41:53,910 Our mantra of programmers are lazy. 777 00:41:53,910 --> 00:41:56,320 This is less lazy than just directly accessing the 778 00:41:56,320 --> 00:41:57,240 attributes. 779 00:41:57,240 --> 00:41:59,690 But even though we know that programmers are super, super 780 00:41:59,690 --> 00:42:02,950 lazy, programmers also like to be super, super safe. 781 00:42:02,950 --> 00:42:05,580 So when there's a trade off between defensive programming 782 00:42:05,580 --> 00:42:07,780 and being lazy, always pick defensive programming.