1 00:00:00,790 --> 00:00:03,190 The following content is provided under a Creative 2 00:00:03,190 --> 00:00:04,730 Commons license. 3 00:00:04,730 --> 00:00:07,030 Your support will help MIT OpenCourseWare 4 00:00:07,030 --> 00:00:11,390 continue to offer high quality educational resources for free. 5 00:00:11,390 --> 00:00:13,990 To make a donation or view additional materials 6 00:00:13,990 --> 00:00:17,880 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,880 --> 00:00:18,850 at ocw.mit.edu. 8 00:00:31,860 --> 00:00:34,800 ERIC GRIMSON: Ladies and gentlemen, I'd 9 00:00:34,800 --> 00:00:37,330 like to get started. 10 00:00:37,330 --> 00:00:38,490 My name's Eric Grimson. 11 00:00:38,490 --> 00:00:41,280 I have the privilege of serving as MIT'S chancellor 12 00:00:41,280 --> 00:00:44,470 for academic advancement, you can go look up what that means, 13 00:00:44,470 --> 00:00:48,270 and like John I'm a former head of course six. 14 00:00:48,270 --> 00:00:49,860 This term, with Ana and John, I'm 15 00:00:49,860 --> 00:00:53,070 going to be splitting the lectures, so I'm up to date. 16 00:00:53,070 --> 00:00:58,110 OK last time Ana introduced the first of the compound data 17 00:00:58,110 --> 00:01:01,170 types, tuples and lists. 18 00:01:01,170 --> 00:01:03,700 She showed lots of ways of manipulating them, 19 00:01:03,700 --> 00:01:06,754 lots of built in things for manipulating those structures. 20 00:01:06,754 --> 00:01:08,670 And the key difference between the two of them 21 00:01:08,670 --> 00:01:11,910 was that tuples were immutable, meaning you could not 22 00:01:11,910 --> 00:01:14,310 change them, lists were mutable, they 23 00:01:14,310 --> 00:01:16,290 could be changed, or mutated. 24 00:01:16,290 --> 00:01:19,320 And that led to both some nice power and some opportunities 25 00:01:19,320 --> 00:01:20,410 for challenges. 26 00:01:20,410 --> 00:01:23,370 And, in particular, she showed you things like aliasing, 27 00:01:23,370 --> 00:01:25,740 where you could have two names pointing to the same list 28 00:01:25,740 --> 00:01:27,360 structure, and because of that, you 29 00:01:27,360 --> 00:01:29,520 could change the contents of one, 30 00:01:29,520 --> 00:01:32,940 it would change the appearance of the contents of the other, 31 00:01:32,940 --> 00:01:34,890 and that leads to some nice challenges. 32 00:01:34,890 --> 00:01:36,420 So the side effects of mutability 33 00:01:36,420 --> 00:01:38,795 are one of the things you're going to see, both as a plus 34 00:01:38,795 --> 00:01:42,360 and minus, as we go through the course. 35 00:01:42,360 --> 00:01:45,507 Today we're going to take a different direction 36 00:01:45,507 --> 00:01:47,840 for a little while, we're going to talk about recursion. 37 00:01:47,840 --> 00:01:49,950 It Is a powerful and wonderful tool 38 00:01:49,950 --> 00:01:52,950 for solving computational problems. 39 00:01:52,950 --> 00:01:56,460 We're then going to look at another kind of compound data 40 00:01:56,460 --> 00:01:59,210 structure, a dictionary, which is also mutable. 41 00:01:59,210 --> 00:02:01,710 And then we're going to put the two pieces together and show 42 00:02:01,710 --> 00:02:03,900 how together they actually give you a lot of power 43 00:02:03,900 --> 00:02:08,500 for solving some really neat problems very effectively. 44 00:02:08,500 --> 00:02:11,610 But I want to start with recursion. 45 00:02:11,610 --> 00:02:13,950 Perhaps one of the most mysterious, at least according 46 00:02:13,950 --> 00:02:16,500 to programmer's, concepts in computer science, one that 47 00:02:16,500 --> 00:02:18,840 leads to lots of really bad computer science jokes, 48 00:02:18,840 --> 00:02:21,090 actually all computer science jokes are bad, 49 00:02:21,090 --> 00:02:23,430 but these are particularly bad. 50 00:02:23,430 --> 00:02:27,390 So let's start with the obvious question, what is recursion? 51 00:02:27,390 --> 00:02:31,260 If you go to the ultimate source of knowledge, Wikipedia, 52 00:02:31,260 --> 00:02:34,350 you get something that says, in essence, recursion 53 00:02:34,350 --> 00:02:39,710 is the process of repeating items in a self-similar way. 54 00:02:39,710 --> 00:02:41,796 Well that's really helpful, right? 55 00:02:41,796 --> 00:02:43,920 But we're going to see that idea because recursion, 56 00:02:43,920 --> 00:02:45,336 as we're going to see in a second, 57 00:02:45,336 --> 00:02:50,250 is the idea of taking a problem and reducing it to a smaller 58 00:02:50,250 --> 00:02:53,310 version of the same problem, and using that idea 59 00:02:53,310 --> 00:02:56,310 to actually tackle a bunch of really interesting problems. 60 00:02:56,310 --> 00:02:58,142 But recursion gets used in a lot of places. 61 00:02:58,142 --> 00:02:59,850 So it's this idea of using, or repeating, 62 00:02:59,850 --> 00:03:01,800 the idea multiple times. 63 00:03:01,800 --> 00:03:05,700 So wouldn't it be great if your 3D printer printed 3D printers? 64 00:03:05,700 --> 00:03:09,210 And you could just keep doing that all the way along. 65 00:03:09,210 --> 00:03:10,740 Or one that's a little more common, 66 00:03:10,740 --> 00:03:12,198 it's actually got a wonderful name, 67 00:03:12,198 --> 00:03:14,970 it's called mise en abyme, in art, sometimes referred 68 00:03:14,970 --> 00:03:17,142 to as the Droste effect, pictures 69 00:03:17,142 --> 00:03:19,350 that have inside them a picture of the picture, which 70 00:03:19,350 --> 00:03:22,290 has inside them a picture of the picture, and you get the idea. 71 00:03:22,290 --> 00:03:23,790 And of course, one of the things you 72 00:03:23,790 --> 00:03:25,414 want to think about in recursion is not 73 00:03:25,414 --> 00:03:27,382 to have it go on infinitely. 74 00:03:27,382 --> 00:03:29,715 And yes there are even light bulb jokes about recursion, 75 00:03:29,715 --> 00:03:31,339 if you can't read it, it says, how many 76 00:03:31,339 --> 00:03:34,440 twists does it take to screw in a light bulb? 77 00:03:34,440 --> 00:03:38,610 And it says, if it's already screwed in, the answer is 0. 78 00:03:38,610 --> 00:03:42,780 Otherwise, twist it once, ask me again, add 1 to my answer. 79 00:03:42,780 --> 00:03:46,470 And that's actually a nice description of recursion. 80 00:03:46,470 --> 00:03:48,150 So let's look at it more seriously. 81 00:03:48,150 --> 00:03:50,700 What is recursion? 82 00:03:50,700 --> 00:03:55,480 I want to describe it both abstractly, or algorithmically, 83 00:03:55,480 --> 00:03:59,850 and semantically or, if you like, in terms of programming. 84 00:03:59,850 --> 00:04:02,670 Abstractly, this is a great instance of something 85 00:04:02,670 --> 00:04:05,770 often called divide-and-conquer, or sometimes called 86 00:04:05,770 --> 00:04:07,720 decrease-and-conquer. 87 00:04:07,720 --> 00:04:09,510 And the idea of recursion is, I want 88 00:04:09,510 --> 00:04:13,080 to take a problem I'm trying to solve and say, how could I 89 00:04:13,080 --> 00:04:17,149 reduce it to a simpler version of the same problem, 90 00:04:17,149 --> 00:04:19,194 plus some things I know how to do? 91 00:04:19,194 --> 00:04:20,610 And then that simpler version, I'm 92 00:04:20,610 --> 00:04:21,984 going to reduce it again and keep 93 00:04:21,984 --> 00:04:24,957 doing that until I get down to a simple case 94 00:04:24,957 --> 00:04:26,040 that I can solve directly. 95 00:04:26,040 --> 00:04:29,700 That is how we're going to think about designing solutions 96 00:04:29,700 --> 00:04:31,710 to problems. 97 00:04:31,710 --> 00:04:35,160 Semantically, this is typically going to lead to the case 98 00:04:35,160 --> 00:04:38,220 where a program, a definition of function, 99 00:04:38,220 --> 00:04:41,340 will refer to itself in its body. 100 00:04:41,340 --> 00:04:44,719 It will call itself inside its body. 101 00:04:44,719 --> 00:04:47,010 Now, if you remember your high school geometry teacher, 102 00:04:47,010 --> 00:04:48,810 she probably would wrap your knuckles, which you're not 103 00:04:48,810 --> 00:04:50,982 allowed to do, because in things like geometry 104 00:04:50,982 --> 00:04:53,190 you can't define something in terms of itself, right? 105 00:04:53,190 --> 00:04:54,720 That's not allowed. 106 00:04:54,720 --> 00:04:56,460 In recursion, this is OK. 107 00:04:56,460 --> 00:05:00,990 Our definition of a procedure can in its body call itself, 108 00:05:00,990 --> 00:05:03,840 so long as I have what I call a base case, 109 00:05:03,840 --> 00:05:08,100 a way of stopping that unwinding of the problems, 110 00:05:08,100 --> 00:05:10,645 when I get to something I can solve directly. 111 00:05:10,645 --> 00:05:13,020 And so what we're going to do is avoid infinite recursion 112 00:05:13,020 --> 00:05:15,210 by ensuring that we have at least one or more base 113 00:05:15,210 --> 00:05:17,070 cases that are easy to solve. 114 00:05:17,070 --> 00:05:18,570 And then the basic idea is I just 115 00:05:18,570 --> 00:05:20,910 want to solve the same problem on some simpler 116 00:05:20,910 --> 00:05:23,610 input with the idea of using that solution 117 00:05:23,610 --> 00:05:25,870 to solve the larger problem. 118 00:05:25,870 --> 00:05:27,867 OK, let's look at an example, and to set 119 00:05:27,867 --> 00:05:30,450 the stage I'm going to go back to something you've been doing, 120 00:05:30,450 --> 00:05:32,100 iterative algorithms. 121 00:05:32,100 --> 00:05:34,170 For loops, while loops, they naturally 122 00:05:34,170 --> 00:05:36,480 lead to what we would call iterative algorithms, 123 00:05:36,480 --> 00:05:38,520 and these algorithms can be described 124 00:05:38,520 --> 00:05:42,510 as being captured by a set of state variables, 125 00:05:42,510 --> 00:05:45,630 meaning one or more variables that tell us exactly 126 00:05:45,630 --> 00:05:47,890 the state of the computation. 127 00:05:47,890 --> 00:05:50,747 That's a lot of words, let's look at an example. 128 00:05:50,747 --> 00:05:52,330 I know it's trivial, but bear with me. 129 00:05:52,330 --> 00:05:54,420 Suppose I want to do integer multiplication, 130 00:05:54,420 --> 00:05:56,820 multiply two integers together, and all 131 00:05:56,820 --> 00:05:59,080 I have available to me is addition. 132 00:05:59,080 --> 00:06:03,620 So a times b is the same as adding a to itself b times. 133 00:06:03,620 --> 00:06:05,940 If I'm thinking about this iteratively, 134 00:06:05,940 --> 00:06:10,080 I could capture this computation with two state variables. 135 00:06:10,080 --> 00:06:13,200 One we'd just call the iteration number, 136 00:06:13,200 --> 00:06:15,420 and it would be something, for example, 137 00:06:15,420 --> 00:06:19,420 that starts at b, and each time through the loop reduces 1. 138 00:06:19,420 --> 00:06:19,920 One. 139 00:06:19,920 --> 00:06:22,080 And it will keep doing that until I've counted down 140 00:06:22,080 --> 00:06:24,690 b times, and I get down to 0. 141 00:06:24,690 --> 00:06:26,130 And at the same time, I would have 142 00:06:26,130 --> 00:06:28,110 some value of the computation, I might call it 143 00:06:28,110 --> 00:06:32,520 result, which starts at 0, first time through adds an a, 144 00:06:32,520 --> 00:06:34,320 next time through adds an a, and it just 145 00:06:34,320 --> 00:06:36,240 keeps track of how many things have 146 00:06:36,240 --> 00:06:38,639 I added up, until I get done. 147 00:06:38,639 --> 00:06:40,305 And yeah, I know you could just do mult, 148 00:06:40,305 --> 00:06:41,850 but this is trying to get this idea 149 00:06:41,850 --> 00:06:45,160 of, how would I do this iteratively. 150 00:06:45,160 --> 00:06:50,160 So I might start off with i, saying there are b things still 151 00:06:50,160 --> 00:06:51,360 to add, and the result is 1. 152 00:06:51,360 --> 00:06:55,320 The first time through the loop, I add an a, reduce i by 1. 153 00:06:55,320 --> 00:06:57,510 Next time through the loop, I add in another a, 154 00:06:57,510 --> 00:06:59,680 reduce i by 1, and you get the idea. 155 00:06:59,680 --> 00:07:01,980 I just walk down it until, eventually, I got 156 00:07:01,980 --> 00:07:05,680 to the end of this computation. 157 00:07:05,680 --> 00:07:07,690 So we could write code for this, and, actually, 158 00:07:07,690 --> 00:07:09,660 it should be pretty straightforward. 159 00:07:09,660 --> 00:07:11,170 There it is. 160 00:07:11,170 --> 00:07:15,254 Going to call it mult_iter, takes in two arguments a and b, 161 00:07:15,254 --> 00:07:17,170 and I'm going to capture exactly that process. 162 00:07:17,170 --> 00:07:19,330 So notice what I do, I set up result 163 00:07:19,330 --> 00:07:20,950 internally as just a little variable 164 00:07:20,950 --> 00:07:23,050 I'm going to use to accumulate things. 165 00:07:23,050 --> 00:07:25,880 And then, there is the iteration, 166 00:07:25,880 --> 00:07:29,320 as long as b is greater than 0 what do I do? 167 00:07:29,320 --> 00:07:33,630 Add a to result, store it away, reduce b by 1, 168 00:07:33,630 --> 00:07:35,470 and I'll keep doing that until b gets 169 00:07:35,470 --> 00:07:37,600 down to being equal to 0, in which case 170 00:07:37,600 --> 00:07:43,610 I just return the result. OK, simple solution. 171 00:07:43,610 --> 00:07:47,770 Now, let's think about this a different way. 172 00:07:47,770 --> 00:07:51,220 A times b is just adding a to itself b times, 173 00:07:51,220 --> 00:07:57,920 and that's the same as a plus adding a to a itself 174 00:07:57,920 --> 00:08:00,710 b minus 1 times. 175 00:08:00,710 --> 00:08:02,780 OK, that sounds like leisure to me, 176 00:08:02,780 --> 00:08:04,820 that sounds like just playing with words. 177 00:08:04,820 --> 00:08:08,200 But it's really important, because what is this? 178 00:08:08,200 --> 00:08:12,870 Ah, that's just a times b minus 1, 179 00:08:12,870 --> 00:08:15,760 by the definition of the top point. 180 00:08:15,760 --> 00:08:17,500 And I know you're totally impressed, 181 00:08:17,500 --> 00:08:20,420 but this is actually really cool, because what have I done? 182 00:08:20,420 --> 00:08:24,180 I've taken one problem, this one up here, 183 00:08:24,180 --> 00:08:27,870 and I've reduced it to a simpler version of the same problem, 184 00:08:27,870 --> 00:08:30,020 plus some things I know how to do. 185 00:08:30,020 --> 00:08:31,760 And how would I solve this? 186 00:08:31,760 --> 00:08:36,110 Same trick, that's a times a times b minus 2, 187 00:08:36,110 --> 00:08:37,730 I would just unwrap it one more time, 188 00:08:37,730 --> 00:08:40,580 and I would just keep doing that until I get down 189 00:08:40,580 --> 00:08:43,470 to something I can solve directly, a base case. 190 00:08:43,470 --> 00:08:47,240 And that's easy, when b equal to 1, the answer is just a. 191 00:08:47,240 --> 00:08:51,730 Or I could do when b is equal to 0 the answer is just 0. 192 00:08:51,730 --> 00:08:54,270 And there's code to capture that. 193 00:08:54,270 --> 00:08:57,510 Different form, wonderful compact description, 194 00:08:57,510 --> 00:08:58,270 what does it say? 195 00:08:58,270 --> 00:09:03,470 It says, if I'm at the base case, if b is equal to 1, 196 00:09:03,470 --> 00:09:05,560 the answer is just a. 197 00:09:05,560 --> 00:09:09,840 Otherwise, I'm going to solve the same problem with a smaller 198 00:09:09,840 --> 00:09:14,760 version and add it to a and return that result. 199 00:09:14,760 --> 00:09:18,970 And that's nice, crisp characterization of a problem. 200 00:09:18,970 --> 00:09:21,400 Recursive definition that reduces a problem to a simpler 201 00:09:21,400 --> 00:09:24,430 version of the same problem. 202 00:09:24,430 --> 00:09:28,260 OK, let's look at another example. 203 00:09:28,260 --> 00:09:30,720 Classic problem in recursion is to compute factorial, 204 00:09:30,720 --> 00:09:32,700 right? n factorial, or n bang if you 205 00:09:32,700 --> 00:09:35,580 like, n exclamation point is n times n minus 1, 206 00:09:35,580 --> 00:09:36,550 all the way down to 1. 207 00:09:36,550 --> 00:09:38,640 So it's the product of all the integers from 1 208 00:09:38,640 --> 00:09:41,900 up to n assuming n is a positive integer. 209 00:09:41,900 --> 00:09:43,430 So we can ask the same question if I 210 00:09:43,430 --> 00:09:47,690 wanted to solve this recursively what would the base case be? 211 00:09:47,690 --> 00:09:52,376 Well, when n is equal to 1, it's just 1. 212 00:09:52,376 --> 00:09:55,270 In the recursive case, will n times n minus 1 213 00:09:55,270 --> 00:09:58,720 all the way down to 1, that's the same as n times 214 00:09:58,720 --> 00:10:02,000 n minus 1 factorial. 215 00:10:02,000 --> 00:10:06,210 So I can easily write out the base case, 216 00:10:06,210 --> 00:10:08,790 and I've got a nice recursive solution to this problem. 217 00:10:11,990 --> 00:10:14,839 OK, if you're like me and this is the first time you've 218 00:10:14,839 --> 00:10:16,630 seen it, it feels like I've taken your head 219 00:10:16,630 --> 00:10:18,340 and twisted it about 180 degrees. 220 00:10:18,340 --> 00:10:20,962 I'm going to take it another 180 degrees because you might 221 00:10:20,962 --> 00:10:22,420 be saying, well, wait a minute, how 222 00:10:22,420 --> 00:10:24,080 do you know it really stops. 223 00:10:24,080 --> 00:10:27,970 How do you know it really terminates the computation? 224 00:10:27,970 --> 00:10:29,410 So let's look at it. 225 00:10:29,410 --> 00:10:33,590 There is my definition for fact, short for factorial. 226 00:10:33,590 --> 00:10:36,200 Fact of 1 is, if n is equal to 1 return 1, 227 00:10:36,200 --> 00:10:39,900 otherwise return n times fact of n minus 1. 228 00:10:39,900 --> 00:10:41,990 And let's use the tools that Ana talked about, 229 00:10:41,990 --> 00:10:44,210 in terms of an environment at a scope, 230 00:10:44,210 --> 00:10:46,640 and think about what happens here. 231 00:10:46,640 --> 00:10:49,070 So when I read that in or I evaluate that in Python, it 232 00:10:49,070 --> 00:10:53,880 creates a definition that binds the name fact to some code, 233 00:10:53,880 --> 00:10:55,940 just all of that stuff over here plus the name 234 00:10:55,940 --> 00:11:00,050 for the formal parameter, hasn't done anything with it yet. 235 00:11:00,050 --> 00:11:04,350 And then I'm going to evaluate print a fact of 4. 236 00:11:04,350 --> 00:11:07,696 Print needs a value, so it has to get the value of fact of 4, 237 00:11:07,696 --> 00:11:08,820 and we know what that does. 238 00:11:08,820 --> 00:11:13,860 It looks up fact, there it is, it's procedure definition. 239 00:11:13,860 --> 00:11:15,990 So it creates a new frame, a new environment, 240 00:11:15,990 --> 00:11:19,290 it calls that procedure, and inside that frame 241 00:11:19,290 --> 00:11:24,310 the formal parameter for fact is bound to the value passed in. 242 00:11:24,310 --> 00:11:26,590 So n is bound to 4. 243 00:11:26,590 --> 00:11:29,320 That frame is scoped by this global frame 244 00:11:29,320 --> 00:11:31,917 meaning it's going to inherit things in the global frame. 245 00:11:31,917 --> 00:11:32,750 And what does it do? 246 00:11:32,750 --> 00:11:38,420 It says, inside of this frame evaluate the body of fact. 247 00:11:38,420 --> 00:11:41,330 OK, so it says as n equal to 1? 248 00:11:41,330 --> 00:11:43,770 Nope, it's not, it's 4. 249 00:11:43,770 --> 00:11:46,230 So in that case, go to the else statement and says, 250 00:11:46,230 --> 00:11:51,000 oh, return n times fact of n and n as 4, fact of n minus 1 251 00:11:51,000 --> 00:11:56,530 says I need to return 4 times fact of 3. 252 00:11:56,530 --> 00:11:59,732 4 is easy, multiplication is easy, fact of 3, 253 00:11:59,732 --> 00:12:02,080 ah yes, I look up fact. 254 00:12:02,080 --> 00:12:04,210 Now I'm in this frame, I don't see fact there, 255 00:12:04,210 --> 00:12:05,510 but I go up to that frame. 256 00:12:05,510 --> 00:12:07,630 There's the definition for fact, and we're 257 00:12:07,630 --> 00:12:09,280 going to do the rest of this a little more quickly, 258 00:12:09,280 --> 00:12:10,030 what does that do? 259 00:12:10,030 --> 00:12:13,600 It creates a new frame called by fact. 260 00:12:13,600 --> 00:12:17,970 And the argument passed in for n is n minus 1, 261 00:12:17,970 --> 00:12:19,420 that value, right there, of 3. 262 00:12:19,420 --> 00:12:22,290 So 3 is now bound to n. 263 00:12:22,290 --> 00:12:25,207 Same game, evaluate the body is n equal to 1? 264 00:12:25,207 --> 00:12:28,110 No, so in that case, I'm going to go to the return statement, 265 00:12:28,110 --> 00:12:31,500 it says return 3 times fact of 2. 266 00:12:31,500 --> 00:12:33,872 And notice it's only looking at this value of n 267 00:12:33,872 --> 00:12:35,580 because that's the frame in which I'm in. 268 00:12:35,580 --> 00:12:39,480 It never sees that value of n. 269 00:12:39,480 --> 00:12:41,850 OK, aren't you glad I didn't do fact of 400? 270 00:12:41,850 --> 00:12:44,100 We've only got two more to go, but you get the idea. 271 00:12:44,100 --> 00:12:45,770 Same thing, I need to get fact of 2 272 00:12:45,770 --> 00:12:48,840 is going to call fact again with n bound to 2. 273 00:12:48,840 --> 00:12:51,990 Relative that evaluates the body and is not yet equal to 1. 274 00:12:51,990 --> 00:12:53,610 That says I'm going to the else clause 275 00:12:53,610 --> 00:12:56,330 and return 2 times fact of 1. 276 00:12:56,330 --> 00:12:59,650 I call fact again, now with n bound to 1, 277 00:12:59,650 --> 00:13:03,090 and, fortunately, now that clause is true, 278 00:13:03,090 --> 00:13:07,440 and it says return 1. 279 00:13:07,440 --> 00:13:10,517 Whoops, sorry, before I do, so there's the base case. 280 00:13:10,517 --> 00:13:13,100 And it may seem apparent to you, but this is important, right? 281 00:13:13,100 --> 00:13:15,020 I'm unwinding this till I get to something 282 00:13:15,020 --> 00:13:17,030 that can stop the computation. 283 00:13:17,030 --> 00:13:19,190 Now I'm simply going to gather the computation up, 284 00:13:19,190 --> 00:13:20,300 because it says return 1. 285 00:13:20,300 --> 00:13:21,790 Who asked for it? 286 00:13:21,790 --> 00:13:24,090 Well that call to fact of 1. 287 00:13:24,090 --> 00:13:27,350 So that reduces to return 2 times 1. 288 00:13:27,350 --> 00:13:28,640 And who called for that? 289 00:13:28,640 --> 00:13:29,870 Fact of 2. 290 00:13:29,870 --> 00:13:34,220 That reduces to return a 3 times 2, which reduces to 4 times 291 00:13:34,220 --> 00:13:36,680 6, which reduces to printing out 24. 292 00:13:39,280 --> 00:13:42,840 So it unwinds it down to a base case and it stops. 293 00:13:42,840 --> 00:13:47,670 A couple of observations, notice how each recursive call creates 294 00:13:47,670 --> 00:13:50,120 its own frame, and as a consequence, 295 00:13:50,120 --> 00:13:54,140 there's no confusion about which value of n I'm using. 296 00:13:54,140 --> 00:13:57,080 Also notice, in the other frames, n was not changed. 297 00:13:57,080 --> 00:13:58,010 We did not mutate it. 298 00:13:58,010 --> 00:14:00,020 So we're literally creating a local scope 299 00:14:00,020 --> 00:14:03,930 for that recursive call, which is exactly what we want. 300 00:14:03,930 --> 00:14:07,580 Also notice how there was a sense of flow of control 301 00:14:07,580 --> 00:14:10,740 in computing fact of something, that reduces to returning n 302 00:14:10,740 --> 00:14:15,450 times fact of n minus 1, and that creates a new scope. 303 00:14:15,450 --> 00:14:17,100 And that will simply keep unwinding 304 00:14:17,100 --> 00:14:19,200 until I get to something that can return a value 305 00:14:19,200 --> 00:14:21,850 and then I gather all those frames back up. 306 00:14:21,850 --> 00:14:24,400 So there's a natural flow of control here. 307 00:14:24,400 --> 00:14:27,390 But most importantly, there's no confusion about which variable 308 00:14:27,390 --> 00:14:30,591 I'm using when I'm looking for a value of n. 309 00:14:30,591 --> 00:14:32,959 All right, because this is often a place where 310 00:14:32,959 --> 00:14:35,500 things get a little confusing, I want to do one more example. 311 00:14:35,500 --> 00:14:38,320 But let me first show you side by side 312 00:14:38,320 --> 00:14:40,416 the two different versions of factorial. 313 00:14:40,416 --> 00:14:43,040 Actually, I have lied slightly, we didn't show this one earlier 314 00:14:43,040 --> 00:14:45,331 but there's factorial if I wanted to do it iteratively. 315 00:14:45,331 --> 00:14:48,530 I'd set up some initial variable to 1, 316 00:14:48,530 --> 00:14:50,300 and then I'd just run through a loop. 317 00:14:50,300 --> 00:14:55,180 For example, from 1 up to just below n minus 1, or 1 up to n, 318 00:14:55,180 --> 00:15:00,510 multiplying it and putting it back into return product. 319 00:15:00,510 --> 00:15:02,820 Which one do you like more? 320 00:15:02,820 --> 00:15:04,800 You can't say neither you have to pick one. 321 00:15:04,800 --> 00:15:08,100 Show of hands, how many of you like this one? 322 00:15:08,100 --> 00:15:10,708 Some hesitant ones, how many prefer this one? 323 00:15:10,708 --> 00:15:13,850 Yeah, that's my view. 324 00:15:13,850 --> 00:15:16,130 I'm biased, but I really like the recursive one. 325 00:15:16,130 --> 00:15:19,610 It is crisper to look at, you can see what it's doing. 326 00:15:19,610 --> 00:15:21,680 I'm reducing this problem to a simpler 327 00:15:21,680 --> 00:15:24,146 version of that problem. 328 00:15:24,146 --> 00:15:25,520 Pick your own version but I would 329 00:15:25,520 --> 00:15:27,050 argue that the recursive version is 330 00:15:27,050 --> 00:15:29,150 more intuitive to understand. 331 00:15:29,150 --> 00:15:31,070 From a programmer's perspective, it's 332 00:15:31,070 --> 00:15:32,870 actually often more efficient to write, 333 00:15:32,870 --> 00:15:36,140 because I don't have to think about interior variables. 334 00:15:36,140 --> 00:15:39,260 Depending on the machine, it may not be as efficient 335 00:15:39,260 --> 00:15:41,960 when you call it because in the recursive version 336 00:15:41,960 --> 00:15:44,360 I've got it set up, that set of frames. 337 00:15:44,360 --> 00:15:45,860 And some versions of these languages 338 00:15:45,860 --> 00:15:47,401 are actually very efficient about it, 339 00:15:47,401 --> 00:15:48,706 some of them a little less so. 340 00:15:48,706 --> 00:15:50,330 But given the speed of computers today, 341 00:15:50,330 --> 00:15:54,790 who cares as long as it actually just does the computation. 342 00:15:54,790 --> 00:15:57,370 Right, one more example, how do we really 343 00:15:57,370 --> 00:15:59,724 know our recursive code works? 344 00:15:59,724 --> 00:16:01,390 Well, we just did a simulation but let's 345 00:16:01,390 --> 00:16:03,680 look at it one more way. 346 00:16:03,680 --> 00:16:07,330 The iterative version, what can I say about it? 347 00:16:07,330 --> 00:16:09,390 Well, I know it's going to terminate 348 00:16:09,390 --> 00:16:11,520 because b is initially positive, assuming 349 00:16:11,520 --> 00:16:13,740 I gave it an appropriate value. 350 00:16:13,740 --> 00:16:16,620 It decreases by 1 every time around this loop, 351 00:16:16,620 --> 00:16:19,860 at some point it has to get less than 1, it's going to stop. 352 00:16:19,860 --> 00:16:23,050 So I can conclude it's always going to terminate. 353 00:16:23,050 --> 00:16:25,700 What about the recursive version? 354 00:16:25,700 --> 00:16:30,000 Well, if I call it with b equal to one, I'm done. 355 00:16:30,000 --> 00:16:33,160 If I call it with b greater than one, 356 00:16:33,160 --> 00:16:35,789 again it's going to reduce it by one on the recursive call, 357 00:16:35,789 --> 00:16:38,080 which means on each recursive call it's going to reduce 358 00:16:38,080 --> 00:16:39,977 and eventually it gets down to a place, 359 00:16:39,977 --> 00:16:41,810 assuming I gave it a positive integer, where 360 00:16:41,810 --> 00:16:43,340 b is equal to one. 361 00:16:43,340 --> 00:16:47,040 So it'll stop, which just good. 362 00:16:47,040 --> 00:16:49,380 What we just did was we used the great tool 363 00:16:49,380 --> 00:16:54,464 from math, second best department at MIT. 364 00:16:54,464 --> 00:16:56,630 Wow, I didn't even get any hisses on that one, John, 365 00:16:56,630 --> 00:16:58,790 all right, and I'm now in trouble 366 00:16:58,790 --> 00:17:01,150 with the head of the math department. 367 00:17:01,150 --> 00:17:02,620 So now that I got your attention, 368 00:17:02,620 --> 00:17:04,599 and yes, all computer science jokes are bad, 369 00:17:04,599 --> 00:17:06,640 and mine are really bad, but I'm tenured. 370 00:17:06,640 --> 00:17:10,514 You cannot do a damn thing about it. 371 00:17:10,514 --> 00:17:12,680 Let's look at mathematical induction which turns out 372 00:17:12,680 --> 00:17:14,900 to be a tool that lets us think about programs 373 00:17:14,900 --> 00:17:16,827 in a really nice way. 374 00:17:16,827 --> 00:17:18,410 You haven't seen this, here's the idea 375 00:17:18,410 --> 00:17:19,609 of mathematical induction. 376 00:17:19,609 --> 00:17:22,280 If I want to prove a statement, and we 377 00:17:22,280 --> 00:17:24,420 refer to it as being indexed on the integers. 378 00:17:24,420 --> 00:17:26,660 In other words, it's some mathematical statement 379 00:17:26,660 --> 00:17:28,300 that runs over integers. 380 00:17:28,300 --> 00:17:32,660 If I want to prove it's true for all values of those integers, 381 00:17:32,660 --> 00:17:34,580 mathematically I'd do it by simply proving 382 00:17:34,580 --> 00:17:37,640 it's true for the smallest value of n typically 383 00:17:37,640 --> 00:17:41,310 n is equal to 0 or 1, and then I do an interesting thing. 384 00:17:41,310 --> 00:17:44,310 I say I need to prove that if it's true for an arbitrary 385 00:17:44,310 --> 00:17:47,310 value of n, I'm just going to prove that it's also then 386 00:17:47,310 --> 00:17:49,530 true for n plus 1. 387 00:17:49,530 --> 00:17:51,480 And if I can do those two things I can then 388 00:17:51,480 --> 00:17:54,390 conclude for an infinite number of values of n 389 00:17:54,390 --> 00:17:56,102 it's always true. 390 00:17:56,102 --> 00:17:58,310 Then we'll relate it back to programming in a second, 391 00:17:58,310 --> 00:18:00,351 but let me show you a simple example of this, one 392 00:18:00,351 --> 00:18:02,170 that you may have seen. 393 00:18:02,170 --> 00:18:05,690 If I had the integers from 0 up to n, or even from 1 up to n, 394 00:18:05,690 --> 00:18:10,160 I claim that's the same as n times n plus 1 over 2. 395 00:18:10,160 --> 00:18:12,220 So 1, 2, 3, that's 6, right. 396 00:18:12,220 --> 00:18:14,560 And that's exactly right, 3 times 4, 397 00:18:14,560 --> 00:18:17,740 which is divided by 2, which gives me out 6. 398 00:18:17,740 --> 00:18:19,840 How would I prove this? 399 00:18:19,840 --> 00:18:21,556 Well, by induction? 400 00:18:21,556 --> 00:18:25,180 I need to do the simple cases if n is equal to 0, 401 00:18:25,180 --> 00:18:27,070 well then this side is just 0. 402 00:18:27,070 --> 00:18:29,590 And that's 0 times 1, which is 0 divided by true. 403 00:18:29,590 --> 00:18:32,210 So 0 equals 0, it's true. 404 00:18:32,210 --> 00:18:34,360 Now the inductive step. 405 00:18:34,360 --> 00:18:36,250 I'm going to assume it's true for some k, 406 00:18:36,250 --> 00:18:38,530 I should have picked n, but for some k, 407 00:18:38,530 --> 00:18:42,180 and then what I need to show is it's true for k plus 1. 408 00:18:42,180 --> 00:18:45,200 Well, there's the left hand side, 409 00:18:45,200 --> 00:18:48,850 and I want to show that this is equal to that. 410 00:18:48,850 --> 00:18:51,640 And I'm going do it by using exactly this recursive idea, 411 00:18:51,640 --> 00:18:55,660 because what do I know, I know that this sum, in here, I'm 412 00:18:55,660 --> 00:18:58,100 assuming is true. 413 00:18:58,100 --> 00:19:01,330 And so that says that the left hand side, the first portion 414 00:19:01,330 --> 00:19:04,000 of it, is just k times k plus 1 over 2, 415 00:19:04,000 --> 00:19:06,820 that's the definition of the thing I'm assuming is true. 416 00:19:06,820 --> 00:19:10,151 To that I'm going to add k plus 1. 417 00:19:10,151 --> 00:19:11,650 Well, you can do the algebra, right? 418 00:19:11,650 --> 00:19:14,320 That's k plus 1 all times k over 2 419 00:19:14,320 --> 00:19:18,250 plus 1, which is k plus 2 over 2. 420 00:19:18,250 --> 00:19:22,000 Oh cool, it's exactly that. 421 00:19:22,000 --> 00:19:23,920 Having done that, I can now conclude this 422 00:19:23,920 --> 00:19:27,690 is true for all values of n. 423 00:19:27,690 --> 00:19:30,996 What does it have to do with programming? 424 00:19:30,996 --> 00:19:32,870 That's exactly what we're doing when we think 425 00:19:32,870 --> 00:19:35,450 about recursive code, right? 426 00:19:35,450 --> 00:19:38,300 We're saying, show that it's true for the base case, 427 00:19:38,300 --> 00:19:40,040 and then what I'm essentially assuming 428 00:19:40,040 --> 00:19:44,120 is that, if it works for values smaller than b, 429 00:19:44,120 --> 00:19:46,820 then does the code return the right answer for b? 430 00:19:46,820 --> 00:19:48,620 And the answer is, absolutely it does, 431 00:19:48,620 --> 00:19:51,740 and I'm using induction to deduce that, in fact, my code 432 00:19:51,740 --> 00:19:54,600 does the right thing. 433 00:19:54,600 --> 00:19:57,180 Why am I torturing you with this? 434 00:19:57,180 --> 00:20:00,487 Because this is the way I want you to think about recursion. 435 00:20:00,487 --> 00:20:02,070 When I'm going to break a problem down 436 00:20:02,070 --> 00:20:03,990 into a smaller version of the same problem, 437 00:20:03,990 --> 00:20:06,720 I can assume that the smaller version gives the answer. 438 00:20:06,720 --> 00:20:09,450 All I have to do is make sure that what I combined together 439 00:20:09,450 --> 00:20:12,510 gives me out the right result. 440 00:20:12,510 --> 00:20:15,950 OK, you may be wondering what I'm 441 00:20:15,950 --> 00:20:18,500 doing with these wonderful high tech toys down here. 442 00:20:18,500 --> 00:20:20,692 I want to show you another example of recursion. 443 00:20:20,692 --> 00:20:23,150 So far we've seen simple things that have just had one base 444 00:20:23,150 --> 00:20:25,700 case, and this is a mythical story called 445 00:20:25,700 --> 00:20:28,040 The towers of Hanoi and this story, as I heard it, 446 00:20:28,040 --> 00:20:30,530 is there's a temporal somewhere in Hanoi 447 00:20:30,530 --> 00:20:35,630 with three tall spikes and 64 jewel-encrusted golden disks 448 00:20:35,630 --> 00:20:37,600 all of a different size. 449 00:20:37,600 --> 00:20:40,180 They all started out on one spike with the property 450 00:20:40,180 --> 00:20:43,570 that they were ordered from smallest down to largest. 451 00:20:43,570 --> 00:20:46,420 And there are priests in this temple who are moving the disks 452 00:20:46,420 --> 00:20:48,580 one at a time, one per second, and their goal 453 00:20:48,580 --> 00:20:53,270 is to move the entire stack from one spike to another spike. 454 00:20:53,270 --> 00:20:55,760 And when they do nirvana is achieved 455 00:20:55,760 --> 00:20:58,200 and we all get a really great life. 456 00:20:58,200 --> 00:20:59,790 We'll talk separately about how long 457 00:20:59,790 --> 00:21:02,040 is this going to take because there's one trick to it. 458 00:21:02,040 --> 00:21:05,380 They can never cover a smaller disk with a larger disk 459 00:21:05,380 --> 00:21:07,380 as they're doing it, so they've got a third disk 460 00:21:07,380 --> 00:21:09,161 as a temporary thing. 461 00:21:09,161 --> 00:21:11,160 And I want to show you how to solve this problem 462 00:21:11,160 --> 00:21:13,380 because you're going to write code with my help in a second, 463 00:21:13,380 --> 00:21:14,760 or I'm going to write code with your help 464 00:21:14,760 --> 00:21:15,760 in a second to solve it. 465 00:21:15,760 --> 00:21:17,790 So let's look at it, so watch carefully, 466 00:21:17,790 --> 00:21:21,960 moving a disk of size one, well that's pretty easy, right? 467 00:21:21,960 --> 00:21:23,550 Moving a disk of size two, we'll just 468 00:21:23,550 --> 00:21:25,716 put this one on the spare one while you move it over 469 00:21:25,716 --> 00:21:26,850 so you don't cover it up. 470 00:21:26,850 --> 00:21:28,650 That's easy. 471 00:21:28,650 --> 00:21:30,524 Moving a disk of size three, you've 472 00:21:30,524 --> 00:21:32,940 got be a little more careful, you can't cover up a smaller 473 00:21:32,940 --> 00:21:34,650 one with a larger one, so you have to really think 474 00:21:34,650 --> 00:21:35,745 about where you're putting it. 475 00:21:35,745 --> 00:21:38,369 It would help with these things didn't juggle and there you go, 476 00:21:38,369 --> 00:21:40,066 you got it done. 477 00:21:40,066 --> 00:21:41,190 All right, you're watching? 478 00:21:41,190 --> 00:21:41,940 You've got to do four. 479 00:21:41,940 --> 00:21:44,565 To do four, again, you've got to be really careful not to cover 480 00:21:44,565 --> 00:21:46,170 things up as you do this. 481 00:21:46,170 --> 00:21:48,360 You want to get the bottom one eventually exposed, 482 00:21:48,360 --> 00:21:49,810 and so are you going to pull that one over there. 483 00:21:49,810 --> 00:21:51,060 If you do the pattern really well, 484 00:21:51,060 --> 00:21:53,185 you won't notice if I make a serious mistake as I'm 485 00:21:53,185 --> 00:21:54,610 doing this, which I just did. 486 00:21:54,610 --> 00:21:56,026 But I'm going to recover from that 487 00:21:56,026 --> 00:21:58,230 and do it that way to put this one over here, 488 00:21:58,230 --> 00:22:00,859 and that one goes there, and if I did this in Harvard Square 489 00:22:00,859 --> 00:22:01,650 I could make money. 490 00:22:01,650 --> 00:22:02,486 There you go, right? 491 00:22:05,920 --> 00:22:08,930 OK, got the solution? 492 00:22:08,930 --> 00:22:10,200 See how to solve it? 493 00:22:10,200 --> 00:22:12,600 Could you write code for this? 494 00:22:12,600 --> 00:22:15,600 Eh, maybe not. 495 00:22:15,600 --> 00:22:17,100 That's on the quiz, thanks, John, 496 00:22:17,100 --> 00:22:19,200 don't tell them on the quiz, damn. 497 00:22:19,200 --> 00:22:23,190 All right, I want to claim though that in fact there's 498 00:22:23,190 --> 00:22:26,569 a beautiful recursive solution. 499 00:22:26,569 --> 00:22:28,610 And here's the way to think about it recursively. 500 00:22:28,610 --> 00:22:31,740 I want to move a tower of size n, 501 00:22:31,740 --> 00:22:33,757 I'm going to assume I can move smaller towers 502 00:22:33,757 --> 00:22:34,840 and then it's really easy. 503 00:22:34,840 --> 00:22:37,860 What do I do, I take a stack of size n minus 1, 504 00:22:37,860 --> 00:22:41,280 I move it onto the spare one, I move the bottom one over, 505 00:22:41,280 --> 00:22:44,090 and then I move a stack of size n minus 1 506 00:22:44,090 --> 00:22:47,740 to there, beautiful, recursive solution. 507 00:22:47,740 --> 00:22:49,240 And how do I move the smaller stack? 508 00:22:49,240 --> 00:22:53,010 Just the same way, I just unwind it, 509 00:22:53,010 --> 00:22:58,850 simple, and, in fact, the code follows exactly that. 510 00:22:58,850 --> 00:23:01,080 OK, I do a little [INAUDIBLE] domain up here 511 00:23:01,080 --> 00:23:03,300 to try and get your attention, but notice 512 00:23:03,300 --> 00:23:04,881 by doing that what did I do? 513 00:23:04,881 --> 00:23:06,630 I asked you to think about it recursively, 514 00:23:06,630 --> 00:23:08,940 the recursive solution, when you see it, 515 00:23:08,940 --> 00:23:13,810 is in fact very straightforward, and there's the code. 516 00:23:13,810 --> 00:23:15,760 Dead trivial, well, that trivial is unfair, 517 00:23:15,760 --> 00:23:16,840 but it's very simple. 518 00:23:16,840 --> 00:23:17,440 Right? 519 00:23:17,440 --> 00:23:20,050 I simply write something, so let me describe it, 520 00:23:20,050 --> 00:23:22,240 I need to say how big of tower am I moving 521 00:23:22,240 --> 00:23:25,580 and I'm going to label the three stacks a from, a to, 522 00:23:25,580 --> 00:23:26,924 and a spare. 523 00:23:26,924 --> 00:23:28,840 I have a little procedure that just prints out 524 00:23:28,840 --> 00:23:32,650 the move for me, and then what's the solution? 525 00:23:32,650 --> 00:23:35,350 If it's just a stack of size one, just print the move, 526 00:23:35,350 --> 00:23:38,020 take it to from-- from from to to. 527 00:23:38,020 --> 00:23:41,310 Otherwise, move a tower of size n minus 1 528 00:23:41,310 --> 00:23:46,660 from the from spot to the spare spot, then move 529 00:23:46,660 --> 00:23:49,630 what's left of tower size one from to two, 530 00:23:49,630 --> 00:23:51,850 and then take that thing are stuck on spare 531 00:23:51,850 --> 00:23:56,500 and move it over to two, and I'm done. 532 00:23:56,500 --> 00:23:59,285 In that code that we handed out, you'll see this code, 533 00:23:59,285 --> 00:23:59,910 you can run it. 534 00:23:59,910 --> 00:24:01,310 I'm not going to print it out because, if I did, 535 00:24:01,310 --> 00:24:02,800 you are just going to say, OK, it 536 00:24:02,800 --> 00:24:04,970 looks like it does the right kind of thing. 537 00:24:04,970 --> 00:24:08,196 Look at the code, nice and easy, and that's 538 00:24:08,196 --> 00:24:10,320 what we like you to do when you're given a problem. 539 00:24:10,320 --> 00:24:11,986 We asked you to think about recursively. 540 00:24:11,986 --> 00:24:13,790 How do I solve this with a smaller 541 00:24:13,790 --> 00:24:15,920 version of the same problem? 542 00:24:15,920 --> 00:24:19,310 And then how do I use that to build the larger solution? 543 00:24:19,310 --> 00:24:21,390 This case is a little different. 544 00:24:21,390 --> 00:24:23,030 You could argue that this is not really 545 00:24:23,030 --> 00:24:25,321 a recursive call here, it's just moving the bottom one, 546 00:24:25,321 --> 00:24:26,870 I could have done that directly. 547 00:24:26,870 --> 00:24:31,420 But I've got two recursive calls in the body here. 548 00:24:31,420 --> 00:24:33,710 I have to move a smaller stack twice. 549 00:24:33,710 --> 00:24:37,870 We're going to come back to that in a little bit. 550 00:24:37,870 --> 00:24:40,720 Let me show you one other example of recursion that 551 00:24:40,720 --> 00:24:43,215 runs a little bit differently. 552 00:24:43,215 --> 00:24:45,340 In this case it's going to have multiple base cases 553 00:24:45,340 --> 00:24:46,630 and this is another very old problem, 554 00:24:46,630 --> 00:24:48,046 it's called the Fibonacci numbers. 555 00:24:48,046 --> 00:24:50,350 It's based on something from several centuries 556 00:24:50,350 --> 00:24:52,600 ago when a gentleman, named Leonardo of Pisa, 557 00:24:52,600 --> 00:24:56,350 also known as Fibonacci, asked the following challenge. 558 00:24:56,350 --> 00:24:58,570 He said, I'm going to put a newborn pair of rabbits, 559 00:24:58,570 --> 00:25:02,870 one male and one female, into an enclosure, a pan of some sort. 560 00:25:02,870 --> 00:25:05,290 And the rabbits have the following properties, 561 00:25:05,290 --> 00:25:09,660 they mate at age one month, so they take a month to mature. 562 00:25:09,660 --> 00:25:11,650 After a one month gestation period, 563 00:25:11,650 --> 00:25:15,550 they produce another pair of rabbits, a male and a female, 564 00:25:15,550 --> 00:25:19,030 and he says I'm going to assume that the rabbits never die. 565 00:25:19,030 --> 00:25:22,180 So each month mature females are going to produce another pair. 566 00:25:22,180 --> 00:25:24,670 And his question was, how many female rabbits are there 567 00:25:24,670 --> 00:25:27,410 at the end of a year, or two years, or three years? 568 00:25:30,040 --> 00:25:33,690 The idea is, I start off with two immature rabbits, 569 00:25:33,690 --> 00:25:36,270 after one month they've matured, which 570 00:25:36,270 --> 00:25:41,700 means after another month, they will have produced a new pair. 571 00:25:41,700 --> 00:25:44,760 After another month, that mature pair has produced another pair, 572 00:25:44,760 --> 00:25:47,940 and the immature pair has matured. 573 00:25:47,940 --> 00:25:49,530 Which means, after another month, 574 00:25:49,530 --> 00:25:53,770 those two mature pairs are going to produce offspring, 575 00:25:53,770 --> 00:25:56,470 and that immature pair has matured. 576 00:25:56,470 --> 00:25:59,800 And you get the idea, and after several months, 577 00:25:59,800 --> 00:26:01,152 you get to Australia. 578 00:26:03,175 --> 00:26:05,550 You can also see this is going to be interesting to think 579 00:26:05,550 --> 00:26:07,500 about how do you compute this, but what I want you to see 580 00:26:07,500 --> 00:26:08,910 is the recursive solution to it. 581 00:26:08,910 --> 00:26:11,910 So how could we capture this? 582 00:26:11,910 --> 00:26:14,130 Well here's another way of thinking about it, 583 00:26:14,130 --> 00:26:15,540 after the first month, and I know 584 00:26:15,540 --> 00:26:16,998 we're going to do this funny thing, 585 00:26:16,998 --> 00:26:18,960 we're going to index it 0, so call it month 0. 586 00:26:18,960 --> 00:26:22,040 There is 1 female which is immature. 587 00:26:22,040 --> 00:26:24,280 After the second month, that female 588 00:26:24,280 --> 00:26:28,870 is mature and now pregnant which means after the third month it 589 00:26:28,870 --> 00:26:31,440 has produced an offspring. 590 00:26:31,440 --> 00:26:34,740 And more generally, that the n-th month, 591 00:26:34,740 --> 00:26:38,260 after we get past the first few cases, what do we have? 592 00:26:38,260 --> 00:26:41,470 Any female that was there two months ago 593 00:26:41,470 --> 00:26:44,160 has produced an offspring. 594 00:26:44,160 --> 00:26:46,210 Because it's taken at least one month to mature, 595 00:26:46,210 --> 00:26:47,585 if it hasn't already been mature, 596 00:26:47,585 --> 00:26:49,570 and then it's going to produce an offspring. 597 00:26:49,570 --> 00:26:52,810 And any female that was around last month 598 00:26:52,810 --> 00:26:55,330 is still around because they never die off. 599 00:26:55,330 --> 00:26:56,610 So this is a little different. 600 00:26:56,610 --> 00:27:00,070 This is now the number of females at month n 601 00:27:00,070 --> 00:27:02,200 is the number of females T month n minus 1, 602 00:27:02,200 --> 00:27:05,600 plus the number of females and month n minus 2. 603 00:27:05,600 --> 00:27:10,890 So two recursive calls, but with different arguments. 604 00:27:10,890 --> 00:27:14,430 Different from towers of Hanoi, where there were two recursive 605 00:27:14,430 --> 00:27:16,200 calls, but with the same sized problem. 606 00:27:19,670 --> 00:27:22,960 So now I need two base cases, one 607 00:27:22,960 --> 00:27:26,960 for when n is equal to 0, one for when n is equal to 1. 608 00:27:26,960 --> 00:27:29,810 And then I've got that recursive case, 609 00:27:29,810 --> 00:27:34,162 so there's a nice little piece of code. 610 00:27:34,162 --> 00:27:36,620 Fibonacci, I'm going to assume x is an integer greater than 611 00:27:36,620 --> 00:27:37,260 or equal to 0. 612 00:27:37,260 --> 00:27:39,410 I'm going to return Fibonacci of x. 613 00:27:39,410 --> 00:27:42,860 And you can see now it says, if either x is equal to 0 614 00:27:42,860 --> 00:27:45,860 or x is equal to 1 I'm going to return 1, 615 00:27:45,860 --> 00:27:50,810 otherwise, reduce it to two simpler versions of the problem 616 00:27:50,810 --> 00:27:55,470 but with different arguments, and I add them up. 617 00:27:55,470 --> 00:27:59,630 OK, and if we go look at this, we can actually run this, 618 00:27:59,630 --> 00:28:01,465 if I can find my code. 619 00:28:01,465 --> 00:28:09,800 Which is right there, and I'm just going to, 620 00:28:09,800 --> 00:28:13,010 so we can, for example, check it by saying fib of 0. 621 00:28:21,076 --> 00:28:23,300 I just hit a bug which I don't see. 622 00:28:27,044 --> 00:28:28,200 Let me try it again. 623 00:28:32,580 --> 00:28:35,580 I'll try it one more time with fib of 0. 624 00:28:41,450 --> 00:28:49,187 Darn, it's wrong, let me try it. 625 00:28:49,187 --> 00:28:50,770 I've got two different versions of fib 626 00:28:50,770 --> 00:28:52,060 in here, that's what I've got going on. 627 00:28:52,060 --> 00:28:53,895 So let me do it again, let's do fib of 1. 628 00:28:53,895 --> 00:28:59,260 There we go, fib of 2 which is 2, fib of 3 just three, 629 00:28:59,260 --> 00:29:01,660 and fib of 4 which should add the previous two, which 630 00:29:01,660 --> 00:29:02,491 gives me 5. 631 00:29:02,491 --> 00:29:02,990 There we go. 632 00:29:02,990 --> 00:29:05,080 Sorry about that, I had two versions of fib 633 00:29:05,080 --> 00:29:07,924 in my file, which is why it complained at me. 634 00:29:07,924 --> 00:29:09,340 And which is why you should always 635 00:29:09,340 --> 00:29:11,340 read the error instructions because it tells you 636 00:29:11,340 --> 00:29:13,760 what you did wrong. 637 00:29:13,760 --> 00:29:17,327 Let's go on and look at one more example of doing recursion, 638 00:29:17,327 --> 00:29:18,785 and we're going to do dictionaries, 639 00:29:18,785 --> 00:29:21,420 and then we're going to pull it all together. 640 00:29:21,420 --> 00:29:24,940 So far we've been doing recursion on numerical things, 641 00:29:24,940 --> 00:29:26,730 we can do it on non-numerical things. 642 00:29:26,730 --> 00:29:28,590 So a nice way of thinking about this is, 643 00:29:28,590 --> 00:29:31,350 how would I tell if a string of characters is a palindrome? 644 00:29:31,350 --> 00:29:34,240 Meaning it reads the same backwards and forwards. 645 00:29:34,240 --> 00:29:35,950 Probably the most famous palindrome 646 00:29:35,950 --> 00:29:39,940 is attributed to Napoleon "Able was I ere I saw Elba." 647 00:29:39,940 --> 00:29:41,770 Given that Napoleon was French, I really 648 00:29:41,770 --> 00:29:44,040 doubt he said "Able was I ere I saw Elba," 649 00:29:44,040 --> 00:29:45,990 but it's a great palindrome. 650 00:29:45,990 --> 00:29:47,860 Or another one attributed to Anne Michaels 651 00:29:47,860 --> 00:29:52,930 "Are we not drawn we few drawn onward to a new era," 652 00:29:52,930 --> 00:29:55,420 reads the same backwards and forwards. 653 00:29:55,420 --> 00:29:58,560 It's fun to think about how do you create the palindromes. 654 00:29:58,560 --> 00:30:01,727 I want to write code to solve this. 655 00:30:01,727 --> 00:30:03,560 Again, I want to think about it recursively, 656 00:30:03,560 --> 00:30:05,230 so here's what I'm going to do. 657 00:30:05,230 --> 00:30:08,230 I'm first going to take a string of characters, 658 00:30:08,230 --> 00:30:10,930 reduce them all to lowercase, and strip out 659 00:30:10,930 --> 00:30:12,290 spaces and punctuation. 660 00:30:12,290 --> 00:30:15,460 I just want the characters. 661 00:30:15,460 --> 00:30:17,110 And once I got that, I want to say, 662 00:30:17,110 --> 00:30:19,150 is that string, that list of characters 663 00:30:19,150 --> 00:30:22,880 or that collection of characters as I should say, a palindrome? 664 00:30:22,880 --> 00:30:24,940 And I'm going to think about it recursively, 665 00:30:24,940 --> 00:30:27,750 and that's actually pretty easy. 666 00:30:27,750 --> 00:30:32,320 If it's either 0 or 1 long, it's a palindrome. 667 00:30:32,320 --> 00:30:34,574 Otherwise you could think about having an index 668 00:30:34,574 --> 00:30:36,490 at each end of this thing and sort of counting 669 00:30:36,490 --> 00:30:38,320 into the middle, but it's much easier 670 00:30:38,320 --> 00:30:41,930 to say take the two at the end, if they're the same, 671 00:30:41,930 --> 00:30:44,890 then check to see what's left in the middle is a palindrome, 672 00:30:44,890 --> 00:30:48,170 and if those two properties are true, I'm done. 673 00:30:48,170 --> 00:30:51,940 And notice what I just did I nicely reduced a bigger problem 674 00:30:51,940 --> 00:30:53,190 to a slightly smaller problem. 675 00:30:53,190 --> 00:30:56,110 It's exactly what I want to do. 676 00:30:56,110 --> 00:30:57,190 OK? 677 00:30:57,190 --> 00:31:00,706 So it says to check is this, I'm going to reduce it 678 00:31:00,706 --> 00:31:02,080 to just the string of characters, 679 00:31:02,080 --> 00:31:04,621 and then I'm going to check if that's a palindrome by pulling 680 00:31:04,621 --> 00:31:07,520 those two off and checking to see they're the same, 681 00:31:07,520 --> 00:31:10,971 and then checking to see if the middle is itself a palindrome. 682 00:31:14,660 --> 00:31:16,044 How would I write it? 683 00:31:16,044 --> 00:31:19,762 I'm going to create a procedure up here, isPalindrome. 684 00:31:19,762 --> 00:31:22,220 I'm going to have inside of it two internal procedures that 685 00:31:22,220 --> 00:31:23,030 do the work for me. 686 00:31:23,030 --> 00:31:26,000 The first one is simply going to reduce this 687 00:31:26,000 --> 00:31:28,220 to all lowercase with no spaces. 688 00:31:28,220 --> 00:31:32,360 And notice what I can do because s is a string of characters. 689 00:31:32,360 --> 00:31:35,690 I can use the built in string method lower, so there's 690 00:31:35,690 --> 00:31:37,730 that dot notation, s.lower. 691 00:31:37,730 --> 00:31:40,400 It says. apply the method lower to a string. 692 00:31:40,400 --> 00:31:43,040 I need an open and close per end to actually call 693 00:31:43,040 --> 00:31:45,380 that procedure, and that will mutate 694 00:31:45,380 --> 00:31:48,254 s to just be all lowercase. 695 00:31:48,254 --> 00:31:49,920 And then I'm going to run a little loop, 696 00:31:49,920 --> 00:31:52,500 I'll set up answer or ans to be an empty string, 697 00:31:52,500 --> 00:31:56,060 and then, for everything inside that mutated string, 698 00:31:56,060 --> 00:32:01,570 I'll simply say, if it's inside this string, if it's a letter, 699 00:32:01,570 --> 00:32:03,160 add it into answer. 700 00:32:03,160 --> 00:32:05,740 If it's a space or comma or something else I'll ignore it, 701 00:32:05,740 --> 00:32:07,900 and when I'm done just return answer, 702 00:32:07,900 --> 00:32:10,450 strips it down to lowercase. 703 00:32:10,450 --> 00:32:14,810 And then I'm going to pass that into isPal which simply says, 704 00:32:14,810 --> 00:32:17,780 if this is either 0 or 1 long, it's 705 00:32:17,780 --> 00:32:19,970 a palindrome, returned true. 706 00:32:19,970 --> 00:32:24,500 Otherwise, check to see that the first and last element 707 00:32:24,500 --> 00:32:26,630 of the string are the same, notice 708 00:32:26,630 --> 00:32:29,090 the indexing to get into the last element, 709 00:32:29,090 --> 00:32:31,310 and similarly just slice into the string, 710 00:32:31,310 --> 00:32:34,130 ignoring the first and last element, and ask 711 00:32:34,130 --> 00:32:36,880 is that a palindrome. 712 00:32:36,880 --> 00:32:40,111 And then just call it, and that will do it. 713 00:32:40,111 --> 00:32:42,610 And again there's a nice example of that in the code I'm not 714 00:32:42,610 --> 00:32:44,110 going to run it, I'll let you just go look at it, 715 00:32:44,110 --> 00:32:46,276 but it will actually pull out something that checks, 716 00:32:46,276 --> 00:32:48,520 is this a palindrome. 717 00:32:48,520 --> 00:32:51,410 Notice again, what I'm doing here. 718 00:32:51,410 --> 00:32:52,870 I'm doing divide-and-conquer. 719 00:32:52,870 --> 00:32:55,390 I'm taking a problem reducing it, I keep saying this, 720 00:32:55,390 --> 00:32:57,700 to a simpler version of the same problem. 721 00:32:57,700 --> 00:32:59,350 Keep unwinding it till I get down 722 00:32:59,350 --> 00:33:00,860 to something I can solve directly, 723 00:33:00,860 --> 00:33:03,320 my base case and I'm done. 724 00:33:03,320 --> 00:33:05,170 And that's really the heart of thinking 725 00:33:05,170 --> 00:33:08,082 about recursive solutions to problems. 726 00:33:08,082 --> 00:33:10,040 I would hope that one of the things I remember, 727 00:33:10,040 --> 00:33:12,770 besides my really lousy patter up here, 728 00:33:12,770 --> 00:33:15,140 is the idea of Towers of Hanoi, because to me it's 729 00:33:15,140 --> 00:33:17,390 one of the nicest examples of a problem that 730 00:33:17,390 --> 00:33:20,180 would be hard to solve iteratively, 731 00:33:20,180 --> 00:33:22,250 but when you see the recursive solution is 732 00:33:22,250 --> 00:33:23,570 pretty straightforward. 733 00:33:23,570 --> 00:33:28,010 Keep that in mind as you think about doing recursion. 734 00:33:28,010 --> 00:33:30,200 OK, let's switch gears, and let's 735 00:33:30,200 --> 00:33:32,930 talk very briefly about another kind of data type 736 00:33:32,930 --> 00:33:34,677 called a dictionary. 737 00:33:34,677 --> 00:33:36,260 And the idea of a dictionary I'm going 738 00:33:36,260 --> 00:33:38,850 to motivate with a simple example. 739 00:33:38,850 --> 00:33:40,580 There's a quiz coming up on Thursday. 740 00:33:40,580 --> 00:33:41,550 I know you don't want to hear that, 741 00:33:41,550 --> 00:33:44,120 but there is, which means we're going to be recording grades. 742 00:33:44,120 --> 00:33:46,580 And so imagine I wanted to build a little database just 743 00:33:46,580 --> 00:33:48,926 to keep track of grades of students. 744 00:33:48,926 --> 00:33:50,300 So one of the ways I could do it, 745 00:33:50,300 --> 00:33:53,130 I could create a list with the names of the students, 746 00:33:53,130 --> 00:33:55,580 I could create another list with their grades, 747 00:33:55,580 --> 00:33:58,650 and a third list with the actual subject or course 748 00:33:58,650 --> 00:34:01,381 from which they got that great. 749 00:34:01,381 --> 00:34:03,630 I keep a separate list for each one of them, 750 00:34:03,630 --> 00:34:05,860 keep them of the same length, and in essence, 751 00:34:05,860 --> 00:34:08,790 what I'm doing here is I'm storing information 752 00:34:08,790 --> 00:34:13,750 at the same index in each list. 753 00:34:13,750 --> 00:34:16,900 So Ana, who's going to have to take the class again, gets a B, 754 00:34:16,900 --> 00:34:20,750 John, who's created the class, gets an A plus, Sorry Ana, 755 00:34:20,750 --> 00:34:23,130 John's had a longer time at it. 756 00:34:23,130 --> 00:34:24,880 All right, bad jokes aside, what I'm doing 757 00:34:24,880 --> 00:34:26,320 is I can imagine just creating lists. 758 00:34:26,320 --> 00:34:28,278 I could create lists of lists, but a simple way 759 00:34:28,278 --> 00:34:31,360 is to do lists where basically at each index 760 00:34:31,360 --> 00:34:34,540 I've got associated information. 761 00:34:34,540 --> 00:34:37,100 It's a simple way to deal with it. 762 00:34:37,100 --> 00:34:39,320 Getting a grade out takes a little bit of work 763 00:34:39,320 --> 00:34:41,389 because if I want to get the grade associated 764 00:34:41,389 --> 00:34:44,300 with a particular student, what would I do? 765 00:34:44,300 --> 00:34:48,610 I would go into the name list and use the method index, which 766 00:34:48,610 --> 00:34:50,860 you've seen before, again notice the dot notation 767 00:34:50,860 --> 00:34:53,860 it says, this is a list, use the index method, 768 00:34:53,860 --> 00:34:56,949 call it on student, and whatever the value of student 769 00:34:56,949 --> 00:34:58,390 is, it will find that in the list, 770 00:34:58,390 --> 00:35:01,630 return the index at that point, and then I 771 00:35:01,630 --> 00:35:04,690 can use that to go in and get the grade in the course 772 00:35:04,690 --> 00:35:08,110 and return something out. 773 00:35:08,110 --> 00:35:11,029 Simple way to do it but a little ugly, right, 774 00:35:11,029 --> 00:35:12,820 because among other things, I've got things 775 00:35:12,820 --> 00:35:14,932 stored in different places in the list. 776 00:35:14,932 --> 00:35:17,140 I've got to think about if I'm going to add something 777 00:35:17,140 --> 00:35:20,110 to the list I've got to put them in the same spot in the list. 778 00:35:20,110 --> 00:35:22,720 I've got to remember to always index using integers 779 00:35:22,720 --> 00:35:27,890 which is what we know how to do with lists, at least so far. 780 00:35:27,890 --> 00:35:30,051 It would be nice if I had a better way to do it, 781 00:35:30,051 --> 00:35:31,550 and that's exactly what a dictionary 782 00:35:31,550 --> 00:35:33,630 is going to provide for me. 783 00:35:33,630 --> 00:35:36,240 So rather than indexing on integers 784 00:35:36,240 --> 00:35:38,790 I'd like to index directly on the item of interest. 785 00:35:38,790 --> 00:35:41,340 I'd like to say where's Ana's record 786 00:35:41,340 --> 00:35:43,990 and find that in one data structure. 787 00:35:43,990 --> 00:35:47,040 And so, whereas a list is indexed by integers, 788 00:35:47,040 --> 00:35:50,880 and has elements associated with it, a dictionary is going 789 00:35:50,880 --> 00:35:55,670 to combine a key, or if you like, a name of some sort, 790 00:35:55,670 --> 00:35:57,200 with an actual value. 791 00:35:57,200 --> 00:35:59,150 And we're going to index just by the name 792 00:35:59,150 --> 00:36:02,560 or the label as we go into it. 793 00:36:02,560 --> 00:36:05,210 So let me show you some examples. 794 00:36:05,210 --> 00:36:07,750 First of all, to create a dictionary I use curly braces, 795 00:36:07,750 --> 00:36:09,790 open closed curly brace, so an empty dictionary 796 00:36:09,790 --> 00:36:11,550 would be simply that call. 797 00:36:11,550 --> 00:36:13,900 If I want to create an actual dictionary, 798 00:36:13,900 --> 00:36:15,430 before I insert things into it, I 799 00:36:15,430 --> 00:36:17,950 use a little bit of a funky notation. 800 00:36:17,950 --> 00:36:21,970 It is a key or a label, a colon, and then 801 00:36:21,970 --> 00:36:24,250 a value, in this case the string Ana 802 00:36:24,250 --> 00:36:28,210 and the string b, followed by a comma which separates it 803 00:36:28,210 --> 00:36:32,290 from the next pairing of a key and a label, 804 00:36:32,290 --> 00:36:36,080 or a key and a value, and so on. 805 00:36:36,080 --> 00:36:38,140 So if I do this what it does in my dictionary 806 00:36:38,140 --> 00:36:44,140 is it creates pairings of those labels with the values 807 00:36:44,140 --> 00:36:47,250 I associated with them. 808 00:36:47,250 --> 00:36:50,867 OK, these are pretty simple, but in fact, there's 809 00:36:50,867 --> 00:36:52,450 lots of nice things we can do with it. 810 00:36:52,450 --> 00:36:55,780 So once we've got them indexing now is similar to a list 811 00:36:55,780 --> 00:36:58,120 but not done by a number, it's done by value. 812 00:36:58,120 --> 00:37:02,350 So if that's my key, I can say, what's John's grade, 813 00:37:02,350 --> 00:37:05,230 notice the call, it's grades, which is in my dictionary, 814 00:37:05,230 --> 00:37:09,037 open close square brackets, with the label John. 815 00:37:09,037 --> 00:37:11,620 And what it does, it goes in and finds that in the dictionary, 816 00:37:11,620 --> 00:37:14,102 returns the value associated with it. 817 00:37:14,102 --> 00:37:15,980 If I ask for something not in the dictionary, 818 00:37:15,980 --> 00:37:19,380 it's going to give me a key error. 819 00:37:19,380 --> 00:37:22,330 Other things we can do with dictionaries, 820 00:37:22,330 --> 00:37:27,010 we can add entries just like we would do with lists. 821 00:37:27,010 --> 00:37:30,460 Grades as a dictionary, in open and closed square brackets, 822 00:37:30,460 --> 00:37:34,980 I put in a new label and a value, 823 00:37:34,980 --> 00:37:38,480 and that adds that to the dictionary. 824 00:37:38,480 --> 00:37:41,780 I can test if something's in the dictionary by simply saying, 825 00:37:41,780 --> 00:37:45,440 is this label in grades, and it simply 826 00:37:45,440 --> 00:37:49,070 checks all of the labels or the keys for the dictionary 827 00:37:49,070 --> 00:37:53,130 to see if it's there, and if it's not returns false. 828 00:37:53,130 --> 00:37:56,880 I can remove entries, del, something we've seen before, 829 00:37:56,880 --> 00:37:57,859 a very generic thing. 830 00:37:57,859 --> 00:37:59,650 It will delete something, and in this case, 831 00:37:59,650 --> 00:38:01,740 it says, in the dictionary grades, 832 00:38:01,740 --> 00:38:05,640 find the entry associated with that key, sorry, Ana, 833 00:38:05,640 --> 00:38:09,519 you're about to be flushed, remove it. 834 00:38:09,519 --> 00:38:11,810 She's only getting a b in the class and she teaches it. 835 00:38:11,810 --> 00:38:15,312 We've got to do something about this, right? 836 00:38:15,312 --> 00:38:17,020 So I can add things, I can delete things, 837 00:38:17,020 --> 00:38:18,415 I can test if things are there. 838 00:38:18,415 --> 00:38:20,320 Let me show you a couple of other things 839 00:38:20,320 --> 00:38:22,880 about dictionaries. 840 00:38:22,880 --> 00:38:26,960 I can ask for all of the keys in the dictionary. 841 00:38:26,960 --> 00:38:29,390 Notice the format, there is that dot notation, grades 842 00:38:29,390 --> 00:38:32,480 as a dictionary, it says, use the keys method associated 843 00:38:32,480 --> 00:38:35,030 with this data structure dictionaries. 844 00:38:35,030 --> 00:38:37,640 Open close actually calls it, and it gives me 845 00:38:37,640 --> 00:38:43,831 back a collection of all the keys in some arbitrary order. 846 00:38:43,831 --> 00:38:45,490 I'm going to use a funny term here 847 00:38:45,490 --> 00:38:47,330 which I'm not certain we've seen so far. 848 00:38:47,330 --> 00:38:50,560 It returns something we call an iterable, it's like range. 849 00:38:50,560 --> 00:38:52,900 Think of it as giving us back the equivalent of a list, 850 00:38:52,900 --> 00:38:54,316 it's not actually a list, but it's 851 00:38:54,316 --> 00:38:55,910 something we can walk down. 852 00:38:55,910 --> 00:38:58,210 Which is exactly why I can then say, is something 853 00:38:58,210 --> 00:39:02,230 in a dictionary, because it returns this set of keys, 854 00:39:02,230 --> 00:39:04,300 and I can test to see something's in there. 855 00:39:04,300 --> 00:39:06,500 I can similarly get all of the values 856 00:39:06,500 --> 00:39:11,470 if I wanted to look at them, giving us out two iterables. 857 00:39:11,470 --> 00:39:16,740 Here are the key things to keep in mind about dictionaries. 858 00:39:16,740 --> 00:39:20,989 The values can be anything, any type, mutable, immutable. 859 00:39:20,989 --> 00:39:22,030 They could be duplicates. 860 00:39:22,030 --> 00:39:23,800 That'd actually makes sense, I could have the same value 861 00:39:23,800 --> 00:39:26,020 associated, for example, the same grade associated 862 00:39:26,020 --> 00:39:28,540 with different people, that's perfectly fine. 863 00:39:28,540 --> 00:39:31,240 The values could be lists, they could be other data structures, 864 00:39:31,240 --> 00:39:32,823 they could even be other dictionaries. 865 00:39:32,823 --> 00:39:35,670 They can be anything, which is great. 866 00:39:35,670 --> 00:39:40,590 The keys, the first part of it are a little more structure. 867 00:39:40,590 --> 00:39:42,750 They need to be unique. 868 00:39:42,750 --> 00:39:43,890 Well duh, that make sense. 869 00:39:43,890 --> 00:39:46,500 If I have that same key in two places in the dictionary, 870 00:39:46,500 --> 00:39:47,580 when I go to look it up, how am I 871 00:39:47,580 --> 00:39:48,871 going to know which one I want? 872 00:39:48,871 --> 00:39:51,240 So it needs to be unique, and they also 873 00:39:51,240 --> 00:39:54,254 need to be immutable, which also makes sense. 874 00:39:54,254 --> 00:39:56,420 If I'm storing something in a key in the dictionary, 875 00:39:56,420 --> 00:39:59,057 and I can go and change the value of the key, 876 00:39:59,057 --> 00:40:01,140 how am I going to remember what I was looking for? 877 00:40:01,140 --> 00:40:05,510 So they can only be things like ints, floats, strings, tuples, 878 00:40:05,510 --> 00:40:06,792 Booleans. 879 00:40:06,792 --> 00:40:08,750 I don't recommend using floats because you need 880 00:40:08,750 --> 00:40:10,370 to make sure it's exactly the same float 881 00:40:10,370 --> 00:40:12,320 and that's sometimes a little bit challenging, 882 00:40:12,320 --> 00:40:15,590 but nonetheless, you can have any immutable type as your key. 883 00:40:15,590 --> 00:40:18,670 And notice that there's no order to the keys or the values. 884 00:40:18,670 --> 00:40:21,110 They are simply stored arbitrarily by the Python 885 00:40:21,110 --> 00:40:23,490 as it puts them in. 886 00:40:23,490 --> 00:40:27,210 So if I compare these two, lists or ordered sequences indexed 887 00:40:27,210 --> 00:40:30,480 by integers, I look them up by integer index, 888 00:40:30,480 --> 00:40:33,000 and the indices have to have an order as a consequence. 889 00:40:33,000 --> 00:40:35,670 Dictionaries are this nice generalization, 890 00:40:35,670 --> 00:40:37,830 arbitrarily match keys to values. 891 00:40:37,830 --> 00:40:40,470 I simply look up one item by looking up things 892 00:40:40,470 --> 00:40:42,450 under the appropriate key. 893 00:40:42,450 --> 00:40:47,270 All I require is that the keys have to be immutable. 894 00:40:47,270 --> 00:40:48,990 OK, I want to do two last things I've 895 00:40:48,990 --> 00:40:51,170 got seven minutes to go here. 896 00:40:51,170 --> 00:40:53,652 I want to show you an example of using dictionaries, 897 00:40:53,652 --> 00:40:55,610 and I'm going to do this with a little bit more 898 00:40:55,610 --> 00:40:56,818 interesting, I hope, example. 899 00:40:56,818 --> 00:40:58,571 I want to analyze song lyrics. 900 00:40:58,571 --> 00:41:00,320 Now I'm going to show you, you can already 901 00:41:00,320 --> 00:41:02,460 tell the difference between my age and Ana's age. 902 00:41:02,460 --> 00:41:05,832 She used Taylor Swift and Justin Bieber. 903 00:41:05,832 --> 00:41:07,040 I'm going to use The Beatles. 904 00:41:07,040 --> 00:41:08,390 That's more my generation. 905 00:41:08,390 --> 00:41:09,740 Most of you have never heard of The Beatles 906 00:41:09,740 --> 00:41:11,615 unless you watched Shining Time Station where 907 00:41:11,615 --> 00:41:13,862 you saw Ringo Starr, right? 908 00:41:13,862 --> 00:41:15,320 OK, what I'm going to do is, I want 909 00:41:15,320 --> 00:41:17,000 to write a little set of procedures 910 00:41:17,000 --> 00:41:21,500 that record the frequencies of words in a song lyric. 911 00:41:21,500 --> 00:41:24,510 So I'm going to match strings, or words, to integers. 912 00:41:24,510 --> 00:41:27,980 How many times did that word appear in the song lyric? 913 00:41:27,980 --> 00:41:30,380 And then I want to ask, can I easily figure out 914 00:41:30,380 --> 00:41:33,627 which words occur most often, and how many times. 915 00:41:33,627 --> 00:41:35,210 Then I'm going to gather them together 916 00:41:35,210 --> 00:41:37,127 to see what are the most common words in here. 917 00:41:37,127 --> 00:41:39,585 And I'm going to do that where I'm going to let a user say, 918 00:41:39,585 --> 00:41:42,530 I want every word that appears more than some number of times. 919 00:41:42,530 --> 00:41:44,150 It's a simple example, but I want 920 00:41:44,150 --> 00:41:45,980 you to see how a mutation of the dictionary 921 00:41:45,980 --> 00:41:49,950 gives you a really powerful tool for solving this problem. 922 00:41:49,950 --> 00:41:52,774 So let's write the code to do that. 923 00:41:52,774 --> 00:41:55,590 It's also in the handout, here we go. 924 00:41:55,590 --> 00:42:00,779 Lyrics to frequency's, lyrics is just a list of words, strings. 925 00:42:00,779 --> 00:42:02,570 So I'm going to set up an empty dictionary, 926 00:42:02,570 --> 00:42:05,000 there's that open close curly brace, 927 00:42:05,000 --> 00:42:06,710 and here's what I want to do. 928 00:42:06,710 --> 00:42:08,930 I'm going to walk through all the words in lyrics. 929 00:42:08,930 --> 00:42:10,596 You've seen this before, this is looping 930 00:42:10,596 --> 00:42:12,690 over every word in lyrics. 931 00:42:12,690 --> 00:42:14,330 Ah, notice what I'm going to do. 932 00:42:14,330 --> 00:42:17,960 I'm going to simply say-- so the first part is, I can easily 933 00:42:17,960 --> 00:42:20,180 iterate over the list, --but now I'm going to say, 934 00:42:20,180 --> 00:42:23,400 if the word is in the dictionary, 935 00:42:23,400 --> 00:42:25,470 and because the dictionary is iterable, 936 00:42:25,470 --> 00:42:27,600 it's simply going to give me back all of the keys, 937 00:42:27,600 --> 00:42:29,224 it's simply going to say, in this case, 938 00:42:29,224 --> 00:42:32,050 if it's in the dictionary, it's already there, 939 00:42:32,050 --> 00:42:33,890 I've got some value associated with it, 940 00:42:33,890 --> 00:42:37,040 get the value out, add 1 to it, put it back in. 941 00:42:39,640 --> 00:42:41,540 If it's not already in the dictionary, 942 00:42:41,540 --> 00:42:44,260 this is the first time I've seen it, just store it 943 00:42:44,260 --> 00:42:45,860 into the dictionary. 944 00:42:45,860 --> 00:42:48,575 And when I'm done just return the dictionary. 945 00:42:48,575 --> 00:42:50,750 OK? 946 00:42:50,750 --> 00:42:53,170 So I'm going to, if I can do this right with my Python, 947 00:42:53,170 --> 00:42:55,080 show you an example of this. 948 00:42:55,080 --> 00:43:00,900 I have put in one of the great classic Beatles songs, 949 00:43:00,900 --> 00:43:03,319 you might recognize it right there. 950 00:43:03,319 --> 00:43:05,860 Mostly because it's got a whole lot of repetitions of things. 951 00:43:05,860 --> 00:43:07,832 So she loves you yeah, yeah, yeah, yeah. 952 00:43:07,832 --> 00:43:09,790 Sorry, actually they sing it better than I just 953 00:43:09,790 --> 00:43:11,350 did it sarcastically. 954 00:43:11,350 --> 00:43:13,390 Sorry about that, but I got she loves you there, 955 00:43:13,390 --> 00:43:15,920 and here's my code up here, lyrics to frequency. 956 00:43:15,920 --> 00:43:18,230 So let's see what happens if we call it. 957 00:43:18,230 --> 00:43:26,520 And we say lyrics to frequencies she loves you. 958 00:43:29,660 --> 00:43:31,360 And it would help if I can type, all 959 00:43:31,360 --> 00:43:33,940 right, we'll try it one more time, lyrics 960 00:43:33,940 --> 00:43:44,100 to frequency's, she loves you. 961 00:43:44,100 --> 00:43:47,740 Cool, this gave me back a dictionary, 962 00:43:47,740 --> 00:43:49,987 you can see the curly braces, and there 963 00:43:49,987 --> 00:43:52,570 are all the words that appear in there and the number of times 964 00:43:52,570 --> 00:43:55,360 that they appear. 965 00:43:55,360 --> 00:43:56,760 What's the order? 966 00:43:56,760 --> 00:43:57,900 You don't care. 967 00:43:57,900 --> 00:43:58,590 You don't know. 968 00:43:58,590 --> 00:44:00,090 What we want to do is to think about 969 00:44:00,090 --> 00:44:01,860 how can we analyze this, so let's go back 970 00:44:01,860 --> 00:44:04,030 and look at the last piece of this. 971 00:44:04,030 --> 00:44:09,810 Which is, OK, I can convert lyrics to frequencies. 972 00:44:09,810 --> 00:44:12,560 So here's the next thing I want to do, how do I 973 00:44:12,560 --> 00:44:14,660 find the most common words? 974 00:44:14,660 --> 00:44:16,520 Well, here's what I'm going to do, 975 00:44:16,520 --> 00:44:19,160 frequencies is the dictionary, something 976 00:44:19,160 --> 00:44:21,570 that I just pulled out. 977 00:44:21,570 --> 00:44:24,230 So I can use the values method on it 978 00:44:24,230 --> 00:44:26,330 which returns and iterable, as I said earlier, 979 00:44:26,330 --> 00:44:28,860 again notice the open close because I got to call it. 980 00:44:28,860 --> 00:44:31,239 That gives me back an iterable that 981 00:44:31,239 --> 00:44:33,030 has all of the frequencies inside of there, 982 00:44:33,030 --> 00:44:36,172 because it's an iterable, I can use max on it, 983 00:44:36,172 --> 00:44:38,130 and it will take that editable and give me back 984 00:44:38,130 --> 00:44:39,445 the biggest value. 985 00:44:39,445 --> 00:44:41,695 I'm going to call that best, I'm going to set up 986 00:44:41,695 --> 00:44:43,320 words to be an empty list, and then I'm 987 00:44:43,320 --> 00:44:45,210 just going to walk through all of the entries 988 00:44:45,210 --> 00:44:49,680 in the dictionary saying, if the value at that entry 989 00:44:49,680 --> 00:44:53,670 is equal to best add that entry into words, 990 00:44:53,670 --> 00:44:56,260 just append it onto the end of the list. 991 00:44:56,260 --> 00:44:57,760 And when I'm done all of that loop, 992 00:44:57,760 --> 00:45:00,960 I'm just going to return a tuple of both the collections 993 00:45:00,960 --> 00:45:03,450 of words that period that many times 994 00:45:03,450 --> 00:45:05,355 and how often they appeared. 995 00:45:05,355 --> 00:45:07,230 I'm going to show you an example in a second, 996 00:45:07,230 --> 00:45:10,454 but notice I'm simply using the properties of the dictionary. 997 00:45:10,454 --> 00:45:12,120 The last thing I want to do then is say, 998 00:45:12,120 --> 00:45:14,110 I want to see how often the words appear. 999 00:45:14,110 --> 00:45:17,160 So I'm going to give it a dictionary and a minimum number 1000 00:45:17,160 --> 00:45:18,530 of times. 1001 00:45:18,530 --> 00:45:21,120 And here I'm going to set result up to be an empty list, 1002 00:45:21,120 --> 00:45:23,100 I'm going to create a flag called false, 1003 00:45:23,100 --> 00:45:25,140 it's going to keep track of when I'm done. 1004 00:45:25,140 --> 00:45:27,450 And as long as I'm not yet done, I'll 1005 00:45:27,450 --> 00:45:28,950 call that previous procedure that's 1006 00:45:28,950 --> 00:45:31,110 going to give me back the most common words 1007 00:45:31,110 --> 00:45:33,380 and how often they appeared. 1008 00:45:33,380 --> 00:45:34,920 I check and remember it was a tuple, 1009 00:45:34,920 --> 00:45:37,211 how often do they appear, if it's bigger than the thing 1010 00:45:37,211 --> 00:45:41,140 I'm looking for, I'll add that into my result. 1011 00:45:41,140 --> 00:45:43,300 And then the best part is, I'm now 1012 00:45:43,300 --> 00:45:44,980 going to walk through all the words 1013 00:45:44,980 --> 00:45:47,920 that appeared that many times, and just delete them 1014 00:45:47,920 --> 00:45:49,654 from the dictionary. 1015 00:45:49,654 --> 00:45:50,820 I can mutate the dictionary. 1016 00:45:50,820 --> 00:45:54,342 And by doing that, I can go back around and do this again, 1017 00:45:54,342 --> 00:45:56,550 and it will pull out how many times has this appeared 1018 00:45:56,550 --> 00:45:57,540 and keep doing it. 1019 00:45:57,540 --> 00:45:59,250 When I can go all the way through that, 1020 00:45:59,250 --> 00:46:01,470 if I can't find any more, I'll set the flag 1021 00:46:01,470 --> 00:46:03,540 to true which means it will drop out of here 1022 00:46:03,540 --> 00:46:07,207 and return the result. I'm going to let you run this yourself, 1023 00:46:07,207 --> 00:46:09,290 if you do that, you'll find that it comes up with, 1024 00:46:09,290 --> 00:46:11,580 not surprisingly, I think yeah is the most common one 1025 00:46:11,580 --> 00:46:14,434 and she loves you, followed by loves and a few others. 1026 00:46:14,434 --> 00:46:16,850 What I want you to see here is how the dictionary captured 1027 00:46:16,850 --> 00:46:19,430 the pieces we wanted to. 1028 00:46:19,430 --> 00:46:23,110 Very last one, there's Fibonacci, 1029 00:46:23,110 --> 00:46:25,470 as we called it before. 1030 00:46:25,470 --> 00:46:27,540 It's actually incredibly inefficient, 1031 00:46:27,540 --> 00:46:29,750 because if I call it, I have to do all the sub 1032 00:46:29,750 --> 00:46:32,880 calls until I get down to the base case, which is OK. 1033 00:46:32,880 --> 00:46:36,030 But notice, every other thing I do here, 1034 00:46:36,030 --> 00:46:39,080 I've actually computed those values. 1035 00:46:39,080 --> 00:46:41,120 I'm wasting measures, or wasting time, 1036 00:46:41,120 --> 00:46:44,180 it's not so bad with fib of 5, but if this is fib of 20, 1037 00:46:44,180 --> 00:46:46,760 almost everything on the right hand side of this tree 1038 00:46:46,760 --> 00:46:48,260 I've already computed once. 1039 00:46:48,260 --> 00:46:51,880 That means fibs very inefficient. 1040 00:46:51,880 --> 00:46:55,930 I can improve it by using a dictionary, very handy tool. 1041 00:46:55,930 --> 00:46:58,720 I'm going to call fib not only with a value of n, 1042 00:46:58,720 --> 00:47:00,220 but a dictionary which initially I'm 1043 00:47:00,220 --> 00:47:03,250 going to initialized to the base cases. 1044 00:47:03,250 --> 00:47:05,500 And notice what I do, I'm going to say if I've already 1045 00:47:05,500 --> 00:47:09,580 computed this, just return the value in the dictionary. 1046 00:47:09,580 --> 00:47:12,670 If I haven't, go ahead and do the computation, 1047 00:47:12,670 --> 00:47:15,400 store it in the dictionary at that point, 1048 00:47:15,400 --> 00:47:17,912 and return the answer. 1049 00:47:17,912 --> 00:47:19,370 Different way of thinking about it, 1050 00:47:19,370 --> 00:47:21,620 and the reason this is really nice is a method called 1051 00:47:21,620 --> 00:47:26,090 memoization, is if I call fib of 34 1052 00:47:26,090 --> 00:47:30,650 the standard way it takes 11 million plus recursive calls 1053 00:47:30,650 --> 00:47:31,685 to get the answer out. 1054 00:47:31,685 --> 00:47:32,900 It takes a long time. 1055 00:47:32,900 --> 00:47:34,400 I've given you some code for it, you 1056 00:47:34,400 --> 00:47:36,860 can try it and see how long it takes. 1057 00:47:36,860 --> 00:47:40,710 Using the dictionary to keep track of intermediate values, 1058 00:47:40,710 --> 00:47:42,572 65 calls. 1059 00:47:42,572 --> 00:47:44,780 And if you try it, you'll see the difference in speed 1060 00:47:44,780 --> 00:47:46,040 as you run this. 1061 00:47:46,040 --> 00:47:47,510 So dictionaries are valuable, not 1062 00:47:47,510 --> 00:47:49,340 only for just storing away data, they're 1063 00:47:49,340 --> 00:47:53,750 valuable on procedure calls when those intermediate values are 1064 00:47:53,750 --> 00:47:55,125 not going to change. 1065 00:47:55,125 --> 00:47:56,750 What you're going to see as we go along 1066 00:47:56,750 --> 00:47:59,390 is we're going to use exactly these ideas, using dictionaries 1067 00:47:59,390 --> 00:48:02,600 to capture information, but especially using recursion 1068 00:48:02,600 --> 00:48:04,490 to break bigger problems down into smaller 1069 00:48:04,490 --> 00:48:07,160 versions of the same problem, to use that as a tool 1070 00:48:07,160 --> 00:48:10,430 for solving what turn out to be really complex things. 1071 00:48:10,430 --> 00:48:13,410 And with that, we'll see you next time.