1 00:00:00,050 --> 00:00:01,770 The following content is provided 2 00:00:01,770 --> 00:00:04,010 under a Creative Commons license. 3 00:00:04,010 --> 00:00:06,860 Your support will help MIT OpenCourseWare continue 4 00:00:06,860 --> 00:00:10,720 to offer high quality educational resources for free. 5 00:00:10,720 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:17,207 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,207 --> 00:00:17,832 at ocw.mit.edu. 8 00:00:22,130 --> 00:00:25,860 PROFESSOR: Today's lecture is about a brand new data 9 00:00:25,860 --> 00:00:29,130 structure that you've probably seen before, 10 00:00:29,130 --> 00:00:32,490 and we've mentioned earlier in double 06, 11 00:00:32,490 --> 00:00:34,940 called a binary search tree. 12 00:00:34,940 --> 00:00:37,080 We've talked about binary search. 13 00:00:37,080 --> 00:00:40,730 It's a fundamental divide and conquer paradigm. 14 00:00:40,730 --> 00:00:42,730 There's a data structure associated with it, 15 00:00:42,730 --> 00:00:45,880 called the BST, a binary search tree. 16 00:00:45,880 --> 00:00:49,310 And what I want to do is motivate this data structure 17 00:00:49,310 --> 00:00:51,390 using a problem. 18 00:00:51,390 --> 00:00:55,250 It's a bit of a toy problem, but certainly a problem 19 00:00:55,250 --> 00:01:01,070 that you could imagine exists in all sorts 20 00:01:01,070 --> 00:01:04,690 of scheduling problems. 21 00:01:04,690 --> 00:01:08,310 It's a part of a runway reservation system 22 00:01:08,310 --> 00:01:10,160 that you can imagine building. 23 00:01:10,160 --> 00:01:13,910 And what I'll do is define this problem 24 00:01:13,910 --> 00:01:17,780 and talk about how we could possibly solve it with the data 25 00:01:17,780 --> 00:01:23,080 structures you've already seen-- so lists and arrays, heaps 26 00:01:23,080 --> 00:01:25,270 as well as, which we saw last time-- 27 00:01:25,270 --> 00:01:32,540 and hopefully motivate you into the reason behind the existence 28 00:01:32,540 --> 00:01:35,015 of binary search trees, because they 29 00:01:35,015 --> 00:01:36,850 are kind of the perfect data structure 30 00:01:36,850 --> 00:01:40,080 for this particular problem. 31 00:01:40,080 --> 00:01:45,640 So let's dive into what the runway reservation system looks 32 00:01:45,640 --> 00:01:46,140 like. 33 00:01:49,910 --> 00:01:53,050 And it's your basic scheduling problem. 34 00:01:53,050 --> 00:02:00,030 We'll assume an airport with a single runway. 35 00:02:04,286 --> 00:02:06,180 Now Logan has six runways. 36 00:02:06,180 --> 00:02:09,699 But the moment there's any sort of weather you're down to one. 37 00:02:09,699 --> 00:02:13,180 And of course, there's lots of airports with a single runway. 38 00:02:13,180 --> 00:02:15,850 And we can imagine that this runway is pretty busy. 39 00:02:15,850 --> 00:02:19,670 There's obviously safety issues associated with landing planes, 40 00:02:19,670 --> 00:02:21,500 and planes taking off. 41 00:02:21,500 --> 00:02:23,230 And so there are constraints associated 42 00:02:23,230 --> 00:02:25,840 with the system, that have to be obeyed. 43 00:02:25,840 --> 00:02:28,530 And you have to build these constraints in-- and the checks 44 00:02:28,530 --> 00:02:31,780 for these constraints-- into your data structure. 45 00:02:31,780 --> 00:02:34,520 That's sort of the summary of the context. 46 00:02:37,170 --> 00:02:43,780 So reservations for future landings 47 00:02:43,780 --> 00:02:48,240 is really what this system is built for. 48 00:02:48,240 --> 00:02:50,200 There's a notion of time. 49 00:02:50,200 --> 00:02:52,790 We'll assume that time is continuous. 50 00:02:52,790 --> 00:02:58,440 So it could be represented by a real variable, 51 00:02:58,440 --> 00:03:00,290 or a real quantity. 52 00:03:00,290 --> 00:03:14,460 And what we'd like to do is reserve requests for landings. 53 00:03:14,460 --> 00:03:20,340 And these are going to specify landing time. 54 00:03:20,340 --> 00:03:22,630 Each of them is going to specify a landing time. 55 00:03:22,630 --> 00:03:26,110 We call it t. 56 00:03:26,110 --> 00:03:33,210 And in particular, we're going to add t 57 00:03:33,210 --> 00:03:52,370 to the set R of landing times if no other landings are scheduled 58 00:03:52,370 --> 00:03:54,595 within k minutes. 59 00:03:57,790 --> 00:04:03,430 And k is a parameter that could vary. 60 00:04:03,430 --> 00:04:07,810 I mean, it could be statically set to 3 minutes, or maybe 4. 61 00:04:07,810 --> 00:04:10,420 You can imagine it varying it dynamically 62 00:04:10,420 --> 00:04:12,910 depending on weather conditions, things like that. 63 00:04:16,230 --> 00:04:18,730 For the most of the examples we'll talk about today, 64 00:04:18,730 --> 00:04:23,130 we'll assume k is 3 minutes, or something like that. 65 00:04:23,130 --> 00:04:28,700 So this is about adding to the data structure. 66 00:04:28,700 --> 00:04:31,290 And so an insert operation, if you will, 67 00:04:31,290 --> 00:04:34,470 that has a constraint associated with it that you need to check. 68 00:04:34,470 --> 00:04:37,060 And so you wouldn't insert if the constraint was violated. 69 00:04:37,060 --> 00:04:40,760 You would if the constraint was satisfied. 70 00:04:40,760 --> 00:04:45,030 And time, as I said, is something 71 00:04:45,030 --> 00:04:47,500 that is part of the system. 72 00:04:47,500 --> 00:04:49,300 It needs to be modeled. 73 00:04:49,300 --> 00:04:51,560 You have the current notion of time. 74 00:04:51,560 --> 00:04:57,240 And every time you have a plane that's already landed, 75 00:04:57,240 --> 00:05:00,370 which means that you can essentially 76 00:05:00,370 --> 00:05:03,430 take this particular landing time away 77 00:05:03,430 --> 00:05:12,650 from the set R. So this removal, or delete-- we remove 78 00:05:12,650 --> 00:05:17,730 from set R, which is the set of landing times 79 00:05:17,730 --> 00:05:20,200 after the plane lands. 80 00:05:23,330 --> 00:05:25,892 So every once in awhile, as time increments, 81 00:05:25,892 --> 00:05:27,850 you're going to be checking the data structure. 82 00:05:27,850 --> 00:05:31,440 And you can do this, maybe, every minute, every 30 seconds. 83 00:05:31,440 --> 00:05:32,810 That isn't really important. 84 00:05:32,810 --> 00:05:34,810 But you have to be able to remove from this data 85 00:05:34,810 --> 00:05:36,060 structure. 86 00:05:36,060 --> 00:05:38,510 So fairly straightforward data structure. 87 00:05:38,510 --> 00:05:43,190 It's a set, R. We don't quite know how to implement it yet. 88 00:05:43,190 --> 00:05:59,150 But we'd like to do all of these operations in order log n time, 89 00:05:59,150 --> 00:06:03,090 where n is the size of the set. 90 00:06:03,090 --> 00:06:04,410 All right? 91 00:06:04,410 --> 00:06:07,940 So any questions about that? 92 00:06:07,940 --> 00:06:09,970 Any questions about the definition 93 00:06:09,970 --> 00:06:14,790 of the problem before we move on? 94 00:06:14,790 --> 00:06:16,440 Are we good on? 95 00:06:16,440 --> 00:06:17,820 OK. 96 00:06:17,820 --> 00:06:25,460 So let's look at a real straightforward example, 97 00:06:25,460 --> 00:06:32,720 and put this up here so you get a better sense of this. 98 00:06:32,720 --> 00:06:37,570 Let's say that, right now, we are at time 37. 99 00:06:37,570 --> 00:06:45,580 And the set R has 41.2, 49, and 53 in it. 100 00:06:45,580 --> 00:06:48,120 And that's time. 101 00:06:48,120 --> 00:06:53,040 Now you may get a request for landing time 53. 102 00:06:53,040 --> 00:06:55,010 And-- I'm sorry. 103 00:06:55,010 --> 00:07:01,500 I want to call this 56.3-- 41.2, 49, and 56.3. 104 00:07:01,500 --> 00:07:04,630 You may get a request for landing time 53. 105 00:07:04,630 --> 00:07:06,740 And right now the time is 37. 106 00:07:06,740 --> 00:07:10,870 It's in the future, and you say OK because you've 107 00:07:10,870 --> 00:07:11,760 done the check. 108 00:07:11,760 --> 00:07:15,340 And let's assume that k equals 3. 109 00:07:15,340 --> 00:07:24,530 And 53 is four ahead of 49, and 3.3 before 56.3, so you're OK. 110 00:07:24,530 --> 00:07:27,520 44 is not allowed. 111 00:07:30,550 --> 00:07:33,610 It's too close to 41.2. 112 00:07:33,610 --> 00:07:38,290 And 20, just for completeness, is not 113 00:07:38,290 --> 00:07:39,475 allowed because it's passed. 114 00:07:42,710 --> 00:07:43,955 Can't schedule in the past. 115 00:07:43,955 --> 00:07:45,330 I mean, it could be the next day. 116 00:07:45,330 --> 00:07:46,790 But then you wouldn't call it 20. 117 00:07:46,790 --> 00:07:50,110 Let's assume that time is a monotonically increasing 118 00:07:50,110 --> 00:07:51,830 function. 119 00:07:51,830 --> 00:07:54,020 You have a 64-bit number. 120 00:07:54,020 --> 00:07:56,600 It can go to the end of the world, or 2012, 121 00:07:56,600 --> 00:07:58,390 or wherever you want. 122 00:07:58,390 --> 00:08:01,390 So you can keep the number a bit smaller, 123 00:08:01,390 --> 00:08:05,390 and do a little constant factor optimization, I guess. 124 00:08:05,390 --> 00:08:08,430 So that's sort of the set up. 125 00:08:08,430 --> 00:08:12,650 And hopefully you get a sense of what the requirements. 126 00:08:12,650 --> 00:08:16,510 And you guys know about a bunch of data structures already. 127 00:08:16,510 --> 00:08:19,300 And what I want to do is list each one of them, 128 00:08:19,300 --> 00:08:23,310 and essentially shoot them down with respect 129 00:08:23,310 --> 00:08:31,520 to not being able to make this efficiency requirement. 130 00:08:31,520 --> 00:08:34,830 And I'd like you guys to help me shoot them down. 131 00:08:34,830 --> 00:08:39,235 So let's talk about an easy one first. 132 00:08:42,110 --> 00:08:44,720 Let's say you have an unsorted list or an array corresponding 133 00:08:44,720 --> 00:08:49,670 to R. That's all you have. 134 00:08:49,670 --> 00:08:51,990 What's wrong with this data structure 135 00:08:51,990 --> 00:08:55,250 from an efficiency standpoint? 136 00:08:55,250 --> 00:08:55,994 Yeah. 137 00:08:55,994 --> 00:08:58,660 AUDIENCE: Pretty much everything you want to do to it is linear. 138 00:08:58,660 --> 00:08:59,993 PROFESSOR: That's exactly right. 139 00:08:59,993 --> 00:09:03,230 Pretty much everything you want to do to it is linear. 140 00:09:03,230 --> 00:09:08,290 And so you want to check the k minute check. 141 00:09:08,290 --> 00:09:14,360 You can certainly insert into it, and just add to it. 142 00:09:14,360 --> 00:09:17,070 So that part is not linear, that's constant time. 143 00:09:17,070 --> 00:09:20,250 But certainly, anything where you 144 00:09:20,250 --> 00:09:25,040 want to go check against other elements of the array, 145 00:09:25,040 --> 00:09:26,330 it's unsorted. 146 00:09:26,330 --> 00:09:28,840 You have no idea of where to find these elements. 147 00:09:28,840 --> 00:09:31,010 You have to scan through the entire array 148 00:09:31,010 --> 00:09:33,930 to check to see whether there's a landing time that's 149 00:09:33,930 --> 00:09:38,920 within k of the current time t that you're asking for. 150 00:09:38,920 --> 00:09:42,490 And that's going to take order n time. 151 00:09:42,490 --> 00:09:53,740 So you can insert in order 1 without a check. 152 00:09:53,740 --> 00:10:03,530 But sadly, the check takes order n time. 153 00:10:03,530 --> 00:10:04,030 All right? 154 00:10:09,680 --> 00:10:15,460 Let's do something that is a little more plausible. 155 00:10:15,460 --> 00:10:18,950 Let's talk about a sorted array. 156 00:10:18,950 --> 00:10:22,720 So this is a little more subtle question. 157 00:10:22,720 --> 00:10:25,640 Let's talk about a sorted array. 158 00:10:25,640 --> 00:10:28,640 What happens with a sorted array? 159 00:10:28,640 --> 00:10:30,226 Someone? 160 00:10:30,226 --> 00:10:33,312 What can you do with a sorted array? 161 00:10:33,312 --> 00:10:33,812 Yeah. 162 00:10:33,812 --> 00:10:37,130 AUDIENCE: Do a binary search to find the [INAUDIBLE]. 163 00:10:37,130 --> 00:10:40,900 PROFESSOR: Binary search would find a bad insert. 164 00:10:40,900 --> 00:10:41,700 OK, good. 165 00:10:41,700 --> 00:10:43,950 So that's good. 166 00:10:43,950 --> 00:10:47,530 So if you have a sorted array, and just for argument's sake, 167 00:10:47,530 --> 00:10:54,740 it looks like 4, 20, 32, 37, 45. 168 00:10:54,740 --> 00:10:56,640 And it's increasing order. 169 00:10:56,640 --> 00:11:02,180 And if you get a particular time t, you can use binary search. 170 00:11:02,180 --> 00:11:07,300 And let's say, in particular, the time is, for example, 34. 171 00:11:07,300 --> 00:11:10,110 Then what you do is you go to the midpoint of the array, 172 00:11:10,110 --> 00:11:11,860 and maybe you just look at that. 173 00:11:11,860 --> 00:11:17,880 And you say oh, 34 is greater than 32. 174 00:11:17,880 --> 00:11:22,870 So I'm going to go check and figure out 175 00:11:22,870 --> 00:11:26,200 if I need to move to the left or the right. 176 00:11:26,200 --> 00:11:28,450 And since it's greater I'm going to move to the right. 177 00:11:28,450 --> 00:11:31,880 And within logarithmic time, you'll 178 00:11:31,880 --> 00:11:37,206 find what we call the insertion point of the sorted array, 179 00:11:37,206 --> 00:11:40,540 where this 34 is supposed to sit. 180 00:11:40,540 --> 00:11:44,600 And you don't necessarily get to insert there. 181 00:11:44,600 --> 00:11:47,330 You need to look, once you've found the insertion point, 182 00:11:47,330 --> 00:11:50,440 to your left and to your right. 183 00:11:50,440 --> 00:11:53,340 And do the k minute check. 184 00:11:53,340 --> 00:11:57,260 So finish up the answer to the question, 185 00:11:57,260 --> 00:12:02,370 tell me how long it's going to take me to find the insertion 186 00:12:02,370 --> 00:12:06,100 point, how long it's going to take me to do the check, 187 00:12:06,100 --> 00:12:08,210 and how long it's going to take me to actually do 188 00:12:08,210 --> 00:12:09,802 the insertion. 189 00:12:09,802 --> 00:12:12,325 AUDIENCE: Log n in the search-- 190 00:12:12,325 --> 00:12:14,450 PROFESSOR: Log n for the search, to find the point. 191 00:12:14,450 --> 00:12:16,384 AUDIENCE: Constant for the comparison? 192 00:12:16,384 --> 00:12:17,320 PROFESSOR: Constant to the comparison. 193 00:12:17,320 --> 00:12:18,278 And then the last step? 194 00:12:18,278 --> 00:12:20,128 AUDIENCE: Do the research [INAUDIBLE]. 195 00:12:20,128 --> 00:12:23,249 PROFESSOR: Sorry, little louder. 196 00:12:23,249 --> 00:12:23,748 Sorry. 197 00:12:23,748 --> 00:12:25,150 AUDIENCE: The insertion is constant. 198 00:12:25,150 --> 00:12:26,524 PROFESSOR: Insertion is constant? 199 00:12:26,524 --> 00:12:27,330 Is that right? 200 00:12:27,330 --> 00:12:31,324 Do you people agree with him, that insertion is constant? 201 00:12:31,324 --> 00:12:33,729 AUDIENCE: You've got a maximum size up there, right? 202 00:12:33,729 --> 00:12:35,172 There must be a maximum. 203 00:12:35,172 --> 00:12:37,100 [INAUDIBLE] 204 00:12:37,100 --> 00:12:39,900 PROFESSOR: No, the indices-- so right now the array 205 00:12:39,900 --> 00:12:41,680 has indices i. 206 00:12:41,680 --> 00:12:47,830 And if you start with 1, it's 1, 2, 3, 4, 5, et cetera. 207 00:12:47,830 --> 00:12:49,910 So what do you mean by insertion? 208 00:12:49,910 --> 00:12:52,749 Someone explain to me what-- yeah, go ahead. 209 00:12:52,749 --> 00:12:54,374 AUDIENCE: When you put something in you 210 00:12:54,374 --> 00:12:56,042 have to shift every element over. 211 00:12:56,042 --> 00:12:57,375 PROFESSOR: That's exactly right. 212 00:12:57,375 --> 00:12:58,309 That's exactly right. 213 00:12:58,309 --> 00:13:00,892 Ok, good, that's great. 214 00:13:00,892 --> 00:13:02,600 I guess I should give you half a cushion. 215 00:13:02,600 --> 00:13:05,000 But I'll do the full one, right? 216 00:13:05,000 --> 00:13:05,940 And you get one, too. 217 00:13:09,010 --> 00:13:11,420 So the point here is this is pretty close. 218 00:13:11,420 --> 00:13:13,530 It's almost what we want. 219 00:13:13,530 --> 00:13:15,610 It's almost what we want. 220 00:13:15,610 --> 00:13:18,320 There's a little bit of a glitch here. 221 00:13:18,320 --> 00:13:20,210 We know about binary search. 222 00:13:20,210 --> 00:13:22,360 The binary search is going to allow us, 223 00:13:22,360 --> 00:13:25,950 if there's n elements here, to find the place-- 224 00:13:25,950 --> 00:13:29,920 it's going to be able to find-- and I'm 225 00:13:29,920 --> 00:13:39,970 going to precise here-- the smallest i such that R of i 226 00:13:39,970 --> 00:13:44,910 is greater than or equal to t in order log n time. 227 00:13:47,520 --> 00:13:49,970 It's going to be able to do that. 228 00:13:49,970 --> 00:14:00,410 You're going to be able to compare R of i and R of i 229 00:14:00,410 --> 00:14:06,530 minus 1-- so the left and the right-- against t 230 00:14:06,530 --> 00:14:10,000 in order 1 time. 231 00:14:10,000 --> 00:14:24,105 But sadly, the actual insertion is going to require shifting. 232 00:14:29,800 --> 00:14:33,890 And that could take order n time, because it's an array. 233 00:14:38,160 --> 00:14:40,470 So that's the problem. 234 00:14:40,470 --> 00:14:47,240 Now you could imagine that you had a sorted list. 235 00:14:47,240 --> 00:14:50,610 And you could say, hey if I have a sorted list, 236 00:14:50,610 --> 00:14:57,430 then the list looks like this, and it's 237 00:14:57,430 --> 00:14:59,650 got a bunch of pointers in it. 238 00:14:59,650 --> 00:15:05,690 And if I've found the insertion point, 239 00:15:05,690 --> 00:15:13,370 then-- the list is nice, because you can insert something 240 00:15:13,370 --> 00:15:16,660 by moving pointers in constant time 241 00:15:16,660 --> 00:15:18,840 once you've found the insertion point. 242 00:15:18,840 --> 00:15:21,620 But what's the problem with the list? 243 00:15:21,620 --> 00:15:22,120 Yeah. 244 00:15:22,120 --> 00:15:24,580 AUDIENCE: You can't do binary search [INAUDIBLE]. 245 00:15:24,580 --> 00:15:26,930 PROFESSOR: Well you can't do binary search on a list. 246 00:15:26,930 --> 00:15:30,800 There's no notion of going to the n by 2 index 247 00:15:30,800 --> 00:15:36,480 and doing random access on a conventional list, right? 248 00:15:36,480 --> 00:15:39,610 So the list does one thing right, 249 00:15:39,610 --> 00:15:41,440 but doesn't do the other thing right. 250 00:15:41,440 --> 00:15:43,600 The array does a couple things right, 251 00:15:43,600 --> 00:15:45,440 but doesn't do the shifting right. 252 00:15:45,440 --> 00:15:49,430 And so you see why we've constructed this toy problem. 253 00:15:49,430 --> 00:15:52,670 It's to motivate the binary search tree data 254 00:15:52,670 --> 00:15:53,850 structure, obviously. 255 00:15:53,850 --> 00:15:59,040 But you're close, but not quite there. 256 00:15:59,040 --> 00:15:59,830 What about heaps? 257 00:16:03,190 --> 00:16:06,510 We talked about heaps last time. 258 00:16:06,510 --> 00:16:12,350 What's the basic problem with the heap for this problem? 259 00:16:12,350 --> 00:16:14,510 The heaps are data arrays, but you 260 00:16:14,510 --> 00:16:15,970 can visualize them as trees. 261 00:16:15,970 --> 00:16:19,070 And obviously if we're talking about min heaps and max heaps. 262 00:16:19,070 --> 00:16:23,430 So in particular, what goes wrong with a min heap or a max 263 00:16:23,430 --> 00:16:26,940 heap for this problem? 264 00:16:26,940 --> 00:16:28,515 What takes a long time? 265 00:16:28,515 --> 00:16:29,015 Yeah. 266 00:16:31,660 --> 00:16:36,372 AUDIENCE: You have to scan every element, which [INAUDIBLE]. 267 00:16:36,372 --> 00:16:37,372 PROFESSOR: That's right. 268 00:16:37,372 --> 00:16:39,990 I mean, sadly, you know when we talk about min heaps or max 269 00:16:39,990 --> 00:16:46,460 heaps, they actually have a fairly weak invariant. 270 00:16:46,460 --> 00:16:49,510 It turns out that-- I'm previewing a bit here-- 271 00:16:49,510 --> 00:16:51,040 binary search trees are obviously 272 00:16:51,040 --> 00:16:53,920 similar to heaps in the sense that you visualize 273 00:16:53,920 --> 00:16:56,280 an array as a tree, in the case of a heap. 274 00:16:56,280 --> 00:16:58,360 And binary search trees are trees. 275 00:16:58,360 --> 00:17:02,170 But the invariant in a min heap or a max heap, 276 00:17:02,170 --> 00:17:04,069 is this kind of a week invariant. 277 00:17:04,069 --> 00:17:12,740 It essentially says, look at the min element. 278 00:17:12,740 --> 00:17:15,670 And the min element has to be the root, 279 00:17:15,670 --> 00:17:18,190 so you can do that one operation pretty quickly. 280 00:17:18,190 --> 00:17:21,770 But if you want to look for a k minute check, 281 00:17:21,770 --> 00:17:30,760 you want to see if there's an element in the heap that 282 00:17:30,760 --> 00:17:36,820 is less than or equal to k, or greater than or equal to k 283 00:17:36,820 --> 00:17:41,520 from t, this is going to take order n time. 284 00:17:41,520 --> 00:17:43,251 OK? 285 00:17:43,251 --> 00:17:43,750 Good. 286 00:17:46,390 --> 00:17:49,250 And finally, we haven't talked about dictionaries, 287 00:17:49,250 --> 00:17:52,040 but we will next week. 288 00:17:52,040 --> 00:17:54,530 Eric will talk about hash tables and dictionaries. 289 00:17:54,530 --> 00:17:56,800 And they have the same problem. 290 00:17:56,800 --> 00:17:59,690 So it's not like dictionaries are going to solve the problem, 291 00:17:59,690 --> 00:18:02,290 for those of you who know about hash tables and dictionaries. 292 00:18:02,290 --> 00:18:04,040 But you'll hear about them in some detail. 293 00:18:04,040 --> 00:18:06,380 They're very good at other things. 294 00:18:06,380 --> 00:18:10,360 So I don't want to say much more about that, because you're not 295 00:18:10,360 --> 00:18:12,340 supposed to know about dictionaries. 296 00:18:12,340 --> 00:18:13,798 Or at least we don't want to assume 297 00:18:13,798 --> 00:18:16,130 you do, though we have talked about them 298 00:18:16,130 --> 00:18:19,190 and alluded to dictionaries earlier. 299 00:18:19,190 --> 00:18:21,405 And so that's a story here. 300 00:18:21,405 --> 00:18:22,530 Yeah, back there, question. 301 00:18:22,530 --> 00:18:25,450 AUDIENCE: Yeah, can you explain why it's [INAUDIBLE] time? 302 00:18:25,450 --> 00:18:27,530 PROFESSOR: So what is a heap, right? 303 00:18:27,530 --> 00:18:30,220 A heap essentially-- a min heap, for example, 304 00:18:30,220 --> 00:18:34,280 or we talked about max heaps last time, 305 00:18:34,280 --> 00:18:39,440 has the property that you have an element k, 306 00:18:39,440 --> 00:18:47,420 and you're going to look at, let's say it's 21. 307 00:18:47,420 --> 00:18:52,050 Let's do min heaps, so this has to be less than what's 308 00:18:52,050 --> 00:18:55,440 here, 23, and what there, maybe it's 309 00:18:55,440 --> 00:18:57,545 30, and so on and so forth. 310 00:18:57,545 --> 00:18:59,045 And you have a recursive definition. 311 00:19:04,220 --> 00:19:07,490 And when you insert into a min heap, typically what happens 312 00:19:07,490 --> 00:19:11,590 is suppose you wanted to insert, for argument's sake, 313 00:19:11,590 --> 00:19:16,280 I want to insert 25. 314 00:19:16,280 --> 00:19:19,170 I want to insert 25 into this. 315 00:19:19,170 --> 00:19:23,010 The insertion algorithm for a min heap 316 00:19:23,010 --> 00:19:25,780 typically adds to the end of the min heap. 317 00:19:25,780 --> 00:19:29,280 So what you do is you would add 25 to this. 318 00:19:29,280 --> 00:19:33,500 And let's say that you had something out here. 319 00:19:33,500 --> 00:19:34,630 So you'd add to it. 320 00:19:34,630 --> 00:19:38,370 And you'd start flipping things. 321 00:19:38,370 --> 00:19:43,080 And you could work with just this part of the array 322 00:19:43,080 --> 00:19:45,140 to insert 25 in here. 323 00:19:45,140 --> 00:19:48,660 And you'd be able to satisfy the invariant of the min heap. 324 00:19:48,660 --> 00:19:51,750 And you'd get a legitimate min heap. 325 00:19:51,750 --> 00:19:56,110 But you'd never check the left part of it, which is 23. 326 00:19:56,110 --> 00:20:00,360 So it's quite possible-- and this is a good example-- 327 00:20:00,360 --> 00:20:04,420 that your basic insertion algorithm, which is essentially 328 00:20:04,420 --> 00:20:07,640 a version of max heap of i, or min heap of i, 329 00:20:07,640 --> 00:20:09,840 would simply insert at the end, and keep 330 00:20:09,840 --> 00:20:12,000 flipping until you get the min heap property, 331 00:20:12,000 --> 00:20:15,110 would be unable to check for the k minute check 332 00:20:15,110 --> 00:20:16,430 during the insertion. 333 00:20:16,430 --> 00:20:18,860 But what you'd have to do is to go look elsewhere. 334 00:20:18,860 --> 00:20:20,827 That min heap of i we'd never look at-- 335 00:20:20,827 --> 00:20:23,035 or the insert algorithm we'd never look at-- and that 336 00:20:23,035 --> 00:20:24,900 would require order n time. 337 00:20:24,900 --> 00:20:25,460 All right? 338 00:20:25,460 --> 00:20:26,293 AUDIENCE: Thank you. 339 00:20:28,890 --> 00:20:31,500 PROFESSOR: So that's the story for the min heap. 340 00:20:31,500 --> 00:20:32,730 Thanks for the question. 341 00:20:32,730 --> 00:20:35,360 And it's similar for dictionaries, as I said. 342 00:20:35,360 --> 00:20:37,150 And so we're stuck. 343 00:20:37,150 --> 00:20:42,960 We have no data structure yet that can do all of the things 344 00:20:42,960 --> 00:20:48,910 that I put up on the board to the left, in order log n time. 345 00:20:48,910 --> 00:20:52,770 And as you can see, the sorted array got pretty close. 346 00:20:52,770 --> 00:20:58,350 And so if you could just solve this problem, 347 00:20:58,350 --> 00:21:04,100 if you could do fast insertion-- and by fast I mean order log n 348 00:21:04,100 --> 00:21:14,480 time-- into a sorted array, we'd be in business. 349 00:21:14,480 --> 00:21:18,200 So that's what we'd like to do with binary search trees. 350 00:21:18,200 --> 00:21:20,340 Binary search trees are, as you can imagine, 351 00:21:20,340 --> 00:21:22,080 enable binary search. 352 00:21:22,080 --> 00:21:27,320 But the sorted arrays don't allow fast insertion, 353 00:21:27,320 --> 00:21:28,420 but BSTs do. 354 00:21:30,919 --> 00:21:31,960 So let me introduce BSTs. 355 00:21:38,500 --> 00:21:40,130 As with any data structure, there's 356 00:21:40,130 --> 00:21:43,820 a nice invariant associated with BSTs. 357 00:21:43,820 --> 00:21:49,080 The invariant is stronger than the heap invariant. 358 00:21:49,080 --> 00:21:52,570 And actually, that makes them a different data structure, not 359 00:21:52,570 --> 00:21:54,520 necessarily a better data structure. 360 00:21:54,520 --> 00:21:57,957 And I'll say why, but different. 361 00:21:57,957 --> 00:21:59,290 For this problem they're better. 362 00:22:02,130 --> 00:22:04,620 So one example of a binary search tree looks like this. 363 00:22:14,000 --> 00:22:19,600 And as a binary tree you have a node, and we call it x. 364 00:22:19,600 --> 00:22:22,880 Each of the nodes has a key of x. 365 00:22:22,880 --> 00:22:27,290 So 30 is the key for this node, 17 for that one, et cetera. 366 00:22:27,290 --> 00:22:29,575 Unlike in a heap, your data structure 367 00:22:29,575 --> 00:22:31,690 is a little more complicated. 368 00:22:31,690 --> 00:22:33,870 The heap is simply an array, and you 369 00:22:33,870 --> 00:22:36,524 happen to visualize it as a tree. 370 00:22:36,524 --> 00:22:37,940 The binary search tree is actually 371 00:22:37,940 --> 00:22:44,040 a tree that has pointers, unlike a heap. 372 00:22:44,040 --> 00:22:46,810 So it's a more complicated data structure. 373 00:22:46,810 --> 00:22:50,240 You need a few more bytes for every node of the binary search 374 00:22:50,240 --> 00:22:52,170 tree, as opposed to the heap, which 375 00:22:52,170 --> 00:22:55,440 is simply an array element. 376 00:22:55,440 --> 00:22:59,670 And the pointers are parent of x. 377 00:22:59,670 --> 00:23:03,860 I haven't bothered showing the arrows here, 378 00:23:03,860 --> 00:23:07,150 because you could be going upwards or backwards. 379 00:23:07,150 --> 00:23:09,080 And you could imagine that you actually 380 00:23:09,080 --> 00:23:11,930 have a parent pointer that goes up this way, 381 00:23:11,930 --> 00:23:14,600 and you have a child pointer that goes this way. 382 00:23:14,600 --> 00:23:17,790 So there's really, potentially, three pointers 383 00:23:17,790 --> 00:23:22,220 for each node, the parent, the left child, 384 00:23:22,220 --> 00:23:24,020 and the right child. 385 00:23:24,020 --> 00:23:26,580 So pretty straightforward. 386 00:23:26,580 --> 00:23:28,640 That's the data structure in terms 387 00:23:28,640 --> 00:23:33,420 of what it needs to have so you can operate on it. 388 00:23:33,420 --> 00:23:41,440 And there's an invariant for a BST. 389 00:23:41,440 --> 00:23:48,420 What makes a BST is that you have 390 00:23:48,420 --> 00:23:53,700 an ordering of the key values that 391 00:23:53,700 --> 00:24:05,620 satisfy the invariant that for all nodes x if y is 392 00:24:05,620 --> 00:24:18,780 in the left subtree of x, we have-- 393 00:24:18,780 --> 00:24:23,130 if it's in the left subtree then key of y 394 00:24:23,130 --> 00:24:27,950 is less than or equal to key of x. 395 00:24:27,950 --> 00:24:35,870 And if y is in the right subtree we 396 00:24:35,870 --> 00:24:42,070 have key of y is greater than or equal to key of x. 397 00:24:42,070 --> 00:24:44,540 So if we're talking about trees here, 398 00:24:44,540 --> 00:24:47,110 subtrees here, everything underneath-- 399 00:24:47,110 --> 00:24:51,170 and that's the stronger part of the invariant in the BST, 400 00:24:51,170 --> 00:24:54,700 versus in the heap we were just talking about the children. 401 00:24:54,700 --> 00:24:58,090 And so you look at this BST, it is a BST 402 00:24:58,090 --> 00:25:01,430 because if I look to the right, from the root 403 00:25:01,430 --> 00:25:04,470 I only see values that are greater than 30. 404 00:25:04,470 --> 00:25:08,150 And if I look to the left, in the entire subtree, 405 00:25:08,150 --> 00:25:13,890 all the way down I only see values that are less than 30. 406 00:25:13,890 --> 00:25:20,110 And that has to be true for any intermediate node in the tree. 407 00:25:20,110 --> 00:25:23,940 And the only other nontrivial node here is 17. 408 00:25:23,940 --> 00:25:28,830 And you see that 14 is less than 17, and 20 is greater than 17. 409 00:25:28,830 --> 00:25:30,000 OK? 410 00:25:30,000 --> 00:25:32,317 So that's the BST. 411 00:25:32,317 --> 00:25:33,400 That's the data structure. 412 00:25:33,400 --> 00:25:34,910 This is the invariant. 413 00:25:34,910 --> 00:25:40,890 So let's look at why BSTs are a possibility for solving 414 00:25:40,890 --> 00:25:44,870 our runway reservation problem. 415 00:25:44,870 --> 00:25:50,190 And what I'll do is I'll do the insert. 416 00:25:54,790 --> 00:25:58,970 So let's start with the nil set of elements, 417 00:25:58,970 --> 00:26:04,060 or null set of elements, R. And let's start inserting. 418 00:26:08,370 --> 00:26:13,570 So I insert 49. 419 00:26:13,570 --> 00:26:19,840 And all I do is make a node that has a key value of 49. 420 00:26:19,840 --> 00:26:22,110 This one is easy. 421 00:26:22,110 --> 00:26:23,595 Next insert, 79. 422 00:26:27,090 --> 00:26:32,600 And what happens here is I have to look at 49, 423 00:26:32,600 --> 00:26:34,165 and I compare 79 to 49. 424 00:26:34,165 --> 00:26:37,780 And because 79 is greater than 49 I go to the right 425 00:26:37,780 --> 00:26:45,180 and I attach 79 to the right child of 49. 426 00:26:45,180 --> 00:26:46,675 Then I want to insert 46. 427 00:26:49,500 --> 00:26:52,190 And when I want to insert 46 I look at this, 428 00:26:52,190 --> 00:26:54,070 I compare 49 and 46. 429 00:26:54,070 --> 00:26:59,390 46 is less, so I go to the left side and I put 46 in there. 430 00:26:59,390 --> 00:27:04,560 Next, let's say I want to insert 41. 431 00:27:04,560 --> 00:27:09,480 So far I haven't really talked about the k minute checks. 432 00:27:09,480 --> 00:27:11,700 And you could imagine that they're being done. 433 00:27:11,700 --> 00:27:14,080 I'll show you exactly, or talk about exactly how they're 434 00:27:14,080 --> 00:27:15,540 done in a second. 435 00:27:15,540 --> 00:27:17,410 It's not that hard. 436 00:27:17,410 --> 00:27:21,160 But let me go ahead and do one more. 437 00:27:21,160 --> 00:27:25,940 For 41, 41 is less than 49, so I go left. 438 00:27:25,940 --> 00:27:30,210 41 is less than 46, so I go left and attach it 439 00:27:30,210 --> 00:27:31,155 to the left child. 440 00:27:31,155 --> 00:27:31,790 All right? 441 00:27:31,790 --> 00:27:33,650 So that's what I have right now. 442 00:27:33,650 --> 00:27:36,510 Now let's talk about the k minute check. 443 00:27:36,510 --> 00:27:39,100 It's good to talk about the K minute check 444 00:27:39,100 --> 00:27:41,750 when there's actually a violation. 445 00:27:41,750 --> 00:27:45,220 And let's assume the k equals 3 here. 446 00:27:45,220 --> 00:27:47,110 And so, same thing here. 447 00:27:47,110 --> 00:27:49,780 You're essentially doing binary search here. 448 00:27:49,780 --> 00:27:52,502 And you're doing the checks as you're doing the binary search. 449 00:27:52,502 --> 00:27:53,960 So what you're going to be doing is 450 00:27:53,960 --> 00:27:58,580 you're going to check that-- you're going to compare 42 451 00:27:58,580 --> 00:28:01,930 with 49, with the k minute check. 452 00:28:01,930 --> 00:28:03,780 And you realize they're 7 apart. 453 00:28:03,780 --> 00:28:04,960 So that's OK. 454 00:28:04,960 --> 00:28:09,010 And 42 is less than 49, so you go left. 455 00:28:09,010 --> 00:28:12,370 And then you compare 42 with 46. 456 00:28:12,370 --> 00:28:16,770 And again, it's less than 46, but it's k away, more than 3 457 00:28:16,770 --> 00:28:18,030 away from 46. 458 00:28:18,030 --> 00:28:18,950 So that's cool. 459 00:28:18,950 --> 00:28:20,580 And you go left. 460 00:28:20,580 --> 00:28:22,350 And then you get to 41. 461 00:28:22,350 --> 00:28:25,180 And you compare 42 with 41. 462 00:28:25,180 --> 00:28:26,610 In this case is greater. 463 00:28:26,610 --> 00:28:30,580 But it's not k more than it. 464 00:28:30,580 --> 00:28:34,510 And so that means that if you didn't have the check, 465 00:28:34,510 --> 00:28:37,930 you would be putting 42 in here. 466 00:28:37,930 --> 00:28:40,750 But because you have the check, you fail. 467 00:28:40,750 --> 00:28:43,580 And you say, look, I mean this violates 468 00:28:43,580 --> 00:28:46,485 the safety property, violates the check I need to do. 469 00:28:46,485 --> 00:28:48,110 And therefore I'm not going to insert-- 470 00:28:48,110 --> 00:28:50,750 I'm not going to reserve a request for you. 471 00:28:50,750 --> 00:28:51,520 All right? 472 00:28:51,520 --> 00:28:55,340 So what's happened here is it's basically a sorted array, 473 00:28:55,340 --> 00:28:57,850 except that you added a bunch of pointers 474 00:28:57,850 --> 00:28:59,300 associated with the tree. 475 00:28:59,300 --> 00:29:03,520 And so it's somewhere between a sorted list and a sorted array. 476 00:29:03,520 --> 00:29:05,500 And it does exactly the right thing 477 00:29:05,500 --> 00:29:09,310 with respect to being able to insert. 478 00:29:09,310 --> 00:29:11,360 Once you've found the place to insert, 479 00:29:11,360 --> 00:29:14,370 it's merely attaching this particular new node 480 00:29:14,370 --> 00:29:17,150 with it's appropriate key to the pointer. 481 00:29:17,150 --> 00:29:19,200 All right? 482 00:29:19,200 --> 00:29:28,390 So what's happened here is that if h 483 00:29:28,390 --> 00:29:37,970 is the height of the tree then insertion 484 00:29:37,970 --> 00:29:44,510 with or without the check is done in order h time. 485 00:29:48,330 --> 00:29:52,044 And that's what BSTs are good for. 486 00:29:52,044 --> 00:29:52,710 People buy that? 487 00:29:52,710 --> 00:29:55,700 Any questions about how they k minute check proceeded? 488 00:29:55,700 --> 00:29:56,441 Yeah, question. 489 00:29:56,441 --> 00:29:57,732 AUDIENCE: So, what's it called? 490 00:29:57,732 --> 00:29:58,460 The what check? 491 00:29:58,460 --> 00:30:00,022 PROFESSOR: The k minute check. 492 00:30:00,022 --> 00:30:04,410 Sorry, the k was 3 minutes k. 493 00:30:04,410 --> 00:30:07,790 I had this thing over here, add t to the set R 494 00:30:07,790 --> 00:30:12,110 if no other landings are scheduled within k minutes. 495 00:30:12,110 --> 00:30:13,460 So k was just a number. 496 00:30:13,460 --> 00:30:17,150 I want it to be a parameter because it 497 00:30:17,150 --> 00:30:19,130 doesn't matter what k is. 498 00:30:19,130 --> 00:30:22,870 As long as you know what it is when you do the binary search, 499 00:30:22,870 --> 00:30:26,070 you can add that in to an argument to your insert, 500 00:30:26,070 --> 00:30:27,092 and do the check. 501 00:30:27,092 --> 00:30:28,700 AUDIENCE: OK. 502 00:30:28,700 --> 00:30:33,080 PROFESSOR: So in this case, I set k to be 3 out here. 503 00:30:33,080 --> 00:30:35,910 And I was doing a check to see that the invariant, 504 00:30:35,910 --> 00:30:40,830 any elements in the BST already, on any nodes that 505 00:30:40,830 --> 00:30:45,920 had keys that were within 3 minutes-- 506 00:30:45,920 --> 00:30:48,900 because I fixed k to be 3-- to the actual time 507 00:30:48,900 --> 00:30:50,590 that I was trying to insert. 508 00:30:50,590 --> 00:30:51,090 All right? 509 00:30:51,090 --> 00:30:53,254 AUDIENCE: So there's no way [INAUDIBLE]. 510 00:30:53,254 --> 00:30:54,795 PROFESSOR: I'm sorry, there's no way? 511 00:30:54,795 --> 00:30:55,542 AUDIENCE: There's no way you could 512 00:30:55,542 --> 00:30:57,160 insert the 42 into the tree then? 513 00:30:57,160 --> 00:31:00,480 PROFESSOR: Well, if the basic insertion 514 00:31:00,480 --> 00:31:03,535 method into a binary search tree doesn't have any constraints. 515 00:31:06,040 --> 00:31:10,650 But you can certainly augment the insertion method 516 00:31:10,650 --> 00:31:14,720 without changing the efficiency of the insertion method. 517 00:31:14,720 --> 00:31:16,710 So let's say that all you wanted to do 518 00:31:16,710 --> 00:31:19,780 was insert into a binary search tree, 519 00:31:19,780 --> 00:31:22,710 and it had nothing to do with the runway reservation. 520 00:31:22,710 --> 00:31:25,400 Then you would just insert the way I described to you. 521 00:31:25,400 --> 00:31:27,080 The beauty of the binary search tree 522 00:31:27,080 --> 00:31:31,150 is that while you're finding the place to insert, 523 00:31:31,150 --> 00:31:33,487 you can do these checks-- the k minute checks. 524 00:31:33,487 --> 00:31:34,570 Yeah, question back there. 525 00:31:34,570 --> 00:31:36,729 AUDIENCE: What about 45? 526 00:31:36,729 --> 00:31:37,770 PROFESSOR: What about 45? 527 00:31:37,770 --> 00:31:43,190 So this is after-- we haven't inserted 42 528 00:31:43,190 --> 00:31:45,630 because it violated the check. 529 00:31:45,630 --> 00:31:47,520 So when you do 45, then what happens 530 00:31:47,520 --> 00:31:51,100 is you see that 45 is less than 49 531 00:31:51,100 --> 00:31:55,510 and you pass, because you're more than 3 minutes away. 532 00:31:55,510 --> 00:31:57,230 We'll stick with that example. 533 00:31:57,230 --> 00:31:58,930 And then you get here and then you 534 00:31:58,930 --> 00:32:04,780 see that 45 is less than 46, and you'd fail right here. 535 00:32:04,780 --> 00:32:07,390 You would fail right here if you were doing the check, 536 00:32:07,390 --> 00:32:11,220 because 45 is not 3 away from 46. 537 00:32:11,220 --> 00:32:13,390 All right? 538 00:32:13,390 --> 00:32:16,580 So that's the story. 539 00:32:16,580 --> 00:32:19,130 And so if you have h being the height of the tree, 540 00:32:19,130 --> 00:32:21,670 as you can see you're just following a path. 541 00:32:21,670 --> 00:32:24,150 And depending on what the height is 542 00:32:24,150 --> 00:32:26,450 you're going to do that many operations, 543 00:32:26,450 --> 00:32:28,320 times some constant factor. 544 00:32:28,320 --> 00:32:31,010 And so you can say that this is order h time. 545 00:32:31,010 --> 00:32:32,090 All right? 546 00:32:32,090 --> 00:32:35,210 Any other questions? 547 00:32:35,210 --> 00:32:36,552 Yeah, question back there. 548 00:32:36,552 --> 00:32:38,218 AUDIENCE: In a normal array [INAUDIBLE]. 549 00:32:44,550 --> 00:32:46,410 PROFESSOR: Well, it's up to you. 550 00:32:46,410 --> 00:32:50,680 In a conventional binary search tree, or the vanilla binary 551 00:32:50,680 --> 00:32:52,560 search tree, typically what you're doing 552 00:32:52,560 --> 00:32:55,100 is you're doing either find or insert. 553 00:32:55,100 --> 00:32:57,050 And so what that means is that you would just 554 00:32:57,050 --> 00:33:00,870 return the pointer associated with that element. 555 00:33:00,870 --> 00:33:04,710 So if you're looking for find 46, for example, on the tree 556 00:33:04,710 --> 00:33:08,656 that I have out there, typically 46 is just the key value. 557 00:33:08,656 --> 00:33:10,530 And there may be a record associated with it. 558 00:33:10,530 --> 00:33:12,279 And you would get a pointer to that record 559 00:33:12,279 --> 00:33:14,200 because it's already in there. 560 00:33:14,200 --> 00:33:17,890 At that point you can say I want to override. 561 00:33:17,890 --> 00:33:21,420 Or if you want, you could have duplicate values. 562 00:33:21,420 --> 00:33:23,890 You could have this, what's called a multiset. 563 00:33:23,890 --> 00:33:26,770 A multiset is a set that has duplicate elements. 564 00:33:26,770 --> 00:33:29,400 In that case, you would need a little more sophistication 565 00:33:29,400 --> 00:33:33,470 to differentiate between two elements that 566 00:33:33,470 --> 00:33:35,870 have the same key values. 567 00:33:35,870 --> 00:33:38,170 So you'd have to call it 46a and 46b. 568 00:33:38,170 --> 00:33:41,880 And you'd have to have some way of differentiating. 569 00:33:41,880 --> 00:33:43,420 Any other questions? 570 00:33:43,420 --> 00:33:44,258 Yeah. 571 00:33:44,258 --> 00:33:45,692 AUDIENCE: Wouldn't it be a problem 572 00:33:45,692 --> 00:33:47,604 if the tree's not balanced? 573 00:33:47,604 --> 00:33:50,770 PROFESSOR: Ah, great question. 574 00:33:50,770 --> 00:33:55,930 Yes, stay tuned. 575 00:33:55,930 --> 00:33:57,990 So I was careful, right? 576 00:34:00,369 --> 00:34:01,910 I guess I kind of alluded to the fact 577 00:34:01,910 --> 00:34:03,870 that we'd solved the runway reservation system. 578 00:34:03,870 --> 00:34:06,540 Did I actually say that we'd solved the problem? 579 00:34:06,540 --> 00:34:08,080 Did I say we had solved the problem? 580 00:34:08,080 --> 00:34:10,610 OK, so I did not lie. 581 00:34:10,610 --> 00:34:12,150 I did not lie. 582 00:34:12,150 --> 00:34:15,730 I said that the height of the tree was h. 583 00:34:15,730 --> 00:34:18,960 And I said that this was accomplished in order h time, 584 00:34:18,960 --> 00:34:19,760 right? 585 00:34:19,760 --> 00:34:23,730 Which is not quite what I want, which is really your question. 586 00:34:23,730 --> 00:34:25,300 So we'll get to that. 587 00:34:25,300 --> 00:34:28,080 So we're not quite done yet. 588 00:34:28,080 --> 00:34:30,719 But before we do that, it turns out 589 00:34:30,719 --> 00:34:34,909 that today's lecture is really part one of two. 590 00:34:34,909 --> 00:34:40,080 You'll get a really good sense of BST operations 591 00:34:40,080 --> 00:34:41,699 in today's lecture. 592 00:34:41,699 --> 00:34:44,520 But there's going to be a few things that-- we can't cover 593 00:34:44,520 --> 00:34:47,090 all of double 6 in the lecture, right? 594 00:34:47,090 --> 00:34:50,730 We'd like to, and let you off for the entire fall, 595 00:34:50,730 --> 00:34:52,940 but that's not the way it works, all right? 596 00:34:52,940 --> 00:34:54,360 So it's a great question. 597 00:34:54,360 --> 00:34:56,959 I'll answer it towards the end. 598 00:34:56,959 --> 00:34:58,500 I just wanted you to say a little bit 599 00:34:58,500 --> 00:35:01,400 about other operations. 600 00:35:01,400 --> 00:35:05,110 There's many operations that you can do on a binary search 601 00:35:05,110 --> 00:35:10,330 tree, that can be done in order h time, 602 00:35:10,330 --> 00:35:13,040 and some even in constant time. 603 00:35:13,040 --> 00:35:14,940 And I'll put these in the notes. 604 00:35:14,940 --> 00:35:16,970 Some of these are fairly straightforward. 605 00:35:16,970 --> 00:35:22,750 Find min can be done in heap, in a min heap. 606 00:35:22,750 --> 00:35:25,450 If you want to find the minimum value, it's constant time. 607 00:35:25,450 --> 00:35:27,730 You just return the root. 608 00:35:27,730 --> 00:35:32,040 In the case of a binary search tree, how do you find the min? 609 00:35:32,040 --> 00:35:34,134 Someone? 610 00:35:34,134 --> 00:35:34,800 Worth a cushion. 611 00:35:34,800 --> 00:35:35,335 Yep. 612 00:35:35,335 --> 00:35:36,710 AUDIENCE: Keep going to the left? 613 00:35:36,710 --> 00:35:37,710 PROFESSOR: Keep going to the left. 614 00:35:37,710 --> 00:35:38,660 And how do you find the max? 615 00:35:38,660 --> 00:35:39,445 AUDIENCE: [INAUDIBLE]. 616 00:35:39,445 --> 00:35:40,903 PROFESSOR: Keep going to the right. 617 00:35:40,903 --> 00:35:42,640 All right great, thank you. 618 00:35:42,640 --> 00:35:44,646 And finally, what complexity is that? 619 00:35:44,646 --> 00:35:46,812 I sort gave it away, but I want to hear it from you. 620 00:35:46,812 --> 00:35:47,997 AUDIENCE: [INAUDIBLE]. 621 00:35:47,997 --> 00:35:48,580 PROFESSOR: Hm? 622 00:35:48,580 --> 00:35:49,470 AUDIENCE: It's the height 623 00:35:49,470 --> 00:35:50,970 PROFESSOR: It's the height, order h. 624 00:35:50,970 --> 00:35:52,760 All right, it's order h complexity. 625 00:35:52,760 --> 00:35:57,610 Go to the left until you hit a leaf, 626 00:35:57,610 --> 00:36:04,470 and until leaf order h complexity. 627 00:36:04,470 --> 00:36:05,917 Same thing for max. 628 00:36:05,917 --> 00:36:07,500 And then you can do a bunch of things. 629 00:36:07,500 --> 00:36:10,030 I'll put these in the notes. 630 00:36:10,030 --> 00:36:13,180 You can find things like next larger 631 00:36:13,180 --> 00:36:18,940 x, which is the next largest value beyond x. 632 00:36:18,940 --> 00:36:22,530 And you look at the key for x and you say, for example, 633 00:36:22,530 --> 00:36:25,840 if you put 46 in there, what's the next thing that's larger 634 00:36:25,840 --> 00:36:28,220 and that? 635 00:36:28,220 --> 00:36:31,930 In this tree here, it's 49. 636 00:36:31,930 --> 00:36:36,270 But that's something which was trivially done in this example. 637 00:36:36,270 --> 00:36:40,550 But in general you can do this in order h time as well. 638 00:36:40,550 --> 00:36:42,280 And you can see the pseudocode. 639 00:36:42,280 --> 00:36:46,130 And we'll probably cover that in section tomorrow. 640 00:36:46,130 --> 00:36:49,130 What I want to do today, for the rest of the time I have left, 641 00:36:49,130 --> 00:36:53,720 is actually talk about augmented binary search trees, which 642 00:36:53,720 --> 00:36:58,870 are things that can do more and have more data in them 643 00:36:58,870 --> 00:37:01,680 than just these pointers. 644 00:37:01,680 --> 00:37:03,911 And that's actually something which 645 00:37:03,911 --> 00:37:06,410 should give you a sense of the richness of the binary search 646 00:37:06,410 --> 00:37:09,517 tree structure, this notion of augmentation. 647 00:37:09,517 --> 00:37:11,600 And those of you, again, who have taken double 05, 648 00:37:11,600 --> 00:37:13,760 you know about design amendments. 649 00:37:13,760 --> 00:37:16,540 And so specifications never stay the same. 650 00:37:16,540 --> 00:37:18,330 I mean, you're working for someone, 651 00:37:18,330 --> 00:37:21,030 and they never really tell you what they want. 652 00:37:21,030 --> 00:37:24,250 They might, but they change their mind. 653 00:37:24,250 --> 00:37:26,700 So in this case, we're going to change our mind. 654 00:37:26,700 --> 00:37:28,970 And so we've done this to the extent 655 00:37:28,970 --> 00:37:32,330 that we can cover all of these in order h time. 656 00:37:32,330 --> 00:37:35,110 And let's say that now the problem specification 657 00:37:35,110 --> 00:37:36,810 changed on us. 658 00:37:36,810 --> 00:37:38,570 There's an additional requirement 659 00:37:38,570 --> 00:37:42,250 that we're asked to solve. 660 00:37:42,250 --> 00:37:47,060 And so you sort of committed to BST structures. 661 00:37:47,060 --> 00:37:50,150 But now we have an additional requirement. 662 00:37:50,150 --> 00:38:00,100 And the new requirement is that we be able to compute rank t. 663 00:38:00,100 --> 00:38:08,430 And rank t is how many planes are 664 00:38:08,430 --> 00:38:22,470 scheduled to land at times less than or equal to t. 665 00:38:22,470 --> 00:38:24,040 So perfectly reasonable question. 666 00:38:24,040 --> 00:38:26,740 It wasn't part of the original spec. 667 00:38:26,740 --> 00:38:29,810 You now have built your BST data structure, 668 00:38:29,810 --> 00:38:31,480 you thought you were done. 669 00:38:31,480 --> 00:38:33,340 Sorry, you aren't. 670 00:38:33,340 --> 00:38:35,730 You've got to do this extra stuff. 671 00:38:35,730 --> 00:38:39,540 So that's the notion of augmentation, 672 00:38:39,540 --> 00:38:42,744 which we're going to use this is an example of how we're 673 00:38:42,744 --> 00:38:45,200 going to augment the BST structure. 674 00:38:45,200 --> 00:38:46,950 And oh, by the way, I don't want you 675 00:38:46,950 --> 00:38:50,140 to change the complexity from order h. 676 00:38:50,140 --> 00:38:52,080 And we eventually will get to order log n, 677 00:38:52,080 --> 00:38:56,630 but don't go change something that was logarithmic to linear. 678 00:38:56,630 --> 00:38:57,380 That would be bad. 679 00:38:59,970 --> 00:39:01,690 So let's talk about how you do this. 680 00:39:01,690 --> 00:39:03,315 And I don't think we need this anymore. 681 00:39:07,830 --> 00:39:11,000 So the first thing we need to do is add a little bit more 682 00:39:11,000 --> 00:39:15,790 information to the node structure. 683 00:39:20,250 --> 00:39:29,283 And what we're going to do is augment the BST structure. 684 00:39:34,130 --> 00:39:38,980 And we're going to add one little number associated 685 00:39:38,980 --> 00:39:45,670 with each node, that looks at the number of nodes below it. 686 00:39:45,670 --> 00:39:49,000 So in particular, let's say that I 687 00:39:49,000 --> 00:39:59,500 have 49, 46, let's just say 49, 46 for now. 688 00:39:59,500 --> 00:40:06,700 And over here I have 79, 64, and 83. 689 00:40:09,640 --> 00:40:11,490 I'm going to modify-- I'm going to have 690 00:40:11,490 --> 00:40:16,435 an extra number associated with each of these nodes. 691 00:40:16,435 --> 00:40:18,060 And I'm just going to write that number 692 00:40:18,060 --> 00:40:20,830 on the outside of the node. 693 00:40:20,830 --> 00:40:23,640 And you can just imagine that now the key value has 694 00:40:23,640 --> 00:40:25,880 two numbers associated with it-- the thing 695 00:40:25,880 --> 00:40:30,450 that I write inside the node, and what I write outside of it. 696 00:40:30,450 --> 00:40:35,959 So in particular, when I do insert or delete 697 00:40:35,959 --> 00:40:37,625 I'm going to be modifying these numbers. 698 00:40:40,480 --> 00:40:44,470 And these are size numbers. 699 00:40:44,470 --> 00:40:46,210 And what do I mean by that? 700 00:40:46,210 --> 00:40:51,840 Well these numbers correspond to subtree sizes. 701 00:40:57,470 --> 00:41:01,180 So the subtree size here is 1, 1, 1. 702 00:41:01,180 --> 00:41:04,100 So as I'm building this tree up I'm 703 00:41:04,100 --> 00:41:06,450 going to create an augmented BST structure, 704 00:41:06,450 --> 00:41:08,400 and I've modified insert and delete 705 00:41:08,400 --> 00:41:10,364 so they do some extra work. 706 00:41:10,364 --> 00:41:11,780 So let's say, for argument's sake, 707 00:41:11,780 --> 00:41:18,090 that I've added this in sort of a bottom up fashion. 708 00:41:18,090 --> 00:41:21,390 And what I have are these particular subtree sizes. 709 00:41:21,390 --> 00:41:23,110 All of these should make sense. 710 00:41:23,110 --> 00:41:27,240 This has just a single node, same thing here. 711 00:41:27,240 --> 00:41:31,180 So this subtree sizes associated with these nodes are all 1. 712 00:41:31,180 --> 00:41:33,740 The subtree size associated with 79 713 00:41:33,740 --> 00:41:39,060 is 3, because you're counting 79 and 64 and 83. 714 00:41:39,060 --> 00:41:41,200 And the subtree size associated with 49 715 00:41:41,200 --> 00:41:44,930 is 5, because you're counting everything underneath it. 716 00:41:44,930 --> 00:41:46,396 How did we get these numbers? 717 00:41:46,396 --> 00:41:47,770 Well you want to think about this 718 00:41:47,770 --> 00:41:50,120 as you started with an empty set, 719 00:41:50,120 --> 00:41:51,530 and you kept inserting into it. 720 00:41:51,530 --> 00:41:54,650 And you were doing a sequence of insert and delete operations. 721 00:41:54,650 --> 00:41:59,090 And if I explain to you how an insert operation modifies 722 00:41:59,090 --> 00:42:02,380 these numbers, that is pretty much all you need. 723 00:42:02,380 --> 00:42:05,970 And of course, analogously, for a delete operation. 724 00:42:05,970 --> 00:42:11,180 So what would happen for, let's say you wanted to insert 43? 725 00:42:11,180 --> 00:42:15,160 You would insert 43 at this point. 726 00:42:15,160 --> 00:42:19,450 And what you'd do is you follow the insertion path 727 00:42:19,450 --> 00:42:21,160 just like you did before. 728 00:42:21,160 --> 00:42:23,430 But when you're following that path 729 00:42:23,430 --> 00:42:28,570 you're going to increment the nodes that you're seeing by 1. 730 00:42:28,570 --> 00:42:32,340 So you're going to add 43 to this. 731 00:42:32,340 --> 00:42:40,520 And you'd add 5 plus 1, because you see 49. 732 00:42:40,520 --> 00:42:45,140 And then you would go down and you'd see 46. 733 00:42:45,140 --> 00:42:47,170 And so you'd add 1 to that. 734 00:42:47,170 --> 00:42:49,610 And then finally, you add 43 and you 735 00:42:49,610 --> 00:42:51,640 assign-- since it's a leaf-- you'd 736 00:42:51,640 --> 00:42:54,380 assign to value corresponding to the subtree size 737 00:42:54,380 --> 00:42:58,370 of this new node that you put in there, to be 1. 738 00:42:58,370 --> 00:43:01,200 It guess a little, teensy bit more complicated 739 00:43:01,200 --> 00:43:04,110 when you want to do the k minute check. 740 00:43:04,110 --> 00:43:06,700 But from a complexity standpoint, 741 00:43:06,700 --> 00:43:08,860 if you're not worried about constant factors, 742 00:43:08,860 --> 00:43:10,700 you can just say, you know what? 743 00:43:10,700 --> 00:43:14,320 I'm going to first run the regular insert, 744 00:43:14,320 --> 00:43:16,680 ignoring the subtree sizes. 745 00:43:16,680 --> 00:43:19,500 And if it fails, I'm done. 746 00:43:19,500 --> 00:43:22,870 Because I'm not going to modify the BST, and I'm done. 747 00:43:22,870 --> 00:43:25,270 I'm not going to have to modify the subtree sizes. 748 00:43:25,270 --> 00:43:27,930 If it succeeds, then I'm going to go in, 749 00:43:27,930 --> 00:43:31,380 and I know now that I can increment each of these nodes, 750 00:43:31,380 --> 00:43:33,990 because I know I'm going to be successful. 751 00:43:33,990 --> 00:43:36,350 So that's sort of a trivial way of solving this problem, 752 00:43:36,350 --> 00:43:38,850 that from an asymptotic complexity standpoint 753 00:43:38,850 --> 00:43:42,300 gives you your order h augmented insert. 754 00:43:42,300 --> 00:43:44,187 That make sense? 755 00:43:44,187 --> 00:43:46,020 Now you could do something better than that. 756 00:43:46,020 --> 00:43:49,550 I mean, I would urge you, if you had wrote something 757 00:43:49,550 --> 00:43:52,130 that-- we asked you to write something like this, 758 00:43:52,130 --> 00:43:55,770 to create a single procedure that essentially uses 759 00:43:55,770 --> 00:44:00,500 a recursion appropriately to do the right thing in one pass 760 00:44:00,500 --> 00:44:01,590 through the BST. 761 00:44:01,590 --> 00:44:03,430 And we'll talk about things like that 762 00:44:03,430 --> 00:44:08,210 as we go along in sections, and possibly in lectures. 763 00:44:08,210 --> 00:44:11,516 So that's the subtree insert delete. 764 00:44:11,516 --> 00:44:12,265 Everyone buy that? 765 00:44:12,265 --> 00:44:13,693 Yeah, question back there. 766 00:44:13,693 --> 00:44:16,234 AUDIENCE: If I wanted to delete a number, like let's say 79-- 767 00:44:16,234 --> 00:44:16,859 PROFESSOR: Yep? 768 00:44:16,859 --> 00:44:18,704 AUDIENCE: --would we have to take it out 769 00:44:18,704 --> 00:44:21,010 and then rewrite the entire BST? 770 00:44:21,010 --> 00:44:24,030 PROFESSOR: What you'd have to do is a bubble up pointers. 771 00:44:24,030 --> 00:44:30,130 So you'd have to actually have 64 connected to-- what 772 00:44:30,130 --> 00:44:33,960 will happen is 83 would actually come up, 773 00:44:33,960 --> 00:44:36,410 and you would essentially have some thing-- this 774 00:44:36,410 --> 00:44:38,780 is not quite how it works-- but 83 would move up 775 00:44:38,780 --> 00:44:40,410 and you'd have 64 to the left. 776 00:44:40,410 --> 00:44:43,310 That's what would happened for delete in this case. 777 00:44:43,310 --> 00:44:47,360 So you would have to move pointers in the case of delete. 778 00:44:47,360 --> 00:44:50,670 And we're not done with binary search tree operations 779 00:44:50,670 --> 00:44:53,437 from a standpoint of teaching you about them. 780 00:44:53,437 --> 00:44:55,520 We'll talk about them not just in today's lecture, 781 00:44:55,520 --> 00:44:58,670 but later as well. 782 00:44:58,670 --> 00:45:00,380 So there's one thing missing here, 783 00:45:00,380 --> 00:45:03,240 though, which is I haven't quite figured out-- 784 00:45:03,240 --> 00:45:05,690 I've told you how these subtree sizes work. 785 00:45:05,690 --> 00:45:08,650 But it's not completely clear, this 786 00:45:08,650 --> 00:45:11,340 is the last thing we have to do, is how are you 787 00:45:11,340 --> 00:45:17,460 going to compute rank t from the subtree sizes? 788 00:45:17,460 --> 00:45:21,450 So everyone understand subtree sizes? 789 00:45:21,450 --> 00:45:23,780 It's just the number of nodes that are underneath you. 790 00:45:23,780 --> 00:45:27,360 And you remember to count yourself, all right? 791 00:45:27,360 --> 00:45:28,600 Now what is rank t? 792 00:45:28,600 --> 00:45:30,860 Rank t is how many planes are scheduled 793 00:45:30,860 --> 00:45:33,670 to land at times less than or equal to t. 794 00:45:33,670 --> 00:45:37,460 So now I have a BST structure that looks like the one 795 00:45:37,460 --> 00:45:40,930 and I just ended up with. 796 00:45:40,930 --> 00:45:42,940 So I've added this 43. 797 00:45:42,940 --> 00:45:44,500 And so let me draw that out here, 798 00:45:44,500 --> 00:45:48,160 and see if we can answer this question. 799 00:45:48,160 --> 00:45:51,370 This is a subtle question. 800 00:45:51,370 --> 00:45:55,830 So I got 49, and that subtree size is 6. 801 00:45:55,830 --> 00:45:59,170 I got 46, subtree size is 2. 802 00:45:59,170 --> 00:46:07,010 43, 79, 64. 803 00:46:07,010 --> 00:46:08,330 and 83. 804 00:46:11,000 --> 00:46:21,640 So what I want is what lands before t? 805 00:46:24,360 --> 00:46:27,420 And how do I do that? 806 00:46:27,420 --> 00:46:30,660 Give me an algorithm that would allow 807 00:46:30,660 --> 00:46:35,700 me to compute in order h time. 808 00:46:35,700 --> 00:46:38,130 I want to do this in order h time. 809 00:46:38,130 --> 00:46:40,040 What lands before t? 810 00:46:40,040 --> 00:46:42,759 Someone? 811 00:46:42,759 --> 00:46:43,258 Yeah. 812 00:46:43,258 --> 00:46:44,662 AUDIENCE: So first you would have 813 00:46:44,662 --> 00:46:47,286 to find where to insert it, like we did before. 814 00:46:47,286 --> 00:46:48,285 PROFESSOR: Right, right. 815 00:46:48,285 --> 00:46:53,071 AUDIENCE: And then because we have the order of whatever it 816 00:46:53,071 --> 00:46:54,875 was before-- not the order, the-- 817 00:46:54,875 --> 00:46:55,750 PROFESSOR: The sizes? 818 00:46:55,750 --> 00:46:56,280 The sizes? 819 00:46:56,280 --> 00:46:56,780 Yeah. 820 00:46:56,780 --> 00:46:59,445 AUDIENCE: And then we can look what's more than it 821 00:46:59,445 --> 00:47:02,485 on the right, we can subtract it and we get-- 822 00:47:02,485 --> 00:47:04,360 PROFESSOR: What is more than it on the right. 823 00:47:04,360 --> 00:47:04,750 Do you want to say-- 824 00:47:04,750 --> 00:47:05,791 AUDIENCE: Because, like-- 825 00:47:05,791 --> 00:47:06,390 PROFESSOR: OK. 826 00:47:06,390 --> 00:47:07,060 AUDIENCE: --on the right-- 827 00:47:07,060 --> 00:47:07,768 PROFESSOR: Right. 828 00:47:07,768 --> 00:47:09,970 AUDIENCE: --and then we can take this minus this 829 00:47:09,970 --> 00:47:11,860 and we get what's left. 830 00:47:11,860 --> 00:47:13,610 PROFESSOR: That's great, that's excellent. 831 00:47:13,610 --> 00:47:15,747 Excellent. 832 00:47:15,747 --> 00:47:18,080 So I'm going to do it a little bit differently from what 833 00:47:18,080 --> 00:47:19,010 you described. 834 00:47:19,010 --> 00:47:20,470 I'm going to actually do it in a, 835 00:47:20,470 --> 00:47:23,554 sort of, a more positive way, no offense intended. 836 00:47:23,554 --> 00:47:25,095 What we're going to do is we're going 837 00:47:25,095 --> 00:47:28,170 to add up the things that we want to add up. 838 00:47:28,170 --> 00:47:30,520 And what you have to do is walk-- 839 00:47:30,520 --> 00:47:33,210 your first step was right on. 840 00:47:33,210 --> 00:47:35,260 I mean, your answer is correct. 841 00:47:35,260 --> 00:47:38,400 I'm just going to do it a little bit differently. 842 00:47:38,400 --> 00:47:42,180 You walk down the tree to find the desired time. 843 00:47:42,180 --> 00:47:43,810 This is just your search. 844 00:47:43,810 --> 00:47:46,300 We know how to do that. 845 00:47:46,300 --> 00:47:53,840 As you walk down you add in the nodes that 846 00:47:53,840 --> 00:47:58,141 is the subtree sizes-- you're just adding in the notes here. 847 00:47:58,141 --> 00:48:00,140 So if you see-- depending on the number of nodes 848 00:48:00,140 --> 00:48:01,880 that you see as you're going deeper in, 849 00:48:01,880 --> 00:48:03,490 you want to add in the nodes. 850 00:48:03,490 --> 00:48:05,620 And you're going to add one to that, corresponding 851 00:48:05,620 --> 00:48:07,410 to the nodes that are smaller. 852 00:48:07,410 --> 00:48:12,167 And we're going to add in the subtree sizes to the left, 853 00:48:12,167 --> 00:48:13,250 as opposed to subtracting. 854 00:48:19,136 --> 00:48:20,510 That may not make a lot of sense. 855 00:48:20,510 --> 00:48:23,720 But I guarantee you it will once we do an example. 856 00:48:34,270 --> 00:48:36,180 So what's going on here? 857 00:48:36,180 --> 00:48:38,430 I want to find a place to insert. 858 00:48:38,430 --> 00:48:40,490 I'm not actually going to do the insert. 859 00:48:40,490 --> 00:48:42,280 Think of it is doing a lookup. 860 00:48:42,280 --> 00:48:45,870 And along the way, I need to figure out 861 00:48:45,870 --> 00:48:47,510 the less than operator. 862 00:48:47,510 --> 00:48:49,110 I want to find all of the things that 863 00:48:49,110 --> 00:48:51,530 are less than this value I'm searching for. 864 00:48:51,530 --> 00:48:54,640 And so I have to do a bit of arithmetic. 865 00:48:54,640 --> 00:49:00,380 So let's say that I'm looking for what's 866 00:49:00,380 --> 00:49:03,020 less than or equal to 79. 867 00:49:03,020 --> 00:49:07,720 So t equals 79. 868 00:49:07,720 --> 00:49:09,890 So I'm going to look at 49. 869 00:49:09,890 --> 00:49:13,310 I'm going to walk down, I'm going to look at 49. 870 00:49:13,310 --> 00:49:22,670 And because I say I'm looking at 49-- and 49 871 00:49:22,670 --> 00:49:24,810 is clearly less than 79. 872 00:49:24,810 --> 00:49:27,960 So I'm going to add 1. 873 00:49:27,960 --> 00:49:30,370 And that's this check over here. 874 00:49:30,370 --> 00:49:41,830 I move on and what I need to do now is move to the right, 875 00:49:41,830 --> 00:49:45,510 because 79 is greater than 49. 876 00:49:45,510 --> 00:49:47,400 That's how my search would work. 877 00:49:47,400 --> 00:49:50,500 But because I've moved to the right, 878 00:49:50,500 --> 00:49:55,570 I'm going to add the subtree sizes that were to the left. 879 00:49:55,570 --> 00:49:58,240 Because I know that all of the things to the left 880 00:49:58,240 --> 00:50:01,640 are clearly less than 79. 881 00:50:01,640 --> 00:50:10,020 So I'm going to add 2, corresponding to a subtree 46. 882 00:50:10,020 --> 00:50:12,120 So I'm not actually looking there. 883 00:50:12,120 --> 00:50:14,420 But I'm going to add all of that stuff in. 884 00:50:14,420 --> 00:50:18,250 I'm going to move to the right, and now I'm going to see 79. 885 00:50:18,250 --> 00:50:26,890 At this point 79 is less than or equal to 79. 886 00:50:26,890 --> 00:50:33,290 So I'm going to see 79 and I'm going to add 1. 887 00:50:33,290 --> 00:50:37,300 And because I've added 79, just like I did with 49, 888 00:50:37,300 --> 00:50:42,090 I have to add the subtree size to the left of 79. 889 00:50:42,090 --> 00:50:46,160 So the final addition is I add 1 corresponding 890 00:50:46,160 --> 00:50:50,760 to the subtree 64. 891 00:50:50,760 --> 00:50:52,390 And at this point I've discovered 892 00:50:52,390 --> 00:50:56,040 where I have to insert, I've essentially found the location, 893 00:50:56,040 --> 00:50:57,900 it matches 79. 894 00:50:57,900 --> 00:51:01,180 And there was no modification required in this algorithm. 895 00:51:01,180 --> 00:51:05,990 So if that was 78 you'd essentially do the same things. 896 00:51:05,990 --> 00:51:10,670 But you're done because you found the value, or the place 897 00:51:10,670 --> 00:51:11,770 that you want to insert. 898 00:51:11,770 --> 00:51:13,500 And you've done a bunch of additions. 899 00:51:13,500 --> 00:51:20,160 And you go look at add 1, add 2, add 1, add 1, and you have 5. 900 00:51:23,539 --> 00:51:25,080 And that's the correct answer, as you 901 00:51:25,080 --> 00:51:28,440 can see from this example. 902 00:51:28,440 --> 00:51:31,030 So what's the bad news? 903 00:51:31,030 --> 00:51:33,820 The bad news was what this lady said up 904 00:51:33,820 --> 00:51:37,390 front, which was we haven't quite solved the problem. 905 00:51:37,390 --> 00:51:40,590 Because sadly, I could easily set things 906 00:51:40,590 --> 00:51:49,950 up such that the height h is order n, h could be order n. 907 00:51:49,950 --> 00:51:54,410 And if, for example, I gave you a sorted list, 908 00:51:54,410 --> 00:51:56,650 and I said insert into binary search tree 909 00:51:56,650 --> 00:52:00,410 that's originally null 43, and you put 43 in there. 910 00:52:00,410 --> 00:52:02,350 Then I say insert 46. 911 00:52:02,350 --> 00:52:04,090 And then I say instead of 48. 912 00:52:04,090 --> 00:52:06,870 And then I say insert 49, et cetera. 913 00:52:06,870 --> 00:52:09,200 And, you know, these could be any numbers. 914 00:52:09,200 --> 00:52:12,030 Then you see that what does this look like? 915 00:52:12,030 --> 00:52:14,060 Does it look like a tree? 916 00:52:14,060 --> 00:52:16,300 It looks like a list. 917 00:52:16,300 --> 00:52:18,420 That's the bad news. 918 00:52:18,420 --> 00:52:23,200 And I'll let Eric give you good news next week. 919 00:52:23,200 --> 00:52:25,930 We need to have this notion of balanced binary search trees. 920 00:52:25,930 --> 00:52:28,950 So everything I've said is true. 921 00:52:28,950 --> 00:52:30,130 I did not lie. 922 00:52:30,130 --> 00:52:32,120 But the one extra thing is we need 923 00:52:32,120 --> 00:52:35,770 to make sure these trees are balanced so h is order log n. 924 00:52:35,770 --> 00:52:37,581 And then everything I said works. 925 00:52:37,581 --> 00:52:38,080 All right? 926 00:52:38,080 --> 00:52:39,950 See you next time.