1 00:00:00,070 --> 00:00:02,500 The following content is provided under a Creative 2 00:00:02,500 --> 00:00:04,019 Commons license. 3 00:00:04,019 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,730 continue to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,340 To make a donation or view additional materials 6 00:00:13,340 --> 00:00:17,236 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,236 --> 00:00:17,861 at ocw.mit.edu. 8 00:00:22,776 --> 00:00:24,150 PROFESSOR: So this is a 2-3 tree. 9 00:00:24,150 --> 00:00:27,340 So as you can see, every node has-- so the 2-3 is either 10 00:00:27,340 --> 00:00:28,840 two children or three children. 11 00:00:28,840 --> 00:00:31,350 Every node can have either one key or two keys. 12 00:00:31,350 --> 00:00:34,665 And the correlation is that every-- 13 00:00:34,665 --> 00:00:37,610 so if there are n keys in a node, it has n plus 1 children. 14 00:00:37,610 --> 00:00:41,740 So the way that works is similar to binary search trees. 15 00:00:41,740 --> 00:00:46,350 So if you have value here, the two children surrounding it-- 16 00:00:46,350 --> 00:00:48,850 so this side is less, this side is more. 17 00:00:48,850 --> 00:00:53,010 So it's essentially sort of going in order reversal, left 18 00:00:53,010 --> 00:00:54,830 child, root, right child. 19 00:00:54,830 --> 00:00:56,440 So [INAUDIBLE], it's ordered. 20 00:00:56,440 --> 00:01:04,010 So generally a B tree will have some nodes. 21 00:01:04,010 --> 00:01:10,610 So let's say n and n plus 1 children. 22 00:01:16,920 --> 00:01:20,130 And if you take anything in the middle, 23 00:01:20,130 --> 00:01:23,260 look at the two children, all the keys in this sub-tree 24 00:01:23,260 --> 00:01:26,360 are smaller than the key here, and all the keys 25 00:01:26,360 --> 00:01:28,680 in this sub-tree are larger than the key here. 26 00:01:28,680 --> 00:01:30,320 So that's the general node. 27 00:01:30,320 --> 00:01:34,570 So before we go into more details 28 00:01:34,570 --> 00:01:36,520 of the properties and everything, 29 00:01:36,520 --> 00:01:39,330 the question is why use B-trees. 30 00:01:39,330 --> 00:01:42,380 So if we do a quick depth analysis, 31 00:01:42,380 --> 00:01:44,760 we can see that the depth is to log n rate. 32 00:01:44,760 --> 00:01:48,630 Is that clear to everyone sort of, why the depth is log n? 33 00:01:48,630 --> 00:01:52,025 Because you have branching just like in binary search trees. 34 00:01:52,025 --> 00:01:53,860 In fact, you have more branching. 35 00:01:53,860 --> 00:01:56,020 But in any case, depth is to log n. 36 00:01:56,020 --> 00:02:00,260 But why use B-trees over binary search trees? 37 00:02:00,260 --> 00:02:04,920 Anyone have a reason why you would 38 00:02:04,920 --> 00:02:07,500 prefer to use B-trees or not? 39 00:02:07,500 --> 00:02:11,370 So all the operations are still log n. 40 00:02:11,370 --> 00:02:11,910 Any guesses? 41 00:02:16,140 --> 00:02:17,080 None. 42 00:02:17,080 --> 00:02:18,500 OK. 43 00:02:18,500 --> 00:02:22,240 Well, OK, the reason is memory hierarchy. 44 00:02:22,240 --> 00:02:24,340 So normally in [INAUDIBLE], we just 45 00:02:24,340 --> 00:02:27,320 assume that the computer has access to memory, 46 00:02:27,320 --> 00:02:30,150 and you can just pick up things from disk and constant time 47 00:02:30,150 --> 00:02:32,890 and do your operations with it, and you don't worry 48 00:02:32,890 --> 00:02:34,294 about caches and everything. 49 00:02:34,294 --> 00:02:35,710 But that's not how computers work. 50 00:02:35,710 --> 00:02:37,692 So in a computer, you have-- so those of you 51 00:02:37,692 --> 00:02:39,400 who have taken some computer architecture 52 00:02:39,400 --> 00:02:42,730 class [INAUDIBLE] or something, you will know that hierarchy. 53 00:02:42,730 --> 00:02:46,570 So there's a CPU-- so let's draw it somewhere. 54 00:02:46,570 --> 00:02:48,020 So you have your CPU. 55 00:02:48,020 --> 00:02:50,570 And [INAUDIBLE] CPU, you have some registers. 56 00:02:50,570 --> 00:02:52,930 You have your caches, L1, L2, L3, whatever. 57 00:02:52,930 --> 00:02:54,080 You have your RAM. 58 00:02:54,080 --> 00:02:55,610 You have disk after that. 59 00:02:55,610 --> 00:02:57,860 So disk [? loads. ?] Then you have your, I don't know, 60 00:02:57,860 --> 00:02:59,090 your cloud, whatever. 61 00:02:59,090 --> 00:03:01,130 So each level, your memory size grows 62 00:03:01,130 --> 00:03:04,430 and your access time grows as well. 63 00:03:04,430 --> 00:03:07,650 So in the basic memory hierarchy model, 64 00:03:07,650 --> 00:03:11,930 we have just two levels of hierarchy, let's say. 65 00:03:11,930 --> 00:03:16,370 So you have cache connected by a high bandwidth channel 66 00:03:16,370 --> 00:03:19,370 to the CPU, and you have a low bandwidth channel to disk. 67 00:03:23,040 --> 00:03:25,754 So the difference is-- so essentially 68 00:03:25,754 --> 00:03:27,920 you can consider that cache just has infinite speed. 69 00:03:27,920 --> 00:03:29,790 Cache, just like, whatever you can take it. 70 00:03:29,790 --> 00:03:32,790 You don't have any cost for bringing in stuff from cache. 71 00:03:32,790 --> 00:03:34,680 But it's finite size. 72 00:03:34,680 --> 00:03:37,560 So the way cache works is it has a bunch of words, which 73 00:03:37,560 --> 00:03:38,740 is a finite number of words. 74 00:03:38,740 --> 00:03:44,915 So each word has size B, and let's say you have m words. 75 00:03:47,610 --> 00:03:51,340 However, hard disk is just, let's say, infinite memory, 76 00:03:51,340 --> 00:03:56,130 but it has some cost associated to accessing things. 77 00:03:56,130 --> 00:03:58,580 Also when you access things from hard disk, 78 00:03:58,580 --> 00:03:59,630 you copy them into cache. 79 00:03:59,630 --> 00:04:01,990 When you copy a block of size b, you take it up 80 00:04:01,990 --> 00:04:04,800 from the hard disk, and you take a block, 81 00:04:04,800 --> 00:04:05,920 and you put it into cache. 82 00:04:05,920 --> 00:04:08,510 And you have to get rid of something because it's fine. 83 00:04:08,510 --> 00:04:11,460 So what you want to do is you want to utilize 84 00:04:11,460 --> 00:04:14,010 that b block efficiently. 85 00:04:14,010 --> 00:04:16,459 You just want to bring a b block every time 86 00:04:16,459 --> 00:04:18,170 you want to access a new node. 87 00:04:18,170 --> 00:04:21,170 In a binary search tree, normal operations are what? 88 00:04:21,170 --> 00:04:25,520 You start in the root and go to a node. 89 00:04:25,520 --> 00:04:27,880 But that's not very easily correlated with this. 90 00:04:27,880 --> 00:04:28,380 Right? 91 00:04:28,380 --> 00:04:30,779 So if you want to utilize an entire block, 92 00:04:30,779 --> 00:04:33,320 you would want something like a block which sort of goes down 93 00:04:33,320 --> 00:04:35,709 the tree. 94 00:04:35,709 --> 00:04:37,500 But that's not how binary trees are stored. 95 00:04:37,500 --> 00:04:41,450 Binary trees are stored this way. 96 00:04:41,450 --> 00:04:43,660 So that's the nice thing about B-trees. 97 00:04:43,660 --> 00:04:45,452 So this is just a 2-3 tree. 98 00:04:45,452 --> 00:04:46,660 This is not a general B-tree. 99 00:04:46,660 --> 00:04:48,840 A general B-tree will have a bunch of nodes, 100 00:04:48,840 --> 00:04:50,690 and we'll come to that number. 101 00:04:50,690 --> 00:04:53,610 But generally you want to make that number of nodes something 102 00:04:53,610 --> 00:04:57,380 like the cache-- what is it? 103 00:04:57,380 --> 00:04:58,910 The word size in the cache. 104 00:04:58,910 --> 00:05:02,860 So once you do that, you can get an entire node from disk, 105 00:05:02,860 --> 00:05:05,950 like work on that, and then get another [INAUDIBLE], 106 00:05:05,950 --> 00:05:07,629 so your height is reduced. 107 00:05:07,629 --> 00:05:09,420 And you can do your operation much quicker, 108 00:05:09,420 --> 00:05:12,940 because you're not accessing disk every time you're 109 00:05:12,940 --> 00:05:15,060 going down a level. 110 00:05:15,060 --> 00:05:15,560 Sorry. 111 00:05:15,560 --> 00:05:17,930 You are accessing disk every time you go down a level, 112 00:05:17,930 --> 00:05:20,490 but you're utilizing the whole block 113 00:05:20,490 --> 00:05:22,226 when you're accessing disk. 114 00:05:22,226 --> 00:05:23,100 Good? 115 00:05:23,100 --> 00:05:25,180 Sort of make sense? 116 00:05:25,180 --> 00:05:27,370 OK. 117 00:05:27,370 --> 00:05:31,460 So let's write down the specifications for B-trees now. 118 00:05:34,352 --> 00:05:35,316 All right. 119 00:05:40,620 --> 00:05:43,460 So number of children. 120 00:05:50,090 --> 00:05:52,510 So first of all, a B-tree has something 121 00:05:52,510 --> 00:05:54,960 called a branching factor. 122 00:05:54,960 --> 00:05:58,567 So in the 2-3 tree, the branching factor is two. 123 00:05:58,567 --> 00:06:00,650 So what that means is simply that it just balances 124 00:06:00,650 --> 00:06:01,608 the number of children. 125 00:06:01,608 --> 00:06:04,460 So the number of children has to be greater than or equal to 2. 126 00:06:04,460 --> 00:06:05,690 Other than the root node. 127 00:06:05,690 --> 00:06:07,970 The root node can have less than B children. 128 00:06:07,970 --> 00:06:09,200 It's fine. 129 00:06:09,200 --> 00:06:13,670 Also it's upper bounded by 2B [? plus ?] 2B. 130 00:06:13,670 --> 00:06:15,760 Notice that this is a strict upper bound. 131 00:06:15,760 --> 00:06:22,140 So you can have at most 2B minus 1 children from a node. 132 00:06:22,140 --> 00:06:29,790 Also remember that the number of keys, the number of keys 133 00:06:29,790 --> 00:06:32,640 is just 1 less than the number of children. 134 00:06:32,640 --> 00:06:40,730 Therefore, these inequalities are just reduced by 1. 135 00:06:40,730 --> 00:06:45,260 So you have minus 1 and you have 2B minus 1. 136 00:06:45,260 --> 00:06:49,760 So the number of keys can be between minus 1 and 2B minus 2. 137 00:06:49,760 --> 00:06:52,270 The rationale for that will become clear-- yeah? 138 00:06:52,270 --> 00:06:54,345 AUDIENCE: Is B the height of the tree? 139 00:06:54,345 --> 00:06:56,567 PROFESSOR: No, B is the branching. 140 00:06:56,567 --> 00:06:57,650 B is the branching factor. 141 00:06:57,650 --> 00:06:59,497 So that is the number of children. 142 00:06:59,497 --> 00:07:00,830 It's not the number of children. 143 00:07:00,830 --> 00:07:02,650 It's a bound of the number of children. 144 00:07:02,650 --> 00:07:07,050 So like in the 2-3 tree, B is equal to 2, 145 00:07:07,050 --> 00:07:08,840 and this is a 2-3 tree. 146 00:07:13,300 --> 00:07:18,950 So the 2 refers to-- you can have either two children 147 00:07:18,950 --> 00:07:21,660 or you can have three children. 148 00:07:21,660 --> 00:07:24,520 And so the upper bound on children is 2B minus 1. 149 00:07:27,110 --> 00:07:29,240 2B minus 1 is equal to 3. 150 00:07:29,240 --> 00:07:32,180 So you can have two or three children. 151 00:07:32,180 --> 00:07:34,830 And correspondingly, you can have either one or two 152 00:07:34,830 --> 00:07:37,251 keys in a node. 153 00:07:37,251 --> 00:07:37,750 Make sense? 154 00:07:37,750 --> 00:07:38,374 AUDIENCE: Yeah. 155 00:07:38,374 --> 00:07:39,340 PROFESSOR: Cool. 156 00:07:39,340 --> 00:07:42,200 OK So coming back to this. 157 00:07:42,200 --> 00:07:44,630 So the root does not have a lower bound. 158 00:07:44,630 --> 00:07:47,160 The root can have one child in any tree. 159 00:07:47,160 --> 00:07:50,355 So you have a B equal to 5 tree, the root 160 00:07:50,355 --> 00:07:52,050 can still have one child-- sorry. 161 00:07:52,050 --> 00:07:55,480 Not one child, one key element, two children. 162 00:07:55,480 --> 00:07:56,900 All right. 163 00:07:56,900 --> 00:07:57,760 It's good. 164 00:07:57,760 --> 00:08:00,370 Also it's completely balanced. 165 00:08:00,370 --> 00:08:04,310 So all the leaves are the same depth. 166 00:08:19,330 --> 00:08:21,780 So you can see it here, right? 167 00:08:21,780 --> 00:08:24,350 So you can't have a dangling node here. 168 00:08:24,350 --> 00:08:25,880 This is not allowed. 169 00:08:25,880 --> 00:08:27,047 You have to have a leaf. 170 00:08:27,047 --> 00:08:28,630 You have to have something going down, 171 00:08:28,630 --> 00:08:31,400 and everything ends at the same level. 172 00:08:31,400 --> 00:08:31,900 All right. 173 00:08:31,900 --> 00:08:33,110 So that's the thing. 174 00:08:33,110 --> 00:08:36,520 So also the leaves obviously don't have children, 175 00:08:36,520 --> 00:08:43,130 so this condition is violated by the leaf. 176 00:08:43,130 --> 00:08:46,380 So that's the basic structure of a B-tree. 177 00:08:49,101 --> 00:08:51,100 So the first operation we'll consider on B-trees 178 00:08:51,100 --> 00:08:52,560 is searching. 179 00:08:52,560 --> 00:08:55,160 So that should be relatively straightforward. 180 00:08:55,160 --> 00:08:58,110 So remember how searching is done in the binary search tree. 181 00:08:58,110 --> 00:09:01,900 You bring in a value x compared to the key. 182 00:09:01,900 --> 00:09:04,875 Let's say x is less than K, you go down this path. 183 00:09:04,875 --> 00:09:08,120 Let's say x is greater than K, you go down this path. 184 00:09:08,120 --> 00:09:09,570 So similarly in a B-tree. 185 00:09:09,570 --> 00:09:12,010 So let's say we bring in a value. 186 00:09:12,010 --> 00:09:15,770 Let's say you are looking for 20. 187 00:09:15,770 --> 00:09:18,120 So you bring in 20 compared to this. 188 00:09:18,120 --> 00:09:21,190 20 is less than 30, so you go down here. 189 00:09:21,190 --> 00:09:23,852 Now you have two values. 190 00:09:23,852 --> 00:09:25,060 So where does 20 fit in here? 191 00:09:25,060 --> 00:09:25,851 Not here. 192 00:09:25,851 --> 00:09:26,350 Not here. 193 00:09:26,350 --> 00:09:26,951 It fits here. 194 00:09:26,951 --> 00:09:27,450 OK. 195 00:09:27,450 --> 00:09:29,110 Go down this tree. 196 00:09:29,110 --> 00:09:30,980 You find 20, that's it. 197 00:09:30,980 --> 00:09:37,410 So in general, you bring in a key K, you look at this node, 198 00:09:37,410 --> 00:09:39,110 and you go through all the values. 199 00:09:39,110 --> 00:09:43,590 So something I forgot to mention, which should be clear. 200 00:09:43,590 --> 00:09:47,250 All the keys in a node, they're sorted, one after the other. 201 00:09:47,250 --> 00:09:49,350 So your values go like this. 202 00:09:49,350 --> 00:09:52,140 So they're increasing in this way. 203 00:09:52,140 --> 00:09:54,835 Make sense? 204 00:09:54,835 --> 00:09:56,440 So you bring in a key. 205 00:09:56,440 --> 00:09:59,900 Look at all the keys in the node you're looking at, 206 00:09:59,900 --> 00:10:02,230 pick the place where K fits in, unless it's already 207 00:10:02,230 --> 00:10:02,730 in the node. 208 00:10:02,730 --> 00:10:03,438 Then you're done. 209 00:10:03,438 --> 00:10:04,660 You've found it. 210 00:10:04,660 --> 00:10:08,850 Otherwise, let's say K fits in between these two guys. 211 00:10:08,850 --> 00:10:13,250 So you go down this child and continue. 212 00:10:13,250 --> 00:10:15,940 So searching with log n, similar to BSTs. 213 00:10:20,294 --> 00:10:21,835 So searching is not very interesting. 214 00:10:38,480 --> 00:10:40,225 So next is insertion. 215 00:10:44,440 --> 00:10:46,860 So insertion is a little more interesting than searching. 216 00:10:46,860 --> 00:10:48,360 So what you do in insertion is you-- 217 00:10:48,360 --> 00:11:09,020 [SIDE CONVERSATION] 218 00:11:09,020 --> 00:11:12,219 PROFESSOR: So before we resume, does anyone have any questions 219 00:11:12,219 --> 00:11:13,510 about the structure of B-trees. 220 00:11:13,510 --> 00:11:16,860 We rushed through that quite fast. 221 00:11:16,860 --> 00:11:18,655 About how B-trees are structured, 222 00:11:18,655 --> 00:11:20,165 everyone good with that? 223 00:11:20,165 --> 00:11:25,294 OK, also any questions about searching in a B-tree or a BST? 224 00:11:25,294 --> 00:11:25,794 Go ahead. 225 00:11:25,794 --> 00:11:27,770 AUDIENCE: Just a random question. 226 00:11:27,770 --> 00:11:31,722 So the 38 there, it can only have two children. 227 00:11:31,722 --> 00:11:33,210 PROFESSOR: Yep. 228 00:11:33,210 --> 00:11:36,270 So one value, two children. 229 00:11:36,270 --> 00:11:39,760 So you have some node in the B-tree, 230 00:11:39,760 --> 00:11:41,420 and whatever is below it is split 231 00:11:41,420 --> 00:11:43,350 into parts by the elements. 232 00:11:43,350 --> 00:11:45,100 So if you have n elements, it splits it up 233 00:11:45,100 --> 00:11:46,275 into n plus 1 segments. 234 00:11:49,830 --> 00:11:53,087 AUDIENCE: You said that the root didn't have to follow the root. 235 00:11:53,087 --> 00:11:53,670 PROFESSOR: No. 236 00:11:53,670 --> 00:11:54,589 AUDIENCE: Why is that? 237 00:11:54,589 --> 00:11:57,130 PROFESSOR: Well, you'll see when we do insertion and deletion 238 00:11:57,130 --> 00:11:58,005 why that's necessary. 239 00:11:58,005 --> 00:12:02,350 But essentially you can consider that it's an invariant. 240 00:12:02,350 --> 00:12:04,770 And all we have to do is preserve that invariant. 241 00:12:04,770 --> 00:12:08,230 So the root, it has to still have less than two-- 242 00:12:08,230 --> 00:12:10,140 it still has to have the upper bound. 243 00:12:10,140 --> 00:12:14,569 But it doesn't need to have a lower bound. 244 00:12:14,569 --> 00:12:17,490 AUDIENCE: How do you choose B? 245 00:12:17,490 --> 00:12:20,120 PROFESSOR: Well, the whole [INAUDIBLE] cache size, 246 00:12:20,120 --> 00:12:21,220 so something with that. 247 00:12:21,220 --> 00:12:23,850 So you probably want 2B to be about your cache size 248 00:12:23,850 --> 00:12:25,760 so you can get the whole block in one go. 249 00:12:25,760 --> 00:12:28,050 I've never implemented a B-tree, so I 250 00:12:28,050 --> 00:12:29,150 don't know how it's actually done in practice. 251 00:12:29,150 --> 00:12:31,399 But that is the reason, so I'm assuming it's something 252 00:12:31,399 --> 00:12:33,730 to do with the cache length. 253 00:12:33,730 --> 00:12:38,102 AUDIENCE: Is the 14, is it a child of both 10 and 17? 254 00:12:38,102 --> 00:12:39,935 PROFESSOR: Well, it's not a child of either. 255 00:12:39,935 --> 00:12:41,260 It's a child of this node. 256 00:12:41,260 --> 00:12:43,570 So this node has two elements, so it's 257 00:12:43,570 --> 00:12:46,380 being divided-- dividing the interval up into three parts. 258 00:12:46,380 --> 00:12:49,800 So it's in between 10 and 17 is the point here. 259 00:12:49,800 --> 00:12:54,600 AUDIENCE: So then this node has five children? 260 00:12:54,600 --> 00:12:55,857 PROFESSOR: Sorry? 261 00:12:55,857 --> 00:12:56,940 No, it has three children. 262 00:12:56,940 --> 00:13:00,010 So don't think of every key as a node. 263 00:13:00,010 --> 00:13:04,000 Think of the whole unit as a node. 264 00:13:04,000 --> 00:13:06,115 So it's not necessarily-- in a binary search tree, 265 00:13:06,115 --> 00:13:08,134 you have one element, but here every node 266 00:13:08,134 --> 00:13:09,050 has multiple elements. 267 00:13:09,050 --> 00:13:11,624 That's the point of it. 268 00:13:11,624 --> 00:13:14,840 Anyone else? 269 00:13:14,840 --> 00:13:17,650 OK, let's start with searching. 270 00:13:17,650 --> 00:13:22,236 So let's leave this here. 271 00:13:30,210 --> 00:13:32,770 Well, you have the formulas up there, so that's good. 272 00:13:42,570 --> 00:13:43,160 Insertion. 273 00:13:43,160 --> 00:13:44,030 Let's start with insertion. 274 00:13:44,030 --> 00:13:45,155 We already did searching. 275 00:13:52,390 --> 00:13:54,444 So insertion is you bring in a new key K, 276 00:13:54,444 --> 00:13:56,110 and you want to insert it into the tree. 277 00:13:56,110 --> 00:13:57,120 So what's the problem that could happen? 278 00:13:57,120 --> 00:13:59,500 You can find the location where you want to insert it, just 279 00:13:59,500 --> 00:14:00,125 like searching. 280 00:14:00,125 --> 00:14:03,390 You just go down the tree and find where it should be placed. 281 00:14:03,390 --> 00:14:05,280 But once you do place it, you have a problem. 282 00:14:05,280 --> 00:14:06,113 What is the problem? 283 00:14:06,113 --> 00:14:10,570 The problem is that one of your nodes will become overfull. 284 00:14:10,570 --> 00:14:11,070 Whatever. 285 00:14:11,070 --> 00:14:13,840 It'll overflow, and that's not what you want. 286 00:14:13,840 --> 00:14:16,760 So you want some way so you can manage this. 287 00:14:16,760 --> 00:14:17,940 How do you manage this? 288 00:14:17,940 --> 00:14:25,190 So I have this lovely prop here, which I hope to demonstrate. 289 00:14:25,190 --> 00:14:25,690 OK. 290 00:14:29,130 --> 00:14:32,770 So here we have B equal to 4. 291 00:14:32,770 --> 00:14:39,110 So let's first figure out the number of keys. 292 00:14:39,110 --> 00:14:42,230 So what is the minimum number of keys, anyone for B equal to 4? 293 00:14:42,230 --> 00:14:42,959 AUDIENCE: Three. 294 00:14:42,959 --> 00:14:44,125 PROFESSOR: Three, precisely. 295 00:14:44,125 --> 00:14:49,688 So what is the maximum number of keys? 296 00:14:49,688 --> 00:14:51,556 AUDIENCE: Six. 297 00:14:51,556 --> 00:14:53,770 PROFESSOR: 4 into 2 minus 3, yeah. 298 00:14:53,770 --> 00:14:55,000 Correct. 299 00:14:55,000 --> 00:14:56,170 3, 4. 300 00:14:56,170 --> 00:14:58,500 It's not seven, there's a strictly less than sign 301 00:14:58,500 --> 00:15:00,090 somewhere. 302 00:15:00,090 --> 00:15:01,070 Yes. 303 00:15:01,070 --> 00:15:05,198 And you'll see why it's not seven in a minute. 304 00:15:05,198 --> 00:15:06,050 [LAUGHTER] 305 00:15:06,050 --> 00:15:08,150 Oh. 306 00:15:08,150 --> 00:15:09,070 Hypocritical of me. 307 00:15:11,640 --> 00:15:12,140 All right. 308 00:15:12,140 --> 00:15:15,180 So as you can see, 1, 2, 3, 4, 5, 6, 7. 309 00:15:15,180 --> 00:15:18,110 So some insertion happened. 310 00:15:18,110 --> 00:15:19,250 Is the writing clear? 311 00:15:19,250 --> 00:15:22,580 Can everyone read the numbers? 312 00:15:22,580 --> 00:15:23,750 49 looks a little skewed. 313 00:15:23,750 --> 00:15:26,160 Anyway, essentially these are all sorted. 314 00:15:26,160 --> 00:15:27,290 This is the parent node. 315 00:15:27,290 --> 00:15:28,650 Doesn't matter what's over here. 316 00:15:28,650 --> 00:15:33,100 All that matters is 8, 56, and whatever's in between. 317 00:15:38,900 --> 00:15:41,580 So what we do when we have an overfull node 318 00:15:41,580 --> 00:15:43,790 is something that's called a split operation. 319 00:15:43,790 --> 00:15:45,729 So split. 320 00:15:45,729 --> 00:15:47,270 And there's something which is called 321 00:15:47,270 --> 00:15:48,670 a merge, which we'll come to later when 322 00:15:48,670 --> 00:15:49,680 we're doing deletion. 323 00:15:49,680 --> 00:15:52,720 But a split is-- very intuitively, 324 00:15:52,720 --> 00:15:55,486 it splits the node into two parts. 325 00:15:55,486 --> 00:15:57,610 So what it does is when you have an overfull node-- 326 00:15:57,610 --> 00:15:59,700 so the number of elements here is what? 327 00:15:59,700 --> 00:16:03,220 2B minus 1, which is just 1 over the max. 328 00:16:03,220 --> 00:16:06,210 So what do you do is you take the middle element 329 00:16:06,210 --> 00:16:09,115 and remove it. 330 00:16:09,115 --> 00:16:11,800 and now you split the node into two parts. 331 00:16:11,800 --> 00:16:14,730 Observe that there are three here and three here, 332 00:16:14,730 --> 00:16:16,111 which is perfect. 333 00:16:16,111 --> 00:16:17,860 And now what you do with the middle node-- 334 00:16:17,860 --> 00:16:19,774 so now you're actually disrupting 335 00:16:19,774 --> 00:16:21,440 the structure of the tree, because there 336 00:16:21,440 --> 00:16:23,062 was one pointer going in. 337 00:16:23,062 --> 00:16:23,895 There was one child. 338 00:16:23,895 --> 00:16:25,660 And now you have two children. 339 00:16:25,660 --> 00:16:27,940 So somehow you need to adjust the parent node, 340 00:16:27,940 --> 00:16:30,090 because the parent node had only one child. 341 00:16:30,090 --> 00:16:33,240 Well, at least there are other children off to the side. 342 00:16:33,240 --> 00:16:36,900 But here it had only one child, and now it's split apart. 343 00:16:36,900 --> 00:16:38,580 So you do something very simple. 344 00:16:38,580 --> 00:16:42,675 You just insert this guy in here. 345 00:16:42,675 --> 00:16:44,300 And then you say, oh, this points here, 346 00:16:44,300 --> 00:16:47,010 and this points here. 347 00:16:47,010 --> 00:16:47,610 Make sense? 348 00:16:50,310 --> 00:16:51,810 I'm going to get rid of these two. 349 00:16:59,850 --> 00:17:02,120 And you can even convince yourself 350 00:17:02,120 --> 00:17:04,910 that this preserves all the nice properties. 351 00:17:04,910 --> 00:17:09,310 So your children have nicely fallen back 352 00:17:09,310 --> 00:17:11,589 into their interval. 353 00:17:11,589 --> 00:17:13,627 Your sequence is completely correct, 354 00:17:13,627 --> 00:17:15,460 because this was the middle element of this. 355 00:17:15,460 --> 00:17:18,329 So this divides this interval properly. 356 00:17:18,329 --> 00:17:21,970 This is also between 8 and 56, because this was in this node. 357 00:17:21,970 --> 00:17:23,259 So all the properties. 358 00:17:23,259 --> 00:17:25,050 But there's one property that is a problem. 359 00:17:25,050 --> 00:17:28,890 So you have just increased the size of the parent node by 1. 360 00:17:28,890 --> 00:17:32,750 So now it's possible that the parent node has overflowed. 361 00:17:32,750 --> 00:17:33,680 So what do you do? 362 00:17:33,680 --> 00:17:35,800 You split it again. 363 00:17:35,800 --> 00:17:36,950 And split it again. 364 00:17:36,950 --> 00:17:39,544 And if at any point, you're fine, 365 00:17:39,544 --> 00:17:41,710 you look at the parent node and go, OK, that's fine. 366 00:17:41,710 --> 00:17:43,250 That's in the range. 367 00:17:43,250 --> 00:17:45,280 But every time it overflows, you can keep going. 368 00:17:45,280 --> 00:17:46,210 And how many times can you do this? 369 00:17:46,210 --> 00:17:48,180 You can do this all the way up to the root. 370 00:17:48,180 --> 00:17:51,710 And when you reach the root, either it's fine 371 00:17:51,710 --> 00:17:52,930 or the root is too big. 372 00:17:52,930 --> 00:17:54,840 It's reached 2B minus 1. 373 00:17:54,840 --> 00:17:56,680 And then you split the root, and you 374 00:17:56,680 --> 00:17:58,720 get one single [INAUDIBLE] up there. 375 00:17:58,720 --> 00:18:00,220 So that, in answer to your question, 376 00:18:00,220 --> 00:18:02,780 that is why you need that property in some sense. 377 00:18:02,780 --> 00:18:06,750 Not a very convincing argument, but sort of. 378 00:18:06,750 --> 00:18:08,720 So let's actually do an insertion 379 00:18:08,720 --> 00:18:11,540 in this tree we have here. 380 00:18:11,540 --> 00:18:13,600 So we are going to insert 16. 381 00:18:20,250 --> 00:18:23,130 So 16 comes in here. 382 00:18:23,130 --> 00:18:25,450 It's less than 30, it goes to the left. 383 00:18:25,450 --> 00:18:28,390 It's between 10 and 17, it goes in the middle. 384 00:18:28,390 --> 00:18:29,280 16. 385 00:18:29,280 --> 00:18:33,480 And it's greater than 14, so we add 16 here. 386 00:18:54,150 --> 00:18:54,920 All right. 387 00:18:54,920 --> 00:18:56,040 That seems good. 388 00:18:56,040 --> 00:18:57,570 All the properties fine. 389 00:18:57,570 --> 00:18:59,660 This still has two elements, which is the maximum, 390 00:18:59,660 --> 00:19:00,810 but it's good. 391 00:19:00,810 --> 00:19:03,324 It doesn't overflow. 392 00:19:03,324 --> 00:19:04,490 Let's insert something else. 393 00:19:04,490 --> 00:19:05,830 Let's insert 2. 394 00:19:08,600 --> 00:19:11,860 So 2 goes to 30, goes down, goes down. 395 00:19:11,860 --> 00:19:16,670 And we have a problem, because 2 has overflowed this node. 396 00:19:16,670 --> 00:19:18,590 So we split. 397 00:19:18,590 --> 00:19:21,367 And the way we split is we take the middle element. 398 00:19:21,367 --> 00:19:22,450 So we split the node here. 399 00:19:25,000 --> 00:19:31,680 And 3 goes up to the parent, so 3 goes here. 400 00:19:31,680 --> 00:19:38,015 And all good, except for the parent has overflowed. 401 00:19:38,015 --> 00:19:39,390 So what do we do with the parent? 402 00:19:39,390 --> 00:19:40,895 We split the parent again. 403 00:19:40,895 --> 00:19:44,870 And this time, it's right down the middle, the 10 goes up. 404 00:19:44,870 --> 00:19:48,520 So OK, let's get rid of this. 405 00:19:48,520 --> 00:19:53,090 So now that we split the parent, the 10 goes up here. 406 00:19:53,090 --> 00:19:54,490 And you're good. 407 00:19:54,490 --> 00:19:58,510 It's a bit cluttered, so let me reposition the 17. 408 00:20:07,940 --> 00:20:10,550 Did those two operations make sense? 409 00:20:10,550 --> 00:20:12,876 Questions? 410 00:20:12,876 --> 00:20:17,870 AUDIENCE: If your node size [INAUDIBLE] number of-- 411 00:20:17,870 --> 00:20:20,040 PROFESSOR: So just pick the-- first of all-- OK. 412 00:20:22,970 --> 00:20:25,387 If the way we're doing it-- when your node is overflowing, 413 00:20:25,387 --> 00:20:27,344 it's returning only one thing at a time, right? 414 00:20:27,344 --> 00:20:28,000 AUDIENCE: Yeah. 415 00:20:28,000 --> 00:20:29,749 PROFESSOR: So if your node is overflowing, 416 00:20:29,749 --> 00:20:32,587 it'll be 2t minus 1, which is an odd number always. 417 00:20:32,587 --> 00:20:34,670 There might be a case where you get an even number 418 00:20:34,670 --> 00:20:35,753 if you do something weird. 419 00:20:35,753 --> 00:20:38,200 Maybe you have a-- there are different ways to do B-trees. 420 00:20:38,200 --> 00:20:41,710 But if it does, you can probably pick the one, either of them, 421 00:20:41,710 --> 00:20:43,240 and then [INAUDIBLE]. 422 00:20:43,240 --> 00:20:44,240 I'm not sure about that. 423 00:20:44,240 --> 00:20:45,400 I'll look into it. 424 00:20:45,400 --> 00:20:48,910 But in general, if you're doing it this way, it's always odd. 425 00:20:48,910 --> 00:20:51,274 So you don't have to worry about that. 426 00:20:51,274 --> 00:20:53,908 Anything else? 427 00:20:53,908 --> 00:20:56,908 AUDIENCE: If we did reach all the way to the root 428 00:20:56,908 --> 00:20:58,150 and then went one more up-- 429 00:20:58,150 --> 00:20:59,030 PROFESSOR: So what you would do is-- 430 00:20:59,030 --> 00:21:00,446 AUDIENCE: That root would have-- 431 00:21:00,446 --> 00:21:03,070 PROFESSOR: That root would have two children, one element 432 00:21:03,070 --> 00:21:05,410 and two children, which is fine because we didn't put 433 00:21:05,410 --> 00:21:06,866 that restriction on the root. 434 00:21:06,866 --> 00:21:07,795 That's good. 435 00:21:07,795 --> 00:21:08,670 How we doing on time? 436 00:21:08,670 --> 00:21:11,480 OK, we have some time. 437 00:21:11,480 --> 00:21:14,966 Let's jump into deletion, unless anyone else has questions. 438 00:21:14,966 --> 00:21:16,840 AUDIENCE: [INAUDIBLE] any point? 439 00:21:16,840 --> 00:21:18,120 PROFESSOR: So-- oh, yeah. 440 00:21:18,120 --> 00:21:19,520 That's a good-- thank you. 441 00:21:19,520 --> 00:21:21,850 So you are going down to the leaves 442 00:21:21,850 --> 00:21:24,015 at most-- at most of the leaf ones, 443 00:21:24,015 --> 00:21:25,490 and you're going back up one. 444 00:21:25,490 --> 00:21:27,880 So it's like log n plus log n, and you're good. 445 00:21:31,610 --> 00:21:32,710 Let's do deletion. 446 00:21:43,150 --> 00:21:46,210 So deletion is more complicated. 447 00:21:46,210 --> 00:21:48,520 So the reason, it'll be clear. 448 00:21:48,520 --> 00:21:51,790 So the problem in deletion will be remove a node 449 00:21:51,790 --> 00:21:53,490 and a node is now underfull. 450 00:21:53,490 --> 00:21:57,400 So it has less than B minus 1 keys in it suddenly. 451 00:21:57,400 --> 00:22:00,600 So let's turn this around. 452 00:22:04,950 --> 00:22:07,980 So again B equal to 4. 453 00:22:07,980 --> 00:22:09,900 This node is a problem. 454 00:22:09,900 --> 00:22:11,610 Only two things in it. 455 00:22:11,610 --> 00:22:13,930 So what do we do? 456 00:22:13,930 --> 00:22:17,980 So before we go into that, let's make this assumption 457 00:22:17,980 --> 00:22:22,010 that-- there are two steps to deletion. 458 00:22:22,010 --> 00:22:24,950 The first step is making the deletion at a leaf. 459 00:22:24,950 --> 00:22:26,049 How do you do that? 460 00:22:26,049 --> 00:22:28,340 So the way you make a deletion at a leaf is, let's say, 461 00:22:28,340 --> 00:22:30,100 you have a key. 462 00:22:30,100 --> 00:22:34,470 You come down in your B-tree, and you add a node. 463 00:22:34,470 --> 00:22:38,017 Oh, this key needs to be deleted. 464 00:22:38,017 --> 00:22:38,850 But it's not a leaf. 465 00:22:38,850 --> 00:22:40,470 So what do you do? 466 00:22:40,470 --> 00:22:46,230 So essentially what you do is you look at these two subtrees. 467 00:22:46,230 --> 00:22:47,780 So it might have only one subtree. 468 00:22:47,780 --> 00:22:49,530 If it's at the end, it will have only one. 469 00:22:49,530 --> 00:22:51,360 Actually, no, that's not true. 470 00:22:51,360 --> 00:22:51,940 Ignore that. 471 00:22:51,940 --> 00:22:54,680 If it's not a leaf, it has two subtrees. 472 00:22:54,680 --> 00:22:57,130 So either take the rightmost element 473 00:22:57,130 --> 00:22:58,625 in this subtree, which is a leaf, 474 00:22:58,625 --> 00:23:00,250 because you can always keep going down, 475 00:23:00,250 --> 00:23:02,333 right, right, right, right till you get to a leaf, 476 00:23:02,333 --> 00:23:04,810 or the leftmost element in this subtree. 477 00:23:04,810 --> 00:23:09,310 So that is just the next element after this guy. 478 00:23:09,310 --> 00:23:13,700 So you delete this, and you bring this up to here. 479 00:23:13,700 --> 00:23:16,910 We'll do an example of this, and it'll be clearer. 480 00:23:16,910 --> 00:23:19,650 So you take either the rightmost element in the left subtree 481 00:23:19,650 --> 00:23:21,550 or the leftmost element in the right subtree 482 00:23:21,550 --> 00:23:22,620 and bring it up here. 483 00:23:22,620 --> 00:23:25,860 So you sort of like move the deletion to the leaf. 484 00:23:25,860 --> 00:23:27,310 And now it's easier to deal with. 485 00:23:27,310 --> 00:23:28,310 So we will come to that. 486 00:23:28,310 --> 00:23:32,990 Also just note that this is not what is done in the recitation. 487 00:23:32,990 --> 00:23:34,580 This algorithm for deletion, I think, 488 00:23:34,580 --> 00:23:35,880 is not done in the recitation notes. 489 00:23:35,880 --> 00:23:38,546 This is a different thing, which I'll send out a link for later. 490 00:23:38,546 --> 00:23:40,540 But I believe it works, because I got it 491 00:23:40,540 --> 00:23:44,500 from the [INAUDIBLE] reference. 492 00:23:44,500 --> 00:23:49,160 So once you move to the leaf-- so now let's look at this. 493 00:23:49,160 --> 00:23:51,510 So this is a node that is underfull. 494 00:23:51,510 --> 00:23:54,050 And you want to fix it. 495 00:23:54,050 --> 00:23:55,040 So how do you fix it? 496 00:23:55,040 --> 00:23:58,750 So what do is you look at its siblings. 497 00:23:58,750 --> 00:24:00,280 So in this case, it has one sibling. 498 00:24:00,280 --> 00:24:01,770 It can have up to two siblings. 499 00:24:01,770 --> 00:24:04,240 It can have left or right. 500 00:24:04,240 --> 00:24:06,390 So what you do is you look at a sibling. 501 00:24:06,390 --> 00:24:10,360 And this sibling is actually 1 over the minimum. 502 00:24:10,360 --> 00:24:13,460 And if it's 1 over the minimum, then it's really easy. 503 00:24:13,460 --> 00:24:15,840 All you have to do is take the leftmost thing here-- 504 00:24:15,840 --> 00:24:17,340 or if it's the sibling on this side, 505 00:24:17,340 --> 00:24:21,040 take the rightmost thing here. 506 00:24:21,040 --> 00:24:22,730 And look at its parent. 507 00:24:22,730 --> 00:24:30,330 So you bring the parent down, and you move the sibling up. 508 00:24:30,330 --> 00:24:31,220 And there we go. 509 00:24:31,220 --> 00:24:35,950 So you basically are rotating the thing into place. 510 00:24:35,950 --> 00:24:39,420 So you move the parent down into the underfull node, 511 00:24:39,420 --> 00:24:43,329 and you replace the parent by the leftmost thing here. 512 00:24:43,329 --> 00:24:45,120 Everyone see why that preserves everything? 513 00:24:50,150 --> 00:24:52,380 And the child is also shifted. 514 00:24:52,380 --> 00:24:53,490 Make sure you see that. 515 00:24:53,490 --> 00:24:57,190 So the child which was in this subtree is now in this subtree. 516 00:24:59,760 --> 00:25:01,770 But then you can have the situation 517 00:25:01,770 --> 00:25:04,620 where you don't have a nice sibling 518 00:25:04,620 --> 00:25:06,410 to take care of your problems. 519 00:25:06,410 --> 00:25:10,490 So in this scenario, the sibling is barely full. 520 00:25:10,490 --> 00:25:12,980 It has three things, and it can't donate anything to you. 521 00:25:12,980 --> 00:25:15,440 So what do you do in that case? 522 00:25:15,440 --> 00:25:18,170 So then you do something which is a parallel of the split 523 00:25:18,170 --> 00:25:18,730 operation. 524 00:25:18,730 --> 00:25:20,570 You do a merge. 525 00:25:20,570 --> 00:25:21,430 So what do you have? 526 00:25:21,430 --> 00:25:30,490 So here you have B minus 2, and here you have B minus 1. 527 00:25:30,490 --> 00:25:33,555 And you get 2B minus 3. 528 00:25:33,555 --> 00:25:34,930 Well, you've got another element. 529 00:25:34,930 --> 00:25:36,530 You also take the parent. 530 00:25:36,530 --> 00:25:37,040 So how do you do the merge. 531 00:25:37,040 --> 00:25:38,706 I just want to show you the merge first. 532 00:25:38,706 --> 00:25:41,891 So the way you do it is you move the parent down, 533 00:25:41,891 --> 00:25:42,890 and you merge these two. 534 00:25:46,220 --> 00:25:48,380 Seems OK? 535 00:25:48,380 --> 00:25:52,010 So you move the parent node down and merge these two. 536 00:25:52,010 --> 00:25:53,995 And, well, now this comes together, 537 00:25:53,995 --> 00:25:55,480 and this points into the new node. 538 00:25:58,400 --> 00:26:02,530 Sort of clear what's going on? 539 00:26:02,530 --> 00:26:05,112 Questions? 540 00:26:05,112 --> 00:26:06,088 Yes? 541 00:26:06,088 --> 00:26:09,815 AUDIENCE: So now the parent is underfull? 542 00:26:09,815 --> 00:26:11,690 PROFESSOR: Well, so you have-- yeah, exactly. 543 00:26:11,690 --> 00:26:13,990 So you have decreased the size of the parent, 544 00:26:13,990 --> 00:26:15,070 so it might be underfull. 545 00:26:15,070 --> 00:26:17,190 So you propagate. 546 00:26:17,190 --> 00:26:18,220 Anything else? 547 00:26:18,220 --> 00:26:20,630 AUDIENCE: So are these all different techniques 548 00:26:20,630 --> 00:26:21,774 for doing that? 549 00:26:21,774 --> 00:26:23,190 PROFESSOR: So there are two cases. 550 00:26:23,190 --> 00:26:27,110 So either you have a sibling which has extra nodes to donate 551 00:26:27,110 --> 00:26:29,036 to you or you don't. 552 00:26:29,036 --> 00:26:30,660 If you don't, then you have to do this. 553 00:26:30,660 --> 00:26:33,335 AUDIENCE: But what about that case? 554 00:26:33,335 --> 00:26:34,720 Or is that just like-- 555 00:26:34,720 --> 00:26:36,875 PROFESSOR: No, that is moving it down to the leaf. 556 00:26:36,875 --> 00:26:38,708 Once you move the deletion down to the leaf, 557 00:26:38,708 --> 00:26:40,300 so here we have something now. 558 00:26:40,300 --> 00:26:44,100 And now you move it all the way back up. 559 00:26:44,100 --> 00:26:45,720 So there are two cases. 560 00:26:45,720 --> 00:26:46,650 Let's do an example. 561 00:26:46,650 --> 00:26:48,850 That'll make it clearer. 562 00:26:48,850 --> 00:26:50,366 How are we doing on time? 563 00:26:50,366 --> 00:26:51,764 Five minutes, all right. 564 00:26:54,280 --> 00:26:58,435 So we are going to delete 38. 565 00:26:58,435 --> 00:27:00,840 38 is gone. 566 00:27:00,840 --> 00:27:02,840 But we want to move it down to the leaf. 567 00:27:02,840 --> 00:27:04,580 So let's take an element. 568 00:27:04,580 --> 00:27:07,810 Let's say we take 41. 569 00:27:07,810 --> 00:27:13,560 So we take 41 and move it up here. 570 00:27:13,560 --> 00:27:17,482 41 is the leftmost thing in the right subtree. 571 00:27:17,482 --> 00:27:19,440 So this vacancy doesn't really affect anything, 572 00:27:19,440 --> 00:27:22,381 because this node still has the right number of things, 573 00:27:22,381 --> 00:27:24,630 because it's still got one thing in it, which is good. 574 00:27:24,630 --> 00:27:25,920 So you're fine. 575 00:27:25,920 --> 00:27:29,080 This is now just 48. 576 00:27:29,080 --> 00:27:32,850 Let's say we now delete 41. 577 00:27:32,850 --> 00:27:36,540 So 41 is gone. 578 00:27:36,540 --> 00:27:41,480 So now that 41 is gone, what do you 579 00:27:41,480 --> 00:27:42,850 replace this blank spot with? 580 00:27:47,840 --> 00:27:49,390 Either this or this, right? 581 00:27:49,390 --> 00:27:50,520 Doesn't matter. 582 00:27:50,520 --> 00:27:53,900 So let's just do this one for consistency. 583 00:27:53,900 --> 00:27:55,420 So you have 48 here. 584 00:27:55,420 --> 00:27:59,230 And now you a problem because you have a blank box. 585 00:27:59,230 --> 00:28:03,590 So can you rotate? 586 00:28:03,590 --> 00:28:04,460 Yes, no? 587 00:28:04,460 --> 00:28:05,870 No, right? 588 00:28:05,870 --> 00:28:09,860 Because sibling is barely full. 589 00:28:09,860 --> 00:28:11,000 So what can you do? 590 00:28:11,000 --> 00:28:12,016 So you merge. 591 00:28:12,016 --> 00:28:12,890 And how do you merge? 592 00:28:12,890 --> 00:28:17,067 You move the 48 down, and you combine everything. 593 00:28:17,067 --> 00:28:18,650 So this is kind of hard to understand, 594 00:28:18,650 --> 00:28:20,940 but this is like a zero-element node. 595 00:28:23,960 --> 00:28:26,120 So when you merge, you have 32, 48, and nothing, 596 00:28:26,120 --> 00:28:28,070 so it's just 32 and 48. 597 00:28:28,070 --> 00:28:38,770 So what you do is-- so this seems weird, 598 00:28:38,770 --> 00:28:40,910 but this is just another empty node. 599 00:28:40,910 --> 00:28:43,075 You just propagated the emptiness upwards. 600 00:28:46,520 --> 00:28:49,860 Now you take this empty node, and you look for its siblings. 601 00:28:49,860 --> 00:28:54,330 Again, its sibling is-- well, it's barely full. 602 00:28:54,330 --> 00:28:55,330 So what do you do now? 603 00:28:55,330 --> 00:28:57,790 You bring the 30 down, and you merge this. 604 00:28:57,790 --> 00:28:58,950 So let's do that. 605 00:29:06,080 --> 00:29:10,765 30 comes down, and there we go. 606 00:29:13,410 --> 00:29:14,245 Looks fine? 607 00:29:14,245 --> 00:29:15,940 Does that tree look good? 608 00:29:15,940 --> 00:29:17,190 Questions about the operation? 609 00:29:20,400 --> 00:29:27,050 I'm sure it was not clear, but-- anything? 610 00:29:27,050 --> 00:29:28,370 Make sense? 611 00:29:28,370 --> 00:29:32,730 OK, let's do a deletion where we can actually do a rotation. 612 00:29:32,730 --> 00:29:36,200 So let's go ahead and delete 20. 613 00:29:36,200 --> 00:29:38,205 So you do your searching, go down the tree. 614 00:29:38,205 --> 00:29:40,910 You find the 20 under here. 615 00:29:40,910 --> 00:29:43,300 So now, OK. 616 00:29:43,300 --> 00:29:45,620 So you're left with just-- actually never mind. 617 00:29:45,620 --> 00:29:46,570 We'll do another one. 618 00:29:46,570 --> 00:29:47,770 So this doesn't do anything. 619 00:29:47,770 --> 00:29:51,020 You lost the 20, and you're left with the 24 this time. 620 00:29:51,020 --> 00:29:53,359 So now you delete the 24. 621 00:29:53,359 --> 00:29:54,900 So now that you've got rid of the 24, 622 00:29:54,900 --> 00:29:56,150 you have a blank box here now. 623 00:29:58,380 --> 00:30:00,310 But its sibling is not barely full. 624 00:30:00,310 --> 00:30:02,750 It has something to donate. 625 00:30:02,750 --> 00:30:06,864 So anyone, which elements are going to rotate? 626 00:30:06,864 --> 00:30:07,840 AUDIENCE: 17 and 16. 627 00:30:07,840 --> 00:30:09,403 PROFESSOR: 16 and 17, right. 628 00:30:09,403 --> 00:30:10,370 Cool. 629 00:30:10,370 --> 00:30:16,060 So 16 goes up, 17 goes down. 630 00:30:16,060 --> 00:30:17,260 And you're done. 631 00:30:17,260 --> 00:30:18,260 You're consistent again. 632 00:30:20,850 --> 00:30:21,880 So that was deletion. 633 00:30:21,880 --> 00:30:24,180 Those are the two cases for deletion. 634 00:30:24,180 --> 00:30:26,752 Does that make sense? 635 00:30:26,752 --> 00:30:29,220 Anyone? 636 00:30:29,220 --> 00:30:31,020 Any questions? 637 00:30:31,020 --> 00:30:32,290 OK. 638 00:30:32,290 --> 00:30:36,420 So that's all the topics we were supposed to cover today. 639 00:30:36,420 --> 00:30:39,810 Any questions about any of the operations, 640 00:30:39,810 --> 00:30:43,520 any of the other topics, lecture, anything?