1 00:00:00,090 --> 00:00:02,490 The following content is provided under a Creative 2 00:00:02,490 --> 00:00:04,030 Commons license. 3 00:00:04,030 --> 00:00:06,360 Your support will help MIT OpenCourseWare 4 00:00:06,360 --> 00:00:10,720 continue to offer high quality educational resources for free. 5 00:00:10,720 --> 00:00:13,320 To make a donation or view additional materials 6 00:00:13,320 --> 00:00:17,280 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,280 --> 00:00:18,450 at ocw.mit.edu. 8 00:00:20,930 --> 00:00:21,930 ERIK DEMAINE: All right. 9 00:00:21,930 --> 00:00:25,530 Today we start a new section in advanced data structures, 10 00:00:25,530 --> 00:00:28,480 and it's data structures for memory hierarchy. 11 00:00:28,480 --> 00:00:32,820 So the idea here is while almost all data structures think 12 00:00:32,820 --> 00:00:36,000 about memory as a flat thing. 13 00:00:36,000 --> 00:00:39,180 And in the RAM model, you've got a giant array. 14 00:00:39,180 --> 00:00:41,970 You can access the i-th element in your giant array 15 00:00:41,970 --> 00:00:43,730 in constant time. 16 00:00:43,730 --> 00:00:47,430 The memory hierarchy, you admit the reality 17 00:00:47,430 --> 00:00:53,080 of almost all computers since, I don't know, '80s, 18 00:00:53,080 --> 00:00:54,760 which have caches. 19 00:00:54,760 --> 00:01:01,410 The idea is you have your CPU connected 20 00:01:01,410 --> 00:01:04,769 with a very high bandwidth channel to a relatively 21 00:01:04,769 --> 00:01:08,730 small cache, which is connected via a relatively 22 00:01:08,730 --> 00:01:14,610 narrow bandwidth channel to a really big memory. 23 00:01:14,610 --> 00:01:16,680 And in real computers, this keeps going. 24 00:01:16,680 --> 00:01:19,050 You've got level 1 cache, then level 2 cache, 25 00:01:19,050 --> 00:01:22,740 then level 3 cache maybe, then main memory, then disk, then 26 00:01:22,740 --> 00:01:24,480 network, whatever. 27 00:01:24,480 --> 00:01:26,460 Usually the disk has a cache. 28 00:01:26,460 --> 00:01:28,420 So you have this very deep hierarchy, 29 00:01:28,420 --> 00:01:31,830 which is typically growing exponentially in size, 30 00:01:31,830 --> 00:01:36,850 and also growing exponentially in slowness, in latency. 31 00:01:36,850 --> 00:01:41,640 So you'd like to design data structures 32 00:01:41,640 --> 00:01:45,180 that most of the time are working at the cache level, 33 00:01:45,180 --> 00:01:47,100 because that's really fast. 34 00:01:47,100 --> 00:01:48,720 And as little as possible, you'd like 35 00:01:48,720 --> 00:01:51,250 to go to the deeper levels. 36 00:01:51,250 --> 00:01:55,350 So we're going to define two models for working 37 00:01:55,350 --> 00:01:56,925 with memory hierarchies. 38 00:01:56,925 --> 00:02:00,270 The first model is called the external memory model. 39 00:02:04,620 --> 00:02:08,940 It's also called the I/O model. 40 00:02:08,940 --> 00:02:13,110 It's also called the Disk Access Model, 41 00:02:13,110 --> 00:02:17,760 I think just so that it could be called the DAM model. 42 00:02:17,760 --> 00:02:20,130 I usually call it the external memory model. 43 00:02:20,130 --> 00:02:22,650 And this is sort of the simplest model. 44 00:02:22,650 --> 00:02:26,154 It's also the most well studied, I would say. 45 00:02:26,154 --> 00:02:28,320 And it's going to relate closely to the other models 46 00:02:28,320 --> 00:02:29,614 that we're going to look at. 47 00:02:29,614 --> 00:02:31,405 And the idea with the external memory model 48 00:02:31,405 --> 00:02:34,380 is, yeah, OK, a real computer has 49 00:02:34,380 --> 00:02:37,320 many levels in this hierarchy. 50 00:02:37,320 --> 00:02:40,500 But let's just focus on the last one and the next to last one, 51 00:02:40,500 --> 00:02:44,230 because typically that last one is going to be way, 52 00:02:44,230 --> 00:02:45,789 way slower than all the others. 53 00:02:45,789 --> 00:02:47,580 And so you really want to minimize how many 54 00:02:47,580 --> 00:02:50,560 memory transfers happen here. 55 00:02:50,560 --> 00:02:58,410 So I want to count the number of memory transfers between two 56 00:02:58,410 --> 00:02:59,550 levels. 57 00:02:59,550 --> 00:03:02,010 And so this model just thinks about two levels. 58 00:03:02,010 --> 00:03:05,730 I'll call them cache and disk. 59 00:03:05,730 --> 00:03:09,240 We think of disk as infinite and slow. 60 00:03:09,240 --> 00:03:13,500 We think of cache as not only fast, but infinitely fast. 61 00:03:13,500 --> 00:03:16,950 It takes zero time to access something in cache. 62 00:03:16,950 --> 00:03:19,200 All we're thinking about is how many 63 00:03:19,200 --> 00:03:22,350 transfers between the left and right have to happen. 64 00:03:22,350 --> 00:03:24,870 So computation here is free. 65 00:03:24,870 --> 00:03:26,730 Access over here is free. 66 00:03:26,730 --> 00:03:28,440 And we just pay for this. 67 00:03:28,440 --> 00:03:29,820 That's the external memory model. 68 00:03:29,820 --> 00:03:32,361 You can also think about running time, make sure running time 69 00:03:32,361 --> 00:03:34,950 is still optimal and that will guarantee this part's good. 70 00:03:34,950 --> 00:03:38,250 But usually the hard part is to minimize this part, 71 00:03:38,250 --> 00:03:40,530 minimize number of memory transfers. 72 00:03:40,530 --> 00:03:44,380 Now, to make this problem really interesting, 73 00:03:44,380 --> 00:03:47,340 we need to add one more assumption. 74 00:03:47,340 --> 00:03:50,190 And this is also the case in practice, 75 00:03:50,190 --> 00:03:51,900 that when you request-- 76 00:03:51,900 --> 00:03:56,760 when you transfer an item from disk to cache, 77 00:03:56,760 --> 00:03:59,010 you get not just that item, but entire block 78 00:03:59,010 --> 00:04:01,830 of items at the same speed. 79 00:04:01,830 --> 00:04:03,960 So in fact, disk we're going to think 80 00:04:03,960 --> 00:04:07,830 of as divided into blocks. 81 00:04:07,830 --> 00:04:15,810 Block size here is B, capital B. 82 00:04:15,810 --> 00:04:20,430 Also over here, caches are divided into blocks of size B. 83 00:04:20,430 --> 00:04:24,930 And we have a limited number of them. 84 00:04:24,930 --> 00:04:30,900 We're going to suppose we have M over B cache lines 85 00:04:30,900 --> 00:04:33,000 they're usually called at the cache level. 86 00:04:33,000 --> 00:04:34,510 And so the total size of the cache 87 00:04:34,510 --> 00:04:37,900 is M. M stands for memory. 88 00:04:37,900 --> 00:04:39,120 Sorry. 89 00:04:39,120 --> 00:04:41,700 This notation is slightly confusing 90 00:04:41,700 --> 00:04:44,130 for historical reasons, but that's the way it is. 91 00:04:44,130 --> 00:04:45,210 So B is block size. 92 00:04:45,210 --> 00:04:47,920 M is cache size, total cache size. 93 00:04:47,920 --> 00:04:50,670 There's M over B blocks of size B. Disk 94 00:04:50,670 --> 00:04:53,769 has infinitely many blocks of size B. 95 00:04:53,769 --> 00:04:55,310 There's lots of motivations for this. 96 00:04:55,310 --> 00:04:59,850 But basically, the main cost you're paying here is latency. 97 00:04:59,850 --> 00:05:02,400 You have to send a request and then 98 00:05:02,400 --> 00:05:03,720 the answer has to come back. 99 00:05:03,720 --> 00:05:07,170 Think of that as the disk is far away from the CPU. 100 00:05:07,170 --> 00:05:09,815 And so it takes a long time for the request to get there 101 00:05:09,815 --> 00:05:11,110 and for the data to get back. 102 00:05:11,110 --> 00:05:12,300 So if you're sending data back, you 103 00:05:12,300 --> 00:05:14,850 might as well send-- it's just as fast to send a lot of data 104 00:05:14,850 --> 00:05:16,450 as a little bit of data. 105 00:05:16,450 --> 00:05:18,680 In fact, typically you set this block size 106 00:05:18,680 --> 00:05:22,550 that the amount of time it takes for data to arrive 107 00:05:22,550 --> 00:05:26,270 equals the latency to balance those two costs. 108 00:05:26,270 --> 00:05:27,740 With disk it's especially the case, 109 00:05:27,740 --> 00:05:30,020 because you have this latency of moving the disk head. 110 00:05:30,020 --> 00:05:31,936 Once you move the disk head, you might as well 111 00:05:31,936 --> 00:05:34,310 read the entire track, the entire circle of that 112 00:05:34,310 --> 00:05:35,780 if you're dealing with hard drives. 113 00:05:38,300 --> 00:05:42,210 It's just as fast once the head gets there. 114 00:05:42,210 --> 00:05:42,710 OK. 115 00:05:42,710 --> 00:05:44,000 So that's the model. 116 00:05:44,000 --> 00:05:47,000 And so now we want to minimize the number of block memory 117 00:05:47,000 --> 00:05:48,757 transfers this happens. 118 00:05:48,757 --> 00:05:50,840 And the blocks make things a lot more interesting, 119 00:05:50,840 --> 00:05:52,610 because that means when you read an item, 120 00:05:52,610 --> 00:05:55,310 you really want to use all the items right around it 121 00:05:55,310 --> 00:05:57,391 soon, before it gets kicked out of cache. 122 00:05:57,391 --> 00:05:58,765 Obviously, cache is limited size, 123 00:05:58,765 --> 00:06:00,223 so when you bring something, you've 124 00:06:00,223 --> 00:06:01,430 got to kick something out. 125 00:06:01,430 --> 00:06:03,770 That's the block replacement strategy. 126 00:06:03,770 --> 00:06:06,510 In the external memory model, you can do whatever you want. 127 00:06:06,510 --> 00:06:10,970 You have full control over this cache. 128 00:06:10,970 --> 00:06:13,460 So let's see. 129 00:06:13,460 --> 00:06:16,880 Let me write some obvious bounds. 130 00:06:16,880 --> 00:06:22,900 And then I'll tell you some of the main results in this model. 131 00:06:22,900 --> 00:06:25,520 So number of memory transfers is what we care about. 132 00:06:30,180 --> 00:06:37,070 And it's going to be at most the running time on a RAM, 133 00:06:37,070 --> 00:06:40,610 so this is a comparison between two models. 134 00:06:40,610 --> 00:06:44,810 At most, every operation you do on a RAM data structure 135 00:06:44,810 --> 00:06:46,220 incurs one memory transfer. 136 00:06:46,220 --> 00:06:47,840 That would be the worst case. 137 00:06:47,840 --> 00:06:49,820 So that's what we want to beat. 138 00:06:49,820 --> 00:06:52,400 There's also a lower bound. 139 00:06:52,400 --> 00:07:03,110 So this is also at least the number of cell probes divided 140 00:07:03,110 --> 00:07:06,170 by B. I think we've talked briefly about the cell probe 141 00:07:06,170 --> 00:07:09,360 model, I think in retroactive data structures. 142 00:07:09,360 --> 00:07:13,210 Cell probe model is this model where all you count is-- 143 00:07:13,210 --> 00:07:14,930 you just count how many words did I 144 00:07:14,930 --> 00:07:18,710 have to read in order to do this data structure operation? 145 00:07:18,710 --> 00:07:20,390 So computation is free. 146 00:07:20,390 --> 00:07:24,620 We just count number of word reads from memory. 147 00:07:24,620 --> 00:07:28,140 So it's a lower bound on RAM data structures. 148 00:07:28,140 --> 00:07:32,480 And in this model, we're only counting memory accesses, 149 00:07:32,480 --> 00:07:35,480 but where cell probes you count word numbers, 150 00:07:35,480 --> 00:07:37,300 here we're counting blocks. 151 00:07:37,300 --> 00:07:41,180 And so at best, we can improve by a factor of B. 152 00:07:41,180 --> 00:07:42,860 So this is a lower bound. 153 00:07:42,860 --> 00:07:45,170 This is an upper bound, so there's roughly a slop 154 00:07:45,170 --> 00:07:46,730 of a factor of B here. 155 00:07:46,730 --> 00:07:50,030 We want to do closer to this bound than this bound. 156 00:07:50,030 --> 00:07:51,460 This is the obvious thing. 157 00:07:51,460 --> 00:07:54,120 Want to get down to here whenever we can. 158 00:07:54,120 --> 00:07:59,980 So let me give you some examples of that, 159 00:07:59,980 --> 00:08:04,340 so sort of a bunch of basic results in this model. 160 00:08:12,070 --> 00:08:22,670 Result 0 is scanning, so if I have an array of items 161 00:08:22,670 --> 00:08:27,290 and I just want to, say, add them all up 162 00:08:27,290 --> 00:08:30,740 or do a linear scan to search for something, anything that 163 00:08:30,740 --> 00:08:33,440 involves accessing them in order, 164 00:08:33,440 --> 00:08:42,179 this will cost I guess ceiling of N over B. 165 00:08:42,179 --> 00:08:46,880 If the array has size N, number of blocks in that array 166 00:08:46,880 --> 00:08:53,310 is ceiling of N over B. And so that's how many block accesses 167 00:08:53,310 --> 00:08:53,810 it costs. 168 00:08:53,810 --> 00:08:59,110 So the annoying part here is the ceiling. 169 00:08:59,110 --> 00:09:02,520 If N is smaller than B, then this is not great. 170 00:09:02,520 --> 00:09:06,110 I mean, you have to pay one memory read, so it's optimal. 171 00:09:06,110 --> 00:09:08,540 But anyway, that's scanning. 172 00:09:08,540 --> 00:09:09,861 Pretty boring. 173 00:09:09,861 --> 00:09:12,110 Slightly more interesting, although not that much more 174 00:09:12,110 --> 00:09:13,610 interesting, is search trees. 175 00:09:19,160 --> 00:09:21,920 If you want to be able to search for an item 176 00:09:21,920 --> 00:09:24,054 in this external memory model-- 177 00:09:24,054 --> 00:09:25,970 in this external memory model, what do you do? 178 00:09:35,420 --> 00:09:41,240 B-trees, conveniently named B-trees. 179 00:09:41,240 --> 00:09:47,060 We want the branching factor of a B-tree 180 00:09:47,060 --> 00:09:50,860 to be B. Anything that's theta B will do, 181 00:09:50,860 --> 00:09:52,530 but exactly B is great. 182 00:09:52,530 --> 00:09:54,320 If you can read all of these pointers 183 00:09:54,320 --> 00:09:58,400 and read all of these key values, if that fits in B, 184 00:09:58,400 --> 00:09:59,330 then we're happy. 185 00:10:01,990 --> 00:10:06,990 So yeah, typically we assume the branching factor here 186 00:10:06,990 --> 00:10:08,210 is like B plus 1. 187 00:10:11,690 --> 00:10:13,970 So in particular, if B is 1, you still 188 00:10:13,970 --> 00:10:15,630 have a branching factor of 2, just 189 00:10:15,630 --> 00:10:18,680 to make it a reasonable tree. 190 00:10:18,680 --> 00:10:20,750 And so now in a constant number of block 191 00:10:20,750 --> 00:10:24,020 reads you can read all the data you need to decide 192 00:10:24,020 --> 00:10:25,850 which child pointer you follow. 193 00:10:25,850 --> 00:10:28,340 You follow that child pointer, and so it 194 00:10:28,340 --> 00:10:36,620 takes log base B plus 1 of N to do a search. 195 00:10:40,850 --> 00:10:43,700 This is counting memory transfers. 196 00:10:43,700 --> 00:10:45,350 So this is not as big an improvement. 197 00:10:45,350 --> 00:10:47,870 Here, we've improved by a factor of B roughly. 198 00:10:47,870 --> 00:10:52,640 Here we're improving by a factor of basically log B. 199 00:10:52,640 --> 00:10:56,630 So not as good as you might hope to do, dividing by B, 200 00:10:56,630 --> 00:10:58,370 but it turns out this is optimal. 201 00:10:58,370 --> 00:11:04,730 You need log base B plus 1 of N to do 202 00:11:04,730 --> 00:11:12,270 a search in comparison model. 203 00:11:22,370 --> 00:11:24,740 In the same way, you need log base 2 of N time 204 00:11:24,740 --> 00:11:26,830 to do a search in the comparison model, 205 00:11:26,830 --> 00:11:29,019 in this model you need log base B of N. 206 00:11:29,019 --> 00:11:31,060 Let me talk a little bit about the proof of that. 207 00:11:34,470 --> 00:11:38,810 It's a pretty simple information theoretic argument. 208 00:11:38,810 --> 00:11:46,300 So we prove this lower bound. 209 00:11:46,300 --> 00:11:47,450 The idea is the following. 210 00:11:47,450 --> 00:11:51,220 You want to figure out among N items 211 00:11:51,220 --> 00:11:55,780 here where your item fits. 212 00:11:55,780 --> 00:11:58,040 Let's say it fits right here. 213 00:11:58,040 --> 00:12:02,380 So you want to know it's between this element and this element. 214 00:12:02,380 --> 00:12:04,450 How many bits of information is that? 215 00:12:04,450 --> 00:12:08,380 Well, there's I guess N plus 1 different outcomes of where 216 00:12:08,380 --> 00:12:10,540 your item could fit. 217 00:12:10,540 --> 00:12:13,280 If you allow it to exactly match, it's maybe twice that. 218 00:12:13,280 --> 00:12:20,770 But there's theta N options for your answer. 219 00:12:20,770 --> 00:12:28,806 And so if you take logs, that is log N bits of information. 220 00:12:32,356 --> 00:12:33,730 And this is how you normally show 221 00:12:33,730 --> 00:12:35,337 that you need log N comparison. 222 00:12:35,337 --> 00:12:37,420 You need log N comparisons because each comparison 223 00:12:37,420 --> 00:12:41,230 could at most give you one bit of information, yes or no. 224 00:12:41,230 --> 00:12:43,750 And there's log N bits to get, so that's the lower bound 225 00:12:43,750 --> 00:12:45,310 in binary search. 226 00:12:45,310 --> 00:12:47,260 But now we're in this model where we're not 227 00:12:47,260 --> 00:12:48,910 counting comparisons, even though we're 228 00:12:48,910 --> 00:12:50,140 in the comparison model. 229 00:12:50,140 --> 00:12:52,570 We're counting number of block transfers. 230 00:12:52,570 --> 00:12:59,560 So when I read in some data, I get B items from my array, 231 00:12:59,560 --> 00:13:02,050 or from something. 232 00:13:02,050 --> 00:13:10,440 So when I do a block read, all I can do in the comparison model 233 00:13:10,440 --> 00:13:14,710 is compare my item to everything that I read, 234 00:13:14,710 --> 00:13:16,810 so you might as well compare all of them. 235 00:13:16,810 --> 00:13:29,080 But the information you learn is only log B plus 1 bits, 236 00:13:29,080 --> 00:13:33,010 because what you learn is among these B items, where do I fit? 237 00:13:36,920 --> 00:13:39,870 And in the best case, that is log of B 238 00:13:39,870 --> 00:13:42,530 plus 1 bits of information. 239 00:13:42,530 --> 00:13:46,160 It could be worse, if these guys are not well distributed 240 00:13:46,160 --> 00:13:47,780 among your N items. 241 00:13:47,780 --> 00:13:48,530 It's at most that. 242 00:13:48,530 --> 00:13:56,750 And so you take the ratio and you need log N over log B 243 00:13:56,750 --> 00:14:00,590 plus 1 block reads. 244 00:14:04,730 --> 00:14:08,115 And that's log base B plus 1 of N. 245 00:14:08,115 --> 00:14:08,990 AUDIENCE: [INAUDIBLE] 246 00:14:08,990 --> 00:14:09,320 ERIK DEMAINE: Yeah. 247 00:14:09,320 --> 00:14:10,263 Question? 248 00:14:10,263 --> 00:14:13,530 AUDIENCE: Why is that you learn log B plus 1 bits when 249 00:14:13,530 --> 00:14:14,410 you [INAUDIBLE]? 250 00:14:14,410 --> 00:14:15,410 ERIK DEMAINE: All right. 251 00:14:15,410 --> 00:14:17,180 So the claim is, in the comparison model, 252 00:14:17,180 --> 00:14:21,500 you compare your item to everyone in here. 253 00:14:21,500 --> 00:14:26,840 But if your array is presorted, then all you're learning is-- 254 00:14:26,840 --> 00:14:28,730 I mean, information theoretically, 255 00:14:28,730 --> 00:14:33,420 all your learning is where your item fits among these B items. 256 00:14:33,420 --> 00:14:39,680 So there's order B options of where you fit, and so only log 257 00:14:39,680 --> 00:14:44,570 B bits of information there. 258 00:14:44,570 --> 00:14:47,750 You might consider this a vague argument. 259 00:14:47,750 --> 00:14:49,460 You can do a more formal version of it. 260 00:14:49,460 --> 00:14:55,380 But I like the information theoretic perspective. 261 00:14:55,380 --> 00:14:55,890 Cool. 262 00:14:55,890 --> 00:15:01,170 So that's the best thing you can do for search in this model. 263 00:15:01,170 --> 00:15:03,720 Now, you might also wonder about insertions and deletions. 264 00:15:03,720 --> 00:15:05,910 B-trees support insertions and deletions in log 265 00:15:05,910 --> 00:15:09,730 base B plus 1 of N time, but that is not optimal. 266 00:15:09,730 --> 00:15:12,180 We can do better. 267 00:15:12,180 --> 00:15:22,976 So for example, if you want to sort, 268 00:15:22,976 --> 00:15:26,985 the right answer for sorting is this crazy bound. 269 00:15:33,900 --> 00:15:35,970 N over B, which is really good, that's 270 00:15:35,970 --> 00:15:40,920 linear time in this model, times log base M over B of N over B. 271 00:15:40,920 --> 00:15:44,492 This is a slight-- this is roughly log base M of N. 272 00:15:44,492 --> 00:15:48,730 The over B's don't make too big a difference. 273 00:15:48,730 --> 00:15:50,910 So this is an improvement. 274 00:15:50,910 --> 00:15:53,870 Normally sorting takes N log N with a base 2. 275 00:15:53,870 --> 00:15:57,540 We're improving the base of the log to M over B instead of 2. 276 00:15:57,540 --> 00:16:01,770 And we're improving the overall running time by a factor of B. 277 00:16:01,770 --> 00:16:04,930 So this is actually a really big improvement. 278 00:16:04,930 --> 00:16:10,050 We're going faster than the RAM time divided by a B, 279 00:16:10,050 --> 00:16:12,102 because to sort in the cell probe model 280 00:16:12,102 --> 00:16:13,060 just takes linear time. 281 00:16:13,060 --> 00:16:14,010 You read all the data. 282 00:16:14,010 --> 00:16:14,580 You sort it. 283 00:16:14,580 --> 00:16:16,980 You put it back. 284 00:16:16,980 --> 00:16:22,050 So we can't quite get N over B, but we get this log factor. 285 00:16:22,050 --> 00:16:23,250 So this is really good. 286 00:16:23,250 --> 00:16:27,210 And it's something that would not result from using B-trees. 287 00:16:27,210 --> 00:16:30,590 This is a little funny, because in regular RAM model, 288 00:16:30,590 --> 00:16:35,059 if you insert everything into a balanced binary search tree, 289 00:16:35,059 --> 00:16:36,600 then do an in order traversal, that's 290 00:16:36,600 --> 00:16:38,820 an optimal sorting algorithm. 291 00:16:38,820 --> 00:16:40,320 It gives you N log N. 292 00:16:40,320 --> 00:16:42,192 Here, it's not optimal. 293 00:16:42,192 --> 00:16:43,650 You throw everything into a B-tree, 294 00:16:43,650 --> 00:16:46,971 you get N times log base B of N. So there's 295 00:16:46,971 --> 00:16:47,970 two things that are bad. 296 00:16:47,970 --> 00:16:50,840 One is that the base of the log is slightly off. 297 00:16:50,840 --> 00:16:54,810 It's B versus M over B. That's not so big a deal. 298 00:16:54,810 --> 00:16:59,400 The big thing is it's N over B versus N. 299 00:16:59,400 --> 00:17:04,959 We don't want to pay N. We want to pay N over B. There's also 300 00:17:04,959 --> 00:17:07,500 a matching lower bound in the comparison model, which we will 301 00:17:07,500 --> 00:17:08,760 not cover here. 302 00:17:11,710 --> 00:17:17,280 By next lecture, we'll know how to do sorting optimally. 303 00:17:17,280 --> 00:17:19,260 First, I want to tell you about a bunch 304 00:17:19,260 --> 00:17:24,975 of different ways or different problems and results. 305 00:17:28,890 --> 00:17:33,930 So a funny problem, which actually does 306 00:17:33,930 --> 00:17:37,350 arise in practice, but it's interesting just 307 00:17:37,350 --> 00:17:40,350 to think about, instead of sorting-- 308 00:17:40,350 --> 00:17:41,880 sorting is I give you an array. 309 00:17:41,880 --> 00:17:43,290 I don't know what order it's in. 310 00:17:43,290 --> 00:17:44,998 You have to find the right order and then 311 00:17:44,998 --> 00:17:46,380 put it into the right order. 312 00:17:46,380 --> 00:17:49,110 There's an easier problem, which is called permuting. 313 00:17:49,110 --> 00:17:50,070 I give you an array. 314 00:17:50,070 --> 00:17:52,890 I give you a permutation that I'd like to do. 315 00:17:52,890 --> 00:17:55,980 I just want you to move the items around. 316 00:17:55,980 --> 00:17:59,872 So you don't need to do any comparisons to get here. 317 00:17:59,872 --> 00:18:01,080 I tell you what the order is. 318 00:18:01,080 --> 00:18:03,420 You just have to physically move the items. 319 00:18:03,420 --> 00:18:06,690 This is harder than it sounds, because when you read, 320 00:18:06,690 --> 00:18:09,510 you read blocks of items in the original order. 321 00:18:09,510 --> 00:18:12,720 And so it's hard to just do an arbitrary permutation. 322 00:18:12,720 --> 00:18:14,190 And there's two obvious algorithms. 323 00:18:14,190 --> 00:18:17,575 One is you could sort your items. 324 00:18:17,575 --> 00:18:19,440 So you could go to every item in order 325 00:18:19,440 --> 00:18:25,140 and just write down its desired number in the new permutation, 326 00:18:25,140 --> 00:18:26,880 write down the permutation. 327 00:18:26,880 --> 00:18:28,110 Then run a sorting algorithm. 328 00:18:28,110 --> 00:18:31,890 That will take N over B log base M over B of N over B. 329 00:18:31,890 --> 00:18:36,452 Or the other obvious algorithm is to take the first item, 330 00:18:36,452 --> 00:18:38,910 put it where it belongs in the array, take the second item, 331 00:18:38,910 --> 00:18:41,260 put it where it belongs in the array, 332 00:18:41,260 --> 00:18:42,740 in a new copy of the array. 333 00:18:42,740 --> 00:18:46,020 And that will take at most N memory transfers, 334 00:18:46,020 --> 00:18:48,534 each one a completely separate memory transfer. 335 00:18:48,534 --> 00:18:49,950 And the claim is the optimal thing 336 00:18:49,950 --> 00:18:52,150 to do is the min of those two. 337 00:18:52,150 --> 00:18:54,780 So you just check how does N and B 338 00:18:54,780 --> 00:19:00,460 and M compare, do the better of the two algorithms. 339 00:19:00,460 --> 00:19:02,940 Again, there's a lower bound. 340 00:19:02,940 --> 00:19:05,340 In a model called the indivisible model, 341 00:19:05,340 --> 00:19:10,710 which is assuming that each word of your data 342 00:19:10,710 --> 00:19:13,000 doesn't get cut into subwords. 343 00:19:13,000 --> 00:19:14,580 So this is assuming-- 344 00:19:14,580 --> 00:19:20,370 the lower bound assuming your words do not get cut up. 345 00:19:24,390 --> 00:19:26,930 I believe it's still unsolved whether you could do better 346 00:19:26,930 --> 00:19:28,332 otherwise. 347 00:19:28,332 --> 00:19:32,790 The last result in this model I wanted to talk about 348 00:19:32,790 --> 00:19:34,470 is called buffer trees. 349 00:19:34,470 --> 00:19:36,660 Buffer trees are pretty cool. 350 00:19:39,619 --> 00:19:41,160 They're essentially a dynamic version 351 00:19:41,160 --> 00:19:44,640 of sorting, which is what we lack-- 352 00:19:44,640 --> 00:19:46,360 what B-trees do not give you. 353 00:20:10,300 --> 00:20:11,680 So they achieve the sorting bound 354 00:20:11,680 --> 00:20:16,759 divided by N per operation, so this is optimal. 355 00:20:16,759 --> 00:20:18,550 I haven't told you what the operations are. 356 00:20:18,550 --> 00:20:19,550 The amount is amortized. 357 00:20:19,550 --> 00:20:22,720 It has to be amortized, because this bound is typically 358 00:20:22,720 --> 00:20:24,610 little o of 1. 359 00:20:24,610 --> 00:20:26,920 Think of this as 1 over B. There's a log factor. 360 00:20:26,920 --> 00:20:30,550 But assuming that log factor is smaller than B, which is often 361 00:20:30,550 --> 00:20:32,690 the case, this is in order 1 over B. 362 00:20:32,690 --> 00:20:35,560 It's a little o of 1 cost, which is a little fun. 363 00:20:35,560 --> 00:20:37,675 But of course, every real cost is an integer. 364 00:20:37,675 --> 00:20:40,540 It's an integer number of block transfers. 365 00:20:40,540 --> 00:20:45,510 But amortized, you can get a little o of 1 bound. 366 00:20:45,510 --> 00:20:51,185 And what it lets you do is delayed queries and batch 367 00:20:51,185 --> 00:20:51,685 updates. 368 00:21:04,170 --> 00:21:06,030 So I can do an insert and I can a delete. 369 00:21:06,030 --> 00:21:07,720 I can do many of them. 370 00:21:07,720 --> 00:21:09,870 And then I can do a query against the current data 371 00:21:09,870 --> 00:21:11,160 structure. 372 00:21:11,160 --> 00:21:14,107 I want to know, I don't know, the successor of an item. 373 00:21:14,107 --> 00:21:15,690 But I don't get the answer right then. 374 00:21:15,690 --> 00:21:17,880 It's delayed. 375 00:21:17,880 --> 00:21:18,930 The query will sit there. 376 00:21:18,930 --> 00:21:20,346 Eventually the data structure will 377 00:21:20,346 --> 00:21:22,590 decide to give me an answer. 378 00:21:22,590 --> 00:21:24,960 Or it may just never give me an answer. 379 00:21:24,960 --> 00:21:27,612 But then there's a new operation at the end, which is flush, 380 00:21:27,612 --> 00:21:29,820 which is give me all the answers to all the questions 381 00:21:29,820 --> 00:21:31,080 that I asked. 382 00:21:31,080 --> 00:21:33,820 And the flush takes this much time. 383 00:21:33,820 --> 00:21:36,720 So if you do N operations and then say flush, 384 00:21:36,720 --> 00:21:39,240 then it doesn't take any extra time. 385 00:21:39,240 --> 00:21:40,944 It's going to cost one sort of pass 386 00:21:40,944 --> 00:21:43,110 through the entire data structure, one sorting step. 387 00:21:45,880 --> 00:21:46,790 Yeah. 388 00:21:46,790 --> 00:21:53,670 The one thing you can get for free is a zero-cost find min. 389 00:21:58,300 --> 00:21:59,020 All right. 390 00:21:59,020 --> 00:22:02,074 So buffer tree, you can do find min instantaneously. 391 00:22:02,074 --> 00:22:03,490 It maintains the min all the time. 392 00:22:03,490 --> 00:22:05,470 So this lets you do a priority queue. 393 00:22:05,470 --> 00:22:09,430 You can do insert and delete min in this much time online. 394 00:22:13,886 --> 00:22:15,760 But if you want to do other kinds of queries, 395 00:22:15,760 --> 00:22:19,000 then they're slow to answer. 396 00:22:19,000 --> 00:22:21,046 You get the answer much later. 397 00:22:21,046 --> 00:22:22,713 So this is a dynamic version of sorting. 398 00:22:22,713 --> 00:22:24,254 It lets you do sorting in particular. 399 00:22:24,254 --> 00:22:26,530 It let's you do priority queues, other great things. 400 00:22:26,530 --> 00:22:28,860 We will see not quite a buffer tree 401 00:22:28,860 --> 00:22:31,960 but another data structure achieving similar bounds 402 00:22:31,960 --> 00:22:34,695 in the next lecture. 403 00:22:34,695 --> 00:22:36,070 Today's lecture is actually going 404 00:22:36,070 --> 00:22:38,980 to be about B-trees, which seems kind of boring, 405 00:22:38,980 --> 00:22:40,810 because we already know how to do B-trees. 406 00:22:40,810 --> 00:22:43,480 But we're going to do them in another model, which 407 00:22:43,480 --> 00:22:47,020 is the cache-oblivious model. 408 00:22:47,020 --> 00:22:48,520 So let me tell you about that model. 409 00:23:06,376 --> 00:23:07,090 All right. 410 00:23:07,090 --> 00:23:09,925 So the cache-oblivious model is much newer. 411 00:23:09,925 --> 00:23:14,050 It's from 1999. 412 00:23:14,050 --> 00:23:18,910 And it's from MIT originally, Charles Leiserson 413 00:23:18,910 --> 00:23:21,820 and his students invented it back then, 414 00:23:21,820 --> 00:23:24,970 in the context of 6406, I believe. 415 00:23:24,970 --> 00:23:29,650 And it's almost the same as this external memory model. 416 00:23:29,650 --> 00:23:33,520 There's one change, which is that the algorithm doesn't 417 00:23:33,520 --> 00:23:35,020 know what B or M is. 418 00:23:44,600 --> 00:23:49,290 Now, this is both awkward and cool at the same time. 419 00:23:49,290 --> 00:23:54,230 So the assumption is that the computer looks like this. 420 00:23:54,230 --> 00:23:55,490 The algorithm has no-- 421 00:23:55,490 --> 00:23:57,320 and the algorithm knows that fact, 422 00:23:57,320 --> 00:24:01,340 but it doesn't know what the parameters of the cache are. 423 00:24:01,340 --> 00:24:03,590 Doesn't know how the memory is blocked. 424 00:24:03,590 --> 00:24:07,250 Now, as a consequence, it cannot manually say, read this block, 425 00:24:07,250 --> 00:24:08,480 write this block. 426 00:24:08,480 --> 00:24:11,090 It can't decide which block to kick out. 427 00:24:11,090 --> 00:24:14,270 This is going to be more like a typical algorithm. 428 00:24:14,270 --> 00:24:17,810 I mean, the algorithm will look like a regular RAM algorithm. 429 00:24:17,810 --> 00:24:19,780 It's going to say read this word, 430 00:24:19,780 --> 00:24:25,550 write this word to memory, do an addition, whatever. 431 00:24:25,550 --> 00:24:34,580 So algorithm looks like a RAM algorithm. 432 00:24:34,580 --> 00:24:36,719 But the analysis is going to be different. 433 00:24:41,610 --> 00:24:44,964 So this is convenient, because algorithms-- 434 00:24:44,964 --> 00:24:46,880 I mean, how you implement an algorithm becomes 435 00:24:46,880 --> 00:24:48,200 much clearer in this model. 436 00:24:48,200 --> 00:24:51,170 And this is-- now we're going to assume that the cache is 437 00:24:51,170 --> 00:24:52,782 maintained automatically. 438 00:24:57,980 --> 00:25:02,150 In particular, the way that blocks are replaced 439 00:25:02,150 --> 00:25:03,380 becomes automatic. 440 00:25:07,490 --> 00:25:10,010 So the assumption is when you read a word from memory, 441 00:25:10,010 --> 00:25:11,330 the entire block comes over. 442 00:25:11,330 --> 00:25:13,880 You just don't know how much data that is. 443 00:25:13,880 --> 00:25:15,500 And when something gets kicked out, 444 00:25:15,500 --> 00:25:18,410 the entire block gets kicked out and written to memory. 445 00:25:18,410 --> 00:25:20,120 But you don't get to see how that's 446 00:25:20,120 --> 00:25:24,800 happening, because you don't know anything about B or M. 447 00:25:24,800 --> 00:25:26,990 We need to assume something about this replacement 448 00:25:26,990 --> 00:25:27,950 strategy. 449 00:25:27,950 --> 00:25:29,990 And in the cache-oblivious model, 450 00:25:29,990 --> 00:25:32,225 we assume that it is optimal. 451 00:25:35,240 --> 00:25:38,570 This is an offline notion, saying 452 00:25:38,570 --> 00:25:40,100 what block we're going to kick out 453 00:25:40,100 --> 00:25:43,140 is the one that will be used farthest in the future, 454 00:25:43,140 --> 00:25:45,500 so this is very hard to implement in practice, 455 00:25:45,500 --> 00:25:47,170 because you need a crystal ball. 456 00:25:47,170 --> 00:25:50,900 But good news is, you could use something 457 00:25:50,900 --> 00:25:54,820 like LRU, Least Recently Used, kick that guy out, or even 458 00:25:54,820 --> 00:25:58,010 a very simple algorithm, first in, first out. 459 00:25:58,010 --> 00:25:59,870 The most recent-- the oldest block 460 00:25:59,870 --> 00:26:02,960 that you brought in is the one that you kick out. 461 00:26:02,960 --> 00:26:05,810 These are actually very close to optimal. 462 00:26:09,030 --> 00:26:16,160 They're constant competitive against or given 463 00:26:16,160 --> 00:26:17,420 a cache of twice the size. 464 00:26:21,030 --> 00:26:25,910 So this is from the early days of competitive analysis. 465 00:26:25,910 --> 00:26:29,460 It says if I compare offline optimal with a cache size M 466 00:26:29,460 --> 00:26:30,660 versus-- 467 00:26:30,660 --> 00:26:32,890 or let's say offline optimal with a cache size 468 00:26:32,890 --> 00:26:35,940 M over 2, versus LRU or FIFO, which 469 00:26:35,940 --> 00:26:38,700 are online algorithms on a cache of size M, 470 00:26:38,700 --> 00:26:43,140 then this cost will be at most I think twice this cost. 471 00:26:43,140 --> 00:26:45,320 Now, the good news is, you look at all these bounds, 472 00:26:45,320 --> 00:26:48,870 you change M by a factor of 2, nothing changes. 473 00:26:48,870 --> 00:26:51,600 Assuming everything is at least some constant, 474 00:26:51,600 --> 00:26:54,570 and M and B are not super super close, 475 00:26:54,570 --> 00:26:58,650 then changing M by a factor of 2 doesn't affect these bounds. 476 00:26:58,650 --> 00:27:02,550 These ones don't even have M in them, so it's not a big deal. 477 00:27:02,550 --> 00:27:06,107 So we're going to assume this basically for analysis. 478 00:27:06,107 --> 00:27:07,440 It's not really that big a deal. 479 00:27:07,440 --> 00:27:09,365 You could also analyze with these models. 480 00:27:09,365 --> 00:27:10,740 The point is they're all the same 481 00:27:10,740 --> 00:27:17,040 within a constant factor for the algorithms we care about. 482 00:27:17,040 --> 00:27:18,225 What other fun things? 483 00:27:21,840 --> 00:27:25,320 If you have a good, cache-oblivious algorithm, 484 00:27:25,320 --> 00:27:28,800 it's good simultaneously for all B and M. 485 00:27:28,800 --> 00:27:29,890 That's the consequence. 486 00:27:29,890 --> 00:27:32,970 So whereas B-trees for example, need to know B, 487 00:27:32,970 --> 00:27:35,700 they need to have the right branching factor for your value 488 00:27:35,700 --> 00:27:39,600 of B, the cache-oblivious model, you have to have-- 489 00:27:39,600 --> 00:27:41,790 your data structure has to be correct, 490 00:27:41,790 --> 00:27:46,140 has to be optimized for all values of B and all values of M 491 00:27:46,140 --> 00:27:47,940 simultaneously. 492 00:27:47,940 --> 00:27:50,280 So this has some nice side effects. 493 00:27:50,280 --> 00:27:53,820 In particular, you essentially adapt 494 00:27:53,820 --> 00:27:59,460 to different values of B and M. So two nice side effects 495 00:27:59,460 --> 00:28:03,090 of that are that if you have a multi-level memory hierarchy, 496 00:28:03,090 --> 00:28:06,450 each level has its own B and its own M. 497 00:28:06,450 --> 00:28:08,034 If you use a cache-oblivious algorithm 498 00:28:08,034 --> 00:28:10,200 and all of those caches are maintained automatically 499 00:28:10,200 --> 00:28:12,840 using one of these strategies, then 500 00:28:12,840 --> 00:28:16,080 you optimize the number of-- maybe I should draw a picture. 501 00:28:19,420 --> 00:28:22,770 Here's a multi-level memory hierarchy. 502 00:28:22,770 --> 00:28:25,509 You optimize the number of memory transfers 503 00:28:25,509 --> 00:28:26,550 between these two levels. 504 00:28:26,550 --> 00:28:28,200 You optimize the number of memory transfers 505 00:28:28,200 --> 00:28:30,390 between these two levels, between these two levels, 506 00:28:30,390 --> 00:28:32,460 and between these two levels. 507 00:28:32,460 --> 00:28:34,530 So you optimize everything. 508 00:28:34,530 --> 00:28:35,790 Each of them has some-- 509 00:28:35,790 --> 00:28:39,875 if you just think of this as one big level versus this level 510 00:28:39,875 --> 00:28:43,350 or versus all the levels to the right, 511 00:28:43,350 --> 00:28:47,190 then that has some B and some M. And you're 512 00:28:47,190 --> 00:28:50,520 minimizing the number of memory transfers here. 513 00:28:50,520 --> 00:28:54,930 Also, if B and M are changing, lots of situations for that. 514 00:28:54,930 --> 00:28:57,780 If you have multiple programs running at the same time, 515 00:28:57,780 --> 00:28:59,466 then your cache size might effectively 516 00:28:59,466 --> 00:29:00,840 go down if they're really running 517 00:29:00,840 --> 00:29:05,610 in parallel on the same computer, or pseudo-parallel. 518 00:29:05,610 --> 00:29:06,950 Block size could change. 519 00:29:06,950 --> 00:29:09,480 If you think of a disk and your block size 520 00:29:09,480 --> 00:29:12,242 is the size of your track, then closer to the center, 521 00:29:12,242 --> 00:29:13,200 the tracks are smaller. 522 00:29:13,200 --> 00:29:16,687 Closer to the rim of the disk then blocks are bigger. 523 00:29:16,687 --> 00:29:19,020 All these things basically in the cache-oblivious model, 524 00:29:19,020 --> 00:29:21,450 you don't care, because it will just work. 525 00:29:21,450 --> 00:29:23,010 Now, there's no nice formalization 526 00:29:23,010 --> 00:29:24,930 of adapting to changing values of B and M. 527 00:29:24,930 --> 00:29:27,150 That's kind of an open problem to make 528 00:29:27,150 --> 00:29:28,712 explicit what that means. 529 00:29:28,712 --> 00:29:31,170 But we're going to assume in the cache-oblivious models you 530 00:29:31,170 --> 00:29:34,664 have one fixed B, one fixed M, analyze relative to that. 531 00:29:34,664 --> 00:29:37,080 But of course, that analysis applies simultaneously to all 532 00:29:37,080 --> 00:29:38,750 B's and all M's. 533 00:29:38,750 --> 00:29:43,990 And so it feels like you adapt optimally, whatever that means. 534 00:29:47,064 --> 00:29:48,730 In some sense, the most surprising thing 535 00:29:48,730 --> 00:29:52,600 about the cache-oblivious world is that it's possible. 536 00:29:52,600 --> 00:29:56,200 You can work in this model and get basically 537 00:29:56,200 --> 00:29:58,520 all of these results. 538 00:29:58,520 --> 00:30:02,080 So let me go over these results. 539 00:30:24,184 --> 00:30:26,850 So in the cache-oblivious model, of course scans are still good. 540 00:30:26,850 --> 00:30:32,520 Scans didn't assume anything about block size here, 541 00:30:32,520 --> 00:30:34,170 except in the analysis. 542 00:30:34,170 --> 00:30:36,450 Scanning was just a sequential read. 543 00:30:36,450 --> 00:30:40,594 In general, we can do constant number of scans. 544 00:30:40,594 --> 00:30:41,760 Some of them can go to left. 545 00:30:41,760 --> 00:30:43,200 Some of them could go to the right. 546 00:30:43,200 --> 00:30:44,070 Some of them could be reading. 547 00:30:44,070 --> 00:30:45,530 Some of them could be writing. 548 00:30:45,530 --> 00:30:47,070 That's the general form of scan. 549 00:30:47,070 --> 00:30:51,560 But all of that will take order N over B ceiling. 550 00:30:59,670 --> 00:31:00,350 OK. 551 00:31:00,350 --> 00:31:00,850 Search. 552 00:31:05,010 --> 00:31:07,290 This is what we're going to be focused on today. 553 00:31:09,840 --> 00:31:19,890 You can achieve order log base B plus 1 of N search, 554 00:31:19,890 --> 00:31:20,820 insert, and delete. 555 00:31:27,990 --> 00:31:32,929 Just like B-trees, but without knowing what B is. 556 00:31:32,929 --> 00:31:34,470 We're going to do most of that today. 557 00:31:37,451 --> 00:31:37,950 What's next? 558 00:31:37,950 --> 00:31:38,937 Sorting. 559 00:31:38,937 --> 00:31:40,395 Sorting, you can do the same bound. 560 00:31:43,050 --> 00:31:44,260 That we'll see next lecture. 561 00:31:55,430 --> 00:31:56,830 OK. 562 00:31:56,830 --> 00:31:59,170 Next is permuting. 563 00:32:07,690 --> 00:32:10,380 Here there's actually a negative result, which 564 00:32:10,380 --> 00:32:14,080 is that you can't do the min. 565 00:32:14,080 --> 00:32:17,815 It doesn't say exactly what you can do. 566 00:32:17,815 --> 00:32:19,440 In particular, we can certainly permute 567 00:32:19,440 --> 00:32:21,030 as quickly as we can sort. 568 00:32:21,030 --> 00:32:23,100 And we can permute-- 569 00:32:23,100 --> 00:32:25,260 we can achieve this min in a weird sense, 570 00:32:25,260 --> 00:32:28,130 in that we can achieve N and we can achieve the sorting 571 00:32:28,130 --> 00:32:30,720 bound for permutation in the cache-oblivious model, 572 00:32:30,720 --> 00:32:32,580 either one, but not both with min 573 00:32:32,580 --> 00:32:35,430 around them, because to know which one to apply, 574 00:32:35,430 --> 00:32:37,660 you need to know how B and M relate to N, 575 00:32:37,660 --> 00:32:39,222 and you don't here. 576 00:32:39,222 --> 00:32:40,680 And so there's a lower bound saying 577 00:32:40,680 --> 00:32:42,360 you really can't achieve this bound, 578 00:32:42,360 --> 00:32:44,640 even up to constant factors. 579 00:32:44,640 --> 00:32:48,460 But it doesn't explicitly say anything more than that. 580 00:32:48,460 --> 00:32:49,770 So a little awkward. 581 00:32:49,770 --> 00:32:52,650 Fortunately, permuting isn't that big a deal. 582 00:32:52,650 --> 00:32:55,200 And most of the time the sorting bound 583 00:32:55,200 --> 00:32:58,770 is much faster than the end bound, so not so bad. 584 00:32:58,770 --> 00:33:00,810 But slightly awkward. 585 00:33:00,810 --> 00:33:02,310 So there's some things you can't do. 586 00:33:02,310 --> 00:33:04,800 This was the first negative result in cache-oblivious. 587 00:33:04,800 --> 00:33:08,460 Another fun negative result is actually in search trees. 588 00:33:08,460 --> 00:33:12,690 You might ask, what is this constant factor here 589 00:33:12,690 --> 00:33:15,570 in front of the log base B plus 1 of N? 590 00:33:15,570 --> 00:33:20,010 And for B-trees, the constant factor is essentially 1. 591 00:33:23,760 --> 00:33:28,360 You get 1 times log base B plus 1 of N. 592 00:33:28,360 --> 00:33:29,910 In cache-oblivious world, you can 593 00:33:29,910 --> 00:33:33,870 prove that 1 is unattainable and the best possible 594 00:33:33,870 --> 00:33:40,450 constant factor is log base 2 of e, which was like 1.44. 595 00:33:40,450 --> 00:33:42,780 So that is the right answer for the constant. 596 00:33:42,780 --> 00:33:46,140 Today we'll get a constant here of 4. 597 00:33:46,140 --> 00:33:47,030 This is for search. 598 00:33:49,560 --> 00:33:53,490 But 4 has since been improved to 1.44, so a tiny separation 599 00:33:53,490 --> 00:33:56,550 there between cache-oblivious and external memory. 600 00:33:59,470 --> 00:34:01,500 Let's see. 601 00:34:01,500 --> 00:34:02,400 One more thing. 602 00:34:19,620 --> 00:34:24,090 Continuing over here is priority queue. 603 00:34:26,810 --> 00:34:31,469 Cache-oblivious we can do in the optimal time 604 00:34:31,469 --> 00:34:48,050 1 over B log base M over B of N over B. 605 00:34:48,050 --> 00:34:51,080 Only catch with that result is it 606 00:34:51,080 --> 00:34:54,215 assumes something called the tall cache assumption. 607 00:35:05,640 --> 00:35:07,280 Let's see what I want to say here. 608 00:35:12,740 --> 00:35:19,270 We'll say M is omega B to the 1 plus epsilon. 609 00:35:19,270 --> 00:35:21,580 That's an ugly-looking omega. 610 00:35:26,690 --> 00:35:29,870 Certainly M is at least B. If you have one cache line, 611 00:35:29,870 --> 00:35:32,330 you have M equal B. But we'd like 612 00:35:32,330 --> 00:35:36,350 M to be a little bit bigger than B. We'd like the cache to be-- 613 00:35:36,350 --> 00:35:39,620 if M was at least B squared, then the height of the cache 614 00:35:39,620 --> 00:35:42,080 would be at least as big as the width of the cache, hence 615 00:35:42,080 --> 00:35:43,367 tall cache. 616 00:35:43,367 --> 00:35:45,200 But we don't even need M equal to B squared. 617 00:35:45,200 --> 00:35:47,210 Just M to the-- 618 00:35:47,210 --> 00:35:51,510 M equaling B to the 1.001 would be enough. 619 00:35:51,510 --> 00:35:52,985 Then you can get this result. 620 00:35:52,985 --> 00:35:56,240 And there's a theorem that shows for cache-oblivious 621 00:35:56,240 --> 00:35:58,730 you need to assume this tall cache assumption. 622 00:35:58,730 --> 00:36:03,010 You cannot achieve this bound without it. 623 00:36:03,010 --> 00:36:08,760 But this is usually true for real caches, so life is good. 624 00:36:08,760 --> 00:36:09,260 All right. 625 00:36:09,260 --> 00:36:13,060 So that's a quick summary of cache-oblivious results. 626 00:36:13,060 --> 00:36:14,260 Any questions about those? 627 00:36:17,090 --> 00:36:24,950 Then we proceed into achieving search trees log base B of N 628 00:36:24,950 --> 00:36:27,760 performance cache-obliviously. 629 00:36:27,760 --> 00:36:29,729 Make it interesting, because external memory, 630 00:36:29,729 --> 00:36:31,520 we already know how to do it with a B-tree. 631 00:36:40,020 --> 00:36:44,166 So we did both models, cache-oblivious B-tree 632 00:36:44,166 --> 00:36:45,390 is what remain. 633 00:36:51,890 --> 00:36:54,290 This was the first result in cache-oblivious data 634 00:36:54,290 --> 00:36:55,490 structures. 635 00:36:55,490 --> 00:36:58,600 I'm going to start with the static problem. 636 00:37:01,700 --> 00:37:10,805 Cache-oblivious static search trees. 637 00:37:13,900 --> 00:37:20,420 This was actually in a MEng thesis here by Harold Prokop. 638 00:37:20,420 --> 00:37:23,690 So the idea is very simple. 639 00:37:23,690 --> 00:37:26,060 We can't use B-trees because we don't know what B is, 640 00:37:26,060 --> 00:37:28,640 so we'll use binary search trees. 641 00:37:28,640 --> 00:37:31,490 Those have served us well so far. 642 00:37:31,490 --> 00:37:36,085 So let's use a balanced binary search tree. 643 00:37:36,085 --> 00:37:37,850 And because it's static, might as well 644 00:37:37,850 --> 00:37:41,747 assume it's a perfect balanced binary search tree. 645 00:37:41,747 --> 00:37:42,830 Let me draw one over here. 646 00:37:58,610 --> 00:38:00,053 I'll draw my favorite one. 647 00:38:12,390 --> 00:38:15,960 So how do you search in a binary search tree? 648 00:38:15,960 --> 00:38:18,150 Binary search. 649 00:38:18,150 --> 00:38:19,950 Our algorithm is fixed. 650 00:38:19,950 --> 00:38:23,130 So the only flexibility we have is 651 00:38:23,130 --> 00:38:29,070 how we map this tree into the sequential order in memory. 652 00:38:29,070 --> 00:38:31,800 Memory is an array. 653 00:38:31,800 --> 00:38:33,750 So for each of these nodes, we get 654 00:38:33,750 --> 00:38:37,320 to assign some order, some position 655 00:38:37,320 --> 00:38:40,080 in the array of memory. 656 00:38:40,080 --> 00:38:43,950 And then each of these child pointers, that 657 00:38:43,950 --> 00:38:46,610 changes where they point to. 658 00:38:46,610 --> 00:38:48,180 And now cache-obliviously, you're 659 00:38:48,180 --> 00:38:50,730 going to follow some root to leaf path. 660 00:38:50,730 --> 00:38:52,320 You don't know which one. 661 00:38:52,320 --> 00:38:56,460 All of them are visiting nodes in some order, 662 00:38:56,460 --> 00:38:58,260 which looks pretty random. 663 00:38:58,260 --> 00:38:59,760 In the array of memory, it's going 664 00:38:59,760 --> 00:39:05,040 to be some sequence like this. 665 00:39:05,040 --> 00:39:07,380 We would like the number of blocks 666 00:39:07,380 --> 00:39:12,160 that contain those nodes to be relatively small somehow. 667 00:39:16,510 --> 00:39:17,820 So that's the hard part. 668 00:39:17,820 --> 00:39:19,680 Now, you could try level order. 669 00:39:19,680 --> 00:39:22,950 You could try in order, pre order, post order. 670 00:39:22,950 --> 00:39:25,560 All of them fail. 671 00:39:25,560 --> 00:39:29,460 The right order is something called van Emde Boas order. 672 00:39:44,980 --> 00:39:47,730 So let's say there are N nodes. 673 00:39:47,730 --> 00:39:53,910 What we're going to do is carve the tree 674 00:39:53,910 --> 00:39:56,190 at the middle level of edges. 675 00:40:05,110 --> 00:40:08,720 So we have a tree like this. 676 00:40:08,720 --> 00:40:10,470 We divide in the middle. 677 00:40:10,470 --> 00:40:15,750 Leaves us with the top part and many bottom parts. 678 00:40:15,750 --> 00:40:19,290 There's going to be about square root of N nodes 679 00:40:19,290 --> 00:40:23,560 in each of these little triangles, roughly speaking. 680 00:40:23,560 --> 00:40:25,410 The number of triangles down here 681 00:40:25,410 --> 00:40:28,150 is going to be roughly square root of N-- 682 00:40:28,150 --> 00:40:31,605 I think I have written here square root of N plus 1. 683 00:40:31,605 --> 00:40:34,740 It's a little overestimate, because I multiply those two 684 00:40:34,740 --> 00:40:35,910 numbers, it's bigger than N. 685 00:40:35,910 --> 00:40:37,212 But it's roughly that. 686 00:40:37,212 --> 00:40:38,670 If there were root N nodes up here, 687 00:40:38,670 --> 00:40:41,670 then there would be square root of N plus 1 children, 688 00:40:41,670 --> 00:40:44,220 and square root of N plus 1 things down there. 689 00:40:44,220 --> 00:40:49,140 So roughly, root N different things, each of size root N. 690 00:40:49,140 --> 00:40:56,630 Then what we do is recursively lay out 691 00:40:56,630 --> 00:41:05,280 all those triangles of size roughly squared of N, 692 00:41:05,280 --> 00:41:09,160 and then concatenate. 693 00:41:09,160 --> 00:41:12,690 It's a very simple idea. 694 00:41:12,690 --> 00:41:15,700 Let's apply it to this tree. 695 00:41:15,700 --> 00:41:21,690 So I split in the middle level and I recursively 696 00:41:21,690 --> 00:41:25,530 lay out this triangle up here-- 697 00:41:25,530 --> 00:41:27,090 draw it as a box. 698 00:41:27,090 --> 00:41:28,770 Then I recursively lay out this. 699 00:41:28,770 --> 00:41:33,630 Then I recursively lay out this and this and this. 700 00:41:33,630 --> 00:41:36,550 So up here, I mean, to lay this out, 701 00:41:36,550 --> 00:41:39,410 I would again split in the middle level, 702 00:41:39,410 --> 00:41:43,050 lay out this piece, so this guy will go first, then 703 00:41:43,050 --> 00:41:45,180 this one, then this one. 704 00:41:45,180 --> 00:41:48,960 Now I'll go over and recursively do this, so it's 4, 5, 6. 705 00:41:48,960 --> 00:41:58,346 Then this one, 7, 8, 9, 10, 11, 12, 13, 14, 15. 706 00:41:58,346 --> 00:42:00,220 That is the order in which I store the nodes. 707 00:42:00,220 --> 00:42:03,700 And I claim that's a good order. 708 00:42:03,700 --> 00:42:07,650 For example, if B equals 3, this looks really good, 709 00:42:07,650 --> 00:42:11,750 because then all of these nodes are in the first block. 710 00:42:11,750 --> 00:42:15,300 And so I get to choose which of these four children trees 711 00:42:15,300 --> 00:42:16,290 I should look at. 712 00:42:16,290 --> 00:42:19,490 I mean, I happen to go here and then maybe here. 713 00:42:19,490 --> 00:42:23,340 I look at those two nodes, look at their keys to decide, 714 00:42:23,340 --> 00:42:26,070 oh, I should go into this branch. 715 00:42:26,070 --> 00:42:29,460 But I only pay one memory transfer 716 00:42:29,460 --> 00:42:31,390 to look at all those nodes. 717 00:42:31,390 --> 00:42:34,110 So this looks like a tree tuned for B equals 3. 718 00:42:34,110 --> 00:42:37,047 But it turns out it's tuned for all B's simultaneously, 719 00:42:37,047 --> 00:42:38,130 if we do this recursively. 720 00:42:38,130 --> 00:42:40,684 It's hard to draw a small example 721 00:42:40,684 --> 00:42:42,100 where you see all the B's at once, 722 00:42:42,100 --> 00:42:45,885 but that's the best I can do. 723 00:42:45,885 --> 00:42:46,385 OK. 724 00:42:58,030 --> 00:43:01,250 Let's analyze this thing. 725 00:43:01,250 --> 00:43:04,310 So the order of the nodes is clear. 726 00:43:04,310 --> 00:43:07,480 If you know what the van Emde Boas data structure is, 727 00:43:07,480 --> 00:43:09,040 then you know why it's called this. 728 00:43:09,040 --> 00:43:13,360 Otherwise, see lecture 10 or so in the future. 729 00:43:13,360 --> 00:43:16,490 Forget exactly where we're covering it. 730 00:43:16,490 --> 00:43:18,550 So this order was not invented by van Emde Boas, 731 00:43:18,550 --> 00:43:20,830 but it looks very similar to something else that 732 00:43:20,830 --> 00:43:23,650 was invented by van Emde Boas, so we call it that. 733 00:43:23,650 --> 00:43:26,560 But that's the name of a guy, Peter. 734 00:43:29,170 --> 00:43:30,820 Analysis. 735 00:43:30,820 --> 00:43:31,872 Order's clear. 736 00:43:31,872 --> 00:43:32,830 The algorithm is clear. 737 00:43:32,830 --> 00:43:35,840 We follow our root to leaf path. 738 00:43:35,840 --> 00:43:41,860 What we claim is any root to leaf path and any value of B, 739 00:43:41,860 --> 00:43:47,170 the number of blocks visited is order log base B of N. 740 00:43:47,170 --> 00:43:54,190 So to do that, in the analysis we get to know what B is. 741 00:43:54,190 --> 00:43:56,140 The algorithm didn't know. 742 00:43:56,140 --> 00:44:01,420 Algorithm here doesn't know B and doesn't need to know B. 743 00:44:01,420 --> 00:44:05,070 But to analyze it, to prove an order log base B of N, 744 00:44:05,070 --> 00:44:10,240 we need to know what B is. 745 00:44:10,240 --> 00:44:15,640 So while the algorithm recurses all the way down, 746 00:44:15,640 --> 00:44:19,540 keeps dividing these blocks into smaller and smaller pieces, 747 00:44:19,540 --> 00:44:21,320 we're going to look at a particular level 748 00:44:21,320 --> 00:44:23,170 of that recursion. 749 00:44:23,170 --> 00:44:25,045 I'm going to call that a level of detail. 750 00:44:32,160 --> 00:44:39,240 Say I call it straddling B. 751 00:44:39,240 --> 00:44:41,340 So my picture is going to be the following. 752 00:44:47,220 --> 00:44:53,100 These little triangles are going to be size less than B. 753 00:44:53,100 --> 00:44:58,490 But these big triangles are size greater than or equal to B. 754 00:44:58,490 --> 00:45:00,000 OK. 755 00:45:00,000 --> 00:45:11,780 At some level of recursion that happens, by monotonicity, 756 00:45:11,780 --> 00:45:12,280 I guess. 757 00:45:22,720 --> 00:45:23,540 I could keep going. 758 00:45:23,540 --> 00:45:26,662 I mean, this tree is going to have substantial height. 759 00:45:26,662 --> 00:45:28,120 I don't really know how tall it is, 760 00:45:28,120 --> 00:45:31,465 or that's sort of the central question. 761 00:45:31,465 --> 00:45:32,590 Draw in some pointers here. 762 00:45:35,110 --> 00:45:36,630 But just look at-- 763 00:45:36,630 --> 00:45:39,319 if you subdivide all the big triangles 764 00:45:39,319 --> 00:45:41,110 until you get to these small triangles that 765 00:45:41,110 --> 00:45:43,840 are size less than B, then you stop recursing. 766 00:45:43,840 --> 00:45:46,810 Now, the algorithm, of course, will recursively lay that out. 767 00:45:46,810 --> 00:45:49,210 But what we know, we recursively lay things out 768 00:45:49,210 --> 00:45:52,540 and then we concatenate, meaning the recursive layouts remain 769 00:45:52,540 --> 00:45:54,130 consecutive. 770 00:45:54,130 --> 00:45:56,740 We never like interleave two layouts. 771 00:45:56,740 --> 00:45:59,410 So however this thing is laid out, we don't really care, 772 00:45:59,410 --> 00:46:03,770 because at that point it's going to be stored in memory, 773 00:46:03,770 --> 00:46:05,059 which is this giant array. 774 00:46:05,059 --> 00:46:06,850 It's going to be stored in interval of size 775 00:46:06,850 --> 00:46:13,190 less than B. So at worst, where are the block boundaries? 776 00:46:13,190 --> 00:46:16,960 Maybe draw that in red. 777 00:46:16,960 --> 00:46:20,440 Block boundaries are size B. We don't know exactly how they're 778 00:46:20,440 --> 00:46:21,220 interspersed. 779 00:46:21,220 --> 00:46:25,040 Let's say these are the block boundaries. 780 00:46:25,040 --> 00:46:28,990 So it could be that this little subarray falls into two blocks, 781 00:46:28,990 --> 00:46:32,440 but at most two blocks. 782 00:46:32,440 --> 00:46:38,710 So the little triangles of size less than B 783 00:46:38,710 --> 00:46:43,705 live in at most two memory blocks. 784 00:46:49,960 --> 00:46:52,660 So to visit each of these little triangles, 785 00:46:52,660 --> 00:46:55,260 we only spend two memory reads at most. 786 00:47:00,660 --> 00:47:03,270 So all that's left is to count how many of those 787 00:47:03,270 --> 00:47:11,565 little triangles do we need to visit on a root to leaf path? 788 00:47:14,390 --> 00:47:18,150 I claim that's going to be log base B of N. 789 00:47:18,150 --> 00:47:24,657 So I guess the first question is, what is the height of one 790 00:47:24,657 --> 00:47:25,740 of these little triangles? 791 00:47:29,280 --> 00:47:31,230 Well, we start with a tree of height 792 00:47:31,230 --> 00:47:34,594 log N. Then we divide it in half and we divide it in half 793 00:47:34,594 --> 00:47:35,510 and divide it in half. 794 00:47:35,510 --> 00:47:38,430 And we take floors and ceilings, whatever. 795 00:47:38,430 --> 00:47:41,940 But we repeatedly divide the height in half. 796 00:47:41,940 --> 00:47:44,580 And we stop whenever the number of nodes in there 797 00:47:44,580 --> 00:47:47,730 is less than B. So that means the height is going 798 00:47:47,730 --> 00:47:52,950 to be theta log B. In fact, it's going 799 00:47:52,950 --> 00:47:59,460 to be in the range 1/2 log B to log B, 800 00:47:59,460 --> 00:48:01,200 probably with a paren that way. 801 00:48:01,200 --> 00:48:05,744 Doesn't really matter, because you stop as soon as you hit B. 802 00:48:05,744 --> 00:48:07,660 You're dividing the height in half repeatedly. 803 00:48:07,660 --> 00:48:11,700 You might overshoot by at most a half. 804 00:48:11,700 --> 00:48:12,930 That's it. 805 00:48:12,930 --> 00:48:14,970 So height is log B. 806 00:48:14,970 --> 00:48:17,730 So if we look along a root to leaf path, 807 00:48:17,730 --> 00:48:19,410 each of these things we visit, we 808 00:48:19,410 --> 00:48:24,470 make vertical progress log B, theta log B. 809 00:48:24,470 --> 00:48:28,270 And the total vertical progress we need to make is log N, 810 00:48:28,270 --> 00:48:32,535 and so it's log N divided by log B. 811 00:48:32,535 --> 00:48:48,690 So I guess 4 times log N, 2 log N over log B. 812 00:48:48,690 --> 00:48:51,350 Triangles, we reach a leaf. 813 00:48:54,074 --> 00:48:56,920 It's total progress we need to make divided by 1/2 814 00:48:56,920 --> 00:49:01,200 log B here, progress per triangle. 815 00:49:01,200 --> 00:49:04,820 So we get 2 times log base B of N. And then for each triangle, 816 00:49:04,820 --> 00:49:07,100 we have to pay 2 memory reads, so 817 00:49:07,100 --> 00:49:16,430 this implies at most 4 log base B of N memory transfers. 818 00:49:20,760 --> 00:49:24,020 And I implicitly assumed here that B is not 1. 819 00:49:24,020 --> 00:49:29,710 Otherwise, need a B plus 1 here to be interesting. 820 00:49:29,710 --> 00:49:33,810 So long as B isn't 1, then it doesn't matter. 821 00:49:33,810 --> 00:49:34,310 OK. 822 00:49:34,310 --> 00:49:36,680 So that's the log base B of N analysis, actually really 823 00:49:36,680 --> 00:49:37,196 simple. 824 00:49:37,196 --> 00:49:38,570 Essentially what we're doing here 825 00:49:38,570 --> 00:49:44,720 is binary searching on B in a kind of weird way, using divide 826 00:49:44,720 --> 00:49:46,210 and conquer and recursion. 827 00:49:46,210 --> 00:49:48,320 So we say, well, maybe B is the whole thing, 828 00:49:48,320 --> 00:49:51,090 then B equals N. Then we don't have to do anything. 829 00:49:51,090 --> 00:49:52,500 Everything fits in a block. 830 00:49:52,500 --> 00:49:55,460 Otherwise, let's imagine dividing here and supposing 831 00:49:55,460 --> 00:50:01,240 B is root N, and then we need two accesses. 832 00:50:01,240 --> 00:50:04,640 But B might be smaller than that, so we keep recursing, 833 00:50:04,640 --> 00:50:05,480 keep going down. 834 00:50:05,480 --> 00:50:07,550 And because we're dividing in half each time-- 835 00:50:09,780 --> 00:50:10,280 sorry. 836 00:50:10,280 --> 00:50:11,863 We're not binary searching on B. We're 837 00:50:11,863 --> 00:50:15,830 binary searching on log B, which is the height 838 00:50:15,830 --> 00:50:19,037 of a tree with B nodes in it. 839 00:50:19,037 --> 00:50:20,870 And that's exactly what we can afford to do. 840 00:50:20,870 --> 00:50:22,190 And conveniently, it's all we need to do, 841 00:50:22,190 --> 00:50:24,080 because to get log base B of N correct, 842 00:50:24,080 --> 00:50:25,865 you only need to get log B correct up 843 00:50:25,865 --> 00:50:26,740 to a constant factor. 844 00:50:29,510 --> 00:50:32,210 All right. 845 00:50:32,210 --> 00:50:34,080 Now, that was the easy part. 846 00:50:34,080 --> 00:50:36,540 Now the hard part is to make this dynamic. 847 00:50:36,540 --> 00:50:38,270 This is where things get exciting. 848 00:50:38,270 --> 00:50:41,022 This works fine for a static tree. 849 00:50:41,022 --> 00:50:42,230 And this is how we do search. 850 00:50:42,230 --> 00:50:47,660 But we can't do inserts and deletes in this world easily. 851 00:50:47,660 --> 00:50:50,395 But I'll show you how to do it nonetheless. 852 00:51:05,450 --> 00:51:07,995 So this is what's called a cache-oblivious B-tree. 853 00:51:15,350 --> 00:51:17,840 We're going to do it in five steps. 854 00:51:20,350 --> 00:51:21,950 And the first step, unfortunately, 855 00:51:21,950 --> 00:51:25,800 is delayed until next lecture, so I need a black box. 856 00:51:25,800 --> 00:51:29,310 It's a very useful black box. 857 00:51:29,310 --> 00:51:32,840 It's actually one that we've briefly talked about before, 858 00:51:32,840 --> 00:51:33,965 I think in the time travel. 859 00:51:40,939 --> 00:51:42,480 If you remember, in time travel there 860 00:51:42,480 --> 00:51:44,396 was one black box, which we still haven't seen 861 00:51:44,396 --> 00:51:47,080 but we will do finally next lecture, which 862 00:51:47,080 --> 00:51:48,739 was you can insert into a linked list 863 00:51:48,739 --> 00:51:50,530 and then given two items in the linked list 864 00:51:50,530 --> 00:51:52,321 know which one comes before which other one 865 00:51:52,321 --> 00:51:54,850 in constant time per operation. 866 00:51:54,850 --> 00:51:57,220 Closely related to that is a slightly different problem 867 00:51:57,220 --> 00:51:58,900 called order file maintenance. 868 00:51:58,900 --> 00:52:01,840 Here's what it lets you do. 869 00:52:01,840 --> 00:52:08,320 You can store N elements, N words in specified order, 870 00:52:08,320 --> 00:52:12,026 so just like it's a linked list. 871 00:52:12,026 --> 00:52:13,650 But we don't store it as a linked list. 872 00:52:13,650 --> 00:52:18,745 We want to store it in an array of linear size. 873 00:52:25,510 --> 00:52:27,890 So there's going to be some blank spots in this array. 874 00:52:27,890 --> 00:52:32,410 We've got N items, an array of size, say 2N or 1.1 times 875 00:52:32,410 --> 00:52:35,590 N, anything like that will suffice. 876 00:52:35,590 --> 00:52:41,860 The rest, we call them gaps, gaps in the array. 877 00:52:41,860 --> 00:52:47,250 And the updates-- so this is physically what you have to do. 878 00:52:47,250 --> 00:52:50,320 You can change this set of elements 879 00:52:50,320 --> 00:52:55,300 by either deleting an element or inserting an element 880 00:52:55,300 --> 00:52:57,430 between two given elements. 881 00:53:09,776 --> 00:53:11,650 So this is just like the linked list updates. 882 00:53:11,650 --> 00:53:13,358 I can delete somebody from a linked list. 883 00:53:13,358 --> 00:53:15,640 I can insert a new item in between two given 884 00:53:15,640 --> 00:53:16,930 items in the linked list. 885 00:53:16,930 --> 00:53:21,430 But now it has to be stored in a physical array of linear size. 886 00:53:21,430 --> 00:53:24,370 Now, we can't do this in constant time. 887 00:53:24,370 --> 00:53:45,320 But what we can do is move elements around. 888 00:54:04,231 --> 00:54:04,730 OK. 889 00:54:04,730 --> 00:54:06,600 So let me draw a picture. 890 00:54:28,980 --> 00:54:31,080 So maybe we have some elements here. 891 00:54:35,940 --> 00:54:38,880 They could be in sorted order, maybe not. 892 00:54:42,590 --> 00:54:45,720 There are some gaps in between. 893 00:54:45,720 --> 00:54:50,235 I didn't mention it, but all gaps will have constant size. 894 00:54:54,460 --> 00:54:58,710 So this will be constant-sized gaps. 895 00:54:58,710 --> 00:55:02,250 And we're able to do things like insert a new item 896 00:55:02,250 --> 00:55:04,140 right after 42. 897 00:55:04,140 --> 00:55:06,590 Let's say we insert 12. 898 00:55:06,590 --> 00:55:09,930 So 12, we could put it right here, no problem. 899 00:55:09,930 --> 00:55:14,470 Now maybe we say insert right after 42 the number 6. 900 00:55:14,470 --> 00:55:17,009 And we're like, oh, there's no room to put 6. 901 00:55:17,009 --> 00:55:18,300 So I've got to move some stuff. 902 00:55:18,300 --> 00:55:23,100 Maybe I move 12 over here and then I put 6 here. 903 00:55:23,100 --> 00:55:27,966 And then I say, OK, now insert right after 42 the number 7. 904 00:55:27,966 --> 00:55:30,960 And you're like, uh-oh. 905 00:55:30,960 --> 00:55:37,920 I could move these guys over one and then put 7 here, and so on. 906 00:55:37,920 --> 00:55:39,300 Obvious algorithm to do this. 907 00:55:39,300 --> 00:55:43,020 Takes linear time, linear number of shifts in order 908 00:55:43,020 --> 00:55:44,400 to do one insertion. 909 00:55:44,400 --> 00:55:48,970 Claim is you can do an insertion in log squared amortized. 910 00:55:48,970 --> 00:55:49,776 It's pretty cool. 911 00:55:49,776 --> 00:55:51,150 So worst case, you'll still going 912 00:55:51,150 --> 00:55:52,740 to need to shift everybody. 913 00:55:52,740 --> 00:55:54,600 But just leaving constant-sized gaps 914 00:55:54,600 --> 00:55:57,000 will be enough to reduce amortized costs to log squared. 915 00:55:57,000 --> 00:55:58,916 And this is conjectured optimal, though no one 916 00:55:58,916 --> 00:56:00,850 knows how to prove that. 917 00:56:00,850 --> 00:56:02,370 And that's order file maintenance. 918 00:56:02,370 --> 00:56:04,380 Now, we're going to assume this is a black box. 919 00:56:04,380 --> 00:56:08,430 Next lecture, we'll actually do it. 920 00:56:08,430 --> 00:56:12,450 But I want to first show you how it 921 00:56:12,450 --> 00:56:15,430 lets us get cache-oblivious dynamic B-trees. 922 00:56:21,270 --> 00:56:23,535 Log base B of N, insert, delete, search. 923 00:56:32,520 --> 00:56:34,115 OK. 924 00:56:34,115 --> 00:56:36,240 We're going to use an ordered file maintenance data 925 00:56:36,240 --> 00:56:41,910 structure to store our keys, just stick them in there. 926 00:56:41,910 --> 00:56:44,100 Now, this is good-- 927 00:56:44,100 --> 00:56:46,726 well, it's really not good for anything, 928 00:56:46,726 --> 00:56:48,330 although it's a starting point. 929 00:56:48,330 --> 00:56:51,000 It can do updates in log squared of N-- 930 00:56:51,000 --> 00:56:52,710 I didn't mention this part-- 931 00:56:52,710 --> 00:56:55,110 in a constant number of scans. 932 00:56:55,110 --> 00:56:58,230 So think of these as this kind of shift. 933 00:56:58,230 --> 00:57:01,980 I have to scan left to right, copy 23 to the left, 934 00:57:01,980 --> 00:57:04,560 42 to the left, and so on. 935 00:57:04,560 --> 00:57:07,510 Scans are things we know how to do fast in N 936 00:57:07,510 --> 00:57:09,990 over B memory transfers. 937 00:57:09,990 --> 00:57:12,075 So if you're going to do log squared-- 938 00:57:12,075 --> 00:57:14,880 if you're going to update an interval log squared guys 939 00:57:14,880 --> 00:57:16,890 using constant number of scans, this 940 00:57:16,890 --> 00:57:22,920 will take log squared N over B memory transfers. 941 00:57:30,370 --> 00:57:31,570 Not quite the bound we want. 942 00:57:31,570 --> 00:57:33,440 We want log base B of N. 943 00:57:33,440 --> 00:57:36,220 This could be larger or smaller than log base B of N. 944 00:57:36,220 --> 00:57:39,160 It depends how big B is. 945 00:57:39,160 --> 00:57:40,930 But don't worry about it for now. 946 00:57:40,930 --> 00:57:43,210 This is still-- at least it's poly log. 947 00:57:43,210 --> 00:57:45,300 Updates are kind of vaguely decent. 948 00:57:45,300 --> 00:57:47,410 We'll improve them later. 949 00:57:47,410 --> 00:57:50,830 On the other hand, search is really bad here. 950 00:57:50,830 --> 00:57:52,650 I mean, we're maintaining the items-- 951 00:57:52,650 --> 00:57:55,150 I didn't mention it, but we will actually maintain the items 952 00:57:55,150 --> 00:57:55,997 in sorted order. 953 00:57:55,997 --> 00:57:57,580 So yeah, you could do a binary search. 954 00:57:57,580 --> 00:57:59,830 But binary search is really bad cache-obliviously, 955 00:57:59,830 --> 00:58:01,621 because most of the time you'll be visiting 956 00:58:01,621 --> 00:58:04,360 a new block until the very end. 957 00:58:04,360 --> 00:58:07,510 So binary search-- this is a fun exercise-- 958 00:58:07,510 --> 00:58:12,670 is log N minus log B. We want log N divided by log B. 959 00:58:12,670 --> 00:58:14,290 So we can't use binary search. 960 00:58:14,290 --> 00:58:18,580 We need to use van Emde Boas layouts, this thing. 961 00:58:18,580 --> 00:58:20,200 So we just plug these together. 962 00:58:20,200 --> 00:58:22,690 We take this static search tree and stick it 963 00:58:22,690 --> 00:58:26,000 on top of an ordered file maintenance data structure. 964 00:58:26,000 --> 00:58:32,980 So here we have a van Emde Boas layout static tree 965 00:58:32,980 --> 00:58:37,980 on top of this thing, which is an ordered file. 966 00:58:41,980 --> 00:58:43,344 And we have cross-linking. 967 00:58:47,520 --> 00:58:50,580 Really here we have gaps. 968 00:58:50,580 --> 00:58:52,590 And we don't store the gaps up here. 969 00:58:52,590 --> 00:58:53,380 Or do we? 970 00:58:53,380 --> 00:58:53,880 Yeah, sure. 971 00:58:53,880 --> 00:58:54,950 We do, actually. 972 00:58:54,950 --> 00:58:56,010 Store them all. 973 00:58:56,010 --> 00:58:57,360 Put it all in there. 974 00:58:57,360 --> 00:58:57,860 All right. 975 00:58:57,860 --> 00:58:59,568 I'm going to have an actual example here. 976 00:59:14,100 --> 00:59:30,760 Let's do it 3 slash 7, 9 slash 16, 21, 42. 977 00:59:46,400 --> 00:59:47,850 OK. 978 00:59:47,850 --> 00:59:49,080 Cross-links between. 979 00:59:49,080 --> 00:59:50,250 I won't draw all of them. 980 00:59:50,250 --> 00:59:53,210 Then we store all the data up here as well. 981 00:59:58,860 --> 01:00:01,130 And then the internal nodes here are 982 01:00:01,130 --> 01:00:03,990 going to store the max of the whole subtree. 983 01:00:03,990 --> 01:00:05,565 So here the max is 3. 984 01:00:05,565 --> 01:00:15,510 Here the max is 9, 16, 42, 42, and 9. 985 01:00:18,040 --> 01:00:20,934 If I store the max of every subtree, then at the root. 986 01:00:20,934 --> 01:00:23,350 If I want to decide should I go left or should I go right, 987 01:00:23,350 --> 01:00:25,282 I look at the max of the left subtree 988 01:00:25,282 --> 01:00:26,740 and that will let me know whether I 989 01:00:26,740 --> 01:00:29,020 should go left or right. 990 01:00:29,020 --> 01:00:31,780 So now I can do binary search. 991 01:00:31,780 --> 01:00:33,870 I do a search up here. 992 01:00:33,870 --> 01:00:35,650 I want to do a search, I search the tree. 993 01:00:35,650 --> 01:00:38,620 This thing is laid out just like this. 994 01:00:38,620 --> 01:00:46,980 And so we pay log base B of N to search up here, plus 1, 995 01:00:46,980 --> 01:00:48,350 if you want. 996 01:00:48,350 --> 01:00:51,230 And then we get to a key and we're done. 997 01:00:51,230 --> 01:00:52,810 That's all we wanted to know. 998 01:00:52,810 --> 01:00:55,090 So search is now good. 999 01:00:55,090 --> 01:00:58,250 Updates, two problems. 1000 01:00:58,250 --> 01:01:01,892 One is that updating just this bottom part is slow. 1001 01:01:01,892 --> 01:01:03,850 Second problem is if I change the bottom thing, 1002 01:01:03,850 --> 01:01:06,620 now I have to change the top thing. 1003 01:01:06,620 --> 01:01:08,500 But let's look at that part first. 1004 01:01:08,500 --> 01:01:10,780 It's not so bad to update-- 1005 01:01:10,780 --> 01:01:14,330 I mean, this structure stays the same, unless N doubles. 1006 01:01:14,330 --> 01:01:17,020 Then we rebuild the entire data structure. 1007 01:01:17,020 --> 01:01:19,160 But basically, this thing will stay the same. 1008 01:01:19,160 --> 01:01:21,580 It's just data is moving around in this array. 1009 01:01:21,580 --> 01:01:23,620 And we have to update which values 1010 01:01:23,620 --> 01:01:24,977 are stored in this structure. 1011 01:01:24,977 --> 01:01:26,560 The layout of the structure, the order 1012 01:01:26,560 --> 01:01:28,180 of the nodes, that all stays fixed. 1013 01:01:28,180 --> 01:01:29,680 It's just the numbers that are being 1014 01:01:29,680 --> 01:01:31,360 written in here that change. 1015 01:01:31,360 --> 01:01:34,300 That's the cool idea. 1016 01:01:34,300 --> 01:01:41,490 So that's step 2. 1017 01:01:41,490 --> 01:01:42,510 Let's go to step 3. 1018 01:01:52,650 --> 01:01:54,880 I actually kind of wanted that exact picture. 1019 01:01:54,880 --> 01:01:55,760 That's OK. 1020 01:01:58,940 --> 01:02:01,520 I need to specify how updates are done. 1021 01:02:05,610 --> 01:02:08,180 So let's say I want to do an insertion. 1022 01:02:12,070 --> 01:02:14,950 If I want to insert a new key, I search for the key. 1023 01:02:14,950 --> 01:02:17,830 Maybe it fits between 7 and 9. 1024 01:02:17,830 --> 01:02:20,680 Then I do the insertion in the bottom structure 1025 01:02:20,680 --> 01:02:21,835 in the ordered file. 1026 01:02:21,835 --> 01:02:24,190 So we want to insert 8. 1027 01:02:24,190 --> 01:02:25,960 These guys move around. 1028 01:02:25,960 --> 01:02:29,180 Then I have to propagate that information up here. 1029 01:02:29,180 --> 01:02:31,210 And the key thing I want to specify 1030 01:02:31,210 --> 01:02:34,950 is an update does a search. 1031 01:02:34,950 --> 01:02:39,220 Then it updates the ordered file. 1032 01:02:39,220 --> 01:02:44,830 That's going to take log squared N over B. 1033 01:02:44,830 --> 01:02:47,950 And then we want to propagate changes 1034 01:02:47,950 --> 01:03:02,660 into the tree in post-order, post-order traversal. 1035 01:03:02,660 --> 01:03:05,660 That's the one thing I wanted to specify, 1036 01:03:05,660 --> 01:03:09,270 because it actually matters in what order I update the item. 1037 01:03:09,270 --> 01:03:11,820 So for example, suppose I move 9 to the right 1038 01:03:11,820 --> 01:03:14,720 and then I insert 8 here. 1039 01:03:14,720 --> 01:03:18,440 So then this thing-- 1040 01:03:18,440 --> 01:03:22,930 let's do it, what changes in that situation. 1041 01:03:22,930 --> 01:03:27,140 9 will change to an 8 here, which 1042 01:03:27,140 --> 01:03:29,600 means this value has to change to an 8 1043 01:03:29,600 --> 01:03:32,630 and this value needs to change to an 8. 1044 01:03:32,630 --> 01:03:35,570 And this one I can't quite update yet, 1045 01:03:35,570 --> 01:03:37,460 because I need to do the other tree. 1046 01:03:37,460 --> 01:03:43,010 So here I change this value to a 9. 1047 01:03:43,010 --> 01:03:45,470 This one doesn't change, so this is OK. 1048 01:03:45,470 --> 01:03:46,700 This is OK. 1049 01:03:46,700 --> 01:03:48,000 This is OK. 1050 01:03:48,000 --> 01:03:50,210 Just taking the maxes as I propagate up. 1051 01:03:50,210 --> 01:03:53,330 So the red things are the nodes that I touch and compute 1052 01:03:53,330 --> 01:03:53,900 the maxes of. 1053 01:03:53,900 --> 01:03:55,983 And the order I did it was a post-order traversal. 1054 01:03:55,983 --> 01:03:57,830 I need to do a post-order traversal in order 1055 01:03:57,830 --> 01:03:59,270 to compute maxes. 1056 01:03:59,270 --> 01:04:01,130 I need to know what the max of this subtree 1057 01:04:01,130 --> 01:04:04,240 is and this subtree before I can re-compute the max of the root. 1058 01:04:06,860 --> 01:04:09,100 So it's really the obvious algorithm. 1059 01:04:09,100 --> 01:04:11,050 Now, the interesting part is the analysis. 1060 01:04:19,590 --> 01:04:21,662 So searches we've already analyzed. 1061 01:04:21,662 --> 01:04:23,120 We don't need to do anything there. 1062 01:04:23,120 --> 01:04:29,790 That's log base B of N. 1063 01:04:29,790 --> 01:04:32,532 But updates are the interesting thing to analyze. 1064 01:04:43,860 --> 01:04:49,380 And again we're going to look at the level of detail that 1065 01:04:49,380 --> 01:04:53,070 straddles B, just like before. 1066 01:05:01,490 --> 01:05:03,350 So let me draw a good picture. 1067 01:05:20,182 --> 01:05:21,890 I'm going to draw it slightly differently 1068 01:05:21,890 --> 01:05:23,098 from how I drew it last time. 1069 01:05:40,341 --> 01:05:40,840 OK. 1070 01:05:44,525 --> 01:05:45,900 What I'm drawing here is actually 1071 01:05:45,900 --> 01:05:48,510 the bottom of the tree. 1072 01:05:48,510 --> 01:05:50,760 So a real tree is like this. 1073 01:05:50,760 --> 01:05:52,820 And I'm looking at the bottom, where 1074 01:05:52,820 --> 01:05:54,570 the leaves are, because I care about where 1075 01:05:54,570 --> 01:05:56,960 I connect to this ordered file. 1076 01:05:56,960 --> 01:05:59,700 So there's the ordered file at the bottom. 1077 01:05:59,700 --> 01:06:01,945 And I'm refining each triangle recursively 1078 01:06:01,945 --> 01:06:03,570 until they're size less than B. They're 1079 01:06:03,570 --> 01:06:06,900 going to be somewhere between root B and B. 1080 01:06:06,900 --> 01:06:13,344 These big triangles are bigger than B. OK. 1081 01:06:13,344 --> 01:06:15,360 So that's where I want to look at things. 1082 01:06:15,360 --> 01:06:19,015 Now, when we update in the ordered file, 1083 01:06:19,015 --> 01:06:19,890 we have this theorem. 1084 01:06:19,890 --> 01:06:22,990 It says you move elements in an interval. 1085 01:06:22,990 --> 01:06:24,600 So what changes is an interval. 1086 01:06:24,600 --> 01:06:27,960 The interval has size log squared N amortized. 1087 01:06:27,960 --> 01:06:33,900 So we are updating some interval here, 1088 01:06:33,900 --> 01:06:35,220 maybe something like this. 1089 01:06:39,450 --> 01:06:44,509 And so all of these items change, let's say. 1090 01:06:44,509 --> 01:06:46,050 Now, this is not going to be too big. 1091 01:06:46,050 --> 01:06:47,850 It's going to be log square N amortized. 1092 01:06:53,440 --> 01:06:55,660 And all of the ancestors of those nodes 1093 01:06:55,660 --> 01:06:58,090 have to be updated, potentially. 1094 01:06:58,090 --> 01:07:01,150 So all these guys are going to get updated. 1095 01:07:01,150 --> 01:07:05,770 Some of this tree, this path, all of this stuff 1096 01:07:05,770 --> 01:07:09,070 is going to get updated, some of this stuff, 1097 01:07:09,070 --> 01:07:10,930 the path up to here. 1098 01:07:10,930 --> 01:07:14,760 The path up to the root gets updated, in general, 1099 01:07:14,760 --> 01:07:16,220 and all these nodes get touched. 1100 01:07:16,220 --> 01:07:20,680 Now, I claim that's cheap, no more expensive than updating 1101 01:07:20,680 --> 01:07:23,442 down here. 1102 01:07:23,442 --> 01:07:24,150 Let's prove that. 1103 01:07:46,440 --> 01:07:46,950 OK. 1104 01:07:46,950 --> 01:07:50,110 First issue is, in what order do I visit the nodes here? 1105 01:07:50,110 --> 01:07:52,960 I want to think about this post-order. 1106 01:07:52,960 --> 01:07:57,170 Maybe I'll use another color to draw that. 1107 01:07:57,170 --> 01:08:00,760 So let's say we start by updating this guy. 1108 01:08:00,760 --> 01:08:03,000 We update this value. 1109 01:08:03,000 --> 01:08:03,910 Then we go-- 1110 01:08:03,910 --> 01:08:06,220 I'm doing a post-order traversal, this tree. 1111 01:08:06,220 --> 01:08:08,050 Once I've completely finished here, 1112 01:08:08,050 --> 01:08:14,350 I can go up here, maybe go up here, visit over down this way, 1113 01:08:14,350 --> 01:08:16,080 fix all the maxes here. 1114 01:08:16,080 --> 01:08:18,220 Then I go up, fix all the maxes here, 1115 01:08:18,220 --> 01:08:20,229 fix all the maxes, the maxes. 1116 01:08:20,229 --> 01:08:24,170 When I come up, then I can fix the maxes up here, and so on. 1117 01:08:24,170 --> 01:08:24,670 OK. 1118 01:08:24,670 --> 01:08:26,459 Think about that order. 1119 01:08:26,459 --> 01:08:28,209 If you just look at the big picture, which 1120 01:08:28,209 --> 01:08:30,949 is which triangles are we visiting, what we do 1121 01:08:30,949 --> 01:08:33,490 is we visit this triangle, then this one, then this triangle, 1122 01:08:33,490 --> 01:08:36,580 then this one, then this triangle, then this one. 1123 01:08:36,580 --> 01:08:38,080 I guess this is the better picture. 1124 01:08:38,080 --> 01:08:40,580 This triangle, left one, this triangle, next child, 1125 01:08:40,580 --> 01:08:43,970 this parent, next child, this parent, next child. 1126 01:08:43,970 --> 01:08:46,990 And when we're done, we go over to the next big triangle. 1127 01:08:46,990 --> 01:08:49,240 Within a big triangle, we alternate 1128 01:08:49,240 --> 01:08:51,550 between two small triangles. 1129 01:08:51,550 --> 01:08:57,279 As long as we have four blocks of cache, 1130 01:08:57,279 --> 01:08:58,899 this will be good, because there's 1131 01:08:58,899 --> 01:09:01,267 at most two blocks for this little triangle, 1132 01:09:01,267 --> 01:09:03,100 at most two blocks for this little triangle. 1133 01:09:03,100 --> 01:09:04,750 So jumping back and forth between these 1134 01:09:04,750 --> 01:09:07,479 is basically free. 1135 01:09:07,479 --> 01:09:12,640 So we assume-- white chalk. 1136 01:09:17,170 --> 01:09:22,130 We assume that we have at least four blocks of cache. 1137 01:09:22,130 --> 01:09:37,840 Post-order traversal alternates between two little triangles. 1138 01:09:37,840 --> 01:09:40,460 So this part is free. 1139 01:09:40,460 --> 01:09:42,609 The only issue is, how many little triangles 1140 01:09:42,609 --> 01:09:44,560 do I need to load overall? 1141 01:09:49,320 --> 01:09:49,819 Sorry. 1142 01:09:49,819 --> 01:09:52,149 I should say I'm just looking at the little triangles 1143 01:09:52,149 --> 01:09:54,640 in the bottom two levels. 1144 01:09:54,640 --> 01:10:02,041 So this is in the bottom two levels. 1145 01:10:02,041 --> 01:10:02,540 Sorry. 1146 01:10:02,540 --> 01:10:03,581 I forgot to mention that. 1147 01:10:03,581 --> 01:10:06,880 As soon as you go outside the bottom big triangle, 1148 01:10:06,880 --> 01:10:08,380 let's not think about it yet. 1149 01:10:08,380 --> 01:10:10,610 Worry about that later. 1150 01:10:10,610 --> 01:10:12,930 So just looking at these bottom two levels that I drew, 1151 01:10:12,930 --> 01:10:14,565 we're just doing this alternation. 1152 01:10:14,565 --> 01:10:16,690 And so the only issue is, how many little triangles 1153 01:10:16,690 --> 01:10:18,320 do I have to load? 1154 01:10:18,320 --> 01:10:22,390 So there could be these guys. 1155 01:10:22,390 --> 01:10:25,650 These guys are good, because I'm touching all these nodes. 1156 01:10:25,650 --> 01:10:28,030 The number of nodes here was bigger than B, 1157 01:10:28,030 --> 01:10:30,910 so if I take this divided by B in the ceiling, 1158 01:10:30,910 --> 01:10:33,400 the ceiling doesn't hurt me, because the overall size is 1159 01:10:33,400 --> 01:10:34,915 bigger than B. 1160 01:10:34,915 --> 01:10:36,790 These guys I have to be a little bit careful. 1161 01:10:36,790 --> 01:10:39,250 Here the ceiling might hurt me, because I only 1162 01:10:39,250 --> 01:10:40,370 read some of these nodes. 1163 01:10:40,370 --> 01:10:41,870 It may be much less than B. It could 1164 01:10:41,870 --> 01:10:46,240 be as low as root B. I divide by B, take the ceiling, I get 1, 1165 01:10:46,240 --> 01:10:49,870 so I have to pay 1 at the beginning, 1 at the end. 1166 01:10:49,870 --> 01:10:53,710 But in between, I get to divide by B. 1167 01:10:53,710 --> 01:10:58,420 So the number of little triangles in the bottom two 1168 01:10:58,420 --> 01:11:13,720 levels that I visit is going to be 1 plus log squared N over B 1169 01:11:13,720 --> 01:11:15,140 amortized. 1170 01:11:15,140 --> 01:11:18,010 The size of the interval is log squared N amortized, 1171 01:11:18,010 --> 01:11:20,650 so this would be if I'm perfect. 1172 01:11:20,650 --> 01:11:23,270 But I get a plus 1 because of various ceilings. 1173 01:11:27,070 --> 01:11:34,720 And also, the number of blocks is the same, 1174 01:11:34,720 --> 01:11:36,400 because each of these little triangles 1175 01:11:36,400 --> 01:11:40,780 fits inside at most two blocks. 1176 01:11:40,780 --> 01:11:43,830 So it's going to be order 1 plus log squared N over B. 1177 01:11:43,830 --> 01:11:44,330 OK. 1178 01:11:44,330 --> 01:11:51,940 So I can touch all those in this much time, 1179 01:11:51,940 --> 01:11:54,610 because I assume that I can alternate between two of them 1180 01:11:54,610 --> 01:11:55,630 quickly. 1181 01:11:55,630 --> 01:11:58,030 And once I finish one of these leaf blocks, 1182 01:11:58,030 --> 01:12:01,990 I never touch it again, so I visit each block essentially 1183 01:12:01,990 --> 01:12:06,830 only once as long as my cache is this big. 1184 01:12:06,830 --> 01:12:07,330 Cool. 1185 01:12:07,330 --> 01:12:09,038 So that deals with the bottom two levels. 1186 01:12:09,038 --> 01:12:11,680 I claim the bottom two levels only pay this. 1187 01:12:11,680 --> 01:12:14,200 And we were already paying that to update the ordered file, 1188 01:12:14,200 --> 01:12:16,140 so it's no worse. 1189 01:12:16,140 --> 01:12:20,620 Now, there's the levels above this one that we can't see. 1190 01:12:20,620 --> 01:12:24,090 I claim that's also OK. 1191 01:12:24,090 --> 01:12:24,960 Sorry. 1192 01:12:24,960 --> 01:12:35,260 Also need number of memory transfers is equal to that. 1193 01:12:35,260 --> 01:12:35,760 OK. 1194 01:12:35,760 --> 01:12:37,050 One more part of the argument. 1195 01:12:51,010 --> 01:12:56,000 Levels above the bottom two. 1196 01:12:59,920 --> 01:13:01,930 Nice thing about the levels above the bottom two 1197 01:13:01,930 --> 01:13:07,281 is there aren't too many triangles. 1198 01:13:07,281 --> 01:13:07,780 All right. 1199 01:13:07,780 --> 01:13:09,571 Well, there aren't too many nodes involved. 1200 01:13:09,571 --> 01:13:11,260 Let's see. 1201 01:13:11,260 --> 01:13:15,730 So down here, there were like log squared N nodes. 1202 01:13:15,730 --> 01:13:17,470 There was a lot of nodes. 1203 01:13:17,470 --> 01:13:20,530 Once I go up two levels, how many of these nodes 1204 01:13:20,530 --> 01:13:22,230 can there be, these roots? 1205 01:13:25,160 --> 01:13:29,020 Well, it was log squared N at the bottom. 1206 01:13:29,020 --> 01:13:32,300 And then I basically shaved off a factor of B here, 1207 01:13:32,300 --> 01:13:35,380 so the number of these roots is at most log squared N 1208 01:13:35,380 --> 01:13:40,150 over B. That's small. 1209 01:13:40,150 --> 01:13:42,610 For those nodes, I could afford to do an entire memory 1210 01:13:42,610 --> 01:13:44,110 transfer for every single node. 1211 01:13:47,150 --> 01:13:51,950 So the number of nodes that need updating-- 1212 01:13:51,950 --> 01:14:02,110 these are ancestors-- is order log squared 1213 01:14:02,110 --> 01:14:07,507 N over B. That's the number of leaves down there. 1214 01:14:07,507 --> 01:14:09,340 At some point, you reach the common ancestor 1215 01:14:09,340 --> 01:14:10,390 of those leaves. 1216 01:14:10,390 --> 01:14:14,390 That tree-- so here, draw a big picture. 1217 01:14:14,390 --> 01:14:18,190 Here's the bottom two layers. 1218 01:14:21,340 --> 01:14:22,360 We're shaving that off. 1219 01:14:25,920 --> 01:14:31,690 Up here, the number of nodes here, these roots that we're 1220 01:14:31,690 --> 01:14:36,340 dealing with, that's the log squared N over B. 1221 01:14:36,340 --> 01:14:38,920 So if we build a tree on those nodes, 1222 01:14:38,920 --> 01:14:41,440 total number of nodes in here is only log 1223 01:14:41,440 --> 01:14:45,640 squared N over B times 2, because it's a binary tree. 1224 01:14:45,640 --> 01:14:47,890 But then there's also these nodes up here. 1225 01:14:47,890 --> 01:14:50,240 This is potentially log N nodes. 1226 01:14:53,099 --> 01:14:54,640 So number of nodes that need updating 1227 01:14:54,640 --> 01:15:00,520 is at most this plus log N. So what's going on here 1228 01:15:00,520 --> 01:15:04,330 is we first analyzed the bottom two layers. 1229 01:15:04,330 --> 01:15:07,000 That was OK because of this ordering thing, 1230 01:15:07,000 --> 01:15:09,910 that we're only dealing with two triangles at once 1231 01:15:09,910 --> 01:15:13,020 and therefore we pay this size divided by B, 1232 01:15:13,020 --> 01:15:14,320 log squared N over B. 1233 01:15:14,320 --> 01:15:17,290 Then there's the common ancestors of those nodes. 1234 01:15:17,290 --> 01:15:20,790 That forms a little binary tree of size log squared N over B. 1235 01:15:20,790 --> 01:15:23,270 For these nodes, this triangle up 1236 01:15:23,270 --> 01:15:26,740 here, we can afford to pay one memory transfer for every node, 1237 01:15:26,740 --> 01:15:31,720 so we don't care that we're efficient up here, which is-- 1238 01:15:31,720 --> 01:15:34,035 I mean, well, let's not worry about whether we are. 1239 01:15:34,035 --> 01:15:35,125 It doesn't matter. 1240 01:15:35,125 --> 01:15:48,100 So for these nodes, we pay at most one memory transfer each. 1241 01:15:48,100 --> 01:15:50,410 And so we pay at most log squared N over B for them. 1242 01:15:50,410 --> 01:15:53,950 That's fine, because we already paid that much. 1243 01:15:53,950 --> 01:15:55,780 And finally, we have to pay for these nodes 1244 01:15:55,780 --> 01:15:59,170 up here, which are from the common ancestors of these guys 1245 01:15:59,170 --> 01:16:00,212 up to the root. 1246 01:16:00,212 --> 01:16:02,170 That has length at most log N. Could be smaller 1247 01:16:02,170 --> 01:16:05,120 if this was a big interval. 1248 01:16:05,120 --> 01:16:15,970 So for these guys, we pay order log base B of N. Why? 1249 01:16:15,970 --> 01:16:18,740 Because this is a root to node path. 1250 01:16:18,740 --> 01:16:20,260 And this is the static analysis. 1251 01:16:20,260 --> 01:16:22,720 We already know that following a root to node path 1252 01:16:22,720 --> 01:16:25,810 we pay at most log base B of N. 1253 01:16:25,810 --> 01:16:28,440 So we pay log base B of N plus log 1254 01:16:28,440 --> 01:16:31,810 squared N over B for this, plus log squared N over B for this, 1255 01:16:31,810 --> 01:16:36,730 plus log squared N over B for the ordered file. 1256 01:16:36,730 --> 01:16:41,650 So overall running time for an update 1257 01:16:41,650 --> 01:16:49,780 is log base B of N plus log squared N over B. 1258 01:16:49,780 --> 01:16:51,460 We actually paid log base B of N twice, 1259 01:16:51,460 --> 01:16:53,530 once to do the initial search. 1260 01:16:53,530 --> 01:16:56,190 I said an update first has to search for the key. 1261 01:16:56,190 --> 01:16:57,490 Then it updates the order file. 1262 01:16:57,490 --> 01:17:00,160 That's where we've paid the first log squared N over B. 1263 01:17:00,160 --> 01:17:02,980 And then to propagate all the changes up the tree, 1264 01:17:02,980 --> 01:17:05,710 we pay log squared N over B for this part, 1265 01:17:05,710 --> 01:17:07,870 and then another log base B of N for this part. 1266 01:17:10,410 --> 01:17:10,910 OK. 1267 01:17:10,910 --> 01:17:14,080 That's updates with a bad bound. 1268 01:17:14,080 --> 01:17:15,890 It turns out we're almost there. 1269 01:17:15,890 --> 01:17:18,043 So at this point we have searches optimal, 1270 01:17:18,043 --> 01:17:20,376 because searches are just searching through the van Emde 1271 01:17:20,376 --> 01:17:20,920 Boas thing. 1272 01:17:20,920 --> 01:17:22,300 That's log base B of N. 1273 01:17:22,300 --> 01:17:25,180 Updates, not quite optimal. 1274 01:17:25,180 --> 01:17:27,885 I mean not quite matching a B-tree, I should say, 1275 01:17:27,885 --> 01:17:30,010 because it's not an optimal bound we're aiming for. 1276 01:17:30,010 --> 01:17:32,070 We're just aiming for B-trees. 1277 01:17:32,070 --> 01:17:36,010 We'd want to get this bound, same as searching. 1278 01:17:36,010 --> 01:17:39,580 But we have this extra term, which 1279 01:17:39,580 --> 01:17:43,690 if B is small, for example if B is log N, 1280 01:17:43,690 --> 01:17:46,890 this term will be bigger than this one. 1281 01:17:46,890 --> 01:17:50,160 The breakpoint is when B equals log n times log log N. 1282 01:17:50,160 --> 01:17:51,700 So for small B, we're not happy. 1283 01:17:51,700 --> 01:17:54,650 And we want to be happy for all B's simultaneously. 1284 01:17:54,650 --> 01:17:57,240 So I want to fix this. 1285 01:17:57,240 --> 01:18:00,990 We're going to fix it using a general technique, 1286 01:18:00,990 --> 01:18:04,420 great technique, called indirection. 1287 01:18:04,420 --> 01:18:08,880 So this is step 5, indirection. 1288 01:18:12,310 --> 01:18:14,800 The idea is simple. 1289 01:18:14,800 --> 01:18:20,880 We want to use the previous data structure 1290 01:18:20,880 --> 01:18:26,380 on N over log N items. 1291 01:18:26,380 --> 01:18:29,250 And, well, we need to store N items, so where do we put them? 1292 01:18:29,250 --> 01:18:31,060 On the bottom. 1293 01:18:31,060 --> 01:18:35,370 So I'm going to have theta log N nodes down here, 1294 01:18:35,370 --> 01:18:39,040 theta log N nodes next to them-- 1295 01:18:39,040 --> 01:18:41,550 or items, I should say. 1296 01:18:41,550 --> 01:18:44,500 Theta log N. 1297 01:18:44,500 --> 01:18:45,290 OK. 1298 01:18:45,290 --> 01:18:48,840 And then one of these items, let's say the min item, 1299 01:18:48,840 --> 01:18:50,310 gets stored in this tree. 1300 01:18:50,310 --> 01:18:53,730 The min item over here gets stored in this tree. 1301 01:18:53,730 --> 01:18:58,840 The min item here gets stored up here. 1302 01:18:58,840 --> 01:18:59,360 OK. 1303 01:18:59,360 --> 01:19:00,705 So this item only-- 1304 01:19:00,705 --> 01:19:02,810 the data structure we just developed, 1305 01:19:02,810 --> 01:19:05,990 I only store N over log N of the items. 1306 01:19:05,990 --> 01:19:08,180 The rest that fit in between those items 1307 01:19:08,180 --> 01:19:09,830 are stored down here. 1308 01:19:09,830 --> 01:19:12,800 Now I use standard B-tree tricks, actually. 1309 01:19:12,800 --> 01:19:16,470 If I want to do an insertion, how do I do a search? 1310 01:19:16,470 --> 01:19:18,200 I follow a root to leaf path. 1311 01:19:18,200 --> 01:19:21,500 And then I do a linear scan of this entire structure. 1312 01:19:21,500 --> 01:19:23,780 How much does it cost to do a linear scan of log N? 1313 01:19:23,780 --> 01:19:28,770 Only log N divided by B. This is smaller than log base B of N, 1314 01:19:28,770 --> 01:19:32,551 so the hard part of search is the log base B of N. 1315 01:19:32,551 --> 01:19:33,050 OK. 1316 01:19:33,050 --> 01:19:33,980 How do I do an update? 1317 01:19:33,980 --> 01:19:35,450 Well, first, I search for the guy. 1318 01:19:35,450 --> 01:19:38,210 I find the block that it fits into and just insert 1319 01:19:38,210 --> 01:19:41,180 into that block. 1320 01:19:41,180 --> 01:19:42,320 Done. 1321 01:19:42,320 --> 01:19:45,170 As long as this block doesn't get too big, 1322 01:19:45,170 --> 01:19:47,060 how do I insert into a block? 1323 01:19:47,060 --> 01:19:48,970 Rewrite the entire block. 1324 01:19:48,970 --> 01:19:49,970 It's a scan. 1325 01:19:49,970 --> 01:19:52,190 It only takes log N over B, so it's 1326 01:19:52,190 --> 01:19:54,740 basically free to rewrite this entire block every time 1327 01:19:54,740 --> 01:19:59,210 as long as it has size at most log N. If it gets too full, 1328 01:19:59,210 --> 01:20:03,340 then I split it into two parts that are half-full. 1329 01:20:03,340 --> 01:20:06,560 So in general, each of these blocks 1330 01:20:06,560 --> 01:20:14,294 will maintain the size to be between 1/4 log N and log N. 1331 01:20:14,294 --> 01:20:15,710 So if it gets bigger than log N, I 1332 01:20:15,710 --> 01:20:17,840 split into two guys of 1/2 size. 1333 01:20:17,840 --> 01:20:20,120 If it gets smaller than 1/4 log N, 1334 01:20:20,120 --> 01:20:22,430 then I'll merge it with one of its neighbors 1335 01:20:22,430 --> 01:20:25,637 and then possibly have to re-split it. 1336 01:20:25,637 --> 01:20:27,470 Or you can think of it as stealing neighbors 1337 01:20:27,470 --> 01:20:29,090 from your siblings. 1338 01:20:29,090 --> 01:20:31,190 That's what you do in B-trees. 1339 01:20:31,190 --> 01:20:33,620 But with a constant number of merges and splits, 1340 01:20:33,620 --> 01:20:36,000 you can restore the property that these guys have 1341 01:20:36,000 --> 01:20:39,110 sizes very close to 1/2 log N. And so 1342 01:20:39,110 --> 01:20:41,030 if they end up getting full, that means 1343 01:20:41,030 --> 01:20:42,440 you inserted 1/2 log N items. 1344 01:20:42,440 --> 01:20:43,940 If they ended up getting empty, that 1345 01:20:43,940 --> 01:20:48,170 means you deleted 1/4 log N items. 1346 01:20:48,170 --> 01:20:51,890 And so in both cases, you have log N items to charge to. 1347 01:20:51,890 --> 01:20:53,870 Whenever you do an update up here-- 1348 01:20:53,870 --> 01:20:55,640 so we update up here every time we split 1349 01:20:55,640 --> 01:20:57,740 or merge down here, because this guy has 1350 01:20:57,740 --> 01:21:01,070 to keep track of the blocks, not the items in the blocks. 1351 01:21:01,070 --> 01:21:06,320 So we effectively slow down the updates 1352 01:21:06,320 --> 01:21:21,210 in the top structure by a factor log N, 1353 01:21:21,210 --> 01:21:24,720 which is great, because we take this update bound 1354 01:21:24,720 --> 01:21:29,070 and we get to divide it by log N, which removes this square. 1355 01:21:29,070 --> 01:21:31,020 So we end up with-- 1356 01:21:34,410 --> 01:21:40,770 so we used to have log base B of N plus log squared N over B. 1357 01:21:40,770 --> 01:21:45,030 We get to divide that by log N, plus 1358 01:21:45,030 --> 01:21:50,310 we have to do this part at the bottom. 1359 01:21:50,310 --> 01:21:55,650 Plus we're doing log N over B at the bottom. 1360 01:21:55,650 --> 01:21:59,850 So basically, this log cancels with this square, 1361 01:21:59,850 --> 01:22:01,650 so this is log N over B as well. 1362 01:22:01,650 --> 01:22:04,260 This thing gets even smaller, so we end up 1363 01:22:04,260 --> 01:22:07,980 being able to do updates in log N over B. 1364 01:22:07,980 --> 01:22:09,870 Not really though, because at the beginning, 1365 01:22:09,870 --> 01:22:12,369 we have to do a search to figure out where to do the update. 1366 01:22:12,369 --> 01:22:13,950 If you knew where you had to update, 1367 01:22:13,950 --> 01:22:15,120 you could update a little faster. 1368 01:22:15,120 --> 01:22:16,495 You could update in log N divided 1369 01:22:16,495 --> 01:22:18,420 by B. If you don't know where to update, 1370 01:22:18,420 --> 01:22:23,430 you've got to spend log base B of N to search initially. 1371 01:22:23,430 --> 01:22:25,170 But that's the fun part. 1372 01:22:25,170 --> 01:22:27,030 We can now update this data structure 1373 01:22:27,030 --> 01:22:30,090 fast, just with this one layer of indirection with log N 1374 01:22:30,090 --> 01:22:32,736 chunks at the bottom, speeding this data structure up 1375 01:22:32,736 --> 01:22:34,110 by a factor of log N for updates. 1376 01:22:34,110 --> 01:22:35,401 Searches don't take any longer. 1377 01:22:35,401 --> 01:22:37,400 You just pay this extra term, which 1378 01:22:37,400 --> 01:22:40,334 was already much smaller than what we were paying before. 1379 01:22:40,334 --> 01:22:41,280 Phew. 1380 01:22:41,280 --> 01:22:44,460 That's cache-oblivious B-trees, other than ordered 1381 01:22:44,460 --> 01:22:47,420 file maintenance, which we'll do next class.